For a linear model with squared error function:
The optimal parameters are found by:
- Setting the gradient to zero:
- Solving the resulting equation
Simplification Write in matrix form (where is the Design matrix)
Setting Gradient to zero
Solve for :
The optimal parameter vector is:
For the regularized case (MAP estimation):
Where is the regularization parameter.