For a linear model with squared error function:

The optimal parameters are found by:

  1. Setting the gradient to zero:
  2. Solving the resulting equation

Simplification Write in matrix form (where is the Design matrix)

Setting Gradient to zero

Solve for :

The optimal parameter vector is:

For the regularized case (MAP estimation):

Where is the regularization parameter.