Finding Optimum Parameters for Linear Learning Model

For a linear model $y (x, w) = w^{T} ϕ (x)$ with squared error function:

E (w) = \frac{1}{2} n = 1 \sum N (t_{n} - w^{T} ϕ (x_{n}))^{2}

The optimal parameters $w^{*}$ are found by:

Simplification Write in matrix form (where $Φ$ is the Design matrix)

E (w) = \frac{1}{2} ∥ t - Φ w ∥^{2}

Setting Gradient to zero

\nabla_{w} E (w) = - Φ^{T} (t - Φ w) = 0

Solve for $w$ :

Φ^{T} Φ w = Φ^{T} t

The optimal parameter vector is:

w^{*} = (Φ^{T} Φ)^{- 1} Φ^{T} t

For the regularized case (MAP estimation):

w_{M A P} = (λ I + Φ^{T} Φ)^{- 1} Φ^{T} t

Where $λ$ is the regularization parameter.

Ashu's Online Notes