Ashu's Online Notes

❯

❯

Squared Error Function

Squared Error Function

Jun 04, 20251 min read

E (w) = \frac{1}{2} n = 1 \sum N (y (x_{n}, w) - t_{n})^{2}

Remember:

$1/2$ factor (makes derivative cleaner)
Sum over all $N$ data points
Squared difference between:
- $y (x_{n}, w) =$ model prediction
- $t_{n} =$ target/actual value

Mnemonic: “Half the Sum of Squared differences” Note: Sometimes written as $\frac{1}{2 N}$ for averaging

Generalized Squared Error

\tilde{E} (w) = \frac{1}{2} n \sum (t_{n} - w^{T} ϕ (x_{n}))^{2} + \frac{λ}{2} ∣∣ w ∣ ∣^{2}

$w$ is the parameter vector (weights) that the model learns (and optimized during training).
$ϕ (x)$ is the Basis Function vector which Transforms the raw input x into a higher-dimensional feature space
Transforms the raw input $x$ into a higher-dimensional feature space
The dot product $w^{T} ϕ (x)$ gives the model’s prediction

Graph View

Backlinks

Ordinary Least squares Solution
Machine Learning

Website
Substack