Remember:

  • factor (makes derivative cleaner)
  • Sum over all data points
  • Squared difference between:
    • model prediction
    • target/actual value

Mnemonic: “Half the Sum of Squared differences” Note: Sometimes written as for averaging

Generalized Squared Error

  • is the parameter vector (weights) that the model learns (and optimized during training).
  • is the Basis Function vector which Transforms the raw input x into a higher-dimensional feature space
  • Transforms the raw input into a higher-dimensional feature space
  • The dot product gives the model’s prediction