Bayesian Predictive Distribution

The Bayesian approach1(left plot) shows wider uncertainty bands because it accounts for parameter uncertainty on top of data noise. The uncertainty also varies spatially - being larger where there’s less data.

  • Data-dependent mean:
  • Input-dependent variance:

Predictive Distribution (Maximum Likelihood)

The Predictive Distribution uses a single “best” set of parameters found in the data.

  • Fixed mean:
  • Fixed variance: (constant for all inputs)

Key Difference Summary

Predictive Distribution (Maximum Likelihood):

  • Uses fixed parameters and (point estimates)
  • Uncertainty comes only from data noise (aleatoric uncertainty)
  • Constant variance across all inputs

Bayesian Predictive Distribution:

  • Integrates over all possible parameters weighted by their posterior probability
  • Uncertainty comes from both data noise AND parameter uncertainty (aleatoric + epistemic)
  • Input-dependent variance - uncertainty varies with location

Footnotes

  1. Help with the notation: is the probability distribution of a new, unseen output , conditioned on the new input and all the evidence from the training data, and .