Predictive Coding — Fundamentals

Millennial Talks
4 min readNov 29, 2020

As a student of cognitive science, I’ve come to realise that the concept of predictive coding is about as important for neuroscience as the theory of evolution is for biology, and that Bayes’ law is about as important for cognitive science as the Schrödinger equation is for physics. It wouldn’t be exaggerating to state that predictive coding is the fundamental principle of cognition, it would mean that all our sensing, feeling, thinking, and doing is just a matter of making predictions.

So what is predictive coding?

Throughout a human’s life, the brain acts as a model that is constantly adapting itself to the world by gathering the statistics and making predictions accordingly. Just as the heart’s main function is to pump blood through the body, the brain’s main function is to make predictions about the body. For example, your brain predicts incoming sensory data: what you’re about to perceive from within (interoception) as from without (exteroception).

Till the previous decade, perception was viewed as a feed-forward process. For instance, you hear something and the sensory data travels all the way from your ears into your brain from you where it is fed forward through multiple levels of auditory processing, ultimately leading to a motor reaction. However, recent researches have put forth the ideology of the human brain being Bayesian in nature, which proves that the sensory data isn’t processed like that. Instead, it uses predictive coding, also referred to as predictive processing, to predict what your ears will hear before you get the actual data from the auditory channels.

Your brain runs an internal model of the outer world’s causal order, which continuously churns out predictions which you expect to perceive. Then these predictions are matched with what you actually experience, and a prediction error results from the divergence between predicted sensory data and actual sensory data. The better a prediction, the better the fit, and the less prediction error propagates up the hierarchy.

This internal model is referred to as a generative model due to its ability to generate predictions. It is structured as a bidirectional hierarchical cascade:

  • The model is a cascade since it requires multiple processing stages and multiple brain cortical regions.
  • The model is hierarchical because it involves higher and lower layers of processing: lower levels processing basic data (e.g. sensory stimuli, effective signals, motor commands), higher levels process categorisations (e.g. object recognition, emotion classification, action selection), and the highest levels processing mental states (e.g., mental imagery, emotion experience, conscious goals, planning, reasoning).
  • The model is bidirectional since signals propagate continuously in both directions: predictions travel downward to lower-level pyramidal neurons, while prediction errors move upward to higher-level pyramidal neurons.

Each processing layer predicts the performance of the layer underneath and receives an error signal from the latter — a dynamic process that may or may not loop down to incoming sensory information. When the predictions it produces at each layer of the hierarchical cascade are correct, generating only minimal prediction errors, a model is effective.

You get a high prediction error that updates your internal model to minimize more inconsistencies between expectation and proof, between model and fact, if your predictions do not match the actual data. Your brain hates unfulfilled expectations, so its world model is organized and action driven in such a way that more of its predictions are truer. This is a schematic diagram of how all of this works:

The black circles in the diagram (at the origins of prediction error) represent something I have not yet clarified, namely expected precision. Precision assesses the weight of errors in prediction. If your brain assumes that a certain error in prediction will not be especially accurate or meaningful, it reduces its weight and therefore the degree to which your internal model can be modified by the error.

Expected precisions are formally equivalent to inverse variance and attention is functionally equivalent. When you pay a lot of attention to an object, you become confident that the information you get from it is relatively accurate. Say, you’ve been watching a Ferrari carefully in good lighting conditions, so you’re pretty sure it’s red. The more you are attentive, the more weight you place on possible signs of error. By comparison, if you’re not very focused, the “error volume” is turned down by your brain, i.e. the effect that a prediction error can have on your model.

In the upcoming post, we’ll be understanding more about what exactly it means when the researchers say that the human brain tends to have a Bayesian outlook.

--

--

Millennial Talks

Human Cognition | Intelligent Systems | Engineering | Research | Software Trends | and More