July 6, 2016
Cognitive Neuroimaging Unit, Neurospin, CEA, University Paris-Saclay, France
Bayesian concepts are appealing to many researchers in fundamental and applied research, including neuroscience. Bayesian tools, part of probability theory, are useful whenever quantitative analysis is needed, such as in statistics, data mining, or forecasting. However, Bayesian concepts have much further reaching implications in neuroscience. They are essential to the way we think about the brain.
BAYES’ RULE BASICS
The mathematical foundation of Bayesian concepts stems from the so-called Bayes’ rule, named after one of its contributors, the 18th century British Reverend Thomas Bayes. Let's consider a practical example of how Bayes' rule works. A medical doctor faced with the following data D, a patient with a cough, contemplates three hypothetical diseases: a lung cancer (H1), a cold (H2) or gastroenteritis (H3). The relative merit of each hypothesis can be deconstructed as follows according to Bayes’ rule. Patients usually cough when afflicted by lung cancer or a cold but rarely in the case of gastroenteritis. Therefore, the likelihood of the potential cause for the cough is high under H1 and H2 and low under H3. Second, a cold and gastroenteritis are much more prevalent diseases than lung cancer in the general population. The a priori likelihood of H2 and H3 is much higher than that of H1. Given that only H2 scores high both in a priori and current evidence, the most likely disease given the symptoms is a cold.
Stated more generally, Bayes’ rule says that our degree of belief in a hypothesis Hgiven some current data D depends on the a priori likelihood of this hypothesis (what we know about it, independent of the current data), and the likelihood of the current data given this hypothesis. Formally, degrees of belief and likelihoods correspond to probabilities [1] and Bayes’ rule reads:
p(H|D) = p(D|H)*p(H)/p(D).
Bayes’ rule distinguishes between our belief a priori in the hypothesis p(H) and our belief in this hypothesis a posteriori, p(H|D), once particular data are considered to evaluate it. The notation p(D|H) is a shorthand for the probability of D given that we know H (the so-called likelihood of the data) and p(H|D) for the probability of H given that we know D.
Several aspects of Bayes’ rule are noteworthy. First, it is extremely general – H and D may be any sort of variables as long as they can be assigned a probability. Second, Bayes’ rule is quantitative: the posterior probability on the left hand side accepts only one value that depends on the terms in the right hand side. This means that Bayes’ rule offers a unique way to combine uncertain quantities such as current evidence and prior knowledge in order to estimate the likelihood of a conclusion.In that sense, Bayes’ rule is normative: any other estimate is an over- or under-estimation of the likelihood of the conclusion. This normative nature of Bayes’ rule can be seen as an extension of classical logic. With classical logic, one can derive the validity of a conclusion, which is either true or false from premises that are known for sure. With Bayes’ rule, one can derive the likelihood of a conclusion, which varies on a continuum, from premises that suffer from uncertainty. Another key aspect of Bayes’ rule is its symmetry: p(H|D) and p(D|H) appear on opposite sides of the equation which allows going from one to the other. The likelihood of current data given a particular hypothesis – p(D|H) – corresponds to solving a direct or “forward” problem: estimating what should be observed given a known cause. Bayes’ rule allows reversing the logic to infer what might be the unknown cause of particular observations – P(H|D).
HOW THE BRAIN IS BAYESIAN
With these mathematical foundations in mind, the brain can be said to be Bayesian in at least three ways. A first key idea is that the brain computes and represents quantities that are probabilistic [2]. In the perceptual domain, this means that every feature of a visual scene is represented by probabilities. For instance, the orientation of a line is not encoded as a single tilt value, but as a distribution of tilt values across several neurons in the visual cortex. Indeed, each of these neurons is tuned for a particular orientation and it responds more intensely when the input data conform to its preferred orientation. Such a neuron therefore acts as a “likelihood detector”: its activity signals the probability of the line having its preferred orientation. Because different neurons are tuned to different orientations, their activity collectively encodes the likelihood of the tilt [3]. This probabilistic view may contrast with the apparent “oneness” of perception. When viewing a scene, we access only one percept at a time, and not distinct hypothetical percepts associated with probabilities. However, recent theories show that this all-or-none processing is the exception rather than the rule in the brain. This “oneness” results from conscious processes that select and amplify one possible interpretation among many [4]. By contrast, most brain processes operate without consciousness and rely on distributions of values and probabilistic computations.
A second Bayesian view of the brain is that the internal knowledge and percepts represented by neurons are constructed following Bayes' rule. This internal knowledge therefore constitutes a posterior belief about the causes of the inputs received by the brain [5,6]. This inference is usually fraught with uncertainty as the brain must make sense of the world based on inputs that are limited and ambiguous. For instance, different three-dimensional shapes in the world may result in the same image once they are projected onto our eyes. There is therefore a real challenge for the brain to perceive the world despite the paucity and the ambiguity of its inputs. This is an old idea in psychology, identified by the 19thcentury German scientist von Helmholtz. The Bayesian framework is made to handle inference from uncertain data, and it even offers a principled remedy: combining the uncertain evidence provided by sensory inputs with prior knowledge.
There is ample experimental evidence that perception relies on prior information to compensate for the poverty of the inputs received. Many biases and visual illusions reveal this automatic reliance on prior information. For instance, when observers are asked to evaluate the tilt of a line, they tend to perceive lines that are nearly vertical as purely vertical, and nearly horizontal as purely horizontal. These orientations are indeed much more frequent in our world. The perceived orientation of a line that weakly departs from these frequent orientations is therefore dominated by our prior expectations [7]. Studies in non-human animals showed that these priors are learned during development from experience. As a result, priors become part of our cortical networks in such a way that they shape their spontaneous activity [8]. When there is no stimulus to drive neuronal activity, the spontaneous activity is dominated by prior expectations. This is because in the absence of input data, the posterior probability in Bayes' rule boils down to the prior probability.
Lastly, Bayes' rule allows for inferring the causes of current observations. By building on this knowledge of the causes, one can in turn predict future observations [5]. This predictive nature of Bayes' rule is the third pillar of the Bayesian view of the brain. Brain imaging and recordings of neurons show that the brain constantly uses previous observations to form expectations about the upcoming events. Such expectations can build up rapidly even in very simple contexts. For instance, upon hearing the four tones “bip”, “bip”, “bip”, “bip” in a row, you may expect that the fifth sound will be another “bip”. Several brain regions increase their activity if the fifth sound is “bop” instead of “bip” [9–11]. This increased activity signals that there is an error: the current expectation appears violated. Interestingly, this error signal is much larger when the expectation was high. An even larger response is recorded if the deviant sound occurs after ten repetitions of “bip” as compared to only four such repetitions. These error signals are actually quantitative: in this simple experiment, they match the expected frequency of sounds that can be inferred using Bayes' rule and the sounds already presented. It is noteworthy that individuals with schizophrenia exhibit significantly lower error signals on electroencephalograms than do healthy individuals in this kind of paradigm, suggesting that statistical inference might be impaired in this pathology. [12,13]. Other experiments used carefully designed sequences of stimuli to show that the brain is capable of learning more complex statistics and even abstract rules [14,15]. Remarkably, experiments in infants and babies showed that this Bayesian machinery operates early in life. Young babies are already capable of quantitative predictions based only on a few observations [16,17].
A major strength of the Bayesian view of the brain is its unifying power. The few examples reported here show that many brain processes can be accounted for by Bayesian principles. It is true across species (in humans and other animals), spatial scales (from single neurons to neuronal networks to brain-scale circuits), cognitive domains (perception, learning, decision making) and stages of development (in neonates, infants and adults). It may even be true of evolution. This is because Bayes' rule is normative: if a particular process deviates from it, then other processes, closer to Bayes' rule, will do better. By selection, processes should gradually approach Bayes' rule, as we see in well-tuned systems such as the human visual cortex.
FUTURE CHALLENGES
This Bayesian view has proved quite successful in neuroscience, although controversy should be acknowledged [18,19]. Challenges nonetheless remain for the future. The most critical one is that Bayesian principles constrain what computations should be, but they leave their implementation entirely open. Indeed, there are often many different ways to solve the same computation. Future works will aim at identifying the specific algorithms that the brain uses for Bayesian computations.
Another challenge is that Bayesian views have been applied so far mostly to perception because this is the domain in which neuroscience is the most advanced. However, future works will probe Bayesian computations in other domains, such as decision making [20–23]. They should also probe the extent to which Bayesian computations and their associated uncertainty levels are accessible to introspection. Recent studies showed that the “sense of confidence” – the degree of belief that we attach to our percepts, memories and decisions – is actually much more sophisticated in humans than previously envisaged [24,25].
Further readings:
https://en.wikipedia.org/wiki/Bayes'_theorem
http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004305
https://elifesciences.org/content/5/e11476
Supporting references:
1. Jaynes ET. Probability Theory: The Logic of Science. Cambridge University Press; 2003.
2. Knill DC, Pouget A. The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 2004;27: 712–719. doi:10.1016/j.tins.2004.10.007
3. Deneve S, Latham PE, Pouget A. Reading population codes: a neural implementation of ideal observers. Nat Neurosci. 1999;2: 740–745.
4. Dehaene S. Consciousness and the Brain: Deciphering How the Brain Codes Our Thoughts. New York: Viking; 2014.
5. Friston K. The free-energy principle: a unified brain theory? Nat Rev Neurosci. 2010;11: 127–138. doi:10.1038/nrn2787
6. Rao RP. An optimal estimation approach to visual perception and learning. Vision Res. 1999;39: 1963–1989.
7. Girshick AR, Landy MS, Simoncelli EP. Cardinal rules: visual orientation perception reflects knowledge of environmental statistics. Nat Neurosci. 2011;14: 926–932. doi:10.1038/nn.2831
8. Berkes P, Orbán G, Lengyel M, Fiser J. Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science. 2011;331: 83–87. doi:10.1126/science.1195870
9. Huettel SA. Decisions under Uncertainty: Probabilistic Context Influences Activation of Prefrontal and Parietal Cortices. J Neurosci. 2005;25: 3304–3311. doi:10.1523/JNEUROSCI.5070-04.2005
10. Karoui IE, King J-R, Sitt J, Meyniel F, Gaal SV, Hasboun D, et al. Event-Related Potential, Time-frequency, and Functional Connectivity Facets of Local and Global Auditory Novelty Processing: An Intracranial Study in Humans. Cereb Cortex. 2014; bhu143. doi:10.1093/cercor/bhu143
11. Squires KC, Wickens C, Squires NK, Donchin E. The effect of stimulus sequence on the waveform of the cortical event-related potential. Science. 1976;193: 1142–1146. doi:10.1126/science.959831
12. Fletcher PC, Frith CD. Perceiving is believing: a Bayesian approach to explaining the positive symptoms of schizophrenia. Nat Rev Neurosci. 2009;10: 48–58. doi:10.1038/nrn2536
13. Michie PT, Malmierca MS, Harms L, Todd J. The neurobiology of MMN and implications for schizophrenia. Biol Psychol. 2016;116: 90–97. doi:10.1016/j.biopsycho.2016.01.011
14. Wacongne C, Changeux J-P, Dehaene S. A Neuronal Model of Predictive Coding Accounting for the Mismatch Negativity. J Neurosci. 2012;32: 3665–3678. doi:10.1523/JNEUROSCI.5003-11.2012
15. Wang L, Uhrig L, Jarraya B, Dehaene S. Representation of Numerical and Sequential Patterns in Macaque and Human Brains. Curr Biol CB. 2015;25: 1966–1974. doi:10.1016/j.cub.2015.06.035
16. Frank MC, Tenenbaum JB. Three ideal observer models for rule learning in simple languages. Cognition. 2011;120: 360–371. doi:10.1016/j.cognition.2010.10.005
17. Téglás E, Vul E, Girotto V, Gonzalez M, Tenenbaum JB, Bonatti LL. Pure Reasoning in 12-Month-Old Infants as Probabilistic Inference. Science. 2011;332: 1054–1059. doi:10.1126/science.1196404
18. Bowers JS, Davis CJ. Bayesian just-so stories in psychology and neuroscience. Psychol Bull. 2012;138: 389–414. doi:10.1037/a0026450
19. Griffiths TL, Chater N, Norris D, Pouget A. How the Bayesians got their beliefs (and what those beliefs actually are): Comment on Bowers and Davis (2012). Psychol Bull. 2012;138: 415–422. doi:10.1037/a0026884
20. Beck JM, Ma WJ, Kiani R, Hanks T, Churchland AK, Roitman J, et al. Probabilistic Population Codes for Bayesian Decision Making. Neuron. 2008;60: 1142–1152. doi:10.1016/j.neuron.2008.09.021
21. Chater N, Tenenbaum JB, Yuille A. Probabilistic models of cognition: Conceptual foundations. Trends Cogn Sci. 2006;10: 287–291.
22. Pouget A, Beck JM, Ma WJ, Latham PE. Probabilistic brains: knowns and unknowns. Nat Neurosci. 2013;16: 1170–1178. doi:10.1038/nn.3495
23. Solway A, Botvinick MM. Goal-directed decision making as probabilistic inference: A computational framework and potential neural correlates. Psychol Rev. 2012;119: 120–154. doi:10.1037/a0026435
24. Meyniel F, Schlunegger D, Dehaene S. The Sense of Confidence during Probabilistic Learning: A Normative Account. PLoS Comput Biol. 2015;11: e1004305. doi:10.1371/journal.pcbi.1004305
25. Meyniel F, Sigman M, Mainen ZF. Confidence as Bayesian Probability: From Neural Origins to Behavior. Neuron. 2015;88: 78–92. doi:10.1016/j.neuron.2015.09.039
'Life & advice' 카테고리의 다른 글
Behavioral finance - avoid being prey on the way to a Nobel Prize (0) | 2018.08.22 |
---|---|
한국증시 (0) | 2018.08.18 |
The Importance of Taleb’s System: From the Fourth Quadrant to the Skin in the Game (0) | 2018.08.12 |
Antifragile Education: How to Learn for Life in a Changing World (2) | 2018.07.21 |
15 Common Cognitive Distortions (0) | 2018.06.25 |