Sunteți pe pagina 1din 50

Bayesian Inference

Thomas Nichols

With thanks
Lee Harrison
Bayesian segmentation Spatial priors Posterior probability Dynamic Causal
and normalisation on activation extent maps (PPMs) Modelling
Attention to Motion

Paradigm Results

SPC
V3A

V5+

Attention – No attention
Büchel & Friston 1997, Cereb. Cortex
Büchel et al. 1998, Brain

- fixation only
-  observe static dots + photic  V1
- observe moving dots + motion  V5
-  task on moving dots + attention  V5 + parietal cortex
Attention to Motion

Paradigm Dynamic Causal Models

Model 1 (forward): Model 2 (backward):


attentional modulation attentional modulation
of V1→V5: forward of SPC→V5: backward

Photic SPC Attention


Photic SPC

V1
V1

- fixation only V5
-  observe static dots V5
- observe moving dots Motion Motion
-  task on moving dots Attention

Bayesian model selection: Which model is optimal?


Responses to Uncertainty
Long term memory

Short term memory


Responses to Uncertainty

Paradigm Stimuli sequence of randomly sampled discrete events

Model simple computational model of an observers


response to uncertainty based on the number of
past events (extent of memory)
1 2 3 4
Question which regions are best explained by
short / long term memory model?

1 2 40
trials ?
?
Overview
•  Introductory remarks

•  Some probability densities/distributions

•  Probabilistic (generative) models

•  Bayesian inference

•  A simple example – Bayesian linear regression

•  SPM applications
–  Segmentation
–  Dynamic causal modeling
–  Spatial models of fMRI time series
Probability distributions and densities

k=2
Probability distributions and densities

k=2
Probability distributions and densities

k=2
Probability distributions and densities

k=2
Probability distributions and densities

k=2
Probability distributions and densities

k=2
Probability distributions and densities

k=2
Generative models
estimation

space

space

generation

time
Bayesian statistics

new data prior knowledge

posterior ∝ likelihood ∙ prior

Bayes theorem allows one to The posterior probability of the


formally incorporate prior parameters given the data is an
knowledge into computing optimal combination of prior
statistical probabilities. knowledge and new data,
weighted by their relative
precision.
Bayes’ rule
Given data y and parameters θ, their joint probability can be written in 2
ways:

Eliminating p(y,θ) gives Bayes rule:

Likelihood Prior

Posterior

Evidence
Principles of Bayesian inference

  Formulation of a generative model

likelihood p(y|θ)
prior distribution p(θ)

  Observation of data
y

  Update of beliefs based upon observations, given a prior state


of knowledge
Univariate Gaussian
Normal densities

Posterior mean = 

precision-weighted combination of
prior mean and data mean
Bayesian GLM: univariate case
Normal densities
Bayesian GLM: multivariate case
Normal densities

β2

One step if Ce and Cp are known.


Otherwise iterative estimation. β1
Approximate inference: optimization
True posterior
mean-field
approximation iteratively improve

Approximate
posterior

free energy

Objective
function

Value of parameter
Simple example – linear regression
Data Ordinary least squares
Simple example – linear regression
Data and model fit Ordinary least squares

Bases (explanatory variables) Sum of squared errors


Simple example – linear regression
Data and model fit Ordinary least squares

Bases (explanatory variables) Sum of squared errors


Simple example – linear regression
Data and model fit Ordinary least squares

Bases (explanatory variables) Sum of squared errors


Simple example – linear regression
Data and model fit Ordinary least squares

Over-fitting: model fits noise

Inadequate cost function: blind to


overly complex models

Solution: include uncertainty in


model parameters

Bases (explanatory variables) Sum of squared errors


Bayesian linear regression:
priors and likelihood
Model:
Bayesian linear regression:
priors and likelihood
Model:

Prior:
Bayesian linear regression:
priors and likelihood
Model:

Prior:

Sample curves from prior


(before observing any data)

Mean curve
Bayesian linear regression:
priors and likelihood
Model:

Prior:

Likelihood:
Bayesian linear regression:
priors and likelihood
Model:

Prior:

Likelihood:
Bayesian linear regression:
priors and likelihood
Model:

Prior:

Likelihood:
Bayesian linear regression:
posterior
Model:

Prior:

Likelihood:

Bayes Rule:
Bayesian linear regression:
posterior
Model:

Prior:

Likelihood:

Bayes Rule:

Posterior:
Bayesian linear regression:
posterior
Model:

Prior:

Likelihood:

Bayes Rule:

Posterior:
Bayesian linear regression:
posterior
Model:

Prior:

Likelihood:

Bayes Rule:

Posterior:
Posterior Probability Maps (PPMs)
Posterior distribution: probability of the effect given the data
mean: size of effect

precision: variability

Posterior probability map: images of the probability


(confidence) that an activation exceeds some specified
threshold sth, given the data y

Two thresholds:
•  activation threshold sth : percentage of whole brain mean
signal (physiologically relevant size of effect)
•  probability pth that voxels must exceed to be displayed
(e.g. 95%)
Bayesian linear regression:
model selection
Bayes Rule:

normalizing constant

Model evidence:
aMRI segmentation

PPM of belonging to… grey matter white matter CSF


Dynamic Causal Modelling:
generative model for fMRI and ERPs

Hemodynamic
 Electric/magnetic
forward model:
 forward model:

neural activity→BOLD neural activity→EEG

MEG
LFP

Neural state equation:

fMRI ERPs

Neural model: Neural model:


1 state variable per region 8 state variables per region
bilinear state equation nonlinear state equation
no propagation delays propagation delays

inputs
Bayesian Model Selection for fMRI

m1 m2 m3 m4
attention attention attention attention

PPC PPC PPC PPC

stim V1 V5 stim V1 V5 stim V1 V5 stim V1 V5

attention
models marginal likelihood estimated
0.10 effective synaptic strengths
15
for best model (m4)
PPC 0.39
0.26
10 1.25

0.26
stim V1 0.13 V5
5
0.46

0
m1 m2 m3 m4

[Stephan et al., Neuroimage, 2008]


fMRI time series analysis with spatial priors
degree of smoothness Spatial precision matrix

prior precision prior precision


aMRI Smooth Y (RFT) of GLM coeff of AR coeff

prior precision
of data noise

GLM coeff AR coeff


(correlated noise)

observations
ML estimate of β VB estimate of β

Penny et al 2005
fMRI time series analysis with spatial priors:
posterior probability maps
Display only voxels that
exceed e.g. 95%
activation
threshold

Probability mass pn

Mean (Cbeta_*.img)

Posterior density q(βn) PPM (spmP_*.img)

probability of getting an effect, given the data

mean: size of effect


Std dev (SDbeta_*.img)
covariance: uncertainty
fMRI time series analysis with spatial priors:
Bayesian model selection

Log-evidence maps

subject 1
model 1
subject N

model K

Compute log-evidence
for each model/subject
fMRI time series analysis with spatial priors:
Bayesian model selection

Log-evidence maps BMS maps

subject 1
model 1
subject N

PPM

model K
EPM

Probability that model k


generated data model k
Compute log-evidence
for each model/subject Joao et al, 2009
Reminder…
Long term memory

Short term memory


Compare two models

Short-term memory model long-term memory model

IT indices: H,h,I,i IT indices are smoother


onsets Missed
trials

H=entropy; h=surprise; I=mutual information; i=mutual surprise


Group data: Bayesian Model Selection maps

Regions
best Regions best
explained explained by
by short- long-term
term memory
memory model
model

frontal cortex
primary visual (executive
cortex control)
Thank-you

S-ar putea să vă placă și