08 Bayes PDF

Bayesian Inference
Thomas Nichols
With thanks
Lee Harrison
Bayesian segmentation Spatial priors Posterior probability Dynamic Causal
and normalisation on activation extent maps (PPMs) Modelling
Attention to Motion
Paradigm Results
SPC
V3A
V5+
Attention – No attention
Büchel & Friston 1997, Cereb. Cortex
Büchel et al. 1998, Brain
- fixation only
-  observe static dots + photic  V1
- observe moving dots + motion  V5
-  task on moving dots + attention  V5 + parietal cortex
Attention to Motion
Paradigm Dynamic Causal Models
Model 1 (forward): Model 2 (backward):

attentional modulation attentional modulation
of V1→V5: forward of SPC→V5: backward
Photic SPC Attention

Photic SPC
V1
V1
- fixation only V5
-  observe static dots V5
- observe moving dots Motion Motion
-  task on moving dots Attention
Bayesian model selection: Which model is optimal?

Responses to Uncertainty
Long term memory
Short term memory

Responses to Uncertainty
Paradigm Stimuli sequence of randomly sampled discrete events
Model simple computational model of an observers

response to uncertainty based on the number of
past events (extent of memory)
1 2 3 4
Question which regions are best explained by
short / long term memory model?
1 2 40
trials ?
?
Overview
•  Introductory remarks
•  Some probability densities/distributions
•  Probabilistic (generative) models
•  Bayesian inference
•  A simple example – Bayesian linear regression
•  SPM applications
–  Segmentation
–  Dynamic causal modeling
–  Spatial models of fMRI time series
Probability distributions and densities
k=2
k=2
k=2
k=2
k=2
k=2
k=2
Generative models
estimation
space
space
generation
time
Bayesian statistics
new data prior knowledge
posterior ∝ likelihood ∙ prior
Bayes theorem allows one to The posterior probability of the

formally incorporate prior parameters given the data is an
knowledge into computing optimal combination of prior
statistical probabilities. knowledge and new data,
weighted by their relative
precision.
Bayes’ rule
Given data y and parameters θ, their joint probability can be written in 2
ways:
Eliminating p(y,θ) gives Bayes rule:
Likelihood Prior
Posterior
Evidence
Principles of Bayesian inference
  Formulation of a generative model
likelihood p(y|θ)
prior distribution p(θ)
  Observation of data
y
  Update of beliefs based upon observations, given a prior state

of knowledge
Univariate Gaussian
Normal densities
Posterior mean =  
precision-weighted combination of
prior mean and data mean
Bayesian GLM: univariate case
Normal densities
Bayesian GLM: multivariate case
Normal densities
β2
One step if Ce and Cp are known.

Otherwise iterative estimation. β1
Approximate inference: optimization
True posterior
mean-field
approximation iteratively improve
Approximate
posterior
free energy
Objective
function
Value of parameter
Simple example – linear regression
Data Ordinary least squares
Data and model fit Ordinary least squares
Bases (explanatory variables) Sum of squared errors



Over-fitting: model fits noise
Inadequate cost function: blind to

overly complex models
Solution: include uncertainty in

model parameters

Bayesian linear regression:
priors and likelihood
Model:
Model:
Prior:
Model:
Prior:
Sample curves from prior

(before observing any data)
Mean curve
Model:
Prior:
Likelihood:
Model:
Prior:
Likelihood:
Model:
Prior:
Likelihood:
posterior
Model:
Prior:
Likelihood:
Bayes Rule:
posterior
Model:
Prior:
Likelihood:
Bayes Rule:
Posterior:
posterior
Model:
Prior:
Likelihood:
Bayes Rule:
Posterior:
posterior
Model:
Prior:
Likelihood:
Bayes Rule:
Posterior:
Posterior Probability Maps (PPMs)
Posterior distribution: probability of the effect given the data
mean: size of effect 
precision: variability
Posterior probability map: images of the probability

(confidence) that an activation exceeds some specified
threshold sth, given the data y
Two thresholds:
•  activation threshold sth : percentage of whole brain mean
signal (physiologically relevant size of effect)
•  probability pth that voxels must exceed to be displayed
(e.g. 95%)
model selection
Bayes Rule:
normalizing constant
Model evidence:
aMRI segmentation
PPM of belonging to… grey matter white matter CSF

Dynamic Causal Modelling:
generative model for fMRI and ERPs
Hemodynamic  Electric/magnetic
forward model:  forward model: 
neural activity→BOLD neural activity→EEG 
MEG
LFP
Neural state equation:
fMRI ERPs
Neural model: Neural model:

1 state variable per region 8 state variables per region
bilinear state equation nonlinear state equation
no propagation delays propagation delays
inputs
Bayesian Model Selection for fMRI
m1 m2 m3 m4
attention attention attention attention
PPC PPC PPC PPC
stim V1 V5 stim V1 V5 stim V1 V5 stim V1 V5
attention
models marginal likelihood estimated
0.10 effective synaptic strengths
15
for best model (m4)
PPC 0.39
0.26
10 1.25
0.26
stim V1 0.13 V5
5
0.46
0
m1 m2 m3 m4
[Stephan et al., Neuroimage, 2008]

fMRI time series analysis with spatial priors
degree of smoothness Spatial precision matrix
prior precision prior precision

aMRI Smooth Y (RFT) of GLM coeff of AR coeff
prior precision
of data noise
GLM coeff AR coeff

(correlated noise)
observations
ML estimate of β VB estimate of β
Penny et al 2005
fMRI time series analysis with spatial priors:
posterior probability maps
Display only voxels that
exceed e.g. 95%
activation
threshold
Probability mass pn
Mean (Cbeta_*.img)
Posterior density q(βn) PPM (spmP_*.img)
probability of getting an effect, given the data
mean: size of effect

Std dev (SDbeta_*.img)
covariance: uncertainty
Bayesian model selection
Log-evidence maps
subject 1
model 1
subject N
model K
Compute log-evidence
for each model/subject
Bayesian model selection
Log-evidence maps BMS maps
subject 1
model 1
subject N
PPM
model K
EPM
Probability that model k

generated data model k
Compute log-evidence
for each model/subject Joao et al, 2009
Reminder…
Long term memory
Short term memory

Compare two models
Short-term memory model long-term memory model
IT indices: H,h,I,i IT indices are smoother

onsets Missed
trials
H=entropy; h=surprise; I=mutual information; i=mutual surprise

Group data: Bayesian Model Selection maps
Regions
best Regions best
explained explained by
by short- long-term
term memory
memory model
model
frontal cortex
primary visual (executive
cortex control)
Thank-you

08 Bayes PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

08 Bayes PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Bayesian Inference

Paradigm Dynamic Causal Models

Model 1 (forward): Model 2 (backward):

Photic SPC Attention

Bayesian model selection: Which model is optimal?

Short term memory

Paradigm Stimuli sequence of randomly sampled discrete events

Model simple computational model of an observers

• Some probability densities/distributions

• Probabilistic (generative) models

• A simple example – Bayesian linear regression

new data prior knowledge

posterior ∝ likelihood ∙ prior

Bayes theorem allows one to The posterior probability of the

Eliminating p(y,θ) gives Bayes rule:

 Formulation of a generative model

 Update of beliefs based upon observations, given a prior state

One step if Ce and Cp are known.

Bases (explanatory variables) Sum of squared errors

Bases (explanatory variables) Sum of squared errors

Bases (explanatory variables) Sum of squared errors

Over-fitting: model fits noise

Inadequate cost function: blind to

Solution: include uncertainty in

Bases (explanatory variables) Sum of squared errors

Sample curves from prior

Posterior probability map: images of the probability

PPM of belonging to… grey matter white matter CSF

Neural state equation:

Neural model: Neural model:

PPC PPC PPC PPC

stim V1 V5 stim V1 V5 stim V1 V5 stim V1 V5

[Stephan et al., Neuroimage, 2008]

prior precision prior precision

GLM coeff AR coeff

Posterior density q(βn) PPM (spmP_*.img)

probability of getting an effect, given the data

mean: size of effect

Log-evidence maps BMS maps

Probability that model k

Short term memory

Short-term memory model long-term memory model

IT indices: H,h,I,i IT indices are smoother

H=entropy; h=surprise; I=mutual information; i=mutual surprise

S-ar putea să vă placă și

•  Some probability densities/distributions

•  Probabilistic (generative) models

•  A simple example – Bayesian linear regression

  Formulation of a generative model

  Update of beliefs based upon observations, given a prior state