Documente Academic
Documente Profesional
Documente Cultură
TRISEP 2014
Sudbury, Ontario
Cartoon:
xkcd.com
6/5/14
Lecture Outline
Lecture 1:
Lecture 2:
Lecture 3:
Lecture 4:
6/5/14
Figures
of
Merit
Our
jobs
as
scien@sts
are
to
Measure
quan22es
as
precisely
as
we
can
Figure
of
merit:
the
uncertainty
on
the
measurement
Discover
new
par2cles
and
phenomena
Figure
of
merit:
the
signicance
of
evidence
or
observa@on
--
try
to
be
rst!
Related:
the
limit
on
a
new
process
To
be
counterbalanced
by:
Integrity:
All
sources
of
systema@c
uncertainty
must
be
included
in
the
interpreta@on.
Large
collabora@ons
and
peer
review
help
to
iden@fy
and
assess
systema@c
uncertainty
6/5/14
Figures
of
Merit
Our
jobs
as
scien@sts
are
to
Measure
quan22es
as
precisely
as
we
can
Figure
of
merit:
the
expected
uncertainty
on
the
measurement
Discover
new
par2cles
and
phenomena
Figure
of
merit:
the
expected
signicance
of
evidence
or
observa@on
--
try
to
be
rst!
Related:
the
expected
limit
on
a
new
process
To
be
counterbalanced
by:
Integrity:
All
sources
of
systema@c
uncertainty
must
be
included
in
the
interpreta@on.
Large
collabora@ons
and
peer
review
help
to
iden@fy
and
assess
systema@c
uncertainty
Expected
Sensi2vity
is
used
in
Experiment
and
Analysis
Design
6/5/14
Main
Injector
commissioned
in
2002
Recycler
used
as
another
an@proton
accumulator
Start-of-store
luminosi@es
exceeding
2001030
now
are
rou@ne
6/5/14
Booster
CDF
D
pbar source
Main Injector
and Recycler
Circumference: 28 km
p
T.
Junk
TRISEP
2014
Lecture
1
6/5/14
10
!N$ k
Nk
Binom(k | N, p) = # & p (1 p) , (k) = Var(k) = Np(1 p)
"k%
k
is
the
number
of
success
trials
Example:
events
passing
a
selec@on
cut,
with
a
xed
total
N
The
Gaussian
approximate
formula
I
remember:
Measuring
p
with
observe
k
out
of
N
(e.g.
acceptance
measurements)
p
=
k/N
6/5/14
p =
p(1 p)
N
11
12
e k
Poiss(k | ) =
k!
Normalized
to
unit
area
in
two
dierent
senses
(k) =
=6
Poiss(k | ) = 1,
k= 0
Poiss(k | )d = 1
6/5/14
13
14
6/5/14
15
Suppose
I
have
two
independent
(for
now)
random
variables,
x
and
y,
and
add
them
to
get
z
=
x+y.
If
I
know
P(x)=f(x)
and
P(y)=g(y),
how
do
I
deduce
P(z)
?
There
are
many
ways
of
getng
the
same
z,
so
we
need
to
add
up
the
probability
of
getng
each
one.
p(z) = f (x)g(z x)dx = ( f g)(z)
This
is
a
convolu8on
of
the
two
probability
density
func@ons
f(x)
and
g(y)
Detector
resolu@on
is
usually
modeled
as
a
convolu@on
of
the
quan@ty
measured
and
the
random
sources
of
mismeasurement
that
get
added
to
it.
6/5/14
16
6/5/14
17
18
Gauss(x, , ) =
1
2 2
(x )2
2
Gauss z, + , x2 + y2 =
Gauss(x,,
)Gauss(z x, , y )dx
19
6/5/14
20
If,
in
an
experiment
all
we
have
is
a
measurement
n,
we
olen
use
that
to
es@mate
.
We
then
draw
error
bars
on
the
data.
This
is
just
a
conven8on,
and
can
be
misleading.
(We
s@ll
recommend
you
do
it,
however)
6/5/14
21
But:
hLps://twiki.cern.ch/twiki/bin/view/CMSPublic/PhysicsResultsEXO11066
T.
Junk
TRISEP
2014
Lecture
1
22
Correct
presentation
of data and
predictions
23
Some2mes another conven2on is adopted for showing error bars on the data
6/5/14
24
Core
is
approximately
Gaussian
25
26
systematic
statistical
27
1
x = xi
N i=1
is
an
unbiased
es@mator
of
the
mean
x =
N
6/5/14
Worth
Remembering
this
formula!
28
(x
est (true known) =
)
i
true
BUT:
The
true
mean
is
usually
not
known,
and
we
use
the
same
data
to
es@mate
the
mean
as
to
es@mate
the
width.
One
degree
of
freedom
is
used
up
by
the
extrac@on
of
the
mean.
This
narrows
the
distribu@on
of
devia@ons
from
the
average,
as
the
average
is
closer
to
the
data
events
than
the
true
mean
may
be.
An
unbiased
es@mator
of
the
width
is:
(x
est (true unknown) =
6/5/14
x )2
N 1
29
/ 2(N 1)
I
once
had
to
use
this
formula
for
my
thesis.
Momentum-weighted
charge
determina@on
of
Z
decay
to
bbbar
in
e+e-
collisions
b
b
6/5/14
30
31
Propaga@on
of
Uncertain@es
v
Covariance:
If
u
then
This can even
vanish!
(anticorrelation)
In general, if
6/5/14
32
xy = xy2 ( x y )
hLp://en.wikipedia.org/wiki/Correla@on_and_dependence
6/5/14
33
34
6/5/14
35
36
6/5/14
37
NDOF=?
38
2
The
Distribu@on
Plot
from
Wikipedia:
k
=
number
of
degrees
of
Freedom
PDF
Cumulia@ve
Distribu@on
Mean: k
6/5/14
39
6/5/14
40
6/5/14
41
Smaller
data
samples:
harder
to
discern
mismodeling.
6/5/14
42
43
Or Really Suspicious
44
2
is
really
bad
for
any
smooth
func@on
Uncertain@es
probably
underes@mated.
6/5/14
45
6/5/14
46
x best = w i ai
i
with
=1
47
6/5/14
48
Advantages:
Approximately unbiased
Usually close to optimal
Invariant under transformation of parameters. Fit for a mass
or mass2 doesnt matter.
Unbinned likelihood fits are quite popular. Just need
L = P(data | , )
Warnings:
Need to estimate what the bias is, if any.
Monte Carlo Pseudoexperiment approach: generate lots of random
fake data samples with known true values
of the parameters sought,
fit them, and see if the averages differ from the inputs.
More subtle -- the uncertainties could be biased.
-- run pseudoexperiments and histogram the pulls (fit-input)/error -- should
get a Gaussian centered on zero with unit width, or theres bias.
Handling of systematic uncertainties on nuisance parameters by maximization
can give misleadingly small uncertainties -- need to study L for other values
than just the maximum (L can be bimodal)
6/5/14
49
Example of a problem:
Total:
ni ni
Weighted
average:
(from
BLUE)
N tot = n i
i=1
n /
n avg =
i=1
K
1/
i=1
n /n
2
i
=
2
i
i=1
K
1/n
=
i
i=1
K
K
1/n
i=1
50
Entries
10000
Mean
RMS
400
1.084
Underflow
11
Overflow
350
Integral
300
!2 / ndf
Prob
250
0
9989
542.3 / 76
0
Constant
387.4 4.976
Mean
-0.2537 0.01174
Sigma
0.9734 0.00752
Mean: -0.25
200
150
-0.2967
Width: 0.97
100
50
0
-5
-4
-3
-2
-1
Pull = (x-)/
51
2/DOF = 18.5/13
probability = 13.8%
(5)
had(mZ)
mZ [GeV]
91.1875 0.0021
91.1874
Z [GeV]
2.4952 0.0023
2.4959
had [nb]
41.540 0.037
41.478
Rl
20.767 0.025
20.742
0,l
Afb
Al(P)
Rb
Didnt expect a 3
result in 18 measurements,
but then again, the total
2 is okay
meas
fit
|O
0
O |/
1
2
meas
0.1481
Rc
0.1721 0.0030
0.1723
Afb
0,b
0.0992 0.0016
0.1038
Afb
0,c
0.0707 0.0035
0.0742
Ab
0.923 0.020
0.935
Ac
0.670 0.027
0.668
0.1513 0.0021
0.1481
0.2314
mW [GeV]
80.385 0.015
80.377
W [GeV]
2.085 0.042
2.092
Al(SLD)
2 lept
mt [GeV]
173.20 0.90
March 2012
6/5/14
Fit
173.26
3
52
For
displaying
joint
es@ma@on
of
several
parameters
From
the
2011
PDG
Sta@s@cs
Review
hLp://pdg.lbl.gov/2011/reviews/rpp2011-rev-sta@s@cs.pdf
6/5/14
53
54
Nuisance Parameter
Parameter of Interest
6/5/14
e.g.,
Mtop,rec
55
hLp://www.physics.csbsju.edu/stats/KS-test.html
6/5/14
56
j1 2 j 2 z 2
p(z) = 2 (1) e
j=1
You
can
also
compute
the
p-value
by
running
pseudoexperiments
and
nding
the
distribu@on
of
the
KS
distance.
Distribu@ons
are
usually
binned
though
analy@c
formula
no
longer
applies.
Run
pseudoexperiments
instead.
See
ROOTs
TH1::KolmogorovTest()
which
computes
both
D
and
p.
6/5/14
57
6/5/14
58
Extra Slides
6/5/14
59
x2 x3
+ 2
2
2 3
x1 =
1
1
+ 2
2
2 3
1 =
6/5/14
1
1
1
+ 2
2
2 3
60
x1 x 2 = x 2 & 2 1) + x 3 & 2 )
% 2 (
% 3 (
x1 x 2
2
2
2 '2
$
'
$
= 22 & 12 1) + 32 & 12 )
% 2 (
% 3 (
check:
If
the
new
cut
is
the
same
as
the
old
cut,
no
dierence
in
measurements!
Assumes:
Gaussian,
uncorrelated
measurement
pieces.
6/5/14
61
6/5/14
62
Parameter 2
1D
or
2D
Presenta2on
68%
68%
2lnL=1
68%
2lnL=2.3
Parameter
1
I
prefer
when
showing
a
2D
plot,
showing
the
contours
which
cover
in
2D.
The
2lnL=1
contour
only
covers
for
the
1D
parameters,
one
at
a
@me.
6/5/14
63
68%
and
95%
contours
6/5/14
hLp://www-cdf.fnal.gov/physics/new/top/2011/WhelDil/index.html
T.
Junk
TRISEP
2014
Lecture
1
64
65