Sunteți pe pagina 1din 8

MALMQUIST BIAS

(also known as Eddington bias)


...... or Eddingtons solution of Fredholms integral equation of 1st kind
F(x) =

U(x z) K(z) dz
Expand the LH integral argument to give
U(x z) = U(x) U

(x) z + U

(x)
z
2
2!
......
Integrate term by term
F(x) =

n
n!
U
(n)
(x)(1)
n
where
n
are moments of integration kernel K. Now rewrite
U(x) = F(x) +

n
A
n
F
(n)
(x)
For a central Kernel (ie.
1
= 0) and equating coecients
U(x) = F(x)

2
2!
F
(2)
(x) +

3
3!
F
(3)
(x)

4
4!

2
2!

F
(4)
(x) + ......
For a Gaussian kernel
odd
= 0,
2
=
2
,
4
= 3
4
, ...... therefore
U(x) = F(x)

2
2
F
(2)
(x) +

4
8
F
(4)
(x) + ......
and for say a luminosity function of the form eg. N(m) = 10
(mm
o
)
dN
dm
= ln10 N(m)
d
2
N
dm
2
= (ln10)
2

2
N(m)
N
obs
(m) = N(m) +

2
2
(ln10)
2

2
N(m) equivalently m = ln10

2
2
PRINCIPAL COMPONENTS ANALYSIS - PCA
Given a series of n-dimensional vectors x
k
; k = 1, 2, .....m what is the optimal
linear transformation to reduce the dimensionality of the data ? Dene
x

k
= x
k
< x
k
>
k
x

k
=
p

j=1
a
kj

j
+
j
where <

k

k
>
k
is to be minimised subject to |
j
|
2
= 1

k
=
jk
and
1
m

jk
a
2
jk
maximised
a
kj
=

j
x

k
and
1
m

jk
a
2
kj
=

j
C
j
where C =
1
m

m
k=1
x

k
x

k
ie. C is symmetric and +ve denite.
C
j
=
j

j
and
1
m

jk
a
2
jk
=
p

j=1

j
Therefore sorting the eigenvectors of the data covariance matrix by eigenvalue
denes the optimum compression/feature extraction scheme.
INDEPENDENT COMPONENT ANALYSIS - ICA
This is an alternative approach for identifying independent features (compo-
nents) in the data but this time dened by the requirement that they are as
statistically independent as possible. (ICA is closely related to blind source
separation and projection pursuit.)
Start again from series of n-dimensional vectors x
k
, k = 1, 2, .....m and
dene
x

k
= x
k
< x
k
>
k
x

k
=
m

j=1
a
kj
s
j
where s
j
are the sought after independent components. Dening X and S
as the matrices with column vectors x
k
, s
k
respectively and A as the matrix
with elements a
kj
we have
X = A S S = A
1
X S = W X
where the weight matrix W denes the independent components.
Independence already implies uncorrelated but ICA also aims to maximise
the non-Gaussianity of the s
k
, which is equivalent to minimising the entropy
of the distribution of the values of the components of s
k
, which in this case
is also equivalent to minimising the mutual information of the vectors s
k
.
The simplest algorithm is FastICA which solves for s
k
one at a time using
a xed point iteration scheme (Hyv arinen & Oja 1997).
s
k
=
m

j=1
w
kj
x
j
= w

k
X
which in practice is done by minimising < G(w

k
X) > subject to w

k
w
k
= 1
where G(u) = tanh(u) !!
ARTIFICIAL NEURAL NETWORKS - ANNs
Used for feature extraction, classication, data compression, prediction .....
Input layer: x{1 m}; hidden layer(s): X{1 h}; output layer: y{1 p}
y
k
= f
k
(
h

i=1
w
ki
X
i
+ w
k0
) eg. f(z) =
1
1 +e
z
sigmoid
X
k
= g
k
(
m

i=1
w

ki
x
i
+ w

k0
) f(z) = g(z) common
minimise <

i
[y
i
(t) d
i
(t)]
2
>
t
where t denotes training set and d
i
(t) desired outcome.
Solution back propagation of errors (Werbos 1974)
output units
j
= (d
j
y
j
) y
j
(1 y
j
) (sigmoid function)
hidden units

j
= y
j
(1 y
j
)

k
w
jk

k
adjust weights iteratively w(t)
ij
=
j
y
i
+ w(t 1)
ij
loop through entire training set n
loop
>> 1.
GENETIC ALGORITHMS
Generally used for NP hard problems ie. = N
2
, NlnN, N
3
.......
but more of the variety, no. of solutions = N!, N
M
....... ie. solution space
is combinatorial or has complex topology.
Examples include: scheduling timetables, airline routes, travelling salesman-
type problems, ber conguration,
2
template minimisation ......
1. devise gene-like encoding scheme for parameters of interest (N
gene
)
2. randomly generate large nos. of trial solutions (eg. N
trial
= 1000+)
3. devise a tness score (01) to quantify them (eg. constraints,
2
)
4. breed new ospring solutions tness P
crossover
= 0.5 1.0, P
where
5. allow genetic mutations in ospring P
mutate

1
N
trial
N
gene
6. test new generation and rescale tness score to range 01
7. test convergence, end, or repeat from 4.
OUTLINE PROOF MAXIMUM LIKELIHOOD METHOD
The likelihood is the probability of observing a particular dataset, therefore

L(x | ) dx = 1
dierentiate with respect to

dx = 0 =

1
L
L

.L dx =

ln(L)

.L dx
dierentiate RH term with respect to again


2
ln(L)

2
.L +

ln(L)

2
.L dx = 0
therefore



2
ln(L)

ln(L)

2
Let t be an unbiased estimator of some function of , say (), then
< t > =

t L dx = ()

() =
()

t
ln(L)

L dx
therefore from above

() =

(t ())
ln(L)

L dx
Use Schwarz inequality on
2
to generate

(t )
2
L dx

ln(L)

2
L dx
Therefore, for the case () =
var{t}
1


ln(L)

2

=
1



2
ln(L)

WORKED EXAMPLES
What are the correct 1- error bars to use for a Poisson distribution eg.
number density of objects in various parameter ranges ?
Observe N objects in the interval consider the MLE of the model density
parameter .
Poisson = P(N|) =
()
N
N!
e

lnL() = +N ln() lnN!



=
N

Error on estimate p when lnL = lnL


max

1
2
p
2
, substitute for

lnL() =

N +N ln(

N) lnN!
1 +
p
2
2N
=

ln(

)
For N = 1 the 1- range is 0.3 < /

< 2.4
In the limit of large N let /

= 1 +, then
1 +
1
2N
= 1 +

2
2

3
3
+ ......... Lim
N
=
1

N
What is the optimum aperture to use for photometry of radially symmetric
Gaussian, Exponential and Moat prole images ?
Gaussian = I(r) =
I
tot
2
2
G
e
r
2
/2
2
G
FWHM = 2
G

2ln(2)
I(< R) = I
tot
(1 e
R
2
/2
2
G
) MV B =

4
2
G

noise
Efficiency =

4
2
G
R
2
(1 e
R
2
/2
2
G
)
Exponential = I(r) =
I
tot
2a
2
e
r/a
FWHM = 2a ln(2)
I(< R) = I
tot
(1 e
R/a
R/ae
R/a
) MV B =

8a
2

noise
Efficiency =

8a
R
2
(1 e
R/a
[1 +r/a])
Moffat = I(r) = I
o
[1 + (r/)
2
]

FWHM = 2

2
1/
1
I(< R) =

2
1
I
o
{1[1+(R/)
2
]
+1
} MV B =

2
(2 1)
( 1)
2

noise
Efficiency =

2
(2 1)
R
2
( 1)
2
{1 [1 + (R/)
2
]
+1
}

S-ar putea să vă placă și