0 Voturi pozitive0 Voturi negative

0 (de) vizualizări16 paginiBiometrics

Sep 28, 2018

© © All Rights Reserved

PPT, PDF, TXT sau citiți online pe Scribd

Biometrics

© All Rights Reserved

0 (de) vizualizări

Biometrics

© All Rights Reserved

- Durlak Et Al. (2011) Meta Analysis SEL
- Proportions Activity Alt Version
- Testing Practice after midterm 1.pdf
- I.6 Statistical Tests
- What Are Confidence Intervals and P-Values?
- Impact of Perceived Severity and Controllability of Service Failures on Expectations of Justice in Complaint Redressal
- Week14Chapter 9.1 9.3
- Data Analysis With SAS
- Marketing Research - Questionnaire Design & Sampling Issues
- Confidence Intervals
- 12 January 30 - February 2, 2017
- Statistika Lila
- R-studio Commands Lists
- E1 ATQ
- Statistical Methods for Material Characterization and Qualification
- EM561 Lecture Notes - Part 3 of 3[1]
- BC-021-SAA-16-0054-Cu
- GERIE biostatic
- 27. Prof.Asif B. Khatik.pdf
- Math Assignment 13-MV

Sunteți pe pagina 1din 16

Lecture 7

I will have the Biometrics Classes…

László Pótó

From the sample to the population…

Remember: biometrics is about making conclusion from

the collected data (sample) to the unknown population.

Typical questions:

- Is a given lab-data (of a group of patients) different from the

„healthy” value? (what is the expected value – for healthy people?)

- Is a measuring tool/process sharp enough (pipette, drug content

of pills, box of sugar, and so on…)?

- Does a complete series of measurements give the proof that the

values are over a certain limit (air or water pollution, …)?

–

The problem: how to make conclusion: from x and sx to µ and

–x and s (and ‘n’, so the measures of the sample) are known…, but:

x

what about µ and ? So: which population come the sample from.

Two methods: - estimation

- hypothesis testing

The estimation

Point-estimation: ¯x µ and sx

We did this when we supposed based on the 50 students body height data,

that µ=170cm =8cm for the population.

Interval-estimation:

¯ sx a, b ,(two values), so that we can say that

x,

µ is inside of the (a,b) interval by a given probability.

This „ given probability” is the confidence of the estimation.

close to 100%, like 99%, 95%, 90%.

It is already known some such intervals whit a given confidence:

for the bh data: (µ=170cm =8cm ) – so, from last week table:

2 154-186cm (95%) 3 146-194cm (99.7%)

for the 16 data sample means: µ=170cm /n=2cm )

2 166-174cm (95%) 3 164-176cm (99.7%)

the question now is this: estimate the unknown and - based on the

known sample! How? 3 steps how to get that

1st step: Distribution of the sample mean around µ

Example for bh data: The 16 data sample means are distributed around µ:

68%

95% z × /n

99.7%

Let’s take all the 100 different means and draw around each of them

the 2/n (95%) interval!

How many out of that 100 intervals contains the µ?

95 eset

5 eset

The center (mean) of some of those 2/n length intervals are located

inside of the 2/n range. These will contain the µ - but some means

are outside and those ones will not. ‘Out of 100’ means: those 95 „inside”

2/n intervals would contain µ (confidence), but 5 not (error-risk).

2nd step: is unknown so replace /n by the sx /n

The S.E. can be smaller or larger than /n –

so how about the confidence?

164 166

166 168 170 172 174 176

95 eset

5 eset

How many out of the 95 will be shorter – and longer out of those 5?

(based on the binomial distribution…)

The confidence is decreasing!

It depends on the sample size!

- little shorter at the middle part and

more wide at the tails - than the normal

- different curves to each n

-3 -2 -1 0 1 2 3

3rd step: increase the length of the intervals!

Instead of the x¯ ±z*/n intervals use x¯ ±t* sx /n!

t values

In case of 16 data the mean ± 2.13* sx /n interval

n-1 p=95%

contains the exp. value of the population by 95% prob.

2 4,30

5 2,57 -4

-3 -2 -1

1

2

3

4

5

8 2,31

1.

10 2,23

2.

15 2,13

3.

20 2,09

50 2,01 95.

1000 1,96

96.

Z= 1,96

100.

contain again the µ - out of the 100 different center (the means are

different) and length (the std deviations are different) intervals.

The x¯ ±t* sx /n is the p% confidence interval for the µ

( at n=16 and p=95%: the t=2.13)

Calculation of the confidence interval - 1

1st example: based on the body height data of a 16 data group:

Let suppose the mean is 175cm, S.D. is 10cm.

Can be the expected value of the population 170cm?

The 95% C.I.: 175±2.13*10/16 cm =175±2.13*2.5cm=

=175±5.33cm = (169.67- 180.33)cm

The 170cm is inside, so it is a possible ! (by 95% confidence)

The mean of the 9 data is 3.9mg/100ml, the S.D. is 0.6mg/100ml.

Can it be the expected value (of this type of patients) equal to

4.2mg/100ml that is the mean of healthy people?

The 95% C.I. is 3.9±2.3*0.6/9mg/100ml =3.9±2.3*0.2 (…)

=3.9±0.46 = (3.44- 4.36 )mg/100ml

The 4.2 is inside, so it’s a possible ! (by 5% error risk)

Calculation of the confidence interval - 2

The drug content of pills at a pharmacological factory was checked

by the measures of a ‘16 pills sample’.

The measures are: n=16, mean=102.1 mg, S.D.= 4mg.

(note: the 3 number) Can the expected value be 100mg?

The 95% conf. intv. (in mg): 102.1±2.13*4/16 =102.1±2.13=

= (99.97- 104.23)mg

The 100mg is inside of it, so the 100mg is a possible !

(by 95% confidence or 5% error-risk)

Interpretation: When repeating the experiment 100 times – having 100 datasets:

100 different means and S.D.s – and calculating the 95% CI from each on the

above way, then 95 out of the 100 different CI would contain the real expected

value (the ) and only 5 CI not.

But note, please, that we can not know that which is the only one C.I. out of the

above 100? Is that one out of the 95 (that „contains”) or the 5 (that is „not…”)!

Let’s see the second method for giving answer to such kind of questions:

the hypothesis testing method!

The hypothesis testing – 1

An „everyday life” model

I remember like hearing some noises of heavy rain during night. How

can I decide in the morning, whether it was a rain or just a dream?

(2,) Decide what do I mean on „probable” and on „not probable”…

(this is more or less obvious now!)

3, Estimate how probable would be the observed fact in the case of

the 1st point hypothesis? (suppose it IS true now!)

4, Decide about the hypothesis („no rain” in this case)

a, When the result of point 3 is: „not probable”, do reject…

b, When the result of point 3 is: „probable”, do not reject…

5, Conclusion

Checking the method: try the opposite hypothesis at the 1st point

The hypothesis testing – 2.

Hypothesis testing in biometrics

„The drug content of 16 pills…” example.

Mean: 102.1 mg, S.D. 4mg. Can be the expected value 100mg?

1, Suppose that =100mg is true! No significant difference, the

difference is just by chance! — „null”-hypothesis — : H0

2, Let’s choose the low-end of „probable” is 5%. „Border for

decision”: . So let it be now = 0.05

3, If =100mg, than how probable is that the mean of 16 data

would differ from the 100mg at least by 2.1mg?

- As to last week: the difference between the mean and the is t*S.E. (here

SE= 4mg/16=1mg) where „t” follows df=n-1 (here 15) t distribution.

- In our case t=2.1/1=2.1 (-times the S.E.). At the t15-curve at 2.13 (figure!)

would „cut” 5% area (probability), so the prob. of „at least 2.1-times” S.E.

difference is >5%. So that p>0.05 (=„probable”) – (figure)

4, Decide about the hypothesis („ =100mg”)

Because at point 3: p> („probable”), not to reject!

5, Conclusion: The mean is not significantly different than the

hypothetical expected value. So can be 100mg!

What did we do here?

We checked how different is the mean than a hypothetical („ H0”)

expected value. (in S.E. units: „t” times)

When the difference „t” is big

(= the area under the t curve – outside of the ‘t’ - is small that means:

at least this size of difference has small probability if H0 was true)

than our sample (the fact) are against of our hypothesis (null-hypothesis)

See: everyday life model of hyp. test: Reject the null-hypothesis!

When the difference „t” is small (= the area under the t curve is big)

at least this size of difference has large probability if H0 was true)

than our sample (the fact) is not against of our hypothesis

(the „null-hypothesis).

See: everyday life model of hyp. test: Do not reject the null-hypothesis!

The probability (area) can be calculated knowing „t” (and n) using the

prob dens function. By computer: „p=” (sharp) or from table: „p< ”.

This is the: One sample t test.

An other (special) case

The effect of diet + training was checked: did it lowered the blood-cholesterol? The

lab data of the 12 patients (2 datasets but in paired arrangement…):

serial 1 2 3 4 5 6 7 8 9 10 11 12

before 201 231 221 260 228 237 326 235 240 267 284 201

after 200 236 216 233 224 216 296 195 207 247 210 209

diff -1 5 -5 -27 -4 -21 -30 -40 -33 -20 -74 8

The „difference”: x¯= -20.17 sx= 23.13 S.E.=sx/n=23.13/ 12=6.68

3, The differences (in pairs) : mean –20.17, S.D. 23.13 , so that the

given „difference” from 0 (t value) is -20.17/6.68 = -3.02.

The probability of „at least this difference is 1.17% - figure -, less then .

4, Here p< , so that reject H0.

5, Conclusion: the diet + training was effective.

The difference was significant.

The one sample t test - summary

The variable (data) are continuous and normally distributed…

…and the question is about the mean (and the expected value)

(Is there some difference, effect … or it was just by chance?)

How probable is „at least this difference”

— t times the S.E. —, due to chance only, when a H0 is true?

The probability is the area outside the (-t, t) interval of the t

probability density function (by computer or tables.)

-If this (p) probability is less than a predefined limit (), means:

- the difference is big, the sample mean is „far” from the hypothetical exp. value

- and it is rather unlikely to find „at least this big” difference due to chance only

so reject the null-hypothesis.

-If this (p) probability is not less than a predefined limit (), means

no reason to reject the null-hypothesis.

in this case the difference data are the „one sample” and =0 is the H0.

An (already known) example

The effect of diet+training was checked: did it lowered the blood-cholesterol? The

lab data of the 12 patients are (2 datasets but in paired arrangement…):

serial 1 2 3 4 5 6 7 8 9 10 11 12

before 201 231 221 260 228 237 326 235 240 267 284 201

after 200 236 216 233 224 216 296 195 207 247 210 209

diff -1 5 -5 -27 -4 -21 -30 -40 -33 -20 -74 8

The „difference”: 10 cases out of 12 was effective (negative) while 2 was not (positive).

1, H0: the treatment was ineffective B(12, 0.5) 2, =0.05 (border value).

3, How probable is „at least that difference” from the expected k=6?

This probability is 2*(p(k=0)+p(k=1)+p(k=2)) - figure -,

= 2*(0.02%+0.29%+1.61%) = 2*1.93% It is less than =5% .

4, Here p< , so that reject H0.

5, Conclusion: the diet+training was effective.

The difference was significant.

This method is the: sign test.

Note, please: normal distribution was not supposed! The method can be applied just in those

cases: when the data are not normally distributed (t test is not applicable) this test works well

Goals for the 7th week - What was it today?

The 3 steps „road” to the last week method of the statistical inference:

the interval-estimation understand it

The confidence interval for the expected value:

¯x ± t* sx /n calculate

The hypothesis testing

an everyday life model for the method’s 5 steps (was there rain?)

1. Formulate the starting hypothesis (H0)

2. what is the limit between the „small” and „large” probability ()

3. what is the probability to observe

„at least this difference” from the (when H0 sets)?

– t is the size p is the probability (significance).

4. decide about H0: — when p < than reject… (‘not probable…’)

— when p do not reject… (‘probable…’)

5. conclusion ( based on H0 what is the meaning of the decision?)

3 methods: One sample (+ paired) t tests and the sign test

Coming next: compare the two methods, errors, …

From the textbooks :

Moore: pp. 340-364 and 411-434

- Durlak Et Al. (2011) Meta Analysis SELÎncărcat deMihaela Maxim
- Proportions Activity Alt VersionÎncărcat deAlicia Reynolds
- Testing Practice after midterm 1.pdfÎncărcat deNataliAmiranashvili
- I.6 Statistical TestsÎncărcat deIntal XD
- What Are Confidence Intervals and P-Values?Încărcat deMostafa Abdelrahman
- Impact of Perceived Severity and Controllability of Service Failures on Expectations of Justice in Complaint RedressalÎncărcat deIOSRjournal
- Week14Chapter 9.1 9.3Încărcat deA K
- Data Analysis With SASÎncărcat deVictoria Liendo
- Marketing Research - Questionnaire Design & Sampling IssuesÎncărcat deDr Rushen Singh
- Confidence IntervalsÎncărcat deDr Rushen Singh
- 12 January 30 - February 2, 2017Încărcat dejun del rosario
- Statistika LilaÎncărcat dePuji Rahayu Eka Ningrum
- R-studio Commands ListsÎncărcat deSakshi Relan
- E1 ATQÎncărcat deDorothy Joy Ytac
- Statistical Methods for Material Characterization and QualificationÎncărcat deGunner92
- EM561 Lecture Notes - Part 3 of 3[1]Încărcat deAbel Carr
- BC-021-SAA-16-0054-CuÎncărcat deAhmedSerrar
- GERIE biostaticÎncărcat deteklay
- 27. Prof.Asif B. Khatik.pdfÎncărcat deAnonymous CwJeBCAXp
- Math Assignment 13-MVÎncărcat deOmar Houssami
- biostatisticsÎncărcat dethesmushroom
- data analysis projectÎncărcat deapi-253544507
- Msc StatisticsÎncărcat deNamita Vishwakarma
- Independent t TestÎncărcat deGenette Sy Solis
- Descriptive Statistics.editedÎncărcat deDavid Fluky Fluky
- math1040termprojecteportfolioÎncărcat deapi-241823876
- Freq_DistributionÎncărcat deYadav Priyankaa
- Jungian SymbolsÎncărcat deAlex Barutis
- Sme AnswerÎncărcat dewaszenv
- Stat LecsÎncărcat deAamir Mukhtar

- 03 Epidemiological Indicators; Measures 2 RR-OrÎncărcat deESSA GHAZWANI
- 01 Epidemiology Introduction 2016Încărcat deESSA GHAZWANI
- Screening 2016Încărcat deESSA GHAZWANI
- 01 Problem SolvingÎncărcat deESSA GHAZWANI
- 03 Epidemiological Studies 2016Încărcat deESSA GHAZWANI
- 05_Spps Logistic RegressionÎncărcat deESSA GHAZWANI
- SPSS English 2015 With KeyÎncărcat deESSA GHAZWANI
- 04_Spss 1Încărcat deESSA GHAZWANI
- 02_Epidemiological Indicators; MeasuresÎncărcat deESSA GHAZWANI
- Basic Epidemiology (E-Book)Încărcat deTiong Ing Ching

- Carradale Antler - 209 - March 2010Încărcat deKintyre On Record
- EUHPN Hospital Ward ConfigurationÎncărcat deVijay Laptop
- Performance Routing (PfR) Design GuideÎncărcat deRoman
- Modern ParksÎncărcat deMZGMZG
- Biochemistry of Digestive SystemÎncărcat deSyam Unhas
- 758_Chinchilla_laniger.pdfÎncărcat deClaudia Herrera
- ForcesÎncărcat deSkykus
- PRTC-FIRST-PREBOARD.docxÎncărcat deSelyn Padua
- Servo, Hydraulic - EquationsÎncărcat dedodoNiyo
- SAT II Physics Vocadsfsdfbulary.pdfÎncărcat deShruti Giri
- JavaScriptÎncărcat dePrabhakar Dumbre
- 50757819 Process Control and Safety in Chemical ProcessingÎncărcat deShazzad HossaIn
- Technical Doc for Data Transfer With P2P Connecitivty Between FWTÎncărcat deGNSVamsiKrishna
- Lecture 5-Sedimentation and Flocculation (Part 2)Încărcat deHarold Taylor
- Phil c115 Course Syllabus-1!16!20121Încărcat deannahar
- Colin Low - A Depth of Beginning - Notes of Kabbalah.pdfÎncărcat deŁukasz Nosek
- Financial Statements IsagenÎncărcat deAlmighty59
- Overview Sistema SCADA SIMATIC WinCC .pdfÎncărcat deJJ Pedro
- hw (3)Încărcat deEngr. Naveed Mazhar
- EOSSP Handbook on Engine AnalysisÎncărcat dejeyaselvanm
- Série BsraÎncărcat deTapan Khandelwal
- RentÎncărcat deLâm Nguyễn Sơn
- AbdullahÎncărcat deabdullahmahmood
- founders day of service social media plan revisedÎncărcat deapi-336932305
- cep sonnet lesson planÎncărcat deapi-316320676
- Mantenimiento Manual Munich Inst09.002E-01-08Încărcat deErik Juárez Ortega
- 05 Architektur Payment Engine EngÎncărcat deRicardo Araneda Saldias
- BBC HorowitzÎncărcat decostin_soare_2
- ELX 111_Occupational Health and SafetyÎncărcat deJonathan Zapata
- Directing the Story Ch 3Încărcat dePadawan Sergio

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.