Documente Academic
Documente Profesional
Documente Cultură
By
Total credits = 15 . Half in introduction to statistics & the other half for research
Overall module aim
Learning outcome
Sir Nicholas, who has died aged 71, was England’s senior divorce court judge who had rare neurological
disease called fronto-temporal lobe dementia that had only recently been diagnosed.
Leading judge who hanged himself after dementia diagnosis left wife a note
saying she had 'a life to live', inquest hears :Telegraph Reporters
7 June 2017 • 11:39am
Marina Fagan, a 51-year-old mother of four, was discharged following a two-day stay at
Whipps Cross hospital, in Leytonstone, after investigations ruled out a brain haemorrhage.
She returned to A&E the same day as her headache persisted but was advised to get her
GP to refer her to an outpatient clinic. Her condition was finally diagnosed 11 days after she
was first admitted to hospital. She died six days later, on October 6, 2015.
So we need more research & more neurologists
Statistical
thinking is
involved in all
these phases,
along with
substantive
scientific
knowledge.
Introduction to data analysis
Percentiles Smallest
1% 24 24
5% 24 32
10% 24 37 Obs 8
25% 34.5 39 Sum of Wgt. 8
Displaying Data
• Graphs
– Bar Charts
– Histograms
– Line Graphs
– ……
Introduction to data analysis
Displaying Data
Histogram of the 2012/13 ordinary hospital admissions with a neurological condition among England CCGs
40
30
Frequency
20
10
0
The box indicates that the median and two quartiles (1st quartiles = 2269, median= 2895 and
3rd quartile = 4013) . The vertical lines above and below the box indicate the range of values,
with outliers shown as separate points.
Introduction to data analysis
• Outliers are identified by assessing whether or not they fall within a set
of numerical boundaries called "inner fences" and "outer fences".
• A point that falls outside the data set's inner fences is classified as a
minor outlier, while one that falls outside the outer fences is classified
as a major outlier.
• Multiplying inter-quartile range (Q3-Q1) by 1.5 then add this number to
Q3 and subtract it from Q1 to find the boundaries of the inner fences.
• Multiplying inter-quartile range (Q3-Q1) by 3 (instead of 1.5) then add
this number to Q3 and subtract it from Q1 to find the upper and lower
boundaries of the outer fences.
• A point that falls outside the data set's inner fences is classified as a
minor outlier, while one that falls outside the outer fences is classified
as a major outlier.
Identifying outliers in your data-example hospital admissions
15
• count if Ordinary1213 > 9245
4
As the data are positively skewed so report median and inter-quartile range.
Types of data
Quantitative
Continuous Discrete
Blood pressure Number of children (parity)
Age Number of cigarettes per day
Concentration of a pollutant Counts of deaths in small areas
Categorical
Ordinal Nominal
(Ordered categories) (Unordered categories)
Grade of breast cancer Sex (male/female)
Disease severity (mild/moderate/severe) Exposed/unexposed
Social class (I, II, III, IV, V) Ethnicity (white/asian/black/other)
Comments
Transformations
Log transformation
• Log transform stretches scale at
2
log(e)=1
lower end and compresses it at
1
upper end
y = log(x)
log(1)=0
0
values
-2
0 2 4 6 8 10
x
0 20 40 60 80
Number of patients
Number of patients
0
Suppose you are running a study at UCLH aiming to lowers the low-density
lipoprotein (LDL) cholesterol levels for the patients with cardiovascular
disease. Your study is an RCT , double blind and placebo-controlled.
Patients were randomly assigned to receive evolocumab (either 140 mg
every 2 weeks or 420 mg monthly) or matching placebo as
subcutaneous injections. Out of first 20 patients
Group: 11 patients received evolocumab and 9 patients received placebo.
Gender: 12 female and 8 male.
Statin use: High intensity – 12 patients
Medium intensity – 6 patients
Low intensity – 2 patients
Using patient ID 1 to 20 and appropriate code display above information in
a spreadsheet. Ignore between variables information for now.
Data display in a spreadsheet - coding
Consider the patients age between 50 and 70 with a mean age of 60 years.
Can you now put an extra column for age of the patients?
In your study you might get different variables but need to present in a similar
way!
Data display in a spreadsheet – type in extra column Age
Check twice that your coding is correct and make sure you
didn’t put any wrong information or typed any number wrongly
Check relevant research data matched your findings
Recap