Sunteți pe pagina 1din 57

RESEARCH

METHODOLOGY
LECTURE 6
VARIABLES, HYPOTHESIS
AND ERRORS
1

Imran Siddiqi
Dept of CS

Bahria University, Islamabad


imran.siddiqi@gmail.com

CONTENTS
Variables and Concepts
Types of variables
Hypothesis and Types
Testing the Hypothesis
Errors and Types

VARIABLE
"An attribute that is observable,
measurable, and has a dimension that
can vary".

For example, temperature is a variable


that is observable, measurable, and
varies from high to low.
3

CONCEPTS VS. VARIABLES

Concepts are mental images or perceptions


Meaning

vary from individual to individual

Variables are measurable


With

varying degrees of accuracy

A variable can be measured, a


concept can not be.

Concepts in your study?


Need

to convert them to variables

CONCEPTS, INDICATORS &


VARIABLES
Concepts in your study How to measure?
Identify the indicators

set of criteria reflective of the concept

Convert the indicators to variables


Example

Concept

Rich
Indicator Income & Assets
Income(in $) is also a variable
Assets

House, Cars, Investements


Convert each of these into dollars

Based on total income and total value of assets,


decide whether a given person is rich or not.

CONVERTING CONCEPTS TO
VARIABLES

Concept

Indicators

Variables

Decision level

Rich

1. Income
2. Assets

1. Income/year
2. Total value:
1. Home
2. Cars

1. If >$100,000
2. If > $250,000

High academic
achievement

1. Marks exam
2. Marks practical

1. Percentage
2. Percentage

1. If > 80%
2. If > 80%
6

TYPES OF VARIABLES

Classification can be based on:


The

causal relationship
The design of study
The unit of measurement

TYPES FROM VIEW OF CAUSATION

Change variables Independent variables


The

cause responsible for brining about a change in a


phenomenon or situation
Variable that is believed to cause or influence the
dependent variable

Outcome variables Dependent variables


Variable

variable.

that is influenced by the independent

Extraneous variables
Variables

affecting the cause-and-effect relationship

EXAMPLES
Does

Smoking

Cause

Lung cancer ?

Does

Nursing care

Cause

Rapid recovery ?

Does

Drug (a)

Cause

Improvement ?

Cause

Effect

Independent variable

Dependent variable

EXAMPLES
Extraneous
Variables

Variable that confound the relationship


between the dependent and independent
variables, thus it needs to be controlled.
E.g., "air pollution" is an extraneous
variable interferes with studying the
relationship between smoking
"independent variable" and lung cancer
"dependent variable".
10

VARIABLES VIEWPOINT OF STUDY


DESIGN

Active variables
Variables

that do not pre-exist, so, the researcher has to


create them.
These variables can be manipulated, changed or
controlled.

Attribute Variables
A

pre-existing characteristic or attribute which the


researcher simply observes and measures.
These variables cannot be manipulated, changed or
controlled
11

EXAMPLE

Study designed to measure the effectiveness of


three teaching models A,B,C
Researcher may change the teaching model
No control on the characteristics of the student
population age, gender or motivation to study

12

VARIABLES MEASUREMENT
VIEWPOINT
Categorical Variables (Qualitative)
Continuous Variables (Quantitative)

13

VARIABLES MEASUREMENT
VIEWPOINT
Categorical Variables
Measured on nominal scales
Two types

Dichotomous

Variables

Vary in only two values.


E.g. alive or dead, day or night etc.

Polytomous

Variables

More than two categories


E.g. Religion Muslim, Christian, Jew

14

VARIABLES MEASUREMENT
VIEWPOINT
Continuous Variables
Continuity in measurement take any value on
the scale on which they are measured
E.g. age, income etc.

15

Hypothesis

16

HYPOTHESIS

Hypothesis
Brings

clarity, specificity and focus to research problem

Possible to conduct a study without hypothesis as


well
Hypothesis how to construct
Arise

from hunches or educated guesses

17

HYPOTHESIS - EXAMPLES

Betting on a horse race


Hunch

Horse#6 will win


Hunch is true or false Only after the race

Distribution of smokers
Hunch

more male smokers at your workplace than


female smokers
Test the hunch ask them
Conclude hunch was right or wrong

18

HYPOTHESIS - EXAMPLES

Public health
A

disease is very common in people coming


from a specific sub-group of population
To find every possible cause enormous
time and resources
Narrow down based on your study identify
the most probable cause e.g. contaminated
water
Perform a study collect information to
verify your hunch
Verificiation hunch correct or not
19

HYPOTHESIS - EXAMPLES

In example 1
Waited

for event to take place

In example 2 & 3
Designed

a study to test the validity of your hunch

20

HYPOTHESIS
Researcher does not know about a phenomenon,
situation or a condition
But does have a hunch, assumption or guess
Conclude through verification
Hunch may be

Right

Wrong

Partially

right

21

HYPOTHESIS - DEFINITIONS

A tentative statement about something, the


validity of which is usually unknown
A proposition that is stated in a testable form
and that predicts a particular relationship
between two or more variables.
A hypothesis is written in such a way that it can
be proven or disproven by valid and reliable data
it is in order to obtain these data that we
perform our study.

22

23

HYPOTHESIS - CONSIDERATIONS

A hyothesis should be simple, specific and


clear
No

ambiguity in the hypothesis makes verification


difficult
Unidimensional should test one relationship at a
time
Must be familiair with the subject area (literature
review) before suggesting the hypothesis

24

HYPOTHESIS - CONSIDERATIONS
The average age of male students in the
class is higher than that of female students
Clear
Specific
Testable

25

HYPOTHESIS - CONSIDERATIONS
Suicide rates vary inversely with social
cohesion
Clear
Specific
Testable?

Difficult

What is social cohesion, how to measure it.

26

HYPOTHESIS - CONSIDERATIONS

A hypothesis should be capable of verification


Data

collection and analysis


Hypothesis cannot be tested?
May forumulate hypothesis for which methods of
verification not available

You may end up developing a technique

A hypothesis should be operationalisable


Expressed

in terms that can be measured

27

TYPE OF HYPOTHESIS

Categories of hypothesis
Research

hypothesis

Your hypothesis which you


want to test
Alternate

hypothesis

Specify the relationship that will be


considered as true in case the research
hypothesis proves to be wrong.
28

WAYS OF FORMULATING
HYPOTHESIS

There is no significant difference in the proportion


of male and female smokers in the study population
A greater proportion of females than males are
smokers in the study population
A total of 60% of females and 30% of males in the
study population are smokers
There are twice as many female smokers as male
smokers in the study population

29

WAYS OF FORMULATING
HYPOTHESIS

Hypothesis of No Difference
When

you formulate a hypothesis stipulating that


there is no difference between two situations, groups
or outcomes

There

is no significant difference in the proportion of


male and female smokers in the study population

30

WAYS OF FORMULATING
HYPOTHESIS

Hypothesis of Difference
A

hypothesis in which a researcher stipulates that there


will be a difference but does not specify its magnitude

greater proportion of females than males are smokers


in the study population

31

WAYS OF FORMULATING
HYPOTHESIS

Hypothesis of Point-Prevalence
A

researcher has enough knowledge about the


behaviour/situation

Able

to express the hypothesis in quantitative units

total of 60% of females and 30% of males in the


study population are smokers

32

WAYS OF FORMULATING
HYPOTHESIS

Hypothesis of Association
Expressed
Twice

as a relationship

as many female smokers as male smokers

33

HYPOTHESIS TESTING

Hypothesis testing - H0
Null

hypothesis

Usually corresponds to a default "state of nature", for example


"this person is healthy", "this accused is not guilty" or "this
product is not broken".

Alternate

hypothesis

Negation of null hypothesis, for example, "this person is not


healthy", "this accused is guilty" or "this product is broken ".

Errors

depend directly on null hypothesis.


34

HYPOTHESIS TESTING
True state of nature

Your Decision

H0 is True

H0 is False

Reject H0
Accept H0

35

HYPOTHESIS TESTING

Your Decision

True state of nature


H0 is True

H0 is False

Reject H0

Type I error

Correct Decision

Accept H0

Correct Decision

Type II error

36

Reject H0
Accept H
HYPOTHESIS TESTING

H0 is True

H0 is False

Type I error

Correct Decision

Correct Decision

Type II error

H0 = This person is healthy

Telling the person that he is sick when infact he was healthy Type I error
Telling the person that he is sick when infact he was sick

Correct

Telling the person that he is healthy when infact he was sick Type II error
Telling the person that he is healthy when infact he was healthy

Correct

Traditionally probability of type I errors is denoted


by and that of type II errors by

37

HYPOTHESIS TESTING

H0 = Defendent is Innocent

38

EXAMPLE AIRPORT TRAVELERS

Your Decision

True state of nature


Innocent

Terrorist

Terrorist

False positive

True positive

Innocent

True Negative

False negative

39

EXAMPLE: FACE DETECTION


True Positives

False Negative

False Positives

True Negative (Rest of the image)

40

PERFORMANCE MEASURES
Recall

TP
R
TP FN
Precision

TP
P
TP FP

F measure
Precision . Recall
F 2.
Precision + Recall

41

TP
P
TP FP

TP
TP FN

EXAMPLE: FACE DETECTION


How many faces
were detected out
of total?

Recall = 3/4= 75%

Did system
detected extra
objects other
than faces?

Precision = 3/6 = 50%


42

EXAMPLE - BIOMETRICS

Biometric access control system


Finger

print, iris, face, hand geometry etc.

Enrollment
Enroll

all the authorized users take their finger prints,


facial images or iris scans etc.

Validation
A

person arrives
Take data (finger print, iris, face)
Compare with database
If matched with an individual Allow
Else - Decline

43

EXAMPLE - BIOMETRICS
Enrollment

What kind of errors the system can make?


44

http://www.idteck.com/support/biometrics.asp

EXAMPLE

The FRR is the frequency that an


authorized person is rejected access

The FAR is the frequency that a non authorized


person is accepted as authorized

45

EXAMPLE - BIOMETRICS
Challenge
How

to find a similarity threshold value for


acceptance/rejection
Find system response to a large number of
inquires from authorized as well as unauthorized
users.
Record similarity scores of authorized and
unauthorized cases
Plot respective histograms/distributions
46

EXAMPLE - BIOMETRICS

47

EXAMPLE - BIOMETRICS

48

EXAMPLE - BIOMETRICS

49

EXAMPLE - BIOMETRICS

50

Move the decision boundary (threshold) to the right


FAR will decrease and FRR will increase

51

Move the decision boundary (threshold) to left


FAR will increase and FRR will decrease

52

Which boundary to chose?


Depends upon your application Which errors are less serious

53

HOW TO QUANTIFY SYSTEM


PERFORMANCE
On different thresholds system has different
values of FAR and FRR
If some one asks you what is the performance of
your system how to answer?

54

HOW TO QUANTIFY SYSTEM


PERFORMANCE
Equal Error Rate - EER
Change the value of threshold and plot FAR and
FRR
The point where both are equal is the EER

55

HOW TO QUANTIFY SYSTEM


PERFORMANCE

The Receiver Operating Characteristic (ROC)


Curve

High security
cannot
afford FAR
Balance
User comfort Lesser
False Rejections

56

The material in these slides is based on the following resources.

REFERENCES

Research Methodology, Ranjit Kumar, Chapter 6


http://en.wikipedia.org/wiki/Type_I_and_type_II_errors

http://www.intuitor.com/statistics/T1T2Errors.html

http://www.fingerprint-it.com

http://fingerchip.pagesperso-orange.fr

57