METHODOLOGY

LECTURE 6

VARIABLES, HYPOTHESIS

AND ERRORS

1

Imran Siddiqi

Dept of CS

CONTENTS

Variables and Concepts

Types of variables

Hypothesis and Types

Testing the Hypothesis

Errors and Types

VARIABLE

"An attribute that is observable,

measurable, and has a dimension that

can vary".

that is observable, measurable, and

varies from high to low.

3

Meaning

With

concept can not be.

Need

VARIABLES

Concepts in your study How to measure?

Identify the indicators

Example

Concept

Rich

Indicator Income & Assets

Income(in $) is also a variable

Assets

Convert each of these into dollars

decide whether a given person is rich or not.

CONVERTING CONCEPTS TO

VARIABLES

Concept

Indicators

Variables

Decision level

Rich

1. Income

2. Assets

1. Income/year

2. Total value:

1. Home

2. Cars

1. If >$100,000

2. If > $250,000

High academic

achievement

1. Marks exam

2. Marks practical

1. Percentage

2. Percentage

1. If > 80%

2. If > 80%

6

TYPES OF VARIABLES

The

causal relationship

The design of study

The unit of measurement

The

phenomenon or situation

Variable that is believed to cause or influence the

dependent variable

Variable

variable.

Extraneous variables

Variables

EXAMPLES

Does

Smoking

Cause

Lung cancer ?

Does

Nursing care

Cause

Rapid recovery ?

Does

Drug (a)

Cause

Improvement ?

Cause

Effect

Independent variable

Dependent variable

EXAMPLES

Extraneous

Variables

between the dependent and independent

variables, thus it needs to be controlled.

E.g., "air pollution" is an extraneous

variable interferes with studying the

relationship between smoking

"independent variable" and lung cancer

"dependent variable".

10

DESIGN

Active variables

Variables

create them.

These variables can be manipulated, changed or

controlled.

Attribute Variables

A

researcher simply observes and measures.

These variables cannot be manipulated, changed or

controlled

11

EXAMPLE

three teaching models A,B,C

Researcher may change the teaching model

No control on the characteristics of the student

population age, gender or motivation to study

12

VARIABLES MEASUREMENT

VIEWPOINT

Categorical Variables (Qualitative)

Continuous Variables (Quantitative)

13

VARIABLES MEASUREMENT

VIEWPOINT

Categorical Variables

Measured on nominal scales

Two types

Dichotomous

Variables

E.g. alive or dead, day or night etc.

Polytomous

Variables

E.g. Religion Muslim, Christian, Jew

14

VARIABLES MEASUREMENT

VIEWPOINT

Continuous Variables

Continuity in measurement take any value on

the scale on which they are measured

E.g. age, income etc.

15

Hypothesis

16

HYPOTHESIS

Hypothesis

Brings

well

Hypothesis how to construct

Arise

17

HYPOTHESIS - EXAMPLES

Hunch

Hunch is true or false Only after the race

Distribution of smokers

Hunch

female smokers

Test the hunch ask them

Conclude hunch was right or wrong

18

HYPOTHESIS - EXAMPLES

Public health

A

from a specific sub-group of population

To find every possible cause enormous

time and resources

Narrow down based on your study identify

the most probable cause e.g. contaminated

water

Perform a study collect information to

verify your hunch

Verificiation hunch correct or not

19

HYPOTHESIS - EXAMPLES

In example 1

Waited

In example 2 & 3

Designed

20

HYPOTHESIS

Researcher does not know about a phenomenon,

situation or a condition

But does have a hunch, assumption or guess

Conclude through verification

Hunch may be

Right

Wrong

Partially

right

21

HYPOTHESIS - DEFINITIONS

validity of which is usually unknown

A proposition that is stated in a testable form

and that predicts a particular relationship

between two or more variables.

A hypothesis is written in such a way that it can

be proven or disproven by valid and reliable data

it is in order to obtain these data that we

perform our study.

22

23

HYPOTHESIS - CONSIDERATIONS

clear

No

difficult

Unidimensional should test one relationship at a

time

Must be familiair with the subject area (literature

review) before suggesting the hypothesis

24

HYPOTHESIS - CONSIDERATIONS

The average age of male students in the

class is higher than that of female students

Clear

Specific

Testable

25

HYPOTHESIS - CONSIDERATIONS

Suicide rates vary inversely with social

cohesion

Clear

Specific

Testable?

Difficult

26

HYPOTHESIS - CONSIDERATIONS

Data

Hypothesis cannot be tested?

May forumulate hypothesis for which methods of

verification not available

Expressed

27

TYPE OF HYPOTHESIS

Categories of hypothesis

Research

hypothesis

want to test

Alternate

hypothesis

considered as true in case the research

hypothesis proves to be wrong.

28

WAYS OF FORMULATING

HYPOTHESIS

of male and female smokers in the study population

A greater proportion of females than males are

smokers in the study population

A total of 60% of females and 30% of males in the

study population are smokers

There are twice as many female smokers as male

smokers in the study population

29

WAYS OF FORMULATING

HYPOTHESIS

Hypothesis of No Difference

When

there is no difference between two situations, groups

or outcomes

There

male and female smokers in the study population

30

WAYS OF FORMULATING

HYPOTHESIS

Hypothesis of Difference

A

will be a difference but does not specify its magnitude

in the study population

31

WAYS OF FORMULATING

HYPOTHESIS

Hypothesis of Point-Prevalence

A

behaviour/situation

Able

study population are smokers

32

WAYS OF FORMULATING

HYPOTHESIS

Hypothesis of Association

Expressed

Twice

as a relationship

33

HYPOTHESIS TESTING

Hypothesis testing - H0

Null

hypothesis

"this person is healthy", "this accused is not guilty" or "this

product is not broken".

Alternate

hypothesis

healthy", "this accused is guilty" or "this product is broken ".

Errors

34

HYPOTHESIS TESTING

True state of nature

Your Decision

H0 is True

H0 is False

Reject H0

Accept H0

35

HYPOTHESIS TESTING

Your Decision

H0 is True

H0 is False

Reject H0

Type I error

Correct Decision

Accept H0

Correct Decision

Type II error

36

Reject H0

Accept H

HYPOTHESIS TESTING

H0 is True

H0 is False

Type I error

Correct Decision

Correct Decision

Type II error

Telling the person that he is sick when infact he was healthy Type I error

Telling the person that he is sick when infact he was sick

Correct

Telling the person that he is healthy when infact he was sick Type II error

Telling the person that he is healthy when infact he was healthy

Correct

by and that of type II errors by

37

HYPOTHESIS TESTING

H0 = Defendent is Innocent

38

Your Decision

Innocent

Terrorist

Terrorist

False positive

True positive

Innocent

True Negative

False negative

39

True Positives

False Negative

False Positives

40

PERFORMANCE MEASURES

Recall

TP

R

TP FN

Precision

TP

P

TP FP

F measure

Precision . Recall

F 2.

Precision + Recall

41

TP

P

TP FP

TP

TP FN

How many faces

were detected out

of total?

Did system

detected extra

objects other

than faces?

42

EXAMPLE - BIOMETRICS

Finger

Enrollment

Enroll

facial images or iris scans etc.

Validation

A

person arrives

Take data (finger print, iris, face)

Compare with database

If matched with an individual Allow

Else - Decline

43

EXAMPLE - BIOMETRICS

Enrollment

44

http://www.idteck.com/support/biometrics.asp

EXAMPLE

authorized person is rejected access

person is accepted as authorized

45

EXAMPLE - BIOMETRICS

Challenge

How

acceptance/rejection

Find system response to a large number of

inquires from authorized as well as unauthorized

users.

Record similarity scores of authorized and

unauthorized cases

Plot respective histograms/distributions

46

EXAMPLE - BIOMETRICS

47

EXAMPLE - BIOMETRICS

48

EXAMPLE - BIOMETRICS

49

EXAMPLE - BIOMETRICS

50

FAR will decrease and FRR will increase

51

FAR will increase and FRR will decrease

52

Depends upon your application Which errors are less serious

53

PERFORMANCE

On different thresholds system has different

values of FAR and FRR

If some one asks you what is the performance of

your system how to answer?

54

PERFORMANCE

Equal Error Rate - EER

Change the value of threshold and plot FAR and

FRR

The point where both are equal is the EER

55

PERFORMANCE

Curve

High security

cannot

afford FAR

Balance

User comfort Lesser

False Rejections

56

