Sunteți pe pagina 1din 83

1

Types of Data

 Quantitative

 Qualitative
Quantitative Data Analysis

3
Key components of a Data Analysis Plan

 Research objectives

 Research questions

 Level of analysis / technique

 How data will be presented

 Tabular

 Graphical / Pictures

4
Statistical Applications
 Two main groups of statistical applications are:
1) Descriptive statistics
 Summarize data

 Summarize the relationship between two or more variables.

2) Inferential statistics
 Generalize from a sample to a population.
 Population includes all cases in which the research is interested
 Samples include carefully chosen subsets of the population
 Tests hypothesis.

 Descriptive statistics simply describes what is or what the data


shows. Inferential statistics tries to reach conclusions that extend
beyond the immediate data alone. For instance, infer from the
sample data what the population might think.

5
Descriptive Statistics
 Univariate descriptive statistics include:
 Percentages, averages, and various charts and graphs.
Example: On the average, participants are of 30.3 years of age.
 Bivariate descriptive statistics describe the strength and direction of
the relationship between two variables.
Example: Older students have higher GPAs.
 Multivariate descriptive statistics describe the relationships
between three or more variables.
 Example: Grades increase with age for females but not for males.

 Basic Descriptive Statistics: Percentages, Ratios and Rates, Tables,


Central tendency (Mean, Median, Mode), Measures of dispersion
(M.D.: S.D.; Variance; correlation; Simple graphics, etc.

6
Levels of data analysis

 Univariate analysis

 Bi-variate analysis

 Multivariate analysis

7
Univariate analysis
 Involves examination across cases of one variable at a time.
Some techniques used are: percentage, central tendency,
dispersion, etc.

Table-1: Distribution of GB Members by Amount of Land Own


Amount of Land No. of Percent Cum. Descriptive Statistics
(Dec.) member Percent Minimum 0
0-10 938 44.5 44.5 Maximum 99

11-50 1112 52.8 97.2 Mean 20.37

50+ 58 2.8 100.0 Std. Deviation 17.233

Total 2108 100.0 - Variance 296.974

8
Bivariate Analysis
 It involves the examination across cases of two variables at a
time to see the relation. Some techniques used are:
percentage, central tendency, dispersion, co-relation,
regression, inferential statistics, etc.

Table-2: Number of Family Member by Status of Membership in GB


Number of family member
Membership Total
Up to 2 3 4 5 6+
Status
No. % No. % No. % No. % No. % No. %
Continuing 154 62.6 385 70.1 400 70.9 334 76.8 242 77.1 1515 71.9
Dropout 92 37.4 164 29.9 164 29.1 101 23.2 72 22.9 593 28.1
Total 246 11.7 549 26.0 564 26.8 435 20.6 314 14.9 2108 100.0
Mean = 4.07 Standard Deviation = 1.38
Chi-square = 20.92 DF = 4 Significance = 0.00
9
Multivariate analysis
 It involves the examination across cases of more than two variables at a
time. Here we can control one or more variables to single out influence of
extraneous variable(s) on dependent variable. Both descriptive and
inferential statistical techniques are used.
Table-3: Members by No. of Loans Received, Training Received and Default Status
Default Status Sig.
No. of Loans Training Total
Not Defaulter Defaulter (Chi-square)
No. 79 9 88
No % 89.8 10.2 100.0
No. 11 5 16
1
Yes % 68.8 31.3 100.0
No. 90 14 104 0.02
Total % 86.5 13.5 100.0
No. 76 47 123
No % 61.8 38.2 100.0
No. 24 35 59
2-3
Yes % 40.7 59.3 100.0
No. 100 82 182 0.01
Total % 54.9 45.1 100.0
No. 8 7 15
No % 53.3 46.7 100.0
No. 9 28 37
4-7
Yes % 24.3 75.7 100.0
No. 17 35 52 0.04
Total % 32.7 67.3 100.0 10
Introduction to SPSS

11
About SPSS
 SPSS stands for

Statistical Package For Social Sciences


or
Statistical Product and Service Solution

12
Data Entry into SPSS
 Click the SPSS icon and open the SPSS

 SPSS data editor will open

 SPSS data editor has two windows:

 Data View

 Variable View

13
SPSS Data Editor
 Data View
 Rows are cases
 Columns are variables

14
SPSS Data Editor
 Variable view

15
Data Entry into SPSS
 Use the ‘variable view’ window for data entry

16
Data Entry into SPSS – Variable Name
 First column is ‘Name’ - Name of variable
 It must be unique
 It must start with a letter, Example ‘m1age’
 Certain characters can not be used. For example,
we can use ‘_’ but not ‘-’; space not accepted
 Name will be used to identify the variable later on
analysis and other manipulations
 Variable name should be short but easily
recognizable and visible (nowadays no limit of
number of character)

17
Variable Name- Example

 We can use ‘cat_dog’ but not ‘cat-dog’ or not


‘cat dog’.

18
Data Entry into SPSS – Variable Type
 2nd column is ‘Type’ - Type of variable

 Different types are there

 ‘Numeric’ most commonly used


 ‘String’ may be used in case of full name,
address, company name, etc.; Not eligible for
numeric operations.

19
Type of variable - Example

Clicking on this box will bring up the variable


type menu

20
Variable Type- Example

21
Data Entry into SPSS - Width
 3rd column is ‘Width’ – What is the number of
categories or values

 Number of characters SPSS will allow for the variable


(number of digits in code data)
 Example - 1: if there are ‘12’ categories of a response, the
width will be ‘2’ as two digits are there; in case ‘2’ we can
enter up to 99 categories
 Example - 2: If income is to enter as it is (interval scale) and
highest income would be ‘9900000’, the width will by ‘7’ as
there are seven digits in the maximum value

22
Data entry into SPSS – Example of Width

 We can change a width by clicking in the


width cell for the desired variable and
typing a new number or using the arrow
keys at the edge of the cell

23
Data Entry into SPSS - Decimals
 4th column is ‘Decimals’ – How many decimal points
are there in the values?

 Actual decimal points we want to keep. Usually


applicable for numeric variables.
 Usually, for entry of codes against categories, no
decimal point is needed

 We can change decimals by clicking in the decimal


cell for the desired variable and typing a new number
or using the arrow keys at the edge of the cell 24
Data Entry into SPSS - Label
 5th column is ‘Label’ – Detail to identify
what a variable represents.

 It is like text and go as far as 255 characters


 Will appear as table head
 Should be detail enough yet short as much as
possible
 Should be as we like to present the variable in
our report.

25
Data Entry into SPSS – Example of Label

 To change or edit a variable label, simply click


anywhere within the cell.

26
Data Entry into SPSS - Values
 6th column is ‘Values’ – Numerical value of each
category (if category is there)
 Not needed for interval scale variables where
quantitative values are entered directly, but if in
category then needed
 Put the code against ‘value’ for specific category
and then write the name of that category against
‘label’; then click OK.
 For every category we need to complete same
process
 We can enter new category anytime
27
Data Entry into SPSS - Example of ‘value’
and ‘value labels’
 Example: Define Value and Labels
 How is the law and order situation of the country?
 Excellent
 Good
 Bad
 Very bad
 We can code this answers as follows:
Code Value
1 Excellent
2 Good
3 Bad
4 Very bad
28
Data Entry into SPSS – Example of values

Clicking here opens up the ‘value labels’ dialogue box

29
Data Entry into SPSS – Example of value labels

 Click in the Value field to type a specific numeric value


 Click in the Label field to type the corresponding label
 Click on the Add button to add this pair of value and label to
the list
 Repeat the same for all categories
 Any value or label or both can be deleted or changed by
using ‘Change’ and ‘Remove’ buttons and clicking on ‘OK’ 30
Data Entry into SPSS – Missing Values
 7th column is Missing – Here we can define data to
be treated as ‘missing value’ later on.
 There are two types of missing values in SPSS:
system-missing and user-defined.
 System-missing data is assigned by SPSS when
nothing is entered or a function cannot be
performed and shows by a single period.
 If there is no ‘missing value’ select no missing value
 If there is ‘missing value, then define.
 For example, we can define ‘9’ or ‘99’ or ‘999’ as
missing value as in the next slide
31
Data Entry into SPSS – Example of Missing Values

 All these values will be treated by SPSS as missing


(i.e., these values will be ignored)
32
Data Entry into SPSS - Columns
 8th column is ‘Columns’ – Number of columns SPSS
will allow to appear the variable or how wide the
column should be for each variable.
 It is different than ‘width’, which indicates how many
digits of the number will be displayed.
 The column size indicates how much space is
allocated rather than the degree to which it is filled.
 Keep short as much as possible as may be needed to
see more variables on the screen at a time

33
Data Entry into SPSS - Align

 9th column is ‘Align’ – This is where to place


the entry within the column, i.e., left-justified,
right-justified, or centered
 If numerical select ‘right-justified’

34
Data Entry into SPSS - Measure
 10th column is ‘Measure’ – Level of
measurement in which the variable is measured.
 It can be changed anytime even during analysis

35
Using SPSS for Results

36
Output in SPSS

 All SPSS outputs are shown in a separate


output window

 More than one data set can open at a time

 Output(s) can / shall be saved separately

 Will be shown practically

37
SPSS Menu: ‘File’

 Mostly, like other softwares

 Use ‘display data file information’ for name


and categories of variables in the file

 Will be shown practically

38
SPSS Menu: ‘Edit’
 Works same like edit menu of other softwares
 Use ‘Insert Variable’ for insertion of new
variable
 Use ‘Insert Cases’ for insertion of new cases
 Both can also be done by clicking on screen
 Use ‘Options’ for choosing different items
(Display, Output Labels, Tables, etc.) as required

 Will be shown practically

39
SPSS Menu: ‘View’

 Use View menu like other programmes


 Will be shown practically

40
SPSS Menu: ‘Data’
 Use ‘Sort Cases’ to arrange cases by ascending or descending
order according to the values of a particular variable
 Use ‘Sort Variables’ to arrange variables by ascending or
descending order according to a particular property of a
variable
 Use ‘Transpose’ to convert variables into cases or cases into
variables
 Use ‘Merge Files (add cases or add variables)’ to add cases or
variables of other file(s) into one file
 Use ‘Split file’ to have outputs separately by the values of a
particular values
 Use ‘Weight Cases’ to weight particular cases

 Will be shown practically


41
SPSS Menu: ‘Transform’
 Use ‘Compute (+, -, x, /, etc.)’ to calculate as
necessary.
 Any formula can be used
 Use ‘Count’ to count a particular value across the
variables
 Use ‘Recode into same variable’ to recode the
values of a variable in the same variable
 Use ‘Recode into different variable’ to recode the
values of a variable as a different variable
 Will be shown practically

42
SPSS Menu: ‘Analyze’
 Use ‘Descriptive Statistics’  Frequency for frequency
distribution
 Number of times a particular value occurs in a data series
 Frequency table is arranging data by values and their
corresponding frequencies
 We can easily have frequency table using data in SPSS
 SPSS frequency output has four columns
1) Value
2) Percent
3) Valid percent
4) Cumulative percent

 Will be shown practically

43
SPSS Menu: ‘Analyze’ - Continue
 Use ‘Descriptive Statistics’  Descriptive
for some specific statistics
 Use ‘Descriptive Statistics’  Crosstabs for
bi- & multi variate tables
 Select desired statistics and statistical tests

 Use ‘Custom Tables’ for a table of own


preference

 Will be shown practically


44
SPSS Menu: ‘Analyze’ - Continue
 Use ‘Compare means’ for comparing a particular mean along
with some other statistics among different groups of
respondents
 Some other functions can also be calculated
 For correlation use ‘Correlation’ and selecting desired type of
correlation
 For Regression use ‘Regression’ and selecting desired type of
correlation
 For Non-Parametric Test ‘Non-parametric Tests’ and
selecting desired Test
 Use ‘Multiple Response’ to analyze multiple response
variables

 Will be shown practically

45
SPSS Menu: ‘Analyze’ - Continue

 Use ‘Multiple Response’ to analyze multiple response variables


 First, Define Multiple Response Sets:
 Transfer variables in the set from variable list to ‘Variable in Set’
box
 Check appropriate circle between ‘Dichotomies’ and Categories
• In case of Categories mention range
 Write of the ‘variable’ derived from multiple variable
 Write the ‘label’ of the variable derived from multiple variable
• In case of Categories mention range
 Click ‘Add’ then ‘Close’
 Second, use the variable created for ‘frequency’ or ‘Crosstabs’

 Will be shown practically


46
Frequency Distribution / Table
 Report the number of times each score of a
variable occurred.
 The categories of the frequency distribution must
be:
 Mutually exclusive: No chance of a case to be
included in two categories.
 Exhaustive: All cases in the data set will be included
in the categories.
 Frequency: Number of times a particular value occurs in a data
series
 Frequency table: Arranging data by values and their
corresponding frequencies
47
Calculating Frequency using SPSS
 We can easily have frequency table using data in SPSS
 Select: Analyze  Descriptive statistics  Frequencies
SPSS frequency output has five columns
1) Value / Category
2) Frequency
3) Percent
4) Valid percent
5) Cumulative percent
 Practical exercise

48
Frequency Table for Multiple Variables
 Determine the frequency of a combination of
variables.
 Select: Analyze > Descriptive Statistics > Crosstabs
Rows: Preferably independent variable
Columns: Preferably dependent variable
Layer: Preferably control variable
 Send desired variable from variable list to rows or columns or
layer
 Select desired statistics by clicking ‘Statics’
 Select desired count and percentage by clicking ‘Cells’
 Select: Analyze > Descriptive Statistics > Custom Tables

49
Percentage and Proportion

50
Use of Percentage and Proportion
 Calculated to standardize the frequencies so that
comparison among values / categories and between
data sets becomes valid
 Report relative size in the total data set
 Compare the number of cases in a specific category to
the number of cases in all categories.
 Compare a part (specific category) to a whole (all
categories).
 The part is the numerator (f ).
 The whole is the denominator (N).

51
Ratios
 Compare the relative sizes of categories.
 Compare parts to parts.
 Ratio = f1 / f2
 f1 = number of cases in first category
 f2 = number of cases in second category
 Example: In case of 23 females and 19 males:
 the ratio of males to females is:
19/23 = 0.83, that is, for every female there are 0.83 males.
 the ratio of females to males is:
23/19 = 1.21, that is, for every male, there are 1.21 females.

52
Rate
 Expresses the number of actual occurrences of an event
(births, deaths, homicides) vs. the number of possible
occurrences per some unit of time.
 Birth rate is the number of births divided by the
population size times 1000 per year.
 If a village of 2300 had 17 births last year, the birth rate is:
 (17/2300) * 1000 = (.00739) * 1000 = 7.39

That is, the village had 7.39 births for every 1000 residents.

53
Measures of Central Tendency
 Central tendency is the trend of individual data of
data set towards the central point of the data set.
 It summarizes the data allowing:
 Description of data briefly and conveniently
 Comparison among groups
 Base for advanced statistical analysis

54
Calculate Measures of Central Tendency in SPSS
 Select: Analyze Descriptive Statistics 
Frequencies  Statistics  Desired type in Central
tendency
or
 Select: Analyze Descriptive Statistics 
Descriptives  Options  Mean
or
 Select: Analyze Compare Means  Options
Transfer Mean from left box to right box

55
Calculating Mean(s) Using SPSS

Will be shown practically

56
Calculate Measures of Dispersion in SPSS
 Select: Analyze Descriptive Statistics 
Frequencies  Statistics  Desired type in
Dispersion
or
 Select: Analyze Descriptive Statistics 
Descriptives  Options  Desired type in Dispersion
or
 Select: Analyze Compare Means  Options
Transfer Desired type of Dispersion from left box to
right box

57
Producing a charts
 Select: Analyze Descriptive Statistics
Frequencies Reset Desired variable
Charts  Bar/Pie/Histogram
Frequencies/Percentage OK

 Practice with real data

58
Chart using Excel
 Charts can be constructed easily with more options
using tables produced by through SPSS
1. Open excel
2. Calculate desired table in SPSS
3. Copy the table by clicking right button of mouse in the SPSS
output window
4. Paste the table in Excel
5. Edit the table to make it fit for Chart
6. Select the table
7. Click chart
8. Edit chart according your choice

 Practice with real data


59
60
Test of hypothesis
 A hypothesis test is a statistical test that is
used to determine whether there is enough
evidence in a sample of data to infer that a
certain condition is true for the entire
population.
 A hypothesis test examines two
opposing hypotheses about a population: the
null hypothesis and the
alternative hypothesis.

61
Steps of hypothesis testing
 Hypothesis testing – 7 steps
1) Construct null hypothesis
2) Construct alternative hypothesis
3) Select a significance level and a critical region
(region of rejection of the null hypothesis).
Consider:
a) Whether both ends (tails) of the distribution should be
included.
b) How the critical region of a certain size will contribute
to Type I or Type II errors.
4) Select a statistical test to use
5) Calculate the test statistic
6) Find out tabulated value
7) Make a decision (accept of reject null hypothesis)
1) Null hypothesis
 Null (nullus – latin): “not any”  no difference
between groups
 A neutral position
 Predicts that two groups will not differ
 Denotes by H0
H0: µ1 = µ2
 The purpose of statistical test is to evaluate the
null hypothesis (H0) at a specified level of
significance
2) Alternative hypothesis
3 (a) Levels of significance
 Level of significance determines the probability that
the observed result of a study is due to the influence
of the independent variable rather than by chance.
 A result is “statistically significant” at a certain
level. For example, a result might be significant at
p<.05.
 “P” represents the probability that the result was due to chance,
and
 .05 represents a 5% probability that the result was due to chance.
 Therefore, p<.05 means that the observed results have over a
95% probability of being due to the influence of the
independent variable. Or The hypothesis is correct for 95%
samples
3 (b) Critical region
 The set of values of the
test statistic for which the null
hypothesis is rejected.
 Values of the test statistic for which we
reject the null in favor of the
alternative hypothesis
 May be located in one side or in both
side
66
Critical region for one-tailed test
Critical regions for a two-tailed test
4) Test selection
 When a researcher is ready to test a
specific hypothesis generated from a
theory or to answer a research question
posed, he or she is faced with the task
of choosing an appropriate statistical
procedure.
Factors in choosing a test

1. The nature of the hypothesis


2. The levels of measurement of the
variables to be tested
3. Sample size
4. Number of groups in analysis
Levels of measurement and statistical test
 There are four levels or scales of
measurement.
 Each level is classified according to certain
characteristics.
 Data that fall in the first level are limited to
certain statistical tests.
 Choices of statistical tests (and the power of
the tests) increase as the levels go up.
5) Calculate the test statistic
 You can do it manually, or

 You can do it using software, such as SPSS

74
75
Degrees of Freedom
 One sample t-test or paired t-test = N-1
 Independent t-test = N-2
 Chi-square test =
(# rows - 1) x (# columns – 1)
 ANOVA
df between groups = (# levels or groups – 1)
df within groups = (# subjects - # of levels)
 Correlations = N-2
7) Making the decision
 If calculated value is lower than
tabulated value the null hypothesis is
accepted. That is, there is no difference
between/among groups
 If calculated value is higher than
tabulated value the null hypothesis is
rejected. That is, there is significant
difference between/among groups.
Use of some common
Statistical tests

78
Chi-square test
 Application
 When data are presented in ‘contingency
tables’; i.e. in categories.
 Use
 χ2 test is used in testing hypothesis.

 Calculation using SPSS


 Will be shown practically.
Test of Mean (T-test & Z-test)
 Application
 The data must be continuous.
 The data must follow the normal probability distribution.
 The sample is a simple random sample from its population.

 Use
 It is used to determine whether there is a significant
difference between the means of two groups
 Whether the sample mean represents the population mean

 T-test if sample size small (<30)


 Z-test if sample size is large (>30)
 Calculation using SPSS
 Will be shown practically
F-test
 F-Test for testing equality of variances of two
normal populations
 Variances are a measure of dispersion, or how far the
data are scattered from the mean.
 F-Test is also used for testing equality of several
means. The test for equality of several means is
carried out by the technique called ANOVA.
 If a researcher wants to test whether or not two
independent samples have been drawn from a
normal population with the same variability, then
he generally employs the F-test.
ANNOVA
 ANOVA -The Analysis Of Variance
 ANNOVA can be used to test means of more than 2 groups.
 Example: All students of Economics, Social Work and Sociology
have equal mean IQ scores. Each dept. has 500 students. A
random sample of 10 from each dept. taken and tested.
 We have mean scores of 113, 96 and 93 respectively
 Now the problems is do all 500 students per dept. have same
mean IQ?
 We can decide it through ANNOVA
 Calculation using SPSS
 Will be shown practically

82
Thank you

83

S-ar putea să vă placă și