Sunteți pe pagina 1din 25

DATA PROCESSING

The data preparation process


DATA EDITING

DATA CODING

DATA CLASSIFICATION

DATA TABULATION

EXPLORATORY DATA ANALYSIS

Benefits of data editing


The data obtained is complete in all respects. It is accurate in terms of information recorded and

responses sought.
Questionnaires are legible and are correctly deciphered,

especially the open ended questions.


The response format is in the form that was instructed. The data is structured in a manner that entering the

information will not be a problem.

Data editing
Field editing: usually done by the field investigators at the end of every field day the investigator(s) who must review the filled forms for any inconsistencies, nonresponse, illegible responses or incomplete questionnaires. Centralized in-house editing: usually done at the researchers end. Backtracking Allocating missing values Plug value Discarding unsatisfactory responses

Data coding
The process of identifying and denoting a numeral to the responses given by the respondent is called coding
Field

Record
File Data matrix

Data coding
Sample record: Excel sheet for two-wheeler owners
Vehicle Column 3 Unit Column 1 1 2 3 4 5 6 7 8 occupation Column 2 4 3 5 2 4 5 1 5

Km/day Column 4 20 25 25 15 20 35 40 20

1 2 1 1 2 2 1 2

Marital status Column 5 1 2 1 2 2 2 1 2

Family size Column 6 3 1 4 2 4 6 3 4

Code book formulation


Appropriate to the research objective
Comprehensive Mutually exclusive Single variable entry

Pre-Coding closed-ended questions


Dichotomous questions:

Do you eat ready-to-eat food? Ranking questions


Q.NO. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Variable name Balika Badhu Sathiya Sasural Genda Phool Bidai Pathshala Bandini Laptaganj Sajan Ghar Jaaana Hai Tere Liye Uttaran

Yes=1; no=0 (X-1)


Variable name X 10a X 10b X 10c X 10d X 10e X 10f X 10g X 10h X 10i X 10j

Coding instructions Number from 1-10 Number from 1-10 Number from 1-10 Number from 1-10 Number from 1-10 Number from 1-10 Number from 1-10 Number from 1-10 Number from 1-10 Number from 1-10

Pre-Coding Closed-ended questions


Checklists/multiple responses

How many columns will you make for the following question?
Which of the following newspapers do you read (tick all that you read)

Times of India Hindustan Times Mail Today Indian Express

---------------------------------------------------------------------------------

Deccan Chronicle --------------------Asian Age Mint -------------------------------------------.

Pre-Coding Closed-ended questions


Scaled questions

Col.no. Variable name 1. Individual shops more

Coding instructions
A number from 1 to 5 SA = 5, A = 4, N = 3, D = 2, SD = 1 - do - do - do - do -

Variable name X 1a

2. 3. 4. 5.

Well informed Knows what to buy More spending money More shopping options

X 1b X 1c X 1d X 1e

Sample code book extract


Question No. Variable Name Coding Instruction Yes = 1 No = 0 Yes = 1 No = 0 Symbol used for variable name X1 X2 1. 2. Buy ready to eat food products Use ready to eat food products

22.

Age

Less than 20 yrs = 1, 21 to 26 years = 2, 27 to 35 years = 3, 36 to 45 years = 4, More than 45 years = 5 Male = 1 Female = 2 Single = 1 Married = 2 Divorced/widow = 3 Exact no. to be written One to two = 1, Three to five = 2, Six & more = 3 Rs.20000 to Rs.34999 = 1, Rs.35000 to Rs.50000 = 2, Rs.50001 to Rs.74999 = 3 Rs.75000 & above = 4 Less than graduation = 1 Graduation = 2 Post graduation & above = 3 Student = 1 Businessman = 2 Professional = 3 Service = 4 Housewife = 5 Others = 6

X22

23.

Gender

X23

24. 25. 26.

Marital status No. of children Family size

X24 X25 X26

27.

Monthly household income

X27

28.

Education

X28

29.

Occupation

X29

Post-Coding Open-ended questions


If you think Lean was a success so far, please specify three most significant reasons that have contributed to its success in your opinion?
Col.no. 63. 64 65 66 Variable name Improvement at work place by eliminating waste. To meet increasing demands of customers To improve quality To achieve corporate goal Coding instructions
Yes = 1 No = 0 Yes = 1 No = 0 Yes = 1 No = 0 Yes = 1 No = 0

Variable name X 63a X 63b X 63c X 63d

67
68 69

It reduces cycle time of the manufacturing & production.


Reduced response time Enhanced innovation and creativity

Yes = 1 No = 0 Yes = 1 No = 0 Yes = 1 No = 0

X 63e
X 63f X 63g

Classification and tabulation of data


Classification by attributes: mostly categorical Classification

by class intervals: this could be exclusive or inclusive arrangement of data into an orderly arrangement of rows and columns in order to subject it to statistical analysis

Tabulation:

Exploratory data analysis


Sample characteristics: age group of the sample
Age groups
20-25 26-30 31-35 36-40 41-45 46 & above Total

frequency
27 37 9 22 3 2 100

percent
27.0 37.0 9.0 22.0 3.0 2.0 100.0

Exploratory data analysis pie charts


Age Group
20-25 26-30 31-35 36-40 41-45 46 & Above

Exploratory data analysis bar charts


Age Group
40

30

Frequency

20

10

0 20-25 26-30 31-35 36-40 41-45 46 & Above

Age Group

Exploratory data analysis histograms


Histogram

Frequency

0 10.00 15.00 20.00 25.00 30.00 35.00 40.00

Mean =18.3553 Std. Dev. =6.55777 N =15

purchase in gms

Statistical software packages


MS EXCEL MINITAB System for Statistical Analysis(SAS) Statistical Software for Social Sciences(SPSS)

Types of statistical methods


Descriptive Inferential

Descriptive statistics includes statistical method involving the collection, presentation & characterization of a set of data in order to describe its various features. They provide simple summaries about the sample and the measures. Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data.

With inferential statistics, we are trying to reach conclusions that extend beyond the immediate data alone. For instance, we use inferential statistics to try to infer from the sample data what the population parameter might be.

Descriptive Statistics
Collect data
e.g., Survey

Present data
e.g., Tables and graphs

Summarize data
e.g., Sample mean =

X
n

Inferential Statistics
Estimation
e.g., Estimate the population

mean weight using the sample mean weight


Hypothesis testing
e.g., Test the claim that the

population mean weight is 120 pounds


Inference is the process of drawing conclusions or making decisions about a population based on sample results

Data
Data are collection of any number

of related observations.
A collection of data is called a data

set, and a single observation data point.

Types of Data
Data

Categorical
Examples:

Numerical

Marital Status Are you registered to vote? Eye Color (Defined categories or groups)

Discrete
Examples:

Continuous
Examples:

Number of Children Defects per hour (Counted items)

Weight Voltage (Measured characteristics)

Qualitative & quantitative data


Qualitative
Deals

Quantitative with
Deals with numbers. Data

descriptions. Data can be observed but not measured. E.g. color, texture, smell, beauty etc.

which can be measured. E.g. length, height, area, volume, speed etc.

Concept on Discrete & Continuous data


A type of data is discrete if there are only a finite number of values possible or if there is a space on the number line between each 2 possible values. Continuous data can take up any numerical value. This is a type of data that is usually associated with some sort of physical measurement.

Ex. A 5 question quiz is given in a Math class. The number of correct answers on a student's quiz is an example of discrete data. The number of correct answers would have to be one of the following : 0, 1, 2, 3, 4, or 5.

Ex. The height of trees at a nursery is an example of continuous data. Is it possible for a tree to be 76.2" tall? Sure. How about 76.29"? Yes. How about 76.2914563782"? Yes

S-ar putea să vă placă și