Sunteți pe pagina 1din 13

Simple and Cross Tabulation

Abu Bashar

Data exploration
Graphical plots of the data: to get a first overview of the main characteristics of the data-set, especially the distribution of the original variables across the whole sample and for subsamples Univariate descriptive statistics and one-way tabulation: to synthesize the main characteristics of each of the variables in the-set Multivariate descriptive statistics and cross-tabulation: to get a first understanding of relationship existing between different variables and enabling the joint examination of two or more variables

Graphs
Univariate plots of qualitative or discrete data Univariate plots of quantitative data Bivariate and multivariate plots of quantitative data Bivariate and multivariate plots of quantitative versus qualitative data

Univariate qualitative or discrete data


Bar chart
Line chart

Number of sampled household by household size


200
200

Sampled households by size

150

150

Count

Count

100

100

50

50

Pie chart
0 1 2 3 4 5 6 9

0 1 2 3 4 5 6 9

Sampled households by household size


n=1 n=24 n=12

Household size

Household size

Household size
1 2 3 4 5 6 9 Pies show counts

n=70

n=149

n=67

n=177

Univariate continuous data (1)


Histogram
Number of sampled households by household income

Error bar chart


Household income by quartile

Error Bars show Mean +/- 1.0 SD


200 0.0 0
100

Anonymised hhold inc + allowan ces

150 0.0 0

75

Count

Box-Whiskers Diagram
50

Weekly household income plus allowances


200 0.0 0

100 0.0 0

500 .00

25

Anonymised hhold inc + allowan ces

Maximum
150 0.0 0

0.00

0 100 0.0 0 200 0.0 0 300 0.0 0 400 0.0 0

Anonymised hhold inc + allowances

Low i ncome

100 0.0 0

Medi um-h igh i ncome Medi um-l ow in come Hi gh income

Quartile

Median
500 .00

Upper quartile Lower quartile

0.00

Minimum 5

Univariate continuous data (2)


Normal Q-Q Plot of EFS: Total Food & non-alcoholic beverage

Pareto charts
Bars ordered in decreasing order of the
The line indicates the cumulative proportion Useful for quality control (ANALYZE/QUALITY CONTROL in SPSS) Pareto Chart
Total Revenues by Income Quartile
Anonymised hhold inc + allowances
100% 300000.00 80%

150

Expected Normal Value

frequencies they represent

100

50

-50 0 100 200 300

Q-Q plots

Observed Value

Percent

200000.00

60%

40% 100000.00

Compare the empirical (observed) data


distribution and some theoretical distribution
When the observed distribution is close to the theoretical one, the plotted values tend to lie on a straight line.

170054.19
20%

78867.66 45640.06
0.00 High income Medium-high income Medium-low income

23488.32
Low income

0%

Anonymised hhold inc + allowances (Banded)

Bivariate and multivariate plots


Clustered Bar Chart

Scatterplot
Beer and sausage expenditure

Average household expenditure for selected categories by income range


120.0 EFS: Total Food & non-alcoholic beverage EFS: Total Clothing and Footwear EFS: Total Recreation EFS: Total Restaurants and Hotels

Multi-variable Line Chart


Mean Weekly Household Expenditure by Category with Confidence Intervals
EFS: Total Restaurants and Hotels EFS: Total Recreation EFS: Total Housing, Water, Electricity EFS: Total Health expenditure

75.000

100.0

Beer and lager (brought home)

80.0

Mean

60.0

50.000

40.0

20.0

0.0 Low income

EFS: Total Furnishings, HH Equipment, Carpets


Medium-low income Medium-high income High income

25.000

0.000

Anonymised hhold inc + allowances (Banded)

EFS: Total Food & non-alcoholic beverage EFS: Total Education EFS: Total Communication

Multi-variable Pie Chart

Household expenditure by category


EFS: Total Food & non-alcoholic beverage
9.22% 11.84% 3.36% 6.16% 1.93%

EFS: Total Clothing and Footwear EFS: Total Alcoholic Beverages, Tobacco
0.0

0.000

1.000

2.000

3.000

4.000

5.000

Sausages
10.16%

25.0

50.0

75.0

EFS: Total Alcoholic Beverages, Tobacco EFS: Total Clothing and Footwear EFS: Total Housing, Water, Electricity EFS: Total Furnishings, HH Equipment, Carpets EFS: Total Health expenditure EFS: Total Transport costs

Value

11.24% 16.89%

9.42% 2.88% 15.55% 1.35%

Cases weighted by Annual weight

Bivariate and multivariate plots


Pareto Chart Total Weekly Expenditure for Selected Categories
50,000 100%

Stacked Pareto Chart

Soft Drink and Fruit Juice Consumption


1,400 Soft drinks

40,000 80%

100% 1,200

Fruit juices Cumulative

30,000

Count

60%

1,000

80%

Count

Percent

Percent

20,000

800

40%

60%

600

10,000

22,725 18,110 6,335


20%

281
400

40%

Clustered Bar Chart


0%

0 EFS: Total Food & non-alcoholic beverage EFS: Total Restaurants and Hotels EFS: Total Alcoholic Beverages, Tobacco

200

381

88 196 44 116
1

20%

Alcohol expenditure away from home


Means by income quartile
8.000

0 0 2

81
3

11
4

8
5

1
6

0%

Number of children
Wine from grape or other fruit (away from home) Ciders and Perry (away from home)

6.000

Beer and lager (away from home)

Mean

4.000

2.000

0.000 Low income Medium-low income Medium-high income High income

Anonymised hhold inc + allowances (Banded)

Frequency table
Frequency Table for variable q1 in the Trust dataset
How many people do you regularly buy food for home consumption (including yourself)? Count 1 - Extremely unlikely 2 3 4 Neither 5 6 7 - Extremely likely Total Missing values 91 176 100 94 21 13 2 497 3 % 18.3 35.4 20.1 18.9 4.2 2.6 0.4 100.0

Response category

Descriptive statistics
Descriptive statistics
In a typical week how much fresh or frozen chicken do you buy for your household consumption (Kg.)? 446 54 1.0582 .06843 .9100 1.00 1.44514 2.088 .00 25.03 25 50 75 .5000 .9100 1.3600 In a typical week how much do you spend on fresh or frozen chicken (Euro)? 443 57 5.6677 .19640 5.0000 3.00 4.13383 17.089 .00 30.00 3.0000 5.0000 7.5000

N Mean Std. Error of Mean Median Mode Std. Deviation Variance Minimum Maximum Percentiles

Valid Missing

Age 500 0 45.582 .7100 45.000 45.0 15.8763 252.055 18.0 87.0 32.000 45.000 57.000

10

Cross-tabulation
Food & non-alcoholic be ve rage (Binne d) * Anonym ise d hhold inc + allow ance s (Bande d) Cross tabulation A nonymised hhold inc + allow ances (Banded) Medium-low Medium-high Low income income income High inc ome 47 19 18 4 9.4% 3.8% 3.6% .8% 57 48 24 22 11.4% 9.6% 4.8% 4.4% 17 31 45 40 3.4% 6.2% 9.0% 8.0% 4 27 38 59 .8% 5.4% 7.6% 11.8% 125 125 125 125 25.0% 25.0% 25.0% 25.0%

Food & non-alc oholic beverage (Binned)

20 or les s From 20 to 40 From 40 to 60 More than 60

Total

Count % of Total Count % of Total Count % of Total Count % of Total Count % of Total

Total 88 17.6% 151 30.2% 133 26.6% 128 25.6% 500 100.0%

11

3-variables frequency table


Childr en, incom e and age of HRP Number of c hildren (Banded) More than No One Tw o tw o children children children children Table % Table % Table % Table % 1.4% .8% .2% 2.6% 1.0% .4% .2% 18.4% 1.0% .2% .2% 4.8% 2.2% 2.6% 1.6% 12.0% .4% .6% .4% .8% .2% 7.6% 2.6% 3.8% 2.4% 6.4% .2% 1.6% .4% 9.2% 2.6% 4.6% 1.6% 4.8% .2% Total

A nonymised hhold inc + allow ances (Banded)

Low income

A ge of HRP anonymis ed (Binned) A ge of HRP anonymis ed (Binned) A ge of HRP anonymis ed (Binned) A ge of HRP anonymis ed (Binned)

Medium-low income Medium-high income High inc ome

Less than 30 y ears From 30 to 55 years More than 55 y ears Less than 30 y ears From 30 to 55 years More than 55 y ears Less than 30 y ears From 30 to 55 years More than 55 y ears Less than 30 y ears From 30 to 55 years More than 55 y ears

Table % 2.4% 4.2% 18.4% 1.4% 11.2% 12.4% 2.0% 16.4% 6.6% 2.0% 18.0% 5.0%

12

Quantitative by categorical
Age of HRP - anonymised (Binned) Less than 30 y ears From 30 to 55 years More than 55 y ears Standard Standard Standard Mean Deviation Mean Deviation Mean Deviation .589 (1.59) 2.701 (8.26) 1.112 (3.78) .000 (.00) .067 (.39) .010 (.11) .263 (.96) .396 (1.80) .139 (.84) 1.240 (5.78) .644 (3.37) .107 (.84)

Books Ic e cream Internet subs cription f ees Cinemas

13

S-ar putea să vă placă și