Sunteți pe pagina 1din 6

Statistics

Paper 6

TOPIC 1: DATA REPRESENTATIONS Subtopic: Introduction to Statistics and Construction of Frequency Distribution
What is Statistics? Statistics is a subject that deals with collecting, organizing, summarizing, presenting and analyzing data. The information obtained will help in decision-making.

Types of Data: a) Discrete data can only take exact value. For instance the number of car passing a checkpoint in 30 minutes, number of children in family, number of tomatoes on each plant in a greenhouse etc. b) Continuous data this set of data can be given only within a certain range or measured to a certain degree of accuracy. For instance the height of 20 children in a school, the length of hibiscus plant, the speed of vehicles passing a particular point, the masses of cooking apples from a tree etc.

Frequency Distribution
Frequency is the number of times the same data occurs. A frequency distribution is the method of summarizing data in tabular form (frequency table) whereby values of the data are often grouped into classes. The steps for constructing a frequency distribution are as follows: a) Choose a suitable number of classes. The number of classes also may refer to class intervals. b) Determine the range, which means the difference between the highest and lowest observation where the observation is any numerical information that is recorded during the survey or study. c) Determine the class size. The class size refers to the difference between the lower and upper class boundaries. d) State the class limit for each class. Overlapping between classes must be avoided. Make sure that every observation can go to one class only. e) Determine the frequency of each class. Note: It is important also to understand class mark. Class mark is the mid point of each class.

Example 1: Discrete Data Table 1 below represents the score obtained in at test by 30 students. 5 7 6 7 7 10 6 4 9 8 8 7 10 7 8 8 4 7 7 5 9 8 7 7 8 6 9 9 7 8

Construct a frequency distribution of the scores obtained in a test by 30 students. First we construct a tally chart: Scores 4 5 6 7 8 9 10 Total:
Prepared by Hani

Tally

Frequency

Statistics Paper 6

The frequency distribution of the scores obtained in a test by 30 students is: Scores Frequency Example 2: Continuous Data Table 2 below represents the height of 20 children in a school. The heights have been measured correct to the nearest cm. 133 130 136 131 120 125 138 144 133 128 131 134 127 135 141 137 127 133 143 129

To form frequency distribution for the heights of the 20 children we usually group the information into classes or intervals for example: Height (cm)

119.5 h < 124.5 124.5 h < 129.5 129.5 h < 134.5 134.5 h < 139.5 139.5 h < 144.5

(Alternative ways of writing the interval) 119.5 124.5 120 124 124.5 129.5 125 129 129.5 134.5 130 134 134.5 139.5 135 139 139.5 144.5 140 144

The values 119.5, 124.5, 129.5, are called class boundaries. The upper class boundary of one interval is the lower class boundary of the next interval. The next step is to construct the frequency distribution as shown below: Height (cm) Tally Frequency

119.5 h < 124.5 124.5 h < 129.5 129.5 h < 134.5 134.5 h < 139.5 139.5 h < 144.5

Total:

The frequency distribution of the heights of the 20 children is: Height (cm) Frequency

Prepared by Hani

Statistics Paper 6

More Examples: 1) Frequency distribution to show the lengths of 30 rods. Length (mm) Frequency 27 31 4 32 36 11 37 46 12 47 51 3

The interval 27 31 means: _________________________________________________ The class boundaries: _______________________________________________________ The class widths: ______________ 2) Frequency distribution to show the marks in a test of 100 students. Mark Frequency 30 39 10 40 49 14 50 59 26 60 69 20 70 79 18 80 89 12

This distribution can be interpreted in two ways: a) If discrete data: The interval 40 49 means: _________________________________________________ The class boundaries: _______________________________________________________ The class widths: _______________ b) If continuous data: The interval 40 49 means: _________________________________________________ The class boundaries: _______________________________________________________ The class widths: _______________

3) Frequency distribution to show the speeds of 50 cars passing a checkpoint. Speed (km/h) Frequency 20 23 2 30 40 7 40 60 20 60 80 16 80 100 5

The interval 30 40 means: _________________________________________________ The class boundaries: _______________________________________________________ The class widths: ______________ 4) Frequency distribution to show the lengths of 50 telephone calls. Length of call (min) Frequency 09 312 615 910 12 4 18 0

The interval 3 means 3min s time < 6min s , so any time including 3 minutes and up to (but not including) 6mins comes into this interval. The class boundaries: _______________________________________________________ The class widths: ______________
Prepared by Hani

Statistics Paper 6

5) Frequency distribution to show the masses of 40 packages brought to a particular counter. Mass (g) Frequency - 100 8 - 250 10 - 500 16 - 800 6

The interval - 250 means: _________________________________________________ The class boundaries: _______________________________________________________ The class widths: ______________ 6) Frequency distribution to show the ages (in completed years) Age (Years) Frequency 21 24 4 25 28 2 29 32 2 33 40 1 41 52 1

The interval 21 24 means since the ages are given in completed years (not to the nearest year) then 21 24 means 21 age < 25 . Someone who is 24 years and 11 months would come into this category. The class boundaries: _______________________________________________________

The class widths: ______________

Prepared by Hani

Statistics Paper 6

TOPIC 1: DATA REPRESENTATIONS Subtopic: Graphical Representation of Frequency Distribution


a) Histogram A histogram is the graphical representation of a frequency distribution. A histogram consists of a series of vertical bars without any empty space between the columns. Each column involves two factors for representing data, that is: i) The width of each bar is proportional to the class width that it represents. ii) The area of each bar is proportional to the frequency of the class that it represents. Steps for constructing a Histogram i) ii) iii) Mark the class boundaries on the horizontal axis. The base of each bar is from the lower class boundary to the upper class boundary. For equal class intervals, the heights of the rectangles of a histogram are equal to the frequencies of their respective classes. For unequal class intervals, the area of each rectangle must be proportional to their respective class frequencies. Thus, the heights of the rectangles must be adjusted. The best way to do this is to calculate:

Frequency density =

Frequency Class Width

We then use frequency density as the height for each rectangle. Examples:

Draw histogram using the information obtained from previous examples 1, 4 and 5.

b) Frequency Polygon A frequency polygon shows approximately the smooth curve that would describe a frequency distribution. One way to form frequency polygon is to connect the midpoints at the top of the bars of a histogram with the segments (or a smooth curve). Although the midpoints could easily be plotted without the histogram and be joined by the line segments, but it is beneficial to show the histogram and frequency polygon together. For a complete frequency polygon, a class with zero frequency is added before the first class and also the last class. Examples: Draw frequency polygon using the information obtained from previous examples 1, 4 and 5.

Prepared by Hani

Statistics Paper 6

c) Stem and Leaf Diagram A very useful way of grouping data into classes while still retaining the original data is to draw a stem and leaf diagram, also known as stemplot. Example 1: These are the marks of 20 students in an assignment: 60 51 53 42 45 42 32 69 28 75 28 32 42 42 45 51 53 60 69 75

Note that the lowest mark is 28 and the highest mark is 75. In stem and leaf diagrams all the interval must be equal width, so it seems sensible to choose intervals 20 29, 30 39, 40 49, 50 59, 60 69, 70 79 for this data. Take the stem represents the tens and the leaf to represent the units.

Back-to-Back Stemplots Example 2: The table gives the test marks for 31 students obtained during the month of May and July. May 10 21 35 July 24 29 13 27 40 40 28 41 41 30 27 28 35 28 30 38 38 26 29 39 18 10 26 25 35 27 19 23 28 27 20 12 16 14 18 23 18 23 14 37 26 15 16 27 19 42 10 20 7 11 24 19 24 28 29 29 18 31 9

Draw back-to-back stemplot to illustrate the data.

Note: The key is essential in explaining how the stemplot has been formed, in stem and leaf diagram or stemplot: a) Equal intervals must be chosen. b) A key is essential.
Prepared by Hani

S-ar putea să vă placă și