numbers. statistical analysis is done to make sense of, and draw some inferences from the data. wide range of possible techniques that can be used. The following provides a brief summary of some of the most common techniques for summarising data, and explains when to use each one.
Summarising Data: Grouping
The first thing to do with any data is to summarise it, which means and Visualising to present it in a way that best tells the story. The starting point is usually to group the raw data into categories, and/or to visualise it. For example, if you think you may be interested in differences by age, the first thing to do is probably to group your data in age categories, perhaps ten- or five-year chunks. One of the most common techniques used for summarising is using graphs, particularly bar charts, which show every data point in order, or histograms, which are bar charts grouped into broader categories. An example is shown below, which uses three sets of data, grouped by four categories. This might, for example, be men, women, and no gender specified, grouped by age categories 2029, 3039, 4049 and 5059.
Bar Chart
Summarising Data: Grouping
and Visualising
An alternative to a histogram is aline chart,
which plots each data point and joins them up with a line. The same data as in the bar chart are displayed in a line graph below.
Line Chart
Summarising Data: Grouping
and Visualising Pie chartsare best used when you are interested in the relative size of each group, and what proportion of the total fits into each category, as they illustrate very clearly which groups are bigger.
Pie Chart
Measures of Location: Averages
Theaveragegives you information about the size of the effect of whatever you are testing, in other words, whether it is large or small. There are three measures of average: mean, median and mode. Average, means themean. It has the advantage that it uses all the data values obtained and can be used for further statistical analysis. However, it can be skewed by outliers, values which are atypically large or small. As a result, researchers sometimes use themedianinstead. This is the mid-point of all the data. The median is not skewed by extreme values, but it is harder to use for further statistical analysis. Themodeis the most common value in a data set. It cannot be used for further statistical analysis. The values of mean, median and mode arenotthe same, which is why it is really important to be clear which average you are talking about.
KISS Principle Keep It Simple Straightforward
Exercise 1
What can be concluded from the statistics
above?
Exercise 2
20 15 10 5 0
Sum of HSDPA User Num
Sum of HSUPA User Num
What else is lacking here?
Exercise 3
What are the other associated items to these results?
Summary & Findings
Causes and effects? What is the problem? What are the symptoms? What is the root cause? What are the possible issues related to it? What are the possible solutions? What are the gaps identified? Is there an explanation to the problem?