Quantitative Methods For Business Management

MAS1403
Quantitative Methods for

Business Management
Semester 1
Dr. Daniel Henderson
School of Mathematics & Statistics

MAS1403: Quantitative Methods for Business Management
2016/17
Lecturer: Dr. Daniel Henderson, Room 2.21 Herschel Building.
Email: daniel.henderson@ncl.ac.uk
www.mas.ncl.ac.uk/ndah6/teaching/MAS1403/
Lectures: Mondays at 12pm In the Curtis Auditorium, Herschel Building

Tutorials: One per week There are 6 groups check the module webpage to see which tutorial to attend.
Practicals: Occasionally Check the full schedule overleaf for dates. These will take place instead of the tutorials.
Drop-in: Mon 1-2pm, Wed 1-2pm Optional office hours where I will be available in my office for any help with the work.
Lecture notes and handouts

You will be provided with a booklet containing lecture notes and tutorial exercises.
You should bring your booklet to every class!
There will often be gaps in the lecture notes for you to complete during the lecture, so make sure youve got them with you!
All lecture notes, slides and solutions to tutorial exercises will be available to download from the course website (see above). There
should be a link to this website from within Blackboard. Some additional handouts may only be available in lectures and tutorials.
You will notice that my lecture slides are colour-coded: Green for announcements, Blue for listen and learn and Red for write!
Assessment
Assessment for this course is via examination (60% at end of Semester 2), assignments (10% each semester) and computer-based
assessments (10% each semester). Ordinarily, if you fail this module you cannot proceed to Stage 2 of your degree!
Exam: May/June 2017 A two hour, open-book, computer-based exam based on whole course: Answer all questions.
Assignments: Dec 2016, May 2017 About three big questions in each, some of which will use your own personal datasets and
some of which will require you to use the computer package Minitab.
CBAs: Throughout the year Three CBAs in each Semester. Available in practice mode for one week and then exam
mode the next week. Some multiple choice questions, but mainly data response/calculations.
Every student will get a different set of questions from a bank of hundreds!
Must be done in your own time.
Late Work Policy:

It is not possible to extend submission deadlines for coursework in this module and no late work can be accepted. For details of the
policy (including procedures in the event of illness etc.) please look at the School web site:
http://www.ncl.ac.uk/maths/students/resources/late-missed/
Other Stuff
Email: Check your University email every day announcements about the course will be made regularly!
Calculator: There is no way around it, you must have a scientific calculator for this course, and it must be on the Universitys
approved list! I recommend the Casio fX-85GT PLUS (about 10). You can get advice on how to use the Statistics
mode of your calculator in tutorials, and some video presentations on use of the calculator will be available from the
module webpage. You should bring your calculator to every class. You will be stuck without one!
MAS1403 - Provisional Schedule for Semester 1
Week 1 (week commencing 3/10/16) Topic 1: Data collection, display and summaries
Mon 3rd October Lecture 12 - 1 Herschel Building, Curtis Auditorium

Thu 6th October Tutorial 10 - 11 Herschel Building, Lecture Theatre 3
Thu 6th October Tutorial 11 - 12 King George VI Building, Lecture Theatre 6
Fri 7th October Tutorial 9 - 10 Percy Building, G.13
Fri 7th October Tutorial 11 - 12 Herschel Building, Lecture Theatre 3
Week 2 (week commencing 10/10/16)
Mon 10th October Lecture 12 - 1 Herschel Building, Curtis Auditorium

Thu 13th October Practical 10 - 11 Armstrong Building, 2.96 (PC)
Thu 13th October Practical 11 - 12 King George VI Building, Lawn cluster
Thu 13th October Practical 12 - 1 King George VI Building, Lawn cluster
Fri 14th October Practical 9 - 10 Herschel Building, Blue Zone - Herschel cluster
Fri 14th October Practical 10 - 11 Armstrong Building, 2.96 (PC)
Fri 14th October Practical 11 - 12 King George VI Building, Lawn cluster

CBA1 opens in practice mode

Fri 21st October Tutorial 9 - 10 Percy Building, G.13
Fri 21st October Tutorial 10 - 11 Percy Building, G.13
Fri 21st October Tutorial 11 - 12 Herschel Building, Lecture Theatre 3
Week 4 (week commencing 24/10/16) Topic 2: Probability and decision making

CBA1 opens in assessed mode deadline: midnight Friday 28th October

Fri 28th October Tutorial 11 - 12 Herschel Building, Lecture Theatre 3
Mon 31st October Lecture 12 - 1 Herschel Building, Curtis Auditorium

Thu 3rd November Tutorial 10 - 11 Herschel Building, Lecture Theatre 3
Thu 3rd November Tutorial 11 - 12 King George VI Building, Lecture Theatre 6
Thu 3rd November Tutorial 12 - 1 King George VI Building, Lecture Theatre 1
Fri 4th November Tutorial 9 - 10 Percy Building, G.13
Fri 4th November Tutorial 11 - 12 Herschel Building, Lecture Theatre 3

CBA2 opens in practice mode
Mon 7th November Lecture 12 - 1 Herschel Building, Curtis Auditorium

Thu 10th November Tutorial 10 - 11 Herschel Building, Lecture Theatre 3
Thu 10th November Tutorial 11 - 12 King George VI Building, Lecture Theatre 6
Week 7 (week commencing 14/11/16) Topic 3: Probability models
CBA2 opens in assessed mode deadline: midnight Friday 18th November
Assignment 1 available

Thu 17th November Practical 10 - 11 Armstrong Building, 2.96 (PC)
Thu 17th November Practical 11 - 12 King George VI Building, Lawn cluster
Thu 17th November Practical 12 - 1 King George VI Building, Lawn cluster
Fri 18th November Practical 9 - 10 Herschel Building, Blue Zone - Herschel cluster
Fri 18th November Practical 10 - 11 Armstrong Building, 2.96 (PC)
Fri 18th November Practical 11 - 12 King George VI Building, Lawn cluster
Mon 21st November Lecture 12 - 1 Herschel Building, Curtis Auditorium

Thu 24th November Tutorial 10 - 11 Herschel Building, Lecture Theatre 3

Thu 1st December Tutorial 10 - 11 Herschel Building, Lecture Theatre 3
Thu 1st December Tutorial 11 - 12 King George VI Building, Lecture Theatre 6
Thu 1st December Tutorial 12 - 1 King George VI Building, Lecture Theatre 1
Fri 2nd December Tutorial 9 - 10 Percy Building, G.13
Fri 2nd December Tutorial 10 - 11 Percy Building, G.13
Fri 2nd December Tutorial 11 - 12 Herschel Building, Lecture Theatre 3

CBA3 opens in practice mode and assessed mode
Mon 5th December Lecture 12 - 1 Herschel Building, Curtis Auditorium

Thu 8th December Tutorial 10 - 11 Herschel Building, Lecture Theatre 3
Thu 8th December Tutorial 11 - 12 King George VI Building, Lecture Theatre 6
Fri 9th December Tutorial 9 - 10 Percy Building, G.13
Fri 9th December Tutorial 11 - 12 Herschel Building, Lecture Theatre 3

Assignment 1 deadline: 4pm, Thursday 15th December
CBA3 deadline: midnight, Friday 16th December
Mon 12th December Lecture 12 - 1 Herschel Building, Curtis Auditorium

Thu 15th December Tutorial 10 - 11 Herschel Building, Lecture Theatre 3
Fri 16th December Tutorial 11 - 12 Herschel Building, Lecture Theatre 3
Christmas vacation!
Week 12 (week commencing 9/1/17) Revision week
Mon 9th January Lecture 12 - 1 Herschel Building, Curtis Auditorium

Thu 12th January Tutorial 10 - 11 Herschel Building, Lecture Theatre 3
Thu 12th January Tutorial 11 - 12 King George VI Building, Lecture Theatre 6
Thu 12th January Tutorial 12 - 1 King George VI Building, Lecture Theatre 1
Fri 13th January Tutorial 9 - 10 Percy Building, G.13
Fri 13th January Tutorial 10 - 11 Percy Building, G.13
Fri 13th January Tutorial 11 - 12 Herschel Building, Lecture Theatre 3
MAS1403 Quantitative Methods for Business Management
1 Collecting and presenting data
1.1 Definitions
The quantities measured in a study are called random variables and a particular outcome is
called an observation. A collection of observations is the data. The collection of all possible
outcomes is the population.
We can rarely observe the whole population. Instead, we observe some subset of this called
the sample. The difficulty is in obtaining a representative sample.
Data/random variables are of different types:
Qualitative (i.e. non-numerical)
Categorical
Outcomes take values from a set of categories, e.g. mode of transport to Uni:
{car, metro, bus, walk, other}.
Quantitative (i.e. numerical)
Discrete
Things that are countable, e.g. number of people taking this module.
Ordinal, e.g. response to questionnaire; 1 (strongly disagree) to 5 (strongly
agree)
Continuous
Things that we measure rather than count, e.g. height, weight, time.
Example 1

Identify the type of data described in each of the following examples:
(a) The time between emails arriving in your inbox is recorded.
(b) An opinion poll was taken asking people what is their favourite chocolate bar.
(c) The number of students attending a MAS1403 tutorial is recorded.
1
1.2 Sampling techniques

We typically aim for the sample to be representative of the population. The larger the sample
size the more precise information we have about the population.
There are three main types of sampling: random, quasi-random, non-random.
Simple random sampling (random)
Each element in the population is equally likely to be drawn into the sample.
All elements are put in a hat and the sample is drawn from the hat at random.
Advantages easy to implement; each element has an equal chance of being se-
lected.
Disadvantages often dont have a complete list of the population; not all elements
might be equally accessible; it is possible, purely by chance, to pick an unrepresen-
tative sample.
Stratified sampling (random)
We take a simple random sample from each strata, or group, within the population.
The sample sizes are usually proportional to the population sizes.
Advantages sampling within each stratum ensures that that stratum is properly
represented in the sample; simple random sampling within each stratum has the
advantages listed under simple random sampling above.
Disadvantages need information on the size and composition of each group; as
with simple random sampling, we need a list of all elements within each strata.
Systematic sampling (quasi-random)
The first element from the population is selected at random, and then every kth item
is chosen after this. This type of sampling is often used in a production line setting.
Advantages its simplicity! and so its easy to implement.
Disadvantages not completely random; if there is a pattern in the production pro-
cess it is easy to obtain a biased sample; only really suited to structured populations.
Judgemental sampling (non-random)
The person interested in obtaining the data decides who should be surveyed; for
example, the head of a service department might suggest particular clients to survey
based on his judgement, and they might be people who he thinks will give him the
responses he wants!
Advantages very focussed and aimed at the target population.
Disadvantages relies on the judgement of the person conducting the question-
naire/survey, and so cannot be guaranteed to be representative; is prone to bias.
2
Accessibility sampling (non-random)
Here, the most easily accessible elements are sampled.

Advantages easy to implement.
Disadvantages prone to bias.
Quota sampling (non-random)
Similar to stratified sampling, but uses judgemental sampling within each strata in-
stead of random sampling. We sample within each strata until our quotas have been
reached.
Advantages results can be very accurate as this technique is very targeted.
Disadvantages the identification of appropriate quotas can be problematic; this
sampling technique relies heavily on the judgement of the interviewer.
Example 2
(a) A toy company, Toys 4 U, is to be inspected for the quality and safety of the toys it produces.
The inspection team takes a sample of toys from the production line by choosing the first
toy at random, and then selecting every 100th toy thereafter. What form of sampling are the
team using?
(b) Another inspection team is to investigate the quality of the smartphone covers made by a
local factory. In a typical working day the factory produces 100 covers for the new i-Phone
and 200 covers for the latest Samsung phone. Suggest a suitable form of sampling to check
the quality of the smartphone covers produced.
Solution

3
1.3 Frequency tables

Once we have collected our data, often the first stage of any analysis is to present them in a
simple and easily understood way. Tables are perhaps the simplest means of presenting data.
The way we construct the table depends on the type of data.
Example: discrete data

The following table shows the raw data for car sales at a new car showroom over a two week
period in July.
Date Cars Sold Date Cars Sold

1st July 9 8th July 10
2nd July 8 9th July 5
3rd July 6 10th July 8
4th July 7 11th July 4
Presenting these data in a relative frequency table by number of days on which different numbers
of cars were sold, we get the following table:

Cars Sold Tally Frequency Relative Frequency %
Totals
4
Example: continuous data

The following data set represents the service time in seconds for callers to a credit card call
centre.
196.3 199.7 206.7 203.8 203.1

200.8 201.3 205.6 181.6 201.7
180.2 193.3 188.2 199.9 204.7
We can present these data in a relative frequency as follows:

Class Interval Tally Frequency Relative Frequency %
180 time < 185 || 2 13.33
185 time < 190 | 1 6.67
190 time < 195 | 1 6.67
195 time < 200 ||| 3 20.00
200 time < 205 |||| | 6 40.00
205 time < 210 || 2 13.33
Totals 15 100
5
1.4 Exercises
1. Identify the type of data described in each of the following examples:
(a) An opinion poll was taken asking people which party they would vote for in a general
election.
(b) In a steel production process the temperature of the molten steel is measured and recorded
every 60 seconds.
(c) A market researcher stops you in Northumberland Street and asks you to rate between 1
(disagree strongly) and 5 (agree strongly) your response to opinions presented to you.
(d) The hourly number of units produced by a beer bottling plant is recorded.
2. A credit card company wants to investigate the spending habits of its customers. From its
lists, the first customer is selected at random; thereafter, every 30th customer is selected.
(a) Is this an example of simple random sampling, stratified sampling, systematic sampling,
or judgemental sampling?
(b) Is this form of sampling random, quasi-random or non-random?
3. The number of telephone calls made by 20 students in a day is shown below.
3 5 1 0 0 2 1 0 3 1 4 3 2 0 1 1 1 2 0 4
Put these data into a relative frequency table.
4. The following data are the recorded length (in seconds) of 25 mobile phone calls made by
one student.
281.4 293.4 306.5 286.6 298.4

312.7 327.7 311.5 314.8 303.3
270.7 293.9 310.9 346.4 304.6
304.1 320.7 283.6 337.5 259.6
305.4 317.9 289.5 286.9 300.5
Complete the following percentage relative frequency table for these data.
Class Interval Tally Frequency Relative Frequency %

250 time < 270 || 2 13.33
270 time < 290 | 1 6.67
290 time < 310 | 1 6.67
310 time < 330 ||| 3 20.00
330 time < 350 ||| 3 20.00
Totals 25 100
6
2 Graphical methods for presenting data

Once we have collected our data, often the best way to summarise this data is through an appro-
priate graph. Graphs are more eyecatching than tables, and give us an ataglance picture
of the main features of our data: its distribution, location, spread, outliers etc.
2.1 Stemandleaf plots

Example 1
The observations below are the recorded time it takes to get through to an operator at a telephone
call centre (in seconds).
54 56 50 67 55 38 49 45 39 50
45 51 47 53 29 42 44 61 51 50
30 39 65 54 44 54 72 65 58 62
Represent the data in a stem-and leaf plot.
Stem Leaf
n= stem unit = leaf unit =
Some notes on stemandleaf plots.
Always show the stem units and the leaf units.

The stem unit will usually be either 10 or 1; the corresponding unit for the leaves is
usually 1 and 0.1.
Order the leaves from smallest to largest.
If you have observations recorded to 2 d.p., always round down, e.g. 2.97 would become
2.9 rather than 3.0.
7
2.2 Bar charts

A commonlyused and clear way of presenting categorical data or any ungrouped discrete data.
Example 2
The following frequency table represents the modes of transport used daily by 30 students to
get to university.
Mode Frequency
Car 10
Walk 7
Bike 4
Bus 4
Metro 4
Train 1
Total 30
This gives the following bar chart:
10
8
Frequency
6
Car Walk Bike Bus Metro Train
This bar chart clearly shows that the most popular mode of transport is the car and the least
popular is the train (in our small sample).
8
2.3 Histograms
Histograms can be thought of as bar charts for continuous data. First construct a grouped
frequency table then draw a bar for each class interval. Important point: unlike bar charts, there
are no gaps between the bars in a histogram.
Example 3
The following frequency table summarises the service times (in seconds) at a telephone call
centre.
Service time Frequency Relative Frequency (%)

175 time <180 1 2
180 time <185 3 6
185 time <190 3 6
190 time <195 6 12
195 time <200 10 20
200 time <205 12 24
205 time <210 8 16
210 time <215 3 6
215 time <220 3 6
220 time <225 1 2
Totals 50 100
The histogram for these data is:
12 24
10 20
Frequency
8 Relative 16
frequency
(%)
6 12
4 8
2 4
175 180 185 190 195 200 205 210 215 220 225 175 180 185 190 195 200 205 210 215 220 225
Time (s) Time (s)
We can also plot relative frequency (%) on the vertical axis: this gives a percentage relative
frequency histogram. These are useful for comparing datasets of different sizes.
9
2.4 Relative frequency polygons

The relative frequency polygon is exactly the same as the relative frequency histogram, but
instead of having bars we join the midpoints of the top of each bar with a straight line. These
are useful for illustrating the relative differences between two or more groups.
Example 4
Consider the following data on gross weekly income (in ) collected from two sites in Newcas-
tle.
Weekly Income () West Road (%) Jesmond Road (%)

0 income < 100 9.3 0.0
100 income < 200 26.2 0.0
200 income < 300 21.3 4.5
300 income < 400 17.3 16.0
400 income < 500 11.3 29.7
500 income < 600 6.0 22.9
600 income < 700 4.0 17.7
700 income < 800 3.3 4.6
800 income < 900 1.3 2.3
900 income < 1000 0.0 2.3
The following plot shows percentage relative frequency polygons for the two groups.
Example comments: The distribution of incomes on West Road is skewed towards lower val-
ues, whilst those on Jesmond Road are more symmetric. The graph clearly shows that income
in the Jesmond Road area is higher than that in the West Road area. The spread of incomes is
roughly the same in the two areas. There are no obvious outliers.
10
2.5 Cumulative frequency polygons

These are very useful for comparing datasets.
Construct a percentage relative frequency table for your data.
Add a cumulative column by adding up the percentages as you go along.
Plot the upper endpoint of each class interval against the cumulative value.
Example 5
The following plot contains the cumulative frequency polygons for the income data at both the
West Road and Jesmond Road sites.
It clearly shows the line for Jesmond Road is shifted to the right of that for West Road. This tells
us that the surveyed incomes are higher on Jesmond Road. We can compare the percentages of
people earning different income levels between the two sites quickly and easily.
11
2.6 Scatter plots

Scatter plots are used to plot two variables which you believe might be related, for example,
advertising expenditure and sales.
Example 6
The following data represents monthly output and total costs at a factory.
Total costs () Monthly output (units)

10,300 2,400
12,000 3,900
12,000 3,100
13,500 4,500
12,200 4,100
14,200 5,400
10,800 1,100
18,200 7,800
16,200 7,200
19,500 9,500
17,100 6,400
19,200 8,300
For scatter plots, we comment on whether there is a linear association between the two vari-
ables? If so, is this positive (uphill) or negative (downhill)? Is the association strong? Or
maybe moderate or weak?
The plot above shows a clear positive, roughly linear, relationship between the two variables:
the more units made, the more it costs in total.
12
2.7 Time Series Plots

Data collected over time can be plotted by using a scatter plot, but with time as the (horizontal)
x-axis, and where the points are connected by lines: a time series plot.
Example 7
Consider the following data on the number of computers sold (in thousands) by quarter (January-
March, April-June, July-September, October-December) at a large warehouse outlet, starting in
quarter 1 2000.
Q1 Q2 Q3 Q4
2000 86.7 94.9 94.2 106.5
2001 105.9 102.4 103.1 115.2
2002 113.7 108.0 113.5 132.9
2003 126.3 119.4 128.9 142.3
2004 136.4 124.6 127.9
The time series plot is:
For time series plots, look out for trend and seasonal cycles in the data. Also look out for any
outliers.
The above plot clearly shows us two things: firstly, that there is an upwards trend to the data
(sales increase over time), and secondly that there is some regular variation around this trend
(sales are usually higher in quarters 1 and 4 than quarters 2 and 3.
13
2.8 Exercises
1. The following table shows the weight (in kilograms) of 50 sacks of potatoes leaving a farm
shop (the data have been ordered from smallest to largest).
8.1 8.2 8.5 8.7 8.8

8.9 9.2 9.3 9.3 9.4
9.5 9.5 9.6 9.6 9.6
9.7 9.7 9.9 9.9 10.0
10.0 10.0 10.0 10.0 10.1
10.2 10.2 10.2 10.3 10.3
10.4 10.4 10.4 10.5 10.6
10.6 10.6 10.6 10.6 10.7
10.8 10.9 11.0 11.2 11.3
11.3 11.3 11.5 11.6 12.8
Display these data in a stem and leaf plot. State clearly both the stem and the leaf units.
Comment on the distribution of the data.
2. Which is more suitable for representing the data from Question 1 (above), a bar chart or a
histogram? Justify your answer.
3. A small clothes shop have records of daily sales both before and after a local radio advertis-
ing campaign. Relative frequency polygons of the sales data are shown below.
Relative frequency polygons of sales (before and after)
Rel. freq. (%)
30
Before
20 After
10
2000 4000 6000 8000 10000

0
Daily sales ()
Comment, with justification, on the success, or otherwise, of the advertising campaign.
14
3 Numerical summaries for data

Numerical summaries are numbers which summarise the main features of your data. You should
use both a measure of location and a measure of spread to summarise your dataset.
3.1 Measures of location

A measure of location is a value which is typical of the observations in our sample
1. The mean
The sample mean is the average of our data: the total divided by the sample size. Its given
by the formula
n
1X
x = xi ,
n i=1
which, put more simply, means add them up and divide by how many youve got.
Example 1
Suppose we ask 7 Stage 2 Business Management students how many units of alcohol they drank
last week and get: 16, 52, 0, 6, 10, 0, 21. The sample mean alcohol consumption of these n = 7
students is
If your data are given in the form of a frequency table, then you multiply each observation by
its frequency, add these numbers together and then divide by how many youve got. If you
have a grouped frequency table, then you dont know the value of each observation and so just
use the midpoint of the class interval.
2. The median
This is just the observation in the middle, when the data are put into order from smallest to
largest:
th
n+1
median = smallest observation.
2
Example 2
Ordering the student alcohol data from the previous example gives 0, 0, 6, 10, 16, 21, 52.
Clearly the middle value is 10, so the median is 10 units per week.
Example 3
Suppose we also asked four Stage 2 Marketing and Management students how many units of

alcohol they drank last week, and got: 21,0,12,14. Calculate the median.
Solution
The median is often used if the dataset has an asymmetric profile, since it is not distorted by
extreme observations (outliers).
15
3. The mode
The mode is simply the most frequently occurring observation. For example, consider the
following data: 2, 2, 2, 3, 3, 4, 5. The mode is 2 as it occurs most often. The modal class is
easily obtained from a grouped frequency table or a histogram; its the class with the highest
frequency.
3.2 Measures of spread

A measure of spread quantifies how spread out (or how variable) our data are.
1. The range
Range = largest value smallest value. For example, the range of the data: 2, 2, 2, 3, 3, 4, 5 is
5 2 = 3.
Advantage: very simple to calculate.
Disadvantages: sensitive to extreme observations; only suitable for comparing (roughly)

equally sized samples.
2. The inter-quartile range (IQR)

The IQR measures the range of the middle half of the data, and so is less affected by extreme
observations. It is given by Q3 Q1, where
(n + 1)
Q1 = th smallest observation (lower quartile)
4
3(n + 1)
Q3 = th smallest observation (upper quartile).
4
Example 4
Calculate the inter-quartile range for the following data.
8.7, 9.0, 9.0, 9.2, 9.3, 9.3, 9.5, 9.6, 9.6, 9.6, 9.7, 9.7, 9.9, 10.3, 10.4, 10.5, 10.7, 10.8
Solution
n = 18, so the position of Q1 is (18 + 1)/4 = 4.75, therefore
Q1 = 9.2 + 0.75 (9.3 9.2) = 9.2 + 0.075 = 9.275.
Similarly, the position of Q3 is 3 (18 + 1)/4 = 14.25, therefore
Q3 = 10.3 + 0.25 (10.4 10.3) = 10.3 + 0.025 = 10.325.
And so
IQR = Q3 Q1 = 10.325 9.275 = 1.05.
16
3. The variance and standard deviation

The sample variance is the standard measure of spread used in statistics. It can be thought of as
the average squared deviation from the mean, and is given by
n
1 X
2
s = (xi x)2 .
n 1 i=1
The following formula is easier for calculations

( n )
1 X
s2 = x2 (n x2 ) .
n 1 i=1 i
In practice most people simply use the Statistics mode on their calculator (mode SD or Stat).
The sample standard deviation is just the square root of the variance, and is often preferred as
it is in the original units of the data.
Example 5
Consider again the data on the number of units of alcohol consumed by a sample of 7 students
last week: 16, 52, 0, 6, 10, 0, 21. Calculate the sample variance and the sample standard
deviation.
Solution
We have already calculated the sample mean as x = 15. Now
X
x2 = 162 + 522 + 02 + 62 + 102 + 02 + 212 = 3537
x)2 = 7 152 = 1575

n(
and so the sample variance is
1 1962
s2 = (3537 1575) = = 327
71 6
and the sample standard deviation is

s= s2 = 327 = 18.08 units per week.
17
3.3 Box plots

Box plots (or box and whisker plots) are another graphical method for displaying data.
Example 6
Suppose that, from our data, we obtain the following summary statistics:
Minimum Lower Quartile (Q1) Median (Q2) Upper Quartile (Q3) Maximum
10 40 43 45 50
A box plot is constructed as follows.
Box plots are particularly useful for highlighting differences between groups.
Example 7
It clearly shows that although there is overlap between the three sets of data, the first and second
datasets contain roughly similar responses and that these are quite different from those in the
third set. Note that the asterisks (*) at the ends of the whiskers is the way Minitab highlights
outlying values.
18
3.4 Exercises
1. Recall the following data from Exercise 1 in Chapter 2 on the weight (in kg) of 50 sacks of
potatoes leaving a farm shop.
8.1 8.2 8.5 8.7 8.8

8.9 9.2 9.3 9.3 9.4
9.5 9.5 9.6 9.6 9.6
9.7 9.7 9.9 9.9 10.0
10.0 10.0 10.0 10.0 10.1
10.2 10.2 10.2 10.3 10.3
10.4 10.4 10.4 10.5 10.6
10.6 10.6 10.6 10.6 10.7
10.8 10.9 11.0 11.2 11.3
11.3 11.3 11.5 11.6 12.8
(a) Calculate the mean of the data.

(b) Calculate the median of the data.
(c) Calculate the range of the data.
(d) Calculate the interquartile range.
(e) Calculate the sample standard deviation.
(f) Draw a box plot for these data and comment on it.
(g) Put the data in a grouped frequency table.
(h) Find the modal class.
2. Chloe collected the following data on the weight, in grams, of large chocolate chip cookies
produced by Millies Cookie Company.
27.1 22.4 26.5 23.4 25.6 26.3 51.3 24.9 26.0 25.4
To summarise, Chloe was going to calculate the mean and standard deviation for this sam-
ple. However, her friend Mark warned her that the mean and standard deviation might be
inappropriate measures of location and spread for these data.
(a) Do you agree with Mark? If so, why?

(b) Calculate measures of location and spread that you feel are more suitable.
3. An internet marketing firm was interested in the amount of time customers spend on their
website. They recorded the lengths of visits to the website for a sample of 100 customers
and whether the customer was male or female. The standard deviations of the lengths of
visits were 12.2 seconds for males and 18.5 seconds for females. Which group has the more
variable visit lengths, based on this sample, males or females?
19
4 Introduction to Probability
4.1 Definitions
An experiment is an activity where we do not know for certain what will happen, but we will
observe what happens. An outcome is one of the possible things that can happen. The sample
space is the set of all possible outcomes. An event is a set of outcomes.
All probabilities are measured on a scale ranging from zero to one, and can be expressed as
fractions, decimal numbers or percentages.
is the
Notation: P (A) represents the probability of the event A, e.g. P (Rain tomorrow). P (A)
probability that A does not occur (not A).
The collection of all possible outcomes, that is the sample space, has a probability of 1. Two
events are said to be mutually exclusive if both cannot occur simultaneously. Two events are
said to be independent if the occurrence of one does not affect the probability of the other
occurring.

Example 1
Do you think the following pairs of events are independent?
A: Molly plays table tennis, and B: Molly is good at maths
C: Henry gets over 60 in MAS1403, and D: Henry gets under 40 in MAS1403
4.2 Measuring probability

1. Classical interpretation
Used when all possible outcomes are equally likely. In general, calculations follow from the
formula
Total number of outcomes in which event occurs
P (Event) = .
Total number of possible outcomes
2. Frequentist interpretation
When the outcomes of an experiment are not equally likely, we can perform the same exper-
iment a large number of times and observe the outcome. The probability of an event can be
estimated using the following formula:
Number of times an event occurs
P (Event) = .
Total number of times experiment performed
20
3. Subjective interpretation
Probabilities are formulated subjectively using an individuals (sometimes expert) opinion.
(Useful when the experiment cant be repeated.) For example, when we board an aeroplane,
we judge the probability of it crashing to be sufficiently small that we are happy to undertake
the journey.
4.3 Examples
1. Chicken King is a fastfood chain with 700 outlets in the UK. The geographic location of its
restaurants is tabulated below:
Region
NE SE SW NW Total
Under 10,000 35 42 21 70 168
Population 10,000100,000 70 105 84 35 294
Over 100,000 175 28 35 0 238
Total 280 175 140 105 700
A health and safety organisation selects a restaurant at random for a hygiene inspection.
Assuming that each restaurant is equally likely to be selected, calculate the following prob-
abilities.
(a) P (NE restaurant chosen),

(b) P (Restaurant chosen from a city with a population over 100,000),
(c) P (SW and city with a population under 10,000).
Solution
21
2. The spinner shown below is spun once.
Assuming each sector on the board is the same size, calculate the following probabilities.
(a) P (lands on a red shape) =
(b) P (lands on a triangle) =
(c) P (lands on a 4-sided shape) =
3. On the probability scale, how likely do you think it is that Newcastle United will be promoted
this season? Which approach to probability would you use to estimate this?

22
4.4 The addition rule

The addition rule describes the probability of any of two or more events occurring. The addition
rule for two events A and B is
P (A or B) = P (A) + P (B) P (A and B).
This describes the probability of either event A or event B happening.

Example 2
Prospective interns at internet startup BlueFox face two aptitude tests. If 35 percent of applicants
pass the first test, 25 percent pass the second test, and 15 percent pass both tests, what percentage
of applicants pass at least one test?
Solution

We are told P (pass 1st test) = 0.35, P (pass 2nd test) = 0.25 and P (pass 1st and 2nd test) =
0.15. Therefore using the addition law
P (pass at least one test) = P (pass 1st or 2nd test)

= P (pass 1st test) + P (pass 2nd test) P (pass 1st and 2nd test)
= 0.35 + 0.25 0.15
= 0.45.
So 45% of the applicants pass at least one of the tests.
Note: if events A and B are mutually exclusive then P (A and B) = 0 since A and B cant
occur together. Therefore,
P (A or B) = P (A) + P (B).
23
4.5 Exercises
1. Do you think the following pairs of events are independent or dependent? Explain.
(a) E: An individual has a high IQ

F : An individual is accepted for a University place
(b) E1 : An individual has a large outstanding credit card debt
E2 : An individual is allowed to extend his bank overdraft
2. The following data refer to a class of 18 students. Suppose that we will choose one student
at random from this class.
Student Height Weight Shoe Student Height Weight Shoe

Number Sex (m) (kg) Size Number Sex (m) (kg) Size
1 M 1.91 70 11.0 10 M 1.78 76 8.5
2 F 1.73 89 6.5 11 M 1.88 64 9.0
3 M 1.73 73 7.0 12 M 1.88 83 9.0
4 M 1.63 54 8.0 13 M 1.70 55 8.0
5 F 1.73 58 6.5 14 M 1.76 57 8.0
6 M 1.70 60 8.0 15 M 1.78 60 8.0
7 M 1.82 76 10.0 16 F 1.52 45 3.5
8 M 1.67 54 7.5 17 M 1.80 67 7.5
9 F 1.55 47 4.0 18 M 1.92 83 12.0
Find the probabilities for the following events.
(a) The student is female.

(b) The students weight is greater than 70kg.
(c) The students weight is greater than 70kg and the students shoe-size is greater than 8.
(d) The students weight is greater than 70kg or the students shoe-size is greater than 8.
3. The regional manager of supermarket Freshco is interested in predicting sales patterns of

breakfast cereal. If 85% of Freshco customers buy branded cereals (e.g. Kelloggs etc), 60%
of customers buy Freshcos own-brand cereals, and 50% of customers buy both branded
and Freshcos own-brand cereal, what percentage of Freshco customers do not buy breakfast
cereal?
24
5 Conditional probability
5.1 The multiplication rule

The multiplication rule describes the probability of two (or more) events occurring. The proba-
bility of two events A and B both occurring is
P (A and B) = P (A) P (B|A),
where P (B|A) is the conditional probability of B given that A has already happened.
Example 1
A small company has 10 employees: 4 male and 6 female. You, as the manager, select two
employees at random to attend a training session. What is the probability that you select two
female employees?
Solution

Re-arranging the above expression for the multiplication rule gives a formula for calculating a
conditional probability:
P (A and B)
P (B|A) = .
P (A)
Example 2
Recall that prospective interns at internet startup BlueFox face two aptitude tests. If 35 percent
of applicants pass the first test, 25 percent pass the second test, and 15 percent pass both tests,
what percentage of applicants pass the second test given that they passed the first test?
Solution

25
Independent events: two events A and B are independent if P (B|A) = P (B), in which case
P (A and B) = P (A) P (B).
Example 3
Are the outcomes of the two aptitude tests at internet startup BlueFox independent? Justify your
answer.
Solution

Example 4
Employees at a Marketing firm are classified by age and sex as follows:
under 30 30 to 50 over 50 Total

Male 0.275 0.125 0.025
Female 0.325 0.175 0.075
So, for example, 27.5% of employees are Male and under 30 years of age.
From this table, calculate
(a) P (Male) (d) P (30 to 50|Male)
(b) P (30 to 50) (e) Are the events Male and 30 to 50 independent?
(c) P (Male|30 to 50) (f) P (Male)

Solution

26
5.2 Tree diagrams

Tree diagrams (or probability trees) are simple, clear ways of presenting probabilistic informa-
tion.
Example 5
Suppose we have a biased coin, with P (Head) = 0.75. Then the following tree diagram displays
all outcomes, along with their associated probabilities, for two consecutive flips of the coin:
0.75 0.75 = 0.5625

H
0.75
H 0.25
T
0.75 0.75 0.25 = 0.1875
0.25 0.25 0.75 = 0.1875

H
T
0.75
0.25
T
0.25 0.25 = 0.0625
Important: multiply probabilities along branches (multiplication rule); the probabilities at the
ends of the branches should add up to 1.
Example 6
A small company has 10 employees: 4 male and 6 female. You, as the manager, select two
employees at random to attend a training session. What is the probability that you select one
male and one female employee?
Solution

27
Example 7
Joe has a Business Management exam on Thursday morning. On Wednesday night he is free to
choose one (and only one) of the following activities: (a) go to the cinema, (b) go to the pub,
(c) stay home and watch TV, (d) stay home and study. The probabilities that he elects these
alternatives are 0.14, 0.45, 0.25 and 0.16, respectively. His conditional probabilities of passing
the exam given (a), (b), (c) and (d) are 0.4, 0.05, 0.5 and 0.8 respectively. Find
(i) the probability that Joe goes to the pub and passes his exam;
(ii) the probability that Joe passes his exam;
(iii) the probability that Joe went to the pub, given that he passed his exam.
Solution
Use the space provided below to construct a tree diagram for this example.
(i) P (Joe goes to Pub and passes exam) =
(ii) P (Joe passes exam) =
(iii) P (Joe went to Pub | Joe passed exam) =
28
5.3 Exercises
1. An on-line retailer conducts a survey of 200 customers and obtains the following results.
Age
Under 30 30 to 45 Over 45
Male 60 20 40
Female 40 30 10
A customer is selected at random.
(a) What is the probability that the customer is male and aged 30 to 45?
(b) Given that this customer is aged 30 to 45, what is the probability that they are male?
(c) Given that this customer is female, what is the probability that they are 45 or under?
(d) Now suppose that two customers are selected at random. What is the probability that
both are Male?
2. If Vinny goes to the cinema, there is a 60% chance he will then also go to the bar afterwards.
However, if he doesnt go to the cinema, this reduces to just 30%. On Friday night, Vinny
decides to go to the cinema only if his friend Julia also goes. Vinny has no idea about Julias
intentions this Friday and so is just as likely to go to the cinema as he is to not go. Let C
be the event that Vinny goes to the cinema, and B the event that Vinny goes to the bar, this
Friday. Using a probability tree diagram, or otherwise, find
(a) P (C)

(b) P (C)

(c) P (B|C)
C)
(d) P (B|
(e) P (C and B)
(f) P (B)
29
6 Decisionmaking using probability
6.1 Expected Monetary Value

The Expected Monetary Value (EMV) of a single event is simply the probability of that event
multiplied by its monetary value.
Example 1
Suppose you win 5 if you pull an ace from a pack of cards, the EMV would be
4
EMV (Ace) = P (Ace) MonetaryValue(Ace) = 5 = 0.38.
52
Your expected return would be 38 pence; if you repeated this bet a large number of times, you
would come out, on average, 38 pence better off per bet. Therefore you would want to pay no
more than 38p for such a bet.
In general, for more complicated problems involving several options,
X
EMV = {P (Event) Monetary value of Event}
where the sum is over all possible events. We choose the option with the largest EMV.
Example 2
Synaptec is a small technology company with a new product that they wish to launch on to the
market. It could go for
a direct approach, launching onto the domestic market through traditional channels,
it could launch only on the internet,
or it could license the product to a larger company through the payment of a licence fee
irrespective of the success of the product.
Initial market research suggests that demand for the product can be classed into three categories:
high, medium or low, and these categories will occur with probabilities 0.2, 0.35 and 0.45.
Likely profits (in K) to be earned under each option are
High Medium Low

Direct 100 55 -25
Internet 46 25 15
Licence 20 20 20
How should the company launch the product?

The EMV of each option can be calculated as follows:
EMV (Direct) = (0.2 100) + (0.35 55) + (0.45 (25)) = 28K
EMV (Internet) = (0.2 46) + (0.35 25) + (0.45 15) = 24.7K
EMV (Licence) = (0.2 20) + (0.35 20) + (0.45 20) = 20K.
On the basis of expected monetary value, the best choice is the Direct approach as this max-
imises EMV.
30
6.2 Decision trees

When we include a decision in a tree diagram (see Chapter 5) we use a rectangular node, called
a decision node to represent the decision. The diagram is then called a decision tree.
Example 3
The decision tree for the last example (Example 2) would look like this:
100
H 0.2
M
55
0.35
L
0.45 -25
Direct
+28
0.2 40
H
Internet M 0.35
25
+24.7 0.45
L
15
Licence +20
0.2 20
H
M 0.35
20
L 0.45
20
Key points:
There are no probabilities at a decision node but we evaluate the expected monetary values
of the options.
In a decision tree the first node (on the left) is always a decision node.
There may also be other decision nodes.
If there is another decision node then we evaluate the options there and choose the best one
(based on EMV), and the expected monetary value of this option becomes the expected
monetary value of the branch leading to the decision node.
We work backwards through the tree (from right to left), evaluating EMVs and making
decisions at each decision node.
31
Example 4
Charlotte Watson, the manager of a small sales company, has the opportunity to buy a fixed
quantity of a new type of Android tablet which she can then offer for sale to clients.
The decision to buy the product and offer it for sale would involve a fixed cost of 200,000. The
number of tablets that will be sold is uncertain, but Charlotte judges that:
Sales will be poor with probability 0.2; this will result in an income of 100,000.
Sales will be moderate with probability 0.5; this will result in an income of 220,000.
Sales will be good with probability 0.3; this will result in an income of 350,000.
For an additional fixed cost of 30,000, market research can be conducted to aid the decision
making process. The outcome of the market research can be either positive or negative, with
probabilities 0.58 and 0.42, respectively. Knowing the outcome of the market research changes
the probabilities for the main sales project as follows:
Main sales probabilities

Market research Poor Moderate Good
Positive 0.15 0.45 0.4
Negative 0.6 0.35 0.05
Charlotte has various options:
Buy the tablets, without market research.
Pay for the market research.
Do nothing.
If she pays for the market research then, depending on the outcome, she can:
Buy the tablets.
Do nothing.
(a) Draw a decision tree for this problem.
(b) Use expected monetary value to determine the optimal course of action for Charlotte.
The following page is left blank for your solution to this question
32
33
6.3 Exercises
1. Picoplex Technologies have developed a new manufacturing process which they believe will
revolutionise the smartphone industry. They are, however, uncertain how they should go
about exploiting this advance.
Initial indications of the likely success of marketing the process are 55%, 30%, 15% for
high success, medium success and probable failure, respectively. The company has
three options; they can go ahead and develop the technology themselves, licence it or sell
the rights to it. The financial outcomes (in millions) for each option are given in the table
below.
high success medium success failure

Develop 80 40 100
Licence 40 30 0
Sell 25 25 25
(a) Draw a decision tree to represent the companys problem.

(b) Calculate the Expected Monetary Value for all possible decisions the company may take
and hence determine the optimal decision for the company.
2. The manager of a small business has the opportunity to buy a fixed quantity of a new product
and offer it for sale for a limited time.
The decision to buy the product and offer it for sale would involve a fixed cost of 150,000.
The amount that would be sold is uncertain but the manager judges that:
There is a probability of 0.3 that sales will be poor with an income of 80,000.
There is a probability of 0.5 that sales will be medium with an income of 160,000.
There is a probability of 0.2 that sales will be good with an income of 240,000.
For an additional fixed cost of 20,000, the product can be sold for a trial period before a
final decision is made. No income is made from this trial. The result of the trial will be
poor with probability 0.33, medium with probability 0.40 or good with probability
0.27. Knowing the outcome of the trial changes the probabilities for the main sales project:
Main sales probabilities

Trial outcome Poor Medium Good
Poor 0.7 0.2 0.1
Medium 0.2 0.6 0.2
Good 0.1 0.2 0.7
The manager also has the option to do nothing.
(a) Draw a decision tree for this problem.

(b) Use expected monetary value to determine the optimal course of action for this business.
34
7 Discrete probability models
7.1 Probability distributions

The probability distribution of a discrete random variable X is the list of all possible values
X can take and the probabilities associated with them.
Example 1
If the random variable X is the outcome of a roll of a fair six-sided die then the probability
distribution for X is:
r 1 2 3 4 5 6 Sum
P (X = r) 1/6 1/6 1/6 1/6 1/6 1/6 1
Key point: For a discrete random variable the probabilities of each possible value sum up to 1.
7.2 The binomial distribution

Suppose the following statements hold:
There are a fixed number of trials or experiments (n).

There are only two possible outcomes for each trial (success or failure).
There is a constant probability of success, p.
The outcome of each trial is independent of any other trial.
Then the number of successes, X, follows a binomial distribution.
Example 2
Which of the following scenarios could be adequately modelled by a binomial distribution?

The number of sixes on 3 rolls of a fair six-sided die.
The number of students who pass MAS1403 this year.
7.2.1 Calculating probabilities
If X follows a binomial distribution we write X Bin(n, p), and

n
P (X = r) = Cr pr (1 p)nr , r = 0, 1, . . . , n.
Here, n Cr is the number of ways of getting r successes out of n trials, and is given by
n n!
Cr = ,
r!(n r)!
where r! = 1 2 3 (r 1) r is known as r factorial. Important: most scientific
calculators have an n Cr button!
35
Example 3
What is the probability of getting 2 sixes from three rolls of a fair six-sided die?
Solution

Example 4
If X Bin(10, 0.2) calculate:
(a) P (X = 2) (c) P (X < 3)
(b) P (X 2) (d) P (X > 1)

Solution

36
7.2.2 Mean and variance
If X Bin(n, p), then its mean (or expected value) and variance are
E[X] = n p and
Var(X) = n p (1 p).
Example 5
If X Bin(10, 0.2) calculate:
(a) E[X]
(b) Var(X)
(c) SD(X)
Solution

Example 6
A salesperson has a 50% chance of making a sale on a customer visit and she arranges 6 visits
in a day.
(a) Assuming sales at each visit are independent, suggest an appropriate distribution for the
number of sales she makes in a day.
(b) Calculate her expected number of sales.
Solution

37
7.3 Exercises
1. Consider the following probability distribution for the discrete random variable X. One of
the values is missing.
r -2 -1 0 1 2
P (X = r) 0.1 0.2 ? 0.3 0.2
What is the missing value, P (X = 0)?
2. Let X be the number of sixes rolled on four rolls of a fair six-sided die.
(a) Calculate the probability distribution of X, i.e. the values P (X = r) for r = 0, 1, 2, 3, 4.

(b) Calculate P (X 2).
(c) Calculate P (X > 2).
(d) Calculate the mean and variance of X.
(e) What is the most likely number of sixes from four rolls of the die?
38
8 More discrete probability models
8.1 The Poisson distribution

Suppose the following hold:
Events occur independently, at a constant rate ();
There is no natural upper limit to the number of events.
Then the number of events, X, occurring in a given interval, has a Poisson distribution with
parameter .
Example 1
Which of the following random variables could be modelled by a Poisson distribution? Sug-
gest an alternative if the Poisson distribution is not appropriate, and state the values of any
parameters.
(a) Calls are received at a call centre at a constant rate of 3 per minute on average. Let X be
the number of calls received in a 1 minute period.
(b) An operator at a tele-sales marketing firm has 20 calls to make in an hour. History suggests
that calls will be answered 55% of the time. Let Y be the number of answered calls in an
hour.
(c) Newcastle United score goals at a constant rate of 2.4 in 90 minutes, on average. Let Z be
the number of goals scored in 45 minutes.
Solution

39
8.1.1 Probabilities, means and variances
If X follows a Poisson distribution we write X Po(), and
r e
P (X = r) = , r = 0, 1, . . .
r!
If X Po(), then its mean and variance are
E[X] = and
Var(X) = .
[Approximation to binomial: If X Bin(n, p) with n large, p small and both np and n(1
p) > 5 then X is approximately P o(np).]
Example 2
If X P o(5) calculate:
(a) P (X = 4) (d) E[X]
(b) P (X 1) (e) SD(X)
(c) P (X > 0) (f) SD(X)

Solution

40
Example 3
A new MercedesBenz car franchise forecasts that it will sell around three of its most expensive
models each day.
(a) What probability distribution might be reasonable to use to model the number of cars sold
each day?
(b) What is the expected number and standard deviation of the number of cars sold each day?
(c) What is the probability that 3 cars are sold on a particular day?
(d) What is the probability that no cars are sold on a particular day?
(e) What is the probability that at least one car is sold on a particular day?
(f) Sales will be monitored over the next seven days and the sales team at the franchise will
receive a warning if they make no sales on at least 1 of the 7 days. What is the probability
that they receive a warning?
Solution

41
8.2 Exercises (on Chapters 7 & 8)

1. Which of the following random variables could be modelled with a binomial distribution and
which could be modelled with a Poisson distribution? In each case state the value(s) of the
parameter(s) of the distribution.
(a) A salesperson has a 30% chance of making a sale on a customer visit. She arranges 10
visits in a day. Let X be the number of sales she makes in a day.
(b) Calls to the British Passport Office in Durham occur at a rate of 7 per hour on average.
Let Y be the number of calls at the passport office in a 1 hour period.
(c) History suggests that 10% of eggs from a family-run farm are bad. Let Z be the number
of bad eggs in a box of a dozen (i.e. 12) eggs.
2. An operator at a call centre has 8 calls to make in an hour. History suggests that they will be
answered 40% of the time. Let X be the number of answered calls in an hour.
(a) What probability distribution does X have?

(b) What is the mean and standard deviation of X?
(c) Calculate the probability of getting a response exactly 7 times.
(d) Calculate the probability of getting fewer than 2 responses.
3. Calls are received at a telephone exchange at an average rate of 4 per minute. Let Y be the
number of calls received in one minute.
(a) What probability distribution does Y have?

(b) What is the mean and standard deviation of Y ?
(c) Calculate the probability that there are 6 calls in one minute.
(d) Calculate the probability that there are no more than 2 calls in a minute.
(e) Calculate the probability that there are more than 2 calls in a minute.
42
9 Continuous probability models
9.1 The Normal distribution

The Normal distribution is possibly the bestknown and mostused continuous probability
distribution: you will use it a lot in Semester 2 of MAS1403. Its probability density function
(pdf) has a symmetrical bell shaped profile:
f (x)
4 2 + 2 + 4
We can think of the pdf as a smoothed percentage relative frequency histogram: the area under
the curve is 1.
The Normal distribution has two parameters: the mean, , and the standard deviation, .
Normal pdf with mean 30 and sd 10 Normal pdfs with means 10, 30, 50 and sd 10 Normal pdfs with mean 30 and sds 5, 10, 15
0.04
0.04
0.08
Density
Density
Density
0.02
0.02
0.04
0.00
0.00
0.00
0 10 20 30 40 50 60 -20 0 20 40 60 80 -20 0 20 40 60 80
x x x
If a random variable X has a Normal distribution with mean and variance 2 , then we write
X N , 2 .

9.1.1 The standard Normal distribution
The standard Normal distribution, usually denoted by
Z N(0, 1),
has a mean of zero and a variance of 1, and we have tables of probabilities for this particular
Normal distribution; see page 51.
43
Example 1
Find the following probabilities when Z N(0, 1).

(a) P (Z 1.46) (d) P (1.2 < Z 1.5)
(b) P (Z 0.01) (e) P (Z < 1.5)
(c) P (Z > 1.5) (f) P (Z = z)
Solution

44
9.1.2 Probabilities from any Normal distribution
Any Normally distributed random variable X N(, 2 ) can be transformed into the standard
Normal distribution using the formula:
X
Z = ,

therefore
x
P (X x) = P Z ,

which can be looked up in tables.
Example 2
If X N(10, 22 ) calculate P (X 8).
Solution
Example 3
Suppose X is the IQ of a randomly selected 1819 year old and that X follows a normal
distribution with mean = 100 and standard deviation = 15. Thus, we have:
X N 100, 152 .

Find the following probabilities.
(a) The probability that an 1819 year old has an IQ less than 110.
(b) The probability that an 1819 year old has an IQ greater than 110.
(c) The probability that an 1819 year old has an IQ greater than 125.
(d) The probability that an 1819 year old has an IQ between 95 and 115.
Solutions
45
This page has been left blank for your solutions to the last example
46
9.2 Exercises
1. A company promises delivery within 20 working days of receipt of order. However, in
reality, they deliver according to a normal distribution with a mean of 16 days and a standard
deviation of 2.5 days.
(a) What proportion of customers receive their order late?

(b) What proportion of customers receive their orders between 10 and 15 days of placing
their order?
(c) A new order processing system promises to reduce the standard deviation of delivery
times to 1.5 days. If this system is used, what proportion of customers will receive their
deliveries within 20 days?
2. A drinks machine is regulated by its manufacturer so that it dispenses an average of 200ml

per cup. However, the machine is not particularly accurate and actually dispenses an amount
that has a normal distribution with standard deviation 15ml.
(a) What percentage of cups contain below the minimum permissible volume of 170ml?
(b) What percentage of cups contain over 225ml?
(c) What percentage of cups contain between 175ml and 225ml?
(d) How many cups would you expect to overflow if 240ml cups are used for the next 10000
drinks?
47
10 More continuous probability distributions
10.1 The normal distribution: using tables in reverse

Suppose we are told that P (Z z) = 0.95. What is the value of z?
From tables on page 51, we can see that
P (Z 1.64) = 0.9495 and

P (Z 1.65) = 0.9505.
Therefore, z = 1.645.
Now suppose that X N(100, 152), as in the IQ example from Chapter 9. Below what IQ are
95% of the population?
We know that P (Z 1.645) = 0.95 and z = (x )/ so
x x 100
1.645 = = ,
15
therefore
x = 1.645 15 + 100 124.7.
In other words, 95% of IQs are less than about 125.
10.2 The uniform distribution

The uniform distribution is the most simple continuous distribution. As the name suggests, it
describes a variable for which all possible outcomes are equally likely.
If the random variable X follows a uniform distribution, we write
X U(a, b).
Probabilities can be calculated using the formula

0 for x < a
x a
P (X x) = for a x b

ba
1 for x > b,

and the mean and variance are given by
a+b (b a)2
E[X] = , Var(X) = .
2 12
48
10.3 The exponential distribution

The exponential distribution is another common distribution that is used to describe con-
tinuous random variables. It is often used to model lifetimes of products and times between
random events such as arrivals of customers in a queueing system or arrivals of orders. The
distribution has one parameter, . If our random variable X follows an exponential distribution,
then we say
X Exp().
Probabilities can be calculated using

(
1 ex for x 0
P (X x) =
0 for x < 0,
and the mean and variance are given by

1 1
E[X] = , Var(X) = .
2
10.3.1 Poisson process
The exponential distribution and the Poisson distribution are related through the notion of events
occurring randomly in time (at a constant average rate, ). This is known as a Poisson process.
Consider a series of randomly occurring events such as calls at a call centre. The times of calls
might look like
0
1 2
3
4 5
There are two ways of viewing these data. One is as the number of calls in each minute (here 2,
0, 2, 1 and 1) and the other is as the times between successive calls. For the Poisson process,
the number of calls in each one minute interval has a Poisson distribution with parame-
ter , and
the time between successive calls has an exponential distribution with parameter .
49
10.4 Exercises
1. An express coach is due to arrive in Newcastle from London at 11pm. However, in practice,
it is equally likely to arrive anywhere between 15 minutes early to 45 minutes late, depending
on traffic conditions. Let the random variable X denote the amount of time (in minutes) that
the coach is delayed.
(a) Calculate the mean of the delay time.

(b) What is the probability that the coach is less than 5 minutes late?
(c) What is the probability that the coach is more than 20 minutes late?
(d) What is the probability that the coach arrives between 10.55 and 11.20pm?
(e) What is the probability that the coach arrives before 11pm?
2. The time (in minutes) between requests to a network server can be modelled by an exponen-
tial distribution with rate parameter = 2.5.
(a) What is the expected time between requests?

(b) What is the probability that the time between requests is less than 1 minute and 30
seconds?
(c) What is the probability that the time between requests is greater than 1 minute?
(d) What is the probability that the time between requests is between 1 minute and 1 minute
and 30 seconds?
(e) What is the probability that the time between requests is between 30 seconds and 50
seconds?
50
Probability Tables for the Standard Normal Distribution

The table contains values of P (Z z), where Z N(0, 1).
z -0.09 -0.08 -0.07 -0.06 -0.05 -0.04 -0.03 -0.02 -0.01 0.00
-2.9 0.0014 0.0014 0.0015 0.0015 0.0016 0.0016 0.0017 0.0018 0.0018 0.0019
-2.8 0.0019 0.0020 0.0021 0.0021 0.0022 0.0023 0.0023 0.0024 0.0025 0.0026
-2.7 0.0026 0.0027 0.0028 0.0029 0.0030 0.0031 0.0032 0.0033 0.0034 0.0035
-2.6 0.0036 0.0037 0.0038 0.0039 0.0040 0.0041 0.0043 0.0044 0.0045 0.0047
-2.5 0.0048 0.0049 0.0051 0.0052 0.0054 0.0055 0.0057 0.0059 0.0060 0.0062
-2.4 0.0064 0.0066 0.0068 0.0069 0.0071 0.0073 0.0075 0.0078 0.0080 0.0082
-2.3 0.0084 0.0087 0.0089 0.0091 0.0094 0.0096 0.0099 0.0102 0.0104 0.0107
-2.2 0.0110 0.0113 0.0116 0.0119 0.0122 0.0125 0.0129 0.0132 0.0136 0.0139
-2.1 0.0143 0.0146 0.0150 0.0154 0.0158 0.0162 0.0166 0.0170 0.0174 0.0179
-2.0 0.0183 0.0188 0.0192 0.0197 0.0202 0.0207 0.0212 0.0217 0.0222 0.0228
-1.9 0.0233 0.0239 0.0244 0.0250 0.0256 0.0262 0.0268 0.0274 0.0281 0.0287
-1.8 0.0294 0.0301 0.0307 0.0314 0.0322 0.0329 0.0336 0.0344 0.0351 0.0359
-1.7 0.0367 0.0375 0.0384 0.0392 0.0401 0.0409 0.0418 0.0427 0.0436 0.0446
-1.6 0.0455 0.0465 0.0475 0.0485 0.0495 0.0505 0.0516 0.0526 0.0537 0.0548
-1.5 0.0559 0.0571 0.0582 0.0594 0.0606 0.0618 0.0630 0.0643 0.0655 0.0668
-1.4 0.0681 0.0694 0.0708 0.0721 0.0735 0.0749 0.0764 0.0778 0.0793 0.0808
-1.3 0.0823 0.0838 0.0853 0.0869 0.0885 0.0901 0.0918 0.0934 0.0951 0.0968
-1.2 0.0985 0.1003 0.1020 0.1038 0.1056 0.1075 0.1093 0.1112 0.1131 0.1151
-1.1 0.1170 0.1190 0.1210 0.1230 0.1251 0.1271 0.1292 0.1314 0.1335 0.1357
-1.0 0.1379 0.1401 0.1423 0.1446 0.1469 0.1492 0.1515 0.1539 0.1562 0.1587
-0.9 0.1611 0.1635 0.1660 0.1685 0.1711 0.1736 0.1762 0.1788 0.1814 0.1841
-0.8 0.1867 0.1894 0.1922 0.1949 0.1977 0.2005 0.2033 0.2061 0.2090 0.2119
-0.7 0.2148 0.2177 0.2206 0.2236 0.2266 0.2296 0.2327 0.2358 0.2389 0.2420
-0.6 0.2451 0.2483 0.2514 0.2546 0.2578 0.2611 0.2643 0.2676 0.2709 0.2743
-0.5 0.2776 0.2810 0.2843 0.2877 0.2912 0.2946 0.2981 0.3015 0.3050 0.3085
-0.4 0.3121 0.3156 0.3192 0.3228 0.3264 0.3300 0.3336 0.3372 0.3409 0.3446
-0.3 0.3483 0.3520 0.3557 0.3594 0.3632 0.3669 0.3707 0.3745 0.3783 0.3821
-0.2 0.3859 0.3897 0.3936 0.3974 0.4013 0.4052 0.4090 0.4129 0.4168 0.4207
-0.1 0.4247 0.4286 0.4325 0.4364 0.4404 0.4443 0.4483 0.4522 0.4562 0.4602
0.0 0.4641 0.4681 0.4721 0.4761 0.4801 0.4840 0.4880 0.4920 0.4960 0.5000
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
51

Quantitative Methods For Business Management

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Quantitative Methods For Business Management

Încărcat de

Drepturi de autor:

Formate disponibile

MAS1403

Quantitative Methods for

Dr. Daniel Henderson

School of Mathematics & Statistics

Lectures: Mondays at 12pm In the Curtis Auditorium, Herschel Building

Lecture notes and handouts

Late Work Policy:

Mon 3rd October Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 2 (week commencing 10/10/16)

Mon 10th October Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 3 (week commencing 17/10/16)

Mon 17th October Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 4 (week commencing 24/10/16) Topic 2: Probability and decision making

Mon 24th October Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 5 (week commencing 31/10/16)

Mon 31st October Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 6 (week commencing 7/11/16)

Mon 7th November Lecture 12 - 1 Herschel Building, Curtis Auditorium

Mon 14th November Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 8 (week commencing 21/11/16)

Mon 21st November Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 9 (week commencing 28/11/16)

Mon 28th November Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 10 (week commencing 5/12/16)

Mon 5th December Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 11 (week commencing 12/12/16)

Mon 12th December Lecture 12 - 1 Herschel Building, Curtis Auditorium

Week 12 (week commencing 9/1/17) Revision week

Mon 9th January Lecture 12 - 1 Herschel Building, Curtis Auditorium

1 Collecting and presenting data

Data/random variables are of different types:

Qualitative (i.e. non-numerical)

Quantitative (i.e. numerical)

(a) The time between emails arriving in your inbox is recorded.

(c) The number of students attending a MAS1403 tutorial is recorded.

1.2 Sampling techniques

Simple random sampling (random)

Stratified sampling (random)

Systematic sampling (quasi-random)

Judgemental sampling (non-random)

Accessibility sampling (non-random)

Here, the most easily accessible elements are sampled.

Quota sampling (non-random)

1.3 Frequency tables

Example: discrete data

Date Cars Sold Date Cars Sold

Example: continuous data

196.3 199.7 206.7 203.8 203.1

We can present these data in a relative frequency as follows: 

3. The number of telephone calls made by 20 students in a day is shown below.

Put these data into a relative frequency table.

281.4 293.4 306.5 286.6 298.4

Class Interval Tally Frequency Relative Frequency %

2 Graphical methods for presenting data

2.1 Stemandleaf plots

Represent the data in a stem-and leaf plot. 

Some notes on stemandleaf plots.

Always show the stem units and the leaf units.

2.2 Bar charts

This gives the following bar chart:

Car Walk Bike Bus Metro Train

Service time Frequency Relative Frequency (%)

The histogram for these data is:

We can present these data in a relative frequency as follows:

Represent the data in a stem-and leaf plot.

A box plot is constructed as follows.