Fallsem2018-19 Mee2013 Eth Mb218 Vl2018191002622 Reference Material I Simulation III Unit

RANDOM NUMBER
A random number is a number generated by a process, whose outcome is

unpredictable, and which cannot be sub sequentially reliably reproduced.
Random numbers are numbers that occur in a sequence such that two
conditions are met:
(1) the values are uniformly distributed over a defined interval or set,
and
(2) it is impossible to predict future values based on past or present
ones.
Random numbers are important in statistical analysis and probability

theory.
The most common set from which random numbers are derived is the set of
single-digit decimal numbers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 1
RANDOM NUMBER
gENERATION

RANDOM NUMBER
GENERATION
Random numbers are a necessary basic ingredient in
the simulation of almost all discrete systems.
Most computer languages have a subroutine, object, or

function that will generate a random number.
Simulation languages generate random numbers that

are used to generate event times and other
random variables.
Random Number Generation
 Desirable Attributes:
 Uniformity
 Independence
 Efficiency
 Replicability (a study should produce the
same results if repeated exactly)
 Long Cycle Length
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore

Random Number Generation (cont….)
Each random number Rt is an independent sample
drawn from a continuous uniform distribution between
0 and 1. That is the pdf is given by
f x  1,
0, 0  x  1
otherwise
f(x)
0 1 x
Fig. The pdf for random numbers
Random Number Generation (cont.)
The exp ected value of each R i is given by
x 2 1
1 1
E(R)    
0 x dx
 2  0 2
and the var iance is given by
1 2
V(R)  0 x dx  [E(R)]2
1 2
 x3   1 
   
 3 0  2 
1 1 1
  
3 4 12

Mid square method:
x1  2421 x12  05861241 R 2  0.8612
x 2  8612 x 22  74166544 R 3  0.1665
x 3  1665 x 32  02772225 R 4  0.7722

x 4  7722
However, one come across the following situations:
1.Series may vanish because a random number obtained is 0000
2.A random number reproduces itself. Ex: x7=8625, x8=8625, x9=8625
3.A loop occurs. Ex: x11=6100, x12=2100, x13=4100, x14=8100, x15=6100
and the process continues in a circle.

Congruence method/Residue method:
Proposed by-Lehmer (1951) produces sequence of integers b/w zero
and m-1 according to the following recursive relationship:
ri 1   ari  b  mod ulo m 

where a, b and m are constants and ri, ri+1 are the ith and (i+1)th random
numbers.
The expression implies multiplying of a by ri, addition of b and then
dividing by m. Then ri+1 is the remainder or residue.
To begin the process of random number generation, in addition to a, b
and m, the value of r0 is also required. It may be any random number
and is called seed.
ri
Random numbers between 0 and 1 can be generated by R i  , i  1, 2,.......
m
Modulo meaning: remainder after division

ri 1   ari  b  mod ulo m 
Numerical Illustration:
Let a=16, b=18 and m=23
16 1  18 34
r0  1 r1    1  remainder11
23 23
16 11  18 194 16 10  18 178
r2    8  remainder10 r3    7  remainder17
23 23 23 23
16 17  18 290 16 14  18 242
r4    12  remainder14 r5    10  remainder12
23 23 23 23
16 12  18 210 16  3  18 66
r6    9  remainder 3 r7    2  remainder 20
23 23 23 23
16  20  18 338 16 16  18 274
r8    14  remainder16 r9    11  remainder 21
23 23 23 23
16  21  18 354 16  9  18 162
r10    15  remainder 9 r11    7  remainder1
23 23 23 23
Random numbers obtained: 1, 11, 10, 17, 14, 12, 3, 20, 16, 21,
9, 1.
 Linear Congruential Method:
Xi+1 = (aXi + c) mod m, i = 0, 1, 2....

(Example)
Let X0 = 27, a = 17, c = 43, and m = 100, then
X1 = (17*27 + 43) mod 100 = 2
R1 = 2 / 100 = 0.02
X2 = (17*2 + 43) mod 100 = 77
R2 = 77 / 100 = 0.77
.........

The congruence random number generator may be of the
additive, multiplicative or mixed type.
If a =1, ri+1=(ri+b)(modulo m) — Additive type
If b = 0, ri+1=ari(modulo m) —Multiplicative type (b=0)
The multiplicative methods are considered better than the

additive methods and as good as the mixed methods.
The selection of the values for the constants a, b and m is very
important, because on them depends the length of the sequence
of random numbers, after which the sequence repeats.

If b = 0, ri+1 = ari(modulo m) —Multiplicative
type
ri+1 = 16ri(modulo 23)
Taking r0 1
16 1
r1   0  remainder 16
23
16 16
23
16  3
23
This way the following sequence of Random numbers can be
generated: 1, 16, 3, 2, 9, 6, 4, 18, 12, 8, 13, 1.
If a =1, ri+1 = (ri+b)(modulo m) — Additive type
ri+1 = (ri+18)(modulo 23)
1  18
r0  1 r1   0  remainder 19
23
19  18 14  18
r2   1  remainder 14 r3   1  remainder 9
23 23
9  18
23
This method results in the following string of 23 random
numbers:
1, 19, 14, 9,4, 22, 17, 12, 7, 2, 20, 15, 10, 5, 0, 18, 13, 8, 3, 21,
16, 11, 6, 1.

Ultimate test of the Linear Congruential method:
How closely the generated numbers R1, R2, ….

approximate uniformity and independence?
Secondary Properties:
Maximum density
Maximum period

Maximum Density:
Xi
Ri  , i  1, 2,.......
m
Numbers generated from above equation can only assume values from the set
I={0, 1/m, 2/m, ……, (m-1)/m},
since each Xi is an integer in the
set {0, 1, 2, ……..m-1}.
Each Ri is discrete on I, instead of continuous on the interval [0, 1].
This approximation appears to be of little consequence.
Hence modulus ‘m’ should be very large integer
m2 31
1 and m  2 48
are in common use

The max period(P) :
Linear Congruential Method (Random number generation)
Xi+1 = (aXi + c) mod m, i = 0, 1, 2.... where
X0 :seed value a :multiplier
c :increment m :modulus
 For m: a power of 2, say m = 2b, and c ≠ 0, the longest possible period is P =
m = 2b , which is achieved provided that c is relatively prime to m (that is, the
greatest common factor of c and m is 1), and a = 1 + 4k, where k is an integer.
 For m: a power of 2, say m = 2b, and c = 0, the longest possible period is P =
m / 4 = 2b-2 , which is achieved provided that the seed X0 is odd and the
multiplier, a, is given by a = 3 + 8k or a = 5 + 8k, for some k = 0, 1,...
 For m: a prime number and c = 0, the longest possible period is P = m - 1,
which is achieved provided that the multiplier, a, has the property that the
smallest integer k such that ak - 1 is divisible by m is k = m – 1.
(Example)
Using the multiplicative congruential method, find the

period of the generator for a = 13, m = 26, and X0 = 1, 2, 3,
and 4. The solution is given in next slide. When the seed is 1
and 3, the sequence has period 16. However, a period of
length eight is achieved when the seed is 2 and a period of
length four occurs when the seed is 4.

Table: Period Determination Using Various Seeds
i Xi Xi Xi Xi
0 1 2 3 4
1 13 26 39 52
2 41 18 59 36
3 21 42 63 20
4 17 34 51 4
5 27 58 23
6 57 50 43
7 37 10 47
8 33 2 35
9 45 7
10 9 27
11 53 31
12 49 19
13 61 55
14 25 11
15 5 15
1/31/2018
16 1
Dr. DEGA NAGARAJU, CIMR, VIT, Vellore
3 18
Test for Random Numbers
1. Frequency test: Uses the Kolmogorov-Smirnov or the chi-

square test to compare the distribution of the set of numbers
generated to a uniform distribution.
2. Runs test: Tests the runs up and down or the runs above and
below the mean by comparing the actual values to expected
values. The statistic for comparison is the chi-square.
3. Autocorrelation test: Tests the correlation between numbers

and compares the sample correlation to the expected
correlation of zero.

Test for Random Numbers (contd…..)
4. Gap test: Counts the number of digits that appear

between repetitions of a particular digit and then uses
the Kolmogorov-Smirnov test to compare with the
expected number of gaps.
5. Poker test: Treats numbers grouped together as a

poker hand. Then the hands obtained are compared to
what is expected using the chi-square test.

In testing for uniformity, the hypotheses are as follows:
H0 : R i U  0,1
H1 : R i not U  0,1
The null hypothesis, H0, reads that the numbers are

distributed uniformly on the interval [0,1].
Failure to reject the null hypothesis means that no

evidence of non uniformity has been detected on the
basis of this test.
This test does not imply that further testing of the
generator for uniformity is unnecessary.

In testing for independence, the hypotheses are as follows;

H0: Ri ~ independently
H1: Ri  independently
This null hypothesis, H0, reads that the numbers are
independent.
Failure to reject the null hypothesis means that no evidence of

dependence has been detected on the basis of this test.
This does not imply that further testing of the generator for
independence is unnecessary.

For each test, Level of significance α must be stated
Level of significance a Probability of
a = P(reject H0 | H0 true) rejecting the null
hypothesis given
Frequently, a is set to 0.01 or 0.05 that the null
hypothesis is true.
(Hypothesis)
Actually True Actually False
Accept 1-a b
(Type II error)
Reject a 1-b
(Type I error)

Frequency Tests
Test for uniformity

Two different methods of testing uniformity-
:Kolmogorov-Smirnov test
:Chi-square test
Usage: to measure the degree of agreement between the
distribution of a sample of generated random numbers
and the theoretical uniform distribution
Basis of both tests: the null hypothesis of no significant
difference between the sample distribution and the
theoretical distribution

Kolmogorov-Smirnov test
Compares the continuous cdf, F(x), of the uniform distribution to the empirical
cdf, SN(x), of the sample of N observations.
By definition:
F(x) = x, 0 ≤ x ≤ 1
If the sample from the random-number generator is R1, R2, ……, RN, then the
empirical cdf, SN(x), is defined by
number of R1, R 2 ,....., R N which are  x
SN  x  
N
As N becomes larger, SN(x) should become a better approximation to F(x),

provided that the null hypothesis is true.
This test is based on the largest absolute deviation between F(x) and SN(x) over
the range of the random variable. That is,
D  max F  x   SN  x 
The sampling distribution

1/31/2018
of D is known and is tabulated as a function of N.
Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 25
Kolmogorov-Smirnov Test
Procedural Steps:
Step1: Rank the data from smallest to large. Let R(i) denote the ith smallest
observation, so that
R 1  R  2  R 3  ................  R  N 
Step2: Compute
 i 
D  max   R  i  
1 i  NN 
  i  1
D  max R  i   
1 i  N N 
Step3: Compute D = max(D+, D-).
Step4: Determine the critical value Dα , from the table for the specified significance
level α and the given sample size.
Step5: If the sample statistic D is greater than the critical value, Dα, the null
hypothesis that the data are a sample from a uniform distribution is rejected.
If D ≤ Dα, conclude that no difference has been detected between the true
distribution of {R1, R2, …….., RN} and the uniform distribution.

Example Problem:
The sequence of numbers 0.44, 0.81, 0.14, 0.05, o.93 have been generated.
Use the Kolmogorov-Smirnov test with α = 0.05 to determine if the
hypothesis that the numbers are uniformly distributed on the interval [0, 1]
can be rejected.
Ri 0.05 0.14 0.44 0.81 0.93
i
0.20 0.40 0.60 0.80 1.00
N
i
 Ri 0.15 0.26 0.16  0.07
N
i 1
0.00 0.20 0.40 0.60 0.80
N
Ri 
 i  1
0.05  0.04 0.21 0.13
N
 i 
D  max   R i   0.26
1 i  N  N 
  i  1
D  max R i    0.21
1 i  N  N 

D  max D , D  0.26 
The critical value of D from table for the given value of Level of
significance α = 0.05 and Sample of observations N= 5 is 0.565.
The computed value is less than the tabulated critical value , the hypothesis
of no difference between the distribution of the generated numbers and the
uniform distribution is not rejected.

SN(x)

The Chi-Square Test:
The Chi-Square Test uses the sample statistic
 
2
n O i  Ei
02 
i 1 Ei
where
Oi Observed number in the ith class
Ei Expected number in the ith class
n Number of classes
For the uniform distribution, Ei, the expected number in each class is
N
Ei  (for equally spaced classes)
n
where N is the total number of observations
Note: The sampling distribution of 02 is approximately the chi-
square distribution with n-1 degrees of freedom.
Example:
Use the chi-square test with α = 0.05 to test whether the data shown below are
uniformly distributed:
0.34 0.90 0.25 0.89 0.87 0.44 0.12 0.21 0.46 0.67
0.83 0.76 0.79 0.64 0.70 0.81 0.94 0.74 0.22 0.74
0.96 0.99 0.77 0.67 0.56 0.41 0.52 0.73 0.99 0.02
0.47 0.30 0.17 0.82 0.56 0.05 0.45 0.31 0.78 0.05
0.79 0.71 0.23 0.19 0.82 0.93 0.65 0.37 0.39 0.42
0.99 0.17 0.99 0.46 0.05 0.66 0.10 0.42 0.18 0.49
0.37 0.51 0.54 0.01 0.81 0.28 0.69 0.34 0.75 0.49
0.72 0.43 0.56 0.97 0.30 0.94 0.96 0.58 0.73 0.05
0.06 0.39 0.84 0.24 0.40 0.64 0.40 0.19 0.79 0.62
0.18 0.26 0.97 0.88 0.64 0.47 0.60 0.11 0.29 0.78
Take n=10 intervals of equal length, namely [0.0, 0.1), [0.1, 0.2), [0.3,0.4),
[0.4,0.5), [0.5,0.6), [0.6,0.7), [0.7,0.8), [0.8,0.9) and [0.9, 1.0)
Computations for Chi-Square Test
Interval Oi Ei Oi-Ei (Oi-Ei)2 (Oi-Ei)2/ Ei
1 8 10 -2 4 0.4
2 8 10 -2 4 0.4
3 10 10 0 0 0.0
4 9 10 -1 1 0.1
5 12 10 2 4 0.4
6 8 10 -2 4 0.4
7 10 10 0 0 0.0
8 14 10 4 16 1.6
9 10 10 0 0 0.0
10 11 10 1 1 0.1
100 100 00 3.4
The value2 of 0
2
 2
is 3.4. this is compared with the critical value 0.05,9 =16.9.
Since 0 is much smaller than the tabulated value of 2 , the null hypothesis
0.05,9
of uniform distribution is not rejected.
Autocorrelation Test
Tests the correlation between numbers and compares the

sample correlation to the expected correlation of zero.
Used to test the dependence between numbers in a sequence.

Tests for Auto Correlation:
Consider the following sequence of numbers:
0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28 0.83 0.93
0.99 0.15 0.33 0.35 0.91 0.41 0.60 0.27 0.75 0.88
0.68 0.49 0.05 0.43 0.95 0.58 0.19 0.36 0.69 0.87
Test whether the 3rd, 8th, 13th and so on, numbers in the sequence given above are
autocorrelated. (Use α = 0.05)
Here i = 3 (beginning with the third number)
m = 5 (every five numbers)
The value M is the largest integer such that i+(M+1)m ≤ N,
where N is the total number of values in the sequence
M = 4 (largest integer such that 3+(M+1)5 ≤ 30)

ˆ im  Autocorrelation between the numbers : R i , R i  m , R i  2m ,......, R i (M 1)m .
1 M 
ˆ im    R i  km R i (k 1)m   0.25
M  1  k 0 
1
ˆ 35  (0.23)(0.28)  (0.28)(0.33)  (0.33)(0.27)  (0.27)(0.05)  (0.05)(0.36)  0.25
4 1
 0.1945
The s tan dard deviation of the estimator
13M  7
ˆ im 
12  M  1
13  4   7
ˆ 35   0.128
12  4  1

ˆ im
Then the test statistic Z0 
ˆ im
ˆ im 0.1945
Z0    1.516
ˆ im 0.1280
After computing Z0 , do not reject the null hypothesis of
independence if  Za / 2  Z0  Z a / 2
Z0.025  1.96
Therefore the hypothesis of independence cannot be rejected
on the basis of this test.

Exercise Problem:
Test the following sequence of numbers for uniformity and independence, using
procedures you learned here:
0.594 0.928 0.515 0.055 0.507 0.351 0.262 0.797 0.788 0.442
0.097 0.798 0.227 0.127 0.474 0.825 0.007 0.182 0.929 0.852

Gap Test:
Counts the number of digits that appear between repetitions
of a particular digit and then uses the Kolmogorov-Smirnov
test to compare with the expected number of gaps.
Determines-significance of the interval b/w the recurrences

of the same digit.
A gap of length x occurs between the recurrences of some

specified digit.

Gap Test:
The following example illustrates the length of gaps associated with the digit
3:
4, 1, 3, 5, 1, 7, 2, 8, 2, 0, 7, 9, 1, 3, 5, 2, 7, 9, 4, 1, 6, 3
3, 9, 6, 3, 4, 8, 2, 3, 1, 9, 4, 4, 6, 8, 4, 1, 3, 8, 9, 5, 5, 7
3, 9, 5, 9, 8, 5, 3, 2, 2, 3, 7, 4, 7, 0, 3, 6, 3, 5, 9, 9, 5, 5
5, 0, 4, 6, 8, 0, 4, 7, 0, 3, 3, 0, 9, 5, 7, 9, 5, 1, 6, 6, 3, 8
8, 8, 9, 2, 9, 1, 8, 5, 4, 4, 5, 0, 2, 3, 9, 7, 1, 2, 0, 3, 6, 3
To facilitate the analysis, digit 3 has been underlined.
Eighteen 3’s in the list.
Only 17 gaps can occur.
The first gap is of length 10, the second gap is of length 7, and so on.
The frequency of the gaps is of interest.

The probability of the first gap is determined as follows.
10 of these terms
P  gap of 10   P(no 3)....P(no 3) P(3)

  0.9   0.1
10
Since the probability that any digit is not a 3 is 0.9, and the probability that
any digit is a 3 is 0.1.
In general,
P(t followed by exactly x non-t digits) = (0.9)x (0.1), x=0,1,2…..
To fully analyze a set of numbers for independence using the gap test,
every digit, 0, 1, 2, …….9, must be analyzed.
To observe, the frequencies of the various gap sizes for all the digits are
recorded and compared to the theoretical frequency using the
Kolmogorov-Smirnov test for discretized data.
The theoretical frequency distribution for randomly ordered digits is given
by x
P  gap  x   F  x   0.1   0.9   1  0.9
x 1
n
n 0
Procedural Steps:
When applying the test to random numbers, class intervals such as [0,0.1),
[0.1,0.2),…… play the role of random digits.
1. Specify the cdf for the theoretical frequency distribution given by

equation based on the selected class interval width.
2. Arrange the observed sample of gaps in a cumulative distribution
with these same classes.
3. Find D, the maximum deviation b/w F(x) and SN(x) as in equation
D  max F  x   SN  x 
4. Determine the critical value Dα, from the table A.8 for the specified
value of α and the sample size N.
5. If the calculated value of D is greater than the tabulated value of
Dα, The null hypothesis of independence is rejected.
Example:
Based on the frequency with which gaps occur, analyze the 110 digits
above to test whether they are independent. Use α = 0.05.
The number of gaps is given by the number of data values minus

the number of distinct digits, or 110-10 = 100 in the example.
The number of gaps associated with the various digits are as

follows.
Digit 0 1 2 3 4 5 6 7 8 9
Number of 7 8 8 17 10 13 7 8 9 13
gaps
Gap Test Example:
G.L Frequency R .F C.R .F Fx F  x   SN  x 
03 35 0.35 0.35 0.3439 0.0061
47 22 0.22 0.57 0.5695 0.0005
8  11 17 0.17 0.74 0.7176 0.0224
12  15 9 0.09 0.83 0.8147 0.0153
16  19 5 0.05 0.88 0.8784 0.0016
20  23 6 0.06 0.94 0.9202 0.0198
24  27 3 0.03 0.97 0.9497 0.0223
28  31 0 0.0 0.97 0.9657 0.0043
32  35 0 0.0 0.97 0.9775 0.0075
36  39 2 0.02 0.99 0.9852 0.0043
40  43 0 0.0 0.99 0.9903 0.0003
44  47 1 0.01 1.00 0.9936 0.0064
G.L –Gap Length R.L-Relative Frequency
C.R.F-Cumulative Relative Frequency
The critical value of D is given by
1.36
D0.05   0.136
100
Since D  max F(x)  SN (x)  0.0224
is less than D0.05, do not reject the hypothesis of independence on the basis
of this test.

clear all;
clc;
x=[3 7 11 15 19 23 27 31 35 39 43 47];
P=1-0.9.^(x+1)
P=
0.3439 0.5695 0.7176 0.8147

0.8784 0.9202 0.9477 0.9657
0.9775 0.9852 0.9903 0.9936

Runs Tests:
Runs up and runs down: Consider a generator that provided a set of 40
numbers in the following sequence:
0.08 0.09 0.23 0.29 0.42 0.55 0.58 0.72 0.89 0.91
0.11 0.16 0.18 0.31 0.41 0.53 0.71 0.73 0.74 0.84
0.02 0.09 0.30 0.32 0.45 0.47 0.69 0.74 0.91 0.95
0.12 0.13 0.29 0.36 0.38 0.54 0.68 0.86 0.88 0.91
Both the Kolmogorov-Smirnov test and the chi-square test would indicate
that the numbers are uniformly distributed.
However, a glance at the ordering shows that the numbers are successively
larger in blocks of 10 values.

If these numbers are rearranged as follows, there is far less reason to doubt
their independence:
0.41 0.68 0.89 0.84 0.74 0.91 0.55 0.71 0.36 0.30
0.09 0.72 0.86 0.08 0.54 0.02 0.11 0.29 0.16 0.18
0.88 0.91 0.95 0.69 0.09 0.38 0.23 0.32 0.91 0.53
0.31 0.42 0.73 0.12 0.74 0.45 0.13 0.47 0.58 0.29
The runs test examines the arrangement of numbers in a sequence to test

the hypothesis of independence.

Example: Look at a sequence of coin tosses will help with some
terminology. Consider the following sequence generated by tossing a coin
10 times:
H T T H H T T T H T
There are
three mutually exclusive outcomes, or events, with respect to the sequence.
Two of the possibilities are rather obvious. That is the toss can result in a
head or a tail. The third possibility is “no event”.
The first head is preceded by no event and the last tail is succeeded by no
event.

A Run is Defined as A succession of similar events preceded and
followed by a different event.
The length of the run is the Number of events that occur in the
run.
H T T H H T T T H T
In the coin flipping example discussed previously: there are six runs.
Length of the first run: one
Length of the second run: two
Length of the third run: two
Length of the fourth run: three
Length of the fifth run: one
Length of the sixth run: one

There are two possible concerns in a runs test for a sequence of numbers:
First concern: number of runs
Second concern: the length of runs
The types of runs counted in the first case might be runs up and runs down.
An up run: is the sequence of numbers each of which is succeeded by a large

number.
A down run: is the sequence of numbers each of which is succeeded by smaller

number.

Consider the following sequence of 15 numbers:
0.87 0.15 0.23 0.45 0.69 0.32 0.30 0.19 0.24
0.18 0.65 0.82 0.93 0.22 0.81
The numbers are given “+” or a “-” depending on whether they are followed
by a larger number or a smaller number. Since there are 15 numbers, and
they are all different, there will be 14 +’s and –’s. The last number is
followed by “no event” and hence will get neither a + nor a -.
The sequence of 14 +’s and –’s is as follows:
             
There are eight runs:
Length of first run: one
Length of second run: three
Length of third run: three and so on.
Totally four runs up and four runs down.
Consider the following sequence of numbers:
0.08 0.18 0.23 0.36 0.42 0.55 0.63 0.72 0.89 0.91
This sequence has one run, a run up.
Note:It is unlikely that a valid random number generator would produce such a
sequence.
Next, consider the following sequence:
0.08 0.93 0.15 0.96 0.26 0.84 0.28 0.79 0.36 0.57
This sequence has nine runs. Five up and four down.
Note: It is unlikely that a sequence of 10 numbers would have this many runs.
What is more likely is that the number of runs will be some where between
the two extremes. These two extremes can be formalized as follows.
If
N = number of numbers in a sequence,
The maximum number of runs = N-1
The minimum number of runs = one.
If
a = total number of runs in a truly random sequence
The mean and variance of ‘a’ are given by
2N  1 16N  29
Mean, a  Variance, a2 
3 90
For N > 20, the distribution of a is reasonably approximated by a normal

distribution , N a , a
2

This approximation can be used to test the independence of numbers from a
generator.
In that case the standardized normal test statistic is developed by subtracting
the mean from the observed number of runs (a) and dividing by the standard
deviation. That is the test statistic is
a  a a   2N  1 3
Z0   Z0 
a 16N  29 90
Where Z0 N(0,1). Failure to reject the hypothesis of independence occurs

when Za 2  Z0  Za 2 , where α is the level of significance.
Based on runs up and runs down, determine whether the following
sequence of 40 numbers is such that the hypothesis of independence can be
rejected where α = 0.05.
0.41 0.68 0.89 0.94 0.74 0.91 0.55 0.62 0.36 0.27
0.19 0.72 0.75 0.08 0.54 0.02 0.01 0.36 0.16 0.28
0.18 0.01 0.95 0.69 0.18 0.47 0.23 0.32 0.82 0.53
0.31 0.42 0.73 0.04 0.83 0.45 0.13 0.57 0.63 0.29
The sequence of runs up and runs down is as follows:

         
        
        
         

There are 26 runs in this sequence. With N=40 and a=26
2N  1 2  40   1
Mean, a    26.33
3 3
16N  29 16  40   29
Variance, a
2
   6.79
90 90
a  a 26  26.33
Z0    0.13
a 6.79
Now the critical value is Z0.025=1.96, so the independence of the numbers
cannot be rejected on the basis of this test.

Runs above and below the mean:
The test for runs up and runs down is not completely adequate to assess
the independence of a group of numbers.
Consider the following 40 numbers:
0.63 0.72 0.79 0.81 0.52 0.94 0.83 0.93 0.87 0.67
0.54 0.83 0.89 0.55 0.88 0.77 0.74 0.95 0.82 0.86
0.43 0.32 0.36 0.18 0.08 0.19 0.18 0.27 0.36 0.34
0.31 0.45 0.49 0.43 0.46 0.35 0.25 0.39 0.47 0.41
The sequence of runs up and runs down is as follows.
+ + + - + - + - - - + + - + - - + - + - - + - -
+ - + + - - + + - + - - + + -
This is same as the previous example. Thus, the numbers would pass the
runs-up and runs-down test.

However, it can be observed that the first 20 numbers are all above the mean
[(0.99+0.00)/2=0.495] and the last 20 numbers are all below the mean. Such an occurrence
is highly unlikely.
The previous runs analysis can be used to test for this condition, if the definition of a run is
changed.
Runs will be described as being above the mean or below the mean.
A ‘+’ sign will be used to denote an observation above the mean, and
a “-” sign will denote an observation below the mean.
Consider the following sequence of 20 digit random numbers:
0.40 0.84 0.75 0.18 0.13 0.92 0.57 0.77 0.30 0.71
0.42 0.05 0.78 0.74 0.68 0.03 0.18 0.51 0.10 0.37
The pluses and minuses are as follows:
- + + - - + + + - + - - + + + - - + - -
Length of the run below the mean: one
Length of the run above the mean: two and so on.
Total number of runs: 11
Number of runs above the mean: Five
Number of runs below the mean: Six

Let
n1 = Number of individual observations above the mean
n2 = Number of individual observations below the mean
b = Total number of runs
Notice that
Maximum number of runs: N = n1 + n 2
Minimum number of runs: one
For the given n1 and n2,
2n1n 2 1 2n1n 2  2n1n 2  N 

Mean, b   Variance, 2b 
N 2 N 2
 N  1

For either n1 or n2 greater than 20, b is approximately normally
distributed.
b   2n1n 2 N   1 2
The test statistic Z0 
 2n1n 2  2n1n 2  N  
12
 
 N  N  1
2

Failure to reject the hypothesis of independence occurs when
Za 2  Z0  Za 2
Where α is the level of significance.

Determine whether there is an excessive number of runs above or below
the mean for the sequence of numbers given in the following example.
0.41 0.68 0.89 0.94 0.74 0.91 0.55 0.62 0.36 0.27
0.19 0.72 0.75 0.08 0.54 0.02 0.01 0.36 0.16 0.28
0.18 0.01 0.95 0.69 0.18 0.47 0.23 0.32 0.82 0.53
0.31 0.42 0.73 0.04 0.83 0.45 0.13 0.57 0.63 0.29
The assignment of +’s and –’s results in the following:

                   
                   
The values of n1=18; n2=22; N=n1+n2=40; b=17;

2n1n 2 1 2 18 22  1
Mean, b      20.3
N 2 40 2
2n1n 2  2n1n 2  N  2 18  22   2 18  22   40 
Variance, 2b    9.54
N 2
 N  1  40   40  1
2
Since, n2 is greater than 20, the normal approximation is acceptable,

resulting in a Z0 value of
17  20.3
Z0   1.07
9.54
Since Z0.025 = 1.96, the hypothesis of independence cannot be rejected

on the basis of this test.

RUNS TEST: LENGTH OF RUNS
Yet another concern is the length of runs.
As an example of what might occur, consider the following sequence of

numbers.
0.16, 0.27, 0.58, 0.63, 0.45, 0.21, 0.72, 0.87, 0.27, 0.15, 0.92, 0.85, …
Assume that this sequence continues in a like fashion: two numbers below
the mean followed by two numbers above the mean.
A test of runs above and below the mean would detect no departure from
independence.
However, it is to be expected that runs other than of length two should occur.

Let
N Total number of numbers
Yi number of runs of length i in a sequence.
For an independent sequence,
the expected value of Yi for runs up and runs down is given by
E  Yi  
2 
 i  3 ! 
   
N i 2  3i  1  i3  3i 2  i  4  ,

i  N2
2
E  Yi   , i  N 1
N!

For runs above and below the mean , the exp ected value of Yi is
approximately given by
Nw i
E  Yi   , N  20
E I
where w i , the approximate probability that a run has length i, is given by
i i
 n1   n 2   n1  n 2 
w i          , N  20
 N   N   N  N 
and where E(I), the approximate exp ected length of a run, is given by
n1 n 2
E  I   , N  20
n 2 n1

The approximate expected total number of runs (of all lengths) in a
sequence of length N, E(A), is given by
N
E A  , N  20
E  I
The appropriate test is the chi-square test with Oi being the observed
number of runs of length i. Then the test statistic is
Oi  E  Yi  
2
L
0
2

i 1 E  Yi 
Where L = N-1 for runs up and down and L = N for runs above and
below the mean. If the null hypothesis of independence is true, then χ02 is
approximately chi-square distributed with L -1 degrees of freedom.

Given the following sequence of numbers, can the hypothesis that the
numbers are independent be rejected on the basis of the length of runs up
and down at α = 0.05?
0.30 0.48 0.36 0.01 0.54 0.34 0.96 0.06 0.61 0.85
0.48 0.86 0.14 0.86 0.89 0.37 0.49 0.60 0.04 0.83
0.42 0.83 0.37 0.21 0.90 0.89 0.91 0.79 0.57 0.99
0.95 0.27 0.41 0.81 0.96 0.31 0.09 0.06 0.23 0.77
0.73 0.47 0.13 0.55 0.11 0.75 0.36 0.25 0.23 0.72
0.60 0.84 0.70 0.30 0.26 0.38 0.05 0.19 0.73 0.44
For this sequence the +’s and –’s are as follows:
                  
                   
                   
The lengths of runs in the sequence is as follows:
1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1,
1, 2, 1, 2, 3, 3, 2, 3, 1, 1, 1, 3, 1, 1, 1, 3, 1, 1, 2, 1

The number of observed number of each length is as follows:
Run length,i 1 2 3
Observed Runs, Oi 26 9 5
The expected numbers of runs of lengths one, two, and three are
computed as
2
E  Y1   60 1  3  1  1  3  1  4    25.08
4!
2
E  Y2   60  4  6  1   8  12  2  4    10.77
5!
2
E  Y3   60  9  9  1   27  27  3  4    3.04
6!
The mean total number of runs (up and down) is

2  60   1
a   39.67
3
Thus far, the E(Yi) for i = 1, 2, and 3 total 38.89.
The expected number of runs of length 4 or more is the difference
3
a   E  Yi  , or 0.78.
i 1
As observed by Hines and Montgomery[1990], there is no general
agreement regarding the minimum value of expected frequencies in
applying the chi-square test. Values 3, 4, and 5 are widely used, and a
minimum of 5 was suggested.
Should an expected frequency be too small, it can be combined with the

expected frequency in an adjacent class interval. The corresponding
observed frequencies would then be combined also, and L would be
reduced by one.
With the foregoing calculations and procedures in mind, the critical value
of χ02 is 3.84. (The degrees of freedom equals the number of class
intervals minus 1 )
Table : Length of Runs Up and Down :  Test 2
Oi  E  Yi  
2
Run Observed number Expected number
Length,i of Runs, Oi of Runs, E  Yi  E  Yi 
1 26 25.08 0.03
2 9  10.77 
 
14 14.59  0.02
3 5  3.82  
40 39.67 0.05
Since χ02 = 0.05 is less than the critical value, the hypothesis of
independence cannot be rejected on the basis of this test

Given the following sequence of numbers, can the hypothesis that the
numbers are independent be rejected on the basis of the length of runs above
and below the mean at α = 0.05?
0.30 0.48 0.36 0.01 0.54 0.34 0.96 0.06 0.61 0.85
0.48 0.86 0.14 0.86 0.89 0.37 0.49 0.60 0.04 0.83
0.42 0.83 0.37 0.21 0.90 0.89 0.91 0.79 0.57 0.99
0.95 0.27 0.41 0.81 0.96 0.31 0.09 0.06 0.23 0.77
0.73 0.47 0.13 0.55 0.11 0.75 0.36 0.25 0.23 0.72
0.60 0.84 0.70 0.30 0.26 0.38 0.05 0.19 0.73 0.44
For this sequence, the +’s and –’s are as follows:
                   
                   
                   

The number of runs of each length is as follows.
Run Length,i 1 2 3 4
Observed Runs, Oi 17 9 1 5
There are 28 values above the mean (n1=28) and 32 values below the mean
(n2=32). The probabilities of runs of various lengths, wi are
i i
 n1   n 2   n1  n 2 
w i         
 N   N   N  N 
1 1
 28   32   28  32 
w1           0.498
 60   60   60  60 
2 2
 28   32   28  32 
w 2           0.249
 60   60   60  60 
3 3
 28   32   28  32 
w 3           0.125
 60   60   60  60 

The expected length of a run, E(I), is
n1 n 2 28 32
E  I      2.02
n 2 n1 32 28
The expected numbers of runs of various lengths as
Nw i
E  Yi   , N  20
E  I
60  0.498 
E  Y1    14.79
2.02
60  0.249 
E  Y2    7.40
2.02
60  0.125 
E  Y3    3.71
2.02
The total number of runs expected is
N 60
E A    29.7 N  20
E  I  2.02
This indicates that approximately 3.8 runs of length four or more can be
expected.
Proceeding by combining adjacent cells in which E(Yi) < 5 produces the
following table.
Table: Length of Runs above and below the mean: χ2 test
Observed number Expected number Oi  E  Yi  
2
Run
Length, i of Runs, Oi of Runs, E  Yi  E  Yi 
1 17 14.79 0.33
2 9 7.40 0.35
1  3.71 
3 6  7.51  0.30
5 3.80 
4
32 29.70 0.98

The critical value χ20.05,2 is 5.99. (The degrees of freedom equals the
number of class intervals minus one) Since χ2 = 0.98 is less than the critical
value, the hypothesis of independence cannot be rejected on the basis of
this test.

Poker Test:
The poker test for independence is based on the frequency with which certain
digits are repeated in a series of numbers.
The following example shows an unusual amount of repetition:
0.255, 0.577, 0.331, 0.414, 0.828, 0.909, 0.303, 0.001, ……..
In each case, a pair of like digits appears in the number that was generated. In
three digit numbers there are only three possibilities, as follows:
1. The individual numbers can all be different.

2. The individual numbers can all be the same.
3. There can be one pair of like digits.
The probability associated with each of these possibilities is given by the
following:

P  three different digits   P  sec ond different from the first 
P  third different from the first and second    0.9  0.8  0.72
P  three like digits   P  sec ond digit same as the first 

P  third digit same as the first    0.1 0.1  0.01
P  exactly one pair   1  0.72  0.01  0.27 OR

3
P  exactly one pair      0.1 0.9   0.27
 2
The following example shows how the poker test (in conjunction with
the chi-square test ) is used to ascertain independence.

A sequence of 1000 three-digit numbers has been generated and an analysis
indicates that 680 have three different digits, 289 contain exactly one pair of
like digits, and 31 contain three like digits. Based on the poker test, are these
numbers independent? Let α = 0.05
Combination,i Observed Expected  Oi  E i 
2
Frequency, Oi Frequency, Ei Ei
Three different digits 680 720 2.22
Three like digits 31 10 44.10
Exactly one pair 289 270 1.33
1000 1000 47.65
The appropriate degrees of freedom are one less than the number of class
intervals. Since 47.65 > χ20.05,2 = 5.99, the independence of numbers is
rejected on the basis of this test.

Random Number:
A random number is a number generated by a process, whose outcome is
unpredictable, and which cannot be sub sequentially reliably reproduced.
Random numbers are numbers that occur in a sequence such that two
conditions are met:
(1) the values are uniformly distributed over a defined interval or set,
and
(2) it is impossible to predict future values based on past or present
ones.
Random numbers are important in statistical analysis and probability

theory.
The most common set from which random numbers are derived is the set of
single-digit decimal numbers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.

Definition of 'Random Variable
A real variable X whose value is determined by the outcome of a random experiment is
called a random variable.
A random variable, usually written X, is a variable whose possible values are numerical
outcomes of a random phenomenon. There are two types of random variables, discrete and
continuous.
Example: Consider a random experiment of throwing a die. Then ‘X’ the number of points
on the die is a random variable, since ‘X’ takes the values 1, 2, 3, 4, 5, and 6 each with the
probability 1/6.
Discrete Random Variable: Random variable takes the values only on the set {0, 1, 2, 3,
…..n}
Example: Number of printing mistakes in each page of a book
Number of telephone calls received by the telephone operator
Continuous Random Variable: Random variable takes on all values within a certain
interval
Example: The height, age and weight of individuals
Amount of rain fall on rainy day
A variable whose value is unknown or a function that assigns values to each
of an experiment's outcomes.
Random variables are often
designated by letters and can be classified as
discrete, which are variables that have specific values, or
continuous, which are variables that can have any values within a
continuous range.

Random Variate Generation
Previous discussions and examples indicated the usefulness of statistical
distributions to model activities that are generally unpredictable or
uncertain.
Example:
Inter arrival times and service times at queues, and demands for a product,
are quite often unpredictable in nature, at least to a certain extent.
Usually such variables are modeled as random variables with some specified
statistical distribution, and standard statistical procedures exist for
estimating the parameters of the hypothesized distribution and for testing
the validity of the assumed statistical model.
Assumptions:
 A distribution has been completely specified.
 Samples are generated from this specified distribution and the generated
samples are used as input to a simulation model.
Purpose of the current discussion:
To illustrate and explain some widely used techniques for generating random
variates.
But, not to give a state-of-the-art survey of the most efficient techniques.
Most simulation modelers will use existing routines available in programming

libraries, or the routines built into the simulation language being used.
However, some programming languages do not have built in routines for all
of the regularly used distributions, and some computer installations do not
have random variate generation libraries, in which case the modeler must
construct an acceptable routine.
Hence, it is worthwhile to understand how random-variate generation occurs.

In all the techniques to be discussed here will assume that a source of
uniform (0,1) random numbers, R1, R2, …… is readily available, where each
Ri has probability density function (pdf)

1, 0  x 1
fR  x   

0, otherwise
and Cumulative distribution function (cdf)
0, x0

FR  x    x, 0  x 1

1, x 1
Here, R1, R2, …represent random numbers uniformly distributed on (0, 1).

INVERSE TRANSFORM TECHNIQUE
Used to generate samples from
Exponential distribution,
Uniform distribution,
Weibull distribution and
Triangular distributions
This technique is also used for sampling from a wide variety of discrete
distributions.
It is the most straight forward, but not always the most efficient, technique
computationally.

Exponential distribution:
For the exponential distribution,
probability density function (pdf)

e x
, x0
f x  

0, x0
and cumulative distribution function (cdf) is given by
x  x
1  e , x0
F  x    f  t  dt  
 
0, x0
Where
λ = mean number of occurrences per time unit
Example :
X1, X2, X3, …. Inter arrival times (follows exponential distribution)
λ mean number of arrivals per time unit
1
For any i E  Xi   1/ λ - mean inter arrival time

Objective: develop a procedure for generating values X1, X2, X3, …..
Step by Step by approach:
Step 1: Compute cdf of the desired random variable X. For the
exponential distribution, the cdf is
F x   1 e x
, x0
Step 2: Set F(x) = R on the range of X.
1  ex  R on the range x  0
Step 3: Solve the equation F(X) = R for X in terms of R
x
1 e R
ex  1 R
X  n 1  R 
1
X   n 1  R  XF 1
R 


Step 4: Generate uniform random numbers R1, R2, R3, …. and compute
the desired random variates by
Xi  F  Ri 
1
1
Xi   n 1  R i  for i  1, 2,3,.......

1
Xi   nR i (sin ce both R i and 1  R i

are uniformly distributed on (0,1))
Example : Generation of exponential variates Xi with mean 1, Given

random numbers Ri
Ri 0.1306 0.0422 0.6597 0.7965 0.7696

i 1 2 3 4 5
Ri 0.1306 0.0422 0.6597 0.7965 0.7696
Xi 0.1400 0.0431 1.078 1.592 1.468
Empirical histogram of 200 Theoretical uniform

uniform random numbers density on (0, 1)
Empirical histogram of Theoretical exponential
200 exponential variates density with mean 1

R1 = 1-e-x1
X1 = -ln(1-R1)
Graphical view of the inverse transform technique

Uniform Distribution:
Consider a random variable X that is uniformly distributed on the
interval [a, b].  1
 , axb
f x  b  a
The pdf of X is given by
0,
 otherwise
Steps to be followed:
Step 1: The cdf is given by xa
0, 

x a
Fx   , axb
b a
1, xb

x a
Step 2: Set F x  R
ba
Step 3: Solving for X in terms of R yields X = a + (b-a)R

Weibull Distribution:
Used as a model for time to failure for machines or electronic components.
When the location parameter ‘ν’ is set to zero, the pdf is given by
 x
b
b  
 x e  a  ,
b
f  x    ab
1
x0

0, otherwise
Where α > 0 and β > 0 are the scale and shape parameters of the distribution.
Steps to generate Weibull variate,
Step 1: The cdf is given by
b
 x a 
F x  1 e , x0
Step 2: Let b
 x a 
F x   1 e R
Step 3: Solving for X in terms of R yields
1b
X  a  n 1  R 
Note:
By comparing the following equations
1 1b
X   n 1  R  X  a  n 1  R 

It can be seen that
if X is a Weibull variate, then Xβ is an exponential variate with mean αβ
Conversely
If Y is an exponential variate with mean ‘µ’, then Y1/β is a Weibull variate

with shape parameter ‘β’ and scale parameter α = µ1/β

Triangular Distribution:
Consider a random variable X which has pdf
 x, 0  x 1
f  x   2  x, 1  x  2
0, otherwise
This distribution is called a triangular distribution with end points (0, 2) and
mode at 1. Its cdf is given by
0, x0
 x2
 , 0  x 1
Fx   2
1   2  x  ,
2
1 x  2
 2
1, x2

x2
For 0  x  1, R from this eq.
2
1
0  X  1 implies that 0  R  ,in which case X  2R
2
and For 1  x  2, R  1
 2  x
2
from this eq.
2
1
1  X  2 implies that  R  1,in which case X  2  2 1  R 
2
Thus, X is generated by
 1
 2R , 0R 
X 2
1
2  2 1  R  ,  R 1
 2

Empirical continuous distributions:
If the modeler has been unable to find a theoretical distribution that
provides a good model for the input data, then it may be necessary to use
the empirical distribution of the data.
One possibility is to simply resample the observed data itself. This is known
as using the empirical distribution and it makes particularly good sense
when the input process is known to take on a finite number of values.
On the other hand, if the data are drawn from what is believed to be a
continuous valued input process, then it makes sense to interpolate between
the observed data points to fill in the gaps.

Five observations of fire crew response times (in minutes) to incoming alarms
have been collected to be used in a simulation investigating possible alternative
staffing and crew scheduling policies. The data are
2.76 1.83 0.80 1.45 1.24
Before collecting more data, it is desired to develop a preliminary simulation
model which uses a response time distribution based on these five
observations. Thus, a method for generating random variates from the
response time distribution is needed.
Assume that response times X have a range 0 ≤ X ≤ c, where c is unknown,
But will be estimated by
ĉ  max Xi : i  1,....n  2.76,
where Xi , i  1, 2,...., n are the raw data and
n  5 is the number of observations
Arrange the data from smallest to largest and let x(1) ≤ x(2) …….. ≤ x(n)
Since the smallest possible value is believed to be 0, define x(0) = 0
Assign a probability of 1/n = 1/5 to each interval x(i-1) < x < x(i)
The slope of the ith line segment is given by
x (i)  x (i 1) x (i)  x (i 1)

ai  
 i n    i  1 n 1n
The inverse cdf is calculated by

X  F  R   x  i 1  a i  R 
 i  1 
ˆ  when  i  1 n  R  i n .
1
 n 

Table: Summary of Fire Crew Response – Time Data
Interval, Pr obability, Cumulative Slope,
i x (i 1)  x  x (i) 1/ n Pr obability, i / n ai
1 0.00  x  0.80 0.2 0.2 4.00
2 0.80  x  1.24 0.2 0.4 2.20
3 1.24  x  1.45 0.2 0.6 1.05
4 1.45  x  1.83 0.2 0.8 1.90
5 1.83  x  2.76 0.2 1.0 4.65
For example if the random number R1 = 0.71 is generated, then R1 is seen to

lie in the fourth interval (between 3/5 = 0.60 and 4/5 = 0.80), so that
X  Fˆ 1  R   x  a  R   i  1 n 
 i 1 i
X1 = x(4-1) + a4(R1 - (4-1)/n)

= 1.45 + 1.90(0.71-0.60)
= 1.66
F̂  X  1.2
(2.76, 1)
1
Cumulative Probability
(1.83, 0.8)
0.8
R1=0.71
0.6 (1.45, 0.6)
0.4 (1.24, 0.4)

X1=1.45 + 1.90(0.71-0.60)=1.66
0.2 (0.8, 0.2)
0 (0, 0)
x
0 0.5 1 1.5 2 2.5 3
Response Times
X1
Fig: Empirical cdf of fire – crew response times

If a large sample of data is available (and sample sizes from several hundred
to tens of thousands are possible with modern, automated data collection),
then it may be more convenient and computationally efficient to first
summarize the data into a frequency distribution with a much smaller
number of intervals and then fit a continuous empirical cdf to the frequency
distribution.
Now the slope of the ith line segment is given by
x  i   x  i1
ai  where
Ci  Ci 1
Ci Cumulative probability of the first i
int ervals of the frequency distribution
x  i  1  x  x  i  i th int erval
The inverse cdf is calculated by

X  Fˆ 1  R   x i1  a i  R  ci1  when ci1  R  ci

Suppose that 100 broken-widget repair times have been collected. The data is
collected in terms of the number of the observations in various intervals. For
example, there were 31 observations between 0 and 0.5 hour, 10 between 0.5
and 1 hour, 25 between 1 and 1.5 hour, 34 between 1.5 and 2.0 hour.
Assume that all repairs take at least 15 minutes, so that X ≥ 0.25 hour always.
Hence, set x(0) = 0.25
Table: Summary of Repair Time Data
i Interval Frequency Re lative Cumulative Slope,

(Hours) Frequency Frequency, ci ai
1 0.25  x  0.5 31 0.31 0.31 0.81
2 0.5  x  1.0 10 0.10 0.41 5.00
3 1.0  x  1.5 25 0.25 0.66 2.00
4 1.5  x  2.0 34 0.34 1.00 1.47

X  F  R   x i1  a i  R  ci1  when ci1  R  ci
ˆ 1
When the random number R1=is 0.83
c3=0.66 and c4=1.00,
X1=x(4-1)+a4(R1-c4-1)=1.5 + 1.47 (0.83-0.66)=1.75
When R2=0.31, X2 = ?

1.2
1 2, 1
R1=0.830.8
Cumulative Frequency
1.5, 0.66
0.6
0.4 1, 0.41
0.5, 0.31
0.2
0 0.25, 0
0 0.5 1 1.5 2 2.5
Repair Times
X1=1.75
Fig. Generating Variates from the Empirical
Distribution for Repair Time Data
Acceptance-Rejection Technique
Suppose that an analyst needed to device a method for generating random
variates, X, uniformly distributed between 1/4 and 1. One way to proceed
would be to follow these steps:
Step 1: Generate a random number R.
Step 2a: If R ≥ 1/4 , accept X = R, then go to step 3.
Step 2b: If R < 1/4 , reject R, and return to step 1.
Step 3: If another uniform random variate on [1/4, 1] is
needed, repeat the procedure beginning at step 1. If
not, stop.
Each time step 1 is executed, a new random number R must be generated.
Step 2a is an “acceptance” and step 2b is a “rejection” in this acceptance-
rejection technique.

To summarize the technique, random variates (R) with some distribution
(here uniform on [0, 1]) are generated until some condition (R > 1/4) is
satisfied.
When the condition is finally satisfied, the desired random variate, X (here
uniform on [1/4, 1]) can be computed (X = R).
This procedure can be shown to be correct by recognizing that the accepted
values of R are conditioned values; that is, R itself does not have the desired
distribution, but R conditioned on the event {R ≥ 1/4} does have the desired
distribution.
To show this, take 1/4 ≤ a < b ≤ 1; then
P a  R  b b  a
P  a  R  b 1 4  R  1  
P 1 4  R  1 3 4
The above equation says that the probability distribution of R, given that R is
between 1/4 and 1 (All other values of R are thrown out), is the desired
distribution. Therefore, if 1/4 ≤ R ≤ 1, set X = R.
The efficiency of an acceptance-rejection technique depends heavily on being
able to minimize the number of rejections.
In this example, the probability of a rejection is P (R < 1/4)=1/4, so that the
number of rejections is a geometrically distributed random variable with
probability of “success” being p = ¾ and mean number of rejections (1/p-
1)=4/3-1 =1/3.
The mean number of random numbers R required to generate one variate X
is one more than the number of rejections; Hence, it is 4/3 = 1.33.
In other words, to generate 1000 values of X would require approximately
1333 random numbers R

For the uniform distribution on [1/4, 1], the inverse transform technique is
undoubtedly much easier to apply and more efficient than the acceptance-
rejection technique.
Advantage of Acceptance-Rejection technique:
Some important distributions such as the normal, gamma and beta, the
inverse cdf does not exist in closed form and therefore the inverse transform
technique is difficult.
In the following subsections, the acceptance-rejection technique is

illustrated for the generation of random variates for the poisson and
gamma distributions.

Poisson Distribution
A poisson random variable, N, with mean α > 0 has pmf
a
e a n
pn  P  N  n  , n  0, 1, 2,......
n!
But more important,
N number of arrivals in one unit of time(from a poisson arrival
process )
As you know
A1, A2,…… inter arrival times of successive customers (exponentially
distributed with rate α)
Where α is the mean number of arrivals per unit time
Exponential variate Xi = (-1/λ)lnRi
Thus there is a relationship b/w the discrete poisson distribution and
the continuous exponential distribution, namely:
N=n
If and only if
1/31/2018
A1  A 2 Dr.DEGA
......  A n  1  A
NAGARAJU, CIMR, VIT, Vellore 1  .......  A n  A n 1 111
N = n, says there were exactly n arrivals during one unit of time.
A1  A 2  ......  A n  1  A1  .......  A n  An 1
From the above relation, the nth arrival occurred before time 1 while the
(n+1) st arrival occurred after time 1.
Clearly these two statements are equivalent.
Now generate exponential inter arrival times until some arrival, say n+1,
occurs after time 1; then set N = n
For efficient generation purposes, the above equation is simplified first using
the equation Ai = (-1/α)lnRi to obtain

n 1 n 1 1
  nR i  1    nR i
i 1 a i 1 a
Next multiply through by –α, which reverses the sign of the inequality, and
use the fact that a sum of logarithms is the logarithm of a product, to get
n n n 1 n 1
n  R i   nR i  a   nR i  n  R i
i 1 i 1 i 1 i 1
Finally, use the relation elnx = x for any number x to obtain
n n 1
a
 Ri  e   Ri
i 1 i 1

Steps to generate a poisson random variate, N:
Step 1: Set n=0, P = 1.
Step 2: Generate a random number Rn+1 and replace P by P.Rn+1.
Step 3: If P < e-α, then accept N = n. Otherwise, reject the current n, increase
n by one, and return to step 2.
Note: Upon completion of step 2, P is equal to the rightmost expression.
If P ≥ e-α in step 3, then n is rejected and the generation process must
proceed through at least one more trail.
How many radom numbers are required, on the average, to generate one
poisson variate, N?
If N = n, then n+1 random numbers are required.
So, the average number is given by
E(N+1)=α+1
Which is quite large if the mean, α, of the poisson distribution is large.

Example:
Buses arrive at the bus stop at peach tree and north avenue according to a
poisson process with a mean of one bus per 15 minutes. Generate a random
variate, N, which represents the number of arriving buses during a 1 hour
time slot.
Now, N is poisson distributed with a mean of four buses per hour. First,
compute e-α = e-4 = 0.0183
Use the sequence of 12 random numbers from table A.1
n R n 1 P Accept / Re ject Result
0 0.4357 0.4357 P  ea  reject  
1 0.4146 0.1806 P  e a  reject  
2 0.8353 0.1508 P  ea  reject  
3 0.9952 0.1502 P  ea  reject  
4 0.8004 0.1202 P  ea  reject  
5 0.7945 0.0955 P  ea  reject  
6 0.1530 0.0146 P  ea  accept  N6
Note:
Here α value is 4
Larger value of α usually requires more random numbers.
If one wants to generate 1000 poisson variates , number of random numbers
required are 1000(α+1)=5000
When α value is larger , say α ≥ 15, the rejection technique outlined here
becomes quite expensive,
but fortunately an approximate technique based on the normal distribution
works quite well.
When the mean α is large,
Na
Z
a
is approximately normally distributed with mean zero and variance 1, which
suggests an approximate technique .

First generate a standard normal variate Z, by equation
Z1   2 nR1  cos  2R 2 
12
Z2   2 nR1  sin  2R 2 

12
Then, generate the desired poisson variate , N, by

N  a  aZ  0.5
Where is the round up function
 
.
(If a  aZ  0.5  0 , then set N=0). The “0.5” used in the formula
makes the round up function become a “round to the nearest integer”
function. The above equation for N is used as an alternative to the
acceptance rejection method.

Gamma Distribution:
Several acceptance-rejection techniques for generating gamma random
variates have been developed.
For any value of the shape parameter β ≥ 1, the mean number of trails
required is between 1.13 and 1.47
The acceptance rejection technique would be a highly efficient method for the
Erlang distribution, if β=k were large.
The routine generates gamma random variates with scale parameter θ and
shape parameter β
where mean 1/θ
variance 1/βθ2

Steps Involved:
Step1 : Compute a   2b  1
12
, b  2b  n4  1 a
Step2 : Generate R1 and R 2
Step3 : Compute X  b  R1 1  R1  
a
Step4a : If X  b  n   , reject X and return to step 2.

2
R1 R 2
Step4b : If X  b  n  R R  , use X as the desired var iate.
2
1 2
where mean and var iance both equal to b

If it is desired to have mean 1  and var iance1 b2 ,
Step5 : Re place X by X b
In step 3, X=β[R1/(1-R1)]a is not gamma distributed, but rejection of certain

values of X in step 4a guarantees that the accepted values in step 4b do have
the gamma distribution.

Fallsem2018-19 Mee2013 Eth Mb218 Vl2018191002622 Reference Material I Simulation III Unit

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Fallsem2018-19 Mee2013 Eth Mb218 Vl2018191002622 Reference Material I Simulation III Unit

Încărcat de

Drepturi de autor:

Formate disponibile

RANDOM NUMBER

A random number is a number generated by a process, whose outcome is

Random numbers are important in statistical analysis and probability

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 2

Most computer languages have a subroutine, object, or

Simulation languages generate random numbers that

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 6

x 3  1665 x 32  02772225 R 4  0.7722

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 7

ri 1   ari  b  mod ulo m 

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 8

Xi+1 = (aXi + c) mod m, i = 0, 1, 2....

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 10

If b = 0, ri+1=ari(modulo m) —Multiplicative type (b=0)

The multiplicative methods are considered better than the

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 11

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 13

How closely the generated numbers R1, R2, ….

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 14

Each Ri is discrete on I, instead of continuous on the interval [0, 1].

This approximation appears to be of little consequence.

Hence modulus ‘m’ should be very large integer

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 15

Using the multiplicative congruential method, find the

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 17

1. Frequency test: Uses the Kolmogorov-Smirnov or the chi-

3. Autocorrelation test: Tests the correlation between numbers

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 19

4. Gap test: Counts the number of digits that appear

5. Poker test: Treats numbers grouped together as a

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 20

The null hypothesis, H0, reads that the numbers are

Failure to reject the null hypothesis means that no

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 21

In testing for independence, the hypotheses are as follows;

Failure to reject the null hypothesis means that no evidence of

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 22

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 23

Test for uniformity

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 24

As N becomes larger, SN(x) should become a better approximation to F(x),

The sampling distribution

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 26

Ri 0.05 0.14 0.44 0.81 0.93

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 28

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 29

The Chi-Square Test uses the sample statistic

Tests the correlation between numbers and compares the

Used to test the dependence between numbers in a sequence.

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 33

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 34

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 35

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 36

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 37

Determines-significance of the interval b/w the recurrences

A gap of length x occurs between the recurrences of some

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 38

1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 39

P  gap of 10   P(no 3)....P(no 3) P(3)

1. Specify the cdf for the theoretical frequency distribution given by

The number of gaps is given by the number of data values minus