Documente Academic
Documente Profesional
Documente Cultură
Random numbers are numbers that occur in a sequence such that two
conditions are met:
(1) the values are uniformly distributed over a defined interval or set,
and
(2) it is impossible to predict future values based on past or present
ones.
The most common set from which random numbers are derived is the set of
single-digit decimal numbers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 1
RANDOM NUMBER
gENERATION
0 1 x
Fig. The pdf for random numbers
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 5
Random Number Generation (cont.)
The exp ected value of each R i is given by
x 2 1
1 1
E(R)
0 x dx
2 0 2
and the var iance is given by
1 2
V(R) 0 x dx [E(R)]2
1 2
x3 1
3 0 2
1 1 1
3 4 12
Random numbers obtained: 1, 11, 10, 17, 14, 12, 3, 20, 16, 21,
9, 1.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 9
Linear Congruential Method:
Secondary Properties:
Maximum density
Maximum period
m2 31
1 and m 2 48
are in common use
0 1 2 3 4
1 13 26 39 52
2 41 18 59 36
3 21 42 63 20
4 17 34 51 4
5 27 58 23
6 57 50 43
7 37 10 47
8 33 2 35
9 45 7
10 9 27
11 53 31
12 49 19
13 61 55
14 25 11
15 5 15
1/31/2018
16 1
Dr. DEGA NAGARAJU, CIMR, VIT, Vellore
3 18
Test for Random Numbers
2. Runs test: Tests the runs up and down or the runs above and
below the mean by comparing the actual values to expected
values. The statistic for comparison is the chi-square.
This test is based on the largest absolute deviation between F(x) and SN(x) over
the range of the random variable. That is,
D max F x SN x
i
0.20 0.40 0.60 0.80 1.00
N
i
Ri 0.15 0.26 0.16 0.07
N
i 1
0.00 0.20 0.40 0.60 0.80
N
Ri
i 1
0.05 0.04 0.21 0.13
N
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 27
i
D max R i 0.26
1 i N N
i 1
D max R i 0.21
1 i N N
D max D , D 0.26
The critical value of D from table for the given value of Level of
significance α = 0.05 and Sample of observations N= 5 is 0.565.
The computed value is less than the tabulated critical value , the hypothesis
of no difference between the distribution of the generated numbers and the
uniform distribution is not rejected.
2
n O i Ei
02
i 1 Ei
where
Oi Observed number in the ith class
Ei Expected number in the ith class
n Number of classes
For the uniform distribution, Ei, the expected number in each class is
N
Ei (for equally spaced classes)
n
where N is the total number of observations
Note: The sampling distribution of 02 is approximately the chi-
square distribution with n-1 degrees of freedom.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 30
Example:
Use the chi-square test with α = 0.05 to test whether the data shown below are
uniformly distributed:
0.34 0.90 0.25 0.89 0.87 0.44 0.12 0.21 0.46 0.67
0.83 0.76 0.79 0.64 0.70 0.81 0.94 0.74 0.22 0.74
0.96 0.99 0.77 0.67 0.56 0.41 0.52 0.73 0.99 0.02
0.47 0.30 0.17 0.82 0.56 0.05 0.45 0.31 0.78 0.05
0.79 0.71 0.23 0.19 0.82 0.93 0.65 0.37 0.39 0.42
0.99 0.17 0.99 0.46 0.05 0.66 0.10 0.42 0.18 0.49
0.37 0.51 0.54 0.01 0.81 0.28 0.69 0.34 0.75 0.49
0.72 0.43 0.56 0.97 0.30 0.94 0.96 0.58 0.73 0.05
0.06 0.39 0.84 0.24 0.40 0.64 0.40 0.19 0.79 0.62
0.18 0.26 0.97 0.88 0.64 0.47 0.60 0.11 0.29 0.78
Take n=10 intervals of equal length, namely [0.0, 0.1), [0.1, 0.2), [0.3,0.4),
[0.4,0.5), [0.5,0.6), [0.6,0.7), [0.7,0.8), [0.8,0.9) and [0.9, 1.0)
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 31
Computations for Chi-Square Test
Interval Oi Ei Oi-Ei (Oi-Ei)2 (Oi-Ei)2/ Ei
1 8 10 -2 4 0.4
2 8 10 -2 4 0.4
3 10 10 0 0 0.0
4 9 10 -1 1 0.1
5 12 10 2 4 0.4
6 8 10 -2 4 0.4
7 10 10 0 0 0.0
8 14 10 4 16 1.6
9 10 10 0 0 0.0
10 11 10 1 1 0.1
100 100 00 3.4
The value2 of 0
2
2
is 3.4. this is compared with the critical value 0.05,9 =16.9.
Since 0 is much smaller than the tabulated value of 2 , the null hypothesis
0.05,9
of uniform distribution is not rejected.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 32
Autocorrelation Test
13 4 7
ˆ 35 0.128
12 4 1
Test the following sequence of numbers for uniformity and independence, using
procedures you learned here:
0.594 0.928 0.515 0.055 0.507 0.351 0.262 0.797 0.788 0.442
0.097 0.798 0.227 0.127 0.474 0.825 0.007 0.182 0.929 0.852
Since the probability that any digit is not a 3 is 0.9, and the probability that
any digit is a 3 is 0.1.
In general,
P(t followed by exactly x non-t digits) = (0.9)x (0.1), x=0,1,2…..
To fully analyze a set of numbers for independence using the gap test,
every digit, 0, 1, 2, …….9, must be analyzed.
To observe, the frequencies of the various gap sizes for all the digits are
recorded and compared to the theoretical frequency using the
Kolmogorov-Smirnov test for discretized data.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 40
The theoretical frequency distribution for randomly ordered digits is given
by x
P gap x F x 0.1 0.9 1 0.9
x 1
n
n 0
Procedural Steps:
When applying the test to random numbers, class intervals such as [0,0.1),
[0.1,0.2),…… play the role of random digits.
Example:
Based on the frequency with which gaps occur, analyze the 110 digits
above to test whether they are independent. Use α = 0.05.
Digit 0 1 2 3 4 5 6 7 8 9
Number of 7 8 8 17 10 13 7 8 9 13
gaps
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 42
Gap Test Example:
G.L Frequency R .F C.R .F Fx F x SN x
03 35 0.35 0.35 0.3439 0.0061
47 22 0.22 0.57 0.5695 0.0005
8 11 17 0.17 0.74 0.7176 0.0224
12 15 9 0.09 0.83 0.8147 0.0153
16 19 5 0.05 0.88 0.8784 0.0016
20 23 6 0.06 0.94 0.9202 0.0198
24 27 3 0.03 0.97 0.9497 0.0223
28 31 0 0.0 0.97 0.9657 0.0043
32 35 0 0.0 0.97 0.9775 0.0075
36 39 2 0.02 0.99 0.9852 0.0043
40 43 0 0.0 0.99 0.9903 0.0003
44 47 1 0.01 1.00 0.9936 0.0064
G.L –Gap Length R.L-Relative Frequency
C.R.F-Cumulative Relative Frequency
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 43
The critical value of D is given by
1.36
D0.05 0.136
100
is less than D0.05, do not reject the hypothesis of independence on the basis
of this test.
P=
However, a glance at the ordering shows that the numbers are successively
larger in blocks of 10 values.
0.41 0.68 0.89 0.84 0.74 0.91 0.55 0.71 0.36 0.30
0.09 0.72 0.86 0.08 0.54 0.02 0.11 0.29 0.16 0.18
0.88 0.91 0.95 0.69 0.09 0.38 0.23 0.32 0.91 0.53
0.31 0.42 0.73 0.12 0.74 0.45 0.13 0.47 0.58 0.29
H T T H H T T T H T
There are
three mutually exclusive outcomes, or events, with respect to the sequence.
Two of the possibilities are rather obvious. That is the toss can result in a
head or a tail. The third possibility is “no event”.
The first head is preceded by no event and the last tail is succeeded by no
event.
H T T H H T T T H T
In the coin flipping example discussed previously: there are six runs.
Length of the first run: one
Length of the second run: two
Length of the third run: two
Length of the fourth run: three
Length of the fifth run: one
Length of the sixth run: one
The types of runs counted in the first case might be runs up and runs down.
There are eight runs:
Length of first run: one
Length of second run: three
Length of third run: three and so on.
Totally four runs up and four runs down.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 51
Consider the following sequence of numbers:
0.08 0.18 0.23 0.36 0.42 0.55 0.63 0.72 0.89 0.91
This sequence has one run, a run up.
Note:It is unlikely that a valid random number generator would produce such a
sequence.
Next, consider the following sequence:
0.08 0.93 0.15 0.96 0.26 0.84 0.28 0.79 0.36 0.57
This sequence has nine runs. Five up and four down.
Note: It is unlikely that a sequence of 10 numbers would have this many runs.
What is more likely is that the number of runs will be some where between
the two extremes. These two extremes can be formalized as follows.
If
N = number of numbers in a sequence,
The maximum number of runs = N-1
The minimum number of runs = one.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 52
If
a = total number of runs in a truly random sequence
The mean and variance of ‘a’ are given by
2N 1 16N 29
Mean, a Variance, a2
3 90
For N > 20, the distribution of a is reasonably approximated by a normal
distribution , N a , a
2
This approximation can be used to test the independence of numbers from a
generator.
In that case the standardized normal test statistic is developed by subtracting
the mean from the observed number of runs (a) and dividing by the standard
deviation. That is the test statistic is
a a a 2N 1 3
Z0 Z0
a 16N 29 90
2N 1 2 40 1
Mean, a 26.33
3 3
16N 29 16 40 29
Variance, a
2
6.79
90 90
a a 26 26.33
Z0 0.13
a 6.79
Now the critical value is Z0.025=1.96, so the independence of the numbers
cannot be rejected on the basis of this test.
Notice that
Maximum number of runs: N = n1 + n 2
Minimum number of runs: one
N N 1
2
Failure to reject the hypothesis of independence occurs when
Za 2 Z0 Za 2
Where α is the level of significance.
17 20.3
Z0 1.07
9.54
0.16, 0.27, 0.58, 0.63, 0.45, 0.21, 0.72, 0.87, 0.27, 0.15, 0.92, 0.85, …
Assume that this sequence continues in a like fashion: two numbers below
the mean followed by two numbers above the mean.
A test of runs above and below the mean would detect no departure from
independence.
However, it is to be expected that runs other than of length two should occur.
E Yi
2
i 3 !
N i 2 3i 1 i3 3i 2 i 4 ,
i N2
2
E Yi , i N 1
N!
N
E A , N 20
E I
The appropriate test is the chi-square test with Oi being the observed
number of runs of length i. Then the test statistic is
Oi E Yi
2
L
0
2
i 1 E Yi
Where L = N-1 for runs up and down and L = N for runs above and
below the mean. If the null hypothesis of independence is true, then χ02 is
approximately chi-square distributed with L -1 degrees of freedom.
With the foregoing calculations and procedures in mind, the critical value
of χ02 is 3.84. (The degrees of freedom equals the number of class
intervals minus 1 )
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 68
Table : Length of Runs Up and Down : Test 2
Oi E Yi
2
Run Observed number Expected number
Length,i of Runs, Oi of Runs, E Yi E Yi
1 26 25.08 0.03
2 9 10.77
14 14.59 0.02
3 5 3.82
40 39.67 0.05
Since χ02 = 0.05 is less than the critical value, the hypothesis of
independence cannot be rejected on the basis of this test
n1 n 2 28 32
E I 2.02
n 2 n1 32 28
The expected numbers of runs of various lengths as
Nw i
E Yi , N 20
E I
60 0.498
E Y1 14.79
2.02
60 0.249
E Y2 7.40
2.02
60 0.125
E Y3 3.71
2.02
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 72
The total number of runs expected is
N 60
E A 29.7 N 20
E I 2.02
This indicates that approximately 3.8 runs of length four or more can be
expected.
Proceeding by combining adjacent cells in which E(Yi) < 5 produces the
following table.
Table: Length of Runs above and below the mean: χ2 test
Observed number Expected number Oi E Yi
2
Run
Length, i of Runs, Oi of Runs, E Yi E Yi
1 17 14.79 0.33
2 9 7.40 0.35
1 3.71
3 6 7.51 0.30
5 3.80
4
32 29.70 0.98
In each case, a pair of like digits appears in the number that was generated. In
three digit numbers there are only three possibilities, as follows:
Frequency, Oi Frequency, Ei Ei
Three different digits 680 720 2.22
Three like digits 31 10 44.10
Exactly one pair 289 270 1.33
1000 1000 47.65
The appropriate degrees of freedom are one less than the number of class
intervals. Since 47.65 > χ20.05,2 = 5.99, the independence of numbers is
rejected on the basis of this test.
Random numbers are numbers that occur in a sequence such that two
conditions are met:
(1) the values are uniformly distributed over a defined interval or set,
and
(2) it is impossible to predict future values based on past or present
ones.
The most common set from which random numbers are derived is the set of
single-digit decimal numbers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}.
Example: Consider a random experiment of throwing a die. Then ‘X’ the number of points
on the die is a random variable, since ‘X’ takes the values 1, 2, 3, 4, 5, and 6 each with the
probability 1/6.
Discrete Random Variable: Random variable takes the values only on the set {0, 1, 2, 3,
…..n}
Example: Number of printing mistakes in each page of a book
Number of telephone calls received by the telephone operator
Continuous Random Variable: Random variable takes on all values within a certain
interval
Example: The height, age and weight of individuals
Amount of rain fall on rainy day
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 81
A variable whose value is unknown or a function that assigns values to each
of an experiment's outcomes.
continuous, which are variables that can have any values within a
continuous range.
Usually such variables are modeled as random variables with some specified
statistical distribution, and standard statistical procedures exist for
estimating the parameters of the hypothesized distribution and for testing
the validity of the assumed statistical model.
Assumptions:
A distribution has been completely specified.
Samples are generated from this specified distribution and the generated
samples are used as input to a simulation model.
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 83
Purpose of the current discussion:
To illustrate and explain some widely used techniques for generating random
variates.
However, some programming languages do not have built in routines for all
of the regularly used distributions, and some computer installations do not
have random variate generation libraries, in which case the modeler must
construct an acceptable routine.
0, x0
FR x x, 0 x 1
1, x 1
Here, R1, R2, …represent random numbers uniformly distributed on (0, 1).
Exponential distribution,
Uniform distribution,
Weibull distribution and
Triangular distributions
This technique is also used for sampling from a wide variety of discrete
distributions.
It is the most straight forward, but not always the most efficient, technique
computationally.
Xi F Ri
1
1
Xi n 1 R i for i 1, 2,3,.......
1
Xi nR i (sin ce both R i and 1 R i
are uniformly distributed on (0,1))
X1 = -ln(1-R1)
Conversely
and For 1 x 2, R 1
2 x
2
from this eq.
2
1
1 X 2 implies that R 1,in which case X 2 2 1 R
2
Thus, X is generated by
1
2R , 0R
X 2
1
2 2 1 R , R 1
2
One possibility is to simply resample the observed data itself. This is known
as using the empirical distribution and it makes particularly good sense
when the input process is known to take on a finite number of values.
On the other hand, if the data are drawn from what is believed to be a
continuous valued input process, then it makes sense to interpolate between
the observed data points to fill in the gaps.
Arrange the data from smallest to largest and let x(1) ≤ x(2) …….. ≤ x(n)
Since the smallest possible value is believed to be 0, define x(0) = 0
Assign a probability of 1/n = 1/5 to each interval x(i-1) < x < x(i)
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 99
The slope of the ith line segment is given by
X F R x i 1 a i R
i 1
ˆ when i 1 n R i n .
1
n
Cumulative Probability
(1.83, 0.8)
0.8
R1=0.71
0.6 (1.45, 0.6)
0 (0, 0)
x
0 0.5 1 1.5 2 2.5 3
Response Times
X1
When R2=0.31, X2 = ?
1 2, 1
R1=0.830.8
Cumulative Frequency
1.5, 0.66
0.6
0.4 1, 0.41
0.5, 0.31
0.2
0 0.25, 0
0 0.5 1 1.5 2 2.5
Repair Times
X1=1.75
Fig. Generating Variates from the Empirical
Distribution for Repair Time Data
1/31/2018 Dr. DEGA NAGARAJU, CIMR, VIT, Vellore 106
Acceptance-Rejection Technique
Suppose that an analyst needed to device a method for generating random
variates, X, uniformly distributed between 1/4 and 1. One way to proceed
would be to follow these steps:
Step 1: Generate a random number R.
Step 2a: If R ≥ 1/4 , accept X = R, then go to step 3.
Step 2b: If R < 1/4 , reject R, and return to step 1.
Step 3: If another uniform random variate on [1/4, 1] is
needed, repeat the procedure beginning at step 1. If
not, stop.
Each time step 1 is executed, a new random number R must be generated.
Step 2a is an “acceptance” and step 2b is a “rejection” in this acceptance-
rejection technique.
Some important distributions such as the normal, gamma and beta, the
inverse cdf does not exist in closed form and therefore the inverse transform
technique is difficult.
A1 A 2 ...... A n 1 A1 ....... A n An 1
From the above relation, the nth arrival occurred before time 1 while the
(n+1) st arrival occurred after time 1.
Now generate exponential inter arrival times until some arrival, say n+1,
occurs after time 1; then set N = n
For efficient generation purposes, the above equation is simplified first using
the equation Ai = (-1/α)lnRi to obtain
When α value is larger , say α ≥ 15, the rejection technique outlined here
becomes quite expensive,
but fortunately an approximate technique based on the normal distribution
works quite well.
When the mean α is large,
Na
Z
a
is approximately normally distributed with mean zero and variance 1, which
suggests an approximate technique .
For any value of the shape parameter β ≥ 1, the mean number of trails
required is between 1.13 and 1.47
The acceptance rejection technique would be a highly efficient method for the
Erlang distribution, if β=k were large.
The routine generates gamma random variates with scale parameter θ and
shape parameter β
where mean 1/θ
variance 1/βθ2