Sunteți pe pagina 1din 183

System Simulation

Dr. Dessouky

Description
Simulation is a very powerful and widely used management
science technique for the analysis and study of complex systems.
Simulation may be defined as a technique that imitates the
operation of a real-world system as it evolves over time. This is
normally done by developing a simulation model. A simulation
model usually takes the form of a set of assumptions about the
operation of the system, expressed as mathematical or logical
relations between the objects of interest in the system.
Simulation has its advantages and disadvantages. We will focus
our attention on simulation models and the simulation technique.

Simulation
What is simulation:
The process of designing a mathematical
or logical model of a real-system and then
conducting computer-based experiments
with the model to describe, explain, and
predict the behavior of the real system.

Simulation
Where simulation fits in
Programming

Simulation
Analysis
Modeling

Probability &
Statistics

Basic Terminology
In most simulation studies, we are concerned with
the simulation of some system.
Thus, in order to model a system, we must
understand the concept of a system.
Definition: A system is a collection of entities that
act and interact toward the accomplishment of some
logical end.
Systems generally tend to be dynamic their status
changes over time. To describe this status, we use
the concept of the state of a system.

Example Simulation Model

Ford - # of Panels per day (throughput)


Emergency Room (beds, doctors, nurses), (minor, moderate, major,
critical)
TRW Ballistic Missile Survivability against Soviet Threat
Paramount Farms Pistachio
Miami University Parking
HMT Disks Throughput
Christopher Ranch Garlic Capacity
Power Integration Semiconductor Capacity, and random machine
down times

Value of Simulation
Empirical Method verses mathematical
model
Allow you to calculate the extreme values
not just the expected value

Simulation
What is simulation
Simulationistheactualrunningofthe
modelsystemtogaininsightintoits
performance.

Simulation
Why use simulation
Simulationisusedtobetterunderstandthe
expectedperformanceoftherealsystem
andtotesttheeffectivenessofthesystem
design.

Simulation
Why use simulation
Without building them
experimental system
new concepts

Without disturbing them


costly experimentation
unsafe experimentation.

Without destroying them


Determine limits of stress

Queuing systems
Performancemeasures(output)
Datarequirements(input)
Usesofmodel
Kendallsnotation

Queuing systems
SystemPerformancemeasures(outputs)

Expectednumberofcustomersinsystem
Expectednumberofcustomersinqueue
Expectedtimeinsystem
Expectedtimeinqueue
Serverutilization
Probabilityofncustomersinsystem
Throughput

Queuing systems
Datarequirements(Inputs)

Interarrivaltimedistribution
Servicetimedistribution
Numberofservers
Queuediscipline
Systemcapacity
Sizeofinputpopulation
Kendallsnotation(M/M/s/FCFS/K/M)

Alternativetosimulation

Simulation
Analytic models
Physical experimentation
Visit other sites

Simulation vs. analytic modeling


Advantage:

various performance measures


greater realism
easier to understand
model the steady-state as well as the transit
behavior.

Disadvantage:
May not provide you with the optimal solution
time to construct model will be longer.

Simulation vs. Physical


Advantage:

High Speed
Not disruptive
Replication easy
Control variations
Generally less costly

Disadvantage:
Realism
Validity

Simulation vs. Alternatives


Realism
V

Cost
V

Representing system
System:
a collection of mutually interacting objects
designed to accomplish a goal (machines
repair system)

Entities:
denotes an element/object within boundary of
system (machines, operators, repairman)
Entity work being performed on object
Resource performing the work

Representing system
Attribute:
Characteristic or property or an entity
(machine ID, Type of breakdown, time that
machine went down)

Activity:
transforms the state of an object usually over
some time (repairman service time, machine
run time)

Representing system
State of the system:
Numeric values that contain all the
information necessary to describe the system
at any time.

Delays:
Processes that take a conditional length of
time in the system

Representing system
Events:
Change the state of the system(end of service
of machine,machine breaks down)

Queue:
it is set, used to model waiting

Ex. Elevator systems


Entities
Elevators, people

Sets
People waiting at each floor

Attributes
Elevators capacity, speed, destination,
current location of each elevator
People inter-arrival time at each floor,
destination of each people

Ex. Elevator systems


State of system:
# of people on each elevator
# of people in each floor

Activities
Load/Unloading passenger
Travel to next floor (speed and distance)
Persons travel to elevator

Ex. Elevator systems


Delays:
Persons waiting for elevator

Events:

Elevator arrival
End unloading
End Loading
Person Arrival

Static Simulation vs. Dynamic


Simulation
There are two types of simulation models,
static and dynamic.
Definition: A static simulation model is a
representation of a system at a particular
point in time.
We usually refer to a static simulation as a
Monte Carlo simulation.

Static Simulation vs. Dynamic


Simulation
Definition: A dynamic simulation is a
representation of a system as it evolves
over time.
Within these two classifications, a
simulation may be deterministic or
stochastic.
A deterministic simulation model is one
that contains no random variables; a
stochastic simulation model contains one
or more random variables.

Discrete Event vs. Continuous


Event Simulation
Discrete event:
state of system changes only at discrete points
in time(events)
ex. Machine repair problem

Programming
Look at system only when events occur; time is
advanced from event to event.

Discrete Event vs. Continuous


Event Simulation
Continuous event:
state of system changes continuously over
time
Ex. Level of fluid in tank

Programming:
Advances time in small intervals. Use differential
equations to represent flows.

An Example of a Discrete-Event
Simulation
To simulate a queuing system, we first have to
describe it.
We assume arrivals are drawn from an infinite
calling population.
There is unlimited waiting room capacity, and
customers will be serve in the order of their arrival
(FCFS).
Arrivals occur one at a time in a random fashion.
All arrivals are eventually served with the
distribution of service teams as shown in the book.

Service times are also assumed to be random.


After service, all customers return to the calling
population.
For this example, we use the following variables
to define the state of the system: (1) the number of
customers in the system; (2) the status of the
server that is, whether the server is busy or idle;
and (3)the time of the next arrival.
An event is defined as a situation that causes the
state of the system to change instantaneously.

All the information about them is maintained in a


list called the event list.
Time in a simulation is maintained using a variable
called the clock time.
We begin this simulation with an empty system and
arbitrarily assume that our first event, an arrival,
takes place at clock time 0.
Next we schedule the departure time of the first
customer.
Departure time = clock time now + generated service time

Also, we now schedule the next arrival into the system


by randomly generating an interarrival time from the
interarrival time distribution and setting the arrival time
as
Arrival time = clock time now + generated interarrival time

Both these events are their scheduled times are


maintained on the event list.
This approach of simulation is called the next-event
time-advance mechanism, because of the way the
clock time is updated. We advance the simulation clock
to the time of the most imminent event.

As we move from event to event, we carry out the


appropriate actions for each event, including any
scheduling of future events.
The jump to the next event in the next-event mechanism
may be a large one or a small one; that is, the jumps in
this method are variable in size.
We contrast this approach with the fixed-increment
time-advance method.
With this method, we advance the simulation clock in
increments of t time units, where t is some
appropriate time unit, usually 1 time unit.

For most models, however, the next event


mechanism tends to be more efficient
computationally.
Consequently, we use only the next-event
approach for the development of the models for
the rest of the chapter.
To demonstrate the simulation model, we need to
define several variables:
TM = clock time of the simulation
AT = scheduled time of the next arrival

DT = scheduled time of the next departure


SS = status of the server (1=busy, 0=idle)
WL = length of the waiting line
MX = length (in time units) of a simulation run

We now begin the simulation by initializing


all the variables. This simple example
illustrates some of the basic concepts in
simulation and the way in which simulation
can be used to analyze a particular problem.

World View The Structure concepts and views


under which the simulation is guided for the
development of the simulation model
Event Orientation defines the changes in state that
occur at each event time
Process Orientation describes the process through
which the entities in the system flow
Activity Scanning Orientation describes the activities in
which the entities in the system engage

Discrete Event Simulation


Event scheduling
Write modules that describe changes in the
state of the system at each event
Main program advances time
One subprogram for each event
General purpose programming language

Discrete Event Simulation


Process interaction
Write modules that describe the progress of
entities through the system
As entities move the systems changes state
Entities are held to represent activities and
delays
Promodel programming language

Event scheduling
Time is advanced from event to event
Future events list ordered list of
upcoming events
As events are scheduled, they are added to the
list
As events occur they are removed from list

Activities in event ( one / event type)

Event scheduling
List is required to keep track of entities in
a set
Statistics Two types
Sample statistics average of some values
(W)
W = (W1 +W2 + +Wn)/n = Total Wait / # of wait

Time average statistics time weighted (L)


L = (0(t1) + 1(t2-t1) + 2(t3-t2) + 1(t4-t3)) / t4

Activity scanning
Activity scanning
Time is modeled in fixed time increments to
check if activity occurred
Small time increments is inefficient
Large time increments may miss activity
describes the activities in which the entities in
the system engage.

Process Oriented
Process oriented:
Many simulation models include elements
which occur in defined patterns
The logic associated with such a system or
events can be generalized and defined by a
single statement
A simulation language could then translate
such statement into the appropriate sequence
of events
describes the processes through which the
entities in the system flow.

Process Oriented
Process oriented:
These statements, define a sequence of events
which are automatically executed by the
simulation language as the entities move
through the process
Create arrival entities every t time units
However, since we are normally restricted to a
set of standardized statement, provided by the
simulation language, our model flexibility is
not as great as with the event condition

Feature provided by a language


Conceptual framework(entities, attributes,
resource, queues)
Maintenance of event list
Random variable generation
Animation
Debugging function
Output analysis
Input analysis
Report generation

Simulation Languages
One of the most important aspects of a simulation
study is the computer programming.
Several special-purpose computer simulation
languages have been developed to simplify
programming.
The best known and most readily available
simulation languages, including GPSS, GASP IV
and SLAM.
Most simulation languages use one of two different
modeling approaches or orientations; event
scheduling or process interaction.

GPSS uses the process-interaction approach.


SLAM allows the modeler to use either approach
or even a mixture of the two, whichever is the
most appropriate for the model being analyzed.
Of the general-purpose languages, FORTRAN is
the most commonly used in simulation.
In fact, several simulation languages, including
GASP IV and SLAM, use a FORTRAN base.

To use GASP IV we must provide a main program,


an initialization routine, and the event routines.
For the rest of the program, we use the GASP
routines.
Because of these prewritten routines, GASP IV
provides a great deal of programming flexibility.
GPSS, in contrast to GASP, is a highly structured
special-purpose language.
GPSS does not require writing a program in the
usual sense.

Building a GPSS model then consist of


combining these sets of blocks into a flow
diagram so that it represents the path an
entity takes as it passes through the system.
SLAM was developed by Pritsket and
Pegden (1979). It allows us to develop
simulation model as network models,
discrete-event models, continuous models, or
any combination of these.

The decision of which language to use is one of


the most important that a modeler or an analyst
must make in performing a simulation study.
The simulation language offer several advantages.
The most important of these is that the specialpurpose languages provide a natural framework
for simulation modeling and most of the features
needed in programming a simulation model.

The Simulation Modeling Steps


We now discuss the process for a complete
simulation study and present a systematic
approach of carrying out a simulation.
A simulation study normally consists of several
distinct stages. (See Figure in the book)
However, not all simulation studies consist of all
these stages or follow the order stated here.
On the other hand, there may even be considerable
overlap between some of these stages.

Problem/Model Formulation
State the objective of the study.
Identify the Problem. Determine any underlying
causes if possible.
Determine the input variables.
Controllable Variables.
Uncontrollable Variables.

Make assumptions / boundaries that were used to


simplify the model.
Determine Performance measures used to
measure the objective. (Output)

Data collection/acquisition
Determine the Data Collection System or
Estimates to be used.

Observe the system


Historical or Similar Systems
Theoretical Estimates
Engineering Estimates
Operator Estimates
Vendor Estimates

Identify the data collected.


How it was collected.
How it was represented in the model.

Model Construction or
Development
Identify The Real System
Determine Conceptual Model -Activities
and Events
Develop the Logical Model.
Identify the Programming Language used.
Computer Implementation (Promodel,
Arena, Slam Systems).

Model Construction or
Development
Modeling Tips

Art vs. Science


Over Simplification vs. Unnecessary Detail
Start Simple
Add stronger assumptions

Model Verification and


Validation
Verification: Determining whether
simulation model works as intended.
Verifying the Model.

Structure: Walk Through of the Model


Debugger.
Trace = print or writing in process calculations.
Animation.
Model testing
Analytical Model.

Model Verification and


Validation
Verification.
Logical Model.
Are events represented correctly?
Are mathematical formulas and relationships
correct?
Are statistical measures formulated correctly?

Computer Model/Simulation Model.


Does the code contain all aspects of the logical
model?
Are the statistics and formulas calculated correctly?
Does the model contain coding errors?

Model Verification and


Validation
Validation:Determine whether Simulation
of The Model is a credible representation
of a Real System.
Compare the model with the actual systems
by performing statistical tests. T-Test &
C.I.
Conceptual Model.
Does the model contain all relevant elements,
events and relationships?
Will the model answer the questions of concern?

Model Verification and


Validation
Logical Model.
Does the model contain all events included in the
conceptual model?
Does the model contain all the relationships of the
conceptual model?

Computer Model/Simulation Model.


Is the computer model a valid representation of the
real system?
Can the computer model duplicate the performance
of the real system?
Does the computer model output have credibility
with system experts and decision makers?

Experimentation and Analysis of


Results
Experimentation The execution of the
simulation model to obtain output values
Analysis of Results The process of analyzing
the simulation outputs to draw inferences and
make recommendations for problem resolution

Implementation and Documentation


The process of implementing decisions
resulting from the simulation and
documenting the model and its use.

Manual Simulation Example


Given the following arrival times for a single
server system what will be the average number
in the queue, average number in the system,
average time in system, average time in queue,
the number of completed jobs, number in the
queue, number in the system, and server
utilization at time 15 if the service time is 3 time
units for each entity.

1, 3, 5, 9,13,15,17

Data Collection
Activities may be represented as
Constants
Random variables

Collection of data
Design a data collection form
Record more than single attribute in case you
need to use data in a different way.
Use several session to get representative data
Use control charts

Data Collection
Machine

Begin Repair End Repair

Time
Elapsed

Data Collection
Testing data
Independence
Randomness
Homogeneity

Data Collection
Test of Independence
Ho: Measure A is independent of measure B
H1: Measure A is not independent of measure
B.
Inventory and day of week

Data Collection
Test of Randomness
Ho: f(xi/xj) = f(xi) =Independent
Hi: f(xi/xj) f(xi) : Dependent
For example, when simulation a production
process in which the items can be defective or
good, it would be important to know if
successive items are randomly distributed with
reputation good items followed by some of
defective items.

Data Collection
Test of Homogeneity
Tests for whether multiple sets of data can be
considered as coming from statistical
population are generally referred to as tests of
homogeneity distribution free.
Ho : G(x) =H(x)
H1 : G(x) H(x)
Two different workers working on the same
machine.

Random Variable
Two types
Discrete
Continuous

Random Variable
Probability mass function
Discrete
P(X = xi) = p(xi)
p(xi) = 1

Random Variable

Probability density function


Continuous
f(x) = e x x > 0
P(X = a) = 0
- f(x) dx = 1
P(a < x < b) = ab f(x) dx

Random Variable
Cumulative distribution function (CDF)
F(X) = P(X <= x)

X<x

p(xi)

- x f(x) dx

Random Variable
Expected value
= E(x)
= xi p (xi)
= x f(x) dx

Random Variable
Variance
V ( x ) E[( x ) 2 ]
E[ x 2 2 x 2 ]
E

x ( E ( x))

2
i

p ( xi ) (

xi p( xi))

Random Variable
Standard deviation
SD ( X )
Sums of R.V.

V (X )

Y a1 x1 a2 x2
E ( y ) a1 E ( x1 ) a2 E ( x2 )
2

V (Y ) a1 V ( x1 ) a2 V ( x2 )

Random Variable

SampleMean X
SampleVari ance S

(X X )

n 1

2
i

nx

n 1

Poisson Probability Distribution


Consider a discrete r.v. which is often useful
when dealing with the number of occurrences
of an event over a specified interval of time.
Suppose we want to find the probability
distribution of the accidents at the intersection
of Rural and Apache during a one week
period.
The R.V. we are interested in is the number of
accidents.

Poisson Probability Distribution


i. The Poisson Distribution provides a good model for the probability
distribution of the number of rare events that occur in space, time,
and volume where is the average at which events occur.
ii. Define: A r.v. is said to have a Poisson distribution if the p.m.f of
X is
x e
P(x) = f(x) =
, x = 0,1,
x!
where is the rate per unit time or per unit area
E[ X ]
iii.
V (X )

Exponential Distribution
Previously, we discussed the Poisson random variable,
which was the number of events occurring in a given
interval. This number was a discrete r.v. and the
probabilities associated with it could be described by the
Poisson Probability Distribution.
Not only is the number of events a r.v., but the waiting
time between event is also a random variable. This r.v. is a
continuous r.v. for it can assume any positive value.
This r.v. is an exponential r.v. which can be described by
the exponential distribution.

Exponential Distribution
e x

x 0& 0

i. Pdf: f ( x)

otherwise
0
where = rate at which events occur

ii. Correspondingly,
x

F ( x) P ( X x) e x dx 1 e x , x 0
0

1
V (X ) 2

E[ X ]

iii. An important application of the exponential distribution is to


model the distribution of component lifetime. A reason for its
popularity is because of the memory-less property of the
Exponential Distribution

The Uniform Distribution


o The simplest distribution is the one in which a continuous r.v. can assume
any value within a interval [a, b]
Def:
A continuous r.v. X is said to have a uniform distribution on the
interval [a,b] if the probability distribution (pdf) of X is:
1

a xb
f ( x) b a

0
otherwise

The Uniform Distribution


The cumulative distribution is
x

F ( X ) P ( X x)

f ( x)dx

x x
x
a
xa
f ( x)dx

ba a ba ba ba

1
ba
E[ X ] xf ( x)dx x(
)dx
ba
2

(b a ) 2
V (X )
12

The Uniform Distribution


Note:
An important uniform distribution is
that for when a = 0 and b = 1, namely
U(0, 1)
A U(0,1) r.v. can be used to simulate
observation of other random variables
of the discrete and continuous type.

The Triangular Distribution


Continuous Distribution
2( x a )
f ( x)
a xb
(b a )(c a )
2(c x)

bxc
(c b)(c a )
0
elsewhere

The Triangular Distribution


F ( x) 0

xa

( x a) 2
F ( x)
(b a )(c a )
(c x ) 2
1
(c b)(c a )
1
xc

a xb
bxc

The Triangular Distribution


F ( x) 0
xa
abc
E ( x)
3
a 2 b 2 c 2 ab ac bc
V ( x)
18
a min{x1 xn }
c max{x1 xn }
b 3 x a c

Normal Distribution
It is a fact that measurements on many random variables will follow a bellshaped distribution.
Random variable of this type are closely approximated by a Normal
Probability Distribution.
A continuous r.v. X is said to have a normal distribution if the pdf of X is
f ( x)

1
2

( x )2
2 2

, 0, x ,

The distribution contains 2 parameters ( and ). These are the expected


value and the variance and hence locate the center of the distribution and
measure its spread.

Normal Distribution
The Standard Normal Distribution
To compute P(a x b) when X ~ N(, 2), we must evaluate
b

f ( x)dx
a

1
2

( x )2
2 2

dx

Note: None of the standard integration techniques can be used


to evaluate this pdf. Instead, for = 0, and 2 = 1, the pdf has
been evaluated and values have been computed. Using the
table, probabilities for any other values of and 2 can be
determined

Normal Distribution
The normal distribution for parameters values
2
=
0,
and

= 1 is called the standard normal


distribution. A r.v. that has a standard
distribution is called a standard normal random
variable (denoted by Z). The pdf of Z is:
f ( z)

1
2

z2

Normal Distribution
The cumulative distribution of Z is
z

P( Z z )

f ( y)dy

and is denoted by (Z)

Note: The N(0,1) Table returns the cumulative


probability up to z or (z)

Normal Distribution
Non-standard Normal Distribution
The table only provides probabilities for r.v.
following the N(0,1) distribution. Thus, when X
2
2
~ N(, ), (i.e. not = 0, = 1), probabilities
involving X are computed by standardizing
the r.v. to N(0,1) scale.

Selecting a Distribution
Theoretical prior knowledge
Random arrival => exponential IAT
Sum of large manufactures => Normal CLT

Compare histogram with probability mass


or probability density

Data Collection
Little variability model as a constant.
Variability model as a random variable.
Empirical vs. Theoretical, Select a
Distribution, Estimate Parameter of
distribution, goodness or fit test.

X2 goodness of fit test


Compare observed versus theoretical
density
A collection of data can be as a sample
from a specified p.d.f
H0: Xis are IID r.v. with density f(x)
H1: Xis are not IID r.v. with density f(x)

X2 goodness of fit test


Critical value
If H0 is true, TS ~ X2k-1-(# of par estimated),
A large T.S.would cause rejection of H0
Reject Ho if T.S. > X2 critical

i
i
TS
k

i 1

X2 goodness of fit test

Issues test is an art


Number of intervals > 2
Size of intervals: Ei ~ same > 5
Requires relatively large amount of data

K-S test
Compare observed with theoretical CDF
Limited to continuous distribution, known
parameters
H0: Xi are IID r.v. with CDF F(x)
H1: Xi are not IID r.v. with CDF F(x)
Test statistic From table

K-S test
Critical value
A large T.S would cause rejection
Critical value 0.01
1.63 / n

0.05

1.36 / n

0.10

1.22 / n

i 1
i ^

TS max max( F ( Xi )
), max( F ( Xi ))
n
n

Parameter estimation
Set of data
x

xi
n

x1, x2, xm
s
2

2
2
x

n
x
i

n 1

Methods of moments => equate E(X),


V(X) to x and S2

Parameter estimation
Maximum likelihood => find parameter
that max the likelihood of obtaining the
given sample
Produces efficient and consistent estimates
Not always unbiased
Superior properties to methods of moments

Common sense.

Statistical Analysis of Simulations


As previously mentioned, output data from
simulation always exhibit random variability, since
random variables are input to the simulation model.
We must utilize statistical methods to analyze
output from simulations.
The overall measure of variability is generally
stated in the form of a confidence interval at a given
level of confidence.
Thus, the purpose of the statistical analysis is to
estimate this confidence interval.

Output analysis
Need multiple observations to estimate
variability
Y1, Y2, Y3, . Yn
Estimate a confidence interval for the
measure of performance
Estimate the number of observations
required to obtain the desired precision

Output analysis
What is an observation?
Is observation a sample statistic or time
average statistic?
Is this a steady state simulation or
terminating simulation?
Are the observations independent or
correlated?

Terminating vs Steady State Simulation


Often, the type of model determines which
type of output analysis is appropriate for a
particular simulation.
However, the system or model may not
always be the best indicator of which
simulation would be the most appropriate.
It is quite possible to use the terminating
simulation approach for systems more
suited to steady-state simulations, and vice
versa.

Observation vs Time Based


Observation (Sample)
Average Time In System
Average Time In Queue

Time Based
Average Number in System
Average Number in Queue
Machine Utilization

Terminating simulation
Simulation in which the output measure of
performance is defined over a specific
interval of time with a specific starting
condition and a specific ending condition

Retail sales during a business day


Project network
Time to produce a batch of parts in a work cell
Military Simulations

Terminating simulation
Has a specified starting and ending
condition.
Each observation must have the same
starting and ending.
Observations are obtained by replication.
Use a different seed for random number
generation.

Steady state simulation


Simulation in which the output measure of
performance is defined over an infinite
interval of time independent of the initial
state of the system and stopping condition
Average production from an assembly line of
well trained employees
Inventory simulation

Steady state simulation


Independent of starting and ending
condition.
Remove initial condition bias
Specify warm-up period (transient period) .
Set initial condition too steady state.
Have a very long run length

Steady state simulation


1. Individual Yi average of individuals.
2. Replication Yi average of each one.
3. Batch means batch by time, by number.

Terminating vs. Steady state simulation


Terminating
Observations are obtained by replication
Each observation must reflect the specified
starting and ending condition
Use a different seed for each replication
Y1, Y2, , Yr => one independent
observation per replication

Confidence interval for steady state


simulation
Y1, Y2, . Yn
Trying to estimate a long run performance
measure independent of starting and
ending conditions
Two problems
Initial condition bias
Dependent observations

Confidence interval for steady state


simulation
Outline
Removing initial condition bias
Creating independent observation
Replication/ deletion
Batch means

Confidence interval for replication


Let Y1, Y2, and Y3YR be measures of
performance from R independent
replication.
Independent -> different seed for each run

Y t r 1 ,

(Y

) RY

R 1

Confidence interval for replication


Approximate due to need for Yi ~ Normal
(1-) Confidence Interval => Probability
of containing true mean
1
Var (Y ) Var ( R
Y1 ...

1
2

1
R

YR )

(Var (Y1 ) ... Var (YR ))

R
RS 2
S2

2
R
R

Number of replication needed


Suppose we desire a confidence interval
Y I HalfLength

Based on a preliminary run of R0


replication, we have an estimate of S2 and
confidence interval

Y t

,
1

R0 1
2

S
R0

Number of replication needed


Find R such that
I t

,
1

R 1
2

S2
R

If R is large,

r 1

R
R

*S

Test for comparing two means


H0: 1 2 = 0
H1: 1 2 0
Two approaches:
Form a (1 ) confident on 1 2 :
Y1 Y2 t / 2,r V (Y1 Y2 )

Reject H0 if confident does not contain 0.


Perform a t test

(Y1 Y2 ) 0
V (Y1 Y2 )

Reject if \t\ > tr,/2


Assumptions
Case 1: Y1, Y2 YR1
Case 2: Y1, Y2 YR2

Y1 , s12
Y2 , s 22

Observations are independent


Observation are normally distributed
Variances are unknown/known.
Variances are equal/unequal
Observations are paired/unpaired.

Test for comparing two means


Equal Variance
1. Assumptions: independent, normal, unknown, unpaired, equal
variance.
2
2
( R1 1) S12 ( R2 1) S 22
(Yi Y1 ) (Yi Y2 )
2
2. S p

R1 R2 2
R1 R2 2
3. Var (Y1 Y2 ) Var (Y1 ) Var (Y2 )

S p2
R1

4. (1 )confident : Y1 Y2 t / 2, R1 R2 2
5. t-test: t
t-crit = t

S p2

S p2
R1

R2

S p2
R2

( y1 y 2 )
Sp

1 1

R1 R2

R1 R2 2 ,

6. Note: Many simulations do not have equal variance.

Test for comparing two means

One sided test


Need to make hypothesis in advance
Use t test, adjust critical value

Test for comparing two means


Test for normal population with known variance
Assumptions: independent, normal, known variance,
unpaired, unequal variance.
2 populations: X1 ~ N(1, 12) & X2 ~ N(2, 22)
Sample m from X1 & sample n from X2
Want to test whether 1= 2
H0: 1 = 2
H1: 1 2
X 1 X 2 ( 1 1 ) X 1 X 2

Test Statistic: Z 0
2
2
2
2
1

2
m
n

1 2

m
n

Test for comparing two means


Unequal Variance
1. Assumptions: independent, normal, unknown variance,
unpaired, unequal variance.
S12 S 22
2. Var (Y1 Y2 ) Var (Y1 ) Var (Y2 )

R1 R2
3. (1 )confident : Y1 Y2 t / 2,

S2 S2
1
2
R R
1
2

S2
1
R
1

R1 1

S2
2
R
2

2 1

S12 S 22

R1 R2

Test for comparing two means


Paired Test
Assumptions: independent, normal, unknown variance,
equal # of replications
Case 1: Y1, Y2 YR
Case 2: Y1, Y2 YR
Different: d1, d2 dR , where di = yi yi
2
di
(d i d )
2
d
Sd
R
R 1
H0: 1 2 = 0 d = 0
H1: 1 2 0 d 0
(1 )confident : d t / 2, R 1
t

d
S d2
R

S d2
V (d )
R

Test for comparing two variances


F-test for equal variance
1.
H 0 12 22
H 1 12 22

2. Test statistics = F =
3. Critical Value =

S12
S 22

R1 1, R2 1,

4. Example
F =5.4/2.55 = 2.12
= .10, Fcritical = F9,9,.05 = 3.18, can not reject Ho

Common Random Number


The process of comparing cases with the
same set of random numbers
creating identical condition

Observation
Confident Interval
(Y1 Y2 ) t R 1, / 2

V (Y1 Y2 )

V (Y1 Y2 ) V (Y1 ) V (Y2 ) 2Cov (Y1 , Y2 )

Use the paired test

Random Numbers
Generation of U(0,1) random number
algorithm used by the RND function
Generation of random variates from
various distributions algorithm used by
EXPONENTIAL, UNIFORM, and so on
(these algorithms use U(0,1) random
numbers.

Random Number Generation


Desirable properties

Fast and efficient


Capable of repeating same sequence
Statistically equivalent to U(0,1)
Independent and dense
Large cycle length or period
Low storage requirements

Old method tables

Random Number Generation


Pseudo random number generators
A non random sequence of numbers each
completely determined by its predecessor, the
algorithm, and initially, the seed.

Linear Congruential Generator


Zi = ( a * Zi-1 + C ) mod m
Z0 = seed
Ui = Zi / m (Random Number)
If we choose a, C, and m correctly, => then
we achieve a maximum period
0<= Zi <= m-1

Linear Congruential Generator


Rule For Full Period :
C is relatively prime to m.
other than 1, hence there is no integer that exactly
divides C and m

Every prime factor of M is also a prime factor


of A-1
If m is exactly dividable by 4, then A-1 must
be exactly dividable by 4

Linear Congruential Generator


A full period does NOT mean always a
good random number generator

Multiplicative Generators
Zi = a * Zi-1 mod m
Z0 = seed
Saves an addition, more popular

Multiplicative Generators
C=0

M divides both m and c


Condition (a) is violated
Not full period
P = m 1 is largest available period

Multiplicative Generators
2b is not a good choice for m
only possible numbers
Let m = 2b - 1

Testing a random number generator


Testing the distribution
Generate 1000 or more observations
X2 test or K-S test for U(0, 1)
Use 100 intervals

Test for independence


Runs up
Tests designed to compare observed and
expected distribution

E(x) = .5 V(X) = 1/12, where a = 0, b=1

Random variate generation


Assume a random number generator is
available to generate Ui ~ U(0, 1)
Goal: Generate Xi from a specified
distribution f(x) or p(x) of F(x)
Three methods
Inverse transformation method
Convolution method
Acceptance\Rejection method

Random variate generation


Apply these methods to the five
distributions we are using in this class

Uniform
Triangular
Exponential
Normal
Poisson

Inverse transformation method


General idea use CDF
Select Ui
Find corresponding xi
That is xi = F-1(Ui)

Advantage of inverse transformation method


One Ui per xi

Disadvantage
CDF may not always exist

Inverse transformation method


Exponential distribution
f(x) = e -x x 0
F(X) = 1 - e -x x 0
Ui = F(Xi) = 1 - e -xi
(1- Ui) = e -xi
ln(1- Ui) = - Xi
Xi = - (1/ ) ln(1- Ui) = - (1/ ) ln(Ui)

Inverse transformation method


Triangular distribution

( x a )2

,a x b
( x ) ( b a )( c a )

ui

( c x )2
1
,b x c
( c b )( c a )
No

Yes

u
i

x a
i

ba
ca

(b a)(c a) ui

ba

ui c a

ba

ui ca

x c
i

(c b)(c a)(1 u i)

Convolution Method
Applicable to situation where the random
variable of interest can be expressed as a
sum of other random variables that are IID
(independent identical distributed)
X=Y1+Y2+Y3. +Yn
Idea: Generate Y1. Yn and add these up
to calculate X

Convolution Method
Normal distribution
Focus: Generating Zi ~ N(0, 1)

xi
Zi
xi Z i ~ N ( , )

Generating Zi

1
f (Z )
e
2

1 2
z
2

Inverse transformation: F(x) does not exist


Acceptance\Rejection: Not bounded

Convolution Method
Normal distribution
Generate Ui
Generate Zi
Then Xi Zi

Zi~N(0,1)

Acceptance\Rejection Method
Applicable to distribution functions that
are hard to integrate
Idea
Find a majoring function t(x) where t(x) > f(x)
Sample values of x from t(x) call it x*
Sample Ui < f(x*) / t(x*), accept x*

Simplification for this class we will


always use a rectangular majoring function

9.3 Random Numbers and Monte


Carol Simulation
The procedure of generating these times from the
given probability distributions is known as
sampling from probability distributions, or
random variate generation, or Monte Carlo
sampling.
We will discuss several different methods of
sampling from discrete distributions.
The principle of sampling from discrete
distributions is based on the frequency
interpretation of probability.

In addition to obtaining the right frequencies, the


sampling procedure should be independent; that is,
each generated service time should be independent
of the service times that precede it and follow it.
This procedure of segmentation and using a roulette
wheel is equivalent to generating integer random
numbers between 00 and 99.
This follows from the fact that each random number
in a sequence has an equal probability of showing
up, and each random number is independent of the
numbers that precede and follow it.

A random number, Ri, is defined as an


independent random sample drawn from a
continuous uniform distribution whose
1 0function
x 1 (pdf) is given
probability
density
f ( x)
0 otherwise
by

Random Number Generators

Since our interest in random numbers is for use


within simulations, we need to be able to generate
them on a computer.
This is done by using mathematical functions called
random number generators.
Most random number generators use some form of a
congruential relationships. Examples of such
generators include linear congruential generator, the
multiplicative generator, and the mixed generator.
The lineal congruential generator is by far the most
widely used.

Each random number generated using this methods


will be a decimal number between 0 and 1.
Random numbers generated using congruential
methods are called pseudorandom numbers.
Random number generators must have these
important characteristics:
1.
2.
3.
4.

The routine must be fast


The routine should not require a lot of core storage
The random numbers should be replicable; and
The routine should have a sufficiently long cycle

Most programming languages have built-in


library functions that provide random (or
pseudorandom) numbers directly.

Computer Generation of Random


Numbers
We now take the method of Monte Carlo sampling
a stage further and develop a procedure using
random numbers generated on a computer.
The idea is to transform the U(0,1) random
numbers into integer random numbers between 00
and 99 and then to use these integer random
numbers to achieve the segmentation by numbers.
We now formalize this procedure and use it to
generate random variates for a discrete random
variable.

The procedure consists of two steps:


1. We develop the cumulative probability
distribution (cdf) for the given random
variable, and
2. We use the cdf to allocate the integer random
numbers directly to the various values of the
random variables.

9.4 An Example of Monte Carlo


Simulation
The book uses a Monte Carlo simulation to
simulate a news vendor problem.
The procedure in this simulation is different from
the queuing simulation, in that the present
simulation does not evolve over time in the same
way.
Here, every day is an independent simulation.
Such simulations are commonly referred to as
Monte Carlo simulations.

9.5 Simulations with Continuous


Random Variables
In many simulations, it is more realistic and
practical to use continuous random variables.
We present and discuss several procedures for
generating random variates from continuous
distributions.
The basic principle is similar to the discrete case.
We first generate U(0,1) random number and then
transform it into a random variate from the
specified distribution.

The selection of a particular algorithm will


depend on the distribution from which we want
to generate, taking into account such factors as
the exactness of the random variables, the
computations and storage efficiencies, and the
complexity of the algorithm.
The two most common used algorithms are the
inverse transformation method (ITM) and the
acceptance-rejection method (ARM).

Inverse Transformation Method


The inverse transformation method is generally used
for distribution whose cumulative distribution
function can be obtained in closed form.
Examples include the exponential, the uniform, the
triangular, and the Weibull distributions.
For distributions whose cdf does not exist in closed
form, it may be possible to use some numerical
method, such as a power-series expansion, within
the algorithm to evaluate the cdf.

The ITM is relatively easy to describe and


execute.
It consists of the following steps:
Step1: Given the probability density formula f(x) for a
random variable X, obtain
the cumulative distribution
x
function F(x)Fas
( x)
f (t )dt

Step 2: Generate a random number r.


Step 3: Set F(x) = r and solve for x.

We consider the distribution given by the function


x
2
f ( x)
0

0x2
otherwise

A function of this type is called a ramp function.


To obtain random variates from the distribution
using the inverse transformation method, we first
computer the cdf as
x t
F ( x) dt
0 2
x2

In Step 2, we generate a random number r.


Finally, in Step 3, we set F(x) =r and solve for x.
x2
r
4
x 2 r

Since the service time are defined only for positive


values of x, a service time of
as the solution
for x. This equation is called a random variate
generator or a process generator.
Thus, to obtain a service time, we
x 2first
r generate a
random number and then transform it using the
preceding equation.

As this example shows, the major


advantage of the inverse transformation
method is its simplicity and ease of
application.

Acceptance Rejection Method


There are several important distributions,
including the Erlang (used in queuing models) and
the beta (used in PERT), whose cumulative
distribution functions do not exist in closed form.
For these distributions, we must resort to other
methods of generating random variates, one of
which is the acceptance rejection method
(ARM).
This method is generally used for distributions
whose domains are defined over finite intervals.

Given a distribution whose pdf, f(x), is defined


over the interval a x b, the algorithm consists
of the following steps:
Step 1: Select a constant M such that M is the largest
value of f(x) over the interval [a, b].
Step 2: Generate two random numbers, r1 and r2.
Step 3: Computer x* = a + (b a)r1. (This ensures that
each member of [a, b] has an equal chance to be chosen
as x*.)
Step 4: Evaluate the function f(x) at the point x*. Let
this be f(x*).

Step 5: If

r2

f ( x*)
M

deliver x* as a random variate from the distribution whose


pdf is f(x). Otherwise, reject x* and go back to Step 2.

Note that the algorithm continues looping back to


Step 2 until a random variate is accepted.
This may take several iterations. For this reason, the
algorithm can be relatively inefficient.
The efficiency, however, is highly dependent on the
shape of the distribution.

There are several ways by which the method can


be made more efficient.
One of these is to use a function in Step 1 instead
of a constant.
We now give an intuitive justification of the
validity of the ARM.
In particular, we want to show that the ARM does
generate observations from the given random
variable X.

Direct and Convolution Methods for


the Normal Distribution
Both the inverse transformation method and the
acceptance reject method are inappropriate for
the normal distribution, because (1) the cdf does
not equal in closed form and (2) the distribution
is not defined over a finite interval.
Other methods such as an algorithm based on
convolution techniques, and then a direct
transformation algorithm that produces two
standard normal variates with mean 0 and
variance 1.

The Convolution Algorithm

In the convolution algorithm, we make direct use of


the Central Limit Theorem.
The Central Limit Theorem states that the sum Y of
n independent and identically distributed random
variables ( say Y1, Y2,Yn), each with mean and
finite variance 2) is approximately normally
distributed with mean n and variance n2.
If we want to generate a normal variate X with
mean and variance 2, we first generate Z using
this process generator then transform it using the
relation X = + Z. Unique to normal distribution.

The Direct Method


The direct methods for the normal distribution
was developed by Box and Muller (1958).
Its not as efficient as some of the newer
techniques, it is easy to apply and execute.
The algorithm generates two U(0,1) random
numbers, r1 and r2, and then transforms them into
two normal variates, each with mean 0 and
variance 1, using the direct transformation.

1
2

Z1 (2 ln r1 ) sin 2r2
1
2

Z2 (2 ln r1 ) cos 2r2

It is easy to transform these standardized normal


variates intro normal variates X1 and X2 from the
distribution with mean and variance 2, using
the equations
X1 Z1
X2 Z2

9.6 An Example of a Stochastic


Simulation
Cabot Inc. is a large mail order firm in Chicago.
Orders arrive into the warehouse via telephones. At
present, Cabot maintains 10 operators on-line 24
hours a day.
The operators take the orders and feed them directly
into a central computer, using terminals.
Each operator has one terminal. At present, the
company has a total of 11 terminals.
That is, if all terminals are working, there will be 1
spare terminal.

Cabot managers believe that the terminal system


needs evaluation, because the downtime of operators
due to broken terminals has been excessive.
They feel that the problem can be solved by the
purchase of some additional terminals for the spares
pool.
It has been determined that a new terminal will cost
a total of $75 per week.
It has also been determined that the cost of terminal
downtime, in terms of delays, lost orders, and so on
is $1000 per week.

Given this information, the Cabot managers would like


to determine how many additional terminals they
should purchase.
This model is a version of the machine repair problem.
It is easy to find an analytical solution to the problem
using the birth-death processes.
However, in analyzing the historical data for the
terminals, it has been determined that although the
breakdown times can be represented by the exponential
distribution, the repair times can be adequately
represented only by the exponential distribution.

This implies that analytical methods cannot be used


and that we must use simulation.
To simulate this system, we first require the
parameters of both the distributions.
The data show that the breakdown rate is
exponential and equal to 1 per week per terminal.
In other words, the time breakdowns for a terminal
is exponential with a mean equal to 1 week.
Analysis for the repair times shows that this
distribution can be represented by the triangular
distribution which has a mean of 0.075 week.

10 400 x 0.025 x 0.075


f ( x)
50 400 x 0.075 x 0.125

The repair stuff on average can repair 13.33


terminals per week.
To find the optimal number of terminals, we must
balance the cost of the additional terminals against
the increased revenues generated as a result of the
increase in the number of terminals.
In this simulation we increase the number of
terminals in the system, n, from the present total of
11 in increments of 1.

For this fixed value of n, we then run our simulation


model to estimate the net revenue.
Net revenue here is defined as the difference
between the increase in revenues due to the
additional terminals and the cost of these additional
terminals.
We keep on adding terminals until the net revenue
position reaches a peak.
To calculate the net revenue, we first computer the
average number of on-line terminals, ELn, for a
fixed number of terminals in the system, n.

Once we have a value of ELn, we can


computer the expected weekly downtime
costs, given by 1000(10-ELn).
Then the increase in revenue as a result of
increasing the number of terminals from 11
m
to n is 1000(EL
T n EL11). Mathematically,
Ai

N
(
t
)
dt
0
ELn EL
i 1
we compute
n
T

where
T = length of simulation
N(t) = number of terminals on-line at time t (0tT)
Ai = area of rectangle under N(t) between ei-1 and ei
(where ei is the time of the ith event)
m = number of events that occur in the interval [0,T]
Between time 0 and time e1, the time of the first
event, the total on-line time for all the terminals is
given by 10ei, since each terminal is on-line for a
period of e1 time units.

If we now run this simulation over T time units and


sum up the areas A1, A2, A3,, we can get an
estimate for EL10 by dividing this sum by T. This
statistic is called a time-average statistic.
We would like to set up the process in such way that
it will be possible to collect the statistics to
computer the areas A1, A2, A3,.
That is, as we move from event to event, we would
like to keep track of at least the number of terminals
on-line between the events and the time between
events.

We first define the state of the system as the


number of terminals in the repair facility.
The only time the state of the system will change is
when there is either a breakdown or a completion
of a repair.
Therefore, there are two events in this simulation:
breakdown and completion of repairs.
To set up the simulation, our first task is to
determine the process generators for both the
breakdown and the repair times.

We use the ITM to develop the process generators.


For the exponential distribution the process
generator is simply x = -log r
In case of the repair times, applying the ITM gives
us
x 0.025 0.005r (0 r 0.5)

and
x 0.125 0.005(1 r ) (0.5 r 1.0)
as the process generators.

For each n, we start the simulation in the state


where there are no terminals in the repair facility.
In this state, all 10 operators are on-line and any
remaining terminals are in the spares pool.
Our first action is the simulation is to schedule the
first series of events, the breakdown times for the
terminals presently on-line.
Having scheduled these events, we next determine
the first event, the first breakdown, by searching
through the current event list.

We then move the simulation clock to the time


of this event and process this breakdown.
To process a breakdown, we take two separate
series of actions
1. Determine whether a spare is available.
2. Determine whether the repair staff is idle.

These actions are summarized in the system


flow diagram showed in the book in Figure 17.
Otherwise, we process a completion of a repair.

To process the completion of a repair, we also


undertake two series of actions.
1. At the completion of a repair, we have an additional
working terminal, so we determine whether the terminal
goes directly to an operator or to the spares pool.
2. We check the repair queue to see whether any terminals
are waiting to be repaired.

We proceed with the simulation by moving from


event to event until the termination time T.
At this time, we calculate all the relevant measures
of performance from the statistical counters.

Our key measure is the net revenue for the current


value of n.
If this revenue is greater than the revenue for a
system with n-1 terminals, we increase the value of
n by 1 and repeat the simulation with n +1
terminals in the system.
Otherwise, the net revenue has reached a peak.
The simulation outlined in this example can be
used to analyze other policy options that
management may have.

The simulation model provides a very


flexible mechanism for evaluating
alternative policies.

S-ar putea să vă placă și