This paper surveys efficient techniques for estimating, via
simulation, the probabilities of certain rare events in queueing
and reliability models. The rare events of interest are long waiting
times or buffer overflows in queueing systems, and system failure
events in reliability models of highly dependable computing systems.
The general approach to speeding up euch simulations is to accelerate
the occurrence of the rare events by using importance sampling. In
importance sampling, the system is simulated using a new set of
input probability distributions, and unbiased estimates are recovered
by multiplying the simulation output by a likelihood ratio. Our
focus is on describing asymptotically optimal importance sampling
techniques. Using asymptoti-cally optimal importance sampling, the
number of samples required to get accurate estimates grows slowly
compared to the rate at which the probability of the rare event
approaches zero. In practice, this means that run lengths can be
reduced by many orders of magnitude, compared to standard simulation.
In certain cases, asymptotically optimal importance sampling results
in estimates having bounded relative error. With bounded relative
error, only a fixed number of samples are required to get accurate
estimates, no matter how rare the event of interest is. The queueing
systems studied include simple queues (e.g., GI/GI/ 1), Jackson
networks, discrete time queues with multiple autocorrelated arrival
processes that arise in the analysis of Asyn-chronous Transfer Mode
communications switches, and tree structured networks of such
switches, Both Markovian and non-Markovian reliability models are
treated.
|