Chapter Three: Number Generation and Random Variables
Chapter Three: Generation of Random Numbers and Random Variables
Chapter Objectives
Explain the importance of random number generation in stochastic simulation.
Describe the characteristics and requirements of a good pseudo-random number generator (PRNG).
Understand the basic algorithms for generating uniform random numbers (linear congruential generator).
Apply transformation techniques to generate random variables with specific probability distributions from uniform random numbers.
Analyze and validate the quality of sequences of random numbers and random variables generated.
Use software tools and libraries (such as Excel, Python, AnyLogic) to generate and test random numbers and random variables.
Random number generation is fundamental to process simulation, as it allows for modeling the uncertainty and variability inherent in complex systems. High-quality random numbers ensure realistic and accurate simulations, which is crucial for analysis, prediction, and decision-making in fields such as engineering, economics, and computer science. For this reason, the use of efficient pseudorandom number generators and validating their suitability through statistical testing are essential pillars of any simulation environment. -crc-
Introduction
In stochastic simulation, random numbers and random variables are fundamental building blocks. Through them, we can model uncertainty, variability, and chance events that occur in real systems—such as arrival times, service times, demand, machine failures, and more.
However, computers are deterministic machines; they cannot generate truly random numbers, but only pseudo-random numbers—sequences that "appear" random but are produced by mathematical algorithms.
A solid understanding of random number and random variable generation is essential for building valid and reliable simulation models.
1. Pseudo-Random Number Generators (PRNGs)
What Are Pseudo-Random Numbers?
A pseudo-random number is a value produced by a deterministic algorithm that simulates randomness.
In simulation, we usually require random numbers uniformly distributed between 0 and 1 (Uniform[0,1]).
These numbers serve as a basis for generating random variables with any distribution.
Requirements for a Good PRNG
A good random number generator should:
Produce numbers uniformly distributed in the interval [0,1].
Have a long period before repeating the sequence.
Be computationally efficient (fast).
Be reproducible (the same seed produces the same sequence).
Pass statistical tests for independence and randomness.
Linear Congruential Generator (LCG)
One of the most common and oldest algorithms for generating pseudo-random numbers.
The LCG is defined by the recurrence:
where:
X_0 is the seed (initial value),
a is the multiplier,
c is the increment,
m is the modulus.
The normalized random number is:
Example
Let a=5,c=1,m=16,X0=7
Compute the first four random numbers:
X1=(5∗7+1)mod16=36mod16=4,U1=4/16=0.25
X2=(5∗4+1)mod16=21mod16=5,U2=5/16=0.3125
X3=(5∗5+1)mod16=26mod16=10,U3=10/16=0.625
X4=(5∗10+1)mod16=51mod16=3,U4=3/16=0.1875
Parameters in Practice
The choice of parameters a, c, and m is critical for the quality of the generator.
In practice, well-tested generators are used (e.g., Mersenne Twister in Python).
2. Statistical Tests for Random Numbers
To ensure the quality of the numbers generated, several statistical tests are used:
Uniformity Test: Checks whether the numbers are uniformly distributed in [0,1].
Independence Test: Checks for correlations or patterns between numbers.
Runs Test: Detects long sequences of increasing or decreasing numbers.
Chi-Square Test: Compares observed frequencies with expected frequencies.
Kolmogorov-Smirnov Test: Compares the empirical distribution with the uniform distribution.
In simulation, it is recommended to use established libraries (e.g., NumPy, random in Python) rather than building your own generators.
3. Generation of Random Variables
The goal is to generate random variables with a specific distribution (e.g., Exponential, Normal, Poisson) from uniform random numbers.
General Steps
Generate a uniform random number U in [0,1].
Transform U using a method appropriate for the desired distribution.
Methods for Generating Random Variables
a) Inverse Transform Method
If $F(x)$ is the cumulative distribution function (CDF) of the desired distribution and $U$ is Uniform[0,1], then:
will have the desired distribution.
Example: For the Exponential distribution with mean $1/\lambda$:
F(x)=1−e−λx⟹x=−λ1ln(1−U)
b) Acceptance-Rejection Method
Used when the inverse CDF is not easily obtainable.
Involves generating candidate values and accepting or rejecting them according to a rule.
c) Other Methods
Box-Muller Method for generating Normal random variables.
Composition and convolution when dealing with sums of random variables.
4. Generation in Excel, Python, and AnyLogic
Excel
RAND(): Generates Uniform[0,1].RANDBETWEEN(a,b): Generates random integers between a and b.Use formulas to transform to other distributions.
Python
random.random(): Uniform[0,1].numpy.random: Library with generators for many distributions (normal, exponential, Poisson, etc.)scipy.stats: Advanced random variable generation and statistical tests.
AnyLogic
Built-in functions:
uniform(),exponential(),normal(),poisson(), etc.Can specify parameters and use in models directly.
5. Validation and Analysis of Generated Variables
Always check that the generated data matches the expected theoretical distribution.
Use histograms, summary statistics, and formal statistical tests.
In simulation, the accuracy of random variable generation directly affects the quality of results.
Step By Step Guide
Pseudorandom number generation: Lehmer's method.
Stochastic simulation, which is at the core of modeling most industrial engineering systems, is fundamentally based on the ability to generate sequences of numbers that behave as if they were realizations of uniformly distributed random variables in the interval (0,1). These numbers, commonly denoted as
Ui∼U(0,1)
They are the basic ingredient for later generating random variables from more complex distributions (Exponential, Normal, Poisson, etc.) that represent the various uncertain phenomena in a system.
Pseudo-Random vs. Truly Random Numbers
Computers are deterministic machines. Therefore, they cannot generate truly random numbers in the strict sense.
Instead, they use mathematical algorithms to produce sequences of numbers that appear random and that pass various statistical tests of randomness.
These are called pseudo-random numbers. The desirable properties of a sequence of pseudo-random numbers.
The numbers should be uniformly distributed over the interval (0,1)
Each generated number should be independent of the previous numbers in the sequence.
The sequence will eventually repeat (it is periodic). The period (length of the sequence before it begins to repeat) should be as long as possible, ideally much longer than the number of random numbers needed for any simulation.
Given the same initial seed, the generator must produce exactly the same sequence of numbers. This is crucial for model debugging and for comparing different system configurations under the same random conditions.
The algorithm must be fast at generating numbers, as a simulation may require millions of them.
The generator must produce the same (or statistically equivalent) results on different computing platforms.
Linear Congruential Generators (LCGs)
Linear Congruential Generators are a class of algorithms for generating pseudorandom numbers. They operate using a linear recurrence relation. The general formula for a LCG is:
where:
Xi is the i-th integer in the sequence.
Xo is the seed or initial value.
a is the multiplier.
c is the increment.
m is the modulus.
The parameters a,c,m,andX0 are non-negative integers. The term mod m means the remainder of dividing (aXi+c) by m. The pseudorandom numbers Ui in (0,1) are obtained as Ui=Xi/m.
Example
Consider the following values:
( a = 5 )
( c = 3 )
( m = 16 )
Seed ((X0))=7
The sequence can be generated as follows:
(X1=(5×7+3)mod16=38mod16=6)
(X2=(5×6+3)mod16=33mod16=1)
(X3=(5×1+3)mod16=8mod16=8)
We repeat this process to generate more numbers. This method is simple and efficient, but the choice of ( a ), ( c ), and ( m ) is crucial to obtain good randomness and a long period.
Lehmer's Method (Multiplicative Congruential Generator)
Lehmer's Method
Lehmer's Method, also known as the Multiplicative Congruential Generator, is an algorithm for generating pseudorandom numbers through a recursive relation of the form:
where:
a is the multiplier,
m is the modulus,
X0 is the initial seed of the generator.
This method is a variant of the linear congruential generator, but does not include the constant increment term ( c ). The values of ( a ) and ( m ) are carefully chosen to maximize the period length and the quality of the randomness.
Example
Suppose ( a = 7 ), ( m = 31 ), and the initial seed ( X0=3 ).
We calculate the first number in the sequence:
[X1=(7×3)mod31=21] 2. We calculate the second number:
[X2=(7×21)mod31=23] 3. We calculate the third number:
[X3=(7×23)mod31=6]
We can continue this process to generate more numbers in the sequence.
Parameter Choice:
The quality of the Lehmer generator depends critically on the choice of m,a,andX0.
Modulus m: Must be a large prime number. A common choice, especially on 32-bit machines, is m=231−1, which is a Mersenne prime.
Multiplier a: Must be carefully chosen to ensure a full (or nearly full) period and good statistical properties. A full period for a multiplicative generator with prime modulo m is m−1, which means that the sequence X1,X2,…,Xm−1 contains all integers from 1 to m−1 exactly once, in some order.
Extensive research has been done to find "good" multipliers. For example, for m=231−1, a commonly cited multiplier is a=75=16807, although others such as a=48271 have also been proposed for their good properties and implementation efficiency.
Seed X0: Must be an integer between 1 and m−1. Different seeds will produce different (albeit related) sequences.
Implementation: One challenge in the implementation is to calculate aXi without overflowing if the product exceeds the computer's maximum integer representation capacity, before taking the modulo m.
Generating U(0,1) numbers is the first critical step. The quality of these numbers directly impacts the validity of the entire simulation study.
If the base generator is flawed (e.g., it's not uniform, or the numbers are not independent), then the random variables generated from it will also be flawed, and the simulation results will be unreliable, no matter how sophisticated the rest of the model is.
Randomness and Independence Tests
Since pseudorandom number generators are deterministic algorithms, it is crucial to subject their output sequences to rigorous statistical tests to assess whether they behave sufficiently similarly to truly random and independent sequences from a distribution. U(1,0)
No generator is perfect, and no test can "prove" that a generator is good; rather, tests can reveal whether a generator is "bad" by failing to meet certain statistical criteria.
The two fundamental properties that we want to evaluate are uniformity and independence.
Uniformity Tests
It is a statistical tool used to assess uniformity in a sequence of pseudorandom numbers. It involves comparing the observed frequencies of numbers in different intervals with the expected frequencies, which allows us to determine whether the numbers are uniformly distributed.
The interval (0,1) is divided into k subintervals of equal length (1/k).
A sample of N random numbers Ui is generated.
The number of observations (Oj) that fall within each subinterval j is counted.
Under the null hypothesis of uniformity, the expected frequency (Ej) for each subinterval is N/k.
The test statistic is calculated: X02=∑j=1kEj(Oj−Ej)2.
This statistic is compared to a critical value from the Chi-square distribution with k−1 degrees of freedom and a significance level α (e.g., 0.05). If χ02 is greater than the critical value (or if the p-value is less than α), the uniformity hypothesis is rejected.
Sort the N generated numbers U(1)≤U(2)≤…≤U(N).
Calculate the empirical cumulative distribution function (CDF) Fe(x)=(number of Ui≤x)/N.
For a U(0,1) distribution, the theoretical CDF is Ft(x)=x for 0≤x≤1.
The K-S test statistic is the maximum absolute vertical difference between Fe(x) and Ft(x):DN=max0≤x≤1∣Fe(x)−Ft(x)∣. In practice, it is calculated as
DN is compared to a critical value of the Kolmogorov-Smirnov distribution for sample size N and significance level α. If DN is greater than the critical value, the uniformity hypothesis is rejected.
Let's imagine we have collected n=100 observations of a continuous variable and we want to verify whether these data can be modeled by a normal distribution.
1. Estimate Parameters and Define Intervals
First, we need the mean and standard deviation of our sample. Suppose for our data:
Sample mean (xˉ)=50
Sample standard deviation (s)=10
Next, we divide the data range into k intervals. It is crucial that the expected frequency for each interval (Ei) be sufficiently large, typically Ei≥5. For this example, we will select k=5 intervals.
The interval boundaries are defined and then standardized (converted to Z scores) using xˉ y s👏
X < 40
Z<(40−50)/10=−1.0
40≤X<45
−1.0≤Z<(45−50)/10=−0.5
45≤X<55
−0.5≤Z<(55−50)/10=0.5
55≤X<60
0.5≤Z<(60−50)/10=1.0
X≥60
Z≥1.0
2. Calculate Observed Frequencies (O_i)
We count the number of data in our sample that fall within each of the defined intervals:
X < 40
18
40≤X<45
15
45≤X<55
36
55≤X<60
13
X≥60
18
Total
100
3. Calculate Expected Frequencies (E_i)
For each interval, we determine the probability that a value from a standard normal distribution falls within its Z limits. We then multiply this probability by the total sample size (n=100) to obtain E_i.
X < 40
Z<−1.0
P(Z<−1.0)≈0.1587
100⋅0.1587=15.87
40≤X<45
−1.0≤Z<−0.5
P(−1.0≤Z<−0.5)≈0.1498
100⋅0.1498=14.98
45≤X<55
−0.5≤Z<0.5
P(−0.5≤Z<0.5)≈0.3829
100⋅0.3829=38.29
55≤X<60
0.5≤Z<1.0
P(0.5≤Z<1.0)≈0.1498
100⋅0.1498=14.98
X≥60
Z≥1.0
P(Z≥1.0)≈0.1587
100⋅0.1587=15.87
Total
≈1.0
≈100
The probabilities P(Interval) are obtained from a standard normal distribution table (Z-table) or using statistical software._
4. Calculate the χ2 Statistic
We use the observed (O_i) and expected (E_i) frequencies to calculate the χ2 statistic
X<40
18
15.87
2.13
4.5369
≈0.2859
40≤X<45
15
14.98
0.02
0.0004
≈0.00002
45≤X<55
36
38.29
-2.29
5.2441
≈0.1369
55≤X<60
13
14.98
-1.98
3.9204
≈0.2617
X≥60
18
15.87
2.13
4.5369
≈0.2859
Total
χ2≈0.9704
The calculated value of the statistic is χcalculated2≈0.9704.
5. Determine Degrees of Freedom (df)
The degrees of freedom for this test are calculated as:
Where:
k = number of intervals (5 in our example).
m = number of distribution parameters estimated from the sample. For a normal distribution, we estimate the mean and standard deviation, so m = 2.
So, df=5−1−2=2
6. Making a Statistical Decision
We choose a significance level, typically α=0.05.
We look for the critical value of χ2 in a Chi-square distribution table (or using software) for df=2 and α=0.05. The value χcritical2(2,0.05) is approximately 5.991.
Decision Rule:
If χcalculated2>χcritical2, we reject H_0.
If χcalculated2≤χcritical2, we fail to reject H_0.
In our example: 0.9704≤5.991.
7. Conclusion
Since the calculated value of χ2(0.9704) is less than the critical value (5.991) for a significance level of 0.05, we fail to reject the null hypothesis (H_0).
This means that we do not have sufficient statistical evidence to conclude that the sample data do not follow a normal distribution.
Kolmogorov-Smirnov (KS) Normality Test
Hypothesis
H0: The data follow a normal distribution.
H1: The data do not follow a normal distribution.
Test Statistic (D)
The Kolmogorov-Smirnov test statistic, D, is the maximum absolute difference between the sample CDF Sn(x) and the theoretical CDF of the normal distribution F(x):
For practical calculations with an ordered sample x(1),x(2),…,x(n): First, D+ and D− are calculated:
D+=max1≤i≤n(ni−F(x(i)))
Then, the statistic D is the maximum of these two values:
Where:
n is the sample size. * x(i) is the ith smallest value in the sample (ordered data).
Sn(x(i))=i/n is the value of the ECDF at x(i)
F(x(i)) is the value of the theoretical CDF of the normal distribution evaluated at x(i), using the specified mean (μ) and standard deviation (σ) (or estimated from the sample for the Lilliefors test).
Step-by-Step Example
Suppose we have a small sample of n = 5 observations: X=10,12,15,16,20. We want to test whether they come from a normal distribution.
Steps:
Sort the Data: The data are already sorted: x(1)=10,x(2)=12,x(3)=15,x(4)=16,x(5)=20.
Estimate Normal Distribution Parameters (for Lilliefors test): We calculate the sample mean (xˉ) and the sample standard deviation (s).
xˉ=(10+12+15+16+20)/5=73/5=14.6
s=n−1∑(xi−xˉ)2=4(10−14.6)2+(12−14.6)2+(15−14.6)2+(16−14.6)2+(20−14.6)2
s=4(−4.6)2+(−2.6)2+(0.4)2+(1.4)2+(5.4)2=421.16+6.76+0.16+1.96+29.16=459.2=14.8≈3.847
Calculate F(x(i)),Sn(x(i)) and the differences: For each x(i), we calculate F(x(i)) using the one-normal CDF with μ=14.6 and σ=3.847. This involves standardizing each x(i) to z(i)=(x(i)−xˉ)/s and then finding P(Z≤z(i))
ix_(i)z_(i) = x_(i)-14.6)/3.847$$F(x_{(i)}) = P(Z \le z_{(i)})$S_n(x_(i)) = i/n$\frac{i-1}{n}$$D_i^+ = \frac{i}{n} - F(x_{(i)})$$D_i^- = F(x_{(i)}) - \frac{i-1}{n}$1
10
(10−14.6)/3.847≈−1.196
≈0.1159
1/5 = 0.2
0/5 = 0.0
0.2 - 0.1159 = 0.0841
0.1159 - 0.0 = 0.1159
2
12
(12−14.6)/3.847≈−0.676
≈0.2495
2/5 = 0.4
1/5 = 0.2
0.4 - 0.2495 = 0.1505
0.2495 - 0.2 = 0.0495
3
15
(15−14.6)/3.847≈0.104
≈0.5414
3/5 = 0.6
2/5 = 0.4
0.6 - 0.5414 = 0.0586
0.5414 - 0.4 = 0.1414
4
16
(16−14.6)/3.847≈0.364
≈0.6420
4/5 = 0.8
3/5 = 0.6
0.8 - 0.6420 = 0.1580
0.6420 - 0.6 = 0.0420
5
20
(20−14.6)/3.847≈1.404
≈0.9198
5/5 = 1.0
4/5 = 0.8
1.0 - 0.9198 = 0.0802
0.9198 - 0.8 = 0.1198
Calcular el Estadístico D: D+=max(0.0841,0.1505,0.0586,0.1580,0.0802)=0.1580 D−=max(0.1159,0.0495,0.1414,0.0420,0.1198)=0.1414
D=max(D+,D−)=max(0.1580,0.1414)=0.1580
Tomar una Decisión Estadística:
Se elige un nivel de significancia α (ej., α=0.05).
El valor Dcalculado=0.1580 se compara con un valor crítico Dcritico. Para la prueba de Lilliefors, ya que μ y σ fueron estimados, los valores críticos son diferentes de la prueba KS estándar y dependen de n.
Por ejemplo, para n=5 y α=0.05, el valor crítico de Lilliefors es aproximadamente 0.337.
Regla de decisión: Si Dcalculado>Dcritico, se rechaza H_0. De lo contrario, no se rechaza H_0.
En nuestro ejemplo: 0.1580 < 0.337.
Conclusión: Como Dcalculado (0.1580) es menor que el valor crítico de Lilliefors (0.337) para α=0.05, no rechazamos la hipótesis nula (H_0). Concluimos que no hay evidencia suficiente para afirmar que los datos no siguen una distribución normal.
Consideraciones Adicionales
La prueba de Kolmogorov-Smirnov es generalmente más potente que la prueba χ2 para probar la normalidad de datos continuos, especialmente con muestras pequeñas.
Cuando los parámetros de la distribución normal (media y/o desviación estándar) se estiman a partir de la muestra, es crucial utilizar los valores críticos de la prueba de Lilliefors, no los de la prueba KS estándar, ya que estos últimos serían demasiado conservadores, llevarían a no rechazar H_0 con demasiada frecuencia.
La prueba es sensible a desviaciones en el centro, las colas y la forma de la distribución.
Independence Tests
These tests verify whether the generated numbers are independent of each other (that is, whether the value of one number does not influence the value of the following numbers).
Runs Test: A "run" is a subsequence of observations with a common property.
For example, an upward run is a sequence of numbers where each is greater than the previous one. The total number of runs in the sequence is analyzed. If there are too many or too few runs compared to what would be expected from a truly independent sequence, the hypothesis of independence is rejected.
Autocorrelation Test: This test measures the linear correlation between numbers separated by a certain "lag" (lag) k in the sequence. The lag k autocorrelation, ρk, estimates the correlation between Ui and Ui+k. For an independent sequence, all autocorrelations (for k≥1) are expected to be close to zero. A test statistic (based on the Normal approximation for ρk when N is large) is constructed to check whether ρk is significantly different from zero.
It is important to note that a generator must pass a diverse and rigorous set of these statistical tests before being considered acceptable for use in serious simulation studies. The fact that a sequence passes a test does not guarantee that it is "perfect," only that no evidence against the assumption of uniformity or independence has been found under that particular test.
Random Variable Generation Methods.
Once a reliable source of pseudorandom numbers Ui∼U(0,1) is available, the next step is to transform these uniform numbers into realizations of random variables that follow the specific probability distributions (Exponential, Normal, Poisson, etc.) that have been identified as models for the uncertain components of the system under study.
There are several methods for performing this transformation.
Inverse Transform Method (ITM)
This is the most fundamental method and, when applicable, often the most efficient and straightforward.
Principle It is based on the fact that if X is a random variable with a continuous and strictly increasing cumulative distribution function (CDF) F(x), and U is a random variable U(0,1), then the random variable Y=F−1(U) has the same distribution as X. Here, F−1(u) is the inverse function of the CDF, defined as the value y such that F(y)=u.
Algorithm for Continuous VAs:
Generate a number u∼U(0,1).
Compute x=F−1(u). This x is the desired observation from the CDF. X.
Application to Discrete A.V.:
If X is discrete with MPF p(xj) and FDAF(xj)=P(X≤xj), then u∼U(0,1) is generated and the value xj is searched for such that F(xj−1)<u≤F(xj) (where F(x0)=0). Then xj is the generated observation. This can be implemented using a sequential or more efficient search.
Direct Application Examples, where F−1(u) has closed form:
Uniform Continuous U(a,b): F(x)=(x−a)/(b−a). Inverting, x=a+(b−a)u.
Exponential (mean μ=1/λ): F(x)=1−e−λx. Inverting, x=−(1/λ)ln(1−u). Since if u∼U(0,1), then 1−u∼U(0,1), one can use x=−(1/λ)ln(u) equivalently.
Triangular: The CDF is piecewise linear, and its inverse can also be found analytically piecewise.
Weibull: F(x)=1−e−(x/α)β. Inverting, x=α(−ln(1−u))1/β.
Advantages:
It is exact if F−1(u) can be calculated precisely. It is intuitive. It preserves monotonicity (if u1<u2, then F−1(u1)≤F−1(u2)), which is useful for some variance reduction techniques.
Disadvantages:
The function F−1(u) does not always have a simple analytic or closed-form expression
(e.g., for the Normal, Gamma, Beta, Binomial, and Poisson distributions with large parameters). In these cases, numerical approximations or alternative methods are required.
Acceptance-Rejection (A-R) Method.
This is a general method that can be used when the ITM is difficult or inefficient to apply.
Suppose we want to generate a VA X with PDF f(x). We choose another "majority" PDF g(x) (from which it is easy to generate variables, for example, using ITM) and a constant c≥1 such that f(x)≤c⋅g(x) for all x where f(x)>0. The idea is to generate a candidate Y from g(y), and then "accept" it as a realization of X with probability f(Y)/(c⋅g(Y)).
General Algorithm.
Generate a candidate Y from the distribution with PDF g(y).
Generate a number U∼U(0,1) (independent of Y).
If U≤c⋅g(Y)f(Y), then accept X=Y.
Otherwise, reject Y and return to step 1.
Efficiency.
The probability of acceptance at each iteration is 1/c. Therefore, we want c to be as close to 1 as possible, which means the "envelope" function c⋅g(x) should be as close to f(x) as possible.
The expected number of iterations to generate an observation is c.
Uses
Useful for distributions such as the Normal, using a Normal PDF as the major factor for one side, and exponentials for the tails, as in the Ziggurat, Gamma, and Beta algorithms, where the MTI is not straightforward.
Box-Muller Method for the Normal Distribution
This is a specific and elegant method for generating pairs of independent standard normal random variables Z1,Z2∼N(0,1) from two independent random numbers U1,U2∼U(0,1).
To generate a variable X∼N(μ,σ2), first generate Z1 or Z2 and then transform it: X=μ+σZ1.
Polar Variant (Marsaglia-Bray): There is a modification that avoids the direct use of trigonometric functions, using an acceptance-rejection method to generate points within a unit circle, which can be computationally more efficient on some architectures.
Generation for Empirical Distributions
When a set of observed data x1,…,xN from a real system is available, and a theoretical distribution that fits them well cannot be found (or one is unwilling to assume one), the empirical distribution can be used.
For Discrete Data: If there are k distinct values v1,…,vk with relative frequencies p1,…,pk, the MTI for discrete data is applied.
For Continuous Data: A stepwise empirical CDF can be constructed (if the data are used directly) or a piecewise linear CDF (if the data are grouped into a histogram and interpolated). Then, the MTI is applied.
Special Techniques
is used to generate random variables when a complex distribution can be expressed as a weighted combination of several simpler distributions. Basically, if a random variable ( X ) is known to have a distribution function that is a mixture of ( k ) different distributions with associated probabilities ( p1,p2,…,pk) (where ( ∑i=1kpi=1)), the composition technique allows its simulation as follows:
( i ) is chosen with probability ( p_i ).
A random variable ( X_i ) is generated from the ( i )th distribution.
The generated variable (X=Xi) then follows the desired distribution, combining the characteristics of the different participating distributions according to the specified weights.
The convolution technique is used to generate random variables that are the sum of other independent random variables. It is particularly useful when a random variable (Z) is known to be the sum of other (X1,X2,…,Xn) that follow known distributions. /
To simulate (Z) using convolution:
Generate (n) independent random variables X1,X2,…,Xn following their respective distributions.
Calculate (Z=X1+X2+…+Xn).
The result is that ( Z ) follows the distribution resulting from the sum of the distributions of ( X1,X2,…,Xn ).
The rejection technique is used to generate a random variable from a complex distribution when it is difficult to sample directly. A density function ( g(x)) is required that is greater than or equal to the desired function f(x) for all possible values of ( X ). The procedure is:
Choose ( X ) such that it follows the distribution of ( g(x) ).
Generate a uniform random number ( U ) between 0 and 1.
Accept ( X ) if ( U≤cg(X)f(X) ), where ( c ) is a constant such that ( cg(x)≥f(x) ) for all ( x ).
Repeat the process until an ( X ) is accepted, ensuring that it follows the ( f(x) ) distribution. \
Recommended Reading:
Law, A. M. (2015). Simulation Modeling and Analysis (5th ed.). McGraw-Hill Education. (Chapters on random number generation and random variables).
Banks, J., Carson, J. S., II, Nelson, B. L., & Nicol, D. M. (2010). Discrete-Event System Simulation (5th ed.). Prentice Hall. (Chapters on random numbers and random variable generation).
Leemis, L. M., & Park, S. K. (2006). Discrete-Event Simulation: A First Course. (Chapters 2, 6, 7).
Hillier, F. S., & Lieberman, G. J. (2010). Introduction to Operations Research (9th ed.). McGraw-Hill. (Chapter 20 on Simulation, sections 20.3, 20.4).
Cassandras, C. G., & Lafortune, S. (2008). Introduction to Discrete Event Systems (2nd ed.). Springer. (Section 10.5 on Random Variable Generation).
Última actualización
¿Te fue útil?