Chapter Two: Fundamentals of Probability and Statistics for Simulation

Chapter Objectives:

  1. Apply basic concepts of probability theory to model uncertainty in industrial engineering systems.

  2. Distinguish between discrete and continuous random variables, and describe their fundamental properties.

  3. Use and interpret probability mass functions (PMF), probability density functions (PDF), and cumulative distribution functions (CDF).

  4. Calculate and interpret the expected value, variance, and other descriptive measures of random variables.

  5. Identify the most common probability distributions (Bernoulli, Binomial, Poisson, Geometric, Uniform, Exponential, Normal, Triangular, Gamma, Weibull, Lognormal) and select the most appropriate one to model random phenomena in logistics and production.

  6. Understand the statement and practical implications of the Central Limit Theorem (CLT).

  7. Describe the process of parameter estimation from sample data and the need to perform goodness-of-fit tests.


Basic Concepts of Probability

Industrial engineering deals with systems that operate under uncertainty. Product demand fluctuates, production times vary, machines fail unexpectedly. Probability is the branch of mathematics that provides the language and tools to quantify and analyze this uncertainty. A solid understanding of its principles is therefore an indispensable prerequisite for stochastic simulation.

It is the set of all possible outcomes of a random experiment. For example, if a manufactured part is inspected, the sample space could be {defective, not defective}. If the cycle time of a product is measured, the sample space could be all positive real numbers.

circle-info

Montgomery, D. C., & Runger, G. C. (2018). Applied Statistics and Probability for Engineers (7th ed.). Wiley.

circle-info

The foundation of stochastic simulation lies in the ability to numerically represent uncertainty. Production and logistics processes are full of variability. - CRC -

Probability theoryarrow-up-right provides the conceptual framework for describing these phenomena. Concepts such as event independence are crucial; for example, when modeling a system with multiple machines, it is important to determine if the failure of one machine is independent of failures of another. If they are not (perhaps due to widespread poor maintenance or power fluctuations), the model must capture this dependence, often through conditional probabilities.


Example of Probability Using the Dartboard Exercise

There is a square board with an inscribed circle. The circle has a radius r and, since it is perfectly inscribed, the side of the square is 2r. Darts are thrown at the square randomly, meaning each point inside the square has the same probability of being hit.

The goal is to determine the probability that a dart that lands inside the square also lands inside the circle.

P(A)=Measure of the total sample spaceMeasure of the favorable outcome spaceP(A)= \frac{ \text{Measure of the total sample space}}{ \text{Measure of the favorable outcome space}}

For this problem:

  • The total sample space (S) is all possible points where the dart can land, i.e., the area of the square.

  • The favorable event (E) is that the dart lands inside the circle, which corresponds to the area of the circle.

Area of the Circle (Acircle)(A_\text{circle}): The area of a circle is calculated with the formula A=πr2A=\pi r^2.

For r=1, the area is: Acircle=π(1)2=πA_{\text{circle}} = \pi (1)^2 = \pi

Area of the Square (Asquare)(A_\text{square}): If the circle of radius 1 is centered and concentric with the square, the side of the square (l) equals the diameter of the circle, i.e., l=2r. For r=1, the side of the square is l=2(1)=2. The area of the square is calculated as

A=l2. Asquare=(2)2=4A=l^2. \ A_{\text{square}} = (2)^2 = 4

Calculation of Probability: The probability that the dart lands inside the circle is the ratio of the two areas.

P(Dart in circle)=AcircleAsquare=π4P(\text{Dart in circle}) = \frac{A_{\text{circle}}}{A_{\text{square}}} = \frac{\pi}{4}

Numerically, this is approximately:

π43.1415940.7854\frac{\pi}{4} \approx \frac{3.14159}{4} \approx 0.7854 This means there is approximately a 78.54% probability that the dart lands inside the circle.

Theorem of Mutually Exclusive Events

circle-exclamation

The theorem of mutually exclusive events states that two events cannot occur simultaneously. In other words, if one event happens, the other cannot. Mathematically, two events A and B are mutually exclusive if their intersection is empty, that is,

AB=A \cap B = \emptyset.

Example:

Imagine rolling a six-sided die. Define the following events:

  • Event A: rolling an even number (2, 4, 6)

  • Event B: rolling an odd number (1, 3, 5)

Clearly, a number cannot be even and odd at the same time. Therefore, A and B are mutually exclusive events.

Property

For mutually exclusive events, the probability that any of them occurs is the sum of their individual probabilities: P(AB)=P(A)+P(B)P(A \cup B) = P(A) + P(B)

In the context of discrete-event simulation, the concept of mutually exclusive events is fundamental for modeling decision points, branching, and entity routing. When an entity (a customer, product, document) reaches a point in the process where it must follow one of several possible paths, these paths represent mutually exclusive events. The entity cannot take two paths at once.

This logic is implemented in AnyLogic mainly through the SelectOutput block from the Process Modeling Library (PML).

  • Practical Implementation: A SelectOutput block has one input and multiple outputs. An arriving entity is routed to one and only one of these outputs, making the choice of each route a mutually exclusive event.

This routing can be defined in two main ways:

  1. Probabilistic Routing: The block can be set to route entities based on probabilities. For example, 50% of entities go to output 1, 30% to output 2, and 20% to output 3. The sum of probabilities must be 1, ensuring each entity takes exactly one route.

  2. Conditional Routing: The route selection is based on a boolean condition (true/false). For example: IF (entity.type == "Urgent") THEN Output1 ELSE Output2. The condition entity.type == "Urgent" can only be true or false for a given entity, never both. Therefore, taking Output 1 and taking Output 2 are mutually exclusive events.

Shoe Factory Example:

Consider the "Shoe Factory" case.

Process Description: The document states that, in the first stage, the cutting process depends on the shoe type. 50% of units are Type 1, 30% are Type 2, and 20% are Type 3.

Event Analysis: A shoe unit arriving at the cutting station cannot be both Type 1 and Type 2 at the same time. Therefore, the events "the unit is Type 1", "the unit is Type 2", and "the unit is Type 3" are mutually exclusive.

Implementation in AnyLogic: To model this system, a Source block is used to generate shoe pairs' arrivals, immediately followed by a SelectOutput block. This SelectOutput is configured with three outputs, each connected to a different cutting process, and with probabilities of 0.5, 0.3, and 0.2, respectively.


Random Variables: Discrete and Continuous

A random variable (RV) is a function that assigns a numeric value to each possible outcome of a random experiment. Instead of working with qualitative events, random variables allow us to quantify outcomes and apply mathematical and statistical tools.

An RV is discrete if the set of values it can take is finite or countably infinite (that is, can be counted, like the integers).

Examples in Industrial Engineering:

  • Number of customer arrivals at a service system in an hour.

  • Number of defective items in a production lot.

  • Number of trucks waiting to load at a dock.

  • Number of units demanded of a product in a day.


Definitions: Probability, Density, and Cumulative Distribution Functions

Probability Density Function (PDF): Describes the probability that a continuous random variable takes a value within a specific range, where the area under the curve equals the probability.

The probability density function (PDF): The probability that X is between values a and b is the integral of the PDF from a to b:

P(aXb)=abf(x)dxP(a \leq X \leq b) = \int_{a}^{b} f(x) \, dx

Example:

If X is a continuous random variable with PDF f(x)=12f(x) = \frac{1}{2} for 0x20 \leq x \leq 2, and zero elsewhere, the probability that X is between 0.5 and 1.5 is calculated as:

P(0.5X1.5)=0.51.512dx=[12x]0.51.5=12(1.5)12(0.5)=0.5P(0.5 \leq X \leq 1.5) = \int_{0.5}^{1.5} \frac{1}{2} \, dx = \left[\frac{1}{2}x\right]_{0.5}^{1.5} = \frac{1}{2}(1.5) - \frac{1}{2}(0.5) = 0.5

Therefore, the probability that X is between 0.5 and 1.5 is 0.5.

Cumulative Distribution Function (CDF): Provides the probability that a random variable, discrete or continuous, is less than or equal to a specific value, representing the sum or integration of values up to that point.

Example: Rolling a Die

Consider a simple experiment: rolling a standard six-sided die.

  • Discrete Random Variable (X): The result of the roll, which can take discrete values {1,2,3,4,5,6}.

  • Probability Density Function (pdf): For a fair die, each result has the same probability of occurring.

p(x)=P(X=x)=16,for x={1,2,3,4,5,6}p(x)=P(X=x)= \frac{1}{6} ,\quad \text{for } x=\{1,2,3,4,5,6\}

Calculation of the Cumulative Distribution Function (CDF)

The CDF, denoted as F(x), gives us the probability that the outcome of the roll is less than or equal to a specific value x. It is calculated by summing the probabilities of all outcomes up to that value.

  • Probability of getting a 1 or less: F(1)=P(X1)=P(X=1)=16F(1)=P(X≤1)=P(X=1)=\frac{1}{6}

  • Probability of getting a 2 or less: F(2)=P(X2)=P(X=1)+P(X=2)=16+16=26F(2)=P(X≤2)=P(X=1)+P(X=2)=\frac{1}{6}+\frac{1}{6}=\frac{2}{6}

  • Probability of getting a 3 or less: F(3)=P(X3)=P(X=1)+P(X=2)+P(X=3)=16+16+16=36F(3)=P(X≤3)=P(X=1)+P(X=2)+P(X=3)=\frac{1}{6}+\frac{1}{6}+\frac{1}{6}=\frac{3}{6}

Resultado de x_i
Probabilidad individual
Prob Acumulada

1

16\frac{1}{6}

16\frac{1}{6}

2

16\frac{1}{6}

26\frac{2}{6}

3

16\frac{1}{6}

36\frac{3}{6}

4

16\frac{1}{6}

46\frac{4}{6}

5

16\frac{1}{6}

56\frac{5}{6}

6

16\frac{1}{6}

1

circle-exclamation

For Discrete Random Variables, Probability Mass Function (PMF)

The PMF of a discrete random variable X, denoted as p(x)p(x) or P(X=x)P(X=x), specifies the probability that X takes exactly the value x.

  • Properties:

    1. 0p(x)10 \leq p(x) \leq 1 for all x.

    2. all xp(x)=1\sum_{\text{all } x} p(x) = 1 (the sum of the probabilities of all possible values is 1).

  • Example:

    • If X is the result of rolling a fair die, p(x)=1/6p(x)=1/6 for x{1,2,3,4,5,6}x \in \{1,2,3,4,5,6\}, and p(x)=0p(x)=0 for any other x.

  • The formula for the PMF is:

P(X=k)=(nk)pk(1p)nkP(X=k) = {n \choose k} \cdot p^k (1-p)^{n-k}

Where:

  • n is the number of trials (sample size).

  • k is the number of successes (defective components).

  • p is the probability of success in a single trial.

  • (nk)=n!k!(nk)!{n\choose k} = \frac{n!}{k!(n-k)!} is the binomial coefficient, which counts the number of ways to choose k elements from a set of n.

Practical Example: Quality Inspection in a Production Lot

Scenario: An industrial engineer is supervising an electronic components production line. On average, 5% of the components produced are defective. The engineer randomly takes a sample of 10 components from a large batch for a quality inspection.

The interest is in the number of defective components found in that sample.

Random Variable: Define the discrete random variable X as: X = "the number of defective components in the sample of 10".

The possible values X can take are {0,1,2,3,4,5,6,7,8,9,10}.

Solution: In an Excel sheet, replace the values and plot the results.>

For Continuous Random Variables: Probability Density Function (PDF)

For a continuous random variable X, the PDF, denoted as f(x)f(x), does not give the probability that X is equal to a specific value — this probability is always zero for continuous random variables. Instead, the area under the curve of f(x)f(x) over an interval [a,b] represents the probability that X falls within that interval:

P(aXb)=abf(x)dxP(a \leq X \leq b) = \int_a^b f(x) dx

Properties:

  1. f(x)0f(x) \geq 0 for all x.

  2. f(x)dx=1\int_{-\infty}^{\infty} f(x) dx = 1 (the total area under the density curve is 1).

Cumulative Distribution Function (CDF)

The CDF, denoted as F(x)F(x), applies to both discrete and continuous random variables. It defines the probability that the random variable X takes a value less than or equal to x, F(x)=P(Xx)F(x) = P(X \leq x).

  • For Discrete Random Variables: F(x)=kxp(k)F(x) = \sum_{k \leq x} p(k)

  • For Continuous Random Variables: F(x)=xf(t)dtF(x) = \int_{-\infty}^x f(t) dt

  • General Properties:

    1. 0F(x)10 \leq F(x) \leq 1

    2. F(x) is a non-decreasing function (that is, if a<ba < b, then F(a)F(b)F(a) \leq F(b)).

    3. limxF(x)=0\lim_{x \to -\infty} F(x) = 0 and limxF(x)=1\lim_{x \to \infty} F(x) = 1

  • Usefulness: The CDF is very useful for calculating interval probabilities: P(a<Xb)=F(b)F(a)P(a < X \leq b) = F(b) - F(a)

circle-info

The CDF plays a particularly important role in simulation, as it is the basis of the inverse transform method, a fundamental technique for generating random observations from a specific probability distribution using uniform random numbers.

triangle-exclamation

Expected Value, Variance, and Other Measures

To summarize the characteristics of a probability distribution and its associated random variable, various numerical measures are used.

The most important are expected value and variance.

The expected value of a random variable X, denoted as E[X]E[X] or μX\mu_X, represents the average value X would take if the random experiment were repeated a very large number of times. It is a measure of the central tendency of the distribution.

  • For discrete X with PMF p(x):E[X]=all xxp(x)p(x):\quad E[X] = \sum_{\text{all }x} x \cdot p(x)

  • For continuous X with PDF f(x):E[X]=xf(x)dxf(x):\quad E[X] = \int_{-\infty}^{\infty} x f(x) dx

Properties of Expected Value (if a and b are constants):

  • E[a]=aE[a] = a

  • E[aX]=aE[X]E[aX] = aE[X]

  • E[aX+b]=aE[X]+bE[aX + b] = aE[X] + b

  • E[X+Y]=E[X]+E[Y]E[X + Y] = E[X] + E[Y] (for any random variables X and Y)

  • E[g(X)]=xg(x)p(x)E[g(X)] = \sum_x g(x)p(x) or E[g(X)]=g(x)f(x)dxE[g(X)] = \int_{-\infty}^{\infty} g(x)f(x)dx

These descriptive measures are fundamental in simulation for two main reasons:

chevron-right1. Input Modelinghashtag

When selecting probability distributions for a model's input variables (e.g., service times, demand), their parameters are often directly related to the mean, variance, or other measures (e.g., the exponential distribution is defined by its mean, the normal by its mean and standard deviation).

chevron-right2. Output Analysishashtag

Performance metrics estimated from simulation runs are often expected values (e.g., average queue time, average resource utilization) and measures of their variability (e.g., cycle time variance, standard deviation of profit).


Common Probability Distributions and Their Application in Industrial Engineering

Choosing an appropriate probability distribution to represent a random phenomenon is a critical step in building a valid simulation model. Below are some of the most commonly used distributions in industrial engineering, logistics, and production, along with their characteristics and typical applications.

Distribution
Type
Key Parameters
Mean
Variance
Common Applications in IE (Logistics/Production)

Bernoulli

Discrete

p (prob. of success)

p

p(1-p)

Defective/non-defective part, machine up/down.

Binomial

Discrete

n (trials), p

np

np(1-p)

Number of defectives in a lot, number of customers buying a product.

Poisson

Discrete

λ (mean rate)

λ

λ

Arrivals/hour, failures/day, defects/unit.

Geometric

Discrete

p (prob. of success)

1/p

(1-p)/p²

Number of inspections until first defect.

Uniform Disc.

Discrete

a, b (min, max) or N values

(a+b)/2

((b-a+1)²-1)/12

Random selection among N equally likely options.

Uniform Cont.

Continuous

a, b (min, max)

(a+b)/2

(b-a)²/12

Uncertainty with only known limits, process times with no more info.

Exponential

Continuous

λ (rate) or μ=1/λ (mean)

1/λ

1/λ²

Interarrival times (Poisson), service times, life of non-aging components.

Normal

Continuous

μ (mean), σ (std. dev.)

μ

σ²

Part dimensions, errors, sums of RVs (CLT), task durations.

Triangular

Continuous

a (min), b (max), c (mode)

(a+b+c)/3

(a²+b²+c²-ab-ac-bc)/18

Activity durations (PERT), process times with limited data.

Erlang

Continuous

k (shape), λ (rate)

k/λ

k/λ²

Sum of k exponentials, repair times in phases.

Weibull

Continuous

α (scale), β (shape)

αΓ(1+1/β)

α²[Γ(1+2/β)-(Γ(1+1/β))²]

Failure times (wear-out, infant mortality, constant rate).

Lognormal

Continuous

μₗₙ (mean of lnX), σₗₙ (std. dev. of lnX)

exp(μₗₙ + σₗₙ²/2)

exp(2μₗₙ + σₗₙ²) (exp(σₗₙ²) - 1)

Repair times, some multiplicative processes.

Central Limit Theorem (CLT)

The Central Limit Theorem (CLT) is one of the most important and remarkable results in probability theory and statistics, with profound implications for simulation.

The Central Limit Theorem (CLT) states that, given a sufficiently large sample of independent and identically distributed random variables, the distribution of the sum or sample mean will approximate a normal distribution, regardless of the original distribution of the variables. This means that even if the individual data are not normally distributed, their mean will tend to follow a normal distribution as the sample size increases. This enables statistical inference and the use of techniques such as confidence intervals and hypothesis testing.

Formally, if we have a sequence of random variables:

X1,X2,,XnX_1, X_2, \ldots, X_n

that are independent and identically distributed, with finite mean μ and finite variance σ2\sigma^2, then as n (the number of variables) becomes large, the distribution of the sum

Sn=X1+X2++XnS_n = X_1 + X_2 + \ldots + X_n

approaches a normal distribution. More formally, the standardized variable

Zn=(Snnμ)(σn)=(Xnμσn)Z_n = \frac{(S_n - n\mu)}{(\sigma \sqrt{n})} = \left(\frac{\overline{X_n} - \mu}{\frac{\sigma}{\sqrt{n}}} \right)

converges in distribution to a standard normal variable N(0,1) as nn \to \infty.


Importance of the CLT in Simulation

Many phenomena in industrial engineering, such as the total time to complete a product that passes through multiple processing stages, can be considered as the sum of individual activity durations. Even if the individual activity times are not normally distributed, their sum (if there are enough activities and they are approximately independent) will tend to be normally distributed thanks to the CLT. This justifies the use of the normal distribution as a model for many aggregated quantities in systems.

circle-exclamation

Parameter Estimation and Goodness-of-Fit Tests

What is a Goodness-of-Fit Test?

A goodness-of-fit test is a statistical tool used to determine how well a set of data fits an expected theoretical distribution. It evaluates how likely it is that an observed sample was generated from a population that follows a specific distribution.

What Is It For?

The goodness-of-fit test is mainly used to:

  1. Validate Statistical Models: Helps verify if the chosen statistical model is appropriate for the data.

  2. Compare Distributions: Allows comparing the theoretical distribution with the observed distribution.

  3. Decision Making: Facilitates decisions in fields such as process quality, as understanding the data distribution can lead to better adjustments and solutions.

How Is It Used?

  1. Select the Theoretical Distribution: Choose the distribution that best describes the data (e.g., Normal, Binomial, Poisson, etc.).

  2. Perform the Statistical Test: Use tests such as Chi-square, Kolmogorov-Smirnov, or Anderson-Darling.

  3. Interpret the Results: Analyze the p-value of the test to determine whether to reject or not the null hypothesis that the data follow the selected theoretical distribution.

circle-exclamation

Parameter Estimation

Once a candidate family of distributions has been selected (based on system knowledge and/or a preliminary analysis of the data, such as histograms), the next step is to estimate the parameters of that distribution using the available sample data

(e.g., observed service times, historical demands).

The most common methods include:

Equates the sample moments (e.g., sample mean, sample variance) with the theoretical moments of the distribution (which are functions of the parameters) and solves the resulting system of equations for the parameters.

It is conceptually simple but not always the most efficient.


Goodness-of-Fit Tests

After selecting a distribution and estimating its parameters, it is crucial to formally assess how well that fitted theoretical distribution agrees with the observed empirical data. Goodness-of-fit tests quantify this agreement.

Hypotheses:

  • Null Hypothesis (H0)(H_0): The observed data come from the specified theoretical distribution.

  • Alternative Hypothesis (H1)(H_1): The observed data do not come from the specified theoretical distribution.

Chi-square ($\chi^2$) Test:

  1. Applicable to both discrete and continuous data

circle-exclamation
  1. Compares the observed frequencies (Oi)(O_i) in each category or interval with the expected frequencies (Ei)(E_i) under the null hypothesis.

  2. The test statistic is χ2=(OiEi)2/Ei\chi^2 = \sum (O_i - E_i)^2 / E_i.

  3. If the value of the statistic is large (exceeding a critical value from the Chi-square distribution with certain degrees of freedom, or if the p-value is small), we reject H0H_0, concluding that the theoretical distribution is not a good fit.

Kolmogorov-Smirnov (K-S) Test:

  1. Mainly for continuous data (though there are adaptations for discrete).

  2. Compares the empirical cumulative distribution function (Fe(x))(F_e(x)), built from the data, with the theoretical CDF (Ft(x))(F_t(x)) of the hypothesized distribution.

  3. The test statistic is D=maxxFe(x)Ft(x)D = \max_x |F_e(x) - F_t(x)|, the maximum absolute vertical difference between the two CDFs.

  4. If D is large (exceeding a critical value, or if the p-value is small), we reject H0H_0. It is more sensitive to differences in shape, median, or spread than the Chi-square test.

Interpretation of the p-value

A small p-value (typically < 0.05 or < 0.10) indicates that there is enough evidence to reject the null hypothesis, suggesting that the chosen distribution does not fit the data well. A large p-value does not "prove" that H0H_0 is true, only that there is not enough evidence to reject it.

circle-exclamation

Bibliography

  1. Montgomery, D. C., & Runger, G. C. (2018). Applied Statistics and Probability for Engineers (7th ed.). Wiley.

  2. Walpole, R. E., Myers, R. H., Myers, S. L., & Ye, K. E. (2017). Probability and Statistics for Engineering and the Sciences (9th ed.). Pearson Education.

  3. Ross, S. M. (2014). Introduction to Probability Models (11th ed.). Academic Press.

  4. Leemis, L. M., & Park, S. K. (2006). Discrete-Event Simulation: A First Course. (Chapters 6, 7, and 9)

  5. Winston, W. L. (2005). Operations Research: Applications and Algorithms. (Chapter 12)

  6. Hillier, F. S., & Lieberman, G. J. (2010). Introduction to Operations Research. (Probability and statistics chapters)

Última actualización

¿Te fue útil?