Monte-Carlo Methods: PRNGs

Data Science
Computational Mathematics
Number Theory
Author

Rafiq Islam

Published

August 11, 2024

Introduction

The Monte Carlo method is a widely used statistical technique that leverages the power of randomness to solve complex mathematical problems and simulate the behavior of various systems. It’s a method that has found applications across diverse fields, including physics, finance, engineering, and biology. In this blog post, we’ll dive deeper into the Monte Carlo method and explore the mathematics behind it, along with a discussion of random number generators like Linear Congruential Generators (LCGs) and the infamous RANDU.

The Monte Carlo Method

The Monte Carlo method is based on the idea of using randomness to approximate solutions to problems that may be deterministic in nature but are too complex for analytical methods. The name “Monte Carlo” is a nod to the randomness associated with the famous casino in Monaco.

The basic principle behind the Monte Carlo method is to simulate the behavior of a system by generating random samples and using them to estimate the desired quantities. Let’s consider a mathematical problem where we need to compute an integral that does not have a straightforward analytical solution:

\[\begin{align*} I &= \int_{a}^{b} f(x) \, dx \end{align*}\]

The Monte Carlo method approximates this integral by sampling random points \(x_i\) uniformly from the interval \([a, b]\) and evaluating the function \(f(x)\) at these points. The integral can then be approximated as:

\[\begin{align*} I \approx \frac{b - a}{N} \sum_{i=1}^{N} f(x_i) \end{align*}\]

where \(N\) is the number of random samples. As \(N\) increases, the approximation becomes more accurate, thanks to the Law of Large Numbers.

This approach is particularly useful for high-dimensional integrals, where traditional numerical integration methods become computationally expensive or infeasible.

Random Number Generators (RNGs)

At the heart of the Monte Carlo method lies the generation of random numbers. In practice, most simulations do not use true random numbers but rather pseudorandom numbers generated by deterministic algorithms. These pseudorandom number generators (PRNGs) produce sequences that mimic the properties of true randomness.

Linear Congruential Generator (LCG)

One of the most commonly used PRNGs is the Linear Congruential Generator (LCG). The LCG generates a sequence of numbers \(X_1, X_2, X_3, \ldots\) using the recursive relation:

\[\begin{align*} X_{n+1} &= (aX_n + c) \mod m \end{align*}\]

where:

  • \(X_n\) is the \(n\)-th number in the sequence.
  • \(a\) is the multiplier.
  • \(c\) is the increment.
  • \(m\) is the modulus.

The sequence starts with an initial value \(X_0\), known as the seed, and the parameters \(a\), \(c\), and \(m\) are carefully chosen to maximize the period and quality of the generated sequence.

The quality of the LCG depends on the choice of these parameters. For instance, to achieve a full period (i.e., the sequence cycles through all possible values before repeating), the following conditions must be met:

  1. \(c\) and \(m\) must be relatively prime.
  2. \(a - 1\) must be divisible by all prime factors of \(m\).
  3. If \(m\) is divisible by 4, then \(a - 1\) must also be divisible by 4.

A well-known example of an LCG is the minstd_rand generator used in the C++ Standard Library, which uses \(a = 16807\), \(c = 0\), and \(m = 2^{31} - 1\).

The RANDU Generator

RANDU is an example of a poorly designed LCG that became notorious for its flaws. It is defined by the recurrence relation:

\[\begin{align*} X_{n+1} &= (65539X_n) \mod 2^{31} \end{align*}\]

Although RANDU was widely used in the 1960s and 1970s due to its simplicity, it was later discovered to produce sequences with significant correlations. For example, points generated using RANDU tend to lie on a small number of planes in three-dimensional space, which can severely impact the accuracy of Monte Carlo simulations.

The generator’s flaws arise from poor parameter selection. In RANDU, the modulus \(m = 2^{31}\) and the multiplier \(a = 65539\) result in a sequence with poor distribution properties. As a consequence, RANDU’s generated numbers do not pass modern statistical tests for randomness, rendering it unsuitable for serious applications.

Let’s solve some math problems and visualize randomness.

Problem 1

Given an LCG with parameters \(a,c,m\), prove that

which shows that the \((n+k)th\) term can be computed directly from the \(nth\) term.

Solution:

We know from D. H. Lehmer’s linear congruential generator that
\[\begin{equation} x_n \equiv ax_{n-1}+c \mod m \end{equation}\]
where \(a\) is called the multiplier, \(c\) is called the shift or increment, and \(m\) is called the modulus of the generator. The given equation is also an LCG. We can prove this by induction method. Since \(k\ge 0\) so, let \(k=0\). Then the given relation can be written as

If \(k=1\). Then the given relation can be written as
\[\begin{align*} x_{n+1}& \equiv ax_n+\frac{a-1}{a-1}c \mod m\\ &\equiv ax_n+c \mod m \end{align*}\]

If \(k=2\). Then the given relation can be written as
\[\begin{align*} x_{n+2}& \equiv a^2x_n+\frac{a^2-1}{a-1}c \mod m\\ &\equiv a^2x_n+(a+1)c \mod m\\ &\equiv a^2x_n+ac+c \mod m \\ &\equiv a(ax_n+c)+c \mod m\\ &\equiv ax_{n+1}+c \mod m \end{align*}\]

Now for any \(k=p\) where \(p\in \mathbb{N}\), \[\begin{align*} x_{n+p}& \equiv a^px_n+\frac{a^p-1}{a-1}c \mod m \\ \end{align*}\]

Now by the method of induction, the given equation would be a lcg if it holds for any \(k=p\in \mathbb{N}\) then it must hold for \(k=p+1\) where \(p\in \mathbb{N}\). Now from equation (1) \[\begin{align*} x_{n+p+1} &\equiv ax_{(n+p+1)-1}+c \mod m\\ & \equiv ax_{n+p}+c \mod m \\ & \equiv a(a^px_n+\frac{a^p-1}{a-1}c) +c \mod m\\ & \equiv a^{p+1}x_n+(a\frac{a^p-1}{a-1}+1)c \mod m\\ & \equiv a^{p+1}x_n+\frac{a^{p+1}-1}{a-1}c \mod m\\ \end{align*}\]

Which proves that \(x_{n+k}=a^kx_n+\frac{(a^k-1)}{a-1}c (\mod m)\); \((a\ge 2, k\ge0)\) is an lcg such that \((n+k)th\) term can be computed directly from the \(nth\) term.

Problem 2

(a)
If \(U\) and \(V\) are independently distributed random variables from the uniform distribution \(U(0,1)\) show that \(U+V (\mod 1)\) is also \(U(0,1)\).

Solution

Let \(Z=U+V\) where \(U\) and \(V\) are independently distributed random variables from the uniform distribution \(U(0,1)\). So the minimum value that \(Z\) can have is \(0\) and the maximum value could be \(2\). If \(f_U(u)\) is the PDF of \(U\) and \(f_V(v)\) is the PDF of \(V\) then the PDF of \(Z\) can be found from the convolution of two distribution as follows \[\begin{align*} f_Z(z)=\int_{-\infty}^{+\infty}f_U(u)f_V(z-u)du=\begin{cases} z & \text{for} \hspace{2mm} 0 < z < 1\\ 2-z & \text{for} \hspace{2mm} 1 \le z <2\\ 0 & \text{otherwise} \end{cases} \end{align*}\] Now for any \(x\in (0,1)\) \[\begin{align*} \mathbb{P}(U+V (\mod 1) \le x) &= \mathbb{P}(Z \le x)+ \mathbb{P}(1\le Z \le x+1)\\ &= \int_{0}^{x} z dz +\int_{1}^{1+x}(2-z)dz\\ &=x \end{align*}\]

which is the CDF of a random variable distributed \(U(0,1)\)

(b)
A random number generator is designed by

where \(X_0=0, Y_0=1, X_{n+1}=(9X_n+3) \mod 8\) and \(Y_{n+1}=3Y_n \mod 7\) for \(n=0,1,2,\cdots\). Calculate \(R_0,R_1,R_2, \cdots , R_5.\). What is the period of the generator \(\{R_n\}\)?

Solution

rand.gen<-function(n){
  RN<-vector(length = n)
  x<-rep(n)
  y<-rep(n)
  x[1]<-0;
  y[1]<-1;
  RN[1]<-(x[1]/8+y[1]/7)%% 1
  for (i in 1:n) {
    x[i+1]<-(9*x[i]+3)%% 8
    y[i+1]<-(3*y[i]) %% 7
    RN[i+1]<-(x[i+1]/8+y[i+1]/7)%% 1
  }
  return(data.frame(X_values=x,Y_values=y,R_values=RN))
}
rand.gen(4)  
  X_values Y_values   R_values
1        0        1 0.14285714
2        3        3 0.80357143
3        6        2 0.03571429
4        1        6 0.98214286
5        4        4 0.07142857

So the unique values are

     R_values
1  0.14285714
2  0.80357143
3  0.03571429
4  0.98214286
5  0.07142857
6  0.58928571
7  0.39285714
8  0.05357143
9  0.28571429
10 0.23214286
11 0.32142857
12 0.83928571
13 0.64285714
14 0.30357143
15 0.53571429
16 0.48214286
17 0.57142857
18 0.08928571
19 0.89285714
20 0.55357143
21 0.78571429
22 0.73214286
23 0.82142857
24 0.33928571

So from the above data we can see that the period is \(24\).

Problem 3

Write a code that would implement RANDU. For debugging purpose print \(x_{1000}\) when the seed is \(x_0=1\)

(a)
Using RANDU generate \(u_1,u_2,\cdots, u_{20,002}\) where \(u=\frac{x_n}{M}\). For all triplets in your sequence, \(u_i, u_{i+1}, u_{i+2}\), in which \(0.5\le u_{i+1} \le 0.51\) plot \(u_i\) versus \(u_{i+2}\). Comment on the pattern of your scatterplot.

(b)
Generate a sequence of lenght 1002. Use a program that plots points in 3 dimensions and rotates the axes to rotate the points until you can see the 15 planes.

library("rgl")
library("rglwidget")

N = 1002
A = matrix(0, ncol=3, nrow=N)
seed <- as.double(1)

RANDU <- function() {
  seed <<- ((2^16 + 3) * seed) %% (2^31)
  round(seed/(2^31), 6)
}

for (i in 1:N) {
  A[i, ] <- c(RANDU(), RANDU(), RANDU())
}
B = as.data.frame(A)

bg3d(color = "#f4f4f4")
plot3d(B$V1, B$V2, B$V3, type="s", size=1, lit=TRUE, col = rainbow(1000))
spin <- spin3d(axis= c(0, 0, 1), rpm = 5)
play3d(spin, duration = 10)

# Render the 3D plot as a WebGL widget
rglwidget()
rgl.close()

Problem 4 Approximation of pi using Monte-Carlo Method

A circle with radius \(r=1\) has the area \(A=\pi r^2= \pi\) and a square with length \(l=1\) has the area \(B=1\). Now if we consider the following situation, where a quarter of a unit circle is inscribed inside a unit square like this

# I used python to generate this plot. Else everywhere the codes are in R
import matplotlib.pyplot as plt
import numpy as np 

num_points = 100
pts = np.random.rand(num_points,2)

fig, axes = plt.subplots()
theta = np.linspace(0, np.pi/2,100)
x = np.cos(theta)
y = np.sin(theta)

axes.plot(x, y, 'b')
axes.plot([0,1],[0,0],'k')
axes.plot([1,1],[0,1],'k')
axes.plot([1,0],[1,1],'k')
axes.plot([0,0],[1,0],'k')

for p in pts:
    if p[0]**2+p[1]**2 <=1:
        axes.plot(p[0], p[1], 'go')
    else:
        axes.plot(p[0], p[1], 'ro')
axes.set_aspect('equal')
plt.gca().set_facecolor('#f4f4f4') 
plt.gcf().patch.set_facecolor('#f4f4f4')
plt.show()

Quarter circle inscribe in a unit square

we get,

\[\begin{align*} \frac{\text{Area of the quarter of a unit circle: C}}{\text{Area of a unit square: S}}&= \frac{\frac{\pi}{4}}{1}=\frac{\pi}{4}\hspace{4mm}\implies \pi = \frac{4C}{S} \end{align*}\]

The above relation tells us some interesting fact. If we uniformly create as many points as possible inside the square then the number of points inside the circle will be approximately 4 times the number of the points outside the circular region.

monte_carlo_pi <- function(n) {
  inside_circle <- 0
  for (i in 1:n) {
    x <- runif(1)
    y <- runif(1)
    if (x^2 + y^2 <= 1) {
      inside_circle <- inside_circle + 1
    }
  }
  pi_estimate <- (inside_circle / n) * 4
  return(pi_estimate)
}

n <- 10000
cat(sprintf("Monte Carlo estimated value of π from %d points = %f\n", n, monte_carlo_pi(n)))
Monte Carlo estimated value of π from 10000 points = 3.124000

That’s all for this post.

Reference

Share on

You may also like

Back to top

Citation

BibTeX citation:
@online{islam2024,
  author = {Islam, Rafiq},
  title = {Monte-Carlo {Methods:} {PRNGs}},
  date = {2024-08-11},
  url = {https://mrislambd.github.io/jobandintern/montecarlo1/},
  langid = {en}
}
For attribution, please cite this work as:
Islam, Rafiq. 2024. “Monte-Carlo Methods: PRNGs.” August 11, 2024. https://mrislambd.github.io/jobandintern/montecarlo1/.