Generating Correlated Uniform Variates

Keywords

covariance matrices, Matlab, R, random numbers, random variables

Review Status

Reviewed; revised 2009-02-12.

Overview

The tutorial describes a method to generate uniformly distributed random variates that exhibit a prespecified linear correlation.

The Problem

Assume one wants to create a vector $U$ of random variates where each element is distributed uniformly over the interval $[0,1]$, and the elements of $U$ are linearly correlated as specified in a matrix $\Sigma$.

Solution

Assume a multivariate random variable $X$ is defined as

(1)
\begin{equation} X = F^{-1}(U) \end{equation}

where $F$ is some distribution function2 and $U\! \sim \! \mathcal{U}[0,1]$. A well-known fact (sometimes used for generating random variates with non-uniform disitributions) is that $X$ will be a random variable with distribution $F$. Reversing this expression, one can create uniformly distributed variates from variates following other distributions by inserting the latter into their respective distribution function. Such transformations, however, will typically affect the dependence between the original variables.

Bravais-Pearson (or linear) correlation is not necessarily invariant to transformations (only to certain linear ones), thus if the original non-uniform variates $X$ have been linearly correlated, there is no guaranty that this linear correlation will be preserved after the transformation. However, other measures of dependence are more ‘robust’ to transformations of the variables. One of those measures is Spearman correlation (also called rank correlation, or fractile correlation) which is invariant to any strictly increasing transformation. Hence, as long as $F$ is strictly monotonically increasing, Spearman correlation will be preserved. A useful result relating Bravais-Pearson and Spearman correlation is that if $X\! \sim \! F$ and $Y\! \sim \! G$ (where $G$ is another distribution function), then

(2)
\begin{align} \rho^S_{X,Y} = \rho^B_{F(X),G(Y)}\, . \end{align}

Here $\rho^B$ is the Bravais-Pearson correlation, the $\rho^S$ is the Spearman correlation. So the linear correlation between the uniforms obtained from transforming the original variates equals the Spearman correlation between the original variates. (For details, see .) So what is needed are random variables (following some distribution) for which we can easily generate realisations with a given Spearman correlation.

Assume $F$ is the normal distribution function. Thus, let the multivariate random variable $Z$ be distributed as

(3)
\begin{align} Z\! \sim \! \mathcal{N}(0,\Sigma)\, . \end{align}

Assume further that the marginals of $Z$ are standardised, that is $\Sigma$ is also the Bravais-Pearson correlation matrix of $Z$. (See here for how to create linearly correlated normal variates.)

For the normal distribution, the exact relationship between Spearman correlation and Bravais-Pearson correlation is known (see ) and given by

(4)
\begin{align} \rho^B = 2 \sin\bigg(\frac{\pi}{6} \rho^S\bigg)\, . \end{align}

Equations (2) and (4) suggest a simple way to obtain a desired Bravais-Pearson correlation for uniforms: Set up the desired Bravais-Pearson correlation matrix $\Sigma$ (these are actually Spearman correlations for the normals, but will be Bravais-Pearson correlations for the uniforms). Then, find the linear correlation matrix $\Sigma^{\mathrm{adj}}$ corresponding to the Spearman matrix $\Sigma$ (i.e., adjust $\Sigma$ according to Equation (4)). Create normal variates with the adjusted correlations $\Sigma^{\mathrm{adj}}$, and transform these normals into uniforms.

Summarised in pseudocode:

(5)
\begin{eqnarray} &&{\scriptscriptstyle 1:\quad} \text{set } \Sigma \\ &&{\scriptscriptstyle 2:\quad} \text{compute } \Sigma^{\mathrm{adj}} = 2 \cdot \sin(\pi/6 \cdot \Sigma) \\ &&{\scriptscriptstyle 3:\quad} \text{create } Z\! \sim \!\mathcal{N}(0,\Sigma^{\mathrm{adj}})\\ &&{\scriptscriptstyle 4:\quad} \text{compute } U = F(Z) \end{eqnarray}

(The computations in step 2 need to be done elementwise on $\Sigma$.)

In fact, even without the adjustment (step 2), the method works very well for the case normal-to-uniform, since the maximum absolute difference between $\rho^B$ and $\rho^S$ after the transformation is less than 0.02. For a more detailed discussion, see for instance .

A Matlab implementation

The sample code creates 1,000 realisations of four correlated random variates, where the first two variates have a normal distribution and the other two are uniformly distributed.

% generate normals, check correlations
X = randn(1000,4);
corrcoef(X)

% desired correlation
M =[1.0  0.7  0.6  0.6;
0.7  1.0  0.6  0.6;
0.6  0.6  1.0  0.8;
0.6  0.6  0.8  1.0];

for i = 1:4
for j = max(3,i):4
if i ~= j
M(i, j) = 2 * sin(pi * M(i, j) / 6);
M(j, i) = 2 * sin(pi * M(j, i) / 6);
end
end
end

% induce correlation, check correlations
C = chol(M);
Y = X * C;
corrcoef(Y)

% create uniforms, check correlations
Y(:,3:4) = normcdf(Y(:,3:4));
corrcoef(Y)

% plot results (marginals)
for i=1:4
subplot(2,2,i);
hist(Y(:,i))
title(['Y ', int2str(i)])
end


An R implementation

The sample code creates 1,000 realisations of four correlated random variates, where the first two variates have a normal distribution and the other two are uniformly distributed.

# generate normals, check correlations
X <- array(rnorm(4000), dim = c(1000, 4))
cor(X)

# desired correlation
M <- c(1.0, 0.7, 0.6, 0.6, 0.7, 1.0, 0.6, 0.6, 0.6, 0.6, 1.0, 0.8, 0.6, 0.6, 0.8, 1.0)
dim(M) <- c(4, 4)

for (i in 1:4){
for (j in max(i, 3):4){
if (i != j){
M[i, j] <- 2 * sin(pi * M[i, j] / 6)
M[j, i] <- 2 * sin(pi * M[j, i] / 6)
}
}
}

# induce correlation, check correlations
C <- chol(M)
Y <- X %*% C
cor(Y)

# create uniforms, check correlations
Y[, 3:4] <- pnorm(Y[, 3:4])
cor(Y)

# plot results (marginals)
par(mfrow = c(2, 2))
for (i in 1:4){
hist(Y[, i], main = paste("Y", i), xlab = "")
}