'Repairing' an Indefinite Correlation Matrix

Enrico Schumann

## Keywords

Matrix algebra, Matlab, R, Covariance matrices

Unreviewed

# Overview

There are cases where a correlation matrix is indefinite. This may happen if one arbitrarily changes entries of the matrix (e.g., for stress tests), or if the correlations are computed pairwise, but some series had missing data. A simple repair mechanism, based on the spectral decomposition of the correlation matrix, is described here.

# Computing a correlation matrix

Assume we have $T$ return observations of $n$ assets, collected in a matrix $X$ (each series is one column). The estimator for the variance-covariance matrix of these returns can analytically be written as

(1)
\begin{align} \hat{\Sigma} = \frac{1}{T} X' \underbrace{\big(I-\frac{1}{T}\iota \iota'\big)}_{M}X = \frac{1}{T} X' M X. \end{align}

The matrix $M$ transforms the columns of $X$ into deviations from their respective mean. Since $M$ is idempotent, we have $X'M'MX=X'MX$. $M$ is of rank $T-1$, thus if $X$ has full column rank, one needs at least $n+1$ observations to obtain a full-rank covariance matrix.

The variance-covariance matrix can be rewritten as

(2)
\begin{align} \Sigma = \underbrace{ \left[\begin{array}{cccc} \sigma_{1} & 0 & \ldots & 0\\ 0 & \sigma_{2} & \ldots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \ldots & \sigma_{n}\\ \end{array} \right]}_{D} \underbrace{ \left[ \begin{array}{cccc} 1 & \rho_{12} & \ldots & \rho_{1n}\\ \rho_{21} & 1 & \ldots & \rho_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ \rho_{n1} & \rho_{n2} & \ldots & 1\\ \end{array} \right]}_{C} \underbrace{\left[ \begin{array}{cccc} \sigma_{1} & 0 & \ldots & 0\\ 0 & \sigma_{2} & \ldots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \ldots & \sigma_{n}\\ \end{array} \right]}_{D} \end{align}

where $C$ is the correlation matrix and $D$ is a matrix with the assets' standard deviations as its diagonal elements and zeros elsewhere.

## Definiteness and rank

The diagonal matrices $D$ will always be of full rank1, except if at least one standard deviation is zero. (But note that the correlation between a constant and a random variable is not defined.) In any case, the diagonal matrices will at least be positive semidefinite2, since the standard deviations cannot be negative. So if the variance-covariance matrix is not positive semidefinit, then neither is the correlation matrix $C$. This suggests that it suffices to investigate the correlation matrix.

# Repairing a correlation matrix

A correlation matrix may become indefinite (i.e., have at least one positive and at least one negative eigenvalue) if one arbitrarily changes entries of the matrix (e.g., for stress tests), or if the correlations are computed pairwise, but some series had missing data.

If the correlation matrix is not positive (semi)definite, than at least one eigenvalue is negative. Thus, a simple strategy is to replace all negative eigenvalues by zero. Since the resulting correlation matrix will not have a main diagonal of ones any more (which is required for a correlation matrix), one needs to rescale the matrix. For more details, see [2, ch. 6].

# Implementation

The procedure can be implemented as follows. (The code is not necessarily the most efficient, neither the most convenient.)

## R

# compute eigenvectors/-values
E   <- eigen(C, symmetric = TRUE)
V   <- E$vectors D <- E$values

# replace negative eigenvalues by zero
D   <- pmax(D,0)

# reconstruct correlation matrix
BB  <- V %*% diag(D) %*% t(V)

# rescale correlation matrix
T   <- 1/sqrt(diag(BB))
TT  <- outer(T,T)
C   <- BB * TT


## Matlab

% compute eigenvectors/-values
[V,D]   = eig(C);

% replace negative eigenvalues by zero
D       = max(D, 0);

% reconstruct correlation matrix
BB      = V * D * V';

% rescale correlation matrix
T       = 1 ./ sqrt(diag(BB));
TT      = T * T';
C       = BB .* TT;