next up previous contents
Next: Minimal Spanning Trees Up: Methods Previous: The correlation of stock   Contents

A Random Matrix Theory based analysis of stock correlations

Studying the eigensystem of the correlation matrix, we can see some financial information in the eigenvalues of the matrix and in the respective eigenvectors. We know that comparing the spectrum of eigenvalues of correlation matrix with the spectrum of eigenvalues of a random matrix, we can extract information about the market and about the sectors that constitute the market. A random matrix is defined by [63]:

\begin{displaymath}
{\sf C'}=\frac{1}{T}{\sf G}'{\sf G}'^T
\end{displaymath} (2.13)

where ${\sf G}'$ is a $N\times T$ matrix with columns of time series with zero mean and unit variance, that are uncorrelated, the spectrum of eigenvalues can be calculated analytically. In the limit $N\rightarrow \infty$ and $T\rightarrow \infty$, where $Q=T/N$ is fixed and bigger than $1$, the probability density function of eigenvalues of the random matrix is:
\begin{displaymath}
P_{RM}(\lambda)=\frac{Q}{2\pi}\frac{\sqrt{(\lambda_{max}-\lambda)(\lambda-\lambda_{min})}}{\lambda}
\end{displaymath} (2.14)

where
\begin{displaymath}
\lambda_{min}^{max}=\left(1\pm \frac{1}{\sqrt{Q}}\right)^2
\end{displaymath} (2.15)

limits the interval where the probability density function is different from zero. The $P_{RM}(\lambda)$ for $Q=34.6$ ($T=2321$ and $N=67$ are the values for our London Stock Exchange data) is shown in Figure 2.1 and it's compared with the distribution of eigenvalues of a correlation matrix computed from shuffled time series of original data of stocks from the London Stock Exchange.

Figure 2.1: Spectrum of the eigenvalues of random correlation matrix, computed using 2.14 with $Q=34.6$, in bold compared with the normalised distribution of eigenvalues of a correlation matrix computed from shuffled time series, that should be similar to random time series but with the same distribution as the original time series of returns.
\begin{figure}\begin{center}\epsfysize =80mm
\epsffile{RealSpectrumEigenvalue.eps}\end{center}\end{figure}

If we compare the results of a correlation matrix constructed with real time series, with the random matrix (Figure 3.3) we can see that the highest eigenvalues of the real matrix are much higher than the highest eigenvalue of random matrix. The largest eigenvalue represents something that is common to all stocks. If we analyse the respective eigenvector, all the stocks have the same sign (Figure 3.4), so all participate in the same way.

The largest eigenvalue and its corresponding eigenvector can be interpreted as the collective response of the market to any external factors, so it can be compared with the market index [13,14]. A way to prove this is to see the correlation between the index of the market and the projection of the time series in the eigenvector related with the largest eigenvalue (Figure 3.5). The projection is given by:

\begin{displaymath}
R^N(t)=\sum_{i=1}^{N}u_i^{N}R_i(t)
\end{displaymath} (2.16)

where $R^N(t)$ is the return of the portfolio of $N$ stocks, defined by the eigenvector $u^N$ and we call it market mode.

If we filter the real time series, extracting the market mode from every stock, we get a new correlation matrix with the residuals ${\sf C}^{res}$ [14]. A way to filter the market mode is to use the one-factor model or Capital Asset Pricing model [64], where the return of the price can be expressed as:

\begin{displaymath}
R_i(t)=\alpha_i + \beta_i R^N(t) + \epsilon_i(t)
\end{displaymath} (2.17)

The first term is the mean of the returns, the second term is the influence of the market index and the last term is the residual. If we fit every time series ($R_i(t)$ for every $i$) to the time series of market mode ($R^N(t)$) using the least square regression, we can get the values of parameters $\alpha $ and $\beta$:
$\displaystyle \alpha_i$ $\textstyle =$ $\displaystyle <R_i>-\beta_i <R^N>$  
$\displaystyle \beta_i$ $\textstyle =$ $\displaystyle \frac{cov(R_i,R^N)}{\sigma_{R^N}^2}$ (2.18)

where $\sigma_{R^N}$ is the standard deviation of the market mode and $cov()$ is the covariance.

The residuals are given by:

\begin{displaymath}
\epsilon_i(t) = R_i(t) - \alpha_i - \beta_i R^N(t)
\end{displaymath} (2.19)

and we can compute the matrix of residuals with these new time series.

If we analyse the spectrum of eigenvalues of the new filtered matrix, some eigenvalues continue to be far outside the range obtained from equation 2.14 . These are the eigenvalues that represent different sectors. Our main work is to try to understand a way to filter this information, to end up with a time series of random information. Our approach is to use a multifactor model, where we not just use a market mode, but also a sector mode [65,66]. This sector mode is defined for each sector in our portfolio and is related with the highest eigenvalue of the correlation matrix of the stocks of only one sector, similar to what we did for the whole market. Our multifactor model can be represented as:

\begin{displaymath}
R_i(t)=\alpha_i + \beta_i R^N(t) + \sum_{j=1}^{N_S}\gamma_{ij} R^{S_j}(t) + \epsilon_i(t)
\end{displaymath} (2.20)

where $j$ represent the index of the sector presented in the portfolio. The new term has the sector mode $R^{S_j}(t)$ and a parameter $\gamma_{ij}$ that is only different from zero if the stock belongs to the sector $j$. As we did before, we can now filter our time series by subtracting the market and sector modes:
\begin{displaymath}
\epsilon_i(t) = R_i(t) - \alpha_i - \beta_i R^N(t) - \sum_{j=1}^{N_S}\gamma_{ij} R^{S_j}(t)
\end{displaymath} (2.21)

We use least square fitting to find the values of the parameters:
$\displaystyle \alpha_i$ $\textstyle =$ $\displaystyle <R_i>-\beta_i <R^N>-\gamma_{ij} <R^{S_j}>$  
$\displaystyle \beta_i$ $\textstyle =$ $\displaystyle \frac{cov(R_i,R^N)-\gamma_{ij} cov(R^{S_j},R^N)}{\sigma_{R^N}^2}$  
$\displaystyle \gamma_i$ $\textstyle =$ $\displaystyle \frac{cov(R_i,R^{S_j})\sigma_{R^N}^2-cov(R_i,R^N) cov(R^{S_j},R^N)}{\sigma_{R^{S_j}}^2 \sigma_{R^N}^2 - \left[cov(R^{S_j},R^N)\right]^2}$ (2.22)

We will need to check if this model is enough to represent the returns of the stocks, or if there are other terms, like for example a term of correlations between stocks of different sectors.


next up previous contents
Next: Minimal Spanning Trees Up: Methods Previous: The correlation of stock   Contents
Ricardo Coelho 2007-05-08