Calibrating Option Pricing Models with Heuristics

## Keywords

Differential Evolution, Particle Swarm Optimisation, Heston Model, Stochastic Volatility

Unreviewed

# Abstract

Calibrating option pricing models to market prices often leads to optimisation problems to which standard methods (like such based on gradients) cannot be applied. We investigate one particular example, Heston's stochastic volatility model. We discuss how to price options under this model, and how to calibrate the parameters of the model with heuristique techniques, Differential Evolution and Particle Swarm Optimisation.

# 1 Introduction

Implied volatilities obtained by inverting the Black-Scholes-Merton (BSM) model vary systematically with strike and maturity; this relationship is called the volatility surface. Different strategies are possible for incorporating this surface into a model. We can accept that volatility is not constant across strikes and maturities, and directly model the volatility surface and its evolution. This approach is not consistent since we assume that a single underlier has different volatilities, but still, it is the approach that is mostly used in practice. An alternative is to model the option prices such that the BSM-volatility surface is obtained, for instance by including jumps, locally varying volatility, or by making volatility stochastic; a well-known example for the latter approach is the Heston model. It is the model that we look at in this paper.

As so often in finance, the success of the BSM-model stems not so much from its empirical quality, but from its computational convenience. This convenience comes in two flavours. Firstly, there are closed-form pricing equations (the Gaussian distribution function is not available analytically, but fast and precise approximations exist). Secondly, calibrating the model requires only one parameter to be determined, the volatility, which can be computed from market prices readily with Newton's method or another zero-finding technique. For the Heston-model, both tasks become more difficult. Pricing requires numerical integration, and calibration requires to find (in a riskneutral world) five parameters instead of only one for BSM.

# 2 Pricing with the characteristic function

There are several generic approaches to price options. The essence of BSM is a no-arbitrage argument which leads to a partial differential equation that can be solved numerically (in the particular case of BSM, even analytically). A more recent approach builds on the characteristic function of the (log) stock price. European options can be priced by the following equation ([2]):

(1)
\begin{align} C_0 = \mathrm{e}^{-q \tau} S_0 \Pi_1 - \mathrm{e}^{-r \tau}X \Pi_2 \end{align}

where $C_0$ is the call price today (time $0$), $S_0$ is the spot price of the underlier, and $X$ is the strike price; $r$ and $q$ are the riskfree rate and dividend yield; time to expiration is denoted $\tau$. The $\Pi_j$ are calculated as

(2)
\begin{split} \Pi_1 &= \frac{1}{2} + \frac{1}{\pi} \int_0^{\infty}\mathsf{Re}\left(\frac{\mathrm{e}^{-\mathrm{i} \omega\log(X)}\phi(\omega-\mathrm{i})}{\mathrm{i} \omega\phi(-\mathrm{i})}\right)\mathrm{d}\omega \label{eqPi1}\, ,\\ \Pi_2 &= \frac{1}{2} + \frac{1}{\pi} \int_0^{\infty}\mathsf{Re}\left(\frac{\mathrm{e}^{-\mathrm{i} \omega\log(X)}\phi(\omega)}{\mathrm{i} \omega}\right)\mathrm{d}\omega\, . \end{split}

The symbol $\phi$ stands for the characteristic function of the log stock price; the function $\mathsf{Re}(\cdot)$ returns the real part of a complex number. For a given $\phi$ we can compute $\Pi_1$ and $\Pi_2$ by numerical integration, and hence obtain option prices from Equation (1).

## Black-Scholes-Merton

In a BSM-world, the stock price $S_t$ under the risk-neutral measure follows

(3)
\begin{align} \mathrm{d} S_t = r S_t\mathrm{d} t + \sqrt{v} S_t \mathrm{d} z \end{align}

where $r$ is the riskfree rate, and $z$ is a Wiener process. The volatility $\sqrt{v}$ is constant. The well-known pricing formula for the BSM-call is given by

(4)
\begin{align} C_0 &= \mathrm{e}^{-q\tau} S_0 \operatorname{N}(d_1) - X \mathrm{e}^{-r\tau}\operatorname{N}(d_2) \end{align}

with

(5)
\begin{split} d_1 =& \frac{1}{{\displaystyle\sqrt{v \tau}}}\left({\displaystyle \log{\bigg(\frac{S_0}{X}\bigg)} + \bigg(r-q + \frac{v}{2}\bigg)\tau}\right)\\ d_2 =& \frac{1}{{\displaystyle\sqrt{v \tau}}}\left({\displaystyle \log{\bigg(\frac{S_0}{X}\bigg)} + \bigg(r-q - \frac{v}{2}\bigg)\tau}\right) = d_1-\sqrt{v \tau} \end{split}

and $\operatorname{N}(\cdot)$ the Gaussian distribution function.

Given the dynamics of $S$, the log-price $s_{\tau}=\log(S_{\tau})$ follows a Gaussian distribution with

$s_{\tau} \sim \operatorname{\mathcal{N}}\!\left(s_0 + \tau(r - \frac{1}{2}v), \tau v \right)$ ,

where $s_0$ is the natural logarithm of the current spot price. The characteristic function of $s_{\tau}$ is given by

(6)
\begin{split} \displaystyle \phi_{\mathrm{BSM}}(\omega) &= \mathsf{E}(\mathrm{e}^{\mathrm{i} \omega s_{\tau}})\\ &= \mathrm{e}^{\mathrm{i} \omega \big(s_0 + \tau(r - \frac{1}{2}v)\big) + \frac{1}{2}\mathrm{i}^2 \omega^2 v}\\ &= \mathrm{e}^{\mathrm{i} \omega s_0 + \mathrm{i} \omega \tau r - \frac{1}{2} (\mathrm{i} \omega + \omega^2 ) \tau v}\, . \end{split}

With a continuous dividend yield $q$, we replace $r$ by $r-q$. Inserting (6) into Equation (1) should, up to numerical precision, give the same result as Equation (4).

## Heston

Under the Heston model the stock price $S$ and its variance $v$ are described by

(7)
\begin{split} \mathrm{d} S_t &= r S_t \mathrm{d} t + \sqrt{v_t} S_t \mathrm{d} z^{(1)}\\ \mathrm{d} v_t &= \kappa (\theta - v_t)\mathrm{d} t + \sigma \sqrt{v_t}\mathrm{d} z^{(2)}\,. \end{split}

The long-run variance is called $\theta$, mean reversion speed is $\kappa$ and $\sigma$ is the volatility of volatility. The Wiener processes have correlation $\rho$. For $\sigma \to 0$, the Heston dynamics approach those of BSM. The characteristic function of the log-price in the Heston model is as follows, see Albrecher et al. [1].

(8)
\begin{split} \displaystyle \phi_{\mathrm{Heston}} &= \mathrm{e}^{A} \times \mathrm{e}^{B}\times \mathrm{e}^{C}\\ \nonumber A &= \mathrm{i} \omega (\log S_0 + (r-q)\tau)\\ \nonumber B &= \cfrac{\theta \kappa}{\sigma^2}\Bigg((\kappa -\rho \sigma \mathrm{i} \omega - d)\tau - 2\log\bigg(\cfrac{1-g_2\mathrm{e}^{-d\tau}}{1-g_2}\bigg)\Bigg)\\ \nonumber C &= \cfrac{\cfrac{v_0}{\sigma^2}\Big(\kappa - \rho \sigma \mathrm{i} \omega - d\Big)\Big(1-\mathrm{e}^{-d\tau}\Big)}{1-g_2\mathrm{e}^{-d\tau}}\\ \nonumber d &= \sqrt{(\rho \sigma \mathrm{i} \omega-\kappa)^2 + \sigma^2(\mathrm{i} \omega + \omega^2)}\\ \nonumber g_2&= \cfrac{\kappa - \rho \sigma \mathrm{i} \omega - d}{\kappa - \rho \sigma \mathrm{i} \omega + d} \end{split}

# Calibrating the model

Calibrating an option pricing model means to find parameters such that the model's prices are consistent with market prices, leading to an optimisation problem of the form

(9)
\begin{align} \min\ \ \sum_{i=1}^M \frac{\left|C_i^{\text{model}} - C_i^{\text{market}}\right|}{C_i^{\text{market}}} \end{align}

where $M$ is the number of market prices. Alternatively, we could specify relative deviations, use squares instead of absolute values, or introduce weighting schemes. The choice of the objective function depends on the application at hand; ultimately, it is an empirical question to determine a good objective function. Since here we are interested in numerical aspects, we will use specification (9). The problem is not convex, and hence standard methods (eg, based on derivatives of the objective function) may fail. Hence, we deploy heuristic methods, Differential Evolution and Particle Swarm, to solve problem (9).

## Calibrating the Heston model

We create artificial data sets to test DE and PS. The spot price $S_0$ is 100, the riskfree rate $r$ is 2%, there are no dividends. We compute prices for strikes from 70 to 130 (in steps of size 2), and maturities of 1/12, 3/12, 6/12, 9/12, 1, 2 and 3 years. Hence our surface comprises 31 $\times$ 7 = 217 prices. We test different parameters for the Heston model. Given a set of parameters, we compute option prices and store them as the true prices. Then we run DE and PS to solve problem (9), and see if we can recover the parameters; the setup implies that a perfect fit is possible.

Error distributions for different optimisation methods. The x-axis gives the obtained objective function value. A value of 0.03 means that the average price error was 3%. An error of 0% would have been possible.

DE and PS are coded in Matlab (following the descriptions here for DE, and here for PS). The stopping criterion is a fixed number of function evaluations (population size $\times$ generations); we run three settings: 5000, 20000 and 45000. On an Intel P8700 single core at 2.53GHz with 2 GB of RAM, a single run takes about 20, 80, and 180 seconds, respectively. (Each evaluation of the objective function involves the pricing of 217 options, hence for the last setting, in one run we compute 217$\times$150$\times$300=9765000 prices.)

We also run a gradient search and a direct search (Nelder-Mead), using the Matlab implementations fminunc and fminsearch. Again, we conduct several restarts for each set of parameters, starting values were chosen randomly in the same way the initial populations for DE and PS was set up.

For each run we store value for the objective function (the mean percentage error; Equation (9)), and corresponding estimation errors for the parameters. Results for unconstrained optimisation runs are given in the figure on the right.

The panels show the distributions of values of the objective function (pooled over all parameter sets); optimal values would be zero. For each level of function evaluations, we have results from 100 runs. The increasing number of function evaluations is marked by an increasingly dark grey. (Note that this does not concern gradient search and Nelder-Mead. Here the distributions just demonstrate that results remained similar for repeated runs.) DE and PS work well, with error distributions converging to zero, even though there are cases in which the methods fails to recover the true parameters. Gradient search and direct search fair worse; gradient search in particular returned NaN in between 10 to 20% of cases. All results come from unconstrained optimisation runs: in some cases, gradient and direct search converged to economically meaningless or impossible values (eg, correlations outside the range from -1 to 1, or negative variances).