# Calibrating Financial Models using a Non-Parametric Technique Traditionally, asset returns have been modeled using diffusion processes. Diffusion processes assume that the sample path of the process being modeled is continuous. However, empirical evidence suggests that there are jumps that occur in asset returns, such as those that occurred during the financial crisis of 2008. The presence of jumps has implications in derivative pricing and asset allocations and thus need to be taken into account.

In this blog post, a non-parametric method (which involves matching moments) is used to calibrate diffusion and jump diffusion models using only the log returns observed on the stock overtime.  Using a non-parametric approach is ideal because we only need to make minimal amount of assumptions about the process that generated the returns. In addition, for complex models such as jump diffusion processes, optimizing the parametric likelihood/transition density function is not a trivial process, more especially since the transition density may not be bounded as shown here.

Intuitively, the non-parametric approach involves matching the true moments of the process with the sample moments. The assumption that is made by this non-parametric method is that the process that generated the returns has a an stochastic differential equation (SDE), and that the SDE parameters only depend on the current level of the process and not on time.

# The models

In this blog post, we are going to use the non-parametric approach  to calibrate the following continuous time, time homogeneous Markov models $X = \{X_t, t \geq 0 \}$ :

Diffusion model: $dX_t = \mu(X_t)dt+ \sigma(X_t)dW_t ...........(1)$

Jump-diffusion: $dX_t = \mu(X_t)dt+ \sigma(X_t)dW_t +dJ_t, ..........(2)$

where $\mu(X_t)$ and $\sigma(X_t)$ are the drift and the diffusion volatility, and $dJ_t$ is the jump processes. The jump process that we will be focusing on is the compound Poisson process. That is, $dJ_t = d(\sum_{i=0}^{N_t}(Y_i-1))$, where $N_t$ is a Poisson process with rate $\lambda(X_t)$ and $Y_i$ is the $i$th jump size. In this blog post we will assume that $log(Y_i )$ is normally distributed with mean zero and variance $\sigma_y^2.$

Our aim is thus to estimate  the following parameters:

• $\mu(X_t)$ , which is the drift
• $\sigma(X_t)$, which is the diffusive volatility
• $\lambda(X_t)$, which is the jump intensity and
• $\sigma_y$, which is the volatility if the jump size.

Note that all of the above parameters are allowed to depend on the current level of the process, but not on time. The output of the estimation process will a parameters which are a function of  the current stock price (except for  $\sigma_y$ whose estimate will be a single number as $\sigma_y$ is assumed to be independent of the stock price level $X_t$).

For a detailed background on the models that we aim to calibrate in this blog post refer to this article.

# Calibrating The Models

The non-parametric method of estimating the parameters of diffusion processes and jump-diffusion processes are explored in detail by Bandi and Nguyen  in their paper. These authors used kernel smoothers to estimate the parameters of the underlying process.This is the approach that we will be taking in this blog post. That is, we will be using the kernel smoothers  to estimate the parameters of the standard diffusion and the jump-diffusion models described above.

#### Calibrating the Diffusion model

The SDE of a standard diffusion process is given by equation (1). The parameters which need to be estimated are $\mu(X_t)$ and $\sigma(X_t)$. These parameters completely characterize the standard diffusion process. We are going to estimate the parameters using the infinitesimal moments of the SDE as these will be functions of the parameters.

The infinitesimal moments of the standard diffusion process described in equation (1) above are given as:

$M^1(a)=\lim_{\delta t\to 0}\frac{1}{\delta t} E[X_{t+\delta t}-X_{t}|X_t=a]=\mu (a)$

$M^2(a)=\lim_{\delta t\to 0}\frac{1}{\delta t} E[(X_{t+\delta t}-X_{t})^2|X_t=a]=\sigma^2(a)$

$M^k(a)=\lim_{\delta t\to 0}\frac{1}{\delta t} E[(X_{t+\delta t}-X_{t})^k|X_t=a]=0$ for $k>2.$

The parameters can then be determined by estimating the infinitesimal conditional moments using:

$\hat \mu (a) = \hat M^1(a)=\frac{\sum_{i=1}^{n-1}K\left(\frac{X_{i\delta t}-a}{h}\right)(X_{(i+1)\delta t}-X_{i\delta t})^1}{\delta t \sum_{i=1}^{n-1}K\left(\frac{X_{i\delta t}-a}{h}\right)}$

$\hat \sigma^2(a) = \hat M^2(a)=\frac{\sum_{i=1}^{n-1}K\left(\frac{X_{i\delta t}-a}{h}\right)(X_{(i+1)\delta t}-X_{i\delta t})^2}{\delta t \sum_{i=1}^{n-1}K\left(\frac{X_{i\delta t}-a}{h}\right)}$

where $n$ is the sample size, $K(.)$ is a symmetric kernel function, $h$ is the
smoothing parameter (which may be different for each moment) and $\hat \mu (a)$ and $\hat \sigma^2(a)$ are the  estimators of $\mu (a)$ and $\sigma^2(a)$  respectively.

Note that $h$ is the smoothing hyper parameter and is to be chosen by the user subjectively. In this blog post we use the Gaussian Kernel as our kernel.

#### Calibrating the Jump Diffusion model

The calibration approach for the jump diffusion process ( described by equation (2)) is the same as that of the calibration of standard diffusion model, accept that the jump diffusion model has a non-zero higher order (greater than 2) moments.

The infinitesimal moments of the jump-diffusion process in equation (2) are given as:

$M^1(a)=\lim_{\delta t\to 0}\frac{1}{\delta t} E[X_{t+\delta t}-X_{t}|X_t=a]=\mu (a)$

$M^2(a)=\lim_{\delta t\to 0}\frac{1}{\delta t} E[(X_{t+\delta t}-X_{t})^2|X_t=a]=\sigma^2(a)+\lambda(a) \sigma_y^2(a)$

$M^k(a)=\lim_{\delta t\to 0}\frac{1}{\delta t} E[(X_{t+\delta t}-X_{t})^k|X_t=a]=\lambda(a) E[Y^k]$ for $k>2$

The estimate of the $k$th  infinitesimal conditional moment is given as:

$\hat M^k(a)=\frac{\sum_{i=1}^{n-1}K\left(\frac{X_{i\delta t}-a}{h}\right)(X_{(i+1)\delta t}-X_{i\delta t})^k}{\delta t \sum_{i=1}^{n-1}K\left(\frac{X_{i\delta t}-a}{h}\right)}.$

A way to extract the parameter estimates from the above equations is to use the following sequential algorithm:

1. Obtain an estimate of $\sigma_y^2$ from $\hat\sigma_y^2=\frac{1}{n}\sum_{i=1}^{n} \frac{\hat M^6(X_{i\delta t})}{5\hat M^4(X_{i\delta t})}$
2. Obtain $\lambda(X_t)$ from $\hat\lambda(X_t)=\frac{\hat M^4(X_{t})}{3\hat\sigma_y^4}$
3. Obtain $\sigma(X_t)$ from $\hat\sigma^2(X_t)= \hat M^2(X_{t})-\hat\lambda(X_t)(\hat\sigma_y^2)$ (note that this implies that the vol can be negative under certain conditions!)
4. Obtain the drift from $\hat M^1(X_{t}).$

The Gaussian kernel was used to calculate $M^k(.)$ when estimating the parameters for the jump-diffusion model.

# A simulation study

In order to asses the validity of the non-parametric model calibration, we performed a simulation study. In this study, we simulated returns from a diffusion and from a jump diffusion model using specific parameters and we tried to reproduce these parameters using the non-parametric approach.

For the simulation we set $\delta t =\frac{1}{8 \times 252}$ (corresponding to hourly intervals on each business day) and simulated the respective processes over 5 years.

#### Results for the diffusion model

The standard diffusion model was simulated with  the following parameters:

• $\mu(X_t)=0.055$, which is the drift and
• $\sigma(X_t)=0.305$ which is the standard diffusive volatility.

That is, the SDE that generated the returns was as follows:

$dX_t = 0.055dt+ 0.305dW_t$.

We used $h=3$ to estimate the drift and $h=2$ to estimate the standard diffusive vol. The results of the estimated parameters is shown in the pictures below:

Note that the bootstrap confidence bands were generated by re-sampling with replacement from the simulated stochastic process and estimating the parameters again. We used 200 bootstrap estimates to construct the confidence interval.

Given the wide confidence bands on the drift, it suggest that estimating the drift is not easy using this non-parametric approach. However, the parameter does improve when more data is used, but not significantly/ fast enough. This suggests that one would need the delta to be very close to zero (or many many years of data) to reliably estimate the drift, which is not practical.

On the other hand, the vol has been estimated rather accurately and has narrow bootstrap confidence bands.

#### Results for the Jump Diffusion model

The jump-diffusion model was simulated using the following parameters:

• $\mu(X_t)=0$, (the drift is set to zero because getting a statistically significant estimate of was not possible when calibrating the standard diffusion process above. So we wont bother trying to estimate it fro the jump-diffusion model)
• $\sigma(X_t)=0.15$, which is the diffusive volatility
• $\lambda(X_t)= 26$, which is the jump intensity and
• $\sigma_y=0.03$, which is the volatility if the jump size.

We used $h= 2$ to estimate the second moment, $h=1$ to estimate the fourth moment and $h=0.9$ to estimate the 6th moment. The results of the estimated parameters are shown in the pictures below:

The jump size volatility was found to be 0.03026806,  and with bootstrap  confidence interval of (0.02864,0.0325). This interval includes the true value of 0.03 which is good.

Both the jump intensity and the diffusive vol parameters seem to be accurately estimated by  this non-parametric approach as they both have narrow confidence bands that include the true values of 0.15 and 25 respectively. However, the non-parametric estimates of both  the parameters seem to be picking up some form of linearity (and not a constant), which cannot be correct as the true values are constant.  This deficiency is likely  due to the choice of the hyperparameter $h$.

# An Empirical analysis The Johannesburg Stock Exchange (JSE) building in Sandton. It has operated as a market place for the trading of financial products for nearly 125 years.

We now fit one of the models to empirical data to see if it produces sensible parameter estimates. The data used in this blog were observations from the JSE Top 40 index measured every minute for the 139 trading days between 6 October 2010 and  20 April 2011.

The dataset comprised of 65,532 observations. On a given trading day, values from the index were observed between 08:59 and 17:00. However, continuous trading on the JSE occurs between 09:00 and 16:50. This had to be taken into account when analyzing the data as observations lying outside the interval 09:00 and 16:50 do not correspond to continuous trading activity. In addition, observations which appeared to be outliers were also removed.

Note that given that the data set contained returns over weekends, we adjusted the infinitesimal moment estimators by linearly interpolating for the returns over Saturday and Sunday.

Our final data set that was used  was 15 minutes intervals as the Lee and Mykland  jump test referred to in the blog post about jump tests is most powerful at 15 minute intervals. We first performed the Lee and Mykland jump test  to determine which model to fit the JSE Top 40 returns to. The results suggested that the JSE Top 40 log returns do have jumps in them, so we fitted the jump-diffusion model to the data. The results (ignoring the drift) are shown below:

Note hat the solid lines are the estimated parameters while the dashed lines are the 90 per cent bootstrap con fidence bands.

The estimate of the jump size volatility was 0.0019 with a 90 per cent bootstrap confidence interval of (0.0017, 0.0021), which is quite narrow. This suggests that the jump size vol has been estimated accurately.

The plot for the intensity function was not as expected. One would have expected the jump intensity to be higher at larger process values as the process is likely to be more variable or have jump behavior at larger process values. However, the plot shows that  the jump intensity increases up until 10.21, and then starts decreasing. In addition, the jump intensity estimate has large confidence bands. This makes the functional form of the intensity inconclusive.

The estimate of  vol of the jump diffusion appears to be precise as it has narrow confidence bands. Also, it appears to be increasing slightly with the process values, which is what was expected. The confidence  bands also suggest that the vol of the jump diffusion should be modeled as a constant since a straight line can be fitted through the confidence bands.

# Conclusion

In this blog post a  non-parametric technique were used to estimate the parameters of simulated standard diffusion and the jump-diffusion models using the procedure by Bandi and Nguyen (2003).

We then fitted the jump-diffusion model to the JSE Top 40 index and results indicated that the diffusion coefficient of the jump-diffusion model is arguably constant. Furthermore, the functional form of the intensity function was inconclusive due to the wide confidence bands. This may be due to the miss-speci fication of the jump size distribution.

A possible extension to the non-parametric model fitting procedure used in this blog post is to allow the parameters to depend explicitly on time as the procedure used in this paper does not.  This model could then be calibrated using a deep neural network.

# Code Used

The R code used in this blog post is given below: