Volatility smoothing algorithms to remove arbitrage from volatility surfaces

Implied volatility surfaces and smiles constructed by fitting a cubic spline to raw market data may contain arbitrage. In fact, even if the market data points used do not contain arbitrage, cubic interpolation between data points may introduce it. It is therefore usually desirable to find the best fit of a cubic spline to the data points, under the restriction that the result be arbitrage free. Unlike the basic interpolation approach, the spline need not pass through the data points. This is called volatility smoothing.

We recommend the approach of M.R Fengler in his paper Arbitrage-Free Smoothing of the Implied Volatility Surface. Instead of fitting a spline to the graph of volatility vs moneyness, Fengler uses call price vs moneyness. An advantage of this is that the no arbitrage restrictions take on a more simple form in terms of call price.

The surface fitting is done using a least squares fitting, with a number of constraints. The heart of the algorithm is therefore a constrained quadratic optimization procedure. In python, this can be achieved using scipy.optimize.minimise with the parameter method=’SLSQP’. The mathematical difficulty is mainly around understanding the constraints and implementing them accurately.

We’ve implemented Fengler’s algorithm in python. The algorithm runs very quickly on a single vol surface. However, since historical volatility data has, for each date, a large number of vol surfaces (one for each tenor), the number of surfaces to be processed can easily proliferate into the millions. In this case one may wish to consider a C++ implementation or at least a multicore implementation in python.

To illustrate the algorithm, we start with 8 pillar points (moneyness/volatility pairs) which make up the raw data of a vol surface. We’ve deliberately chosen data which contains significant arbitrage. We’ve calculated the Black-Scholes call prices corresponding to these points and plotted them as the blue dots in the below graph.

The orange line is the arbitrage free cubic spline generated by our implementation of Fengler’s approach. You can see that it very effectively solves the problem of the out of place at the money data point which is entirely inconsistent with an arbitrage free surface.

We can also convert the call prices back to implied volatilities, yielding the following graph. For this graph, we have simply joined the data points straight lines for illustration purposes.

We found we had to make one addition to Fengler’s approach as described in his paper. Fengler considers a set of weights for each data point in the fitting. We found we had to weight each data point by 1/vega to achieve an accurate result. This is because at the wings of the volatility surface, where vega is very small, a small change in call price corresponds to a huge change in volatility. This means that when converting the fitted call prices back to volatilities, the surface will otherwise be a very poor fit in the wings.

Fengler’s paper is not limited to one dimensional volatility surfaces (that is, smiles). It can also be used for two dimensional volatility surfaces which incorporate both moneyness and maturity. His paper details how to extend the method to include maturity.

We provide volatility smoothing consulting, along with a wide range of quantitative finance consulting services.

Does barrier option valuation depend on volatility and interest rate term structure?

\(\)It’s well-known that vanilla option valuation does not depend on the term structure of volatility and interest rates. This means that the price depends only on the average volatility and average interest rate between the valuation date and maturity, not on how those quantities are distributed within the interval.

A way to visualize this and understand it intuitively is as follows. Consider a large set of paths of the underlying which have been generated by a Monte Carlo routine. The value of the option is the average over all paths of the quantity \(Max(S(T) – K, 0)\). Now, imagine stretching and compressing the paths in different places as if they were plasticine, corresponding to concentrating volatility more in some places than others. It’s as if the underlying were moving faster in some regions, and slower in others, yet \(S(T)\) remains the same for each path. Thus, the price remains the same.

Interest rates affect the underlying’s drift term. Yet, as for volatility, \(S(T)\) depends only on the total proportional increase that the drift term bestows on the underlying, not on where in the interval this increase occurs.

What about barrier options? There are a few cases to consider.

First, we consider the case of a full barrier option. This means that the barrier is monitored for the full length of the deal from the valuation date to maturity, as opposed to only being monitored for a subset of it. We also assume that the underlying’s drift term is zero (this typically occurs when interest rates are zero, for example). In this case, valuation is actually still independent of volatility term structure. This can be understood by realizing that stretching or compressing the paths in different places does not change whether they breach the barrier, but only when they breach the barrier. Thus whether a given path has knocked-in or knocked-out remains unchanged.

Next, we consider the case of a partial or window barrier option. This means that the barrier is only monitored some of the time, with the monitoring period starting after the valuation date and/or ending before maturity. We still assume that the underlying drift is zero. As mentioned above, while a different volatility term structure does not change whether a path breaches the barrier, it does change when it does. Thus, it can affect whether the path breaches the barrier inside the monitoring window or outside, thus changing whether the path knocks in/out or not. Thus, for partial and window barrier options, valuation is not independent of volatility term structure.

Finally, let’s consider the case of a non-zero drift term. In this case, valuation is not independent of volatility or interest rate term structure regardless of whether it is a full barrier option or a partial/window barrier option. To understand this, consider that the movements in the underlying due to volatility are proportional to the current underlying price. If the underlying is monotonically drifting upwards throughout the monitoring window, then volatility applied early on will cause smaller changes in the underlying than if they were applied towards the end of the monitoring window. Thus, if the volatility term structure concentrates volatility towards the end of the interval after the underlying has had time to drift upwards, they are more likely to cause the underlying to rise above an upper barrier. Thus, volatility term structure and interest rate term structure affect knock out / knock in probability and thus affect valuation.

GPS consulting – mathematics and software development for global positioning systems

GPS satellites and receivers are being applied in a huge number of industries including aviation, agriculture, financial fraud identification, robotics (navigation), and landscape surveying.

Developing software to process GPS data requires an understanding of the mathematics involved in GPS coordinate systems, including coordinate transformations between latitude/longitude/height and ECEF coordinates. GPS data often must be combined with other sensor data and run through a mathematical calculation to produce the required output data or system behaviour.

Our consultants can assist you in formulating the correct mathematical equations for your GPS application, and implement them in a variety of languages like python or C++.

Financial Computation using Nvidia GPUs.

While GPUs were originally invented for image processing, their powerful capabilities are now being applied to computation problems that have nothing to do with graphics. As GPUs have about 20x as many cores as CPUS, they can be up to 100x faster for highly parallelizable computations such as machine learning and data analysis.

Did you know that google has used Nvidia GPUs to train its google translate machine learning algorithms?

In particular, Nvidia GPUs find many applications in the financial services industry, which is increasingly making use of massive data sets and AI / deep learning. GPU computation is ideal for Monte Carlo simulations, used extensively in the finance industry, as each path can be processed independently and simultaneously.

CUDA is a program development environment from Nvidia which allows users to execute the highly parallelizable part of their code on an Nvidia GPU.

Converting Volatility Surfaces from Moneyness to Delta Using an Iterative Method

\(\)It often comes up in quantitative finance that you want to convert a vol surface plotted against moneyness, to a vol surface plotted against delta.

See Options, Futures and Other Derivatives by John Hull for a reference on pricing formulas for European options. In the Black-Scholes framework, the delta of a call option is given by

\[\Delta = N(d_1), \]

Where \(N\) represents the cumulative normal probability density function, and

\[d_1 = \frac{\log(S_0/K) + (r + \sigma^2/2)T}{\sigma \sqrt{T}}. \]

(for a put, it is \(\Delta = N(d_1) – 1 )\) . Rearranging for moneyness, we have

\[ \frac{S_0}{K} = \exp\left(N^{-1}(\Delta) \sigma \sqrt{T} – (r + \sigma^2/2)T \right). \]

Now, our volatility surface would typically be specified using a number of moneyness and volatility pairs \((m_i,v_i)\) where the moneyness values would typically be something like

\[ \{m_i\} = \{0.7, 0.8, 0.9,1,1.1,1.2,1.3\}. \]

When calling for a volitility value for a moneyness in between these numbers, the firm would have implemented an interpolation function,

\[I: \text{ moneyness} \to \text{ volatility},\]

which would typically use a monotonic cubic spline. Inverting this function may be a lot of work, as it would require working out the exact coefficients generated by the cubic spline fitting. Even with an explicit formula, the spline is defined piecewise, which makes inverting it complicated.

Given some delta \(\Delta\) , we want to find a volatility \(\sigma\) such that the moneyness corresponding to that volatility according to the cubic spline interpolation is the same as the moneyness from the above formula. This requires solving the following equation for moneyness \(m\):

\[ m = \exp\left(N^{-1}(\Delta) I(m) \sqrt{T} – (r + I(m) ^2/2)T \right). \]

An equation like this should be solved numerically. This is doubly true due to the complicated definition of the function \(I\). While inverting \(I\) would be difficult, evaluating it is easy. This motivates solving using fixed point methods which only require the function to be evaluated.

What we are looking for is a fixed point of the map \(f\) , i.e. a point \(m\) such that \(f(m) = m\). Thus, in the remainder of the article, we’ll look at an iterative fixed point method for solving this equation. The idea is simple. We start with some initial point \(m_0\), and repeatedly apply the map

\[ f(m) = \exp\left(N^{-1}(\Delta) I(m) \sqrt{T} – (r + I(m) ^2/2)T \right) \]

until the change in \(m\) is less than some small tolerance.

The critical questions is: under what circumstances does this iterative procedure actually converge?

According to the Banach fixed-point theorem, this process will converge to a unique fixed point if \(f\) is a contraction mapping, which in the context of a real-valued function means

\[|f(m_1) – f(m_2)| \leq L |m_1 – m_2|, \]

for some constant \(L \in [0,1)\). This is also known as the Lipschitz condition, and it is well known that

\[L = \sup_m |f'(m)|, \]

where the supremum is of course taken over the domain of interest. Thus, our procedure will converge if \(|f'(m)|<1.\) We calculate,

\[ f'(m) = f(m) I'(m) \left( N^{-1}(\Delta) \sqrt{T} – I(m)T \right). \]

Numerical evidence shows that this derivative does not in general have an absolute value smaller than one, but typically does after just one iteration of our map. Our experience is that this method will almost always converge for all “reasonable” volatility surfaces, and usually within only 2 or 3 iterations!

A possible alternative to searching for a fixed point is to use Newton’s method to search for a zero of the function \(F(m) = m – f(m).\)

Order Imbalance in Algorithmic Trading

\(\)An order imbalance occurs when the buy volume significantly exceeds the sell volume in the order book, or vice versa. They are often caused by news of a significant development that is perceived to affect the value of the stock. It is well-known that order imbalances are an effective predictor of future stock price movement. If demand to buy exceeds the available liquidity, the price will likely move up. If demand to sell is too high for the interest on the buy side to absorb, the price will likely fall. Thus, anyone engaging in algorithmic trading will want to develop algorithms that respond effectively to imbalance signals.

A reasonable definition of order imbalance is

\[ I = \frac{V_b – V_a}{ V_b + V_a },\]

where \(V_b\) and \(V_a\) are the best (or L1) bid and ask volumes. Alternatively, and depending on the application, these volumes may be defined to include multiple levels of the limit order book (a machine learning algorithm would be well suited to determining the complicated relationship between the volume at different levels and the most probable price movement).

A simple approach is described in the book High Frequency Trading by Easley et al. The authors define a “microprice” quantity as a weighted average of the bid and ask price by

\[P_\text{micro} = P_b \frac{V_a}{V_a + V_b} + P_a \frac{V_b}{V_a + V_b}, \]

where \(P_b\) and \(P_a\) are the best (or L1) bid and ask prices, and \(V_b\) and \(V_a\) are the corresponding L1 volumes. The micro price will be closer to the bid price if there is higher volume on the ask side, and closer to the ask price if there is higher volume on the bid side. They then propose to cross the spread on a buy order when \(P _\text{micro}\) is sufficiently close to the ask price, i.e.,

\[P_\text{micro} > P_a – k(P_a – P_b), \]

and analogously for a sell order. Here, \(k\) is some constant specifying the tolerance, which would have to be determined by some kind of tick data analysis technique such as machine learning.

In the book Algorithmic and High Frequency Trading by Cartea et al. the authors discuss a Markov chain approach to modelling the order imbalance. To discretise the problem, order imbalance values are placed into five buckets. A transition matrix is fitted to data. The transition matrix represents the probability of being in each of the five buckets at the next time step, given the current bucket. They also generate data showing the probability of positive and negative price moves based on the current order imbalance bucket.

Financial Model Validation Consulting Services

Mathematicians have an ability to think clearly and precisely that is rare among finance professionals. We’re excellently placed to provide model validation consulting services. Learn how I found a critical conceptual error in risk modelling work by one of the largest financial consulting firms in the world.

There are two main kinds of models that quantitative analysts are called on to validate in the financial services: derivative pricing models, and risk models.

Validating derivative pricing models

Much of derivative pricing theory is now completely standard and well-worn. This means that there is no question, in principle, of how to price certain derivatives. Validating derivative pricing models is thus often mainly about checking the correctness of the coding implementation. A standard way to do this is to build a second, independent model against which to compare the output of the original model. Since it’s impossible to run the two models with all possible inputs, usually one would try to generate a set of test parameters which cover every significant discrete case, such as each possible ordering of date parameters and date coincidence. Another important step is to compare the behaviour of the model to the produce description, as just because the two models agree does not necessarily mean they are correctly implementing the intent in the product description.

However, not all derivatives can be priced with a well-known and standard method. Barrier derivatives can require careful work to ensure the model is converging correctly. Custom derivatives can arise which require some ingenuity to price. Examples like high-dimensional derivatives with a large number of underlying assets can require novel mathematics to price, as standard methods are simply not fast enough on current computer hardware. As mathematicians, we’re excellently place to help you price these bespoke derivatives.

See also our derivative pricing consulting services.

Validating risk models

If pricing derivatives mainly involves applying existing theory, designing and validating risk models can be much more creative. There is no standard way to estimate financial risk, and, in our experience, much of the risk modelling work in the financial services industry is of very little value. We can build bespoke risk models, taking into account data limitations, that will be of far higher quality than what you could get from most major consulting firms.

Looking for an external model validation consultant? Please get in touch to discuss how we can meet your needs.

Derivative Pricing Consulting Services

We can help you price all kinds of derivatives from vanilla to exotic, including:

  • Calls, puts, forwards and futures (FX/equity)
  • American options and exotic options with early exercise optionality
  • Knock in / knock out barrier options and window barrier options. See our article about barrier options and volatility/interest rate term structure. Also, be sure to see the paper by KS Moon for improving the efficiency of Monte Carlo pricing using a Brownian bridge.
  • Fixed and variable coupons
  • Warrants
  • Pnotes (promissory notes)
  • Dividend futures

We use a variety of derivative pricing methods including Monte Carlo, Black-Scholes, Finite Difference, and Longstaff-Schwartz.

Also check out our article on converting volatility surfaces from moneyness to delta using an iterative procedure.

Algorithmic Trading Consulting Services

We can backtest and optimize your trading strategies against historical data.

We can also automate your trading strategies by writing C++/python code to interact directly with the exchange

Do you have an idea for a trading strategy, but want to prove that it will work through backtesting against historical data? Or do you have a successful trading strategy but want to optimize the parameters of the strategy to maximise returns?

Or perhaps you’ve heard about machine learning and would like to find out how you could incorporate it into your trading. Machine learning can be used to trawl through large amounts of data looking for statistically significant signals to use in your trading. It can also be used to determine the optimal way to combine a number of possible signals or ideas into a single algorithm.

We provide cloud-based PhD quant support for traders. We offer trading algorithm development services for equity and FX markets on all major exchanges. We also offer bitcoin and cryptocurrency algorithmic trading services on major exchanges like Binance and Bitmex.

Our consulting services for algorithmic trading include:

  • Backtesting of strategies, strategy optimization and statistical analysis
  • Automating algorithms (trading bots) in languages like C++ and python
  • Applying machine learning techniques like neural networks to trading
  • Processing and analysis of large amounts of data to search for trading signals.
  • Pricing of vanilla and exotic derivatives
  • Mathematical and statistical research projects
  • General quantitative analysis – see our main page Quant Consulting.

Individual traders and smaller financial institutions may lack the quantitative expertise to design or implement trading algorithms, which involves elements of coding, mathematics, statistics and data analysis. Quantitative finance is a field where complex mathematics thrives, so that even sizable firms may wish to undertake projects which are beyond their in-house mathematical expertise. In particular, many firms are interested in dipping their feet into machine learning trading techniques, but lack the necessary internal resources.

Our staff of experienced mathematical researchers can solve sophisticated quantitative problems efficiently, and communicate the results clearly to professionals of all backgrounds. We specialize in advanced mathematical and statistical analysis, and we love a challenge! We can explain and implement the results of sophisticated academic papers and turn them into practical outcomes for your business.

Want to learn more about how cloud-based quant support can supercharge your trading? Contact us today for a frank discussion about the merits of quantitative (or algorithmic) trading.

For examples of the applications of algorithms to trading, see our article on optimal execution algorithms, our article on market making, or our article on algorithms to take advantage of order imbalances.

If you’re just getting started with algorithmic trading, check out our introductory guides to algo trading on various exchanges.

For some examples of backtesting and optimizing trading strategies, take a look at the following articles.

Optimal Execution in Algorithmic Trading

Individual investors who only trade in small volumes usually do not need to consider an execution strategy. But institutional investors who wish to trade a large number of shares, such as investment banks, hedge funds and mutual funds, encounter the issue that large trades cause adverse price movements. If they attempt to trade the whole amount in one go, liquidity will thin out and they’ll quickly move through less and less favorable bid/ask levels in the limit order book. For this reason, traders will often attempt to split large orders into a series of smaller trades over a period of time. But there are trade offs to a more passive execution strategy. Firstly, there is the opportunity risk that prices will move unfavorably before you liquidate the whole order. Secondly, other traders may notice what you are doing and react accordingly.

Electronic exchanges typically make available not just the best bid and ask prices for a given security, but several layers of bid and ask prices along with the corresponding volumes. This data, known as “market microstructure”, is exactly what we need to inform algorithmic trading algorithms which attempt to optimize execution.

In this article, we’ll explore some of the mathematics around algorithms which optimize execution strategies.

See also our article on optimal liquidation using the Almgren-Chriss model.

Optimal execution as an optimal stopping problem

The problem of optimally executing a large trade can be cast as one of a familiar class of problems in mathematics – optimal stopping problems.

  • Optimal stopping problems in mathematics are are a category of problems where, at any time \(t\), you can choose to “stop”, and receive a certain reward which changes with time. The trouble is you only know what the reward is for the current time and earlier times. You do not know what the reward will be in the future. Should you “cash in your chips” now, or wait, and hope the reward will be better in the future?

The relevance to finance here is obvious. Choosing when to execute trades or exercise an American option are both examples of optimal stopping problems.

Optimal execution using stochastic control theory

In the book Algorithmic and High Frequency Trading by Cartea et al., the authors describe the use of stochastic optimal control and stopping methods to attack this problem.

  • Control theory is a field of mathematics that has applications to a wide range of engineering problems. Abstractly, the concept is to find a given “input” to a dynamical system to achieve a desired “output” from the system. A simple example is cruise control, in which the throttle (system input) must be dynamically adjusted to achieve the desired constant speed (system output), for example, when the car begins going uphill. In the case of finance, the dynamical system is typically a stock price \(S(t)\) (typically assumed to follow geometric Brownian motion) or other market information, the system input is the choice of trading strategy, and the system output is the profit of the market participant. The goal is to choose the strategy or “input” which maximises profit.
  • Stochastic control theory is a subfield of control theory in which the time evolution of the dynamical system is not completely determined by the system input, but also contains a stochastic or probabilistic element. Financial applications of control theory obviously fall squarely into this category, since market behaviour such as stock prices are not determined solely by the actions of a single market participant, but have a very significant random element.

Finding an optimal strategy or algorithm for market interaction is a problem the arises across a wide range of trading and investing problems. Many of these problems can be cast as stochastic control problems.

Optimal execution using machine learning

Another possible approach to the optimal execution problem is to put to one side attempts to find an optimal theoretical solution, and allow algorithms to trawl through the vast quantities of freely available data and try to determine an effective strategy empirically.

Machine learning is increasingly in vogue in a wide range of fields, including finance. See this useful summary of a report issued by J.P.Morgan about the future of data science and big data in the financial services industry.

In their paper Reinforcement Learning for Optimized Trade Execution, Nevmyvaka et al. examine the effectiveness of machine learning in finding effective execution strategies. See also the more recent paper Double Deept Q-Learning for Optimal Execution by Ning et al.

The execution strategy to be optimized will take as input, at regular time intervals \(t_i\) for \(i=0,\ldots,n\) , a set of observable market variables (principally market microstructure). It produces as output a limit order, or ask price, at which we are willing to execute all remaining inventory. The algorithm may not wait around forever for the best possible price – so it is reasonable to assume that there is a maximum time \(t_n = T\) at which all remaining inventory must be executed regardless of market prices.

By taking a large number of different stocks, and by considering the same stock at different times, we have a large number of data sets of the form \(S(t_i)\) for \(i=0,\ldots,n\). Our execution strategy must depend on time, because as we near the end of the interval, we are running out of time to transact the remaining inventory. As mentioned in the paper by Nevmyvaka, it’s reasonable to make the Markovian assumption that the optimal strategy depends only on market microstructre at the current time step, and not on what it may have been at previous time steps. Thus, whether proceeding by machine learning or by some other method, determining the optimal execution strategy for each step can be done by working backwards from the final time step, in a similar manner to how one prices an American option using Monte Carlo. At time \(t_n = T\) we already know what the strategy is – we must execute all remaining shares regardless of market prices. At time \(t_{n-1}\), for each individual data set we would like to execute at time \(t_{n-1}\) any shares that can be executed at a price equal or better than they could be at time \( T \). The machine learning algorithm must determine, on average after considering all the data sets, what is the optimal map from the market microstructure (bid/ask levels and volumes) to a price at which we are willing to transact at that time step. Continuing to work backwards, we eventually come up with an optimal strategy at each step.

The question remains – what kind of relationship should we assume between the market microstructure and the optimal transaction price at the same time step? We might, with some careful thought, make a guess as to the form of the mathematical function, so that the machine learning algorithm will optimize the parameters. Another option is to use a neural network which may succeed in finding the form of the relationship by itself. Experience shows that making little effort and expecting a machine learning algorithm to work magic is not always successful. Guiding the process using human theoretical insight and human empirical observation, and then using machine learning techniques to merely to optimize, will often yield the best results.