Introductory Guides to Algorithmic Trading

Each exchange (or broker) provides slightly different services and features to traders wishing to automate their strategies as algos.

  • Some exchanges provide their own high-level trading language allowing people unfamiliar with conventional coding languages to implement and test simple algorithms. I don’t favor this approach as I would rather take advantage of the powerful features of a language like python.
  • Some exchanges provide a simple API key allowing you to interface with the exchange from you favorite language. Others require that you download additional software in order to interface (and authenticate) with the exchange.
  • Some exchanges provide a test account allowing you to test your code without having to risk money on live trades
  • Some exchanges do not allow customers to connect to their API unless they meet certain requirements. For example, TradeStation requires that your account have $10,000 of cash deposited before they will email you your API key. This is very bothersome if you initially intend to just develop and test your algorithm, and only invest your money at an appropriate time in the future.

Here we provide a few exchange specific guides, outlining how to get started interfacing with the exchange, grabbing price histories and posting buy/sell orders:

Algo Trading Crypto on Binance Using Python

Consulting Services

Are you interested in developing an automated algorithm to trade crypto on Binance? Have a successful strategy already that you want automated in order to monitor a large number of data streams 24/7? Want your strategy backtested and optimized? We offer algorithmic trading consulting services for spot, futures and option trading on Binance, including: trading bot implementation in python or C++, data analysis, backtesting and machine learning. Please get in touch to learn more!

About Binance

Binance is one of the world’s largest cryptocurrency exchanges, offering:

  • Spot trading on around 100 digital currencies including Bitcoin and Ethereum
  • Up to 125x leverage on perpetual futures contracts
  • At the money American call and put options with 5 minute to 1 day expiries

Note that Binance has been banned by regulators in some countries such as the US and the UK due to concerns about the compliance of cryptocurrency exchanges with anti-money-laundering laws (competitor Bitmex is in a similar situation). Binance.US is an alternative which is designed to comply with US regulations.

Crypto exchanges are keen to develop cryptocurrency derivative products such as futures and European or American options. But keep in mind that some countries like Australia, Germany, Italy and the Netherlands only allow trading in spot, as they have banned derivatives including futures, options and leverage. Regulators are concerned that retail investors may be unaware of the risk involved in derivative products given the high volatility of cryptocurrencies.

Initial setup

Since Cryptocurrency markets do not close overnight, algorithmic trading using a crypto bot is the only way to monitor your positions 24/7.

First you need to make sure you have an installation of python. I recommend downloading the Anaconda distribution which comes with the Spyder IDE. In addition, you’ll want the python library python-binance, which can be obtained by opening an anaconda prompt from the start menu and typing

pip install python-binance

In addition, an API key is needed to give your installation of python permission to trade on your binance account. After creating a Binance account, simply go to the profile icon in the top right corner and click “API Management”. Then just place these lines at the top of your python code:

from binance import Client, ThreadedWebsocketManager, ThreadedDepthCacheManager

client = Client(API Key, Secret Key)

Here, API Key and Secret Key are of course the two keys you obtained from Binance.

Backtesting data

Binance market data for backtesting purposes can be downloaded here. Spot and futures data are available with three file types as shown below. As the raw data comes without headers, I’ve included screenshots below showing the headers for convenience.

AggTrades

Klines

Trades

Basic commands to get you started

From there, one can start reading the Binance API to start learning basic commands.

To get the current price of BTCUSDT you can use

client.get_symbol_ticker(symbol=”BTCUSDT”)
Out: {‘symbol’: ‘BTCUSDT’, ‘price’: ‘51096.07000000’}

If you want to receive an updated price only when it has changed, you can stream prices by creating a threaded websocket manager. The function “update_price” defines what to do whenever some new information “info” is received from the exchange. In this case it appends it onto a vector of historical prices and prints it out to the console.

def update_price(info):
btc_latest = {}
btc_latest[‘last’] = info[‘c’]
btc_latest[‘bid’] = info[‘b’]
btc_latest[‘ask’] = info[‘a’]
btc_history.append(btc_latest)
print(btc_history[-1])

twm = ThreadedWebsocketManager()
twm.start()
twm.start_symbol_ticker_socket(callback=update_price, symbol=’BTCUSDT’)

To buy/sell, simply use client.create_order:

order = client.create_order(
symbol=’BTCUSDT’,
side=’BUY’,
type=’LIMIT’,
timeInForce=’GTC’,
quantity=100,
price=51097)

Market Making Algorithms and Models

A market maker provides liquidity to the market by standing ready to both buy and sell an asset at stated bid and ask prices. They are common for both Forex markets and stock exchanges, and there are even many firms acting as market makers for bitcoin and other cryptocurrencies. The value of market making to traders is that they are able to execute trades immediately, rather than having to wait for a matching order to appear. In exchange, the market maker generates a profit by setting an appropriate spread between the bid and ask prices. A market making algorithm must determine appropriate bid and ask prices to maximise profits. There are two trade offs that a market maker must consider when trying to achieve optimal market behaviour.

Firstly, there is a trade off between volume and margin. If the market maker’s bid ask spread is too conservative, few of his trades will be fulfilled. On the other hand, if his spread is too aggressive, many trades will be fulfilled but he will make very little money from each trade. So the bid ask spread must be sufficiently attractive to other market participants while still remaining profitable for the market maker.

Secondly, while market makers can profit from the bid ask spread, they are exposed to risk due to price changes on the inventory of the asset that they must hold. If the price drops, the inventory may have to be sold at less than it was acquired for. The market maker must therefore design a quoting algorithm which optimally sets bid and ask prices to generate a profit, while also minimising inventory risk. A market maker may hope to buy and sell in approximately equal quantities to avoid accumulating a large inventory. Market making algorithms are relevant not just to genuine market makers, but to any market participant that both buys and sells an asset. One mechanism a market making algorithm can use to reduce inventory risk is to provide more conservative bid estimates when it is already long a significant inventory.

Market making strategies differ from more general trading strategies in that the latter may take on a large position based on some view of the direction the market will move in, while the market maker attempts to avoid this risky bet as much as possible.

The Avellaneda-Stoikov model

The Avellaneda-Stoikov model is a simple market making model that can be solved for the bid and ask quotes the market maker should post at each time \(t\).

We consider the case of a market maker on a single asset with price trajectory \(S_t\) evolving under brownian motion

\[ dS_t = \sigma dW_t.\]

While this implies a normally distributed price rather than lognormally distributed, the difference is not significant over small time horizons where \(S_t\) does not move too much from its original value.

Let \(S_t^b\) and \(S_t^a\) represent the bid and ask quotes of the market maker at time \(t\), and let \(N_t^b\) and \(N_t^a\) represent the total number of market participants who have bought and sold from the market maker respectively. The model assumes that buyers arrive to purchase from the market maker at random, with an average frequency that decreases as the bid price \(S_t^b\) drops further below \(S_t\). Similarly, the frequency at which sellers arrive to sell to the market maker arrive with an average frequency that decreases as the ask price \(S_t^a\) rises further above \(S_t\). This means that the more conservatively the market maker sets his bid and ask quotes, the less likely he is to make trades.

Furthermore, the model assumes that the market maker must keep his inventory \(q_t\) between some values \(-Q\) and \(Q\). He does this by not posting a bid quote when his inventory reaches \(Q\), and similarly for an ask quote.

For simplicity, the model assumes that each buyer purchases exactly one unit. Since the market maker earns \(S_t^a\) whenever a buyer arrives, and spends \(S_t^b\) whenever a seller arrives, his cash account satisfies the equation

\[dX_t = S_t^a dN_t^a – S_t^b dN_t^b.\]

We assume that the market maker wishes to optimize his behavior over some time interval \([0,T]\). We want to find functions of time \(S_t^b\) and \(S_t^a\) which maximise the expected value of his final holdings of cash and inventory

\[X_T + q_TS_T.\]

However, in such problems it is also typical to penalise the variance of this quantity in the optimization to factor in risk aversion. One can optimize such a function using stochastic control theory. For the exact form of the solutions and for more details see The Financial Mathematics of Market Liquidity by Gueant.

Optimal Liquidation Algorithms – the Almgren-Chriss Model

Unwinding or liquidating a position is a trade off. Liquidate too quickly and you may suffer price slippage as the market order walks the book. Liquidate too slowly with more conservative limit orders, and you are exposed to the risk of adverse price moves. The concept of splitting a large order into a number of smaller orders to be executed over a certain time period is well-known to traders. Exchanges and many other market participants are therefore motivated to develop liquidation algorithms which behave optimally. In this post we’ll discuss the Almgren-Chriss model. For more details consult The Financial Mathematics of Market Liquidity by Gueant.

We assume a trader wants to unwind \(q_0\) trades in a time interval \([0,T]\). Writing \(q_t\) for the trader’s inventory at time \(t\), we write

\[dq_t = v_t dt, \]

where \(v_t < 0\) is the rate of liquidation. If the trades were exercised in a finite number of discrete blocks, then \(v_t\) would be a sum of delta functions, for example. The mid price of the stock is modelled as

\[ dS_t = \sigma dW_t + kv_t dt\]

for \(k>0\). The first term is simply Brownian motion, although note that the decision is made for simplicity to assume a normally distributed price instead of the usual lognormally distributed price. The second term means that the price drops by an amount proportional to the number of stocks our trader executes. This is the permanent market impact.

But the most significant equation here is the equation representing how the rate of liquidation \(v_t\) affects the price obtained for the shares. This is the instantaneous part of the market impact, which in the model has no permanent impact on the market price. We assume that the price obtained for the shares executed at time \(t\) is

\[S_t + g\left(\frac{v_t}{V_t}\right),\]

where \(V_t\) represents the total market volume and \(g<0\) when \(v_t < 0\). The choice of increasing function \(g\) is actually the key to the model. It quantifies how much worse the average price obtained for the shares trades at time \(t\) is when the rate of liquidation \(v_t\) is higher (i.e. more negative). The original model of Almgren and Chriss chose the function $g$ to be linear. This means that if the trader liquidates twice as many shares at time \(t\), the average price obtained for those shares will be twice as far from the mid price. The cash earnt by the trader is then simply the number of shares liquidated multiplied by the average price obtained, i.e.

\[dX_t = – v_t\left( S_t + g\left(\frac{v_t}{V_t}\right) \right) dt.\]

If the midprice were assumed to be close to constant over time, the optimal strategy would be to liquidate as slowly as possible. This would mean that the shares would all be sold at close to the mid price. However, liquidators are not only unwilling to wait forever, but also typically wish to liquidate the portfolio at close to the current market price. Liquidating over a longer time interval means that the price may fluctuate away from the current price. Some kind of “risk appetite” consideration must therefore be included in the model.

This requirement is not actually encoded in the differential equation for \(X_t\) above. Rather, it is encoded in the quantity we wish to optimize. The way this is done is to not simply optimize the final cash holding \(X_T\), but also to penalise its variance. This can be done by choosing the function to be optimized as something like \(\mathbb{E}(X_T) – \frac{\gamma}{2} \mathbb{V}(X_T)\) or \(\mathbb{E}(-e^{- \gamma X_T})\), for some constant \(\gamma > 0\). How much one penalises variance by choosing \(\gamma\) is essentially an arbitrary decision in the model. Of course, longer trading horizons give rise to more variance in \(X_T\) because \(S_t\) becomes less predictable when allowed more time to drift. Thus this parameter will determine the rate of liquidation based on risk appetite.

Finding the optimal trading strategy \(q(t)\) is a variational problem which requires minimising the function

\[J(q) = \int_0^T{\left(V_tL\left(\frac{q'(t)}{V_t}\right) + \frac{1}{2} \gamma \sigma^2 q(t)^2\right)dt},\]

where \(L(\rho) = \rho g(\rho) \).

Gueant also discusses several extensions of the model, including:

  • Incorporating a drift term into the equation for the evolution of the stock price to allow the trader an opinion on the future trajectory of the stock
  • Placing a lower and/or upper bound on the liquidation rate
  • Considering the liquidation of portfolios of multiple stocks

The Almgren-Chriss model implemented in practice

If you attempted to implement the Almgren-Chriss model in practice, there are a number of issues that would arise. In particular, you would need to specify the parameters of the model, which may be difficult to determine.

The first is the shape of the market impact function, which represents the manner in which the price moves as you execute a certain volume of the asset. A simple assumption is a linear market impact function. However, it depends on the structure of the order book, which could take many different shapes, and may change over time. If you have access to the order book data, you could investigate whether the order book shape is sufficiently constant over time to warrant doing some kind of backtest/fitting. But your execution strategy would cease to be optimal if the shape of the order book deviated from your assumptions. And if you don’t have access to the order book data, this is going to be much harder.

The second is the risk appetite parameter, or how much one penalizes the variance in the final PnL. There are two competing factors in the optimal solution. First, the slower you liquidate the better the price you get. Second, the slower you liquidate the more likely the price will move. It’s pretty much arbitrary how to choose to balance these two competing factors. And, of course, there may be other reasons why you need to liquidate your entire inventory within a certain amount of time, regardless.

The third is your view on the likely future movement of the asset. Clearly, this will have a profound impact on your execution strategy. For example, if you believed the price was going to drop significantly soon, you’d want to use a high rate of liquidation to make sure you had liquidated your inventory before the asset drops too much. But if you had no view on the future asset trajectory, you could neglect this issue.

And finally, something not considered in the model is the need to make sure your execution strategy is unpredictable so other market participants can’t anticipate your trades. A predictable rate of execution is a great way to get taken advantage of.

Despite the above, studying this model is a very great way to clarify your thinking before designing your own execution strategy that suits your own specific application.

Volatility smoothing algorithms to remove arbitrage from volatility surfaces

Need help building a volatility smoothing algorithm? Our quant consulting service can help. Contact us today.

Implied volatility surfaces and smiles constructed by fitting a cubic spline to raw market data may contain arbitrage. In fact, even if the market data points used do not contain arbitrage, cubic interpolation between data points may introduce it. It is therefore usually desirable to find the best fit of a cubic spline to the data points, under the restriction that the result be arbitrage free. Unlike the basic interpolation approach, the spline need not pass through the data points. This is called volatility smoothing.

We recommend the approach of M.R Fengler in his paper Arbitrage-Free Smoothing of the Implied Volatility Surface. Instead of fitting a spline to the graph of volatility vs moneyness, Fengler uses call price vs moneyness. An advantage of this is that the no arbitrage restrictions take on a more simple form in terms of call price.

The surface fitting is done using a least squares fitting, with a number of constraints. The heart of the algorithm is therefore a constrained quadratic optimization procedure. In python, this can be achieved using scipy.optimize.minimise with the parameter method=’SLSQP’. The mathematical difficulty is mainly around understanding the constraints and implementing them accurately.

We’ve implemented Fengler’s algorithm in python. The algorithm runs very quickly on a single vol surface. However, since historical volatility data has, for each date, a large number of vol surfaces (one for each tenor), the number of surfaces to be processed can easily proliferate into the millions. In this case one may wish to consider a C++ implementation or at least a multicore implementation in python.

To illustrate the algorithm, we start with 8 pillar points (moneyness/volatility pairs) which make up the raw data of a vol surface. We’ve deliberately chosen data which contains significant arbitrage. We’ve calculated the Black-Scholes call prices corresponding to these points and plotted them as the blue dots in the below graph.

The orange line is the arbitrage free cubic spline generated by our implementation of Fengler’s approach. You can see that it very effectively solves the problem of the out of place at-the-money data point which is entirely inconsistent with an arbitrage free surface.

We can also convert the call prices back to implied volatilities, yielding the following graph. For this graph, we have simply joined the data points by straight lines for illustration purposes.

We found we had to make one addition to Fengler’s approach as described in his paper. Fengler considers a set of weights for each data point in the fitting. We found we had to weight each data point by 1/vega to achieve an accurate result. This is because at the wings of the volatility surface, where vega is very small, a small change in call price corresponds to a huge change in volatility. This means that when converting the fitted call prices back to volatilities, the surface will otherwise be a very poor fit in the wings.

Fengler’s paper is not limited to one dimensional volatility surfaces (that is, smiles). It can also be used for two dimensional volatility surfaces which incorporate both moneyness and maturity. His paper details how to extend the method to include maturity.

We provide volatility smoothing consulting, along with a wide range of quantitative finance consulting services.

You may also wish to check out our article on converting volatility surfaces between moneyness and delta.

Does barrier option valuation depend on volatility and interest rate term structure?

\(\)It’s well-known that vanilla option valuation does not depend on the term structure of volatility and interest rates. This means that the price depends only on the average volatility and average interest rate between the valuation date and maturity, not on how those quantities are distributed within the interval.

A way to visualize this and understand it intuitively is as follows. Consider a large set of paths of the underlying which have been generated by a Monte Carlo routine. The value of the option is the average over all paths of the quantity \(Max(S(T) – K, 0)\). Now, imagine stretching and compressing the paths in different places as if they were plasticine, corresponding to concentrating volatility more in some places than others. It’s as if the underlying were moving faster in some regions, and slower in others, yet \(S(T)\) remains the same for each path. Thus, the price remains the same.

Interest rates affect the underlying’s drift term. Yet, as for volatility, \(S(T)\) depends only on the total proportional increase that the drift term bestows on the underlying, not on where in the interval this increase occurs.

What about barrier options? There are a few cases to consider.

First, we consider the case of a full barrier option. This means that the barrier is monitored for the full length of the deal from the valuation date to maturity, as opposed to only being monitored for a subset of it. We also assume that the underlying’s drift term is zero (this typically occurs when interest rates are zero, for example). In this case, valuation is actually still independent of volatility term structure. This can be understood by realizing that stretching or compressing the paths in different places does not change whether they breach the barrier, but only when they breach the barrier. Thus whether a given path has knocked-in or knocked-out remains unchanged.

Next, we consider the case of a partial or window barrier option. This means that the barrier is only monitored some of the time, with the monitoring period starting after the valuation date and/or ending before maturity. We still assume that the underlying drift is zero. As mentioned above, while a different volatility term structure does not change whether a path breaches the barrier, it does change when it does. Thus, it can affect whether the path breaches the barrier inside the monitoring window or outside, thus changing whether the path knocks in/out or not. Thus, for partial and window barrier options, valuation is not independent of volatility term structure.

Finally, let’s consider the case of a non-zero drift term. In this case, valuation is not independent of volatility or interest rate term structure regardless of whether it is a full barrier option or a partial/window barrier option. To understand this, consider that the movements in the underlying due to volatility are proportional to the current underlying price. If the underlying is monotonically drifting upwards throughout the monitoring window, then volatility applied early on will cause smaller changes in the underlying than if they were applied towards the end of the monitoring window. Thus, if the volatility term structure concentrates volatility towards the end of the interval after the underlying has had time to drift upwards, they are more likely to cause the underlying to rise above an upper barrier. Thus, volatility term structure and interest rate term structure affect knock out / knock in probability and thus affect valuation.

GPS consulting – mathematics and software development for global positioning systems

GPS satellites and receivers are being applied in a huge number of industries including aviation, agriculture, financial fraud identification, robotics (navigation), and landscape surveying.

Developing software to process GPS data requires an understanding of the mathematics involved in GPS coordinate systems, including coordinate transformations between latitude/longitude/height and ECEF coordinates. GPS data often must be combined with other sensor data and run through a mathematical calculation to produce the required output data or system behaviour.

Our consultants can assist you in formulating the correct mathematical equations for your GPS application, and implement them in a variety of languages like python or C++.

Financial Computation using Nvidia GPUs.

While GPUs were originally invented for image processing, their powerful capabilities are now being applied to computation problems that have nothing to do with graphics. As GPUs have about 20x as many cores as CPUS, they can be up to 100x faster for highly parallelizable computations such as machine learning and data analysis.

Did you know that google has used Nvidia GPUs to train its google translate machine learning algorithms?

In particular, Nvidia GPUs find many applications in the financial services industry, which is increasingly making use of massive data sets and AI / deep learning. GPU computation is ideal for Monte Carlo simulations, used extensively in the finance industry, as each path can be processed independently and simultaneously.

CUDA is a program development environment from Nvidia which allows users to execute the highly parallelizable part of their code on an Nvidia GPU.

Converting Volatility Surfaces from Moneyness to Delta Using an Iterative Method

\(\)It often comes up in quantitative finance that you want to convert a vol surface plotted against moneyness, to a vol surface plotted against delta.

See Options, Futures and Other Derivatives by John Hull for a reference on pricing formulas for European options. In the Black-Scholes framework, the delta of a call option is given by

\[\Delta = N(d_1), \]

Where \(N\) represents the cumulative normal probability density function, and

\[d_1 = \frac{\log(S_0/K) + (r + \sigma^2/2)T}{\sigma \sqrt{T}}. \]

(for a put, it is \(\Delta = N(d_1) – 1 )\) . Rearranging for moneyness, we have

\[ \frac{S_0}{K} = \exp\left(N^{-1}(\Delta) \sigma \sqrt{T} – (r + \sigma^2/2)T \right). \]

Now, our volatility surface would typically be specified using a number of moneyness and volatility pairs \((m_i,v_i)\) where the moneyness values would typically be something like

\[ \{m_i\} = \{0.7, 0.8, 0.9,1,1.1,1.2,1.3\}. \]

When calling for a volitility value for a moneyness in between these numbers, the firm would have implemented an interpolation function,

\[I: \text{ moneyness} \to \text{ volatility},\]

which would typically use a monotonic cubic spline. Inverting this function may be a lot of work, as it would require working out the exact coefficients generated by the cubic spline fitting. Even with an explicit formula, the spline is defined piecewise, which makes inverting it complicated.

Given some delta \(\Delta\) , we want to find a volatility \(\sigma\) such that the moneyness corresponding to that volatility according to the cubic spline interpolation is the same as the moneyness from the above formula. This requires solving the following equation for moneyness \(m\):

\[ m = \exp\left(N^{-1}(\Delta) I(m) \sqrt{T} – (r + I(m) ^2/2)T \right). \]

An equation like this should be solved numerically. This is doubly true due to the complicated definition of the function \(I\). While inverting \(I\) would be difficult, evaluating it is easy. This motivates solving using fixed point methods which only require the function to be evaluated.

What we are looking for is a fixed point of the map \(f\) , i.e. a point \(m\) such that \(f(m) = m\). Thus, in the remainder of the article, we’ll look at an iterative fixed point method for solving this equation. The idea is simple. We start with some initial point \(m_0\), and repeatedly apply the map

\[ f(m) = \exp\left(N^{-1}(\Delta) I(m) \sqrt{T} – (r + I(m) ^2/2)T \right) \]

until the change in \(m\) is less than some small tolerance.

The critical questions is: under what circumstances does this iterative procedure actually converge?

According to the Banach fixed-point theorem, this process will converge to a unique fixed point if \(f\) is a contraction mapping, which in the context of a real-valued function means

\[|f(m_1) – f(m_2)| \leq L |m_1 – m_2|, \]

for some constant \(L \in [0,1)\). This is also known as the Lipschitz condition, and it is well known that

\[L = \sup_m |f'(m)|, \]

where the supremum is of course taken over the domain of interest. Thus, our procedure will converge if \(|f'(m)|<1.\) We calculate,

\[ f'(m) = f(m) I'(m) \left( N^{-1}(\Delta) \sqrt{T} – I(m)T \right). \]

Numerical evidence shows that this derivative does not in general have an absolute value smaller than one, but typically does after just one iteration of our map. Our experience is that this method will almost always converge for all “reasonable” volatility surfaces, and usually within only 2 or 3 iterations!

A possible alternative to searching for a fixed point is to use Newton’s method to search for a zero of the function \(F(m) = m – f(m).\)

Order Imbalance in Algorithmic Trading

\(\)An order imbalance occurs when the buy volume significantly exceeds the sell volume in the order book, or vice versa. They are often caused by news of a significant development that is perceived to affect the value of the stock. It is well-known that order imbalances are an effective predictor of future stock price movement. If demand to buy exceeds the available liquidity, the price will likely move up. If demand to sell is too high for the interest on the buy side to absorb, the price will likely fall. Thus, anyone engaging in algorithmic trading will want to develop algorithms that respond effectively to imbalance signals.

A reasonable definition of order imbalance is

\[ I = \frac{V_b – V_a}{ V_b + V_a },\]

where \(V_b\) and \(V_a\) are the best (or L1) bid and ask volumes. Alternatively, and depending on the application, these volumes may be defined to include multiple levels of the limit order book (a machine learning algorithm would be well suited to determining the complicated relationship between the volume at different levels and the most probable price movement).

A simple approach is described in the book High Frequency Trading by Easley et al. The authors define a “microprice” quantity as a weighted average of the bid and ask price by

\[P_\text{micro} = P_b \frac{V_a}{V_a + V_b} + P_a \frac{V_b}{V_a + V_b}, \]

where \(P_b\) and \(P_a\) are the best (or L1) bid and ask prices, and \(V_b\) and \(V_a\) are the corresponding L1 volumes. The micro price will be closer to the bid price if there is higher volume on the ask side, and closer to the ask price if there is higher volume on the bid side. They then propose to cross the spread on a buy order when \(P _\text{micro}\) is sufficiently close to the ask price, i.e.,

\[P_\text{micro} > P_a – k(P_a – P_b), \]

and analogously for a sell order. Here, \(k\) is some constant specifying the tolerance, which would have to be determined by some kind of tick data analysis technique such as machine learning.

In the book Algorithmic and High Frequency Trading by Cartea et al. the authors discuss a Markov chain approach to modelling the order imbalance. To discretise the problem, order imbalance values are placed into five buckets. A transition matrix is fitted to data. The transition matrix represents the probability of being in each of the five buckets at the next time step, given the current bucket. They also generate data showing the probability of positive and negative price moves based on the current order imbalance bucket.