You might think that taking advantage of a price discrepancy between venues sounds pretty simple – if the prices aren’t the same buy at the low exchange and sell at the high exchange. It seems entirely simple and totally risk free! Unfortunately, and as we’ll see in this article, the reality is neither simple or risk free.
The complexities of execution
The first things we need to talk about which deviates from the simple picture above, are fees, slippage and the bid-ask spread. If the bid on exchange A is higher than the ask on exchange B, \(\text{Bid}_A > \text{Ask}_B\), then the profit is actually
\[\text{profit} = \text{Bid}_A – \text{Ask}_B – \text{fees} – \text{slippage}.\]
The slippage (price movement when we transact a volume larger than the first order book entry) contains a component from each exchange, and must be calculated from the order book data. Alternatively, we could drop the slippage term in this equation and instead replace the bid and ask with their volume weighted average prices (VWAP).
However, there’s still more to consider here – we have to consider latency and price movement. If an arbitrage opportunity enters existence at time \(t\), your two trades aren’t actually executed until times \(t+L_A\) and \(t+L_B\), where the latencies \(L_A\) and \(L_B\) can be different and consist of things like:
- The time between an arbitrage opportunity entering existence on the exchanges, and the information reaching your system
- The time taken for your system to process the data and become aware of the opportunity
- The time taken for your buy/sell orders to reach the exchanges
- The time taken for your orders to be processed and executed by the exchanges
Adverse price movement may occur during the latency periods, leading to the sum of the two trades no longer being profitable.
No free lunch
Now you may be thinking – what if I only post limit orders? Then the worst case scenario is that my trades don’t execute and I lose nothing!
But hold on, that’s not true – the worst case scenario is actually that one of the trades executes and the other doesn’t – leaving you holding inventory you didn’t want, whose value may fall.
To avoid this difficulty, cross-venue arbitrage algorithms often use market orders. Although this means the two trades could lose money, it also ensures that the trades always execute, and execute quickly, which reduces adverse price movement risk.
What this means is that, contrary to what you might have assumed, there is no risk-free way to try to exploit cross-exchange arbitrage. It also means that price prediction and probabilistic modelling becomes a significant part of any cross-exchange arbitrage system.
Probabilistic modelling and machine learning
As we’ve seen, by the time your system detects an arbitrage, the relevant prices are already stale. And the prices may move still further by the time your orders are executed on the exchange. In fact our profit equation is now
\[\text{profit} = \text{Bid}_A (t+L_A) – \text{Ask}_B (t+L_B) – \text{fees} – \text{slippage},\]
where the bid and ask functions are random variables. Also a random variable is the slippage, which should really be separated into \( \text{slippage} = \text{slippage}_A + \text{slippage}_B\).
A simple way to model the price at a time in the future is to assume a normal model
\[dP = \sigma dW,\]
where \(\sigma\) is the volatility and \(W\) is a Weiner process (Brownian motion). Now you may object that market prices are often modelled using geometric Brownian motion, where the price moves also get bigger when the price gets bigger. But keep in mind that as long as a normal model is periodically recalibrated (after price has changed substantially), this effect is captured by a normal model anyway. By recalibrating \(/sigma\) to the most recent data, we naturally arrive at the idea of a profitability threshold, where the trades are only executed if the observed arbitrage is sufficiently large relative to recent volatility.
Of course, more complex models are possible, including models that try to predict price movement by looking at volume imbalances on the order book, models that use trends/momentum or mean reversion, and models that attempt to use machine learning on a large number of signals. To undertake this kind of project involves both 1) developing a theoretical model, and 2) calibrating the model to recent historical data.
If your trades are large enough that slippage becomes significant, you would also want to model and calibrate optimal trade size. And if you were to use limit orders, you’d want to model the probability that an order would be filled and estimate when the trade would be filled.
Conclusion
Cross-exchange arbitrage appears simple and risk-free, but once fees, slippage, latency, and execution risk are taken into account, it’s a far more subtle and mathematically involved problem than it at first appears. It becomes a probabilistic trading strategy rather than a deterministic one, and it carries risk.
The role of the arbitrage detection algorithm is not simply to identify price differences, but to estimate expected profit under uncertainty. This requires careful modelling of order book dynamics, execution latency, and price movement.
See also our article on triangular arbitrage.