Triangular Arbitrage in FX and Crypto trading

In a previous article we discussed the unexpected complexities of trying to take advantage of cross-exchange arbitrages. In this article we’ll focus on triangular arbitrages either on a single exchange or between multiple exchanges.

A triangular arbitrage is where we convert currencies \(C_1 \to C_2 \to C_3 \to C_1\). If the exchange rates satisfy \(R_{12}R_{23}R_{31} > 1\), then we end up with more of \(C_1\) than we started with.

We can formulate this as a graph theory problem as follows. Taking the logarithm of both sides gives

\[\text{log}(R_{12})+\text{log}(R_{23})+\text{log}(R_{31}) > 0.\]

Let’s construct a complete graph where the vertices are the currencies \(C_1,C_2,C_3,\ldots\), and the edge between currencies \(C_i\) and \(C_j\) is \(\text{log}(R_{ij})\).

We’re interested in finding cycles where the sum of the edges is greater than 0. In fact, despite the name “triangular arbitrage”, there’s no reason to restrict ourselves to a cycle involving only three currencies. If we can find an arbitrage arising from a cycle of more than three currencies, that’s potentially exploitable as well.

It turns out that we have good algorithms, such as the Bellman-Ford algorithm, which can compute the shortest or longest paths from all vertices to all other vertices in a graph.

Challenges in practice

Just like in the previous article, this idealised model encounters several complexities when you try to apply it in practice:

  • Each edge should actually be replaced by two edges representing bid and ask, for example \(R^{\text{bid}}_{12}\) and \(R^{\text{ask}}_{12}\).
  • Due to slippage, the graph weights may need to depend on trade size.
  • The edges need to be adjusted by the trading fees
  • Latency makes the graph weights stochastic, meaning they need to be modelled by some appropriately chosen model
  • In the case that you place limit orders, partial execution risk is far greater due to the larger number of trades involved

In essence this problem has more moving parts, but is otherwise quite similar to the cross-exchange arbitrage we previously considered. The main difference is we now have a large number of stochastic equations to formulate and calibrate to the data (one for each edge). Conceptually, it’s not too different.

Note also that 1) arbitrage opportunities may exist only briefly, and 2) the edge weights change constantly. Thus the algorithm needs to be re-run continuously. The efficiency of the algorithm is therefore paramount to take advantage of arbitrage opportunities before they disappear, and reduce the latency.