Generating a volatility surface in C++ from options data

This is some code I wrote to generate a volatility surface in C++ using options and futures data, and use it to calculate price and Greeks for an ATM option. To simulate live trading, the code updates the volatility surface and option price as new data becomes available.

Note that the code is currently configured to execute for the first 150,000 lines of market data. This can be changed on by altering max_remaining_lines on line 62

Simplifications

Currently, the code creates only a single volatility smile for a single maturity. Of course, the code can quickly be adapted to generate a whole volatility surface, as this is just a number of smiles at different expiries with a suitable method of interpolating between them.

I used the future as a proxy for the forward, just based on the data I had. This is not strictly correct, as they can differ by a convexity correction. Also, rather than incorporate interest rate data, interest rates are currently assumed to be zero. Since the expiry used was short, this was acceptable.

Calculating Implied Volatilities

Implied volatilities are calculated using Newton’s method.

I’ve found that Newton’s method can fail to converge when converting volatility pillars from moneyness to delta, particularly for problematic vol surface data such as data containing arbitrage. However, these issues may not arise for a single implied volatility calculation. But convergence guarantees, and appropriate behaviour for all input data, should be further investigated for live trading code.

It’s also worth checking out our article on removing arbitrage from volatility surfaces.

Volatility Surface Construction

I took ATM to be the strike equal to the Futures value at expiry (that is, ATMF). Another possible convention is to use the strike for which a call and put have a delta of the same absolute magnitude.

For simplicity, I adopted the sticky strike convention to avoid having to convert volatility surfaces to delta. That is, the pillars of the vol surface are fixed strikes. I also adopted the common market convention of parameterising the vol surface using calls for strikes above ATM, and puts for strikes below ATM.

While cubic spline is market standard, this would have required either using a third party library or implementing it manually. Given my time requirements for this project, I chose linear interpolation between strike pillars. Flat extrapolation was chosen before the first and after the last pillar.

Volatility interpolation will not proceed if one of the two required pillar vols is missing. If the additional complexity was worthwhile, one could proceed to use the next pillar over if available. One could also attempt to use put in place of call (or vice versa) if the intended market price was not available. Note that the volatility surface output displays “-1” when volatility data is not yet available. As long as a valid volatility has been calculated at least once in the past, this will continue to be used. A possible modification would involve removing volatilities based on market prices that are either too old or no longer existing in the market.

Data Quality

The code uses the mid price if both bid and ask quantities are non-zero. If only one is non-zero, it will use that value. If both quantities are zero, the data is ignored and the volatility surface is not updated. That is, the row is interpreted as missing data.

Some data rows in the options input data seemed erroneous. In this case, Newton’s method could not find a volatility solution. I imposed a max_iterations of 100, and discarded such rows. Similarly, a volatility > 10 (1000%) was assumed to be due to erroneous data.

Greeks

The greeks delta, gamma, vega and theta are calculated using finite differences and a reasonable choice of step size. Although infinitesimal greeks are available via the Black-Scholes formula, traders may be more interested in the consequences of small, finite movements rather than infinitesimal movements. The greeks can be quite sensitive to the chosen step size.

Because I used Black’s formula for pricing which takes the forward as input, and spot data was not available, I outputed forward delta and forward gamma (that is, I shifted the forward value instead of the spot value). Delta will differ from spot by forward/spot and gamma by that amount squared. Theta is chosen to be a 1-day theta.

The decision was made to output updated option price and greeks whenever option call price changed by a tolerance of at least 0.00001. This condition could be modified depending on what exactly the reason is for outputting changed prices.

Code Structure

The primary auxillary functions are:

  • “Black” to price calls and puts
  • “Implied Vol” to use Newton’s method to calculate a volatility from a market price
  • “PriceAndGreeks” to return a 5-vector price, delta, gamma, vega, theta.
  • “InterpolateVol” to use linear interpolation and flat extrapolation to extract a volatility from the vol surface data.
  • “UpdateVol” to update a volatility at a particular strike if either the corresponding market price has changed, or the forward has changed.

Presently, the remainder of the code, including file input/output and the primary loop through market data lines, occurs in the “main” function. Given more time, the main function could be broken down into multiple functions for cleaner, more reader and more reusable code (although the value of this depends on the ultimate purpose and future direction of the project).

Code Testing

The code begins with a few simple unit tests to test pricing, calculation of implied volatilities and volatility interpolation. Basic reasonableness examinations have been done on the output data, however more extensive testing should be done before putting such code into production.

C++ code

#include <iostream>
#include <fstream>
#include <vector>
#include <math.h>
#include <limits>

using namespace std;

vector<vector<string>> getData(string fileName, string delimiter);
vector<string> getDataSingleLine(string fileName, string delimiter);
void writeData(vector<vector<string>> Z, string fileName, string delimiter);
vector<string> split(const string& str, const string& delim);
double Black(double V, double F, double K, double T, double r, char callput);
double Ncdf(double x);
double ImpliedVol(double initialVolGuess, double targetPrice, double tol, double F, double K, double T, double r, char callput);
vector<double> PriceAndGreeks(double V, double F, double K, double T, double r, char callput);
double interpolateVol(double strike, double F, vector<double> CallStrikes, vector<double> CallVols, vector<double> PutVols);
void UpdateVol(int updated_symbol, double FuturePrice, double T, vector<double>& CallStrikes, vector<double>& CallVols, vector<double>& PutStrikes, vector<double>& PutVols, vector<double>& CallPrices, vector<double>& PutPrices);

int main()
{
    // Unit tests
    cout << "Unit tests" << '\n';
    cout << "Call price should be 4.26894926: " << Black(0.1,100,99,1,0.05,'C') << '\n';
    cout << "Call implied vol should be 0.1: " << ImpliedVol(0.05, 4.26894926, 0.00001, 100, 99, 1, 0.05, 'C') << '\n';
    cout << "Put price should be 3.31771983: " << Black(0.1,100,99,1,0.05,'P') << '\n';
    cout << "Put implied vol should be 0.1: " << ImpliedVol(0.15, 3.31771983, 0.00001, 100, 99, 1, 0.05, 'P') << '\n';
    vector<double> teststrikes{1,2,3,4,5};
    vector<double> test_vols_call{.10,.12,.14,.16,.18};
    vector<double> test_vols_put{.102,.122,.142,.162,.182};
    cout << "Interpolated vol should be .102: " << interpolateVol(0.9, 3, teststrikes, test_vols_call, test_vols_put) << '\n';
    cout << "Interpolated vol should be .18: " << interpolateVol(5.1, 3, teststrikes, test_vols_call, test_vols_put) << '\n';
    cout << "Interpolated vol should be .131: " << interpolateVol(2.5, 3, teststrikes, test_vols_call, test_vols_put) << '\n';

    // Load data
    string market_data_filename = "market_data.csv";
    string static_info_filename = "static_info.csv";

    // Output files
    ofstream fittingfile("fitting_output.csv");
    ofstream optionfile("option_output.csv");

    // Format static data.
    // This code assumes order of headings and that calls are labelled 1 to 145 and puts 146 to 290 with none missing.
    // Code would need to be modified to handle more general input files or potential missing data.
    vector<vector<string>> static_info = getData(static_info_filename, ",");
    double expiry_time = stod(static_info[1][5]); // Assumes all options have the same expiry

    vector<double> CallStrikes(145);
    vector<double> PutStrikes(145);

    for(int i = 1; i < static_info.size(); i++)
    {
        string symbol = static_info[i][0].substr(4, symbol.size() - 4);
        if (static_info[i][3].compare("C") == 0) {CallStrikes[stoi(symbol)-1] = stod(static_info[i][4]);}
        else if (static_info[i][3].compare("P") == 0) {PutStrikes[stoi(symbol)-146] = stod(static_info[i][4]);}
    }

    // Load market data
    // To simulate the arrival of live trading data, data is loaded one line at a time, with volatility surface calculations updated every line.

    ifstream file(market_data_filename);
    int max_remaining_lines = 150000; // Used to run code for a limited number of rows only
	string line = "";

	double current_time;
	int updated_symbol; // symbol that is updated this data row
	double T; // time to expiry
	getline(file, line); // First row are headings

	// Most recent price update for each strike. -1 represents no data yet.
    vector<double> CallPrices(145,-1);
    vector<double> PutPrices(145,-1);
    vector<double> CallVols(145,-1);
    vector<double> PutVols(145,-1);
    vector<double> OptionCallPrice(5,-1);

    double FuturePrice = -1;

    // Output files
    vector<string> heading_row;
    heading_row.push_back("Timestamp");
    for(int i = 1; i < CallStrikes.size(); i++)
    {
        heading_row.push_back(to_string(CallStrikes[i])); // Assuming call strikes and put strikes are the same, which seems to be the case
    }
    for(int l = 0; l < heading_row.size(); l++)
    {
        fittingfile << heading_row[l] << ",";
    }
    fittingfile << "\n";

    vector<string> heading_row2 {"Timestamp","Vol","Call price","Call delta","Call gamma","Call vega","Call theta","Put price","Put delta","Put gamma","Put vega","Put theta"};
    for(int l = 0; l < heading_row2.size(); l++)
    {
        optionfile << heading_row2[l] << ",";
    }
    optionfile << "\n";

    vector<vector<string>> pricing_output;

    // Main loop. Reads each line in market data input and updates futures price, vol surface and option price/greeks where appropriate.
	while (getline(file, line) && max_remaining_lines > 0)
	{
		vector<string> column = split(line, ",");

		max_remaining_lines = max_remaining_lines - 1;

		current_time = stod(column[1]);
        T = (expiry_time - current_time)*pow(10,-9)/31536000; // convert nanoseconds to years

        updated_symbol = stoi(column[0].substr(4, column[0].size() - 4));

        // Code to find most appropriate price to use for current symbol
        double current_bid = stod(column[2]);
        int bid_quantity = stod(column[3]);
        double current_ask = stod(column[4]);
        int ask_quantity = stod(column[5]);
        if (bid_quantity == 0 && ask_quantity == 0) {continue;} // Interpreted as no data update
        if (current_bid == 0 && current_ask == 0) {continue;} // Interpreted as no data update
        double mid = 0;
        // if one has quantity zero, use the other
        if (bid_quantity == 0 and ask_quantity > 0){mid = current_ask;}
        else if (bid_quantity > 0 and ask_quantity == 0){mid = mid + current_bid;}
        else {mid = (current_bid + current_ask)/2;}

        // Update prices
        if (updated_symbol == 0) {FuturePrice = mid;}
        else if (updated_symbol <= 145)
        {
            CallPrices[updated_symbol-1] = mid;
        }
        else
        {
            PutPrices[updated_symbol-146] = mid;
        }

        // Update volatilities
        if (FuturePrice < 0) {continue;} // unable to calculate volatilities without a forward

        if (updated_symbol == 0) // If Future price has changed, all vols must be updated
        {
             for(int l = 1; l < CallVols.size()+ PutVols.size() + 1; l++)
             {
                 UpdateVol(l, FuturePrice, T, CallStrikes, CallVols, PutStrikes, PutVols, CallPrices, PutPrices);
             }
        }
        else{UpdateVol(updated_symbol, FuturePrice, T, CallStrikes, CallVols, PutStrikes, PutVols, CallPrices, PutPrices);}

        // If V or F have changed, output ATM option price and greeks.
        double impliedATMvol = interpolateVol(FuturePrice, FuturePrice, CallStrikes, CallVols, PutVols);

        // If required vol pillars are still missing from vol surface data, no option price can be output
        if (impliedATMvol > 0)
        {
            double oldprice = OptionCallPrice[0];
            OptionCallPrice = PriceAndGreeks(impliedATMvol, FuturePrice, FuturePrice, T, 0, 'C');
            if(abs(OptionCallPrice[0] - oldprice) > 0.00001) // only write a row to file if price has changed.
            {
                vector<double> OptionPutPrice = PriceAndGreeks(impliedATMvol, FuturePrice, FuturePrice, T, 0, 'P');
                optionfile << to_string(current_time) << ",";
                optionfile << to_string(impliedATMvol) << ",";
                for(int l = 0; l < OptionCallPrice.size(); l++)
                {
                    optionfile << to_string(OptionCallPrice[l]) << ",";
                }
                for(int l = 0; l < OptionPutPrice.size(); l++)
                {
                    optionfile << to_string(OptionPutPrice[l]) << ",";
                }
                optionfile << "\n";
            }
        }

        // Output fitting csv
        // epoch nanoseconds, fitted volatility at each strike
        vector<string> fitting_row;
        fitting_row.push_back(to_string(current_time));
        for(int i = 1; i < CallVols.size(); i++)
        {
            if (CallStrikes[i] < FuturePrice) {fitting_row.push_back(to_string(PutVols[i]));}
            else {fitting_row.push_back(to_string(CallVols[i]));}
        }

        for(int l = 0; l < fitting_row.size(); l++)
        {
            fittingfile << fitting_row[l] << ",";
        }
        fittingfile << "\n";

    }

	file.close();
	fittingfile.close();
	optionfile.close();

    return 0;
}

// Update the vol at a given strike
void UpdateVol(int updated_symbol, double FuturePrice, double T, vector<double>& CallStrikes, vector<double>& CallVols, vector<double>& PutStrikes, vector<double>& PutVols, vector<double>& CallPrices, vector<double>& PutPrices)
{
        double tolerance = 0.0000001;
        double r = 0; // Assumption of zero interest rates
        double initialVolGuess;

        if (updated_symbol <= 145)
        {
            if(CallPrices[updated_symbol-1] < 0) {return;}

            double mid = CallPrices[updated_symbol-1];
            if (CallVols[updated_symbol-1] < 0) {initialVolGuess = 0.15;}
            else {initialVolGuess = CallVols[updated_symbol-1];}
            double impvol = ImpliedVol(initialVolGuess, mid, tolerance, FuturePrice, CallStrikes[updated_symbol-1], T, 0, 'C');
            if (impvol > 0 && impvol < 10)
            {
                CallVols[updated_symbol-1] = impvol; // ImpliedVol returns -1 if Newton fails to converge
            }
        }
        else
        {
            if(PutPrices[updated_symbol-146] < 0) {return;}

            double mid = PutPrices[updated_symbol-1];
            if (PutVols[updated_symbol-146] < 0) {initialVolGuess = 0.15;}
            else {initialVolGuess = PutVols[updated_symbol-146];}
            double impvol = ImpliedVol(initialVolGuess, mid, tolerance, FuturePrice, PutStrikes[updated_symbol-146], T, 0, 'P');
            if (impvol > 0 && impvol < 10)
            {
                PutVols[updated_symbol-146] = impvol;
            }
        }

}

// Linearly interpolate the vol smile for some strike
double interpolateVol(double strike, double F, vector<double> CallStrikes, vector<double> CallVols, vector<double> PutVols)
{
    // Assume callstrike list is identical to putstrike list, and callstrike list is ordered

    // Flat extrapolation
    if (strike <= CallStrikes[0]) {return PutVols[0];}
    if (strike >= CallStrikes[CallStrikes.size() - 1]) {return CallVols[CallVols.size() - 1];}

    int rightindex = 0;
    for(int i = 0; i < CallStrikes.size(); i++)
    {
        if (strike < CallStrikes[i]){rightindex = i; break;}
    }

    double right_strike = CallStrikes[rightindex];
    double left_strike = CallStrikes[rightindex - 1];
    double right_vol;
    double left_vol;

    // Use put for strike below ATM and call otherwise
    if (left_strike < F) {left_vol = PutVols[rightindex - 1];}
    else {left_vol = CallVols[rightindex - 1];}

    // If vols don't exist yet return -1
    if (left_vol < 0 or right_vol < 0) {return -1;}

    if (right_strike < F) {right_vol = PutVols[rightindex];}
    else {right_vol = CallVols[rightindex];}

    return left_vol + (right_vol - left_vol)*(strike - left_strike)/(right_strike - left_strike);
}

vector<double> PriceAndGreeks(double V, double F, double K, double T, double r, char callput)
{
    double deltastep = 0.0001*F;
    double gammastep = 0.01*F;
    double vegastep = 0.0001*V;
    double thetastep = min(1.0/365,T);

    vector<double> result(5,0);
    double price = Black(V, F, K, T, r, callput);
    result[0] = price;
    result[1] = (Black(V, F + deltastep/2, K, T, r, callput) - Black(V, F - deltastep/2, K, T, r, callput))/(deltastep/2);
    result[2] = (Black(V, F + gammastep, K, T, r, callput) + Black(V, F - gammastep, K, T, r, callput) - 2*price) / pow(gammastep, 2);
    result[3] = (Black(V + vegastep/2, F, K, T, r, callput) - Black(V - vegastep/2, F, K, T, r, callput))/(vegastep/2);
    result[4] = (Black(V, F, K, T - thetastep, r, callput) - price)/thetastep;

    return result; // price, delta, gamma, vega, theta
}

// Uses Newton's method to calculate implied volatility from market price
double ImpliedVol(double initialVolGuess, double targetPrice, double tol, double F, double K, double T, double r, char callput)
{
    double derivativeStep = min(0.00001, tol/100);
    double currentVol = initialVolGuess;
    double functionValue = Black(currentVol, F, K, T, r, callput) - targetPrice;
    double derivativeValue;
    int max_iterations = 100;

    while (abs(functionValue) > tol){
        derivativeValue = (Black(currentVol + derivativeStep, F, K, T, r, callput) - functionValue - targetPrice)/derivativeStep;
        currentVol = currentVol - functionValue / derivativeValue;
        functionValue = Black(currentVol, F, K, T, r, callput) - targetPrice;
        max_iterations = max_iterations - 1;
        if (max_iterations == 0) {return -1;}
    }

    return currentVol;
}

// Prices a put or call option using Black's formula and the forward value
double Black(double V, double F, double K, double T, double r, char callput)
{
    constexpr double lowest_double = std::numeric_limits<double>::lowest();
    if (T < lowest_double) {return F-K;} // avoid divide by 0
    double d1 = (log(F/K) + pow(V,2)*T/2)/(V*sqrt(T));
    double d2 = d1 - V*sqrt(T);
    if (callput == 'C')
    {
        return exp(-r*T)*(F*Ncdf(d1)-K*(Ncdf(d2)));
    }
    else if (callput == 'P')
    {
        return exp(-r*T)*(-F*Ncdf(-d1)+K*(Ncdf(-d2)));
    }
}

// Normal cumulative cdf function
double Ncdf(double x)
{
    return erfc(-x / sqrt(2))/2.0;
}

void writeData(vector<vector<string>> Z, string fileName, string delimiter)
{
	ofstream file(fileName);

    for(int k = 0; k < Z.size(); k++)
    {
        for(int l = 0; l < Z[0].size()-1; l++)
        {
            file << Z[k][l] << delimiter;
        }
        file << Z[k][Z[0].size()-1] << "\n";
    }

	file.close();
}

vector<vector<string>> getData(string fileName, string delimiter)
{
	ifstream file(fileName);

	vector<vector<string> > Z;

	string line = "";

	while (getline(file, line))
	{
		vector<string> vec = split(line, delimiter);
		Z.push_back(vec);
	}

	file.close();

	return Z;
}

vector<string> getDataSingleLine(string fileName, string delimiter)
{
	ifstream file(fileName);

	vector<string> data;

	string line = "";

    getline(file, line);
    data = split(line, delimiter);

	file.close();

	return data;
}

vector<string> split(const string& str, const string& delim)
{
    vector<string> tokens;
    size_t prev = 0, pos = 0;
    do
    {
        pos = str.find(delim, prev);
        if (pos == string::npos) pos = str.length();
        string token = str.substr(prev, pos-prev);
        tokens.push_back(token);
        prev = pos + delim.length();
    }
    while (pos < str.length() && prev < str.length());
    return tokens;
}

How to Hire a Quant or Mathematician (whether Freelancer, Consultant or Permanent!)

Hiring managers are put in the position of speculating on the future job performance of candidates. This is true whether they are looking to hire a permanent staff member, or engage a consultant or freelancer for a shorter time period or on a less than fulltime basis.

Yet, their field of expertise is in executing their own jobs, not in appraising the capabilities of another person. While hiring managers might receive interview training from their firm’s HR department, this merely shifts the burden of developing an effective candidate assessment process onto HR personnel. Anyone who has done one of those personality tests which ask you the same highly ambiguous questions over and over in slightly different ways, will know that the only people with no scepticism in the validity of this methodology are HR themselves.

While some of what I say may be applicable to other kinds of roles, I want to focus on hiring for quants (quantitative finance) and mathematicians. As a quant and mathematician myself, and having been through a few interviews myself during my career, and I’ve got a few opinions on this, so strap in!

So how do managers try to interview quants and mathematicians, and why doesn’t it work?

A difficult task under any circumstances

Judging other people is a difficult task at the best of times. Consider the following examples:

  • J.K. Rowling was rejected by 12 publishers before she found success. Given that the Harry Potter franchise is now worth tens of billions, it’s safe to say that those publishers were not as good at picking the winners as they might have thought they were. Even though picking which authors will make them money is supposed to be the primary skill of a publishing house.
  • Most elite trading firms (and even the less elite ones!) like to screen their candidates with online coding and maths tests, apparently believing that this will allow them to select the “smartest” people. And in the news, I occasionally see articles alleging that some kid has an IQ higher than Einstein. The implication being that this kid should go on produce achievements at least as great as Einstein’s. Yet, documentaries which have followed up on high-IQ children year later have found that they’ve become competent professionals, but not the singular individuals their singular IQ scores suggested they would. And Einstein struggled to find a teaching position after graduating, and instead spent 7 years working at a patent office – retrospectively, probably not the best use of his abilities. Why did not one person in society, including academics and professors, anticipate his potential? Couldn’t they have just sent him to do an online test like the Susquehannas, Citadels or Towers of the world? As a slow, deep thinker, I suspect Einstein would not have done unusually well on those kinds of tests.
  • Rachmaninoff’s first symphony was savaged by critics, with one comparing it to the seven plagues of Egypt. Yet today he is one of the most enduringly popular romantic composers.

Google famously analysed how job performance aligned with interview performance:

“”We looked at tens of thousands of interviews, and everyone who had done the interviews and what they scored the candidate, and how that person ultimately performed in their job. We found zero relationship. It’s a complete random mess” He also admitted that brainteasers are useless and only good for making the interviewers feel smart.”

So why don’t interview questions work?

It’s important to remember that the purpose of an interview is to try to determine how a candidate will perform in the work environment. Therefore, the candidate should be observed under circumstances as close as possible to the work environment. Yet, the interview differs profoundly from the normal work environment in several critical ways:

The interviewer fails to understand how specific their questions are, and underestimates transferrable skills.

In my opinion, many managers, and perhaps the majority, make the mistake of disregarding general skills and abilities, and general candidate quality, in favour of very specific past experience.

A former manager of mine called it “looking for the person who was doing the job before”. Ideally, the manager is looking for the person who just quit that very role. Or, failing that, someone who has been doing almost exactly the same role at a competitor.

This is reflected in the interview by the asking of very specific questions. Since the number of topics in either mathematics or quantitative finance is almost unlimited, the candidate may well not have spent time on those very specific topics. Or, they may have done so many years ago but can no longer recall off the top of their head.

For example, if you were to ask me for the definition of a (mathematical) group, I would struggle to recall off the top of my head. Likewise if you were to ask me to write down the Cauchy Riemann equations. Although these are both first year university topics, I simply haven’t looked at them in quite a while. However, if I needed one of these during the course of my work day, I look it up, and seconds later I’m moving forward with my work. It’s very unwise to interview experienced professionals by testing whether they can recall first year university topics off the top of their heads under pressure. Yet, interviews for quants (as well as software developers) are often conducted in this way. And I’ll give some real world examples of this below.

I remember when I was doing Josh Waitzkin’s chess tutorials that come with the Chess Master computer program, he talked about how, after studying a huge number of chess games, he had forgotten all the specifics of those games. Yet, something remained. A kind of intuition or deep understanding that didn’t depend on remembering any particular specifics.

An interviewer can be very impressed with a candidate’s knowledge, or surprised that they don’t know anything, all based on the luck of whether they ask questions the candidate happened to have thought about fairly recently. Furthermore, since the interviewer chooses questions that they know well or have prepared, it can easily appear to them that they seem to know a lot more than any of the candidates that they interview. If the candidate were able to choose questions for the interviewer to answer, an identical dynamic may occur. Sometimes, the interviewer’s limited knowledge leads them to test candidates memory of elementary facts, while the candidates knowledge is much broader than they realise. Interview questions constitute a set of measure zero in the set of all knowledge in the field.

Another thing to keep in mind is that, just because someone has been doing a certain kind of role for many years, doesn’t necessarily mean they are good at it. There are many university lecturers who have been teaching for 30 years, and yet the students find their courses poorly structured and confusing. This means that hiring someone with past experience in the exact same role may not be preferable to choosing a high quality candidate with who’s work history is not exactly the same as the present role.

I’ve also found that some quants and software developers can have difficulty with seemingly mundane tasks like understanding and responding to emails, proofreading their reports, even though they may pass technical interview questions.

The candidate has no opportunity to prepare for the tasks.

In the workplace, people don’t come up to you and insist you produce a random fact off the top of your head in 10 seconds. Nor do they insist you engage in rapid problem solving challenges while they wait and glare at you.

When you are assigned a task in the workplace, you probably won’t instantly understand all aspects of it. It might require information you don’t yet know, or information you once knew but have forgotten because you haven’t needed to use it in your job for a few years. Either way, you use google, wikipedia, you look up some books, and soon you’re moving the task towards completion.

Usually when you start a new role, the first couple of months involve a period of learning. This is because, even though you may have many years of experience in similar roles, every firm and every role has it’s own set of specific financial products, calculational methodologies and coding tools.

Some people suggest “preparing” for interviews. This is both difficult and a waste of time, since you could spend time preparing information and find the interviewer asks you something completely different. It’s silly to try to know everything all at once. A reasonable person researches the specific facts they need for a task, when they need to. Indeed, researching a new problem or task which is not exactly the same as something you did before, is a very important skill, much more important that memorisation. And it’s a skill which is totally untested in an interview.

Now, universities also try to assess people – they do this using exams. But there is one key difference between the assessment of universities and the assessment of interviewers. When you are given an exam at university, you are first told what knowledge and skills you need to master, and afforded the opportunity to do so throughout the semester. Of course, you won’t know exactly which questions will be asked in the exam but, if the lecturer has done a good job, the exam questions should a subset of those you were made aware that you needed to learn to do. You are not being assessed on whether you know x or can do y off the top of your head in seconds. Rather you are being assessed on, if the need to know x or do y arises in your job, can you go away and learn x or learn to do y?

Studies have shown that interviews not only add nothing of value above just considering a candidate’s university marks, but can actually be worse than just judging candidates by their university marks (see this article in the New York Times). Why? Because university exams are an objective measure of whether a candidate is able to achieve a task assigned to them, given an appropriate amount of time to think, research and learn. Exactly like the workplace! Interviews, on the other hand, are not representative of either the university environment or the workplace environment.

When I was a university lecturer in mathematics, I watched some students struggle when transitioning from early undergraduate courses to more advanced courses. These students had perfected a learning strategy of memorizing how to do the specific kinds of problems they suspected were going to be on the exam. But in advanced courses, they were asked to creatively generate their own proofs that did not necessarily match a pattern of anything they had seen before. What was needed here was an approach of developing general skills and conceptual understanding, not memorising how to do certain very specific problems.

And as a mathematics or physics researcher, there is no point in memorising specific topics. Because you have no idea what knowledge or skills the next research project you undertake will require. Rather, the skillset you acquire is the ability to quickly look up and learn things, when you need to know them.

A prospective consulting client once presented to me a paper on quantitative finance that he had been reading, and asked me if I was “familiar with it”. When you consider that someone could spend their entire lives reading papers in a given discipline and still not be familiar with almost all of them, it’s unlikely this client will find a consultant who has coincidentally read the exact paper he’s been looking at. Another client was looking for an expert in “Markov chains”. Not an expert in mathematics with a PhD, who could apply their general skills to many different problems including Markov chains, but someone who specifically specialized in the exact topic the client was interested in. Just like the kinds of interviews I’ve been discussing, these clients were focused on very specific knowledge rather than the broad applicability of general capabilities.

As a very experienced classical pianist, I can provide a good analogy here. If an interviewer were to try to test my claim of being an experienced pianist by challenging me to play Fur Elise, I can tell you that I wouldn’t be able to do so very well. The reason is that, although this is an easy piece, I haven’t played it in ten or fifteen years. In fact, I may never have properly learnt this piece even as a student. Even though it is an easy piece, I still need time to prepare it and learn/relearn what the notes are. However, I can perform Rachmaninoff’s third piano concerto for the interviewer, one of the most challenging pieces written for piano, simply because I have prepared this piece. A pianist does not have the ability to play any piece, even an easy one, off the top of their heads. The skillset of a pianist is rather to go away and prepare and master a piece, when they are assigned the task of doing so.

The candidate is under a lot of pressure in an interview.

Finally, another key issue that interviewers need to be very aware of is that the interview may be testing how the candidate behaves when under a specific kind of pressure that doesn’t arise in the real workplace. Furthermore, when under pressure, memory may function but careful thinking may be difficult. This would again cause the interviewer to select people who have memorised certain facts, over people who can think about them and figure them out when they need them.

I’ve had interviewers ask me probability questions that 14 year olds would solve in their high school maths classes. It’s strange that an experienced quantitative professional would test another experienced quantitative professional with questions from early high school. This can only really be testing one of two things: 1) Can you remember how to solve, off the top of your head, a problem you haven’t thought about in 20 years? 2) Can we prove that when you’re nervous you might make a mistake on a simple problem? I believe that neither of these is a useful gauge of workplace performance.

Case studies

As case studies, I offer some of the interviews and discussions with clients that I myself have encountered!

Get them to do the work before deciding whether you want to pay them to do the work.

Occasionally I get clients who want to know, right off the bat, how I’ll solve the problem, what the outcome will be, and how long it will take. Needless to say, these questions cannot be answered at time t=0 of a research project. Rather, the first step is for the consultant to begin to read through the documents and papers provided by the client, and begin to build up an understanding of the project. Answers about which technical approach is appropriate, or whether the project is even possible, will begin to take shape over time. In fact, clarifying these questions may be most of the work of the project, rather than something that happens before the project begins.

It reminds me of academics who finish their research project, before applying for an academic grant to fund the now finished research. They then use this money to fund their next project instead. The idea is that, once the research is finished, you can show the grant board exactly what methods you plan to use, how long it will take you, and that you’re certain this approach will work. If you instead go to the grant board saying you’re going to “attempt” to solve the problem, using as yet unknown method, and have no idea how long it will take or if you’ll even succeed, then it will be much harder to convince them to fund you!

Building a model is a totally different skill to checking whether the model has been designed correctly. Apparently.

At one point, I was interviewing for a model validation role. The interviewer didn’t like that I hadn’t done derivative model validation before. It didn’t matter that I had a mathematics PhD, great coding skills and several years experience in derivative modelling. He believed that building mathematical models within a fairly mature derivative pricing system was not the same thing as validating a model from scratch. And, apparently, that skills required for the two roles did not have sufficient overlap.

Shortly thereafter, I got a job doing model validation at a different bank – and of course my general skills and abilities allowed me to perform the role well.

Then a bit later, I heard from a recruiter about a firm that would not consider people working in model validation for a particular role. They held this view because they were looking for someone to “build” models instead of validate them.

For those who don’t know, model validation usually involves building an independent model against which to benchmark the system. It therefore is essentially “building models” anyway.

Then I saw a job advert from Murex which stated that the candidate must have experience developing for Murex 1.3. They were not looking for an experienced quant dev. Or even an experienced dev that was working at the same firm 2 years ago and had a lot of experience developing for Murex 1.29.

By endingly subdividing the industry into more and more specific categories, no candidate is ever quite the right fit for a role.

Mathematics PhDs know less about maths than traders?

I once had an interview for a machine learning role at a prop trading firm. The interviewer was not a mathematician – he was a trader who had at some point studied some machine learning.

“How would you solve a linear regression?”, he asked.

Now, keep in mind that he is talking to someone with a PhD in pure mathematics, who has actually taught 3rd and 4th year mathematics courses at university, and who has several years of postdoctoral research experience. Isn’t it obvious from my background that I don’t need to be assessed on my ability to work with one of the most simple concepts from statistics? I told him that there was an exact formula involving matrices.

“Okay, walk me through that” he persisted.

I told him that I did not recall the formula off the top of my head, but would simply look it up if I needed it.

He next wanted to know if there was anything one needed to do to the data before performing a linear regression. I recalled that the last time I did a linear regression I had to scale the data so that all variables had numbers of a similar order of magnitude.

“Well thaaats interestinggggg! Because it’s scale invariant!”

The trader was probably quite pleased with himself for seemingly tripping me up, and for getting to use a fancy sounded term he had learnt.

I remembered later that the last time I had implemented a linear regression in C++ I had used the gradient descent method. You see, implementing matrix inverses and determinants in C++ is a bit of a pain, and gradient descent converges in only about 5 iterations. It was actually the gradient descent part of the algorithm that required the data scaling. If you solve a linear regression using the matrix formula, you probably don’t need to scale the data. So you see that in a way I was right, but only when solving the regression using the specific method that I had been using. A fact which couldn’t come to light in the short timeframe and pressured questioning of an interview.

“You’ve got machine learning on your CV!”, the trader exclaimed, implying that I clearly knew nothing about machine learning.

As I’ve described already, a mathematics PhD can pick these concepts up very quickly when they need them, but don’t necessarily know them off the tops of their heads. And whether someone has memorised very elementary facts has nothing to do with whether they have the skills to engage in complex research.

There was another trading firm that I interviewed with for what appeared to be a heavily machine learning focused role. I say this because the job description mentioned machine learning not once, but three times. So in the interview, I brought up the topic of machine learning. At first, he didn’t know what I was talking about. Apparently he didn’t know that the job description mentioned machine learning (who wrote the job description, ChatGPT?). Then he said they don’t do machine learning because it overfits. Well, why did they put it in the job description three times then? This is a bit off topic, but it’s so funny I couldn’t resist bringing it up.

Relying on what other people think because you can’t think for yourself

I once had a phone interview with a well-known fund manager in Sydney. I won’t say who he is, other than to say he’s often in the financial news giving his opinions about the economy. He said to me, “If you were paid a lot of money by Westpac, then I’d know you were worth a lot of money!” For those Northern Hemisphere readers, Westpac is an Australian bank that I wasn’t working for at the time of that conversation. The idea was, that if someone else was willing to pay me a lot of money, then he’d believe he should offer me a lot of money. Otherwise he wouldn’t. Relying on the judgement of others to the complete exclusion of your own doesn’t seem wise.

It reminds me of study that found women mainly want to go out with men who already have girlfriends. The authors of the study found women would rate men more attractive if they were told he had a girlfriend, or even if the photo of him showed a random woman smiling in his direction. Apparently, the fact that those men had already been chosen by another girl, convinced other girls that he was worth choosing. Unfortunately, none of those men were available so it seems a poor strategy.

Letting HR design the interview for a role they don’t understand

Years ago, before I started working in quantitative finance, I interviewed with a large Australian telecommunications company called Telstra.

Some manager there had attended a conference where he’d heard about people using statistical methods to model the occurrence of faults in networks, allowing them to move their workers around more efficiently to keep the network operating. Thus, he’d had the idea of hiring a quantitative PhD to do this kind of modelling at Telstra.

What astonished me, is that the interview included not one question about my ability to do statistical modelling. The managers believed that the skills required for statistical modelling didn’t need to be tested and could simply be taken for granted. Indeed, the two managers interviewing me knew little about statistical modelling and simply weren’t qualified to determine whether I was qualified. While I would say that statistical modelling skills were 90% of what was required for the role, these two managers considered them largely irrelevant.

Instead, the interview was a series of HR questions such as, “name a time you’ve worked well on a team”, and “what would you do if you needed someone to do something and they didn’t want to do it”. I remember the female manger kept giggling about she was soon going on holiday to drink cocktails on the beach.

I was entirely unprepared for these sorts of silly questions. Apparently, so were all the other candidates. Indeed, an HR guy from Telstra called me to inform me that they’d decided not to move forward with any of the PhDs they had interviewed because none of them seemed “ready for the role”. While Telstra thought these PhDs could be taught what they were lacking, Telstra was “looking for someone to hit the ground running”.

In the coming years, I kept reading in the news about how Telstra’s network was down again.

Smart people should know everything. Even things you haven’t told them yet!

I’ll end with an Anecdote from the physicist Richard Feynman.

In one of his books, Feynman describes an anecdote from when he was working on the atomic bomb project.

Some engineers presented him with a stack of blueprints representing a proposed chemical facility, and gave him a whirlwind explanation of these very complicated blueprints, leaving him in a daze. He was struggling to guess what the squares with crosses in the middle represented – were they valves or perhaps windows? Since everyone was looking at him, waiting for him to say something, he eventually pointed randomly at one of the squares with crosses in and said, “what happens if this one gets stuck?”

After some frantic whispering, the engineers said “you’re absolutely right, sir”, and rolled up their blueprints and exited the room.

“I knew you were a genius”, the lieutenant said.

Just like in your average job interview, Feynman was being asked to perform a task very quickly, with inadequate information, under pressure. In this case, Feynman got lucky.

Remember, if someone can do something quickly, it’s not because they are a genius – it’s because they’ve done it before.

Market Risk Consulting and FRTB

Looking for market risk consulting and advisory services? Or looking for quant developers to build market risk software? Our PhD qualified quants have you covered! Contact us to learn more.

See also our model validation consulting services.

Market risk is the risk of losses in a portfolio due to adverse movements in market variables such as interest rates, exchange rates, equity prices, and volatilities of equities and exchange rates.

The standard approach is to base the calculation of the empirical distribution of daily shifts in market variables over some period, typically the most recent year of the data or data from a particularly stressed historical period. Each set of daily shifts are applied to the inputs of the pricing models for each asset in the bank’s portfolio. Assuming, say, 250 business days worth of shift data, this gives rise to 250 possible valuation changes in the portfolio. Market risk capital is often taken to be a certain quantile of this empirical distribution, say the 99% quantile or worst move out of 100 . This is called VaR (value at risk). However, there is an alterative approach called expected shortfall.

A significant amount of the work involved in market risk management therefore goes into developing, validating and revalidating the asset pricing models. This typically includes:

  • Spot, forward and futures trades
  • Fixed income products and interest rate products like bonds, inflation bonds, FRNs and swaps.
  • Derivatives on equities, FX and interest rates, including European/American exercise and features such as barriers.

For some asset classes, accurate models may be computationally prohibitive. This is the case for barrier options for example, whose sensitivity to volatility term structure requires the use of a local volatility or stochastic volatility model. This may necessitate pricing barrier options using a constant volatility assumption, and then developing an auxillary model to calculate and compensate for the capital error. In many cases, historical volatility data may have to be processed through a cleaning algorithm to remove arbitrage and unusual data from volatility surfaces.

In addition to the building asset pricing models, developing or validating a market risk system involves correctly calculating shifts in market data. There are many subtle pitfalls in this process, including dealing with interpolation between curve pillars. Cubic spline interpolation can lead to strange behaviour which must be carefully considered. Some market risk systems will switch from cubic spline to linear interpolation to increase valuation speed, which introduces an error that must be quantified and shown to be acceptable.

Furthermore, when calculating market risk for bonds, it’s necessary to calculate shifts in both the underlying interest rate curve and the zspread on top of it (alternatively, some systems prefer to work in terms of survival curves). When validating zspread shifts for bonds, we’ve found that there are some subtle issues in calculating these which need to be handled carefully. Since zspread shifts are calibrated primarily to the bond maturity, once must be careful to shift all historical curves forward to today’s date before calculating the zspread shift. One must also be careful about whether one is using absolute or relative shifts for the interest rate, if the interest rate shift is to be consistent with the zspread shift.

The Fundamental review of the trading book (FRTB) is a new international for market risk measurement. Developed by the Basel Committee, it’s designed to improve upon deficiencies in market risk management that came to light during the GFC. Most larger banks would aim to implement the Internal Modelling Approach (IMA), as the alternative Standardized Approach is typically far more punitive in terms of the amount of market risk capital that must be held. The implementation of the FRTB regulation is generally expected to increase market risk capital, particularly around products that are illiquid or hard to model.

Because of its importance to regulators, the requirement to implement and comply with new FRTB regulation, and the complexity involved in calculating market risk for a large and diverse portfolio, market risk management is currently a highly active field. We offer a wide range of market risk consulting services including:

  • Development of VaR and expected shortfall calculations
  • Development and validation of asset pricing models
  • Development and validation of market data shift calculations including interest rate curves, FX curves and zspreads / survival curves.

To discuss how our sophisticated cloud-based quant consulting and advisory business can supercharge your financial services firm, contact us today.

Does backtesting work? / Why doesn’t backtesting work?

If a trading strategy seems to backtest successfully, why doesn’t it always work in live trading?

It’s widely acknowledged that a strategy that worked in the past may not work in the future. Market conditions change, other participants change their algorithms, adapt to your attempts to pick them off, and so on. This means you need to continually monitor and adjust even profitable strategies.

But there’s something even more problematic about backtesting strategies, which fewer people understand clearly. This is that a profitable backtest does not prove that a strategy “worked”, even in the past. This is because most backtests do not achieve any kind of “statistical significance”.

As everyone knows, it’s trivial to tailor a strategy that works beautifully on any given piece of historical data. It’s easy to contrive a strategy that fits the idiosyncratic features of a particular historical dataset, and then show that it is profitable when backtested. But when no mechanism actually exists relating the signal to future movement, the strategy will fail in live testing.

So how does one tell the difference? How can one show that a backtest is not only profitable, but statistically significant?

See also our backtesting software development and our algorithmic trading consulting services.

Statistical hypothesis testing in trading

If you’ve studied some basic statistics, you’ve probably heard of hypothesis testing.

In hypothesis testing, it’s not enough for a model to fit the data. It’s got to fit the data in a way that is “statistically significant”. This means that it’s unlikely that the model would fit the data to the extent that it does, by chance or for any other reason than that the model really is valid. The only way for the model not to be valid is to invoke an “unlikely coincidence”.

One proposes some hypothesis about the data, and then considers the probability (called the p-value) that the apparent agreement between the data and the hypothesis occurred by chance. By convention, if the p-value is less than 5%, the hypothesis is considered statistically significant.

It’s worthwhile to place backtesting within this framework of hypothesis testing to help understand what, if anything, we can really conclude from a given backtest.

Coin toss trading

Let’s keep it simple to start with. Let’s suppose we have an algorithm which predicts, at time steps \(t_1,…,t_n\), whether the asset will subsequently move up (change \(\geq 0\)) or down (change \(<0\)) over some time interval \(\Delta T\). We then run a backtest and find that our algorithm was right some fraction \(0 \leq x \leq 1\) of the time.

If our algorithm was right more than half of the time during the backtest, what’s the probability that our algorithm was right only by chance? This is calculated using the binomial distribution. To see some numbers, let’s suppose our algorithm makes 20 predictions (\(n=20\)) and is right for 12 of them. The probability of this happening entirely by chance is about 25%. If it’s right for 14 of them, the probability of this happening by chance is about 5.8%. This is approaching statistical significance according to convention. The idea is that it’s “unlikely” that our strategy is right by chance, therefore the mechanism proposed by the strategy is likely correct. So if our algorithm got 15 or more correct during the backtest, we’re in the money, right? Not so fast.

To take an extreme example, let’s suppose that our piece of historical data was a spectacular bitcoin bull run that went up 20 times in a row. And let’s suppose that our strategy is “Bitcoin only goes up!” Then our calculation above would prove that the strategy works with a statistical significance of 0.0001%! What’s gone wrong here?

When calculating the p-value for a linear regression, standard statistics usually assumes that the “noise” in the data is random and normally distributed. One mistake we have made in the above analysis is assuming that the actual price trajectory is like a coin toss – equally likely to go up or down. But market movements are not random. They can, for example, be highly autocorrelated. And they can go up in a highly non-random way for quite some time, before turning around and going down.

Secondly, we presumably looked at the data before deciding on the strategy. If you’re allowed to look at the data first, it’s easy to contrive a strategy that exactly matches what the data happened to do. In this case, it’s not “unlikely” that our strategy is profitable by mere coincidence, because we simply chose the strategy that we could see matched the data.

Another thing that can destroy statistical validity is testing multiple models. Suppose a given model has a p value of 0.05, that is, it has only a 5% chance of appearing correct by chance. But now suppose you test 20 different models. Suddenly it’s quite likely that one of them will backtest successfully by chance alone. This sort of scenario can easily arise when one tests their strategy for many different choices of parameter, and chooses the one that works. This is why strategy “optimization” needs to be done carefully.

So how do you backtest successfully?

In practice, we wouldn’t be checking whether the asset goes up or down. Instead, we’d likely check, across all pairs of buy and sell decisions, whether the sellprice minus the buyprice amounted to a profit greater than buy and hold. We would then ask, what is the probability that this apparent fit occurred by chance, and the strategy doesn’t really work? If it seems unlikely that the observed fit could be a coincidence, we may be onto a winner.

On the other hand, a trader may have some external or pre-existing reason for believing that a strategy could work. In this case, he/she may not require the same degree of statistical significant. This is analogous to Bayesian statistics where one incorporates a prior belief into their statistical analysis.

Now, HFT (high frequency trading) backtests can often achieve statistical significant much more easily because of the large amount of data and the large number of buy/sell decisions in a short space of time. More pedestrian strategies will have a harder time.

So does machine learning work for trading?

People often ask whether machine learning techniques are effective for developing trading strategies. The answer is: it depends on how they’re applied. When machine learning models are fit to data, they produce certain “p-value” statistics which are vulnerable to all the issues we’ve discussed. Therefore, some care is needed to ensure the models are in fact statistically significant.

Custom-built Backtesting Software for Traders

We create custom-built backtesting software in languages like C++ and python for individual traders and institutions to test and optimize their strategies. We also offer general algorithmic trading consulting services.

Finding that backtesting doesn’t seem to work for you? This article may help you understand the statistical reasons for this.

We create backtesting software for all asset classes including backtesting strategies on equities, FX, options, futures and cryptocurrencies.

Whether you’re a lone day trader looking to test your strategy, or a sizable organisation looking for get your feet wet with algorithmic trading and machine learning, our cloud-based quant consulting service has got you covered. This includes:

  • Python scripts which can evaluate your strategy for different parameters and determine the parameters that give optimal profitability.
  • Applications which use analyse large amounts of data, use machine learning techniques to find statistically significant signals, and find the optimal way to combine all meaningful signals into a single strategy.
  • Software to automate your strategies by connecting directly to the exchange to grab the latest price information and post buy/sell orders.

There are many advantages of custom-built backtesting software over the simple built-in functionality offered by some exchanges:

  • Code offers unfettered ability to do complex calculations on historical data, including analysing for the presence sophisticated technical analysis patterns.
  • The software can analyse a wide variety of datasets when making trading decisions, including data from other assets and data from outside the exchange.
  • The power of python – make use of python’s mathematical tools, machine learning and data analysis libraries
  • The software can grow in scope and complexity as your business grows, as you expand into new strategies and products.

Partner with our experienced PhD quants to supercharge your trading business. Contact us today to discuss how we can design custom-built backtesting software to meet your needs.

To learn more about what’s involved in automating a strategy, see our simple guides for using python to connect to Interactive Brokers and Binance.

Check out our articles on backtesting moving average crossover strategies on Forex and on Bitcoin, and our article on cryptocurrency correlation strategies.

Learn more about our algorithmic trading consulting services. More generally, we offer a wide range of Quant consulting services for financial organisations of all sizes.

Cryptocurrency (Defi) Risk and Quant Consulting Services

Looking for PhD quantitative support and risk management for your cryptocurrency business? Look no further! Contact us to learn how our quants can help.

As decentralized finance continues to grow in size, there is increasingly a need to quants (quantitative analysts) to bring to bear their skills from the world of traditional finance. Due to the relative immaturity of the industry, there is a huge opportunity for cryptocurrency startups to gain a competitive advantage through quantitative skills and tools.

Applications include derivative pricing models, risk modelling including market risk, credit risk and liquidity risk, and developing and backtesting trading algorithms. There are even many novel applications including the mathematics of decentralized oracles and so-called automated market makers.

We offer cloud-based PhD quant consulting and advisory services to the defi industry, all conveniently delivered remotely to anywhere in the world.

Decentralized finance needs a decentralized quant consulting service!

Derivative pricing models

The cryptocurrency derivatives market is still in its early stages. From TradFi, we already have the mathematical techniques to price options in the form of Black-Scholes. And we even have the tools to price exotic derivatives like American, Asian and barrier options. However, we do need sufficient liquidity in the options market in order to derive implied volatilities. We can develop robust libraries of derivative pricing models so your firm can price any kind of cryptocurrency derivative. See our main article on Cryptocurrency Derivatives.

Risk modelling for defi

With the crypto industry continuing to grow in size, risk management should play as important a role in managing customers and assets as it currently does in conventional finance. Given a number of high profile collapses in the industry, effective and reputable risk management could help to allay customer concerns about holding digital assets or interacting with your firm. It’s particularly useful to consider how the extensive existing literature on risk and risk modelling can be carried over to the crypto space. Market risk and liquidity risk modelling are standard challenges arising in other kinds of finance, and one can consult the literature in order to develop similar frameworks for the crypto space. We do however need to take due notice of the higher volatility which creates some additional challenges.

Market risk

There’s some uncertainty about whether digital assets should be modelled more like exchange rates or more like equities. But there’s not doubt that exchange rates between two digital currencies or between a digital currency and a conventional currency exhibit a high degree of volatility, raising some new challenges for market risk modelling

Borrowing and lending businesses have to take collateral to insure their loans. Similarly, exchanges need to take margin to ensure counterparties can meet their obligations when trades move against them. In both cases, firms are exposed to market risk on the value of the collateral, which could also be interpreted as FX risk between the relevant currencies where multiple cryptocurrencies are involved. In particular for borrowing and lending, one needs to be concerned about changes in relative value between collateral in one token and loaned amount in another. A mathematical model is needed which can can set parameters like the LTV (loan to value ratio) or liquidation trigger level in order to avoid the value of the collateral ever falling below the value of the loaned amount.

A standard way to model market risk is VaR (value at risk). We calculate the relative shift in each asset or market variable over each of the last 250 days, and apply each shift to today’s portfolio. We can then calculate the 99% worst quantile (typically assuming normally-distributed price moves) and make sure margin/LTV is sufficient. Actually, it may be advisable to work out what the liquidation window would be, and use that as our timeframe for VaR calculations. This may have implications for how much collateral you’re comfortable holding in any given coin.

Liquidity and execution risk

In addition to market risk, crypto firms are exposed to liquidity risk when trying to dispose of assets and collateral. The larger a firm grows and the larger its market share becomes, the more liquidity risk becomes a key concern. Liquidity risk may be of particular concern for emerging markets such as digital currencies.

Modelling liquidity risk involves looking not just at the mid price of the assets (as market risk tends to), but also considering the market depth and bid-ask spread. Both of these quantities can be examined in a VaR framework in a similar way to market risk. Large spreads are likely correlated to adverse price moves. Some research indicates they may not be normally distributed as is often assumed in market risk. Data analysis can be performed to determine the appropriate spread modelling assumptions for cryptocurrencies.

One can also consider modelling market risk and liquidity risk together in one model, looking at the 99% quantile of adverse price/spread/market depth moves and backtesting a portfolio or risk management protocols against the historical data or hypothetical scenarios.

Of relevance here also are liquidation algorithms / order splitting. While some illiquidity scenarios will be completely outside our control, in other scenarios the crisis arises only if we try to transact too much too quickly (which of course we may have good reason to attempt in a stressed scenario). Thus researching liquidation algorithms and backtesting them are important also.

In particular, it’s important to backtest against a stressed period in the market’s history, to understand how we would respond. This would include situations where collateral value declines dramatically or quickly, many customers wish to withdraw their collateral simultaneously, or periods of higher than usual volatility.

Another important tool is scenario analysis. This is where we consider a range of hypothetical qualitative and non-modellable scenarios, such as liquidity providers shutting down completely, to evaluate how we would respond.

Correlation and principal component analysis (PCA)

Since cryptocurrencies move together to a significant extent, we can separate market risk into systematic risk (i.e. FX rates between cryptocurrencies and USD, which could perhaps be taken as BTCUSD) vs idiosyncratic risk (cryptocurrencies moving independently of each other, ETHBTC for example).

If both the asset and the collateral are digital currencies (and not pegged to a conventional currency), then their price relationship is not affected by an overall move in the crypto space. Thus we would be interested in looking at the risk of relative movement only.

We can start with a correlation analysis of price moves over different time intervals of both traditional and digital currencies. Below shows the correlations of one day price moves over a one year period. It’s clear that there is significant correlation.

We can also do PCA (principal component analysis) to determine how the various coins relevant to the firm move together / contrary to each other. This helps us to understand what benefit is derived from diversification. A PCA analysis is often done on rates curves to determine to what extent short / long tenors move together.

The below PCA analysis shows that 75% of the variance in the price shifts can be explained by all coins moving in the same direction (notice how in the first vector/row all values have the same sign). Interestingly, 9% of the variance can be explained by DOGE moving contrary to all the other coins (notice how it is the only one with a positive value in the second row). The next two rows, explaining 4.7% and 4.5% of the variance respectively, show the coins moving in different directions to each other.

[75%, 9%, 4.7%, 4.5%]

Backtesting and trading algorithms

We design software to backtest, optimize or automate trading or investing strategies.

Backtesting can be performed on historical data, or on hypothetical synthetic data to test the strategy against a wide range of possible market conditions. Usually, trading strategies have a range of possible parameters which we can set for optimal profitability by examining their behaviour on historical data. If you’re still placing your buy and sell orders manually, we can automate execution by writing code to interact directly with the exchange. This not only allows faster reaction, it also allows sophisticated data analysis and machine learning to be incorporated into your strategy. And automation is particularly important for crypto markets which still operate even while you sleep.

The benefits of backtesting actually extends beyond trading. Almost any business or risk strategy can be backtested against historical or hypothetical data in order to test the profitability or robustness of the business.

For more details, see for example our article on statistical arbitrage / pairs trading for crypto, and backtesting a moving average crossover strategy on bitcoin. More generally, we offer algorithmic trading consulting services on both traditional and digital asset classes.

Decentralized oracles

How does an decentralized oracle convert multiple data sources into a single price or datum? For example, some data sources (for example, different exchanges) may receive different weighting, and more recent data points may be weighted differently than older ones. The oracle might also trim the data between two quantiles to remove the influence of outliers. Some decentralized oracles offer a reward for participants that submit data close to the final price, such as Flare Time Series Oracle (FTSO). How would you go about succeeding as such a participant, or just predicting the final price for your own use? This is where machine learning algorithms come in.

We can build predictive algorithms for decentralized oracles, using machine learning, which predict how oracles combine a large number of inputs into a final output.

Automated Market Makers

Automated market makers, or AMMs, provide liquidity to the market by allowing exchange between two or more cryptocurrencies. They do this by incentivizing people to contribute coins to the pool, and by penalizing the exchange rate if the liquidity ratio swings too far in the direction of one of the coins.

The mathematics around profit and risk of automated market makers and the liquidity tokens they issue require some careful thought. We can conduct this analysis and conduct backtesting of your business strategy.

For details on the Uniswap algorithm, see this article on the CURVE exchange.

Statistical Arbitrage / Pairs Trading on Cryptocurrency

The relative immaturity of crypto markets may mean there are more opportunities for arbitrage than on more conventional markets. In this article, we investigate whether price moves in crypto coins are correlated. Specifically, we see whether the last three price moves in a selection of coins can be used to predict the next move in a coin of interest. You could consider this strategy to be a type of statistical arbitrage, or a pairs trading strategy (albeit over a small interval of time).

We implement a vector autoregression model on a select of nine major crypto coins, whose tickers are: SOL-USD, BTC-USD, ETH-USD, BNB-USD, XRP-USD, ADA-USD, MATIC-USD, DOGE-USD, DOT-USD. All data is grabbed directly from Yahoo Finance using the yfinance python package. We use one week of data (the most recent at the time of writing) and a 1 minute time interval.

A VAR model is a variety of linear regression that attempts to predict the next move of a particular coin, based on the last few price moves of the coin and all the other coins. The idea is two-fold. Firstly, if two coins tend to correlate but one moves first, it may portend a move of the other coin. Secondly, the VAR model will attempt to find a trend in the coin itself. In fact, a VAR model includes a moving average crossover as a subset of what it can fit. An interesting feature is that it can potentially use moving averages in other coins as a predictive signal. However, I only used the three previous price moves as inputs to the model as using more than this didn’t appear to improve the result in this case.

The results show that the algorithm is effective at predicting the next move of many coins, but does not appear to be effective for bitcoin.

Results

The code produces a scatterplot of the actual vs predicted price moves, along with the correlation and p value between the two. Note that since the code grabs the most recent data at the time of execution, the numbers may differ between runs. Below I show two coins where the algorithm is effective and one where it isn’t.

XRP-USD

LinregressResult(slope=0.34169299506953943, intercept=2.8755621199597785e-06, rvalue=0.5873282315680474, pvalue=1.551167162089073e-22, stderr=0.031321080004877454, intercept_stderr=5.55906329231313e-06)

XRP-USD shows a strong correlation of 0.59 between the actual and predicted next move, with a negligible p value demonstrating statistical significance.

SOL-USD

LinregressResult(slope=0.475485553436831, intercept=0.00019601607035287644, rvalue=0.7072734886815194, pvalue=3.3905483806325366e-36, stderr=0.031474953721131224, intercept_stderr=0.0004961483745867841)

SOL-USD shows a strong correlation of 0.71, with a negligible p value demonstrating statistical significance.

BTC-USD

LinregressResult(slope=-0.002493212361084034, intercept=1.8926030981056112, rvalue=-0.015123422722563179, pvalue=0.8195518097975965, stderr=0.010916717835828317, intercept_stderr=0.2832675090566032)

By contrast, BTC-USD shows a poor correlation of 0.015 and a p value of 0.8 showing no statistical significance at all. My interpretation of this is that the smaller coins are more likely to be affected by price moves in Bitcoin, rather than the other way around.

For many coins, the algorithm is able to predict the next price move with strong correlation. Thus, the algorithm could be the starting point for an effective strategy for a variety of cryptocoins.

Future development

A good next step for developing this idea would be to explore using a time interval of less than one minute. Particularly in live prediction, one would not want to wait up to a minute to analyse the data and make a decision. Ideally, the algorithm would analyse and take action every time the exchange updated the price of one or more coins. It would also be interesting to develop a model that accesses data for a very large number of assets (including not just crypto but other asset types, economic parameters etc) and search for correlations. One could eventually explore using big data / machine learning techniques to search for these relationships.

Python code

Below is the python code used for this article. You can specify which coin you are trying to predict using the index_to_predict variable. In order to protect against overfitting to a particular piece of historical data, the variable test_fraction specifies how much of the data to set aside for testing (I’ve used the last 20%).

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.api import VAR
from statsmodels.tsa.statespace.tools import diff
from scipy.stats import linregress
import yfinance as yf

# Set data and period interval
period = "1w"
# Valid intervals: [1m, 2m, 5m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, 3mo]
interval = '1m'

# Number of previous moves to use for fitting
VAR_order = 3

tickers = ["SOL-USD","BTC-USD","ETH-USD","BNB-USD","XRP-USD","ADA-USD","MATIC-USD","DOGE-USD","DOT-USD"]
# Specify which coin to forecast
index_to_predict = 0
test_fraction = 0.2 # fraction of data to use for testing

data = yf.download(tickers = tickers,  # list of tickers
                period = period,         # time period
                interval = interval,       # trading interval
                ignore_tz = True,      # ignore timezone when aligning data from different exchanges?
                prepost = False)       # download pre/post market hours data?

X = np.zeros((data.shape[0],len(tickers)))

for (i,asset) in enumerate(tickers):
    X[:,i] = list(data['Close'][asset])
 
# Deal with missing data.
NANs = np.argwhere(np.isnan(X))
for i in range(len(NANs)):
    row = NANs[i][0]
    X[row,:] = X[row-1,:]
 
# Difference data
Xd = diff(X) 

# Determine test and fitting ranges
test_start = round(len(Xd)*(1-test_fraction))
Xd_fit = Xd[:test_start]
Xd_test = Xd[test_start:]

model = VAR(Xd_fit)
results = model.fit(VAR_order)
summary = results.summary()
print(summary)

lag = results.k_ar

predicted = []
actual = []
for i in range(lag,len(Xd_test)):
    actual.append(Xd_test[i,index_to_predict ])
    predicted.append(results.forecast(Xd_test[i-lag:i], 1)[0][index_to_predict])
    
plt.title(tickers[index_to_predict])
plt.scatter(actual, predicted)
plt.xlabel("Actual")
plt.ylabel("Predicted")

print(linregress(actual, predicted))

Backtesting Algo Trading Strategies using Synthetic Data

In a couple of previous articles, we backtested and optimized a moving average crossover strategy for both Bitcoin and FX.

Now while backtesting on historical data is a key part of developing a trading strategy, it also has some limitations that are important to be aware of.

Firstly you have the issue that you usually have a limited amount of historical data. And even if you could obtain data going back as many years as you wanted, the relevance of older data is questionable. Also, when you have a finite amount of data, there’s always the problem of overfitting. This occurs when we optimize too much for the idiosyncrasies of one particular dataset, rather than trends and characteristics which are likely to be persistant.

And even if you had say, ten years of data, if there happened to be no GFC during those ten years you’d have no idea how the strategy would perform under that scenario. And what about scenarios that have never happened yet? A strategy that is perfectly optimized for historical data may not perform well in the future because there’s no guarantee that future asset and market behaviour will mimic past behaviour.

This is where backtesting using synthetic data comes in.

Synthetic data is data that is artificially generated. It can be generated so as to try to mimic certain properties of a real historical dataset, or it can be generated to test your strategy against hypothetical scenarios and asset behaviour not seen in the historical data.

Now the downside of backtesting using synthetic data is that it may not accurately depict the real behaviour of the asset. However, even real historical data may not be representative of future behaviour.

With synthetic data, one can generate any amount of data, say 100 years or even more. This means:

  • No problems with overfitting – you can generate an unlimited amount of data to test whether optimized parameters work in general or only on a specific piece of data.
  • The large amount of data should contain a wider range of possible data patterns to test your strategy on.
  • Mathematical optimization algorithms for finding optimal parameters work perfectly as they can work with smooth, noise free functions.
  • It’s easy to explore how properties of the data (eg, volatility) affect the optimal parameters. This can ultimately allow you to use adaptive parameters for your strategy, which change based on changing characteristics of the data such as volatility.
  • It allows you to test how robust your strategy is on hypothetical scenarios that might not have occurred in the historical data. For example, if you backtested your strategy on data with lower volatility, will it still be profitable if the asset volatility increases? Would your strategy be profitable (or at least minimize losses) during a market crash?

How to generate synthetic data

In general, when generating asset paths for stocks, cryptocurrencies or exchange rates, the starting point is geometric Brownian motion. For some applications you may wish to add random jumps to the Brownian motion, the timing of which can be generated from a Poisson process.

However, when generating synthetic data for backtesting purposes you will probably find that your strategy is completely ineffective when applied to geometric Brownian motion alone. This is because real asset price data contains non-random features such as trends, mean reversions, and other patterns which are exactly what algorithmic traders are looking for.

This means that we have add trend effects to our synthetic data. However, it does raise a significant issue: how do we know that the artificial trends and patterns we add to the data are representative of those present in real data?

What you’ll find is that the profitability of the strategy is largely determined by the relative magnitudes of the geometric motion and the trend term. If the trend is too strong, the strategy will be phenomenally profitable. Too weak, and the strategy will have nothing but random noise to work with.

We will not concern ourselves too much with generating highly natural or realistic data in the present article as our primary purpose here is to study how synthetic data can allow us to explore the behaviour of the strategy, and how its optimal parameters relate to the properties of the data.

We generate synthetic FX data using the code below. We assume the initial FX rate is S0 = 1, and volatility is 10%. Since, barring some kind of hyperinflation event, an exchange rate does not usually become unboundedly large or small, we add in some mean reversion that tends to bring it back to its original value.

While there are many ways of defining a trend, our trend is a value which starts at 0 and drifts randomly up or down due to it’s own geometric brownian motion. There is also a mean reversion term which tends to bring the trend back to 0. The trend value is added to the stock jump each time step.

def generate_path(S0, num_points, r, t_step, V, rand, rand_trend, mean, mean_reversion):
    S = S0*np.ones(num_points)
    trend = 0
    for i in range(1, len(S)):
        trend = trend + rand_trend[i-1]*S[i-1]/2000 - trend/10
        S[i] = mean_reversion*(mean - S[i-1]) + S[i-1]*np.exp((r - 0.5*V**2)*t_step + np.sqrt(t_step)*V*rand[i-1]) + 0.7*trend
    
    return S

# Generate synthetic data
S0 = 1
num_points = 50000
seed = 123
rs = np.random.RandomState(seed)

rand = rs.standard_normal(num_points-1)
rand_trend = rs.standard_normal(num_points-1)
r=0
V=0.1
t_step = 1/365
mean = 1
mean_reversion = 0.004
close = generate_path(S0, num_points, r, t_step, V, rand, rand_trend, mean, mean_reversion)

To get an idea of what our synthetic data looks like, below I’ve generated and plotted 1000 days of synthetic FX data.

Backtesting an FX moving average strategy on synthetic data

We generate 100,000 days of synthetic FX data and run our moving average backtesting script from the previous two articles.

Plotting 100,000 datapoints on a graph along with the short and long moving averages produces a seriously congested graph. However, we can see that the synthetic data does not stray too far from its initial value of 1. In fact, we could probably stand to relax the mean reversion a bit if it was realistic data we were after. The greatest variation from the initial value of 1 seems to be about 30% which is too low over such a long time period. Now the profitability of the strategy is not particularly meaningful here, since as mentioned it is largely determined the strength of the trend as compared to the Brownian motion that the user specifies. If the strategy is profitable, the profit will also be very high when the strategy is executed over 100,000 days.

What is interesting though, is to strengthen the trend of the data (say, changing 0.7*trend to 1*trend in the earlier code snippet) and plotting graphs of profitability vs parameter values. When backtesting against real historical data, we often found that the resulting graphs were noisy and multi-peaked, making it difficult to determine the optimal parameters. Using a much larger quantity of synthetic data with a strong trend, we find the graphs are clean and clear.

We can clearly see that the optimal values are alpha close to 1, say 0.975, threshold as small as possible, short days also very small and long days about 22.

What happens if we make the volatility of the data smaller by a factor of 3?

It seems the optimal alpha has reduced to about 0.9. Also, the optimal number of days used for the long term average has increased from low twenties to high twenties.

It’s unlikely you’d be able to extract this insight from backtesting on historical data. Firstly, you wouldn’t be able to adjust the volatility of the data on a whim, and the graphs would be too noisy to clearly see the relationship. What this example illustrates is that synthetic data can be used to study how various properties of the data affect the optimal parameters of the strategy. This could be used to create a strategy with “adaptive” parameters which change based on the most recent characteristics of the data, such as volatility. A very simple example of this would be increasing the number of days used in the long term average during periods of high volatility.

Testing stressed artificial scenarios

Another utility of synthetic data is the ability to generate particular scenarios on which to test your strategy. This might include periods of high volatility, steeply declining or rising data, or sudden jumps. To achieve this, one can generate some synthetic data using the method already described, and then manually adjust the data points to create a particular scenario. This will help you to understand what kind of data may “break” your strategy and how you might be able to adjust parameters, or add in additional conditional behaviour or fail safes.

Find out more about our algorithmic trading consulting services.

Python code

Below we include the python code used to generate the numbers and graphs in this article.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import minimize

def generate_path(S0, num_points, r, t_step, V, rand, rand_trend, mean, mean_reversion):
    S = S0*np.ones(num_points)
    trend = 0
    for i in range(1, len(S)):
        trend = trend + rand_trend[i-1]*S[i-1]/2000 - trend/10
        S[i] = mean_reversion*(mean - S[i-1]) + S[i-1]*np.exp((r - 0.5*V**2)*t_step + np.sqrt(t_step)*V*rand[i-1]) + 0.7*trend
    
    return S

# Generate synthetic data
S0 = 1
num_points = 100000
seed = 123
rs = np.random.RandomState(seed)
rand = rs.standard_normal(num_points-1)
rand_trend = rs.standard_normal(num_points-1)
r=0
V=0.1
t_step = 1/365
mean = 1
mean_reversion = 0.004
close = generate_path(S0, num_points, r, t_step, V, rand, rand_trend, mean, mean_reversion)

def moving_avg(close, index, days, alpha):
    partial = days - np.floor(days)
    days = int(np.floor(days))
    
    weights = [alpha**i for i in range(days)]
    av_period = list(close[max(index - days + 1, 0): index+1])  

    if partial > 0:
        weights = [alpha**(days)*partial] + weights
        av_period = [close[max(index - days, 0)]] + av_period

    return np.average(av_period, weights=weights)

def calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold):
    strategy = [0]*(len(close) - start_offset)
    short = [0]*(len(close) - start_offset)
    long = [0]*(len(close) - start_offset)
    boughtorsold = 1   
    for i in range(0, len(close) - start_offset):   
        short[i] = moving_avg(close, i + start_offset, short_days, alpha)
        long[i] = moving_avg(close, i + start_offset, long_days, alpha)
        if short[i] >= long[i]*(1+threshold) and boughtorsold != 1:
            boughtorsold = 1
            strategy[i] = 1
        if short[i] <= long[i]*(1-threshold) and boughtorsold != -1:
            boughtorsold = -1
            strategy[i] = -1       
    return (strategy, short, long)
                
def price_strategy(strategy, close, short_days, long_days, alpha, start_offset):
    cash = 1/close[start_offset]  # Start with one unit of CCY2, converted into CCY1
    bought = 1
    for i in range(0, len(close) - start_offset):
        if strategy[i] == 1:
            cash = cash/close[i + start_offset]       
            bought = 1
        if strategy[i] == -1:   
            cash = cash*close[i + start_offset]           
            bought = -1    
    # Sell at end
    if bought == 1:
        cash = cash*close[-1]
    return cash
        
def graph_strategy(close, strategy, short, long, start_offset):
    x = list(range(0, len(close) - start_offset))
    plt.figure(0)
    plt.plot(x, close[start_offset:], label = "Synthetic FX data")
    plt.plot(x, short, label = "short_av")
    plt.plot(x, long, label = "long_av")
    buyidx = []
    sellidx = []
    for i in range(len(strategy)):
        if strategy[i] == 1:
            buyidx.append(i)
        elif strategy[i] == -1:
            sellidx.append(i)
    marker_height = (1+0.1)*min(close) - 0.1*max(close)
    plt.scatter(buyidx, [marker_height]*len(buyidx), label = "Buy", marker="|")
    plt.scatter(sellidx, [marker_height]*len(sellidx), label = "Sell", marker="|")
    plt.title('Moving average crossover')
    plt.xlabel('Timestep')
    plt.ylabel('Price')    
    plt.legend(loc=1, prop={'size': 6})    
    #plt.legend()

def plot_param(x, close, start_offset, param_index, param_values):
    profit = []
    x2 = x.copy()
    for value in param_values:
        x2[param_index] = value
        short_days = x2[0]
        long_days = x2[1]
        alpha = x2[2]
        threshold = x2[3]        
        (strat, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
        profit.append(price_strategy(strat, close, short_days, long_days, alpha, start_offset) - 1)
    plt.figure(param_index+1)
    param_names = ["short_days", "long_days", "alpha", "threshold"]
    name = param_names[param_index]
    plt.title('Strategy profit vs ' + name)
    plt.xlabel(name)
    plt.ylabel('Profit')    
    plt.plot(param_values, profit, label = "Profit")

def evaluate_params(x, close, start_offset):   
    short_days = x[0]
    long_days = x[1]
    alpha = x[2]
    threshold = x[3]
    (strat1, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
    profit = price_strategy(strat1, close, short_days, long_days, alpha, start_offset)
    return -profit #Since we minimise

#Initial strategy parameters.
short_days = 5
long_days = 30
alpha = 0.99
start_offset = 100
threshold = 0.01

x = [short_days, long_days, alpha, threshold]

#Price strategy
(strat1, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
profit = price_strategy(strat1, close, short_days, long_days, alpha, start_offset)
print("Strategy profit is: " + str(profit - 1))
print("Buy and hold profit is: " + str(close[-1]/close[start_offset] - 1))

#Graph strategy
graph_strategy(close, strat1, short, long, start_offset)

#Graph parameter dependence
plot_param(x, close, start_offset, 2, np.arange(0.7, 1, 0.02))
plot_param(x, close, start_offset, 3, np.arange(0.01, 0.1, 0.001))
plot_param(x, close, start_offset, 0, np.arange(2, long_days, 2))
plot_param(x, close, start_offset, 1, np.arange(short_days, 60, 2))

Algo Trading Forex – Backtesting and Optimizing an FX Moving Average Strategy

Our quant consulting service can help you backtest and optimize your moving average strategy. Find out more about our algorithmic trading consulting services and contact us today.

See also our article on backtesting an FX moving average strategy on synthetic data.

Some distinguishing characteristics of Forex trading, as opposed to stock trading, is 24 hour trading, virtually no limit on liquidity, and the potential for significant leverage on trades. These features are all fortuitous for an algorithmic approach to trading.

The crossing over of short and long term moving averages is a well-known signal used in algo trading. The idea is that when the short term average rises above the long term average, it could indicate the price is beginning to rise. Similarly, when the short term average falls below the long term average, it could indicate the price is beginning to fall.

In a previous article, we investigated backtesting a moving average crossover strategy on bitcoin. We found that our backtesting allowed us to pick optimal parameters for the strategy that very significantly improved its profitability, and resulted in a far higher return than simply buying and holding the asset.

In this article we conduct the same analysis on four currency pairs – GBPUSD, EURUSD, AURUSD and XAUUSD. Forex data going back to January 2001 has been sourced from forextester.

We assume that the trader initially converts one USD to the other currency, converts back and forth between the two currencies based on whether he believes the exchange rate is trending up or down, and converts back to USD at the end. The profit is compared against that of simply holding the non-USD currency for the entire time.

As before our strategy has four parameters to optimize: alpha, threshold, short days and long days.

  • short_days – The number of days used for the short term average
  • long_days – The number of days used for the long term average
  • alpha – This is a parameter used in the exponential moving average calculation which determines how much less weight is given to data further in the past. A value of 1 means all data gets the same weighting.
  • threshold – A threshold of 10% means that instead of executing when the two averages cross, we require that the short average pass the long average by 10%. The idea is to prevent repeatedly entering/exiting a trade if the price is jumping about near the crossover point.

We plot graphs showing the profit of the strategy for different values of each parameter. From these graphs, we choose optimal parameters for each currency pair. The full results, graphs and the python code used in this analysis are available below.

Conclusions

There are many pitfalls and caveats when doing this kind of optimization, and what seem like the optimal parameters should never be accepted naively. Things to keep in mind:

  • We are finding the optimal parameters on a particular piece of historical data. There is no guarantee that these parameters will be effective on future data which may differ in significant ways from historical data.
  • We have to be careful we are not “overfitting”, which occurs when we optimize too much for the idiosyncrasies of our particular dataset, rather than trends which are likely to persist. One way to guard against this is to split the data into pieces and perform the optimization on each piece. This gives an idea of how much variation to expect in the optimal parameters.

A question that arises in our set of graphs below, is how to choose parameters when the graph of profit against parameter value is noisy, multi-peaked or has highly unstable peaks (eg two close together parameters values that have radically different profitability).

We find different currency pairs exhibit differing optimal parameters. However, a critical question is how much of this variation is genuine, and can be expected to hold into the future, and how much is just noise in the data. However, generally speaking we find the following parameters are effective in most cases:

  • Alpha large, say 0.95
  • Threshold small – 0.01 or 0.02. This indicates that the threshold variable is often not useful and could perhaps be removed.
  • Short days between 30 and 60
  • Long days between 120 and 160

This information is certainly useful, and helps us to configure our moving average strategy much better than if we were merely guessing at, for example, how many days to take the long average over. It’s important to realise that even the most carefully optimized strategy can fail when conditions emerge that were not present in its historical backtesting dataset.

It’s worth noting that these optimal parameters differ from those we found for bitcoin. In that case, we found a long days parameter of just 35 and a short days parameter of just 10 were optimal. This probably reflects the much higher volatility of bitcoin as compared to Forex markets.

This article highlights some of the difficulties involved in backtesting strategies on historical data. One way to resolve many of these issues is to test and optimize your strategy on synthetic data. This will be the subject of a future article.

Results

GBPUSD

Strategy profit is: 0.2644639031872955
Buy and hold profit is: -0.23343631994507374

We find that the optimal alpha is about 0.95. The threshold graph displays a general downward trend so we choose threshold = 0.02. The short days graph displays a clear downward trend so we choose short_days = 30. The long days graph displays a peak at about 160. It’s not entirely clear whether or not this peak is simply an artifact of the particular dataset we are looking at. If we believed this, we might choose a value more like 300 since the graph displays a general upward trend. Regardless, we choose long_days = 160 in this case.

Using these values, our strategy has a profit of 0.26 USD, vs a loss of 0.23 USD from buying and holding GBP.

Another thing we can do here is guard against overfitting by splitting the dataset into two halves, repeat the process on each piece, and seeing whether the optimal parameters change much. Surprisingly, we find that the optimal parameters are about the same for both halves of the data. This will often not be the case, however.

First half:

Second half:

EURUSD

Strategy profit is: 0.9485592864080428
Buy and hold profit is: 0.086715457972943

We choose alpha = 0.95, threshold = 0.01, short_days = 30, long_days = 130.

Using these values, our strategy has a profit of 0.95 USD, vs a profit of 0.087 USD from buying and holding EUR.

AUDUSD

Strategy profit is: 1.0978809301386634
Buy and hold profit is: 0.24194800155219265

From these graphs, it seems an optimal choice of parameters might be apha = 0.95, threshold = 0.01, short_days = 80, long_days = 125. Using these parameters are strategy returns 1.1 USD vs 0.24 USD for buy and hold. However, it’s also clear that the parameters alpha, short_days and long_days are highly unstable. The optimum occurs at a narrow peak, with relatively nearby parameter values giving dramatically lower performance.

XAUUSD

Strategy profit is: 6.267892218264149
Buy and hold profit is: 4.984864864864865

We first try parameters of alpha = 0.85, threshold = 0.02, short_days = 60, long_days = 300. This results in a profit of 6.27 USD versus a buy and hold profit of 4.98 USD.

Interestingly, there is another optimum given by alpha = 0.95, threshold = 0.1, short_days = 60, long_days = 300.

Strategy profit is: 5.640074352374969
Buy and hold profit is: 4.984864864864865

The way the threshold and alpha graphs change when we change the base threshold and alpha shows the interdependence of the parameters.

Python code

The python code used in this analysis is made available below.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import minimize

#data = pd.read_csv("GBPUSD.txt")
idx = 0
Names = ["GBPUSD", "EURUSD", "AUDUSD", "XAUUSD"]
Name = Names[idx]
data = pd.read_csv(Name + ".txt")
#data = data.iloc[::-1] #reverses order of dates

# Filter for daily data only
data = data.drop_duplicates(subset=['<DTYYYYMMDD>'], keep='last')

close = np.array(data["<CLOSE>"])
#close = close[:3389]

def moving_avg(close, index, days, alpha):
    # float values allowed for days for use with optimization routines.
    partial = days - np.floor(days)
    days = int(np.floor(days))
    
    weights = [alpha**i for i in range(days)]
    av_period = list(close[max(index - days + 1, 0): index+1])  

    if partial > 0:
        weights = [alpha**(days)*partial] + weights
        av_period = [close[max(index - days, 0)]] + av_period

    return np.average(av_period, weights=weights)



def calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold):
    strategy = [0]*(len(close) - start_offset)
    short = [0]*(len(close) - start_offset)
    long = [0]*(len(close) - start_offset)
    boughtorsold = 1   
    for i in range(0, len(close) - start_offset):   
        short[i] = moving_avg(close, i + start_offset, short_days, alpha)
        long[i] = moving_avg(close, i + start_offset, long_days, alpha)
        if short[i] >= long[i]*(1+threshold) and boughtorsold != 1:
            boughtorsold = 1
            strategy[i] = 1
        if short[i] <= long[i]*(1-threshold) and boughtorsold != -1:
            boughtorsold = -1
            strategy[i] = -1       
    return (strategy, short, long)
                

def price_strategy(strategy, close, short_days, long_days, alpha, start_offset):
    cash = 1/close[start_offset]  # Start with one unit of CCY2, converted into CCY1
    bought = 1
    for i in range(0, len(close) - start_offset):
        #print(strategy[i])
        if strategy[i] == 1:
            cash = cash/close[i + start_offset]       
            bought = 1
        if strategy[i] == -1:   
            cash = cash*close[i + start_offset]           
            bought = -1    
    # Sell at end
    if bought == 1:
        cash = cash*close[-1]
    #if bought == -1:
    #    cash = cash - close[-1]
    return cash
        
def graph_strategy(close, strategy, short, long, start_offset):
    x = list(range(0, len(close) - start_offset))
    plt.figure(0)
    plt.plot(x, close[start_offset:], label = Name)
    plt.plot(x, short, label = "short_av")
    plt.plot(x, long, label = "long_av")
    buyidx = []
    sellidx = []
    for i in range(len(strategy)):
        if strategy[i] == 1:
            buyidx.append(i)
        elif strategy[i] == -1:
            sellidx.append(i)
    marker_height = (1+0.1)*min(close) - 0.1*max(close)
    plt.scatter(buyidx, [marker_height]*len(buyidx), label = "Buy", marker="|")
    plt.scatter(sellidx, [marker_height]*len(sellidx), label = "Sell", marker="|")
    plt.title('Moving average crossover')
    plt.xlabel('Timestep')
    plt.ylabel('Price')        
    plt.legend()

def plot_param(x, close, start_offset, param_index, param_values):
    profit = []
    x2 = x.copy()
    for value in param_values:
        x2[param_index] = value
        short_days = x2[0]
        long_days = x2[1]
        alpha = x2[2]
        threshold = x2[3]        
        (strat, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
        profit.append(price_strategy(strat, close, short_days, long_days, alpha, start_offset) - 1)
    plt.figure(param_index+1)
    param_names = ["short_days", "long_days", "alpha", "threshold"]
    name = param_names[param_index]
    plt.title('Strategy profit vs ' + name)
    plt.xlabel(name)
    plt.ylabel('Profit')    
    plt.plot(param_values, profit, label = "Profit")

def evaluate_params(x, close, start_offset):   
    short_days = x[0]
    long_days = x[1]
    alpha = x[2]
    threshold = x[3]
    (strat1, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
    profit = price_strategy(strat1, close, short_days, long_days, alpha, start_offset)
    return -profit #Since we minimise

#Initial strategy parameters.
short_days = 30
long_days = 300
alpha = 0.95
start_offset = 300
threshold = 0.02#0.01

#short_days = 10marker_height
#long_days = 30
#alpha = 0.75
#alpha = 0.92
#threshold = 0.02

x = [short_days, long_days, alpha, threshold]

#Price strategy
(strat1, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
profit = price_strategy(strat1, close, short_days, long_days, alpha, start_offset)
print("Strategy profit is: " + str(profit - 1))
print("Buy and hold profit is: " + str(close[-1]/close[start_offset] - 1))

#Graph strategy
graph_strategy(close, strat1, short, long, start_offset)

#Graph parameter dependence
plot_param(x, close, start_offset, 2, np.arange(0.7, 1, 0.02))
plot_param(x, close, start_offset, 3, np.arange(0.01, 0.1, 0.001))
plot_param(x, close, start_offset, 0, np.arange(20, long_days, 10))
plot_param(x, close, start_offset, 1, np.arange(short_days, 300, 10))

Backtesting and Optimizing a Bitcoin/Crypto Moving Average Crossover Algorithm on Binance Data

Our quant consulting service can help you backtest and optimize your moving average strategy. Find out more about our algorithmic trading consulting services and contact us today.

See also our article on backtesting a moving average strategy on synthetic data.

The crossing over of short and long term moving averages is a well-known signal used in algo trading. The idea is that when the short term average rises above the long term average, it could indicate the price is beginning to rise. Similarly, when the short term average falls below the long term average, it could indicate the price is beginning to fall. The Binance academy refers to these as the golden cross and death cross respectively. Moving averages can be applied to Bitcoin or other cryptocurrencies as a strategy or one component of a strategy.

The moving average strategy has a number of parameters that need to be determined:

  • short_days – The number of days used for the short term average
  • long_days – The number of days used for the long term average
  • alpha – This is a parameter used in the exponential moving average calculation which determines how much less weight is given to data further in the past. A value of 1 means all data gets the same weighting.
  • threshold – A threshold of 10% means that instead of executing when the two averages cross, we require that the short average pass the long average by 10%. The idea is to prevent repeatedly entering/exiting a trade if the price is jumping about near the crossover point.

Of course, one can also consider combining moving averages with other signals/strategies like pairs trading to improve its effectiveness.

In this article we create python code to backtest a moving average crossover strategy on historical bitcoin spot data from Binance.

We grab the daily Binance BTCUSD close spot prices for the past 945 days here. The data ranges from September 2019 to April 2022. For our purposes we will ignore the complexities introduced by the order book, and assume there is always enough liquidity at the top of the book for the quantity we want to trade. This will likely be the case for individual investors trading their own money, for example. We will also ignore transaction costs, since these are usually negligible compared to price changes unless we are examining a high frequency strategy.

The full python code is included at the bottom of this post. It features the following functions:

  • moving_avg – This computes a moving average a number of days before a given index. The function was modifed to accept non-integer number of days in case it needed to work with an optimization algorithm.
  • calculate_strategy – This takes as input the strategy parameters and calculates where the strategy buys/sells.
  • price_strategy – This takes the strategy created by calculate_strategy and calculates the profit on this strategy on the current data.
  • graph_strategy – This generates a graph of the price data, along with the short and long term moving averages and indicators showing where the strategy buys/sells
  • plot_param – This plots profit as a function of one of the parameters to help gauge optimal values
  • evaluate_params – This is a reformating of the price_strategy function to make it work with optimization algorithms like scipy.optimize.

We assume that we always either hold or short bitcoin based on whether the strategy predicts that the price is likely to rise or fall. To begin, we choose some initial parameters more or less at random.

#Initial strategy parameters.
short_days = 15
long_days = 50
alpha = 1
start_offset = long_days
threshold = 0.05

Then we execute the code. this produces the following output. This means that using these parameters,

Strategy profit is: 31758.960000000006
Buy and hold profit is: 33115.64

This means that with these randomly chosen parameters, our strategy actually performs slightly worse than simply buying and holding. The code produces this graph which shows the bitcoin price, long and short averages, and markers down the bottom indicating where the strategy decided to buy and sell. It’s apparent that the strategy executes essentially where the orange and green lines crossover (although the threshold parameter mentioned earlier will affect this).

The code also produces the following graphs which show how the strategy profit varies with each parameter.

These graphs are quite “jumpy”, i.e. have a significant amount of noise. An attempt to find the optimal parameters using optimization algorithms would probably just “overfit” to the noise (or just get stuck at a local maximum). It’s clear that to deploy optimization algorithms to find the optimal parameters, you would probably need to apply some kind of multidimensional smoothing algorithm to create a smoother objective function. You could also attempt to use more data, however going back further means the data may become less relevant.

Instead, we proceed using visual inspection only. However, it’s important to realise that these parameters are potentially interdependent so one has to be careful optimizing them one at a time. We could consider looking at three dimensional graphs to get a better idea of what’s going on. From the first plot, it appears that an alpha of 0.92 is optimal. The threshold graph is very jumpy, but it appears as if the graph has an overall downward trend. Thus we choose the threshold to be a small value, say 0.02. The short averaging days graph also has a downward trend, so let’s make it something small like 10. The long averaging days graph indicates that the strategy performs poorly for smaller values. Let’s try making it 35. Now let’s run the code again with these new parameters.

Strategy profit is: 41372.97999999998
Buy and hold profit is: 33115.64

So our strategy is now more profitable than buy and hold, at 41.4k vs 33.1k.

Our optimized strategy has performed considerably better than our initial choice of parameters, and considerably better than buy and hold. But before we get excited, we have to consider the possibility that we have overfitted. That is, we have found parameters that perform exceptionally on this particular dataset, but may perform poorly on other datasets (particular future datasets). One way to explore this question is to break the data up into segments and run the code individually on each segment to see whether the optimal parameters disagree. Let’s try to 400 to 600 region from the graph first.

Strategy profit is: 32264.25
Buy and hold profit is: 17038.710000000003

In this region, the strategy is still dramatically more profitable than buy and hold, and an alpha of 0.75 still seems to be approximately optimal. Now let’s look at the region from 600 to 800 on the original graph.

Strategy profit is: 8708.76000000001
Buy and hold profit is: 10350.419999999998

In this region, the strategy actually performs worse than buy and hold (although the loss is massively outweighed by the profit from the 400 to 600 region). While an alpha of 0.75 still seems approximately optimal, the strategy doesn’t perform significantly better than buy and hold for any value of alpha.

Find out more about our algorithmic trading consulting services.

Also check out our article on statistical arbitrage / pairs trading of cryptocurrency.

Python code

Below is the full python code used to fit the strategy and create the graphs used in this article. The four parameters can be manually altered under “initial strategy parameters”.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import minimize

data = pd.read_csv("Binance_BTCUSDT_d.csv")
data = data.iloc[::-1] #reverses order of dates
close = np.array(data.close)

def moving_avg(close, index, days, alpha):
    # float values allowed for days for use with optimization routines.
    partial = days - np.floor(days)
    days = int(np.floor(days))
    
    weights = [alpha**i for i in range(days)]
    av_period = list(close[max(index - days + 1, 0): index+1])  

    if partial > 0:
        weights = [alpha**(days)*partial] + weights
        av_period = [close[max(index - days, 0)]] + av_period
    
    return np.average(av_period, weights=weights)    

def calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold):
    strategy = [0]*(len(close) - start_offset)
    short = [0]*(len(close) - start_offset)
    long = [0]*(len(close) - start_offset)
    boughtorsold = 1   
    for i in range(0, len(close) - start_offset):   
        short[i] = moving_avg(close, i + start_offset, short_days, alpha)
        long[i] = moving_avg(close, i + start_offset, long_days, alpha)
        if short[i] >= long[i]*(1+threshold) and boughtorsold != 1:
            boughtorsold = 1
            strategy[i] = 1
        if short[i] <= long[i]*(1-threshold) and boughtorsold != -1:
                boughtorsold = -1
                strategy[i] = -1
                
    return (strategy, short, long)

def price_strategy(strategy, close, short_days, long_days, alpha, start_offset):
    cash = -close[start_offset]  # subtract initial purchase cost
    bought = 1
    for i in range(0, len(close) - start_offset):
        #print(strategy[i])
        if strategy[i] == 1:
            # Note the factor of 2 is due to selling and shorting
            cash = cash - 2*close[i + start_offset]       
            bought = 1
        if strategy[i] == -1:   
            cash = cash + 2*close[i + start_offset]           
            bought = -1    
    # Sell at end
    if bought == 1:
        cash = cash + close[-1]
    if bought == -1:
        cash = cash - close[-1]
    return cash
        
def graph_strategy(close, strategy, short, long, start_offset):
    x = list(range(0, len(close) - start_offset))
    plt.figure(0)
    plt.plot(x, close[start_offset:], label = "BTCUSDT")
    plt.plot(x, short, label = "short_av")
    plt.plot(x, long, label = "long_av")
    buyidx = []
    sellidx = []
    for i in range(len(strategy)):
        if strategy[i] == 1:
            buyidx.append(i)
        elif strategy[i] == -1:
            sellidx.append(i)
    plt.scatter(buyidx, [0]*len(buyidx), label = "Buy", marker="|")
    plt.scatter(sellidx, [0]*len(sellidx), label = "Sell", marker="|")
    plt.title('Moving average crossover')
    plt.xlabel('Timestep')
    plt.ylabel('Price')        
    plt.legend()

def plot_param(x, close, start_offset, param_index, param_values):
    profit = []
    x2 = x.copy()
    for value in param_values:
        x2[param_index] = value
        short_days = x2[0]
        long_days = x2[1]
        alpha = x2[2]
        threshold = x2[3]        
        (strat, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
        profit.append(price_strategy(strat, close, short_days, long_days, alpha, start_offset))
    plt.figure(param_index+1)
    param_names = ["short_days", "long_days", "alpha", "threshold"]
    name = param_names[param_index]
    plt.title('Strategy profit vs ' + name)
    plt.xlabel(name)
    plt.ylabel('Profit')    
    plt.plot(param_values, profit, label = "Profit")

def evaluate_params(x, close, start_offset):   
    short_days = x[0]
    long_days = x[1]
    alpha = x[2]
    threshold = x[3]
    (strat1, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
    profit = price_strategy(strat1, close, short_days, long_days, alpha, start_offset)
    return -profit #Since we minimise

#Initial strategy parameters.
short_days = 15
long_days = 50
alpha = 1
start_offset = long_days
threshold = 0.05

#short_days = 10
#long_days = 30
#alpha = 0.75
#alpha = 0.92
#threshold = 0.02

x = [short_days, long_days, alpha, threshold]

#Price strategy
(strat1, short, long) =  calculate_strategy(close, short_days, long_days, alpha, start_offset, threshold)
profit = price_strategy(strat1, close, short_days, long_days, alpha, start_offset)
print("Strategy profit is: " + str(profit))
print("Buy and hold profit is: " + str(close[-1] - close[start_offset]))

#Graph strategy
graph_strategy(close, strat1, short, long, start_offset)

#Graph parameter dependence
plot_param(x, close, start_offset, 2, np.arange(0.7, 1, 0.02))
plot_param(x, close, start_offset, 3, np.arange(0.01, 0.1, 0.001))
plot_param(x, close, start_offset, 0, np.arange(5, long_days, 1))
plot_param(x, close, start_offset, 1, np.arange(short_days, 50, 1))

#Optimization
#x0 = [short_days, long_days, alpha, threshold]
#start_offset = 70

#bnds = ((1, start_offset), (1, start_offset), (0.001,1), (0,None))
#cons = ({'type': 'ineq', 'fun': lambda x:  x[1] - x[0]})

#result = minimize(evaluate_params, x0, args=(close, start_offset), bounds=bnds, constraints=cons, method='BFGS')
#print(result)
#evaluate_params(result.x, close, start_offset)