Quant Basics 4: Analysing A Single Backtest

AI, Quantitative Analysis and Data Science Solutions for Finance and Manufacturing.

Quant Basics 4: Analysing A Single Backtest

August 24, 2017 Quant Basics 5

In the previous posts we have downloaded market data, developed a vectorised backtest and calculated PnL, Sharpe ratio and drawdown. In this post we will set up, run and analyse a single backtest. This is the basis for running parameter sweeps and optimisations with hundreds or thousands of backtests.

So, let’s set some important parameters first:

tickers = ['AAPL','MSFT','CSCO','XOM']
start = '2013-01-01'
end = '2017-06-01'

Next, we run our data-gathering code:

p = prices(tickers,start,end,backend='google')

Now we calculate our signals with lookback periods of 10 and 20:

def run_single(tickers, p):
    sig = calc_signals(tickers,p,10,20)
    pnl = calc_pnl(sig,p)
    sharpe = calc_sharpe(pnl)
    ddwn = calc_ddwn(pnl)
    return pnl,sharpe,ddwn

pnl,sharpe,ddwn = run_single(tickers,p)

When we run this we get the following results:

 pnl[-1] = 201.14 sharpe = 1.01 drawdown = -39.86 

The pnl curve does not look too bad for this strategy, the drawdown is acceptable and so is the Sharpe ratio. However, there are some important points to consider:

  • We chose the MA-parameters at random and we could have just been lucky
  • We used some arbitrary time period and the strategy could perform very differently at other times
  • We used an arbitrary set of stocks that we know but at the start of the strategy this could have been very different
  • We had equal positions sizes for each equity so more expensive stocks may contribute more to the PnL

It is good to be aware of these points and we will address them later on. For now, let’s just look at the final point, equal position sizing. In our code, we just use a position of one stock because of the way the signal is constructed and PnL is calculated. However, this will weight the expensive stocks higher. One way to overcome that is to adjust the position sizes when we create the signals. However, this would result in a very complex problem if we try to vectorise it. Another, much quicker approach is to normalise our prices using a scaler. We do this in our prices() function after we imported the scaling function from scikit-learn, a wonderful machine learning package, like so:

from sklearn.preprocessing import MinMaxScaler

This normalises the prices between zero and one, which is somewhat equivalent to adjusting position sizes. There is an element of look-ahead bias in that since we scale the prices on the whole series but this dependency is fairly weak and more than sufficient for our purposes. Let’s replace line 36 in our prices() function with this:

    scaled = MinMaxScaler((0,1)).fit_transform(p[field])
    pp=pd.DataFrame(scaled,index=p[field].index,columns = tickers)

As a result, our PnL curve looks a lot less impressive but far more realistic. Note that after scaling we cannot compare PnL’s between the two strategies anymore since our normalised data only range from zero to one.

Now we’ve ran our backtest, calculated PnL, Sharpe and drawdown and used a trick to adapt position sizing to get similar weighting of all stocks. In the next section I will show you how to run a Monte-Carlo parameter sweep over this strategy to optimise our strategy and later we will find ways to get around the problem of data mining bias which arises from this.

The code base for this section can be found on Github.

5 Responses

  1. Amir Nejad says:

    HOw can you normalize data to 0,1 but plotting data below zero?

  2. Tom Starke says:

    Hi Amir,

    Could you be a bit more specific please? The plots in this post are PnL curves and they can definitely go below zero. If you found any normalised price curves anywhere that go below zero please let me know.

    Thanks, Tom

  3. Rich O'Regan says:

    Hey Amir, if I understand correctly, Tom normalised the MARKET DATA of each of the stocks individually. i.e. APPL and MSFT prices went from 0 (min) to 1 (max).
    Personally, I think I’d have calculated profit as a percentage of market price and used percent cumulation – but chats would essentially look pretty similar I reckon.

  4. an Admirer says:

    i’m not getting the same results as you when i ran the code with your parameters. pretty far from it. can you double check your data?

  5. Tom Starke says:

    It’s difficult to reply to this since you haven’t given any more details. You might want to try running the Jupiter notebook from the Github repo (link can be found at the end of the article). Let me know how you go.

Comments are closed.