Sketching the Option Backtester v2 (with Code downloadable for ALL readers)
In the last post, we wrote code to test for the pnl of a system that continuously rebalances and shorts the atm straddle on index options.
We just made a few notes - the code is rather inflexible in that the logic is tied to the database. The problem arises from option data being massive, which means we need to load the data into RAM on the run, instead of preloading. To deal with performance concerns we want to have some vectorisation and preprocessing, but again the issue arises when testing with derivative contracts; they expire, hence we do not know ahead of time the contract selection - and if we do not know which contracts we are trading ahead of time…there is no precompute to be had.
Apart from the code being slow, another issue is room for human error. When dealing with linear products, we designed a powerful no-code quantitative backtesting solution that encodes systematic rules as mathematical formulas;
to test for a 20/50 day moving average crossover trend following strategy, all it took was to write ‘mac_20/50(close)’, and let our quant engine unravel the logic, with very little room for error, which means…even for future strategies, we worry less about making mistakes in our signal generation since we do not write ANY python code.
The issue with our option backtesting code is that everything is subject to implementation error - data loading, data preprocessing, contract selection, pnl compute, signal generation. This is clear operational hazard, and will at some point bite us - I have made the mistake of thinking I have great alpha from a copy by reference python implementation error leading to forward-looking bias.
So for the quant devs who have been on hangukquant for awhile now, you should know the principles that allow us to alleviate these issues…abstraction and modularisation. Well, these are software engineering design principles, but they apply to quant dev.
Why? Modularisation means each software component has a well-defined job. Abstraction means that we trust that the software component fulfils that job, and we don’t care how the job was done. Complex but stable systems can only be built on these principles.
But sometimes, when we are developing a system…we can’t even answer….what jobs are there? Imagine you have a cereal drinker, a coffee drinker and a water bottle - and you decided to get rid of the clutter by drinking everything in a bowl. The common functionality is that it holds some liquid. We need to know what the jobs are that need to be performed.
In that motivation, we write code to test for the pnl of a system that continuously rebalances and shorts the atm straddle on SINGLE stock options for the next monthlies.
I just want to make a few quick notes: the code is NOT reviewed for logic error and we have not spent much time trying to debug it. The REASON being that over the coming posts, we are going to massively improve the code, in the direction discussed above. So, please do NOT use this for your research…what I want you to DO is to review the code attached, and put it side by side our previous code attached on testing the index options. Do NOT trust the pnl of either code.
There should be striking similarities in the code, which means there should be striking ways we can abstract logic out of each individual class into a parent-child relationship. This will be our first step into writing a more robust system. (code, change .cbz to .zip and you are good)
The explanations for the code are the same as before - only the data load and signal compute is different.
We are going to release the improvements over a series of posts, where the code will be made available to paid readers. Unfortunately, as far as sophistication and quant dev goes, there is a reason why it is difficult to find material like this in textual format any where else. If you are advanced enough, you can just peruse the code attached and break down the logic, but for those people who want to learn all of the thought processes and more in-depth programming, I am going to set up a new quant lecture series QT202; Quantitative Options Backtesting in January or February, and you can learn from there if you want to.
Here is the file directory, for reference - it is abit different from the posts with index data:
from https://historicaloptiondata.com/ (unaffiliated).
Again, to match against your own data source, data provider and database/SQL whatever you have…you need to implement your own compute and load buffer logic.