I have just completed a few days of testing several modeling/backtesting programs. I thought, perhaps, the other members of the list might find the results useful; as I am new here, hopefully this can serve as my first productive contribution.
I do a lot of testing & modeling on (a) daily bars (looking to execute intraday trades) and (b) tick data (for short-term trades). In the case of the first, I need to create signals on a daily series but execute orders against a 1M intraday series. So, in order to test an idea over five years of data, I have to do it over 5Y of 1M data. This gets into performance issues when you start wanting to run a test on several issues and want to run multiple revisions of the test. As for testing on tick data, I am sure you are familiar (at least conceptually) with the performance issues there.
I took a look at NinjaTrader, TradeStation, OpenQuant & AmiBroker. I created an ASCII file of 5Y of EURUSD, USDJPY, GBPUSD, AUDUSD & USDCAD 1M data. In each program, I ran a simple EMA crossover test (10/50). It was an obnoxious test, resulting in +300,000 trades, but it was easy to implement and was a good stress test. What I wanted to do was see if I could get: a) reliable (e.g. reproducible) results from a single-security test and b) a test of the five FX pairs as a portfolio (again, in a reliable/reproducible manner).
1. Ninja: I use Ninja daily to scalp with and execute some short-term system code. I had dim hopes for the backtesting since I am familiar with the program - Ninja really is an execution platform first and an analytical platform second. The results were more or less what I expected: I could get it to test one issue with reproducible results at an okay speed (about 3:00M) but it would start to go into fits when I ran it against all five at once. The results of the five issue test would vary from instance to instance - it would usually show the results for the first 3 issues correctly, but on the last 2 it would suffer some kind of memory issue and give me numbers that were totally off. In one pass, it even managed to corrupt itself and I had to reload all the data.
2. TradeStation: I have a lot of time invested in TradeStation and I was already familar with it's problems - mainly, that over a large test set, TS will return different test results. I have talked with TS support and posted on the message board about this, but I never got anyone interested in what I found to be a critical issue. The results of this test were as expected: I could not get two results to match. Any time I would refresh data from the TS data servers and run the test again, I would get a different result. Sometimes it was as much as test 1 being -$190K and test 2 being +$74K. I do not understand how anyone can use this tool for successful modeling if they are testing over a large dataset; just making up a number would have been as useful. I even exported and imported the data to ensure that it wasn't an issue with the TS data servers. Same inconsistency. I couldn't test all five pairs together since TS does not do portfolio backtesting. As
for time of a single test, it is hard to tell with TS as to what, exactly, it is waiting for/trying to do at any given moment, but I would say it would usually take about 2/4M per test (although, I have no idea what it was doing in that 2/4M since the results it returned seemed random at times).
3. OpenQuant: This platform looked interesting. I set it up, imported the data and ran a test. After 20M I noticed that the first EURUSD 1M test was at 10%. I closed it down and uninstalled it. The performance was just not going to work for me.
4. AmiBroker: You can likely guess the results as I am here as a new member. AB was able to test EURUSD in about 30 seconds and was able to do the portfolio of all 5 pairs in about 2M. No matter how many times I ran the test, the numbers it would return were the same. This fact is pretty crucial to my work. I understand that data will have errors in it, but I at least need my modeling software to consistently return results (which AB did). I even created new databases and populated them with the same data - all test results returned were the same.
As you all know, AB was, by far, the cheapest option out of all of the above.
I hope some members find these results useful/interesting and I think that these results really are a credit to Tomasz.
I would be interested to hear from anyone who has tested AB against any other off-the-shelf tools that I did not look at.
This sort of direct comparison is somewhat rare to find. Thanks for sharing Tim!