Designing a market timing system to maximize the probability it will work
User Rating: / 11
PoorBest 
Written by Administrator   
December 20, 1999
The purpose of this article is to provide some insight on how we at Merriman Capital Management (MCM) develop trading systems; in particular, answering the important question of how to assess a trading system’s potential of working in the future. The term I use to describe this trait is robustness. A robust trading system is generally not the highest performing system when backtested; however, it is sort of an all-weather trading system which is capable of achieving performance objectives and also handling a wide variety of price movements and market conditions.

Before describing the robust system design and check-out process, I would like to take a step back and describe how we use market timing at MCM.

Just as you diversify a portfolio by investing in various asset types, as a market timer I like to manage what I think of as a diversified portfolio of timing systems. This is important because robust trading systems can handle most market conditions, they still have weaknesses. Diversification is obtained by three primary methods. One extremely effective way to diversify and enhance year-to-year consistency is to trade many markets or types of mutual funds such as enhanced index funds, sector funds, emerging market funds, small cap funds and international funds. Just as with buy and hold strategies, trading many markets will enhance returns and reduce risk.

A second effective diversification strategy is to use trading systems that operate at different time scales. For instance, we may employ an intermediate-term trading system that generates two to four trades per year and a shorter-term trading system that trades 20 to 40 times per year. For those times when whipsaws are hurting intermediate-term system returns, such as with the S&P500 this year, a properly designed short-term timing system will likely be more effective, with the overall effect of smoothing out the ride for the entire portfolio.

The third diversification approach is one we have used and advocated for many years: the use of multiple timing systems. We choose every timing system carefully, but we know that each will underperform at times. When one is lagging, chances are that others will be doing well to smooth out overall portfolio performance.

It is very important that we take a portfolio-level viewpoint when designing each trading system. Then, we can accept the weakness in one system knowing that the strengths of others are designed to compensate for this weakness. I want to emphasize that the multiple-system approach is not a hedging strategy that ultimately reduces performance. Every system in the portfolio must individually meet our performance objectives.


With that background provided, I will now move to the primary topic of this article, the design of the robust trading system. You may have seen advertisements for systems promising high returns and 100% success rates in picking market tops and bottoms. Most if not all of these systems achieved this performance in backtesting by providing so many rules and exceptions that they are what we call over-optimized. They might perfectly "predict" last week’s market, for instance, but they are relatively useless once the market environment changes in any significant way. Thus there is very little chance that they will work in the future.

For this article, I will focus on the process used for intermediate trading systems, those that try to catch major moves that occur over time scales of three to six months, and which typically generate two to four trades a year.

The design of a new timing system starts with an idea. It may be prompted by a technical article, by a series of misadventures encountered with an older system, or it might be simply a hunch. The primary tool for screening, testing and evaluating potential new systems is backtesting by computer. In other words, applying the system’s measurements to data and markets in the past to determine how it would have performed. It’s a process of many trials and errors, repeated as the system is refined. We call this process optimization. The process typically generates many more ideas. However, computerized backtesting will typically eliminate 19 out of 20 of them.

The problem in this process is the risk that systems which work well on hypothetical data in the past won’t perform as expected in the future. How do we assess a system’s likelihood of working in the future? Listed below are five general criteria for evaluating intermediate-term system robustness:

  1. Sensitivity analysis on system parameters
  2. Testing in many types of markets and fund types
  3. System implicit risk analysis
  4. System consistency and recent performance
  5. Can the system edge be described in simple and logical terms?

Let’s look at each of these in more detail.


Sensitivity Analysis

First, we desire simple systems with no more than three to five parameters to optimize. Think of parameters as the quantitative component of rules or as conditions that must be met. For example, a simple system might dictate that a signal is created when some factor, say a mutual fund price, crosses the line of its own 39-week simple moving average. That’s one rule, and when mathematically authenticating this rule, we typically need one parameter. In this case, that parameter is 39 weeks.

This simple system, which we have totally fabricated for this example, might also require that the price move up or down from its intersection with the line by at least 1 percent. That’s a second rule, and with it comes another parameter, 1 percent. And the system might require something seemingly unrelated, such as that average interest rates on Treasury bills be no more than some set percentage. That’s a third rule, with a third parameter. If all three conditions were met, the system might generate a buy or sell signal. (This is a totally fabricated example, not a real system.)

Typically when we design a system, we start with a general rule and then fine-tune it by testing various parameters. Systems with more than five parameters, especially intermediate-term systems, are more susceptible to over-optimization to past markets, as discussed earlier.

We also perform a sensitivity analysis to make sure that system performance does not radically change as the parameters are varied. For example, if changing a look-back period of a moving average from 39 weeks to 35 weeks makes a big difference in long-term performance, I am very suspicious of that system’s ability to be robust in the future. It seems too fine-tuned.


Test in Many Markets and Fund Types

A significant indication of robustness is being able to use a system optimized for one market, e.g. the S&P500, on many different markets without having to change any of the parameters. If I found that a system optimized on the S&P500 could also add significant value for timing a Japan fund, a small-cap fund, a value fund, a European fund, and an emerging markets fund, my confidence in that system would be boosted considerably. This is one form of what is called out-of-sample testing. Out-of-sample testing provides a good degree of confidence that the system will work as advertised in the future.

Another form of backtesting is to specifically examine the performance in challenging markets such as long-term bear markets, fast-rising bull markets, stair-stepping markets, trading ranges, and extended periods of low volatility. Examples of long-term downturns include the S&P500 from 1966 to 1982 and the Japanese Nikkei from 1990 to 1998. Recently, fast rising markets such as the NASDAQ have dominated the news; we also want to insure that the trading system captures a decent percentage of such moves. Observing how the system handles such markets can identify weaknesses and establish reasonable expectations for the systems.


System Implicit Risk Analysis

Implicit risk analysis is basically the mental exercise of imagining all ways that a system may significantly underperform its objectives. The analysis goes beyond simply looking at the standard deviation of returns, maximum drawdowns and other mathematical measures of risk.

For example, a system that triggers buy and sell signals based solely on valuation measures, such as the Standard & Poor's 500 Index dividend yield, is at risk that those valuation measures will become outdated for any number of fundamental reasons. Such a risk will not be identified by backtesting. Other examples include the risk of system obsolescence, politically-driven selloffs and recoveries, and what I call "nightmare formations" for trend-following systems. Nightmare formations are price patterns which produced repeated unprofitable whipsaws.

All trading systems have inherent weaknesses. We try to incorporate multiple systems that complement each other so that one may perform well when another is lagging. Careful selection helps us minimize the chances that an entire portfolio will be significantly hurt.


Consistency and Recent Performance

Given the same backtested 10-year performance, I would much prefer a system that provides consistent year-to-year results than a system which obtained the majority of its gains in one or two years. Consistent returns provide an indication that the system, over many trades, is taking advantage of a particular edge. I use the word edge much in the same way a casino has an edge at roulette, meaning that over a large number of trades, a system with an edge will make money. I don’t mean to imply that one or two bad years out of ten is a sign of a bad system – in fact all intermediate-term systems have their bad years.

But a consistent system is more likely to be taking advantage of some edge than an erratic system. And a consistent system is less likely to be relying on occasional good luck. That makes a consistent system more trustworthy. To discount good luck, I will often throw out the best year when I’m evaluating system characteristics.

I also want to verify that the system has been working in recent markets. For intermediate-term systems, I will often test the performance over the past 10 years and over many markets. I will also test the system as far back as possible, though I use older hypothetical results with caution, because the markets are always changing.


Can the system edge be described in simple and logical terms?

The previous four criteria are designed to provide as much practical confidence in the system as possible. However, a last criterion to be met is that the system edge be describable in simple and logical terms. If a system depends on the phase of the moon, last year’s Super Bowl winner, or on the exponential moving average of the fast Fourier transform of the Fibonacci oscillator, then we have cause to reject the system. We need to solidly understand the basis for the system’s success.

In the past, I have rejected many systems which have had good backtested performance, consistency, and small drawdowns because they have failed to meet one of these five tests for robustness. These concepts are not new to most professional traders. But these criteria provide the most straightforward and prudent approach to assess the likelihood that a system will meet its objectives in future markets.
 
 
 
 

Discover how professional money management can help you. 

Get a Free Consultation from a Merriman financial advisor.