• Home
  • About Us
  • Products
  • APPROACH
HTAA
  • Team
  • NEWS
  • Blog
  • Careers
  • Contact
ARCHIVE
    • October 25, 2016
    • 0
    • SHARE

      Cross Validation, or How to Minimize Cheating

      We have learned to be skeptical of simulated returns. A marketing guy visits to pitch an investment strategy. You ask a few questions and find that the strategy was researched and tested on the same data. You are not surprised when actual results are not even close to the simulated results. It turns out that the research staff tortured the data to get the best fit. In short, they cheated.

      Our analysts use cross validation techniques to avoid cheating. Let’s consider an example. We have a data set with 300 observations. We divide into three parts (1, 2 and 3), each with 100 observations. We train a model on parts 1 and 2, and test the model on part 3. Then we train on parts 1 and 3 and test on part 2. Finally, we train on parts 2 and 3 and test on part 1. This is an example of k-fold cross validation with k=3.

      Another example of cross validation is called “leave one out” cross validation. In this case, a model is trained on all but one observation and tested on the hold out observation. This can be done 100 times on a data set with 100 observations. Leave one out is just k-fold cross validation where k equals the number of observations in the data set.

      One wrinkle we have to consider is that we are generally working with time-series data. The observations are ordered and earlier data is correlated with later data. One way to deal with times series data is to divide a data set into two parts. Train a model on the first 70% of the data and test it on the last 30% of the data. For multiple model evaluation, one can divide the data into three parts. Train each model on the first 60% of the data. Test each model on the next 20% of the data. Choose the best model and then test it on the last 20% of the data.

      One can use k-fold cross validation for a time series data set. Consider an example with k=3. We have a daily model that looks at returns over the past five days in order to forecast the next day’s return. The data set has 300 observations that we can divide into three parts, each with 100 observations. As we discussed earlier, we train on Parts 1 and 2 and test on Part 3; train on Parts 1 and 3 and test on Part 2 and finally we train on Parts 2 and 3 and test on Part 1. Note that when time series models are involved data is often “lost” for training. In this case we can only train the model on 95 observations out of the 100 observations of each subset, because the model needs to look back five days in order to create an estimate.

      Walk forward analyses are often used to simulate investment strategies. One might train a strategy on the first 700 observations of a data set with 1,000 observations. The strategy could then be tested on observations 701 through 800. The strategy is then trained on the first 800 observations (or perhaps observations 101 through 800) and tested on observations 801 through 900. The process is executed a final time to train through observation 900 and then test on observations 901 through 1000.

      Using cross validation will not guarantee stellar real time investment returns. But it will reduce the probability of finding spurious relationships or relationships that are unlikely to hold up over time. So the next time you see an investment strategy presentation with simulated returns, ask a few questions to see if there was any cheating involved in the research. You could save yourself a lot of money.

       

      For a technical discussion of cross validation techniques, readers can consult:

       

      S. Arlot and A. Celise (2010). A Survey of Cross-Validation Procedures for Model Selection

      http://projecteuclid.org/euclid.ssu/1268143839

       

      Here is link to an article about cross validation and time series analysis:

       

      C. Bergmeir, R. Hyndman and B. Koo (2015). A Note on the Validity of Cross-Validation of Evaluating Time Series Prediction

      http://robjhyndman.com/papers/cv-wp.pdf

      For less technical discussions try the terms “cross validation techniques” or “cross validation for time series analysis” in an Internet search engine like Google or Yahoo!

       

      ©2016 Hull Tactical Asset Allocation, LLC (“HTAA”) is a Registered Investment Adviser. The information set forth in HTAA’s market commentaries and writings are of a general nature and are provided solely for the use of HTAA, its clients and prospective clients. This information does not constitute investment advice, which can be provided only after the delivery of HTAA’s Form ADV and once a properly executed investment advisory agreement has been entered into by the client and HTAA. These materials reflect the opinion of HTAA on the date of production and are subject to change at any time without notice. Due to various factors, including changing market conditions or tax laws, the content may no longer be reflective of current opinions or positions. Past performance does not guarantee future results. All investments are subject to risks.

      SHARE
        BACK TO BLOG >
        Show Comments (0)

        LEAVE A COMMENT

        Cancel reply

        Your email address will not be published. Required fields are marked *

      This contact form is available only for logged in users.

      DISCLAIMER

      Caution: you are now leaving the Hull Tactical Asset Allocation website. The following link contains information concerning investments, products and other information provided by HTAA, LLC, a Registered Investment Advisor. This information is not an offer to buy or a solicitation to sell any security or investment product. Such an offer or solicitation is made only by the securities' or investment products' issuer or sponsor through a prospectus or other offering documentation.

      Investments involve risk. Principal loss is possible.

      AGREE CANCEL

      2025 Hull Tactical Asset Allocation (“HTAA”).

      HTAA is a registered investment adviser.

      Phone: (312) 356-3150 Fax: (312) 356-4451

      E-mail: info@hulltactical.com


      © 2024 HTAA, LLC is a Registered Investment Adviser. All Rights Reserved.

      The information contained in HTAA's website are of a general nature and is for informational purposes only and does not constitute financial, investment, tax or legal advice. These materials reflect the opinion of HTAA on the date of production and are subject to change at any time without notice due to various factors, including changing market conditions or tax laws. Where data is presented that is prepared by third parties, such information will be cited, and these sources have been deemed to be reliable. Any links to third party websites are offered only for use at your own discretion. HTAA is separate and unaffiliated from any third parties listed herein and is not responsible for their products, services, policies or the content of their website. All investments are subject to varying degrees of risk, and there can be no assurance that the future performance of any specific investment, investment strategy or product referenced directly or indirectly in this website will be profitable, perform equally to any corresponding indicated historical performance level(s), or be suitable for your portfolio. Past performance is not an indicator of future results.