Novy-Marx has a really interesting piece discussing “data-mining:”
Ferson, Sarkissian and Simin (2003) warn that persistence in expected returns generates spurious regression bias in predictive regressions of stock returns, even though stock returns are themselves only weakly auto correlated. Despite this fact a growing literature attempts to explain the performance of stock market anomalies with highly persistent investor sentiment. The data suggest, however, that the potential misspeciﬁcation bias may be large. Predictive regressions of real returns on simulated regressors are too likely to reject the null of independence, and it is far too easy to ﬁnd real variables that have “signiﬁcant power” predicting returns. Standard OLS predictive regressions ﬁnd that the party of the U.S. President, cold weather in Manhattan, global warming, the El Nino phenomenon, atmospheric pressure in the Arctic, the conjunctions of the planets, and sunspots, all have “signiﬁcant power” predicting the performance of anomalies. These issues appear particularly acute for anomalies prominent in the sentiment literature, including those formed on the basis of size, distress, asset growth, investment, proﬁtability, and idiosyncratic volatility.
Some of the more intriguing time series that apparently predict when anomalies will perform the best (I highlighted the titles for effect):
Anyway, in the end Fama is probably going to end up being right and all of us will go down in flames trying to trade crazy quantitative strategies or value strategies that are supposed to make us 15-20% a year. But at least we’ll have fun along the way!