Its been a tough year for the U Penn prof.
First, the market has gotten cut nearly in half, something many buy & hold investors admit is possible, but is supposed to a rare occurrence. But as we have seen in 2008, and 2001-03, and 1972-73, these sorts of brutal bear markets are a lot more common than the theorists like to admit. These 100 year floods seem to come along with disturbing frequency.
Now, Stocks for the Long Run, Siegel’s well regarded book (now in its 4th printing), is under increased attack: Its assumptions have been shown to be false, its conclusion called into question, and now its methodology has been attacked as statistically invalid.
In a WSJ article this morning, Jason Zweig puts together a pretty compelling critique Does Stock-Market Data Really Go Back 200 Years? :
“There is just one problem with tracing stock performance all the way back to 1802: It isn’t really valid.
Prof. Siegel based his early numbers on data first gathered decades ago by two economists, Walter Buckingham Smith and Arthur Harrison Cole.
For the years 1802 through 1820, Profs. Smith and Cole collected prices on three dozen banking, insurance, transportation and other stocks — but ended up including only seven, all banks, in their stock-market index. Through 1845, they tracked 19 insurance stocks, but rejected 95% of them, adding only one to their index. For 1834 onward, they added a maximum of 27 railroad stocks.
To be a good measure of stock returns, an index should be comprehensive (by including many stocks) and representative (by including the stocks commonly held by investors). The Smith and Cole indexes are neither, as the professors signaled in their 1935 book, “Fluctuations in American Business.” They cherry-picked their indexes by throwing out any stock that didn’t survive for the whole period, whose share prices were too hard to find or whose returns seemed “inflexible,” “erratic,” or “non-typical.”
Thus, Siegel’s basis for Stocks for the Long Run exclude 97% of all the stocks in the early history of the US market by cherry picking winners, ignoring survivorship bias, and engaging in data smoothing.
What did this do to the results? As you would imagine, it juiced them significantly. The era of 1802-1870 ended up with a much bigger dividend yield then it should have had. Siegel originally started at 5.0%, but over ensuing versions, that crept up to 6.4%. The net impact was to raise the average annual real returns during the first half of the 19th century from 5.7% to 7.0%.
If you artificially raise the initial returns in the early part of the data series, then the final annual returns become much higher.
As Zweig sardonically notes, “Another emperor of the late bull market, it seems, has turned out to have no clothes.”
Does Stock-Market Data Really Go Back 200 Years?
WSJ, July 11, 2009