Pundit tracker is an interesting new site that proclaims its mission as bringing accountability to the prediction industry.
The site notes the Media’s lack of institutional memory. This creates perverse incentives — Pundits learn that brash predictions generates news coverage. If it turns out that they are wrong, well, it hardly matters, as no one ever remembers or calls them out on it. On those occasions when the blind squirrel finds the occasional nut, they can selectively tout that correct call for self-promotional purposes. The entire cycle then repeats.
I am especially keen on these Pundit excuses:
Too early: “I was simply too early; just wait and see, that stock market crash is still coming.” (see: Broken Clock Pundits)
Black swan: “Sure, our credit rating models failed, but who could have predicted that housing prices would fall across the country at the same time?”
Close enough: “Hey, I said the stock market would go up more than 10% and it went up 8%. I was basically right.”
Self-negated: “It was our own beliefs and actions that spared the world from catastrophe.” (see When Prophecy Fails)
Hedged: “I only said that it could happen.” (See: The 40% Rule) — note: when pundits are correct, they strangely fail to mention the hedge.
Pundit tracker wants to create a permanent catalogue / track record for the punditocracy’s predictions.
It is an interesting site that has the potential to correct some abuses. It will really have an impact once the media starts to use it in questioning or even booking their guests.
More from the site after the jump…
This is from the site’s “about” page:
The absence of media memory creates significant moral hazard in the world of punditry. Nuance and restraint do not play well on soundbite- and ratings-driven media, so shelf space is granted to those who espouse more extreme views. Ideally, these pundits would gain or lose credibility based on the outcomes of their calls. The 24-hour news cycle, however, means that the media is always latching on to the new flavor of the day. Rare is the postmortem to evaluate prior stories.
Pundits are highly incentivized to adhere to the following playbook:
- make a brash prediction
- if wrong, don’t worry…. no one will remember
- if right, selectively tout for self-promotion
- repeat cycle
By cataloging and scoring the predictions of pundits, we hope to bring some balance to the equation. Pundits who demonstrate a track record of making of accurate, out-of-consensus calls will appropriately receive their due. Meanwhile, those who are bombastic solely to garner media attention will be exposed. We are initially tracking three types of pundits: Financial, Political, and Sports.
While we will try our best to catalog as many pundit calls as possible, we ultimately need your help to make this site a success. If you come across any new predictions, please submit them here.
How does your scoring system work?
The traditional method to score pundits employs what’s called a “hit rate” or “batting average” approach: take the number of correct calls and divide it by the number of total calls. Make ten calls and get seven right, and the hit rate is 70%. The problem is that this figure is useless without context. The daily prediction “the sun will rise tomorrow” would (hopefully) yield a perfect hit rate, after all.
Our solution is to calibrate each prediction for boldness. We measure this by asking our users how likely they think a given prediction is to occur. If everyone says “unlikely,” then the call is bold, and the pundit, if correct, should receive more credit than he would for a called deemed “likely”. This moment-in-time gauge of consensus opinion underpins our scoring algorithm.
Pundits who have made at least 25 graded calls are awarded a letter grade (A through F range, C being average) based on this boldness-adjusted accuracy metric.
Are your grades predictive?
Our grades are solely based on a pundit’s historical predictions; think of them as a report card. Whether or not they speak to the accuracy of a pundit’s future calls is predicated on how much skill there is in punditry. If forecasting is purely a game of luck, for instance, then our pundit grades will ultimately revert to the mean. Put differently, our grades would be a contra-indicator: both A-grade and F-grade pundits would revert to a C grade. We do not yet have an informed view on this matter but anticipate that the data we gather over time will help answer the question. Regardless, by playing the role of public scorekeeper, we hope we can help correct any mismatches between a pundit’s reputation and track record.
Why don’t have you more historical calls, particularly for Finance and Politics?
Our boldness rating reflects a moment-in-time gauge of consensus opinion for a given prediction. We believe it is very difficult to judge the boldness of a historical prediction without introducing hindsight bias into the equation. As such, we are only including calls that we evaluated (using a volunteer team) for boldness at the time that they were made. The Sports category is unique in that we can use archived sportsbook odds to provide an unbiased measure of consensus opinion at the time of the prediction.