12 April 2011

Jeopardy! ToC Odds: Now With Error Bars!

Christopher Short (cshort on the J! board) ended his run on Jeopardy! yesterday, leaving as a 6-time champion; congratulations, Chris!. This means that it's time to update the Tournament of Champions standings. One improvement that I'll be putting in the table today, at the request of RCraig (a.k.a Roger Craig, also a 6-time champion), is an error range for each estimate.

The data set I use to make my estimates is a compilation of all 3+-time winners since the 2006 ToC, covering the 2007, 2009, 2010, and (upcoming) 2011 Tournaments. As I explained in my previous post, for each champion in this year's field, I use this data set to estimate the number of players who are likely to appear before the next ToC that would be ranked more highly than the player in question.

If the data set I used to make my estimates was infinite (and if the distribution didn't change with time), I would know the probability of generating a player above a given level of accomplishment exactly, and no error bars would be needed. Right now, my data set has 99 listings, which is a long way from infinite, I'm afraid; this means that I know the relevant probabilities imperfectly. To estimate the uncertainty in a given probability, I will once again turn to the Poisson distribution. As you may remember, a convenient feature of the Poisson distribution is that its standard deviation is the square root of its mean. So, if my dataset contains n entries above the accomplishment level in question, the uncertainty in that value is the square root of n. This isn't to say that I'm unsure of the actual data, since even someone like me with a math degree can count; the uncertainty here is in how well the actual data reflects the underlying distribution. (That was a terribly frequentist way of putting it; I'm sure a hardcore Bayesian would disagree with my wording.) I can then feed the values at one standard deviation's remove from the actual count into my estimation method and thus generate error bars.

This might be clearer with an example: Anthony Fox (ARF!) won four times and earned $51,998. Of the 99 champions in my database, 48 are ranked more highly, and I estimate his chances at the ToC at 17.4% based on this. The square root of 48 is ~6.93, so I ran my estimation algorithm using values of 42.07 and 54.93 to give probability estimates of 27.5% and 10.4%, respectively. So, I'd report that my estimate of Anthony making the ToC is 17.4%, with an error range of 27.5% to 10.4%.

It is worth noting that these error estimates reflect one standard deviation of uncertainty; that is to say, there is a ~68% chance that the real chance of a given player lies within my reported error range. One could just as well chose to report an error range of two standard deviations, (41.5% to 6.2% for the above example), which would be ~95% likely to encompass the real probability.

So, with those formalities out of the way, on to the table:

Competitor   # of Wins   Money Won   ToC Chance   Error Range   
Tom Nissley8 Wins$235,405100.0%100.0%-100.0%
Roger Craig6 Wins$230,200100.0%100.0%-100.0%
Christopher Short6 Wins$94,752100.0%100.0%-100.0%
Tom Kunzen5 Wins$133,402100.0%100.0%-100.0%
Paul Kursky5 Wins$109,41199.9%100.0%-99.1%
Kara Spak5 Wins$83,40198.8%99.7%-96.9%
John Krizel4 Wins$105,20493.6%97.4%-87.6%
Buddy Wright4 Wins$88,80471.3%82.9%-58.7%
Paul Wampler4 Wins$72,00145.5%60.0%-32.3%
Anthony Fox4 Wins$51,99817.4%27.5%-10.6%
Megan Barnes3 Wins$103,2035.2%9.7%-2.7%
Alison Stone Roberg3 Wins$85,1021.2%2.5%-0.5%
Sarah Wilkinson3 Wins$72,7010.1%0.2%-0.0%

Based on these, the most likely field for the ToC at the moment includes all of the 5+-time winners, John Krizel, Buddy Wright, and 5 people who are yet to air or yet to play, with Paul Wampler as the alternate.

Note that the estimates for players on the bubble are extremely sensitive to uncertainty in the data used for the prediction, while the sensitivity is much lower when one moves to the extremes.

It's interesting to see that with ~110 games (out of 272) still left to play, we have already converged to having only 3 players with probabilities between 10% and 90%.

A final note about the results: As Robert K S pointed out on my last post, it's entirely possible, based on history, that a 5-time champion who somehow doesn't make this year's ToC will be invited to the next ToC. So, the probabilities for the 5-timers should be viewed as the probability that they will make the 2011 ToC.

4 comments:

  1. Can you add to your figuring that I will be nine months pregnant when they tape the TOC? I think that 5% is more like a goose egg for my chances of actually appearing, even if I qualify.

    ReplyDelete
  2. Congratulations! This is excellent news, even if the timing is non-ideal with regard to the ToC.

    I do hope and anticipate that, in the event that you do qualify for this year's ToC, the producers will invite you to next year's.

    ReplyDelete
  3. Oh, I don't know about that. If I was a 5-day champ I'm sure they would, but I'm not sure they'd save a spot for someone who squeaked in.

    ReplyDelete
  4. I certainly hope that they wouldn't deny you a spot in a ToC; that would do poor justice to both you, personally, and to women of childbearing age in general. (Indeed, I suspect the fear of a bout of justified outrage from various feminist/mothers/parents groups might be enough to induce them to do the right thing.) It would also be kind of mean, and "mean" is just about the last adjective I would apply to any of the folks I met at J!

    ReplyDelete