The data set I use to make my estimates is a compilation of all 3+-time winners since the 2006 ToC, covering the 2007, 2009, 2010, and (upcoming) 2011 Tournaments. As I explained in my previous post, for each champion in this year's field, I use this data set to estimate the number of players who are likely to appear before the next ToC that would be ranked more highly than the player in question.
If the data set I used to make my estimates was infinite (and if the distribution didn't change with time), I would know the probability of generating a player above a given level of accomplishment exactly, and no error bars would be needed. Right now, my data set has 99 listings, which is a long way from infinite, I'm afraid; this means that I know the relevant probabilities imperfectly. To estimate the uncertainty in a given probability, I will once again turn to the Poisson distribution. As you may remember, a convenient feature of the Poisson distribution is that its standard deviation is the square root of its mean. So, if my dataset contains n entries above the accomplishment level in question, the uncertainty in that value is the square root of n. This isn't to say that I'm unsure of the actual data, since even someone like me with a math degree can count; the uncertainty here is in how well the actual data reflects the underlying distribution. (That was a terribly frequentist way of putting it; I'm sure a hardcore Bayesian would disagree with my wording.) I can then feed the values at one standard deviation's remove from the actual count into my estimation method and thus generate error bars.
This might be clearer with an example: Anthony Fox (ARF!) won four times and earned $51,998. Of the 99 champions in my database, 48 are ranked more highly, and I estimate his chances at the ToC at 17.4% based on this. The square root of 48 is ~6.93, so I ran my estimation algorithm using values of 42.07 and 54.93 to give probability estimates of 27.5% and 10.4%, respectively. So, I'd report that my estimate of Anthony making the ToC is 17.4%, with an error range of 27.5% to 10.4%.
It is worth noting that these error estimates reflect one standard deviation of uncertainty; that is to say, there is a ~68% chance that the real chance of a given player lies within my reported error range. One could just as well chose to report an error range of two standard deviations, (41.5% to 6.2% for the above example), which would be ~95% likely to encompass the real probability.
So, with those formalities out of the way, on to the table:
|Competitor||# of Wins||Money Won||ToC Chance||Error Range|
|Tom Nissley||8 Wins||$235,405||100.0%||100.0%-100.0%|
|Roger Craig||6 Wins||$230,200||100.0%||100.0%-100.0%|
|Christopher Short||6 Wins||$94,752||100.0%||100.0%-100.0%|
|Tom Kunzen||5 Wins||$133,402||100.0%||100.0%-100.0%|
|Paul Kursky||5 Wins||$109,411||99.9%||100.0%-99.1%|
|Kara Spak||5 Wins||$83,401||98.8%||99.7%-96.9%|
|John Krizel||4 Wins||$105,204||93.6%||97.4%-87.6%|
|Buddy Wright||4 Wins||$88,804||71.3%||82.9%-58.7%|
|Paul Wampler||4 Wins||$72,001||45.5%||60.0%-32.3%|
|Anthony Fox||4 Wins||$51,998||17.4%||27.5%-10.6%|
|Megan Barnes||3 Wins||$103,203||5.2%||9.7%-2.7%|
|Alison Stone Roberg||3 Wins||$85,102||1.2%||2.5%-0.5%|
|Sarah Wilkinson||3 Wins||$72,701||0.1%||0.2%-0.0%|
Based on these, the most likely field for the ToC at the moment includes all of the 5+-time winners, John Krizel, Buddy Wright, and 5 people who are yet to air or yet to play, with Paul Wampler as the alternate.
Note that the estimates for players on the bubble are extremely sensitive to uncertainty in the data used for the prediction, while the sensitivity is much lower when one moves to the extremes.
It's interesting to see that with ~110 games (out of 272) still left to play, we have already converged to having only 3 players with probabilities between 10% and 90%.
A final note about the results: As Robert K S pointed out on my last post, it's entirely possible, based on history, that a 5-time champion who somehow doesn't make this year's ToC will be invited to the next ToC. So, the probabilities for the 5-timers should be viewed as the probability that they will make the 2011 ToC.