Friday, March 9, 2018

My Top 5000 brackets for the 2018 NCAA Tournament


image source: ballislife.com



Seeking Glory in Predicting a Perfect NCAA Tournament Bracket

With the NCAA tournament around the corner and most people filling out a bracket or two, I thought I'd programmatically fill out my top 5000 brackets to see if filling out thousands of brackets gives me a leg up on the competition.  Once the field is announced (Sunday 3-11-18) I will post all my 'smart brackets' online in case you want to pick from them when filling out your own bracket this year.  The brackets are up! Scroll down to the bottom of the page to download them.



First, perfection is not likely

A perfect bracket is not realistic.  This would require picked 63 total games correct.  The longest perfect streak occurred in 2017 when someone predicted 39 straight games correctly.


Why is it so hard? 

In theory, there are 2^63 possible brackets you could fill out.  That is 9.2 quintillion (9.2 e18) possible combinations.  To put that into perspective, there are 3 trillion trees on earth.  Imagine you could clone the earth.  You would have to clone the earth 3 million times and only then would you have a tree for every possible bracket.



Are the odds really that low?

Thankfully, the odds are better than 1 in 9.2 quintillion that you will win.  Those odds would hold though if each team was equally likely to win each game, but that is not the case so the odds are better.  For instance a #16 seed has never won a game.  The #15 seed doesn't fare much better winning only 6.1% of the time. Taking these historical win percentages by seed into account improves the odds, but not to a point where they are ever great.


So why try?

Curiosity  I have always been interested in the odds associated with the tournament.  I found it interesting that people could spend so much time trying to "beat the game" when the odds were so ridiculously stacked against them.  To be clear, I used to be one of those people.

Prize money While ESPN and CBS each offer $10k in prizes, Warren Buffett's Berkshire Hathaway will award $100,000 to the best bracket.  In addition, if you are the winner and your bracket has a perfect sweet 16 you win $1 million a year for life.  I do not work for Berkshire but I could imagine some group getting together to "pool their bets" whereby each person enters a unique bracket and if they win, they all split the prize.

In reality though, even if all estimated 100,000 people who entered a bracket last year did so again this year and assuming they all entered unique brackets and assuming one of them won, predicting a perfect sweet 16 along the way thereby winning $1 million a year for life, the results aren't spectacular.  $1 million split 100,000 ways is a cool ten bucks a year for each person.

At the very least though, one of the 377,291 employees could use one of my 5,000 brackets based on statistics and give themselves a better chance of winning the $100k prize than they would either throwing darts or succumbing to their own biases.

To help you  You could probably write an entirely different article about all the biases we have in making predictions.  For instance, pick the winner of Gonzaga vs IUPUI.  Did you pick Gonzaga? You probably did and it's because of the relative familiarity of the name.  Another example of bias: how many people pick their alma mater to go further in the tournament that the statistics suggest?  "Yeah, Texas isn't that good this year but I'll go ahead and put them down for the final four anyway." I imagine a disproportionately high amount. 


How will the prediction work?

I found statistics, going back to 1985, showing the likely winner when any two given seeds match up against one another. They tell us for instance that a #2 seed beats a #6 seed 72.2% of the time.  Or that a #12 seed beats a #5 seed 32.9% of the time. They do not however include possible combinations that have not yet occurred such as a #12 seed playing an #11 seed.  Nonetheless, I transformed this data creating a 16x16 matrix and now have a simple way of computing odds for the simulation.


 Historical Win Percentage For Two Seeded Teams in the NCAA Tournament
(the winning percent applies to the team in the "row", e.g. row 1, column 4 means a 1 seed beats a 4 seed 69.9% of the time)



Predicting 5000 brackets

Once the brackets are announced, I will input all 64 teams competing in the tournament into my program.  Based on the order of input and bracket symmetry, I designed the code in such a way that it will then assign the correct seeds to each team.  I then load in the historical stats and the simulation is ready to run.

To give you an idea of just how hard it is to predict a perfect bracket, I ran 1,000,000 simulations on the already completed 2017 tournament and had a total of 4 perfect round of 32 brackets and no perfect sweet 16 brackets. Out of 1,000,000 simulated brackets using the historical odds, the best it did was give us 10 incorrect games, predicting 53 of 63 games correctly. This is not bad considering the task at hand but is a far cry from the glory of a perfect bracket.




A simulation of 1,000,000 trials showing 4 perfect round of 32 brackets (none perfect through the sweet 16) as well as the best bracket: 53 of 63 games correct





###########################################

How to download and use the brackets

Step 1: Click here, or the link below, and a download box should open


5,000 Brackets 

 

Step 2: To use the bracket file, open it in notepad or the Mac equivalent.  Turn word wrap off if it is not already.  What you should see is below:



Step 3: There is a complete bracket on each line of the file.  So each line contains the predicted winner to all 63 games of the tournament.  You simply pick a line (1-5000) and enter the names listed as your winners for each game.  You'll enter the winners round by round. So the first 32 teams on each line corrrespond to the first 32 games (round 1).  The next 16 teams correspond to round 2 and so on.  Start at the top left, picking Virginia.  Go down the left side of the bracket, go down the right side of the bracket.

 Tips
  • You can use Ctrl-F in notepad and enter a number from 1-5000 and it will jump to that bracket. 
  • If you want a bracket to have a particular tournament champ, you can find the right-most team on each line (that is the predicted champ). 


Question? Let me know in the comments.

Good luck! 

 

 

No comments:

Post a Comment