Recently, I played the game Yahtzee® for the first time in ages. I forgot how much I enjoyed this game; it is a nice blend of strategy and chance. The highest scoring play in the game is a ‘Yahtzee’ *i.e.*, rolling five of a kind. With four of us playing, not one of us scored a Yahtzee during the game… except for a single instance, when the roller had already placed a ‘0’ in the Yahtzee column. This made me wonder how rare scoring a Yahtzee was.

Calculating the odds of rolling a Yahtzee with five, six-sided dice on a single roll is fairly straight-forward:

(1/1)*(1/6)*(1/6)*(1/6)*(1/6) = 1/1296 = 0.077%

However, calculating the probabilities begins to get more complicated when you consider the fact that you are allowed to hold one or more dice and roll up to three times in an attempt to score a Yahtzee.

I thought this might be an interesting problem to try to tackle with some programming. So I coded a Yahtzee simulator in R**, **where I could quickly simulate 1,000,000 attempts to score a Yahtzee.

Once I was finished with the code, I did some research to see if I could validate my simulation’s probabilities to any statistical analyses. I ended up finding quite a few articles that have tackled this already! One in particular I found to be extremely well done, and I used the results from here to validate the findings from my code.

I found that coding this was a good exercise because the ‘hold and roll’ rules of the game result in quite a few scenarios that must be accounted for to obtain accurate results. I quickly realized that the probabilities I was obtaining were too low because I had not considered the fact that it may be advantageous to the roller to change which die he was attempting to match throughout his turns. For example, if I roll [1 2 3 5 6] on Roll #1, I will (arbitrarily, given no matches) select the 6 and roll again. If Roll #2 is [2 2 2 6], then the user should keep the three 2s, and re-roll the 6s for the best chance at a Yahtzee.

After running the code, the percentage of simulations that a Yahtzee occurred on the first roll was 0.0745%, on the second roll was 1.1668%, and on the third roll was 3.3620%. Summing these together, the **overall chance of scoring a Yahtzee was 4.6033%**. These numbers agree quite well with the calculated probabilities from the reference above (0.077%, 1.186%, 3.34%, 4.603% respectively).

I was also interested in determining the most likely outcome if a user were to attempt a Yahtzee. So I amended the code to track the highest number of matching die at the conclusion of each turn. The odds from the simulations were as follows: 1: 0.0783%, 2: 25.6249%, 3: 45.2652%, 4: 24.4283%, 5 (Yahtzee): 4.6033%. Interesting that the odds of not obtaining a single match during three turns is significantly less than that of rolling a Yahtzee! Again, these probabilities match quite nicely with the calculations from the aforementioned reference.

A final point of observation: the ‘true’ odds of scoring a Yahtzee during a game is likely lower than this simulation suggests. In the code, I assumed that the roller was always attempting a Yahtzee, and making decisions that would result in the greatest probability of doing so. There are going to be times in a game where it is advantageous to not try for a Yahtzee (*e.g.*, attempting a small or large straight) or times when you may wish to stop your turn before the third roll (*e.g.*, rolling a Full House).

Using the calculated probability for rolling a Yahtzee and applying it to our four person game (with each player getting 13 turns), the odds of no one scoring a Yahtzee are as follows:

(1 – .046)^(13*4) = 8.6 %

What a shame, because it is so much fun to shout “Yahtzee!”