When Pim and I looked at
the prizes for Menu For Hope, we were surprised to see that a number of people had won more than one prize. I know, theoretically, that an even distribution among prize winners is one possible outcome while lists with duplicates are much more likely. But it's one thing to know it, and another thing to accept it. An iPod's shuffle feature is truly random, but it often seems as if the shuffle is favoring one artist or album. (That's why iTunes now offers a "smart" shuffle that diminishes randomness in favor of distributing artists/albums.) And in the TV show Numb3rs, an evenly distributed set of points is a clue that someone has tried to make something look random. True random numbers aren't spread out equally.
Just to convince myself, I wrote some code to demonstrate this unintuitive aspect of randomness.
First, draw 1 number from a set of 10. The probability for each number is 1/10th. Do this 10,000 times to get a bunch of results to chart, and you end up with an almost even distribution across those 10 numbers, demonstrating the fairness of the random draw:
1 1024
2 1042
3 992
4 952
5 974
6 1034
7 995
8 993
9 1001
10 993
In that run, each "turn" was a single draw. Now make each turn 10 draws. In Menu For Hope, many contributors entered multiple raffles, and drawing more than once from the same pool simulates this. We expect each number to come up once, because that's what probability tells us. But it doesn't work that way, at least not necessarily. On one run, I got this set of results:
2
5
6
7
2
10
2
6
8
3
Even though the probability of each number is 1/10, you end up with single numbers repeated. There are three 2s and 2 6s.
When I ran the 10-draw turn 10,000 times, I got these results:
1 unique numbers 0
2 unique numbers 0
3 unique numbers 6
4 unique numbers 187
5 unique numbers 1274
6 unique numbers 3457
7 unique numbers 3528
8 unique numbers 1373
9 unique numbers 169
10 unique numbers 6
Here I counted the unique numbers in each set. 10 unique numbers means a perfect distribution. 1 unique number means that every random draw hit the same number. I didn't implement the logic to separate duplicates (the 6 in the previous example) from the triples (the 2 in the prior example). Or quadruples, for that matter. This was a simple simulation.
The odds of randomly getting an even distribution are pretty low. You are roughly 600 times more likely to have 6 or 7 unique numbers, meaning that some of those numbers are duplicated (or tripled or whatever). You’re 200 times more likely to have a scenario where half the numbers are missing. You’re 30 times more likely to have just 4 unique numbers: a whole host of repeats. The numbers looked similar on multiple runs.
This makes sense if you think about it. In the second draw of the 10-draw turn, you have a 10 percent chance of drawing a duplicate. If you don’t, you have two unique numbers, and on the third draw you have a 20 percent chance of duplicating one of the existing values. By the time you get to the tenth draw, there’s a 90 percent chance that you’ll draw a number that’s already been drawn, assuming all the others were unique.
Menu For Hope, of course, is much more complicated. Raffle ticket purchasers can stack the odds in their favor by buying more tickets for a given raffle. Prizes with fewer bidders have different odds than those with more bidders. And it was about 100 draws across 9,000 raffle tickets. But even the simple version in this post, where each of just 10 numbers has an equal probability, shows that duplicates and triplicates should be commonplace. So it's not surprising that some people won multiple prizes: It would have been more surprising if none had.
Labels: Coding