Winning the Scratch Lottery

Subtitle: An Experiment in OCR, Robotics, and Statistics

I know someone who has won the million-dollar lottery. Twice. Ask any mathematician about playing the lottery and you'll likely be told not to waste your money because you can't win. That would be bad advice because the lottery is rigged. In your favor.

# Mohan Srivastava’s Discovery

Mohan Srivastava is a geological statistician from Toronto who helps mining companies find gold. In 2003, he struck gold in another way by figuring out how to beat the Canadian scratch lottery, as he explained to Jonah Lehrer of Wired Magazine. People think the lottery is a completely random game, but it isn’t. The lottery commission knows a priori how much it expects to earn from each game by controlling the number of winning tickets.

Printing companies need to produce the right number of winning tickets at each payoff level. Let’s look at a typical bingo ticket, one that I bought in Delaware several years ago.

To play, you scratch off all the “?'s” in the area labeled “Caller’s Card” and then scratch off matching numbers in each of the four Bingo cards above. If you complete any horizontal, vertical, or diagonal line you’ve got a winner. Srivastava figured that for a winning ticket, each number in a winning row, column or diagonal can only appear once on any of the cards. Otherwise, it would be difficult to keep track of where the winning numbers appear and how they contribute to winning cards.

# How to Beat the Lottery

I’ll describe a potential path to scamming the lottery. First, you need to buy a lot of lottery tickets. After taking pictures of them, convert the images into useful numbers with Optical Character Recognition (OCR). Count of how many times each number appears on any of the four cards. We’ll only keep track of those appearing exactly once. For the unique numbers, create a binary array of their positions on each card. Identify Bingos by summing rows, columns, and diagonals. Finally, scrape the dots in the Caller’s Card and the corresponding numbers on each of the four cards to reveal winners. Return the unused cards to the store for refunds and cash in the winning tickets.

Most states have apps to scan lottery tickets with your phone, but a more rapid method would be handy. Something like this Epson scanner, although a less expensive method would be to use the Scanner Bin - The Clever Document Scanning Solution so long as you could find a way to rapidly and automatically move the tickets out of the bin after capturing the image. Let’s assume for now we can find a convenient method to take lots of pictures of lottery tickets in a hurry.

SikuliX by RaiMan automates keyboard, mouse, and screen functions programmatically. It uses OpenCV to find images on the screen. Search for this pattern to locate each of the four Bingo cards :

SikuliX returns the location of the pattern on the screen, and can capture images of each card with a snipping tool like Greenshot. SikuliX can also find the locations of other special symbols such as the “FREE” in the middle of the cards and the little stack of money at random places on each card.

Image processing in Python will let us remove the lines by subtracting the mask shown above from each card, as well as the special symbols. With a clean image, Tesseract OCR gives the text equivalent of each number found in the card images. Using the online OCR program OCRSpace without any image cleaning returned these numbers for each card:

There are errors, but Tesseract even correctly converted the “FREE” at the center of each Bingo card. Cleaning up the images before running the OCR should take care of most of the errors.

The next step is to identify the singleton occurrences of numbers on the cards. We’ll set up a $100 \times 1$ zero vector for each number from 0 to 99, and then add 1 to the index corresponding to each number found on the cards. So, the 11 in the upper left corner of Cards 1-3 will mean a 3 will appear in the 12th entry of the vector. After every number has been counted, we can search the index vector for 1’s. We’ll set up four $5 \times 5$ zero arrays corresponding to each of the four cards, and put 1’s into locations containing the singleton numbers.

By summing these card arrays horizontally and vertically we can find rows or columns totaling 5 (converting the FREE and special symbols to 1’s) to identify winning Bingos. It’s only a little harder to check the diagonals.

Of course, it would be nice to fully automate the card handling process with a Sain Smart robotic arm, which could also be used to scratch off the lottery tickets. But, in any case, I think we’ve arrived at phase 3.

Many states have adopted scratch lotteries as a way to pay for their schools, but the lottery is a very regressive tax on the mathematically challenged. By filtering out the winning tickets we’d be effectively increasing this regressive taxation. On the other hand, the Wired article suggested that some people are using the lottery to launder money, so it might be considered a social service to foil their efforts. Given the moral issues and all the software and hardware required you have to wonder, is it worth all this effort for a lousy \$100K a year?

Update December 16, 2020: An elderly mathematician hacked the lottery for 26 million