Markov chains, lottery misses and twitter

Yet again,  I’ve been drawn into looking at a little maths problem by an off-hand twitter remark.

Closely followed by

I couldn’t resist. I replied with

This was based on a simple idea – You choose 6 numbers on a ticket. The draw is 6 main and 1 bonus ball drawn from 49 without replacement (Wikipedia – Lotto UK). To match no numbers (even the bonus) in a single draw, all 7 draws must pick numbers that you have not selected. I call this number \(P_{0}\) and it is given as:

\begin{aligned}
P_{0} & =\frac{43}{49}\times\frac{42}{48}\times\frac{41}{47}\times\frac{40}{46}\times\frac{39}{45}\times\frac{38}{44}\times\frac{37}{43} \\
& = \frac{16112057}{42950292} \\
& \approx{0.375}
\end{aligned}

I then reasoned, to get 5 in a row (\(P_{5}\)), simply combine the probability of the individual events…

\begin{aligned}
P_{5} = (P_{0})^{5}
\approx 0.00743
\end{aligned}
and
\begin{aligned}
P_{10} = (P_{0})^{10}
\approx 0.0000552
\end{aligned}

As always with probability, though, it is essential to think carefully about what is being asked (a recurring theme of my maths blogging and twittering). I was drawn back to the problem by this:

Scott, as usual, is right. I had answered – “What’s the probability that one specific series of 10 draws will not match your numbers?”. The question really is about how likely is the chain of events to occur in a given time period? Of course – there are many, many opportunities for the chain to occur if you play the lottery every week and have done for some time… Still, I couldn’t quite work out how he had arrived at his number. I decided to re-kindle the dim spark of recognition for statistical properties of Markov chains. I couldn’t remember much, except for the lack of dependence on past history for each step in the chain. However, google leapt to my aid and I’ll link a couple of resources here. First, for those with no knowledge whatsoever of the term – the ever useful Wikipedia is reasonable on Markov chains. The resource that helped me refresh a basic knowledge was from the statslab at Cambridge; course and pdf

I decided to set my problem description up as shown below.

Markov_chain_diagram

Markov process diagram for chains of non-matching draws

The first state is a “base” state where the last draw you matched at least one number somewhere. The next ten states are reached one by one for each consecutive draw in which you match no numbers. The numbers on the arrows are the probabilities that each will be followed. The final state is known as an “absorbing” state, because once the Markov process enters it, it never leaves (probability of remaining in this state for any other draw is 1). This state represents a real world scenario where (at least) one run of 10 or more consecutive “no matches” has been witnessed.

This can be represented as a probability matrix, or a transition matrix, as shown below, where \(P_{i,j}\) represents the probability of moving from state \(i\) to state \(j\)

\[ P = \small{\begin{bmatrix} 0.625 & 0.375 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0.625 & 0 & 0.375 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0.625 & 0 & 0 & 0.375 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0.625 & 0 & 0 & 0 & 0.375 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0.625 & 0 & 0 & 0 & 0 & 0.375 & 0 & 0 & 0 & 0 & 0 \\ 0.625 & 0 & 0 & 0 & 0 & 0 & 0.375 & 0 & 0 & 0 & 0 \\ 0.625 & 0 & 0 & 0 & 0 & 0 & 0 & 0.375 & 0 & 0 & 0 \\ 0.625 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.375 & 0 & 0 \\ 0.625 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.375 & 0 \\ 0.625 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.375\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix}}\]

The problem can be rephrased as “what is the absorption probability for state \(C_{10}\) after \(n\) draws?” Or, “What is the probability that the process will be in \(C_{10}\) after \(n\) draws?”.  This can be obtained by multiplying the transition matrix by itself \(n\) times and then reading the value of \(P_{0,10}\) which represents the probability of being in \(C_{10}\) having started in \(B\) (state 0)

So, the probability of seeing 10 consecutive draws with no numbers matched depends (unsurprisingly) on how many times you play the numbers. If we assume 10 years then the period of play began in November 2003. At least one website has archived all the draws since their inception, so we can directly read off that there have been 1044 draws between 7th November 2003 and 7th November 2013 when the 10 consecutive draws were witnessed. By my reckoning – this means the probability of observing this event at least once over that 10 year period was \({(P^{1044})}_{0,10} \approx 0.0350\) – or 3.5%.  This is about half the probability that Scott suggested.  I checked with him and he didn’t include the Wednesday draw – so we could look at \({(P^{520})}_{0,10} \approx 0.0174\) (1.74%) – even less probable! I rather suspect, though that he has forgotten to take account of the 7th draw of the ball (the bonus), which also should be missed. This changes the numbers in \(P\) significantly – replacing each 0.625 with 0.564 and each 0.375 with 0.436. With that modification, I think \({(P^{520})}_{0,10} \approx 0.0692\). Back to where we started….

Interestingly – I don’t think it makes any difference at all whether (or when) you change your numbers. That’s one to ponder fully another day. Or maybe Scott has a view… 🙂

This entry was posted in Uncategorized. Bookmark the permalink.

3 Responses to Markov chains, lottery misses and twitter

Leave a Reply

Your email address will not be published. Required fields are marked *