*By Paul Butler – January 27, 2019*

Let’s play a game. I have a coin that lands **heads** 60% of the time and **tails** 40% of the time. You have $25 and can bet on either side of the coin — every time you’re right you double your bet, and every time you are wrong you lose it.

What would your betting strategy be? Try your luck:

Cash: **$25.00**

Bet: $

*Use the slider to set your bet.*

If you reached the maximum reward, congrats! You did better than 79% of participants who played this game for real money in a 2016 experiment by Haghani and Dewey.

Let’s consider a strategy for playing the game.

The standard approach to betting games is to maximize the **expected value** of each bet, that is, the amount you can expect to win (or lose) on average. In this game the odds are in your favor: every dollar you bet on *heads* has an expected return of $1.20, so to maximize the expected return on each bet you should bet everything you have on heads on every round.

This strategy might make sense if you can only play for a few rounds, but here’s the problem: as you play many rounds with this strategy, it becomes nearly certain that you will lose everything on one bad bet. Like the dog in Aesop’s fable, greed will have left you with less than you started with.

If maximizing the expected value is going to leave us almost certainly broke, is there a better strategy?

Suppose we are interested in comparing two strategies, which we can call $a$ and $b$. Let’s say that the random variables $V_{t,a}$ and $V_{t,b}$ represent the winnings of each strategy (respectively) after $t$ rounds.

How do we know which strategy is better? As we have seen, comparing them by expected value, i.e. $\mathbb{E}[V_{t,a}]$ to $\mathbb{E}[V_{t,b}]$, won’t work. Instead, let’s ask a subtly different question: which strategy is *more likely* to do better than the other after some number of rounds? Symbolically, the value we are interested in is:

$P(V_{t,a} > V_{t,b})$

One way to think about this probability is that if we were to play strategy $a$ for $t$ flips, and then start over with another $25 and play strategy $b$ for $t$ flips, what is the probability that strategy $a$ made more? If the probability is over 50%, we’ll say that $a$ is a *better* strategy than $b$.

If we have a means of saying that one strategy is *better* than another, we can define a *best* strategy as one for which no other strategy is better.

Does such a strategy exist? It’s less obvious than it might seem! For example, consider a (very generous) slot machine where you can pull one of three levers labeled $A$, $B$, and $C$. The payout for $A$ is drawn uniformly from $\{4,5,6\}$; $B$ is drawn uniformly from $\{2,3,9\}$; and $C$ is drawn uniformly from $\{1,7,8\}$.

The tables below show the nine equally-probable outcomes of a draw each pair of these three random variables.

A | ||||

4 | 5 | 6 | ||

B | 2 | 4 > 2 | 5 > 2 | 6 > 2 |

3 | 4 > 3 | 5 > 3 | 6 > 3 | |

9 | 4 < 9 | 5 < 9 | 6 < 9 |

B | ||||

2 | 3 | 9 | ||

C | 1 | 2 > 1 | 3 > 1 | 9 > 1 |

7 | 2 < 7 | 3 < 7 | 9 > 7 | |

8 | 2 < 8 | 3 < 8 | 9 > 8 |

C | ||||

1 | 7 | 8 | ||

A | 4 | 1 < 4 | 7 > 4 | 8 > 4 |

5 | 1 < 5 | 7 > 5 | 8 > 5 | |

6 | 1 < 6 | 7 > 6 | 8 > 6 |

As you can see, there’s no strategy (lever) which beats every other strategy more often than not for a single draw.

Returning to our coin flip game, we now have two questions:

- Does a best strategy exist for coin-flip betting?
- If so, what is it?

Before we dive into the analysis, I’d like to make a few observations and assumptions to simplify the math.

**First**, we are interested in the long-run outcome over a large number of turns. To be mathematically precise, we are interested in the best strategy in the limit as $t$ approaches infinity.

**Second**, let’s assume that we’re dealing with an infinitely divisible currency. This way we don’t have to round bets to the nearest cent, which would complicate the math.

**Third**, let’s limit ourselves to strategies where we bet a fixed fraction $\ell$ of our wealth on every flip. Each possible value of $\ell$ between 0 and 1 therefore corresponds to a different strategy. Notationally, we will use $\ell_x$ to denote the fixed fraction used by the strategy $x$. It turns out that the optimal strategy under this restriction is also the best strategy overall, but the original Kelly paper leaves this as an open question; see *Optimal Gambling Systems for Favorable Games* for a proof of this.

Note that because of the second and third assumptions above, for a given value of $\ell$, the amount won or lost depends only on the *number of bets* won and lost, but not the order they are in. For example, if you play four rounds alternating wins and losses, you end up with

$\begin{aligned} &25 (1+\ell)(1-\ell)(1+\ell)(1-\ell) \\
= &25(1+\ell)^2(1-\ell)^2
\end{aligned}$

This is the same as if you first lose two and then win two:

$\begin{aligned}&25 (1-\ell)(1-\ell)(1+\ell)(1+\ell) \\
= &25 (1+\ell)^2(1-\ell)^2 \end{aligned}$

More generally, let’s say we run $t$ trials, winning $w$ and losing $t-w$. Then the amount we end up with is:

$V_{t,x} = 25 (1+\ell_x)^{w}(1-\ell_x)^{t-w}$

Since the quantity we are actually interested in is $P(V_{t,a} > V_{t,b})$ for any strategies $a$ and $b$, consider another quantity:

$\begin{aligned} W_{t,x} &= \frac{1}{25} {V_{t,x}^{\frac{1}{t}}} \\
&= (1+\ell_x)^{w/t}(1-\ell_x)^{1-(w/t)} \end{aligned}$

$W_{t,x}$ is a useful variable because $P(V_{t,a} > V_{t,b}) = P(W_{t,a} > W_{t,b})$.

(This is true because neither dividing by 25, nor taking the $t$ root, changes the order of those values. More precisely, $W_{t,x}$ is a *monotonic* function of $V_{t,x}$.)

Notice that $W_{t,x}$ depends on just two values: the value of $\ell_x$ and the fraction of the rounds that we win, $w/t$. To develop an intuition for the relationship between the three values, let’s plot $w/t$ on the left axis vs. $W_{t,x}$ on the bottom axis, for two different values of $\ell$.

Each point on the bottom axis represents a possible realization of the random variable $w/t$, ranging from the case where the coin comes up tails on every flip (the left edge of the plot) to the case where the coin comes up heads every time (the right edge of the plot.) Not all of these values are equally likely. Since heads has 60% probability, the black vertical line at $w/t = 0.6$ represents the **median value** of $w/t$. In other words, if we play the game for $t$ rounds, the resulting value of $w/t$ is as likely to be to the left of that line as it is to the right of it. (This line is actually an approximation of the median, but it becomes an increasingly accurate one as $t$ becomes large.)

Using this fact, we can determine whether $P(W_{t,a} > W_{t,b}) > 0.5$ just by looking at the chart: if (and only if) the curve for $W_{t,a}$ intersects the median line *above* the curve for $W_{t,b}$, then $P(W_{t,a} > W_{t,b}) > 0.5$.

Consider why this is true. As long $\ell_a \neq \ell_b$, the curves intersect for only one value of $w/t$. If that point is *right of the median*, then whichever curve is greater *at the median* is also greater *to the left of it*. Since, by definition, realizations of $w/t$ will be equal to or greater than the median 50% of the time, we know that whichever curve is greater at the median is greater at least 50% of the time (represented by the green line.) But this curve is also greater *between* the intersection and the median (represented by the yellow line), so we can actually say the curve that is greater at the median is greater *more than* 50% of the time.

A symmetric argument applies if the intersection is to the left of the median. If the curves intersect *at* the median, neither strategy is considered better than the other.

By putting together all the pieces we have seen so far, we know that $P(W_{t,a} > W_{t,b}) > 0.5$ if and only if $\mathrm{median}(W_{t,a}) > \mathrm{median}(W_{t,b})$.

Since $P(W_{t,a} > W_{t,b}) = P(V_{t,a} > V_{t,b})$, we now have a way to say whether or not strategy $a$ is better than $b$ just by comparing their medians!

This is handy for two reasons.

- It proves that a best strategy exists. We can’t have a situation like we did with the slot machines where $a$ is better than $b$ and $b$ is better than $c$ but $c$ is better than $a$, because that would imply that $\mathrm{median}(W_{t,a}) > \mathrm{median}(W_{t,b}) > \mathrm{median}(W_{t,c}) > \mathrm{median}(W_{t,a})$ which is not possible.
- It gives us an easy way to find the best strategy: find the strategy that has the maximum median return.

The plot below shows the median return for different values of $\ell$. The optimal strategy is to bet the value of $\ell$ that maximizes the median return, shown as a dot.

I’ve also included the curve for a strategy that bets *tails*. See what happens to each curve as you change the probability of heads, first to 50%, and then to below 50%.

When the probability of heads is **60%**, the highest median return after ten bets is achieved by betting **heads** with $\ell = 0.2$, yielding a median return of **$30.58**.

Adjust the probability of heads:

As you can see, when the probability of heads is 60%, the ideal strategy is betting 20% of your wealth. The game below is just like the one you played earlier with the addition of a slider that lets you set the $\ell$ paramter and automatically calculates the bet based on it. See what happens when you use the optimal strategy.

Cash: **$25.00**

Autobet: $\ell =$

Bet: $

*Choose heads or tails to place your first bet.*

Notice that there is a direct relationship between the probability of heads and the amount you should bet. In general, the optimal strategy in games like this that pay out 1:1 is to subtract 0.5 from your odds and double the result, then multiply that by your current wealth and make that your bet. (See the appendix for an analytical derivation of this.)

To recap, we’ve seen that:

- Maximizing expected value can lead to a bad long-run strategy.
- Rather than maximizing expected value, we may be able to find a strategy which returns more than any other strategy at least half the time.
- But not always! For some sets of random variables, whichever one you pick there is always one “better,” meaning that there is no “best.”
- In the case of our betting game, a strategy is better than another if it has a higher
*median*. - This means that we can, for the betting game, find a
*best*strategy for which there is no*better*. - We do this by maximizing the median outcome with respect to the size of our bet, $\ell$.
- The resulting optimal bet for this type of game, as a fraction of wealth, is $\ell = 2 (p - 0.5)$.

And with that result, we’ve arrived at our destination. This formula (actually, a slightly more general version of it) is commonly known as the Kelly Criterion after J. L. Kelly, Jr. In his original analysis, Kelly took a more direct route than we did to the solution, but I like to think of our path as the scenic route: this approach to the problem helped me develop an intuition for some of the more paradoxical aspects of working with probability and infinite series, and I hope it did for you as well. Happy betting!

To find the optimal strategy, we want to find the value of $\ell$ that maximizes the median value of $(1 + \ell)^{w/t} (1 - \ell)^{1-(w/t)}$.

This is the same as maximizing $\frac{w}{t}\mathrm{log}(1 + \ell) + (1 - \frac{w}{t})\mathrm{log}(1 - \ell)$.

We know that in the limit, $\mathrm{median}(w/t)$ is just the probability of heads. Let’s call this value $p$ so that the value we are interested in maximizing is $p\mathrm{log}(1 + \ell) + (1 - p)\mathrm{log}(1 - \ell)$.

To maximize this, we take the derivative and set to zero:

$\begin{aligned}
0 &= \frac{p}{1 + \ell} - \frac{1 - p}{1 - \ell} \\
&= \frac{p(1 - \ell)-(1 - p)(1 + \ell)}{(1 + \ell)(1 - \ell)} \\
&= p - p\ell - 1 - \ell + p + p\ell \\
&= 2 p - 1 - \ell
\end{aligned}$

This gives us the value of $\ell$ which maximizes the median return:

$\ell = 2 (p - 0.5)$