A practical mathematics of optimal bluff, in poker as in real life

 celebrity poker tournament

(Second part. First part here: https://www.setthings.com/en/poker-bluff/. Image: World Series of Poker celebrity poker tournament – Rio Casino, Las Vegas, https://www.flickr.com/photos/kalooz/3724732892)

Neutral point of bluff

Alice may decide to penalize a possible bluff (with a frequency α), and Bob may have decided to bluff (with frequency β).

If Bob has a weak hand, the choice for him is between bluffing or pass. If he decides to bluff a little, the risk he takes depends on the strategy of Alice: he risks a further loss of the amount of its R raise (with a frequency of a + α), but can recover the pot P as Alice does not follow (other times). The equilibrium point for Bob is achieved if R (a + α) = P (1- (a + α)), that is to say, (a + α) = P / (P + R), and this equilibrium depends only on the probability that Alice has to go. If Alice sanctions less frequent, Bob’s bluff may be more frequent, if Alice sanctions more frequently, Bob’s bluff should be less frequent, and if Alice plays exactly this neutral point, the gain of Bob does not depend on its bluff rate.

Against a raise of Bob, even if Alice has a potentially losing hand, it may elect to sanction a possible bluff, by “calling” the bet. The following table describes the three choices of Alice and possible gains, after Bob decided to rise:

  • (Gains for Alice) >>> Alice has a four of a kind, Frequency a = 5.6% >>> Alice sanctions a bluff, Frequency α%, Alice fold on rise, Frequency 1- (α + 5.6)%
  • Bob had his drawing, Frequency b = 14.6% >>> Alice wins the pot and the rise, P + R >>> Alice loses the rise, – (R-r) >>> Bob wins the pot, —
  • Bob decides to bluff, β frequency >>> Alice wins the pot and the rise, P + R >>> Alice wins the pot and the revival, P + R >>> Bob wins the pot >>> —

Alice will lose its rise (R-r) in the b = 14.6% of times when the drawing of Bob came true, but with a frequency of β% she recovers both the pot and the rise of Bob. Equilibrium is reached when (R-r).(b) = β.(P + R), which depends only on the Bob bluff rates (and of course, its level of rise). If β < b.(R – r)/(P + R), to seek to punish a bluff is an average loss of money, so best leave bluff and do nothing. If instead β exceeds this limit, the sanction of a possible bluff brings in an average of money, so it is better to follow consistently.

The neutral point is reached for both players when Alice comes to the raises with probability a + α = P / (P + R), and Bob bluff with a probability β = b. (R – r) / (P + R) . If Bob keeps exactly this bluff rate, he will earn on average the same, whatever the strategy of Alice: he will earn more thanks to bluff if Alice comes to see less often, and more with successful draws if Alice comes more often. Similarly, if Alice keeps strictly the following of this rate, she will gain much, whatever Bob strategy.

Note that the optimal frequency of bluff β is always less than b, that of the supposed winning hand: when a rational bluffer displays a strong hand, it occurs more than once in two. Conversely, therefore, if a bluffer shows his strong hand less than half the time when one comes to see his bluff is not rational but psychological, and good strategy is to call more often.

  • A typical rise level is twice the value of the pot.
  • In the neutral point, a raise twice the pot is a bluff once out of three.
  • In the neutral point, a raise twice the pot must be called by one in three.

Benefits provided by bluffing

What is the point of playing on the neutral point? The gain can be calculated simply by assuming that Alice is in the neutral point, and Bob never bluffs (since it is sufficient that there be two players for the result to be neutral). The following table shows the possible gains of Bob after he rised:

  • (Gains for Bob) >>> Alice has a full or four of a kind, Frequency a = 5.6% >>> Alice sanctions a bluff, Frequency α% >>> Alice goes on rise, Frequency 1- (α + 5.6)%
  • Bob had his draw and rise >>> Bob loses the pot and the rise, -R >>> Alice loses the rise, R >>> Bob wins the pot, P

 

  • No bluff, when his draw hits, Bob wins the pot with a frequency (1-a), and loses its rise with a frequency of a = 5.6%. Overall, his gain without bluff is P (1-a) -R (a).
  • With bluff, he forces Alice to play in neutral point by calling to see with an additional frequency α, which he earned now P (1-aα) -R (a) + (P + R) .α
  • The difference between the two situations is (P + R)α-Pα = αR: on average, Bob wins exactly the amount of his rises that Alice must call, on its successful draws.

We see that the effect is not bluffing on weak hands (where the result is statistically indifferent), but the strong hands:

  • The interest of optimal bluffing strategy is to leverage its statistically winning hands, forcing the opponent to call see them more often.

Bluffing on a low hand is therefore not an attempt to deceive or a forced passage against the statistics, but simply an investment to do, wisely calculated to increase his earnings over the entire game.

The characteristic exchange with a rise of Bob then implicitly means:

  • (Alice) Opening the pot (I have at least a high pair).
  • (Bob) Call (I have at least that).
  • (Alice) Two cards (it is a pair or three of a kind).
  • (Bob) A card (it’s a draw or two pair).
  • (Alice) Opening the pot (my pair has improved).
  • (Bob) I raise twice the pot (I claim to have hit my draw, but of course I lie twice in five … up to you).
  • (Alice) Call (you’ll laugh, but I had hit my full ..) or call (I’m not with a winning hand, but I assure my 33% “to see” to sanction a bluffer like you) or fold (you have the right to earn this type of rise in 66% of cases, I hope you had the game, you will not even have the pleasure of showing it …).

Reason to play in neutral point

For Bob, playing neutral point has a direct financial benefit: on average he will earn more money without bluffing, either on his strong hands valued called by Alice, or due to the unfulfilled bluffs. The game on the neutral point presents a psychological and statistical advantage: as the profitability of the hand does not depend on psychological factors, it ensures a regular game without financial surprise. The only downside is subject themselves to bluff only within rational limits, without being guided by his inspiration. That said, it may continue to do from time to time: it will be statistically undetectable.

For Alice, the game on the neutral point is not financially beneficial, because it is statistically a waste of money against a player who obviously never bluffing, or bluffing with a frequency clearly below its neutral point. However, this is insurance against big bluffers or erratic players: playing on the neutral point, she can play without having to guess what hides his opponent’s strategy. The insurance has a cost, but on average, it is the same cost that she will win when will be in position to bluff: on average, it is a zero sum strategy. This does not stop calling to see less often those hands which she thinks that the opponent can not bluff, of course, if her intuition is strong.

  • Against a player who obviously do not control the optimal rate of bluff or punishment, do not play on the neutral point, but in a way to take advantage on his systematic failure to play. If he’s bluffing too, increase the call, if he “call” too much, reduce the bluff, and so on. On average, a player who knows his neutral points earns money against a beginner which obviously never respects the balance – it is enough to wait.
  • If the opponent clearly plays on the neutral point of the bluff, there is no reason to change its strategy as it will not move the neutral point, the average gain is the same. At most it is possible to attempt to deviate from the neutral point to see if he call, and playing cat and mouse when it does.

Optimal rise level

We know that Alice can calculate its “neutral point” according to Bob rise rate. The gain of Bob by bluffing strategy is therefore by replacing α with its value:

G a i n = R ( P/( P + R) − a ) = P ( 1 − P/( P + R )) − a . R

We see that based on R (the level of rise compared to the pot), the gain follows a hyperbolic branch, and is maximal when its differential vanishes:

d ( G a i n ) / d R = P2 / ( P + R )2 − a = 0  that is to say P / ( R + P ) = √a

In the case presented, the optimum would be a raise to three times the pot, because the probability for Alice to touch the dangerous full is only a = 5.6%, relatively low. For a raise to three times the pot:

  • The maximum rate of eligible bluff is about 60% that of the alleged hand: a strong rise will be false in 37.5% of cases.
  • The verification rate to ensure is only 1/4 (counting winning hands).
  • The winning strategy is then 0.58 (instead of 0.54 for a raise twice the pot).

In fact, it is not very important to play exactly on the optimum, because around this value the average gain does not vary much. One can generally remember that if the probability of Alice of winning are about 10%, the Bob‘s raises are optimal for nearly twice the pot.

Usinf raises twice the pot (R=2P), Bob is bluffing with a frequency equal to two-thirds of its probability of having the strong hand he claims (b = 14.6%), or about 10%. When its draw is losing (which happens with relative frequency of 85.4%), in order to enhance its long-term winning drawings, it must still bluff in 10% / 85.4% = 11.7% of its losing hands, aggressively raising to twice the amount of the pot.

The cat and the mouse

The neutral point is stable in the sense that if one of the two players uses it, the average gain is independent of the strategy of the other player. But it is a strategy that has a cost: on average, you must call to see the raise, typically one in three.

If Alice constantly plays on its neutral point, Bob may attempt a psychological game: to believe in an eccentric strategy, and guess when Alice will change its strategy to reverse his behavior. Bob can be provocative on a series of small hands, by clearly no longer playing on his own. Bluffing “clearly” too often is relatively easy: Alice eventually realize that when she call, the winning hand is not to the statistically expected level.

When Bob clearly departs from its neutral point, Alice can alter the rate of sanction accordingly, and to charge Bob for his inconstancy: If Bob‘s strategy is stable, Alice can benefit. But to do this, she must herself away from the neutral point and call to see much more often, which exposes it to claw back from Bob …

For example, on fifteen raises of Bob, Alice should call to see average five times. It would be normal that Bob has bluffed twice on average, or even three, but four or five discovered bluffs shows that Bob certainly did not play its neutral point. Alice will be tempted to call more often, allowing Bob (who meanwhile has returned to the level of optimal bluff) to gain more often the price of the raises. Conversely, if Bob never bluffs on this sequence, Alice will be tempted to call less often, allowing Bob to increase its rate of bluff with impunity.

Translated from Wikipedia

Summary
Review Date
Reviewed Item
A practical mathematics of optimal bluff, in poker as in real life
Author Rating
51star1star1star1star1star
Share...Share on FacebookTweet about this on TwitterShare on Google+Share on LinkedInShare on RedditShare on StumbleUponShare on TumblrPin on PinterestEmail this to someone

Leave a Reply