A Licoin
World Champion Tiebreaks: A Counter-Intuitive Proposal.
The Winner is the one who wins the Coin Toss.In a World Championship, the players may draw the Classical games.
So who wins?
A tiebreak is needed. This involves playing rapid games/ blitz games.
The problem is that Rapid and Blitz games favors the stronger player in this format.
The answer is that the tiebreak shall be a coin toss.
1. This is because if two players play equally well in Classical then a 50% chance is more fair then Rapid and Blitz. The better player in those formats has an advantage which is unfair as this is the Classical World Championship. Rapid and Blitz have their own Championships.
2. This tiebreak incentivizes fighting chess because the contenders want to win the championship outright. Having a lead in the points is a greater advantage than a coin toss. The match still goes on. The coin flip is just for if the players have a tied score.
Rapid and Blitz is not supposed to decide the outcome of the Classical World Championship. I believe it's unfair to have a disadvantage in the Classical World Championship if you are a weaker Rapid and Blitz player as the Classical World Championship is all about Classical chess. Not other formats.
Therefore if a player plays equally well as their opponent, then they will have an equal chance in the coin toss. 50%.
Imagine being the weaker Rapid and Blitz player with a 30% chance of winning tiebreaks as an example. Having a 50% chance instead of 30% would feel be more fair in recognition of your equal Classical performance against your opponent.
We can have a big screen at the end of a tied match with a charismatic presenter hosting the coin flip.
More revenue can be made as tickets will be sold for the Coin Toss, which will be devoured by a blood-thirsty audience, too eager to see one of the players break down emotionally.
The screen will show the 'Licoin' as depicted in the thumbnail of this blog.
A player (current World Champion) will guess the color of a pawn which the presenter holds in their clenched hand.
If they guess right they will choose 'horsey' or 'no horsey', corresponding to the two sides of the coin.
If it lands on their choice then they will be The Champion. Otherwise, the challenger will become The Champion.
The camera should pan to the Champion's thrilled and overjoyed face. Then it should pan to the emotionally broken coin toss loser.
Confetti and Balloons should then be deployed from the ceiling of the venue at this very moment.
---
Originally I shared the paper below which recommended Total Pawn Loss Value (TPLV) which measures the summed difference between the evals of the engine and what move the player chose. Then I recommended Total Win Percentage Loss (TWPL) since a -2 centipawn loss is treated the same when going from +10 to +8 and +2 to 0. In both cases the centipawn loss is treated the same even though in the first case the player is still winning, while in the second case it goes from winning to equal.
In the comment section two things were pointed out:
"we assume chess is a draw starting at 0.0 eval, and we reach a draw in the end counting as 0.0 eval; how can the sum of differences be anything else than 0? (Or perhaps I should say: equal for both players)." - @DaBassie
"Consider this situation: Both players play 1.e4 as white in all the games. Player A, plays Caro-Kann defense as black, and Player B, the Pirc. All of the games are won by white. When player B is white, his opponent plays the Caro-Kann, the game is hard and player B has to play for positional advantages, eventually winning his games because of slight positional mistakes by his opponent. Player A lost, but playing close to the engine moves, because his defense was ideal for that. When player A is white, his opponent plays the Pirc, the game is very sharp and player A goes for a direct attack against the king, eventually winning his games because of a tactical combination. Player B lost, but since the position was very sharp and his mistakes were tactical, not merely strateginal as in the other case, his positions were very different than those of the engine. What's the lesson here, winning by tactics is more valuable than winning exploiting positional advantages. Possibly, players who are very good calculators will be in a better spot than positional masters. - @ChesslyChessdriller
So this is why the TWPL cannot work. Because logically the TWPL should be the same for draws. And because when considering won games, tactical players may have an advantage as the relative TWPL difference would be greater for tactical players as blunders weigh heavily, compared to positional players where the opponent may not make a blunder, but many inaccuracies.
Therefore I recommend the coin toss as a tiebreaker.
The following text is from before the above concerns were raised:
----
A interesting proposal is to make the tiebreak simply the one with the lowest Total Pawn Loss Value (TPLV). This is simply adding up the difference between the players move eval and the best move eval over the course of the moves. This is different from average centipawn loss which involves taking the average difference between player move eval and Stockfish move eval across the moves. This can result in low centipawn loss values when the games are long.
An example of TPLV: The top move is +0.5 and the player makes a move which makes eval go to -0.5. That is a 1 pawn loss. So these losses for each game would be summed across the moves for each game. And all the values from the games would be added up to get a final pawn loss value across the match.
One reason for this is to determine the winner of Classical events based on Classical games. Having Rapid/Blitz/Armageddon tiebreaks favors the player who is better in these formats. The quality of the games drop, and luck plays a larger role.
Most people will be thinking "no no no" while thinking about this suggestion. Probably because it seems too basic, too boring or the system could be gamed.
This proposal was put forward in a paper 'AI-powered mechanisms as judges: Breaking ties in chess' published by Nejat Anbarci and Mehmet S Ismail in 2024.
To reduce the growing incidence of ties, many elite tournaments have resorted to fast chess tiebreakers. However, these tiebreakers significantly reduce the quality of games. To address this issue, we propose a novel AI-driven method for an objective tiebreaking mechanism.
This method evaluates the quality of players’ moves by comparing them to the optimal moves suggested by powerful chess engines. If there is a tie, the player with the higher quality measure wins the tiebreak. This approach not only enhances the fairness and integrity of the competition but also maintains the game’s high standards.
Former World Champion Veselin Topalov commented on this proposal:
We would like to thank FIDE World Chess Champion (2005-2006) and former world no. 1 Grandmaster Veselin Topalov for his comments and encouragement: “I truly find your idea quite interesting and I don’t immediately find a weak spot in it. There might be some situations when the winner of the tiebreak does not really deserve it, but that’s also the case with any game of chess” (e-mail communication, 11.08.2022, from Topalov regarding our tiebreaking proposal).
One point is that if a player offers a draw than the eval of that draw offer is '0'. e.g. Being Black and having an advantage of -1, and then making an accepted draw offer counts as an eval of '0', meaning that the draw offer counts as a loss of 1 pawn in the total pawn loss value.
Anbarci and Ismail noted that this situation occurred in the 2018 World Championship between Carlsen and Caruana. Carlsen had that advantage and offered a draw.
Magnus Carlsen offered a draw in a better position against Fabiano Caruana in the last classical game in their world championship match in 2018. This was because Carlsen was a much better player in rapid/blitz time-control than his opponent. Indeed, he won the rapid tiebreaks convincingly with a score of 3–0. Note that Carlsen made the best decision given the championship match, but due to the tiebreak system his decision was not the best (i.e., manipulation-proof) in the particular game.
Carlsen, of course, knew what he was doing when he offered a draw. During the post-game interview, he said “My approach was not to unbalance the position at that point” [34]. Indeed, in our opinion, Carlsen would not have offered a draw in their last game under the TPLV-based tiebreaking system because his TPLV was already lower in this game than Caruana.
After the draw offer was accepted, the game ended in a draw and the evaluation of the position is obviously 0. As a result, Carlsen lost 1=−(−1−0) pawn-unit with his offer, as calculated in Table 4. Thus, his final TPLV is 6.2 = 5.2 + 1 as Table 4 shows. (For comparison, Carlsen’s and Caruana’s average centipawn losses in the entire match were 4.13 and 4.24, respectively).
(A draw offer in that position would make Caruana the winner of the tiebreak under our method. As previously mentioned, we use TPLV to break a tie in matches; though, it could also be used as a tiebreaker in individual drawn games).
---
This system provides an incentive for playing Classical games, on top of making tiebreaks fairer.
A player will not offer draws in advantageous positions for this reason, which avoids faster and more variable time controls from determining the winner of events.
| Year | Players | # Games | C-TPLV 1 | C-TPLV 2 | Δ | % |
|---|---|---|---|---|---|---|
| 1910 | Schlechter-Lasker | 10 | 51.3 | 151.5 | -100.2 | 195.3 |
| 1951 | Botvinnik-Bronstein | 24 | 245.4 | 145.1 | 100.3 | 69.1 |
| 1954 | Botvinnik-Smyslov | 24 | 115.2 | 119.1 | -3.9 | 3.4 |
| 1987 | Kasparov-Karpov | 24 | 93.9 | 88.8 | 5.1 | 5.7 |
| 2004 | Leko-Kramnik | 14 | 117.1 | 22.4 | 94.7 | 422.8 |
| 2006 | Topalov-Kramnik | 12 | 41.2 | 229.8 | -188.6 | 457.8 |
| 2012 | Anand-Gelfand | 12 | 42.3 | 43.9 | -1.6 | 3.8 |
| 2016 | Karjakin-Carlsen | 12 | 41.9 | 40.6 | 1.3 | 3.2 |
| 2018 | Carlsen-Caruana | 12 | 27.5 | 27.8 | -0.3 | 1.1 |
Table 2. Summary statistics: The player with the lower cumulative (C) TPLV is declared the tiebreak winner.. 'AI-powered mechanisms as judges: Breaking ties in chess**' published by Nejat Anbarci and Mehmet S Ismail in 2024.
---
TPLVs per round/game in World Chess Championship matches. 'AI-powered mechanisms as judges: Breaking ties in chess' published by Nejat Anbarci and Mehmet S Ismail in 2024.
This system can also be used for tournaments.
Many elite tournaments, including the world chess championship as mentioned earlier, use Armageddon as a final tiebreaker. Most recently, the Armageddon tiebreaker was used in the 2022 US Women Chess Championship when Jennifer Yu and Irina Krush both tied for the first place, scoring each 9 points out of 13. Both players made big blunders in the Armageddon game; Irina Krush made an illegal move under time pressure and eventually lost the game and the championship.
According to our TPLV-based tiebreak method, Irina Krush would have been the US champion because she played a significantly better chess in the tournament according to Stockfish: Irina Krush’s games were about two pawn-units better on average than Jennifer Yu’s.
It's a question of perspective. Rapid and blitz tiebreaks are the standard. Living in a world with this new system, rapid and blitz tiebreaks would look like an odd choice for Classical events.
However, centipawn loss is not ideal since a -2 centipawn loss is treated the same when going from +10 to +8 and +2 to 0. In both cases the centipawn loss is treated the same even though in the first case the player is still winning, while in the second case it goes from winning to equal.
This means that win percentage loss should be calculated instead. Centipawn advantage an be converted into a win percentage by a formula. Lichess uses Win% = 50 + 50 * (2 / (1 + exp(-0.00368208 * centipawns)) - 1). This was based on 75k positions in 2300+ rapid games on Lichess. https://github.com/lichess-org/lila/pull/11148
A model can be created for OTB Classical game for GMs to create a more accurate Win Percentage graph.
A possibility is to exclude opening book moves to avoid discriminating against openings which are evaluated lower such as the Pirc or King's Indian.
As an example using the Lichess model, in the scenario above going from +10 to +8, converts to going from a 97.5% chance of winning to 95.0% which is a 2.5% difference. (A Classical OTB model with GMs would probably reduce the difference (2.5%) between a +10 to +8 eval change.)
And +2 to 0, converts from 67.6% to 50%, leading to a 17.6% percentage loss.
Under the pawn loss metric, these two instances would be equivalent.
So from here on, I will recommend the Total Win Percentage Loss (TWPL) as a more suitable metric than Total Pawn Loss Value.
The Total Win Percentage Loss (TWPL) sums up the win percentage loss across all the moves in the match, like the Total Pawn Loss Value did.
If players try to play like a computer to get a better TWPL, then that means they are playing better chess and will get a game advantage anyway. 'Gaming' the system by playing better chess is valid. Is a player with a certain style going to lose TWPL unfairly? No, because their opponent would have to play better than them and if they play better then it's fair.
There is the question of style. Tactical and complex players have lower inaccuracies. But this is fine for a two-player event as TWPL is a relative measure. We are only interested in whether one player is better than the other. As an example, Firouzja playing tactical chess doesn't hurt him. It only hurts him if his opponent manages to defend. The opponent also has to cope with the pressure. The opponent making errors will give them a worse TWPL than Firouzja. Only way they can have a better TWPL than Firouzja is if they defend his attack with an advantage. In which case they deserve to have a better TWPL. For this reason, TWPL may not be suitable for tournaments as solid players would be favored over tactical/complex players.
Another question is will a player that is losing resign to avoid losing TWPL potentially? This won't happen because the TWPL accounts for the fact that in extreme positions, the win percentage loss is lower as changes in evaluation won't effect the outcome as much (e.g. +10 to +8 is not much compared to +2 to 0). A player who is losing will not resign prematurely as they have the chance to improve their TWPL by improving their position as the winning opponent's TWPL would decrease at a faster rate due to how the win percentage works. (e.g. +10 to +8 is not much compared to +2 to 0).
Losing games is worse than having a lower TWPL score. The tiebreak doesn't change the fact that the players want to win games. Having a lead in the classical section is better than leading the TWPL. If the player has a practical chance of defending then they will play on.
Another interesting question is whether players would change their playstyles to be more solid. But what would be the benefit of playing safer openings? It would just make the TWPL a coin toss. In a regular match the GMs don't play Petrov every game because they know they have to win. Besides, the player with a lower TWPL would still have to play for a win anyway. Now in the current WCH with Rapid and Blitz tiebreaks we already have instances of players changing their playstyle to be more solid. Carlsen agreed to a draw in a better position against Caruana because he didn't want to risk it in the 2018 match. Under the new tiebreak that wouldn't occur.
If a player wants to go for complicated play in a game, TWPL would not make a difference. As an example, under the current WCH setup they would take a risk. If they feel confident enough to play better than their opponent in that position, why would their confidence level change with the TWPL system? It doesn't matter how badly they play, just as long as they play better than their opponent. It's relative.
One important concern is the reliability of engine evals. Different engines with different depths/time spent can give different evals. Even the same engine can give different evals with the same settings. Engines evals should be consistent to make accuracy determinations valid. The conversion from TPLV to TWPL means that reliability will be increased as evals on the higher end can be more variable.
We will have to compare evals within the latest Stockfish and between previous Stockfish versions to see the margin of error. To see by how much the Total Win Percentage Loss changes between the same version and previous versions. If the margin of error is low than we can proceed. The strongest and most consistent engine should be used in world class events (Stockfish), with optimal settings. The engine used, depth, search parameters and hardware should be disclosed publicly when measuring the TWPL and these settings should be kept consistent throughout the event. For the concern about fluctuations of TWPL being random and arbitrary/inconsistent engine evals, a threshold could be used where TWPL for games that are within a certain percentage are treated as equal, as a example. Determining the right threshold will need to be investigated empirically.
Even if the margin is low, what if the match is very tight? In this case the TWPL difference may be more random, but this would still be more fair than a Rapid and Blitz playoff, where the weaker Rapid/Blitz player would have a greater chance of losing in spite of their tied Classical performance.
I think that the above scenario will be quite unlikely anyway as during a course of a match/tournament, players and the public know their TWPL (Total Win Percentage Loss). A player trailing in the TWPL will play for a win. This reduces draws as players can't rely on going into a fast play tiebreak. A player leading can't offer a draw in better positions or they lose TWPL. The changing values of the TWPL throughout matches or tournaments adds drama and tension.
Another question is about the TWPL of draws. A draw should have the same TWPL as the win percentage loss should be 0%. A draw should not have different TWPL's. This is only a concern if players draw all the games. Otherwise won games can be used to differentiate the more clinical player. Players drawing all the games is rare (only Carlsen-Caruana).
If players draw all their games then the Classical World Champion will be decided by a coin flip. This is because if two players play equally well then a 50% chance is more fair then Rapid and Blitz where the better player in that format has an advantage. The above scenario is unlikely both statistically and also because the contenders want to win the championship outright. Having a lead in the points is a greater advantage than a coin toss. The Classical World Championship should be based on Classic chess. Rapid and Blitz have their own World Championships.
Ultimately, this tiebreak mechanism is a means to an end. It does not change the fact that players will want to win by points, in contrast to situations like the 2018 World Championship. The Championship will ideally be decided by a player winning more games. In the event of a tie the Winner is based on Classical Chess games, as opposed to Rapid and Blitz games. But we want to see a win by most Classical wins as opposed to a tiebreak win. This tiebreak system incentivizes wins in Classical Chess compared to the current Rapid and Blitz playoff.
---
As mentioned before, the above text is from before the concerns given at the start of the blog were raised. This is why I now recommend the coin toss tiebreaker.
1. This is because if two players play equally well in Classical then a 50% chance is more fair then Rapid and Blitz. The better player in those formats has an advantage which is unfair as this is the Classical World Championship. Rapid and Blitz have their own Championships.
2. This tiebreak incentivizes fighting chess because the contenders want to win the championship outright. Having a lead in the points is a greater advantage than a coin toss.
Rapid and Blitz is not supposed to decide the outcome of the Classical World Championship. I believe it's unfair to have a disadvantage in the Classical World Championship if you are a weaker Rapid and Blitz player as the Classical World Championship is all about Classical chess. Not other formats.
Therefore if a player plays equally well as their opponent, then they will have an equal chance in the coin toss. 50%.
Imagine being the weaker Rapid and Blitz player with a 30% chance of winning tiebreaks as an example. Having a 50% chance instead of 30% would feel be more fair in recognition of your equal Classical performance against your opponent.
---
Share your thoughts on this system in the comments section.
