Noobmasterplayer123
Human Estimate Eval Function
If a human or human-like model assessed this position, what centipawn score would reflect their perception?Intro
Hey all, I'm back on writing another mathematical paper on a function I have been working on in my free time, it's called the Human Estimate Eval Function HEE which mathematically computes a human's chess eval in centipawns based on predicted WDL predictions, in this paper will present 2 functions, a objective HEE which can be computed using Maia3's policy value and can be equal to a human eval based on Maia 3 neural net's predicted WDL, and a subjective HEE which uses a human's manual WDL predictions, before I go into the math behind this I was inspired to work on this function by ChessDojo's no engine streams, the sensei's used human eval bar rather than an engine, this quite interested me to explore mathematical side of this just to see if I can come up with anything useful, after couple of late nights work it seems we might have a useful candidate function!
Calculations
I want to start with the intuition behind how we can calculate the HEE, I first wanted to see if I can do quick heuristics checks like KMAPS, but that seemed to engine like, I wanted this function to be adaptable by neural nets like Maia that are trained on human games, so whatever HEE objective function we have its value will directly relate to this neural net's predictions which is trained on human games, so by that we will indeed have a human centipawn engine like eval. When evaluating positions, we humans tend to look for the quality, or mainly which side has more chances to win, based on the current position. The metrics to come to that conclusion can be arbitrary themes, like material, space, positional imbalances, etc. Since these arbitrary themes are quite subjective to that person, we can just take objective Win/Draw/Loss values. We can say that let quality score Q compress Win/Draw/Loss probabilities into a single scalar that represents the overall favorability of a position, and we can use this quality score Q as a starting point of calculations. In chess calculations, the subjective quality score can be calculated as follows

here P(result) function computes the probability of that result, an objective human Qh can be represented as above, we basically calculate the probability of win + draw over the sum of W/D/L probabilities. This function is bounded to [0, 1]
Since the above is mainly on just humans' guess of WDL probability, we can define a stable objective Q function which takes Maia3 WDL probabilities and applies some tweak to make it a Q value bounded to [-1, 1]. The following represents the Qm function.

Now that we have a quality score, we need some way to take this quality score and transform it into a raw engine, like centipawn, which can give us HEE. For this, I was looking around if there is a function, and I found the answer in my own ease metric blog! Let's look at Lichess's converter function below

As I looked at this function, I found the 2 things, centipawn cp and Q. Here, the function takes a centipawn and generates a winning chance quality score Q. We want to do the opposite; we already have subjective and objective Q values, but we want the centipawn to get HEE, and the trick here is that we can just invert the above function by doing the steps below

After doing a couple of math tricks above, we get the following generic HEE function
Now we can get the objective HEE function that takes Maia3's WDL Qm function. Here, we plug the Qm, which we came up with earlier

We can also get a subjective HEE function that takes a human's manual predicted WDL, and we again plug the original Qh which we came up with earlier
And we have mathematically defined a Human Estimated Eval function that outputs centipawns based on the input Q function
HEE Demo
I have coded a demo page, where you can enter the position fen and play out chess positions and see Maia3's HEE based on my formula above, this page shows Q and estimated eval bars for Maia's 600 to 2600 rating band nets, as well as shows a subjective HEE picker where you can input W/D/L probabilities and see the eval bar in action.

Above I put in the position from round 4 of candidates GM Sindarov vs GM Caruana, here according to Maia3's 2600 WDL Q function, the human eval at that level be +1.37, engine here says +2.1. This HEE is closer to engine eval yet reasonable for human play at the 2600 level.
The above allows chess players to enter predictions and calculate the subjective HEE based on the subjective HEE function I discussed above. Feel free to try the HEE function on chessagine human eval page to play around with the function with various positions! If you want to check out the source code and mathematical explanations, check out the GitHub
Conclusion
In conclusion, I was able to come up with a mathematical HEE function that outputs an estimated centipawn engine-like value based on objective and subjective Q functions of WDL values. I want to note this function has its limitations, given the eval doesn't fully account for practical play chess board situations, and engine-like heuristics, yet it's still a reasonable function when paired with a neural net like Maia3, which outputs the WDL on what it has seen during its training on human games.
That's it from me in this blog,
Thanks for reading!
Noob
Credits
- Maia 3 and University of Toronto
- Lichess Lila PR Converter function PR
- Extended Ease metric, Noobmasterplayer123
