Can LLMs like ChatGPT Understand Chess? • page 1/3 • Community Blog Discussions • adjva4.dpdns.org

RuyLopez1000

Comments on https://adjva4.dpdns.org/@/ruylopez1000/blog/can-llms-like-chatgpt-understand-chess/zUyffwCx

Willem_D

Asking a LLM to play chess is like asking a penguin to climb a tree.

sinarody1812

@Willem_D said:

Asking a LLM to play chess is like asking a penguin to climb a tree.

@RuyLopez1000 said:

Comments on https://adjva4.dpdns.org/@/ruylopez1000/blog/can-llms-like-chatgpt-understand-chess/zUyffwCx

@Willem_D said: > Asking a LLM to play chess is like asking a penguin to climb a tree. @RuyLopez1000 said: > Comments on https://adjva4.dpdns.org/@/ruylopez1000/blog/can-llms-like-chatgpt-understand-chess/zUyffwCx

Noobmasterplayer123

They can't understand, they are language models, the whole point of testing LLMs' chess ability is useless, its like asking someone who is great at communicating and talking going to PHD class and use their communication skills to explain hard math ideas just because they know how to "talk". I have been a supporter of multi-systems working together, a system where small units do what they are great at.

LittleFireRat

I've developed an app that can accurately give great information on chess positions. I offer demonstrations of it, the trick being to have it describe a position and then describe the follow up actions in a human way. In doing so I have developed a chess coach ai that can consistently give great advice with detailed descriptions about how to correct errors in your play. I am actively improving its features, and no it doesn't use templates and no it isn't giving vague advice. I can share it with anyone on discord at Ace7712. Everyone that has seen it agrees that it gives great advice and that it does most of what a coach would in terms of explaining a position.

Willem_D

Jokes aside, I wonder what temperature setting these LLMs had.

Any temperature >0 will make the LLM non-deterministic, meaning if you ask it the same question multiple times you can get different answers. Not because it doesn't know, but because it intentionally varies its output (in order to be creative).

I suspect future LLMs will have the option to simply call some stockfish-API (if such a thing exists), where they simply ask an engine to evaluate a position, and recommend moves. Microsoft Copilot for instance will often create a Python script the fly, and execute it in order to calculate difficult things. For instance, if you ask the distance between two point on the earth (which is not easy as the earth is an oblate sphere), then it will create a Python script that implements the Vincenty formula to approximate the distance.

Jokes aside, I wonder what temperature setting these LLMs had. Any temperature >0 will make the LLM non-deterministic, meaning if you ask it the same question multiple times you can get different answers. Not because it doesn't know, but because it intentionally varies its output (in order to be creative). I suspect future LLMs will have the option to simply call some stockfish-API (if such a thing exists), where they simply ask an engine to evaluate a position, and recommend moves. Microsoft Copilot for instance will often create a Python script the fly, and execute it in order to calculate difficult things. For instance, if you ask the distance between two point on the earth (which is not easy as the earth is an oblate sphere), then it will create a Python script that implements the Vincenty formula to approximate the distance.

RuyLopez1000

edited

@Willem_D said:

Jokes aside, I wonder what temperature setting these LLMs had.

Any temperature >0 will make the LLM non-deterministic, meaning if you ask it the same question multiple times you can get different answers. Not because it doesn't know, but because it intentionally varies its output (in order to be creative).

Geometric Stability (Song et al. 2025) Didn't say what the temperature settings for the LLMs were. (For the FEN/PGN/Natural language task just stated that 'We query the LLM three times under identical prompting (same temperature and instruction style)'.

ChessQA (Wen et al. 2025) Used the 'default sampling parameters for generation'.

Fluid Intelligence (Pleiss et al. 2026) Set the temperature to 0.

I added the info to the blog.

@Willem_D said: > Jokes aside, I wonder what temperature setting these LLMs had. > > Any temperature >0 will make the LLM non-deterministic, meaning if you ask it the same question multiple times you can get different answers. Not because it doesn't know, but because it intentionally varies its output (in order to be creative). **Geometric Stability (Song et al. 2025)** Didn't say what the temperature settings for the LLMs were. (For the FEN/PGN/Natural language task just stated that 'We query the LLM three times under identical prompting (same temperature and instruction style)'. **ChessQA (Wen et al. 2025)** Used the 'default sampling parameters for generation'. **Fluid Intelligence (Pleiss et al. 2026)** Set the temperature to 0. I added the info to the blog.

Noobmasterplayer123

@Willem_D said:

Jokes aside, I wonder what temperature setting these LLMs had.

Any temperature >0 will make the LLM non-deterministic, meaning if you ask it the same question multiple times you can get different answers. Not because it doesn't know, but because it intentionally varies its output (in order to be creative).

I suspect future LLMs will have the option to simply call some stockfish-API (if such a thing exists), where they simply ask an engine to evaluate a position, and recommend moves. Microsoft Copilot for instance will often create a Python script the fly, and execute it in order to calculate difficult things. For instance, if you ask the distance between two point on the earth (which is not easy as the earth is an oblate sphere), then it will create a Python script that implements the Vincenty formula to approximate the distance.

It does its called ChessAgine MCP and call various chess tools for chess context, you can read more about it here https://adjva4.dpdns.org/@/Noobmasterplayer123/blog/chessagine-mcp/ETaQkCX6

@Willem_D said: > Jokes aside, I wonder what temperature setting these LLMs had. > > Any temperature >0 will make the LLM non-deterministic, meaning if you ask it the same question multiple times you can get different answers. Not because it doesn't know, but because it intentionally varies its output (in order to be creative). > > I suspect future LLMs will have the option to simply call some stockfish-API (if such a thing exists), where they simply ask an engine to evaluate a position, and recommend moves. Microsoft Copilot for instance will often create a Python script the fly, and execute it in order to calculate difficult things. For instance, if you ask the distance between two point on the earth (which is not easy as the earth is an oblate sphere), then it will create a Python script that implements the Vincenty formula to approximate the distance. It does its called ChessAgine MCP and call various chess tools for chess context, you can read more about it here https://adjva4.dpdns.org/@/Noobmasterplayer123/blog/chessagine-mcp/ETaQkCX6

Noobmasterplayer123

I think @RuyLopez1000 maybe you should talk about MCP server, and various newer ways people are trying to integrate maybe that can be a future blog

Toadofsky

Can humans understand chess? Some priors are required (players need to be taught the rules of the game, sometimes more than once and with examples; LLMs require training data).