Understanding soundness and motivations in chess puzzles, problems, and studies
1 Nov. 2021 | by Peter Wong
What makes chess puzzles, problems, and endgame studies sound – or unsound? Why do some puzzles seem to contain “sub-optimal” moves by the opposing side? Is a problem or a study faulty if its thematic variation isn’t apparently “forced”? These sorts of questions are often raised, and it’s certainly important to be able to distinguish between valid and incorrect positions. Afterall, the brilliant solution of a puzzle or a composition comes to nought if an alternative line solves as well. Understanding the issue of soundness also helps us to appreciate why certain moves appear as the main line of a solution. We will thus look at the motivations that underlie “best” moves and how they actually vary in tactics puzzles, composed problems, and studies, due to different goals and conventions. We’ll also consider the limitations of chess engines and tablebases in analysing such positions, and see why the software’s purported solutions are often misleading. Once these factors are taken into account, we can make the right verdict on the validity of any puzzle or composition, and avoid some common fallacies.
Three types of white goals
Tactics puzzles: The side to play obtains a winning material advantage (or gives mate).
Directmate problems: White to play and forces mate in N moves.
Endgame studies: White to play and win or White to play and draw.
Directmates and studies – two kinds of composed positions – are always solved from White’s perspective. That isn’t the case in tactics puzzles, but for ease of discussion, we’ll assume that in such positions it’s always White who gains the winning material advantage, and Black defends. For simplicity’s sake, I will ignore Draw studies, though the general principles would still apply.
Two types of unsoundness
Alternative white play different from the intended solution also achieves the specific goal.
Black manages to prevent White from achieving the goal, i.e. there is no solution.
The first type underlines the essential element of unique play in all forms of puzzles and compositions. In problems and studies, an alternative first move that also solves is called a cook, and a cooked composition is normally ruined. Alternatives on the second and subsequent moves are called duals, and they are serious flaws in thematic variations but tolerated in subsidiary lines. Importantly, note that these faults apply to white play only. Alternative black moves, unless they actually defeat White’s goal (as mentioned above), are not flaws at all but they simply generate variations.
This is why it’s pointless to criticise a black move in the main variation of a puzzle or a study for being “not forced,” unless you could indicate an alternative black move that actually stops White from winning. A forced move is one where all alternatives would lead to a worse result for the player. Thus in a sound position, White’s correct moves are forced (alternatives would fail to win); but since the position is objectively lost for Black (best play by White is always assumed), not only the black moves of the main line but all of their alternatives would also lead to the same fatal result. In such a situation, the idea of a “forced” black move – one that gives a better result than its alternatives – is ultimately meaningless.
Three types of black strategies
Tactics puzzles and Small-Target strategy
Chess.com Daily Puzzle
“No Need To Worry”, 1 Aug. 2018
The solution is 1.h4! and Black, instead of playing 1…Ke7 to handle the mating threat, chooses 1…Rxh7. White answers with 2.Bg5 mate. Firstly, the puzzle is sound because in the main line, every white move is unique while alternative black moves, including 1…Ke7, would fail to stop White’s goal of gaining a winning material advantage (2.Qxg7+). It’s therefore not useful to say that 1…Rxh7 is a “blunder” or that 1…Ke7 would be a “better” alternative, when both lose ultimately. This is an important distinction between puzzles and the practical game. In puzzles (along with compositions), only objective results matter, based on perfect white play; the “practical” chances in a game, such as that available from playing 1…Ke7 against an imperfect opponent, are not relevant.
Now, since 1.h4! forces a win against 1…Ke7 too, why not make this more “sensible” black move the main defence of the puzzle? The reason is that this black move fails to employ Small-Target strategy. Tactics puzzles are generally “mined” from actual master games, and the selection of suitable positions depends on engine evaluations and the detection of unique solutions. On the evaluation of moves, I’m not sure what exact figures are used by Chess.com, but a move scoring around 3 points or more is normally viewed as winning. Let’s check how the Stockfish on Chess.com evaluates the alternative 1…Ke7.
We see that besides the natural 2.Qxg7+, at least two other white moves also win against 1…Ke7. Thus if this black move had been used as the puzzle defence, any white move chosen as the correct response would not be unique. That is to say, the puzzle would actually become unsound. By contrast, Stockfish confirms that after 1…Rxh7, 2.Bg5 is the sole winning move, i.e. the black capture entails Small-Target strategy.
Directmate problems and Delay-the-Mate strategy
Mate in 4
In this four-move problem, the black rook is preventing both Sb3 and Qa2 mate, and it cannot move without permitting one of these mates. So White wants to preserve the zugzwang by playing a waiting move, but shifting the king enables Black to refute by giving a series of checks, e.g. 1.Kh7? Ra7+! 2.g7 Rxg7+ 3.Kh6 (3.Kxg7 stalemate) Rh7+ 4.Qxh7 too slow. The key-move 1.g7! also exposes the white king to checks, but White manages to mate in time with a surprising promotion: 1…Ra6+ 2.Kh7 Rh6+ 3.Kg8 Rh8+ 4.gxh8=Q (or 3…Ra6 4.Sb3, 3…Rb6 4.Qa2). Another main defence occurs after 2.Kh7 with 2…Ra3. Now 3.g8=Q? fails to 3…Ra7+, but 3.h6 cleverly brings back the initial zugzwang: 3…R~file 4.Sb3, 3…R~rank 4.Qa2.
The problem is sound because: (1) Only the key forces mate in four moves (no cooks), and every subsequent white move in the main lines is unique too (no duals), and (2) Black applies the Delay-the-Mate strategy but still cannot extend the play to beyond four moves.
Directmate problems are typically misconstrued as faulty when their move limits are ignored, as if they were tactics puzzles or studies which don’t require their goals to be achieved within a certain number of moves. Thus alternative white play that forces mate but takes longer to do so is viewed as an equally valid solution, “spoiling” the problem. I call this sort of misunderstanding the fallacy of Any-win-will-do.
Endgame studies and Avoid-Theoretical-Loss strategy
Shakhmatnaya Moskva 1961, 1st Prize
White to play and win
Although White is a piece ahead in this position, Black is on the verge of queening the g-pawn and thus forcing a draw. 1.Sxg3? Kxg3 2.Bb7 only draws as Black could sacrifice the bishop for the remaining pawn. White starts with 1.Bf1!, threatening 2.Bg2 to neutralise the black pawn, after which White only needs to guide the f-pawn forward to win. Black therefore has to try something drastic. 1…Bb5 2.Bg2; not 2.Bxb5? g2 and Black promotes. 2…Bf1 – a neat sacrifice because after 3.Bxf1 g2, Black’s double-threat of …g1=Q/gxf1=Q seems impossible to stop except by 4.Bxg2, and that delivers stalemate! However, White has a remarkable resource, 4.Sg3, which not only defends the bishop but answers 4…g1=Q with 5.Sf5 mate (4…Kxg3 5.Bxg2 wins).
This study finishes with White mating after just five moves, but obviously it’s not a mate-in-5 problem. Why doesn’t Black extend the play at several points, such as on the last move with 4…gxf1=Q forcing 5.Sxf1? The reason, as discussed, is that these alternatives would create theoretical won positions for White, a sort of “giving up.” It’s ironic that Black allowing White to mate in the main line is considered more of a fight, but the point of the thematic defence 4…g1=Q is that it’s the culmination of Black’s sacrificial plan – a safe queen promotion that should have pulled off a draw, if not for the dramatic mating response. (Of course, most studies don’t finish with a mate but with White reaching a theoretical won position, one in which the solution must end because unique white play no longer exists.)
Endgame studies are sometimes mistaken as being reliant on “weak” black moves for their thematic variations to work. “Stronger” alternatives are thought to refute the intended solutions by prolonging the length of play, just as in a directmate problem, a black defence that holds off mate further than the intended solution would render the position unsound. This belief overlooks that studies (unlike directmate problems) don’t impose a move limit, and thus such black alternatives fail to thwart White’s goal and they merely constitute side-variations. This kind of error may be called the fallacy of Delaying-loss-is-good-enough.
Why Stockfish and tablebases aren’t Gods
Suppose we test a Win study with Stockfish and it finds no cooks or black refutations that could salvage a draw. In such a won position, the engine will display what it considers the top variation(s), based on game logic where the losing side drags out the play – what I’d call Delay-the-Loss strategy. While this strategy is akin to Delay-the-Mate, the engine has no inkling of the Small-Target and Avoid-Theoretical-Loss strategies, because they are relevant to puzzles and studies, not to the practical game. This is why engines are hopeless at detecting the main variation of a given position. To have solved a study, an engine would have analysed hundreds or thousands of variations, buried among which is the main line. But armed with the Delay-the-Loss strategy, the software is likely to pick out an uninteresting side-variation for your consumption. No doubt this engine behaviour, favouring the longest line, plays a big role in fostering the Delaying-loss-is-good-enough fallacy.
Unlike studies, directmate problems – especially the shorter mate-in-2 or mate-in-3 ones – usually contain multiple variations of equal importance. These thematic variations are of the same length, chosen from all the full-length variations that arise from the Delay-the-Mate strategy. Since this strategy is just a form of Delay-the-Loss which Stockfish likes, the software would automatically display one of the full-length variations. Engines of course don’t understand which lines are thematically interesting, so it’s hit-or-miss what they choose to show. Longer problems sometimes have just one full-length variation, in which case it can’t be missed. But in shorter problems, even if a thematic line gets picked by chance, that would still present a misleading picture because the point of such compositions is the relationship between several variations.
Tablebases are in the same boat as engines when handling composed positions. The software has no trouble finding correct white moves or black refutations if they exist. But in a sound study where all black moves lose, ordering them according to a tablebase metric number (e.g. depth-to-mate) won’t tell us which of these moves belong to a main variation. In directmate problems, the depth-to-mate numbers do reveal which black moves produce full-length variations, but when there is more than one, deciding which moves are thematic is again a judgement call for humans.
To properly test the correctness of a puzzle or a composition with an engine or tablebases, a single analysis of the initial position is thus not really sufficient. You also need to manually enter each black defence of the main variation(s), and check the software’s white response. At every such step – along with the starting position – if more than one white move accomplishes the relevant goal, or if White fails to achieve the goal at all, the position is unsound. Such a claim, though, is legitimate only when the various goals and conventions of puzzles and compositions are discerned.