Understanding soundness and motivations in chess puzzles, problems, and studies

1 Nov. 2021 | by Peter Wong

What makes chess puzzles, problems, and endgame studies sound – or unsound? Why do some puzzles seem to contain “sub-optimal” moves by the opposing side? Is a problem or a study faulty if its thematic variation isn’t apparently “forced”? These sorts of questions are often raised, and it’s certainly important to be able to distinguish between valid and invalid solutions. Afterall, the brilliant play of a puzzle or a composition comes to nought if an alternative line solves as well. Understanding the issue of soundness also helps us to appreciate why certain moves appear as the main line of a solution. We will thus look at the motivations that underlie “best” moves and note how they vary in tactics puzzles, composed problems, and studies, due to different goals and conventions. We’ll also consider the limitations of chess engines and tablebases in analysing such positions, and see why the software’s purported solutions are often misleading. Once these factors are taken into account, we can make the right verdict on the validity of any puzzle or composition, and avoid some common fallacies.

Three types of white goals

Simply put, puzzles and compositions are sound or correct when their particular goals are achieved only through their intended play. This is one reason why the task of a composed problem or study needs to be clearly stated next to its diagram (as discussed in my blog Chess problems vs puzzles…, about the crucial distinction between composed problems and tactics puzzles). Otherwise, without a specified goal, how would you even know that you have achieved it or found an alternative solution? Tactics puzzles are typically shown without a specific task, but the vast majority of them share the one listed below. Here are the three major types of chess puzzles/problems, as defined by their respective goals:

Directmates and studies – two kinds of composed positions – are always solved from White’s perspective. That isn’t the case in tactics puzzles, but for ease of discussion, we’ll assume that in such positions it’s always White who gains the winning material advantage, and Black defends. For simplicity’s sake, I will ignore Draw studies, though the general principles would still apply.

Two types of unsoundness

There are two general ways by which a puzzle or a composition can become unsound:

The first type underlines the essential element of unique play in all forms of puzzles and compositions. In problems and studies, an alternative first move that also solves is called a cook, and a cooked composition is normally ruined. Alternatives on the second and subsequent moves are called duals, and they are serious flaws in thematic variations but tolerated in subsidiary lines. Importantly, note that these faults apply to white play only. Alternative black moves, unless they actually thwart White’s goal (as mentioned above), are not flaws at all but they simply generate variations.

This is why it’s pointless to criticise a black move in the main variation of a puzzle or a study for being “not forced” or “not the best,” unless you could indicate an alternative black move that actually stops White from winning. A forced move is one where all alternatives would lead to a worse result for the player. Thus in a sound position, White’s correct moves are forced (alternatives would fail to win); but since the position is objectively lost for Black (best play by White is always assumed), not only the black moves of the main line but all of their alternatives would also lead to the same fatal result. In such a situation, the idea of a “forced” black move – a “best” one that gives a better result than its alternatives – is ultimately meaningless.

Three types of black strategies

Even though black moves in puzzles and compositions are not strictly forced, they are intelligently motivated. These moves follow certain principles that allow them to function as the main defences. Let us consider three such principles, or black strategies, that lie behind the selection of moves in a main variation. The first is a basic rule that applies in all three major types of puzzles/problems, and it’s demonstrated below with a tactics puzzle from Chess.com. The other two strategies are specific to directmate problems and studies respectively. As White’s goals in these two types of compositions are different, Black’s aims in countering them are also subtly changed.

Tactics puzzles and Small-Target strategy

The first principle of black play in a main variation derives naturally from the key element of unique play in puzzles and compositions. To be considered as a main defence, a black move must require White to have an only response for achieving the given goal. Black moves that compel White to choose carefully in this sense exhibit what I called Small-Target strategy. In any given situation, various black moves could follow this strategy, so it’s not a sufficient condition for inclusion in a main line (other factors are involved, like themes) – but it’s a necessary one. The principle explains why in many puzzles Black apparently makes “poor” moves. Here’s a Daily Puzzle that illustrates the issues.

Chess.com Daily Puzzle
“No Need To Worry”, 1 Aug. 2018

The solution is 1.h4! and Black, instead of playing 1…Ke7 to handle the mating threat, chooses 1…Rxh7. White answers with 2.Bg5 mate. Firstly, the puzzle is sound because in the main line, every white move is uniquely forced while alternative black moves, including 1…Ke7, would fail to stop White’s goal of gaining a winning material advantage (2.Qxg7+). It’s therefore not useful to say that 1…Rxh7 is a “blunder” or that 1…Ke7 would be a “better” alternative, when both lose ultimately. This is an important distinction between puzzles and the practical game. In puzzles (along with compositions), only objective results matter, based on perfect white play which will inevitably convert the material advantage. The “practical chances" in a game, such as that available from playing 1…Ke7 against an imperfect opponent, are not relevant.

Now, since 1.h4! forces a win against 1…Ke7 too, why not make this more “sensible” black move the main defence of the puzzle? The reason is that this black move (besides being less instructive) fails to employ Small-Target strategy. Tactics puzzles are generally extracted from actual games, and the selection of suitable positions depends on engine evaluations and the detection of unique solutions. On the evaluation of moves, I’m not sure what exact figures are used by Chess.com, but a move scoring 3 points or more is normally viewed as winning. Let’s check how the Stockfish on Chess.com evaluates the alternative 1…Ke7.

We see that besides the natural 2.Qxg7+, at least two other white moves also win against 1…Ke7. Thus if this black move had been used as the puzzle defence, any white move chosen as the correct response (e.g. 2.Qxg7+) would not be unique, and solvers who pick an alternative win (e.g. 2.f6+) would be unfairly penalised. In other words, the puzzle would actually become unsound. By contrast, Stockfish confirms that after 1…Rxh7, 2.Bg5 is the sole winning move, i.e. the black capture entails Small-Target strategy.

Directmate problems and Delay-the-Mate strategy

In directmate problems, White’s goal is not simply to win but to mate in the fewest moves. Black’s opposing motive is therefore quite specific as well – to hold off mate for as long as possible. This Delay-the-Mate strategy is pretty intuitive; thus in a main variation we won’t see Black conceding a short mate of the type seen in the above puzzle example. Another difference is that when solving directmates, you need to work out Black’s best (i.e. delay-the-mate) defences in addition to White’s correct moves, unlike the case in Chess.com puzzles where the interface provides you with the black moves.

Herbert Hultberg
Eskilstuna-Kuriren 1935

Mate in 4

In this four-move problem, the black rook is preventing both Sb3 and Qa2 mate, and it cannot move without permitting one of these mates. So White wants to preserve the zugzwang by playing a waiting move, but shifting the king enables Black to refute by giving a series of checks, e.g. 1.Kh7? Ra7+! 2.g7 Rxg7+ 3.Kh6 (3.Kxg7 stalemate) Rh7+ 4.Qxh7 too slow. The key-move 1.g7! also exposes the white king to checks, but White manages to mate in time with a surprising promotion: 1…Ra6+ 2.Kh7 Rh6+ 3.Kg8 Rh8+ 4.gxh8=Q (or 3…Ra6 4.Sb3, 3…Rb6 4.Qa2). Another main defence occurs after 2.Kh7 with 2…Ra3. Now 3.g8=Q? fails to 3…Ra7+, but 3.h6 cleverly brings back the initial zugzwang: 3…R~file 4.Sb3, 3…R~rank 4.Qa2.

The problem is sound because: (1) Only the key forces mate in four moves (no cooks), and every subsequent white move in the main lines is unique too (no duals), and (2) Black applies the Delay-the-Mate strategy but still cannot extend the play to beyond four moves.

Directmate problems are typically misconstrued as faulty when their move limits are ignored, as if they were tactics puzzles or studies which don’t require their goals to be achieved within a certain number of moves. Thus alternative white play that forces mate but takes longer to do so is viewed as an equally valid solution, “spoiling” the problem. I call this sort of misunderstanding the fallacy of Any-win-will-do.

Endgame studies and Avoid-Theoretical-Loss strategy

Endgame studies have certain features that make them adapt well as tactics puzzles: the general goal of “winning” without a move limit, unique white play, and (usually) one main thematic variation. However, studies work as advanced puzzles because they require knowledge of endgame theory. The real goal in a study is: White to play and attain a theoretical won ending position. Therefore Black’s respective counter-strategy can be seen as, specifically, how to avoid theoretical loss positions. For example, the K+Q vs K+R endgame is a difficult one but it’s known to be won for the queen side. Suppose in a study, Black has two candidate moves, one of which leads to a white K+Q vs black K+R position, and the other results in a quicker loss but requires White to play some interesting unique moves. The main variation will utilise the second candidate move. Not only does the latter option involve Small-Target strategy, but the artistic aims of studies favour original play over the technicality of winning a position known to theory. The first candidate move, no matter how long it could drag things out, is relegated as a side-variation.

Ernest Pogosyants
Shakhmatnaya Moskva 1961, 1st Prize

White to play and win

Although White is a piece ahead in this position, Black is on the verge of queening the g-pawn and thus forcing a draw. 1.Sxg3? Kxg3 2.Bb7 only draws as Black could sacrifice the bishop for the remaining pawn. White starts with 1.Bf1!, threatening 2.Bg2 to neutralise the black pawn, after which White only needs to guide the f-pawn forward to win. Black therefore has to try something drastic. 1…Bb5 2.Bg2; not 2.Bxb5? g2 and Black promotes. 2…Bf1 – a neat sacrifice because after 3.Bxf1 g2, Black’s double-threat of …g1=Q/gxf1=Q seems impossible to stop except by 4.Bxg2, and that delivers stalemate! However, White has a remarkable resource, 4.Sg3, which not only defends the bishop but answers 4…g1=Q with 5.Sf5 mate (4…Kxg3 5.Bxg2 wins).

This study finishes with White mating after just five moves, but obviously it’s not a mate-in-5 problem. Why doesn’t Black extend the play at several points, such as on the last move with 4…gxf1=Q forcing 5.Sxf1? The reason, as discussed, is that these alternatives would create theoretical loss positions for Black, a sort of “giving up.” It’s ironic that Black allowing White to mate in the main line is considered more of a fight, but the point of the thematic defence 4…g1=Q is that it’s the culmination of Black’s sacrificial plan – a safe queen promotion that should have pulled off a draw, if not for the dramatic mating response. (Of course, most studies don’t finish with a mate but with White reaching a theoretical won position, one in which the solution must end because unique white play no longer exists.)

Endgame studies are sometimes mistaken as being reliant on “weak” black moves for their thematic variations to work. “Stronger” alternatives are thought to refute the intended solutions by prolonging the length of play, just as in a directmate problem, a black defence that holds off mate further than the intended solution would render the position unsound. This belief overlooks that studies (unlike directmate problems) don’t impose a move limit, and thus such black alternatives fail to thwart White’s goal and they merely constitute side-variations. This kind of misconception may be called the fallacy of Delaying-loss-is-good-enough.

Why Stockfish and tablebases aren’t Gods

Chess engines are extremely capable at solving puzzles and compositions, and serve as a great tool for testing their soundness. However, because engines are designed for playing the game rather than solving positions with varying conventions, the solutions they provide should be interpreted with care. The issues are similar when dealing with studies and puzzles, but directmate problems are, well, problematic in their own way.

Suppose we test a Win study with Stockfish and it finds no cooks or black refutations that could salvage a draw. In such a won position, the engine will display what it considers the top variation, based on game logic where the losing side drags out the play – what I’d call Delay-the-Loss strategy. While this strategy is akin to Delay-the-Mate, the engine has no inkling of the Small-Target and Avoid-Theoretical-Loss strategies, because they are relevant to puzzles and studies, not to the practical game. This is why engines are hopeless at detecting the main variation of a given position. To have solved a study, an engine would have analysed thousands of variations, buried among which is the main line. But armed with the Delay-the-Loss strategy, the software is likely to pick out an uninteresting side-variation for your consumption. No doubt this engine behaviour, favouring the longest line, plays a big role in fostering the Delaying-loss-is-good-enough fallacy.

Unlike studies, directmate problems – especially the shorter mate-in-2 or mate-in-3 ones – usually contain multiple variations of equal importance. These thematic variations are of the same length, chosen from all the full-length variations that arise from the Delay-the-Mate strategy. Since this strategy is just a form of Delay-the-Loss which Stockfish likes, the software would automatically display one of the full-length variations. Engines of course don’t understand which lines are thematically interesting, so it’s hit-or-miss what they choose to show. Longer problems sometimes have just one full-length variation, in which case it can’t be missed. But in shorter problems, even if a thematic line gets picked by chance, that would still present a misleading picture because the point of such compositions is the relationship between several variations.

Tablebases are in the same boat as engines when handling composed positions. The software has no trouble finding correct white moves or black refutations if they exist. But in a sound study where all black moves lose, ordering them according to a tablebase metric number (e.g. depth-to-mate) won’t tell us which of these moves belong to a main variation. In directmate problems, the depth-to-mate numbers do reveal which black moves produce full-length variations, but when there is more than one, deciding which moves are thematic is again a judgement call for humans.

To properly test the correctness of a puzzle or a composition with an engine or tablebases, a single analysis of the initial position is thus not really sufficient. You also need to manually enter each black defence of the main variation(s), and check the software’s white response. At every such step – along with the starting position – if more than one white move accomplishes the relevant goal, or if White fails to achieve the goal at all, the position is unsound. Such a claim, though, is legitimate only when the various goals and conventions of puzzles and compositions are discerned.