Documente Academic
Documente Profesional
Documente Cultură
Mathias Weller
April 21, 2008
Abstract
In this work, we give an overview of the research done so far on the
topic of Sudoku grid enumeration, solving, and generating Sudoku puz-
zles. We examine possible extensions and generalizations of previous work
on solving and generating Sudoku puzzles focusing mainly on rulebased
solvers. A possible way to inuence the diculty of a generated Sudoku
puzzle is described and we introduce new deduction rules for solving a
puzzle based on the rules described by David Eppstein in his paper Non-
repetitive Paths and Cycles in Graphs with Application to Sudoku. We
then generalize these new rules further leading to an ecient constraint
propagation algorithm that is able to solve puzzles that could not be
solved by applying only Eppsteins deduction rules. The implementation
of this strategy and how it may be used to implement the special cases
is explained, followed by a practical evaluation of the solving power of all
presented solvers.
1
Contents
1 Introduction 3
1.1 The Sudoku Game . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Sudoku Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Prior Work 7
2.1 Counting Sudoku Grids . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Short Introduction to NP-completeness . . . . . . . . . . 11
2.2.2 Sudoku Decision Problem . . . . . . . . . . . . . . . . . . 12
2.2.3 Complexity of the Sudoku Decision Problem . . . . . . . 12
2.3 Generating Sudoku Puzzles . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Incremental Generation . . . . . . . . . . . . . . . . . . . 13
2.3.2 Decremental Generation . . . . . . . . . . . . . . . . . . . 14
2.4 Judging the Diculty of Generated Sudoku Puzzles . . . . . . . 14
2.5 Finding Solutions to Sudoku Puzzles . . . . . . . . . . . . . . . . 15
2.5.1 Solving Sudoku Puzzles via Backtracking . . . . . . . . . 15
2.5.2 Solving Sudoku Puzzles via Constraint Programming . . . 15
2.5.3 Solving Sudoku Puzzles via Logic Deduction . . . . . . . 16
2.6 Graph Coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Generalization and Contribution 18
3.1 Counting Sudoku . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Generating Sudoku Puzzles . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Finding a Full Sudoku Grid . . . . . . . . . . . . . . . . . 20
3.2.2 Deletion Witnesses . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Judging the Diculty of Generated Sudoku Puzzles . . . . . . . 21
3.4 Finding Solutions to Sudoku Puzzles . . . . . . . . . . . . . . . . 21
3.4.1 Extension of Bilocation and Bivalue . . . . . . . . . . . . 22
3.4.2 Group-Modied Rules . . . . . . . . . . . . . . . . . . . . 25
3.4.3 Limited Constraint Propagation . . . . . . . . . . . . . . 26
4 Experimental Results 31
5 Outlook and Future Work 32
2
1 Introduction
Sudoku (or Number place, as it is called in the US) is a well-known logic
puzzle popular for its appearance in newspapers and magazines. Its popular-
ity is expressed in examples like a Boston Japanese restaurant that hands out
$10 gift certicates to patrons who can nish a Sudoku puzzle before their Sushi
is served. There are ocial tournaments in Europe and the US with the possi-
bility to win monetary prizes [Sem05]. Solving a Sudoku puzzle is usually very
satisfactory for the puzzler, or, to quote Henry Dudeney, A good puzzle, like
virtue, is its own reward [Dud02]. Sudoku is a derivative of Latin Square, a
puzzle rst described by Leonhard Euler in 1783. The Sudoku puzzle was rst
created for Dell Magazines by Howard Garnes, an architect from Indianapolis
and introduced to the US public in 1979. It was not until seven years later
that Sudoku was successful in Japan, where it was rst published by the Nikoli
company under its current name, which is Japanese for single number. At
the beginning of the 21
st
century, the puzzle spread all over the world. This
international success partially relies on using numbers instead of letters or words
[Sem05]. When solving Sudoku puzzles, one naturally stumbles upon a variety
of questions: Does my puzzle have a solution? If so, is it the only one for my
puzzle? If not, how many solutions are there and is there a systematic way of de-
termining all solutions? Does the puzzle become harder if there were less hints?
What is the minimum number of hints in order to assure a unique solution? In
this article, we will consider some of these and other questions.
1.1 The Sudoku Game
Sudoku is a puzzle game played on a grid that consists of 9 9 cells each
belonging to three groups: one of nine rows, one of nine columns and one of
nine blocks (sometimes called boxes or subsquares). Three blocks in a row are
called a band, three vertically stacked blocks are called a stack, a chute is either
a band or a stack (see Figure 1). A Sudoku grid is full, if each group contains
the numerals from 1 to 9 exactly once. Figure 1 shows a full Sudoku grid. A
Sudoku puzzle is a Sudoku grid that is partially lled, meaning that a set of
xed cells (cells whose numerals are given, i.e. that cannot be chosen by the
solver), also called hints or clues is provided by a puzzle master, whereas the
other cells are blank. Figure 2 shows a possible Sudoku puzzle for the grid in
Figure 1. The objective of the puzzle game is to ll the Sudoku grid by assigning
a numeral to each blank cell in such a way that each numeral is unique in each
of its three groups. A solution to a Sudoku puzzle is a full Sudoku grid that is
consistent with the puzzle, meaning that all hints of the puzzle appear in the full
grid as well. Figure 1 is a solution to the puzzle in Figure 2. A Sudoku grid is
called proper or unisolvent if it has only one solution, ambiguous if it has more
than one solution and invalid if it has no solution at all, due to contradicting
hints. Most daily newspaper Sudoku puzzles provide about 28 or 30 clues,
but for the diculty of the puzzle, the number of hints matters less than the
complexity of the logical leaps required to assign numerals to the blank cells
3
1 3 8 2 7 6 5 4 9
2 7 5 4 1 9 8 6 3
6 4 9 5 8 3 1 7 2
5 8 3 1 6 7 2 9 4
9 1 6 3 4 2 7 5 8
7 2 4 9 5 8 3 1 6
3 5 2 7 9 4 6 8 1
4 6 1 8 3 5 9 2 7
8 9 7 6 2 1 4 3 5
1 3 8 2 7 6 5 4 9
2 7 5 4 1 9 8 6 3
6 4 9 5 8 3 1 7 2
5 8 3 1 6 7 2 9 4
9 1 6 3 4 2 7 5 8
7 2 4 9 5 8 3 1 6
3 5 2 7 9 4 6 8 1
4 6 1 8 3 5 9 2 7
8 9 7 6 2 1 4 3 5
Figure 1: A full Sudoku grid. On the right, the rst band and the second stack
are marked.
1 7 9
2 5 6
3 1
5 8 1 6
6 7
5 8 1 6
2 7
6 9 7
8 2 5
Figure 2: A proper Sudoku puzzle.
4
[Sem05]. As opposed to the diculty, the number of clues plays a crucial role in
determining the properness of a puzzle. So far, no proper 9 9 Sudoku puzzles
with less than 17 hints is known, whereas there are several proper puzzles with
exactly 17 hints. The minimal number of hints necessary for an nn puzzle to
have a unique solution is yet unknown [HM07], although a lower bound of n1
is easy to prove: if a puzzle had only n 2 hints, then there are two numerals
that are not specied by the puzzle. These two numerals may be exchanged
throughout the solution to the puzzle in order to obtain another solution for it
and thus, the puzzle cannot be proper (see also Section 2.6 on page 18).
In order to do complexity analysis for solving Sudoku puzzles, we parametrize
them. In this case, the size of each group (meaning the number of dierent
numerals) (n = 9), the order of the Sudoku grid, meaning the length of a
side of the blocks (m =
n = 3), or the number of cells in the Sudoku grid
(h = n
2
= 81) may be used. Since they are all polynomial in one another,
an ecient algorithm with regard to any of these parameters is ecient with
regard to all of them, hence eciency is invariant under choosing one of the
mentioned parameters. For reasons of compatibility with other papers, n will
refer to the number of dierent numerals in the following part of the article,
if not explicitly stated otherwise. Though it is common to use numerals to ll
the Sudoku grid, letters, pictures or any kind of items dividable into at least
n disjoint equivalence classes is suitable as well.
Sudoku is closely related to the Latin Square problem: given an n
n square of cells and a set of xed cells, nd a completely lled n n grid that
is a superset of the xed cells such that each item is unique for its column and
row while still using only n dierent types of items. Figure 3 shows an example
of a Latin Square puzzle and its solution.
6 4 2 1
3 9 4 8
4 6 3 7
5 2 8 9
8 1 3 2
1 7 3
2 8 6 1 7
4 9 5 1 8
3
6 8 4 3 7 2 1 5 9
1 3 6 9 4 5 7 2 8
9 4 2 6 1 8 3 7 5
7 5 3 2 8 4 9 6 1
8 6 7 1 5 3 4 9 2
5 9 1 8 2 7 6 3 4
3 2 8 4 9 6 5 1 7
4 7 9 5 3 1 2 8 6
2 1 5 7 6 9 8 4 3
Figure 3: A 9 9 Latin Square puzzle and its solution.
5
1 3 7 4
6 2 3
7 8
4 3 7
2 7 4
7 8
3 8 1
8 5 3 6
Figure 4: An 8 8 Sudoku puzzle with 2 4 groups.
Figure 5: The 12 pentomino groups.
1.2 Sudoku Variants
Although this article is mainly about plain Sudoku as described above, this
short introduction to Sudoku variants may be of interest for the reader.
As already mentioned, a Sudoku grid may have any dimension n, although
99 puzzles are by far the most common. There are also puzzles that do not have
the same number of stacks and bands. An example is the 8 8 Sudoku puzzle
shown in Figure 4. The groups may be irregular, which allows for 55 grids with
pentomino groups (groups of irregular shape that contain exactly ve cells, see
Figure 5). This type of puzzle is also known as Logi-5. Apart from geometric
dierences, there are several Sudoku variants that impose new rules or modify
the existing rules of the puzzle: The Sudoku X variant enforces the numerals
in the cells on the diagonals to be unique for each diagonal (see Figure 6)
[Mon05]. The Hypersudoku variant, also called Windocu consists of a normal
Sudoku grid that is supplemented with additional regions that have to contain
each numeral exactly once. These regions overlap the blocks, thereby giving
additional information (see Figure 6). The Samurai Sudoku variant consists of
ve 9 9 Sudoku puzzles arranged in a quincunx
1
such that the grid in the
1
A quincunx is a formation of ve entities similar to a cross. For example, the ve dots on
a side of a dice form a quincunx.
6
5 4 6 1 2 7 8 3 9
8 9 7 4 5 3 2 1 6
2 3 1 9 6 8 7 5 4
1 7 8 6 4 5 3 9 2
6 2 9 7 3 1 5 4 8
3 5 4 8 9 2 6 7 1
7 1 2 5 8 9 4 6 3
9 6 3 2 7 4 1 8 5
4 8 5 3 1 6 9 2 7
Figure 6: Left: A 9 9 Sudoku X grid. Right: A 9 9 Hypersudoku grid.
middle is being overlapped by the other four grids in its four corners, such that
the middle grid shares one block with each of the other grids while the outer
grids are disjoint (see Figure 7) [Tel06]. The circular Sudoku variant employs a
circular formation of cells that is divided into segments and rings. Each cell has
to be assigned a numeral such that each ring and each pair of neighboring sectors
contain each numeral exactly once (see Figure 8) [PMH06]. A variant combining
the idea of the Rubiks cube with Sudoku puzzles is the Sudokucube, a 3 3 3
cube that can be solved by turning plains of subcubes in such a way that each
side becomes a valid Sudoku grid. Hence the cube contains the numbers 1 to 9
exactly 6 times each. Variants that use letters instead of numerals may enforce
the formation of a valid word at some place in the grid. For almost every Sudoku
variant, there is another variant with the nonconsecutive property, meaning that
no two neighboring cells may be assigned consecutive numerals. Other variants
may modify the way in which hints are given, for example the 2005 U.S. Puzzle
Championship featured a puzzle that contained ranges of numerals as hints.
2 Prior Work
In this chapter, publications about Sudoku puzzles are being introduced: For
a start, we will consider the problem of counting possible Sudoku grids in Sec-
tion 2.1. A general complexity consideration in Section 2.2 will introduce to
the topic of generating (Section 2.3) and solving (Section 2.5) Sudoku puzzles.
In Section 2.4, the problem of judging the diculty of a generated puzzle will
be addressed. We will show parallels to the Graph-n-Coloring problem in
Section 2.6. Chapter 3 will introduce thoughts and ideas developed from the
approaches of Chapter 2 and nally, results of applying some of these ideas are
given in Chapter 4.
7
9 3 7 1 8 6 4 2 5 1 6 4 3 9 5 2 7 8
1 2 4 5 7 9 3 8 6 8 2 3 1 7 4 5 6 9
6 5 8 2 3 4 7 9 1 9 5 7 2 8 6 4 1 3
2 6 1 4 5 7 9 3 8 4 8 5 7 6 9 1 3 2
8 7 5 3 9 2 1 6 4 7 9 2 5 3 1 6 8 4
4 9 3 8 6 1 5 7 2 3 1 6 8 4 2 7 9 5
5 8 2 7 1 3 6 4 9 7 1 5 2 3 8 4 1 7 9 5 6
3 1 6 9 4 8 2 5 7 8 3 9 6 4 1 9 5 3 8 2 7
7 4 9 6 2 5 8 1 3 2 6 4 5 7 9 6 2 8 3 4 1
3 6 4 1 9 2 8 5 7
1 7 8 5 4 6 9 2 3
5 9 2 3 8 7 4 1 6
9 2 7 3 5 6 4 8 1 9 5 3 7 6 2 5 1 3 4 9 8
5 3 8 4 7 1 9 2 6 4 7 1 3 8 5 4 6 9 2 1 7
4 6 1 9 2 8 7 3 5 6 2 8 1 9 4 8 7 2 6 3 5
6 7 3 5 8 9 1 4 2 5 3 7 2 9 4 1 8 6
2 8 4 1 6 3 5 7 9 4 1 6 3 8 7 9 5 2
1 5 9 7 4 2 8 6 3 8 2 9 6 5 1 7 4 3
7 1 5 2 3 4 6 9 8 6 4 3 1 2 8 5 7 9
3 9 6 8 1 7 2 5 4 9 5 8 7 4 6 3 2 1
8 4 2 6 9 5 3 1 7 2 7 1 9 3 5 8 6 4
Figure 7: A 9 9 Samurai Sudoku grid.
Figure 8: A circular Sudoku puzzle with n = 8.
8
2.1 Counting Sudoku Grids
This section is a short summary of what was done so far on the topic of de-
termining the number of full Sudoku grids of specic dimensions. For a more
detailed view, please refer to the literature given in the section.
First of all, we are interested in the number of dierent full Sudoku grids of
a certain order. To calculate this number, we rst need a denition of dierence
regarding Sudoku grids. Therefore, an equality relation is to be provided that
relates equal Sudoku grids. Hence two grids that are not related are considered
dierent. For the following lemmas, two Sudoku grids are considered equal if
every cell of a grid contains the same numeral as the cell at the same position
in the other grid. This equality relation will be referred to as E.
Lemma 2.1 ([HM07]) There are N
44
= 288 valid full 4 4 Sudoku grids.
Lemma 2.2 ([FJ06]) There are
N
99
= 6, 670, 903, 752, 021, 072, 936, 960
valid full 9 9 Sudoku grids.
Remark The lemmas were proved using a combination of symmetry consider-
ation and brute force calculation, which did not allow for the calculation of the
exact number of valid 16 16 Sudoku grids yet, so this is an open problem.
This result may satisfy for the time being, but the fact that in order to calculate
these numbers a Sudoku grid and the version of the grid that is simply rotated by
90
:= E
T1
is being referred to when speaking of essentially
dierent Sudoku grids. For irregular puzzle sizes, the transposition is not va-
lidity preserving. The following transformations are referred to when speaking
of essentially dierent irregular Sudoku grids:
Permuting numerals
Permuting rows in the same band
Permuting bands
Permuting columns in the same stack
Permuting stacks
Lemma 2.3 ([HM07]) There are N
44
= 2 essentially dierent 44 Sudoku
grids (see Figure 9).
1 2 3 4 1 2 3 4
3 4 1 2 3 4 2 1
2 1 4 3 2 1 4 3
4 3 2 1 4 3 1 2
Figure 9: Representatives of the only two equivalence classes of 4 4 Sudoku
grids with respect to essentially dierent Sudoku grids.
Lemma 2.4 ([RJ06a]) There are N
99
= 5, 472, 730, 538 essentially dierent
9 9 Sudoku grids.
Other Sudoku variants were analyzed as well. Applying the transformations
listed in Denition 2.1 to dierent grid sizes results in dierent numbers of full
grids. An overview about these results is given in Table 1.
10
Grid type Block types Number of essentially dierent Sudoku grids
4 4 2 2 2 (See Lemma 2.3)
6 6 2 3 49 [RJ06b]
8 8 2 4 1, 673, 187 [Rus06]
10 10 2 5 4, 743, 933, 602, 050, 718 [Pet06]
9 9 3 3 5, 472, 730, 538 (See Lemma 2.4)
Table 1: Number of dierent Sudoku grids with respect to E
for dierent
puzzle sizes. Note that dierent transformations apply for irregular Sudoku
puzzle sizes.
2.2 Complexity
From the point of view of a student of theoretical computer science, a very
important consideration is the complexity analysis of a problem. In this section,
we will discuss the decision variant of the Sudoku problem. This will be dened
in Section 2.2.2, after a short introduction to NP-completeness.
2.2.1 Short Introduction to NP-completeness
This section will provide a brief overview over the topic of NP-completeness.
First of all, it is important to know some terms: In computer science, an algo-
rithm is called deterministic if each step is determined only by prior steps and
the input data. A deterministic algorithm is called ecient if its running time
is bounded by a polynomial in the size of the input data. The set of problems
that are solvable eciently is denominated by P, while NP denominates the set
of problems whose solutions are eciently veriable. Let A and B be problems
in NP, then a function f is called a reduction from A to B if for any input d,
d A f(d) B and the computation of f(d) is deterministic and ecient.
So the question whether d A can be answered by applying the reduction f to
d and testing whether f(d) B. If such a function exists for two problems A
and B, A is called reducible to B. Note that if f(d) B can be determined e-
ciently, so can d A. Also note that the binary reducible-relation is transitive,
meaning that if A is reducible to B and B is reducible to C, A is also reducible
to C. A problem Q in NP is called NP-complete, if all problems in NP can be
reduced to it. Hence, if Q was solvable eciently, all problems in NP would be.
For example, the SAT Problem, which is to tell whether a given Boolean for-
mula in conjunctive normal form has a satisfying assignment, in other words, if
the formula can evaluate to true, is NP-complete. It is yet unknown if eciently
nding solutions to the problems in NP is possible. This is called the P vs. NP
Problem.
11
2.2.2 Sudoku Decision Problem
We refer to the Sudoku problem as the problem of nding a solution to a
given Sudoku puzzle. Much like SAT, where the decision problem is to nd
whether a satisfying assignment of all variables of a given formula exists, the
decision problem for Sudoku is to nd whether a solution to a given Sudoku
puzzle exists. Note that it does not matter if the solution is ambiguous or not,
the uniqueness of a solution is not of interest. The decision variant of the Latin
Square problem is dened analogously.
2.2.3 Complexity of the Sudoku Decision Problem
The Sudoku decision problem is in NP. Obviously, the size of an nn Sudoku
grid is polynomial in n and thus a given solution to the grid can be veried
eciently. It has been shown that the decision problem of Sudoku is NP-
complete by reducing Latin Square, which is known to be NP-complete, to
Sudoku [YS03]. In the following, a sketch of the proof will be presented: To
solve an n n Latin Square, we construct a k k Sudoku grid with k = n
2
as follows: let S(i, j) denote the numeral in the cell of the Sudoku grid whose
column is i and whose row is j and let L(r, s) denote the cell of the Latin Square
whose column is r and whose row is s. The Sudoku grid is then constructed
respecting the equation
S(i, j) =
r(L(i 1,
j1
n
)) , if
i1
n
= (j 1) mod n = 0,
tr
n
((i, j)) , otherwise.
with r being a bijection that maps the n numerals of the Latin Square to n of
the k numerals of the Sudoku:
r(x) = (x 1) n + 1,
and
tr
n
(x) = (((i 1) mod n) n +
i 1
n
+ j 1) mod n
2
+ 1.
This construction enforces the assignment of all numerals d with
x 0, . . . , n : d = r(x)
12
2
2
1
4 2 5 8 3 6 9
2 5 8 3 6 9 4 7 1
3 6 9 4 7 1 5 8 2
4 5 8 2 6 9 3
5 8 2 6 9 3 7 1 4
6 9 3 7 1 4 8 2 5
1 8 2 5 9 3 6
8 2 5 9 3 6 1 4 7
9 3 6 1 4 7 2 5 8
Figure 10: An example for the reduction from Latin Square to Sudoku.
to the cells with
i 1
n
= (j 1) mod n = 0
but does not enforce any ordering on them other than the Latin Square rules for
the resulting grid to comply with the Sudoku rules. Figure 10 shows an example
for the reduction of a 3 3 Latin Square: together, the gray cells make up a
solution to the given Latin Square. The numerals 1, 4 and 7 in the gray cells of
the Sudoku grid are translated to the numerals 1,2 and 3 in the Latin Square.
2.3 Generating Sudoku Puzzles
Generating a Sudoku puzzle is the task of choosing a subset of cells of the
Sudoku grid to contain hints to enable the solver to calculate a solution for
the puzzle. To be satisfactory for human solvers, the solution implied by the
hints should be unique, so it is desirable to generate proper puzzles. Basically,
there are two dierent methods to create a proper Sudoku puzzle: Incremental
generation, which assigns numerals to one cell after another until sucient hints
are given for the puzzle to have a unique solution. Decremental generation
removes numerals from the cells of a full Sudoku grid for as long as desired or
possible in order for the solution to stay unique.
2.3.1 Incremental Generation
Several Sudoku programmer forums advice to implement Sudoku generators
that (randomly) pick cells and assign a (random) non-conicting numeral to
them until an automated solver can solve it. The disadvantage of this method
is that determining if a numeral contradicts another in a partially lled Sudoku
grid in general requires a solver. When assigning a random numeral to a ran-
dom cell, the puzzle may become invalid so the generator must either utilize
backtracking to nd another cell or numeral, or discard the whole puzzle and
start over when a puzzle becomes invalid.
13
1 2 3 4 5 6 7 8 9
4 5 6 7 8 9 1 2 3
7 8 9 1 2 3 4 5 6
2 3 4 5 6 7 8 9 1
5 6 7 8 9 1 2 3 4
8 9 1 2 3 4 5 6 7
3 4 5 6 7 8 9 1 2
6 7 8 9 1 2 3 4 5
9 1 2 3 4 5 6 7 8
Figure 11: Trivial Sudoku grid generated by S(x, y) = ((x/m|+m(x mod m)+
y) mod n) +1, where x is the number of the row of the cell starting with 0 and
y is the number of its column starting with 0, n is the number of numerals and
m =
n is the order of the Sudoku grid.
2.3.2 Decremental Generation
To generate a Sudoku puzzle decrementally, we have to create a completely
lled grid rst. There are multiple methods for how this can be achieved: For
instance, we could just take an existing Sudoku grid or generate a trivial Sudoku
grid by employing a mathematical formula (see Figure 11). The transformation
of an existing grid using validity-preserving transformations will also yield a new
Sudoku grid. We can also employ an algorithm for incremental generation of
Sudoku puzzles and apply a solver to it. This last method may seem intricate
but may be of interest for complexity analysis. After generating a full Sudoku
grid, the numerals from this grid are being removed for as long as possible for
the solution to stay unique. Therein lies the problem of indirect generation of
Sudoku puzzles, because determining if a Sudoku grid is proper is not trivial
and usually requires a solver. If the removal of a numeral causes the puzzle to
not be proper anymore, backtracking is used or the puzzle is discarded.
2.4 Judging the Diculty of Generated Sudoku Puzzles
With the generation of a Sudoku puzzle comes the task to judge its diculty. To
the best of our knowledge, all Sudoku puzzle generators determine the diculty
of a puzzle after its generation, which has the disadvantage that one cannot
choose the diculty of the puzzle to be generated. In order to get a puzzle of
desired diculty, the generator may have to be run multiple times. Eppsteins
generator judges a puzzle by the logic rules needed to solve it. Each rule is
assigned a value and the diculty value of the puzzle equals the maximum
diculty value of all rules needed to solve it, where the solver only applies a
dicult rule if all simpler rules have been exhausted [Epp05b]. This means that
if we were to generate a Sudoku puzzle of a certain diculty, we would need an
14
automated solver.
2.5 Finding Solutions to Sudoku Puzzles
Finding solutions to Sudoku puzzles is easily done by a simple backtracking
algorithm explained in Section 2.5.1. However, there are two main reasons why
this is not desirable: Backtracking in general takes too much time and it is not
tting to judge the diculty of a Sudoku puzzle. For the purpose of simulating a
human solver and thus evaluating the diculty of a Sudoku puzzle in context of
human strategies, solving it with a set of deduction-rules is of great interest. For
these reasons this article is focused on (ecient) non-backtracking algorithms
for solving Sudoku puzzles and just briey introduces other options.
2.5.1 Solving Sudoku Puzzles via Backtracking
To solve a given Sudoku puzzle we can traverse the search tree of all compatible
Sudoku grids, that is, grids that extend the puzzle. This leads to a trial and
error backtracking algorithm:
1. Find an unxed cell in the grid.
2. Choose a possible numeral for it.
3. With the new xed cell, solve the grid (recursively).
4. If the choice leads to an invalid grid, track back and try another possible
numeral.
The worst case running time of such an algorithm is (n
nk
), with k being the
number of xed cells, hence, if n k (1) it exceeds polynomial boundaries.
It is easy to see that performing backtracking on a constant part of a Sudoku
puzzle is generally not enough to solve it. However, in practice, the backtracking
algorithm can be modied so that it often takes linear time to solve a given
puzzle: instead of randomly picking a cell to branch from, choose the one with
the least number of possible numerals. Although it has a superpolynomial worst
case running time, the backtracking algorithm is capable of solving any proper
Sudoku puzzle and determining every solution to an ambiguous Sudoku puzzle.
2.5.2 Solving Sudoku Puzzles via Constraint Programming
Constraint Programming is the problem of nding an assignment to a given set
of variables in a given domain that complies to a given set of constraints. For
example, solving alphametic puzzles can be solved by Constraint Programming.
A famous alphametic puzzle is shown in Figure 12. Applied to this puzzle,
the Constraint Programming algorithm will come up with an assignment of the
variables respecting the given constraints.
One may utilize Constraint Programming to solve Sudoku puzzles by im-
plementing the fundamental rules of Sudoku as constraints over the domain
15
s e n d
m o r e
m o n e y
Figure 12: A popular alphametic puzzle. The objective is to nd values for
s, e, n, d, m, o, r, y 0 . . . 9 with s ,= 0, m ,= 0, (1000s + 100e + 10n + d) +
(1000m + 100o + 10r + e) = 10000m + 1000o + 100n + 10e + y, and no two
dierent variables being assigned the same value.
1, . . . , n: Each numeral must be unique for its column, row and block. In
general, Constraint Programming is NP-complete and equally suitable for solv-
ing any given Sudoku puzzle as the backtracking algorithm. In the Internet,
there are a lot of examples and tutorials on how to tweak constraint program-
ming for Sudoku, eectively improving the performance for example by cutting
down symmetric branches.
2.5.3 Solving Sudoku Puzzles via Logic Deduction
This method tries to mimic a human solver by applying a set of rules that rule
out possibilities for numerals in certain cells or x unxed cells in the grid thus
simplifying the task of solving the puzzle. As long as each of these rules can
be implemented eciently, the whole solving process can, because the number
of cells is polynomial in n and the number of possible numerals per cell is at
most n. Hence not every given Sudoku puzzle is solvable by a solver using only
logic deduction, unless P=NP. However, it is an open problem whether there
is a ruleset that is able to solve all proper Sudoku puzzles. A set of deduction
rules to solve a Sudoku puzzle is the following [Epp05a]:
Eliminate
If there is only one numeral left for a cell, assign it to this cell.
Locate
If there is only one cell left for a numeral in a group, assign it to this cell.
Align
Eliminate possibilities for numerals that would leave no choices for an-
other group. This means that if all cells of a group g that may contain
a numeral x share two of their three groups (g and g
), all possibilities of
x in cells of g
.
As shown in Figure 9 on page 10, the only two essentially dierent 44 Sudoku
grids are in fact equal under ipping an ambiguous rectangle. Hence, with
respect to E
99
(see Lemma 2.4 on
page 10) is not applicable to the ambiguous rectangle transformation, because
of its dependency on numerals, not just geometric shapes. Thus, there is at the
moment no better way than to look at all N
99
= 5, 472, 730, 538 equivalence
classes and checking all pairs of classes for equality by brute force. However, for
the sheer size of these classes it is overwhelmingly costly to handle them. In the
following, we will estimate how many comparisons it would take to calculate the
number of dierent Sudoku grids taking the ambiguous rectangle transforma-
tion into account. If the average number of grids in an equivalence class with
respect to E
is
k
N
99
N
99
=
6.6 10
21
5.4 10
9
1.2 10
12
and a uniform distribution of grids in each class is assumed, the estimated
number of comparisons is N
99
/2, which is approximately 3.3 10
21
. Hence
19
even if we compared a trillion Sudoku grids per second it would take 104 years
to nish calculation. However it is still interesting how many 9 9 grids are
essentially dierent with respect to E
[ c, c
W x(c
l
x
c
c
v
x
c
)
with the edge-labeling function
label : E T(L, V 1, . . . , n)
(L, x) label(c, c
) c
l
x
c
(V, x) label(c, c
) c
v
x
c
22
is called Force-Propagation-Graph or short FPG. Note that an edge may have
multiple labels. Let (x, y) be a label, then x = type((x, y)) is the type of the
label and y = numeral((x, y)) is the numeral of the label. The function
d : (L, V 1, . . . , n)
2
N
calculates the distances of two labels. It is much like the Hamming-Distance in
that it species how many parts of the labels dier.
d((p, q), (x, y)) =
0 , if p = x q = y
1 , if p = x q ,= y
2 , else.
A path in the FPG of length p +2 is called alternating if for each edge e
i
in the
path, there is a label b
i
l(e
i
) such that
i 0, . . . , p : d(b
i
, b
i+1
) = 1.
That means that only one part of the label may dier from edge to edge. An
alternating cycle is dened analogously.
The additional rules are dened as follows.
1. Alternating Cycle Rule (ACR)
Suppose there is an alternating cycle in the FPG. Let c
i
be a cell of the
cycle and e
i
and e
i+1
its incident edges. If there are two numerals x
and y with (L, x) label(e
i
) and (L, y) label(e
i+1
), remove all possible
numerals except x and y from consideration for c
i
(see Figure 13).
2. Repetitive Cycle Rule (RCR)
Suppose there is an alternating path of p+1 edges in the FPG that starts
and ends at the same vertex (cell) but is not an alternating cycle - this
means that the edges incident to the starting cell prevent the alternating
path from being an alternating cycle. Let e
0
, e
p
denote these two edges
and b
0
, b
p
the labels of e
0
and e
p
that were used to form the alternating
path (note that d(b
0
, b
1
) = d(b
p1
, b
p
) = 1, but d(b
0
, b
p
) ,= 1). Then, the
starting cell may not be assigned the numeral of the label whose type is V
if there are any, and must be assigned the numeral of the label whose type
is L if there are any. These two numerals cannot be the same because the
equality would yield d(b
0
, b
p
) = 1 and thus the alternating path would as
well be an alternating cycle. Also, if both labels were of the same type,
then for the same reason, their numerals would not dier.
While being an improvement to applying the Bilocation and Bivalue rules sepa-
rately, the Alternating Cycle and Repetitive Cycle rules alone are not powerful
enough to provide a substantial gain of solving power, as shown in Section 4.
Further generalization of the rules will be considered in the following sections.
23
7 5 8 4 6 9
8 4 6 5 1
3 7 2 8 5 4
5 6 3
5 2 1
3 9 6
5 8 4 6
1
5 1
Figure 13: An example for the application of the ACR. The grid on the right
shows the alternating cycle and the labels of its edges. Note the two marked
cells in the top row. The left one may only contain 1 or 2, whereas the right one
may only contain 1 or 3. Because each of them may only contain two numerals
one of which is 1, the two cells are connected by an edge labeled V1, which
stands for bivalued by 1, with respect to the top row. That means that by
assigning 1 to any of them, the other cell is forced not to contain 1 but the
other possible numeral. Not being able to contain the 1 propagates by the edge
labeled L1. This label means that the two cells are bilocated by 1 with respect
to their group, meaning that if 1 cannot be assigned to any of them, the other
cell is forced to contain 1. The other edges are formed in the same manner.
24
3.4.2 Group-Modied Rules
The denition of the FPG can be extended to support propagation through
grouping. That means that propagation may occur among cells that do not
have to be bivalued or bilocated, but just in the same group. Since there are a
lot of cells that are related by being in the same group, the extended FPG will
be much bigger (although still polynomial in n) than the FPG. The size may
be too much for a human solver to handle, which is why this was not included
into the (previous) denition of FPG. However, for an automated solver, this is
still of interest, so we will dene the extended FPG in the following:
Denition For an n n Sudoku grid S, the graph G
= (W, E
) with
W = c [ c is a cell in S
E
= c, c
[ c, c
W x(c
g
x
c
: E
T(L, V, G 1, . . . , n)
(L, x) label
(c, c
) c
l
x
c
(V, x) label
(c, c
) c
v
x
c
(G, x) label
(c, c
) c
g
x
c
: (L, V, G 1, . . . , n)
2
N
calculates the distance of two labels.
d
,if p = x = G q ,= y
0 , if (p = L x = L) q = y
2 , if (p = L x ,= L) q ,= y
1 , otherwise.
Analogous to FPG, a path in the EFPG of length p + 2 is called alternating if
for each edge e
i
in the path, there is a label b
i
label(e
i
) such that
i 0, . . . , p : d
(b
i
, b
i+1
) = 1.
Now, both additional rules stated in the previous section may also be used with
d
instead of d:
1. Extended Alternating Cycle Rule (EACR)
Suppose there is an alternating cycle in the FPG. Let c
i
be a cell of the
cycle and e
i
and e
i+1
its incident edges. If there are two numerals x
and y with (L, x) label(e
i
) and (L, y) label(e
i+1
), remove all possible
numerals except x and y from consideration for c
i
.
25
2. Extended Repetitive Cycle Rule (ERCR)
Suppose there is an alternating path of p + 1 edges in the EFPG that
starts and ends at the same vertex (cell) but is not an alternating cycle.
Let e
0
, e
p
denote the two edges that are incident to the starting cell and
b
0
, b
p
the labels of e
0
and e
p
that were used to form the alternating path
(note that d
(b
0
, b
1
) = d
(b
p1
, b
p
) = 1 but ,= d
(b
0
, b
p
) ,= 1). Then
the starting cell may not be assigned the numeral of the label whose type
is V or G if there are any, and must be assigned the numeral of the label
whose type is L if there are any.
With the Extended Alternating Cycle and Repetitive Cycle Rules we are closer
to the goal of making use of the four limited constraint propagation mechanisms
mentioned in Section 3.4. However, there is still a more abstract formulation
than these two rules, which for example takes into account multiple cells having
inuence on the content of a single cell. In the following we will introduce this
formulation and explain our implementation of it.
3.4.3 Limited Constraint Propagation
After uniting Bilocation and Bivalue rules, there are still unconsidered con-
straint propagation rules as mentioned in Section 3.4. To take them into ac-
count a limited constraint propagation algorithm was implemented. The idea
is to build a graph by analyzing the Sudoku grid with respect to the following
interpretation of the fundamental rules of Sudoku:
Each cell contains at least one numeral.
This implies that, if there is only one numeral left for a cell, it has to be
assigned to it (Eliminate).
Each cell contains at most one numeral.
This implies that, if a numeral has been assigned to a cell, no other numeral
may be assigned to it (Cell-Flood).
Each group contains each numeral at least once.
This implies that, if a numeral can only be assigned to one cell in a group,
it has to be assigned to this exact cell (Locate).
Each group contains each numeral at most once.
This implies that, if a numeral has been assigned to a cell in a group, no
other cell in this group may be assigned this numeral (Group-Flood).
In the following the algorithm and its implementation will be described. The
general structure of a constraint-propagation-node, or fp node is shown in Fig-
ure 14. The assignment of a numeral to a cell of a Sudoku grid is represented
by an fp node containing this numeral. Not assigning a certain numeral to a
cell is represented by an fp node whose numeral is negative. An fp node can be
triggered, meaning that it was determined to be true, for example, if a cell must
contain the numeral k, the triggered-property of the fp node of k in this cell is
26
Figure 14: An fp node has an array of triggers and an array of impacts.
set to true. If a numeral k of a cell has been determined to cause a violation of
the above rules, the node of k of this cell is triggered. Every fp node has a list
of triggers. A trigger of a node f is a list of fp nodes that collectively imply f,
meaning that f is a logic consequence of the totality of all nodes of the trigger.
Triggers have a type that describes the nature of its implication. Types may be
Eliminate, Locate, Cell-Flood and Group-Flood:
Eliminate:
For all k 1, . . . , n and all cells c of the grid, the totality of all negative
nodes of c except the node of k trigger the node of k of this cell, see
Figure 16.
Cell-Flood:
For all k 1, . . . , n and all cells c of the grid, the node of k triggers all
negative nodes of c except k, see Figure 16.
Locate:
For all k 1, . . . , n, all groups g and all cells c of g, the totality of all
nodes with the numeral k of all cells of g except c trigger the node with
k of c, see Figure 17.
Group-Flood:
For all k 1, . . . , n, all groups g and all cells c of g, the node of k
triggers all nodes with k of all cells of g except c, see Figure 17.
Likewise, every node has a list of impacts, which point to the triggers it partic-
ipates in (those have to be considered when changing the triggered property
of a node). With this data structure it is possible to represent most of the
implications that assigning or not assigning a certain numeral to a certain cell
may have. The graph structure in which these constraints are organized is
built by the following algorithm:
for all cells c in the grid begin
for all possibilities k of c begin
27
Figure 15: An fp node may be triggered by several other fp nodes and may itself
have impact on multiple fp nodes.
28
Figure 16: Illustration of the Eliminate and Cell-Flood implementation.
Figure 17: Illustration of the Locate and Group-Flood implementation.
f = fp_node(c, k)
s = new set
for all possible negative numerals -m begin
if m != k then add fp_node(c, -m) to s
make s an impact of f by Cell-Flood
make f a trigger for s by Eliminate
t = new set
for all groups g that contain c begin
for all cells d of g begin
if d != c then add fp_node(d, -k) to s
make s an impact of f by Group-Flood
make f a trigger of s by Locate
Lemma 3.1 The size of the graph structure is polynomial in the size of the
Sudoku grid.
Proof Let n be the number of dierent numerals in the grid. Note that the
number of groups is 3 and does not depend on n. Hence, in each run of the
outer loops, 4 triggers and 4 impacts will be added, each of size O(n). The
outer loops will run O(n
2
) O(n) times since there are n
2
cells in a grid and
each cell contains at most 2n possibilities (n, . . . , n 0). In total, the
graph structure will be of size O(n
4
).
The following algorithm is an implementation of the Limited Constraint Prop-
agation method to solve Sudoku puzzles:
build the graph structure G
account for the given hints
do
for all cells c in the grid begin
29
for all possibilities k of c begin
f = fp_node(c, k)
f = fp_node(c, -k)
if f is a consequence of f in G then trigger f
while changes occurred
Remark Being a consequence of f in G is determined by a modied BFS
3
algorithm that gathers all consequences of f in a set while traversing the graph.
A node f
consists exclu-
sively of nodes that are either triggered, a consequence of f, or f itself. This
way, the solver will only trigger nodes that are logical consequences of nodes
that are triggered already. Hence at any given time, all triggered nodes are
consequences of the hints given in the puzzle and therefore, if the solver nds a
full Sudoku grid, it is indeed the solution to the puzzle. Note that if a puzzle
is ambiguous, no solution will be found. Triggering f
that do not contain any more untriggered nodes to be triggered as well, which
may induce a chain-triggering of multiple nodes. The node f is then deleted
since f represents the impossibility of f
.
Lemma 3.2 The solver runs in polynomial time.
Proof Obviously, the inner for-loops will be executed at most O(n
3
) times.
The outer while-loop will run for as long as nodes are being triggered. Since
once a node is triggered, it will not become untriggered again, and the size of
the data structure containing all nodes is polynomial in n, the outer while-loop
will run a number of times that is polynomial in n. Finding a path in a graph
of polynomial size will also take polynomial time and so will triggering a node.
All in all, the worst case running time stays polynomial in n.
With this technique, we now have a more powerful polynomial time solving
mechanism that allows for the implementation of the AC, RC, EAC and ERC
rules by limiting the set of edges that are subject to the path algorithm used:
The Graph-Accessibility algorithm that nds a path in the graph structure
was modied to only use edges that imply its incident vertices to be bilocated,
bivalued or, in case of the extended rules, grouped in such a way as described
in the respective sections: All Cell-Flood triggers are allowed, Locate and Elim-
inate triggers are allowed only if the trigger contains at most one untriggered
node. If a Group-Flood trigger is encountered, the AC and RC implementation
makes sure that the cells of both nodes have only two possible numerals and
thus imply a Bivalue rule. The EAC and ERC implementation makes sure that
no two Group-Flood triggers that do not imply Bivaluation are consecutive.
To show how Alternating Cycles are being detected and exploited, suppose
there is an Alternating Cycle. Let c be a cell in the cycle having more than
two possible numerals. Let x, y and z be three of them with x and y being
part of the labels of the two incident edges used to form the cycle. The path
3
breadth-rst search
30
Strategy Test Puzzles Test Puzzles Solved
ACR and RCR 5464 6 (0.11%)
EACR and ERCR 5464 4261 (78%)
Limited Constraint Propagation 5464 5464 (100%)
Table 2: Comparison of the three solving strategies presented in Section 3.4.
algorithm will nd a path from the node (c, z) to the node (c, z) along the
cycle, thereby removing z from consideration for this cell as proposed by the
Alternating Cycle rule. To show how Repetitive Cycles are being detected and
exploited, suppose there is an alternating path that starts and ends in the same
cell c. If (V, z) is a label of the rst or last edge that is used to form this path,
then the path algorithm will nd a path from the node (c, z) to (c, z) thereby
removing z from consideration for this cell. If (L, z
) to (c, z
) thereby removing
z