A Computational Survey of The Cerny and Pin Conjectures

A Computational Survey of the
ern and Pin Conjectures
Author:
Michael Martis
School of Mathematics
and Statistics
Supervisor:
Dr. Gordon Royle
School of Mathematics
and Statistics
b
0
b
a
a, b
a
b
a
6
b
a
a
5
b
4
b
This thesis is presented for the partial requirements of the degree of

Bachelor of Science with Honours
of the University of Western Australia
December 13, 2014
ii
iii
Abstract
Finite automata are combinatorial objects designed to model computations that
only use a finite number of states. A finite automaton is said to be synchronising if
there is a fixed input sequence (called a synchronising word ) by which it is reset
that is, taken to a known state, regardless of its current configuration. The study of
synchronising automata has applications in robotics, biocomputing, network security
and circuitry, and stands on its own as a fascinating combinatorial research topic.
In 1964, Jan ern constructed, for each natural number n, an n-state synchronising automaton requiring a word of length (n 1)2 to reset. It is conjectured that
these automata are maximally difficult to reset. That is, it is conjectured that no
n-state synchronising automata requires a word of length greater than (n 1)2 to
reset. Cernys conjecture, as this claim is known, remains open, and has received
extensive academic attention since its formulation.
For non-synchronising automata, the concept of a synchronising word is generalised to that of a terminal word. If a word w leaves an automaton in one of k
possible states, and k is the smallest such value achievable for , then is said to
have rank k, and w is said to be a terminal word of . For instance, a synchronising
automaton has rank 1, and its terminal and synchronising words coincide. Pins
conjecture, a generalisation of Cernys conjecture proposed in 1978, claims that if
an automaton has rank k, then it admits a terminal word of length (n k)2 or less.
While other such generalisations have been disproven, Pins conjecture remains open.
This dissertation involves an extensive computational search for counter-examples
to Pins conjecture, and for other automata of theoretical interest.
We show that there are no counter examples, and no significant non-synchronising
edge cases, to Pins conjecture among arity-2 automata of size greater than 2 and
less than 9. We also extend a number of known results and algorithms to operate on
non-synchronising automata, identify new families of slowly-converging automata
and provide the research community with the tools developed for this dissertation.
iv
v
Acknowledgements
I must thank, first and foremost, Professor Gordon Royle, without whose vision and
guidance this dissertation would not be possible. I am also, as always, indebted to
my family, whose love and support I cherish above all else. Lastly, I thank the other
denizens of the Honours Room, with whom shared laughter kept me sane this year.
vi
Contents
Abstract
Acknowledgements
iii
v
List of Figures
ix
List of Tables
1 Motivations and Definitions
2 Relevant Literature
3 Elementary Results
11
4 Pins Bound on Reset Word Length
15
5 Nuclear Words and Strongly Connected Automata
17
6 Eppsteins Reset Word Algorithm
26
7 Enumerating Small Automata
30
8 Results
40
9 Concluding Remarks
50
Appendices
54
A A Listing of Significant Automata
55
A.1 Slowly-converging strongly-connected automata . . . . . . . . . . . . 55

A.1.1 Slowly-converging automata of size 3 . . . . . . . . . . . . . . 55
vii
A.2 Large slowly-converging automata . . . . . . . . . . . . . . . . . . . . 64
A.2.2 Slowly-converging automata of size 10
. . . . . . . . . . . . . 69
. . . . . . . . . . . . . 75
. . . . . . . . . . . . . 81
B Terminal-Threshold Histograms
88
B.1 Terminal-threshold histograms of all automata . . . . . . . . . . . . . 88

B.1.1 Terminal-threshold histogram of size-3 automata . . . . . . . . 88
B.2 Terminal-threshold histograms of strongly-connected automata . . . . 92
C Source Code
97
C.1 Generating automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
viii
C.1.1 The gena method . . . . . . . . . . . . . . . . . . . . . . . . . 97
C.1.2 The GAP method . . . . . . . . . . . . . . . . . . . . . . . . . 105
C.2 Canonicalising automata . . . . . . . . . . . . . . . . . . . . . . . . . 106
C.3 Analysing automata
Index
. . . . . . . . . . . . . . . . . . . . . . . . . . . 115
133
ix
List of Figures
1.1
The graph of the DFA C4 . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
The graph of an arity-2, size-4, rank-2 DFA A. . . . . . . . . . . . . .
2.1
The graph of the DFA Cn . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
A graph depicting the pertinent input symbol of a one-cluster DFA. .
2.3
The graph of Karis counter-example to the deficiency conjecture. . . 10
3.1
The graph of the power automaton of C4 . . . . . . . . . . . . . . . . . 12
4.1
The shortest path between a state P of P() and a smaller state (P )v. 16
5.1
The graph of a DFA B with an illustrative nucleus. . . . . . . . . . . 18
5.2
The graph of A \ A0 , the quotient automaton of the DFA A by the

subautomaton A0 with state set {1, 2}. . . . . . . . . . . . . . . . . . 24
7.1
The graph of the union of A and B under = (0, 2, 1). . . . . . . . 32
7.2
The graph of a DFA admitting non-trivial automorphisms. . . . . . . 34
7.3
The graph of the DFA A, its corresponding graph GA , the canonicalised graph G0A , and the canonicalised DFA A0 . . . . . . . . . . . . 38
8.1
The graph of the DFA E0,n for an even, natural number n with n > 4. 45
8.2
The graph of the DFA E1,n for an odd, natural number n with n > 3.
8.3
The slowest-converging automata produced in our restricted enumeration of DFAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
45
List of Tables
2.1
2.2
The number N (n) of non-isomorphic arity-2, size-n automata, for

2 n 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The number Nsc (`) of synchronising, strongly-connected, arity-2, size11 automata with reset threshold `, for 76 ` 100. . . . . . . . . .
4.1
The state of the variables in GreedyTerminalWord(C4 ) after each

iteration of the while loop. . . . . . . . . . . . . . . . . . . . . . . . . 16
8.1
The number of non-isomorphic arity-2, size-n, rank-k automata, for

3 n 8 and 1 k n. . . . . . . . . . . . . . . . . . . . . . . . . 40
8.2
The number of non-isomorphic arity-2, size-n, rank-k automata with

terminal threshold of C(n, k) = (n k)2 for 3 n 8 and 1 k n. 41
8.3
The number of arity-2, size-n, rank-1 DFAs with terminal threshold

T , for 3 n 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.4
The number of arity-2, size-n, rank-2 DFAs with terminal threshold

T , for 3 n 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
8.5
The maximum terminal threshold of a strongly-connected, arity-2,

size-n, rank-k DFA (upper left), versus C(n, k) (lower right), for 3
n 8 and 1 k n. . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.6
The observed (for 5 n 16) and extrapolated terminal thresholds

of three slowly-converging families of strongly-connected automata. . 45
8.7
The number Nn,k (`) of size-n, rank-k automata constructed from a

cyclic transformation and a transformation of deficiency 1, with terminal threshold `. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.8
The observed (for 10 n 16) and extrapolated rank r, shortest

terminal word w, and terminal threshold T of automata from the
families Sn,1 , Sn,2 and Cn,1 . . . . . . . . . . . . . . . . . . . . . . . 48
CHAPTER 1
Motivations and Definitions

A deterministic finite-state automaton (abbreviated to DFA), or deterministic finitestate machine, is an abstract model of a computational process.
Definition 1.1. A DFA = hQ, , i consists of a finite state set Q, a finite
transition alphabet , and a transition function : Q Q.
In particular, this definition forgoes the notions of an initial state, accepting
states, and output symbols, which are often used, but are not relevant for our purposes. Note that, by insisting on the use of a transition function (as opposed to a
transition map), we restrict our attention to deterministic and complete automata.
We say that has size |Q| and arity ||.
Conceptually, a DFA represents a process that, at any point, is in one of a
finite number of possible states (represented by the elements of Q). The DFA is
transitioned from its current state to another under the application of any symbol
in the transition alphabet; the symbol causes a DFA in state q Q to
transition to state (q, ). Finite-state automata are used to model and analyse
problems in electronic circuit design, computer networking, artificial intelligence
and many other fields. A DFA = hQ, , i can be visually represented by a graph,
with nodes representing states in Q, and the nodes representing states q1 and q2
joined by a directed, -labelled edge precisely when (q1 , ) = q2 .
By way of example, consider the automaton C4 depicted in Figure 1.1. Specifically, C4 = hZ4 , {a, b}, i, with (q, a) = (q + 1) mod 4, and (q, b) = q mod 3.
b
0
b
a
a, b
2
b
Figure 1.1: The graph of the DFA C4 .

Any ordered sequence of alphabet symbols is called a word, and the set of words
over the alphabet is denoted by . We often use the natural extension of acting
CHAPTER 1. MOTIVATIONS AND DEFINITIONS

on words: (q, w) = ((q, ), w), where and w . For example, the word
ab causes the DFA in Figure 1.1 to transition from state 2 to state ((2, a), b) =
0. In many applications involving DFAs, the following question naturally arises:
given a DFA in an unknown state, is there a word that will reset the automaton
to a known state? These kinds of synchronisation problems surface in surprisingly
varied contexts.
For instance, consider the testing of circuit chips. It is often not possible to know
in which state a freshly-manufactured circuit chip will be. So, before any meaningful
testing can be performed on the new chip, it must be reset to a known state. In
order to do this, we may find an input sequence that always leaves the chip in a
certain state, regardless of its original condition [14].
As another example, we look to the time-sensitive, wireless transmission of information. For instance, the kind performed every day by mobile phone carriers, or the
kind required for deep-space communication. With a sender and receiver separated
by time and distance, it is often not possible for each to be aware of the others
state. Hence, the states of the sender and receiver must be synchronised before any
cooperative behaviour can occur. It suffices to encode, in each message, an input
sequence that always takes the target system to a known state [14].
Lastly, consider biocomputing applications, in which DNA molecules are used
to implement finite-state automata. Thousands of billions of identical automata
are used to perform mass parallel computation on differing inputs. At any point in
time, these automata will be in a range of different states. It is clearly advantageous,
then, to have the same chemical word reset an automaton in any state. The entire
solution of DNA molecules can then be reset via the addition of a suitable quantity
of the chemical reset word [36].
We say that a DFA with this property is synchronising.
Definition 1.2. A DFA = hQ, , i is called synchronising if there is some word
w and state q0 Q such that (q, w) = q0 for each q Q.
We call any such w a reset or synchronising word of , with reset or synchronising
state q0 . The length of the shortest possible reset word of is sometimes called the
reset threshold of .
While the problems mentioned above may be more naturally addressed through
the addition of explicit reset machinery, this is not always possible. Adding such
measures may be prohibitive (for reasons of cost, efficiency, resource usage etc.), or
impractical (in the case of existing legacy systems, for instance). There are even systems (such as Edward Moores Gedankenexperiments [22]) that are fundamentally
altered by the addition of states, to which such an approach is nonsensical. Furthermore, synchronising DFAs have proven fascinating combinatorial objects in their
own right, and are often studied as such, divorced from any particular application
[14].
3
The automaton in Figure 1.1 is synchronising, with synchronising word w =
ba3 ba3 b and synchronising state 0. That is, (0, w) = (1, w) = (2, w) = (3, w) =
0. Synchronising DFAs have been studied implicitly since the inception of automata
theory, but the notion was first explicitly considered in 1964 by ern [7]. His paper
gave the construction, for any integer n > 2, of an n-state synchronising DFA Cn
with shortest reset word of length (n 1)2 . Furthermore, he conjectured that no
n-state synchronising DFA requires a longer word to reset. That is:
Conjecture 1.3 (The ern conjecture). If is an n-state synchronising DFA, then
has a reset word of length (n 1)2 or less.
The DFA in Figure 1.1 is the ern automaton for n = 4; erns original
proof establishes that the word ba3 ba3 b of length 9 = (4 1)2 is the shortest word
that resets C4 . We use, as in Volkovs 2008 survey of the subject [36], the ern
function C(n) to denote the maximum length of a shortest reset word on an nstate DFA, for any integer n > 2. The existence of erns family of automata
implies that C(n) (n 1)2 for any n > 2. erns conjecture, then, is that the
equality C(n) = (n 1)2 holds. It is not known whether the ern conjecture is
true or false, and answering this question has emerged as a major open problem in
automata theory.
We associate, with each word w in the alphabet of a DFA = hQ, , i,
the natural transformation of w on Q, also denoted by w. That is, we define the
function w : Q Q to be q 7 (q, w) for each word w of . To maintain
notational consistency between transformation composition and word concatenation,
we write transformations in postfix notation. For instance, a : Z4 Z4 given by
(q)a = (q + 1) mod 4, b : Z4 Z4 given by (q)b = q mod 3, and ba3 ba3 b :
Z4 Z4 given by (q)ba3 ba3 b = 0 are all transformations that arise from words of
C4 . Note that the transformation associated with word uv is the composition of
the transformations associated with u and v. The set of all such transformations,
along with the composition operation, forms a semigroup; analysis can proceed
algebraically from this construction, and results have been proven for DFAs with
semigroups satisfying various properties [1, 33].
The rank of w is defined as |(Q)w|, i.e. the number of distinct states could be
in after the application of w. For instance, the word b of C4 has rank |{0, 1, 2}| = 3,
while the word ba3 ba3 b has rank 1. Synchronising words, then, are characterised by
having rank 1. The rank of a DFA is defined to be the minimum rank of a word
constructed from its alphabet, with words of this rank called terminal words. We
also define the terminal threshold of a DFA as the length of its shortest terminal
word. This generalisation of synchronising words suggests a natural generalisation
of the ern conjecture, first formulated by Pin [24]:
Conjecture 1.4 (Pins conjecture). If is an n-state DFA of rank k, then has a
terminal word of length (n k)2 or less.
CHAPTER 1. MOTIVATIONS AND DEFINITIONS
This conjecture is sometimes called the ern-Pin conjecture, to recognise the

importance of erns work in its development. Note that, in the specific case that k
is equal to 1, the ern and Pin conjectures coincide. We illustrate Pins conjecture
by example.
b
a
Figure 1.2: The graph of an arity-2, size-4, rank-2 DFA A.

The automaton A depicted in Figure 1.2 is not synchronising: as both input
symbols permute the state set {1, 2}, no word of A can map states 1 and 2 to the
same image. The word aba, however, has image {1, 2} of size 2. So A is a rank-2
DFA. Pins conjecture, then, asserts that A must admit a terminal (i.e. rank-2)
word of length (4 2)2 = 4 or less. Clearly, aba is such a word, so A conforms to
Pins conjecture. Incidentally, it also holds the maximal reset threshold among all
arity-2, size-4, rank-2 automata. That is, all arity-2, size-4, rank-2 automata admit
a terminal word of length 3 or less. Pins conjecture, like the ern conjecture,
remains an open problem in automata theory. It has received, however, far less
attention from the research community than its progenitor.
In this dissertation, we present the results of a computational enumeration of all
automata of small size, and analyse them with respect to Pins conjecture. In order
to do so, we extend known results and algorithms to operate on non-synchronising
automata, and present and implement non-trivial computational tools for the generation and analysis of automata. Our enumeration of small automata significantly
enhances the current body of knowledge on Pins conjecture. In particular, it shows
that there are no counter-examples to Pins conjecture, and that there are not even
significant edge-cases for Pins conjecture, among small automata. Furthermore, we
provide the tools developed during the course of this dissertation, to assist future
researchers in the field.
CHAPTER 2
Relevant Literature
The concept of synchronising automata, and the ern conjecture itself, are both
long-standing topics of research in the field of automata theory. A wealth of relevant
literature regarding both subjects is available indeed, many of the fundamental
concepts and results in these areas have been independently rediscovered multiple
times [36]. In the years since its formulation, combinatorial, algebraic, probabilistic
and computational strategies have been used to approach the ern conjecture.
As well as incrementally improving the best known upper bound on the length of
shortest reset words, researchers have found success in proving the ern conjecture
for certain classes of DFAs, and in answering related conjectures.
Volkov [36] traces discussion of synchronising DFAs back to the 1956 book Introduction to Cybernetics, by English psychiatrist and cyberneticist William Ross
Ashby. Ashby [4] wrote of a comically haunted house, in which two ghostly noises
singing and laughing respond, in a convoluted manner, to the burning of incense
and the playing of an organ. The behaviour of the poltergeists can be naturally
encoded into an automaton, which happens to be synchronising. To find a sequence
of actions that silences the ghosts, then, is to find a reset word of the corresponding
automaton. Throughout the 1950s, Moores [22] Gedankenexperiments on sequential machines, and Ginsburgs [12] uniform experiments paved the way to erns
1964 paper, and to the study of synchronising automata in general.
erns original paper [7] did not, in fact, formulate the conjecture that now
bears his name [36]. The paper instead gave the construction of an infinite family
of automata. The general form of a ern automaton is depicted in Figure 2.1.
The ern automaton Cn is an n-state synchronising DFA that requires a word
of length (n 1)2 to reset. It has arity two, admitting input symbols a and b. The
input symbol a acts as a cyclic permutation on the states of the DFA. The input
symbol b fixes all but the last state of the DFA, which it maps back to state 0. More
precisely, we have Cn = hZn , {a, b}, i, with:

(q + 1) mod n if = a
(q, ) =
q mod (n 1) if = b
The DFA Cn has shortest reset word (ban1 )n2 b, of length (n 1)2 , with reset
state 0. erns original paper proved this fact, thus placing a lower bound on
the ern function C(n). In the same paper, ern established an exponentialorder upper bound on C(n), noting that future researchers could expect improvement mainly for the upper bound. Only two years later, Starke [28] published a
German-language paper providing the first polynomial upper bound on C(n), which
was incrementally improved upon, during the subsequent decades, by a number of
mathematicians [8, 17] including ern himself. In 2012, Trahtman [35] claimed to
have proved the ern conjecture. This proof appears to be incorrect; over the next
CHAPTER 2. RELEVANT LITERATURE
6
b
b
a
a, b
1
a
b
2
n1
n2
Figure 2.1: The graph of the DFA Cn .
two years, a number of revisions to the paper were published, culminating in its
revocation from arXiv.org in March, 2014.
Pin [25] deduced, in 1983, the most well-known upper bound on C(n): 16 (n3 n).
His paper detailed a straightforward algorithm to generate a short (though not
necessarily optimal) reset word for a DFA, and placed a bound on the length of the
resultant word by analogy to a known, purely combinatorial problem. Combinatorial
results by Frankl [11] were then leveraged to complete the proof. This upper bound
stood unimproved for almost 30 years, until a separate argument by Trahtman [34]
established a new cubic upper bound (with lower constant factor) on C(n), in 2011.
erns bound, or stronger bounds, have been proven to hold for numerous
subclasses of automata. Many of these classes arise from placing restrictions on
the graph of a DFA. For example, consider circular automata: automata in which
some input symbol cyclically permutes the state set. Each of erns automata,
for example, are circular, as are many of the known individual edge-cases to the
ern conjecture. Pin [23] proved, in 1978, that a synchronising circular automaton
with prime number of states must conform to erns conjecture. In a significant
paper published twenty years after Pins original work, Dubuc [9] generalised this
result to all synchronising circular automata. In 2011, Pins original result was
also generalised to apply to all synchronising one-cluster automata with prime cycle
order, by Steinberg [30]. An automaton is said to be one-cluster if some input symbol
acts as a transformation containing (though not necessarily consisting entirely of)
7
a single cycle. Equivalently, an automaton is one-cluster if the graph of one of its
input symbols is connected. See Figure 2.2 for an example of such a DFA.
7
4
6
a
a
2
5
a
8
Figure 2.2: A graph depicting the pertinent input symbol of a one-cluster DFA.
In 2009, Carpi and dAlessandro [5] provided a quadratic order bound on the
reset threshold of strongly-transitive automata a different generalisation of circular
automata and less than one year later, generalised this result to include locally
strongly-transitive automata [6]. These results mirrored Rystsovs [27] treatment
of regular automata, a closely related concept. Future researchers in this area may
benefit from the work done by Steinberg [29], which elucidates a common strategy
used to prove many of these results. It appears that analogous work, proving Pins
conjecture for certain classes of automata, has not been performed.
Computational tools are widely used to analyse automata. A number of efficient
algorithms exist to check whether a DFA is synchronising, and to generate reset
words. Trahtman and Eppstein appear to be the most prolific researchers in this
area. Eppstein [10] provides the most efficient known algorithm for generating reset
words , which requires O(|Q|3 + |Q|2 ||) time and O(|Q|2 + |Q|||) space to process
a DFA = hQ, , i. Trahtman [31] proposes modifications to this algorithm
designed to increase its average-case efficiency, and decrease the length of the reset
word returned as well as an alternate algorithm motivated by semigroup theory.
None of these algorithms, however, aim to generate a shortest reset word of a DFA.
In general, this task is NP-complete [10]. There are also known algorithms for finding
a terminal word of a DFA [26] , although this area of research appears less mature.
J. D. Mitchells GAP package Semigroups [21] provides numerous computational
tools for the analysis and manipulation of finite automata, and was used extensively
during the production of this dissertation.
Computational enumeration of small synchronising automata is also commonly
performed. Taken together, Trahtman [31], and Kisielewicz and Szykua [16], claim
to have exhaustively generated and examined all automata of size less than 8 and
arity less than 5, or size less than 12 and arity less than 3 a task involving the
generation of over one thousand trillion automata and purportedly requiring over 4
years of CPU time. Table 8.1 reveals the magnitude of the task; N (n), as proven
by Harrison [13] and listed in the Online Encyclopedia of Integer Sequences [20], is
the number of non-isomorphic arity-2, n-state automata that exist.
n
2
3
4
5
6
7
8
9
10
11
N (n)
4
67
1455
41829
1540566
68342769
3540690574
209612913688
13957423185476
1032436318249157
Table 2.1: The number N (n) of non-isomorphic arity-2, size-n automata, for 2
n 11.
While Trahtmans paper gives no indication as to the methods used to generate automata, Kisielewicz and Szykua provide an account of an efficient algorithm for generating automata, which was re-implemented and used for this dissertation (see Chapter 7 for details on this process). The algorithm generates all
automata of a given size, including some isomorphic automata, which must then be
removed through a canonicalisation process. The authors, however, purport that
modifications to the process can be used to generate only non-isomorphic, or only
strongly-connected, automata. Almeida, Moreira and Reis [2] describe a normal
form for the canonical representation of, and an algorithm for the generation of, all
non-isomorphic initially-connected DFAs (a class of DFAs containing all stronglyconnected automata). It should be noted that the computational enumeration of
even moderately sized automata (8 states and above) proves difficult, due to the
sheer number of such objects that exist.
Computational searches have revealed that no counter-examples to the ern
conjecture exist within the given parameter space, and that edge cases automata
with shortest reset word of length (n 1)2 are exceptionally rare; aside from
erns family, no infinite families of edge-case automata have been recognised, and
only eight individual edge-case automata (all either arity-2 or -3) are known [31]
to exist. Furthermore, it appears that edge cases become less frequent with an
increase in automaton size. The largest known sporadic edge case has only 6 states,
while slowly-synchronising automata automata requiring relatively long words to
reset have been observed to grow sparser as automaton size increases. Indeed,
Trahtman conjectures that all edge cases to the ern conjecture have been found.
It appears that the corresponding computational search for counter-examples to
Pins conjecture has yet to be performed, and hence the automata mentioned above
also serve as the only known edge cases to Pins conjecture. J. D. Mitchells online
9
home page [19] provides a collection of all arity-2 automata of size less than 8,
indexed by size, strongly-connectedness and synchronicity.
Notable slowly-synchronising automata have also been studied and categorised.
Ananichev, Gusev and Volkov [3] construct eight infinite families of automata with
reset threshold close to erns bound, through the labelling of primitive digraphs
with large exponents. It is within this context that the concepts of reset-threshold
gaps and islands appear. Consider Table 2.2, originally from Kisielewicz and
Szykuas paper on the computational enumeration of automata. Note that the
automata are clustered by reset threshold. Here, C11 is the ern construction for
n = 11, and the other classes of automata are defined by Ananichev, Gusev and
Volkov.
`
76 77 78 79 80 81 82 83 84 85 89 90 91 92 93 99 100
Nsc (`) 3 2 0 0 9 22 12 2 1 0 0 0 3
2 1 0 0 0 1
0
Classes
D11 H11 G11
E11 W11 D11
C11
00
H11
D11 F11
B11
Table 2.2: The number Nsc (`) of synchronising, strongly-connected, arity-2, size-11
automata with reset threshold `, for 76 ` 100.
This table exhibits four islands of automata (76 ` 77, 80 ` 84,
90 ` 92 and ` = 100), between which are three gaps containing no automata.
The single gap for size-7,-8,-9 and -10 automata was originally found by Trahtman
in 2006 [32], while the two gaps that occur for size 9 were first noted by Ananichev,
Gusev and Volkov in 2012 [3]. Kisielewicz and Szykuas third gap was found for
size 11 in 2013, and led them to formulate the following natural conjecture.
Conjecture (The gap conjecture). For any integer g 1, there exists a large enough
integer n 1 such that there are at least g gaps in the distribution of reset thresholds
among arity-2, size-n, synchronising automata.
It appears, once again, that the corresponding identification of what might be
called slowly-converging automata (not necessarily synchronising) automata with
terminal threshold close to Pins bound has not been performed. Similarly, the
distributions of shortest terminal word lengths are not known.
Pins conjecture itself was not formulated until more than 15 years after the
ern conjecture. It has its origin in another 1978 paper by Pin [24], wherein a
related conjecture was proposed.
Conjecture (The deficiency conjecture). If an n-state DFA admits a word of rank
no greater than k, then there exists such a word of length no greater than (n k)2 .
By setting k to 1, we see that the deficiency conjecture is (another) generalisation
of the ern conjecture. Note, however, that Pins conjecture is not equivalent to
10
the deficiency conjecture. In particular, Pins conjecture concerns only the shortest
terminal word of a DFA, whereas the deficiency conjecture concerns the shortest
word of each possible rank. In its paper of origin, Pin proved the conjecture for
k = n, n 1, n 2 and n 3, but no further progress was made. The conjecture
remained open for 20 years, until Jarkko Kari [15], in 2001, uncovered the automaton
depicted in Figure 2.3.
a
0
1
b
b
5
Figure 2.3: The graph of Karis counter-example to the deficiency conjecture.

The shortest rank-1 word admitted by Karis automaton is baabababaabbabaabaab
abaab of length 25 while the shortest rank-2 words admitted are baaba(babaab)2
and (baababaa)2 b both of length 17. The correctness of the deficiency conjecture,
however, would guarantee the existence of a rank-2 or rank-1 word of size (6 2)2 =
16 or less. While this DFA disproves the deficiency conjecture, it still conforms
to Pins conjecture; it is synchronising, with shortest reset word of length 25
(6 1)2 . Incidentally, it is also one of the eight known individual edge-cases to the
ern conjecture. With the original conjecture closed, researchers considered its
restriction to automata of rank k. Thus arose Pins conjecture, as formulated in this
dissertation.
CHAPTER 3
Elementary Results
We begin by discussing elementary results relevant to ern and Pins conjectures.
While no original results are presented here, Appendix C contains implementations
of the concepts and algorithms discussed in this chapter. In particular, code written
for this dissertation bridges a number of holes in the existing functionality of the
GAP package Semigroups.
The standard notion of the power automaton proves a useful starting point.
Definition 3.1. The power automaton P() = hP 0 (Q), , 0 i of a DFA = hQ, , i
is the DFA with transition function 0 : P 0 (Q) P 0 (Q) given by
0 ({q1 , , qk }, ) = {(q1 , ), , (qk , )}
where P 0 (Q) is the set of all non-empty subsets of Q.
We illustrate the power automaton by example. Figure 3.1 depicts the power
automaton of C4 , the ern automaton illustrated in the introduction of this dissertation. Note that each state in P(C4 ) is a non-empty subset of Z4 , and that the
transitions of C4 , naturally extended to act on each element of a set, give rise to the
transitions of P(C4 ).
We now pause to make explicit some routine terminology.
Definition 3.2. In a DFA = hQ, , i, a state q Q is said to be reachable from
a state p Q if there is some word w such that (p, w) = q.
Definition 3.3. A state q Q of a DFA = hQ, , i is said to be a sink state if
it is reachable from all states in Q.
For example, state 0 of P(C4 ) is reachable from every state in the power automaton, as it is a reset state for C4 . Hence, state 0 is a sink of P(C4 ). Conversely, no
2-element state of P(C4 ) is reachable from a 1-element state (the power automaton
cannot transition from a smaller state to a larger state, by the definition of its transition function), so no 2-element state is a sink of P(C4 ). We require the concept of
reachability for the following proposition.
Proposition 3.4. Let = hQ, , i be a DFA, P() = hP 0 (Q), , 0 i be its power
automaton, and r be the size of the smallest set reachable from Q in P(). Then a
word of is terminal if and only if it takes the state Q of P() to a set of size r.
Proof. From the definition of reachability, a state S of P() is reachable from Q if
and only if 0 (Q, w) = (Q)w = S for some w . Then those sets reachable from
Q in P() are precisely the images of words of . So S is minimally sized if and
only if w has minimal rank in . Hence, a word w of takes the state Q of P() to
a set of size r if and only if w is terminal.
11
CHAPTER 3. ELEMENTARY RESULTS
12
b
a
0, 1, 2, 3
0, 1, 2
1, 2, 3
b
b
b
0
b
a
0, 3
a, b
0, 1, 3
0, 2, 3
b
a
2, 3
1, 2
0, 1
1, 3
0, 2
a
Figure 3.1: The graph of the power automaton of C4 .

If we think of states in the power automaton P() as representing the possible
states in which a DFA could be, then we naturally arrive at the above result. It
can also be used to bound the length of a shortest terminal word.
Corollary 3.5. An n-state DFA admits a terminal word of length less than 2n .
Proof. Suppose = hQ, , i is an n-state DFA. Note that words taking the state
Q to the state Q0 in P() precisely correspond to paths between Q and Q0 in the
graph of P(). We know that a terminal word must take the state Q of P() to
a smallest state Q0 reachable from Q. Then, as |P(Q)| = 2n 1, there must exist
a path of length less than 2n between Q and Q0 in the graph of P(). This path
yields a terminal word of , of length less than 2n .
From the results proved above, we see that a breadth-first search through the
power automaton can be used to identify the shortest terminal word of a DFA. For
example, the path yielding the shortest terminal word (specifically, a reset word) for
C4 is coloured red in Figure 3.1. While requiring exponential time in the worst case,
13
this method remains viable for small automata, and was used in the computational
searches performed for this dissertation. In fact, as finding a shortest terminal word
of an automaton is NP-complete [10], it is unlikely that any efficient algorithm can
perform the same task. We provide the GAP function ShortestWordsByRank
(included in Appendix C) to calculate the shortest word of each rank admitted by
a DFA, using a breadth first search through the power automaton. An equally
simple, but much more efficient, algorithm exists to determine whether a DFA is
synchronising, if a reset word need not be produced. We say that a word w unifies
(or is a unifier for) two states u and v if (u)w = (v)w.
Proposition 3.6. A DFA is synchronising if and only if, for all states u and v, there
exists a word that unifies u and v.
Proof. Suppose is synchronising. Then there is some word w and state q0 such
that (q, w) = q0 for any q Q. Specifically, w unifies any pair of states, as
(u, w) = (v, w) for all u, v Q. Conversely, suppose there is some word-producing
function w : Q2 such that (u, w(u, v)) = (v, w(u, v)) for all u, v Q.
Consider any state Q0 in P(), and choose two distinct elements p, q Q0 . The
word w(p, q), then, maps Q0 to a smaller set in P(). Repeating this process, we
can construct a path from any state in P() to a singleton set. From Proposition
3.4, then, must be synchronising.
In practice, we can check this property via the analysis of another useful construction the pair automaton.
Definition 3.7. If P() = hP 0 (Q), , 0 i is the power automaton of a DFA =
hQ, , i, we define the pair automaton P [2] () by P [2] () = hM, , 0 |M i, where
M P 0 (Q) is the set of all 1- and 2-element subsets of Q.
The pair automaton is conceptually similar to the power automaton, but only
contains state sets of size 1 and 2. In particular, the graph of P [2] () is induced
on the graph of P() by the set of size-1 and -2 states. For example, considering
only the size-1 and -2 states in Figure 3.1 yields P [2] (C4 ). This notion is treated
more rigorously in Chapter 5. We provide the GAP function PairSemigroup for
generating the pair automaton, in Appendix C.
Observe that the words unifying the states p, q Q of a DFA = hQ, , i
are precisely the words taking the pair state {p, q} of P [2] () to a singleton state.
Hence a DFA can be checked for synchronicity by constructing the pair automaton and checking that some 1-element state is reachable from each 2-element state.
This approach yields an O(|Q|2 ||) algorithm, which is implemented in the PolyIsSynchronisingSemigroup 1 function in Appendix C. The following results relate
properties of an automaton to those of its pair automaton.
1
This function is named awkwardly in an attempt to avoid a collision with the exponential-time
IsSynchronisingSemigroup function provided by the Semigroups GAP package.
14
CHAPTER 3. ELEMENTARY RESULTS

Proposition 3.8. A DFA is synchronising if and only if P [2] () has a sink state.
Proof. Let = hQ, , i be a DFA, and P [2] () = (Q0 , , 0 ) be its pair automaton.
If is synchronising, then there exists a w and q0 Q such that (q, w) = q0 for
all q Q. It is clear, then, that w takes any state in P [2] () to {q0 }. Hence P [2] ()
has a sink. Conversely, suppose P [2] () has a sink. Note that this sink must be a
singleton set {q0 }, otherwise there would be no possible word taking a 1-element set
to the sink. Hence, for each two-element state {u, v} in P [2] (), there is a word w{u,v}
taking {u, v} to {q0 }. From the definition of the power automaton, then, we must
have (u, w{u,v} ) = (v, w{u,v} ). So, from Proposition 3.6, is synchronising.
Proposition 3.9. The DFA P [2] () is synchronising if and only if is synchronising,
and the set of synchronising words for P [2] () is identical to the set of synchronising
words for .
Proof. Let = hQ, , i be a DFA, and P [2] () = (Q0 , , 0 ) be its pair automaton.
Suppose that w is a synchronising word for . Then there is some q0 Q such
that (q, w) = q0 for all q Q. Clearly, then, w takes any state in P [2] () to {q0 },
so w synchronises P [2] (). Conversely, suppose w is a synchronising word of
P [2] (). As above, there must be some q0 Q such that 0 (Q0 , w) = {q0 } for all
states Q0 in P [2] (). From the definition of 0 , we must then have (q, w) = q0 for all
q Q. So w synchronises . As the set of synchronising words for P [2] () and are
identical, we must have P [2] () synchronising if and only if is synchronising.
CHAPTER 4
Pins Bound on Reset Word Length

In this chapter, we examine the most well-known upper bound on the reset threshold
of an n-state automaton: 61 (n3 n), as proposed by Pin in his 1983 paper [25]. In
particular, we follow Pins argument to place a cubic upper bound on the terminal
threshold of any (not necessarily synchronising) DFA. This bound is based upon a
greedy algorithm for computing a terminal word of a DFA = hQ, , i, wherein
a (not necessarily optimal) path from Q to a minimal set in the graph of P() is
computed. The algorithm maintains its current position in the power automaton,
and selects, in each iteration, the shortest word that decreases the size of this state.
We present pseudocode for the algorithm:
procedure GreedyTerminalWord(Q, , )
P Q
w
while states in P can be unified do
v the shortest word v 0 such that |(P )v 0 | < |P |
w wv
end while
return w
end procedure
If it is not immediately clear that the above procedure produces a terminal
word, we refer to the proof of Proposition 6.2, wherein the correctness of a similar
algorithm is verified. We derive the stated bound via an analysis of the word returned
by the above procedure. In particular, we examine the word v calculated in each
iteration of the while loop. Suppose that P Q is a state set of size k, and that
v = 1 2 t (with i for i = 1, 2, , t) is the shortest word such that
|(P )v| < |P |. Then define P1 = P, P2 = (P1 )1 , , Pt = (Pt1 )t1 . The sequence
P1 , P2 , , Pt , (P )v, then, gives the shortest path from P to a state smaller than P in
the graph of P(). Specifically, we must have |Pi | = k for i = 1, , t. Furthermore,
as |(Pt )t | = |(P )v| < k, there must be two distinct states qt , qt0 Pt that t unifies.
We may then choose, for each i = 1, , t 1, two unique states qi , qi0 Pi such
0
that (qi )i = qi+1 and (qi0 )i = qi+1
. This configuration is visually represented in
Figure 4.1.
Let us denote the set {qi , qi0 } by Ri , for i = 1, , t. Then we cannot have
Rj Pi for 1 i < j t. If there were such an i and j, then the word j t
would unify a pair of states in Pi , and hence the word 1 i1 j t would unify
a pair of states in P , yet be shorter than v. Using this observation, we may place an
upper bound on the terminal thresholds that are possible by answering the following
purely combinatorial question:
Question. Suppose Q is a size-n set, P1 , , Pt is a sequence of size-k subsets of
Q, and R1 , , Rt is a sequence of size-2 subsets of Q. If Ri Pi for 1 i t, but
15
CHAPTER 4. PINS BOUND ON RESET WORD LENGTH
16
P1
P2
q1
Pt1
0
qt1
q20
q2
qt0
qt1
(P )v
qt
t1
q10
Pt
t1
(qt )t
t
Figure 4.1: The shortest path between a state P of P() and a smaller state (P )v.
Rj 6 Pi for 1 i < j t, then how large can t be?
Frankls
1982 paper [11] answered this question, proving the tight bound t

. If the given DFA has rank r, then the while-loop in GreedyTerminalWord must execute at most r 1 times, and the algorithm will produce a terminal
word of length no greater than

r+1
r+2
n
1
+
+ +
= (n r)(n r + 1)(n r + 2)
6
2
2
2
nk+2
2
by a standard inductive argument. Specifically, in the case that r = 1 (i.e. for a

synchronising DFA), we arrive at the known bound t 61 n(n1)(n+1) = 16 (n3 n).
To understand why this process does not yield erns upper bound, it is useful
to observe its execution upon a ern automaton. We here apply it to C4 , with
reference to Figures 1.1 and 3.1. Table 4.1 lists the state of the algorithm after each
iteration of the while loop.
Iteration
v
P
w
Initial state {0, 1, 2, 3}

1st
b
{0, 1, 2}
b
2nd
a2 b
{0, 2}
ba2 b
3rd
aba3 b
{0}
ba2 baba3 b
Table 4.1: The state of the variables in GreedyTerminalWord(C4 ) after each
iteration of the while loop.
Note that the produced reset word has length 10, rather than the optimal length
of 9. This is due to the selection of v = a2 b in the 2nd iteration of the while loop,
which necessitates the use of a much longer word in the 3rd iteration. Only by
selecting the locally non-optimal choice of v = a3 b can the minimal reset word be
produced. This illustrates a fundamental difficulty in the analysis of terminal words:
a shortest terminal word need not be constructed from locally optimal subwords.
CHAPTER 5
Nuclear Words and Strongly Connected Automata

In this chapter, we reproduce Rystsovs original work [26] to derive a fundamental
result about strongly-connected automata. Along the way, we also re-derive the most
efficient known algorithm for finding a short terminal word of a DFA. The results
presented in this chapter are not fundamentally original, but include a number of
original contributions. In particular, we present, and prove the correctness of, a
potentially new 1 algorithm for producing nuclear words, and fill in the details of
some proofs offered by Rystsov. We begin by codifying the notion of an automaton
contained within another.
Definition 5.1. If = hQ, , i is a DFA, Q0 Q, and 0 = hQ0 , , |Q0 i is a
DFA, then 0 is said to be a subautomaton of .
In particular, the graph of 0 is induced by Q0 in the graph of . For example,
P () is a subautomaton of P(), and is a subautomaton of P [2] (), for any DFA
. Note that not all state sets give rise to subautomata. For instance, the state set
Q1 = {0, 3} does not give rise to a subautomaton of A (depicted in Figure 1.2), on
account of the transition from the state 0 Q1 to the state 1
/ Q1 . Specifically, the
transitions of a subautomaton may not leave the state set of the subautomaton. In
fact, the only non-trivial subautomaton A2 of A arises from the state set Q2 = {1, 2}.
We are particularly interested in subautomata wherein every state is reachable from
every other. This notion is known as strongly-connectedness.
[2]
Definition 5.2. Define an equivalence relation R on states of a DFA = hQ, , i

as follows: two states p, q Q are related by R if and only if there exist words
w, w0 such that (p)w = q and (q)w0 = p. A strongly-connected component of
is an equivalence class of R.
In other words, a strongly-connected component of a DFA is a maximal set of
states mutually reachable from one another. The state set Q1 of A, for example, is a
strongly-connected component, as states 0 and 3 are reachable from each other, but
not from states 1 or 2. This illustrates that even a strongly-connected component
need not be a subautomaton.
Definition 5.3. A DFA is called strongly-connected if its entire state set forms a
strongly-connected component.
Any ern automaton, for instance, is strongly connected (the input symbol a
cyclically permutes its state set), as is each known edge case for the ern conjecture,
and the subautomaton A2 of A. We now have the machinery in place to describe a
particularly important subautomaton of any DFA its nucleus.
1
Our algorithm meets the specification of an existing algorithm presented by Rystsov in a

Russian-language paper which we were unable to obtain.
17
18
CHAPTER 5. NUCLEAR WORDS AND STRONGLY CONNECTED AUTOMATA

Definition 5.4. The nucleus of a DFA is the union of its strongly-connected subautomata.
From the discussion above, it is clear that any ern automaton is its own
nucleus, and that A2 is the nucleus of A. To illustrate some of the subtleties of this
definition, consider the DFA B in Figure 5.1.
a
Figure 5.1: The graph of a DFA B with an illustrative nucleus.

The state 0 is not part of any strongly-connected subautomaton, but each nonzero state is in a singleton strongly-connected subautomaton of its own; a single
state with a single transition to itself is an automaton, which is trivially stronglyconnected. Hence, the nucleus of B arises from the state set {1, 2}.
The nucleus of a DFA, as a union of subautomata, is clearly a subautomaton of
the DFA. We call a state in the nucleus a nuclear state, a word nuclear if it takes
every state to a nuclear state, and the set of non-nuclear states the conucleus. So,
to refer back to the DFA B above: states 1 and 2 are nuclear states, 0 is in the
conucleus (in fact, {0} is the complete conucleus), and the word a is nuclear.
We now derive an algorithm for computing a short nuclear word of a DFA. We
must begin, then, by describing an algorithm to find the nucleus of a DFA. Each
algorithm defined below operates on a DFA = hQ, , i, with nucleus N and
conucleus of size m.
Identifying strongly-connected components in a graph is a commonly encountered task, for which there exist well known linear-time algorithms, such as Tarjans
algorithm. The next result shows that these algorithms can also be used to identify
strongly-connected subautomata.
Lemma 5.5. Any strongly-connected subautomaton is a strongly-connected component that is also a subautomaton.
Proof. Suppose = hQ, , i is a DFA, and Q0 Q is the state set of some
strongly-connected subautomaton 0 of . As 0 is a strongly-connected DFA, Q0 is
a set of states mutually reachable from one another. To show that Q0 is a stronglyconnected component of , then, we must establish that Q0 is a maximal set of
states mutually reachable from one another. Suppose that q 0 Q0 and q Q are
states mutually reachable from one another. Then, as 0 is a subautomaton and q
is reachable from some state in 0 , we must have q Q0 . Hence, Q0 is maximal, and
a strongly-connected component of .
Hence, we present the following algorithm for identifying the nucleus of a DFA .
19
procedure Nucleus(Q, , )
C StronglyConnectedComponents(Q, , )
N
for c C do
if {(q, ) : q c, } c then
N N c
end if
end for
return N
end procedure
The algorithm proceeds by finding the strongly-connected components of , and
taking the union of all such components that are also automata.
Proposition 5.6. The Nucleus procedure finds the nucleus of a DFA in O(|Q|||)
time.
Proof. The fact that the Nucleus procedure identifies all strongly-connected subautomata follows from Lemma 5.5. The specified time complexity follows from the
existence of linear-time algorithms for identifying strongly-connected components,
from the fact that each state transition is examined once, and from the fact that
each state is added to the nucleus set at most once.
This next procedure finds, for each state, a word taking that state to the nucleus.
We will see that a nuclear word can be constructed from the combination of such
words. We first require, however, the following result.
Lemma 5.7. If is a DFA with conucleus of size m, then any state in can be
taken to the nucleus by a word of length no greater than m.
Proof. First we show that there is a path from any conuclear state to a nuclear
state. Suppose q1 is a state in the conucleus of . As every state is part of some
strongly-connected component, q1 is a member of a strongly-connected component
C1 . By assumption, C1 is not also a subautomaton (otherwise, q1 would be in the
nucleus). So there is some state q2 reachable from q1 such that q2
/ C1 . Now, q2 is
part of some strongly-connected component C2 6= C1 . If C2 is also a subautomaton,
then we have found a path from q1 to a nuclear state and need not continue. Assume
this is not the case. Then, as before, we can find a state q3 reachable from q2 and
a strongly-connected component C3 6= C2 with q3 C3 . We can continue in this
manner until we select a state qk in a strongly-connected subautomaton, or until
we must select a state qk such that qk Cj for j < k. Note that there is no other
eventuality, as contains a finite number of strongly-connected components (i.e.
the sequence C1 , C2 , , Ck must contain a repeated element, if k is large enough).
Then qj+1 is reachable from any state in Cj (as qj+1 is reachable from qj Cj ), and
any state in Cj is reachable from qj+1 (as qk Cj is reachable from qj+1 ). Hence,
20

qj+i Cj a contradiction. So the process described above must eventually produce
a path from the arbitrary state q1 of to a nuclear state qk .
Having established that such a path exists, it is simple to place a bound on its
length. As there are m conuclear states, and any shortest path from a conuclear
state to a nuclear state consists of a sequence of edges leaving distinct conuclear
states, no shortest path from a conuclear state to a nuclear state may exceed m
edges in length.
procedure NucleusTransitions(Q, , )
N Nucleus(Q, , )
S queue containing N
P map undefined on all states
for q N do
P [q]
end for
while not Empty(S) do
q Dequeue(S)
for p Q, with (p, ) = q and P [p] undefined do
P [p] P [q]
Enqueue(S, p)
end for
end while
return P
end procedure
The procedure performs a breadth-first search in the transposed graph of the
automaton (i.e. the graph of the DFA, with edge-transitions reversed). Proceeding
from the set of nucleus states, it generates words taking each non-nuclear state back
to the nucleus.
Proposition 5.8. The NucleusTransitions procedure identifies, for each state,
a word of length no greater than m taking that state to a nuclear state, in O(|Q|2 ||)
time.
Proof. The following invariant is maintained during execution of the above procedure: immediately before state p is processed by an iteration of the while-loop, P [p]
contains a shortest word taking p to the nucleus. Clearly the invariant is satisfied for
each state p in the nucleus; P [p] is initially set to contain the empty word, and does
not change during the execution of the procedure. As the algorithm proceeds via
a breadth-first search on the transposed automaton, states are visited in increasing
order of distance from a nuclear state. Hence, we argue by induction on distance
from a nuclear state.
Suppose p Q is any state in the conucleus of the DFA, and that the given
invariant holds for all states closer to the nucleus than p. From Lemma 5.7, each
21
state can be taken to the nucleus by some word. Let q Q be the first-processed
state that is a direct successor to p on a shortest path from p to the nucleus. That
is, let q be the first-processed state such that q = (p), where w (with and
w ) is some shortest word taking p to the nucleus. Note, then, that q is closer
to the nucleus than p, and that w is a shortest word taking q to the nucleus. As
states are visited in increasing order of distance from a nuclear state, q is processed
before p in the execution of NucleusTransitions. From the inductive hypothesis,
before q is processed, P [q] contains a shortest word v taking q to the nucleus
(i.e. |v| = |w|).
Furthermore, before the processing of q, we must have P [p] undefined. For the
sake of contradiction, suppose this is not the case. Then there is some state q 0 Q,
processed before q, such that q 0 = (p) 0 for some symbol 0 . Inductively,
v 0 = P [q 0 ] is a shortest word taking q 0 to the nucleus. As q 0 is processed before
q, we must have that |v 0 | |v|. Hence the word 0 v 0 is a shortest word taking p to
the nucleus. This contradicts the assumption that q is the first-processed state that
is a direct successor to p on a shortest path from p to the nucleus.
So, during the processing of q, P [p] is undefined and is hence set to P [q] = v.
Note that (p)v = (q)v, so P [p] takes p to the nucleus. Furthermore, v is a shortest
word taking p to the nucleus, as |v| = |w|. No defined entry in P is every modified,
so P [p] still contains a shortest word taking p to the nucleus immediately before the
processing of state p. As proven in Lemma 5.7, each such word must be of length
less than m. Hence, the correctness of the algorithm follows inductively.
Before analysing the time efficiency of the algorithm, we pause to comment on
the implementation of word concatenation. In this dissertation, we assume that
concatenating two words u, v entails computing the transformation uv, that
is, the image (q)uv of each state q Q. This mirrors the implementations given
in Appendix C. So prepending an input symbol to a word is an O(|Q|) operation.
During execution of the algorithm, every transition from every state is examined
exactly once, every state is enqueued exactly once, and two words are composed (an
O(|Q|) operation) only when a state is enqueued. Hence, the time complexity of the
algorithm is O(|Q|2 ||).
Finally, we detail an algorithm for generating a nuclear word. Having found

words taking each state to the nucleus, the procedure below combines them into one
nuclear word.
procedure NuclearWord(Q, , )
P NucleusTransitions(Q, , )
w
for q Q do
w wP [(q)w]
end for
return w
22

end procedure
Before proving the correctness and efficiency of the NucleusTransitions procedure, we must note a simple fact. As the nucleus of a DFA is a subautomaton, it
follows that any state taken to the nucleus by a word w will also be taken to the
nucleus by any word prefixed by w.
Proposition 5.9. The NuclearWord procedure returns a nuclear word of length
no greater than m2 in O(|Q|2 ||) time.
Proof. To begin with, we note that the processing of each nuclear state does not
affect the word returned by the NuclearWord procedure. Suppose N is the
nucleus of . As P [q] = for each state q N , and (q)w N for any w , q N ,
we have wP [(q)w] = w = w for each q N .
Hence, the returned word consists of the concatenation of (at most) m words,
as there are m conuclear states in . As each of these words is no greater than m
symbols long, the output of the NuclearWord procedure is a word of length no
greater than m2 . Suppose that the conuclear states of are labelled q1 , q2 , . . . , qm ,
and that qi is processed on the ith iteration of the for-loop in the procedure. To
prove that the returned word is nuclear, we verify the following inductive claim:
after k iterations of the for-loop, the accumulated word takes qj to the nucleus, for
all j such that 1 j k.
After the first iteration, it is clear that the accumulated word takes q1 to the
nucleus (from the correctness of the NuclearTransitions procedure). Suppose,
then, that the inductive hypothesis holds true for iteration k. That is, suppose the
accumulated word w takes all qj to the nucleus for 1 j k. After the (k + 1)th
iteration of the for-loop, the accumulated word w0 is wP [(qk+1 )w]. Consider the
effect of applying the word w0 to the state qk+1 : the application of w takes qk+1 to
(qk+1 )w, and the succeeding application of P [(qk+1 )w] takes (qk+1 )w to the nucleus.
Furthermore, from fact noted above, w0 takes each qj to the nucleus for 1 j k.
So the inductive hypothesis holds for iteration k + 1. Hence, the correctness of the
algorithm follows inductively.
The for-loop performs one O(|Q|) word composition per state in the automaton, leaving the O(|Q|2 ||) complexity of the NuclearTransitions procedure to
dominate the running time of the algorithm.
We have proved, by construction, the following result.
Corollary 5.10. A DFA with conucleus of size m admits a nuclear word of length
no greater than m2 .
Furthermore, we are now in a position to appreciate the relationship between
the nucleus of a DFA and its rank.
23
Corollary 5.11. If = hQ, , i is a DFA with nucleus N , then every terminal
word of maps to N , and the rank of is equal to the rank of N .
Proof. Let w be a terminal word for , and suppose, for the sake of contradiction, that (Q)w 6 N . From Proposition 5.9, there exists a nuclear word w0
mapping every state to N . Then (Q)w0 w (Q)w a contradiction. So w maps
into N . Furthermore, w is a terminal word for N . If this was not the case, then
appending some terminal word of N to w would result in a lower rank word for
also a contradiction. Hence, the rank of is equal to the rank of N .
Rystsov notes that a short nuclear word can be generated in O(|Q|2 ||) time, and
justifies this claim with reference to a result in one of his earlier (Russian) papers.
The algorithm described above meets the specification given by Rystsov, but we do
not know whether the two algorithms coincide. Corollary 5.10 is required for the
following important theoretical result.
Theorem 5.12. If Pins conjecture holds for all strongly-connected DFAs, then it
holds for all DFAs.
Proof. Assume that Pins conjecture holds for all strongly connected DFAs, and
suppose = hQ, , i is a DFA, with nucleus N and conucleus of size m. We first
show that Pins conjecture holds for the nucleus of . Suppose N has size n, rank
r, and is the union of k strongly connected
Nk , where Ni has size ni
PkDFAs N1 , , P
and rank ri for i = 1, , k. Then n = i=1 ni and r = ki=1 ri . By assumption,
Ni admits a terminal word of length less than (ni ri )2 , for i = 1, , k. By
concatenating terminal words for each subautomaton, we obtain a terminal word
for N , of size
k
X
i=1
(ni ri ) =
k
X
i=1
n2i
k
X
i=1
ni ri +
k
X
ri2 n2 2nr + r2 = (n r)2 .
i=1
Hence Pins conjecture holds for the nucleus N of .

We now show that Pins conjecture holds for . Firstly, note that has n0 = m+n
states, and rank r (from Corollary 5.11). We can construct a terminal word for by
appending a terminal word for N to a nuclear word for . From Corollary 5.10, there
exists a nuclear word for of size no greater than m2 , and as shown above, there
is a terminal word for N of size no greater than (n r)2 . Hence, the constructed
terminal word for has length no greater than
m2 + (n r)2 = m2 + (n0 m r)2 (n0 m r + m)2 = (n0 r)2 .
So Pins conjecture holds for the arbitrary DFA .
24

A corollary of this fact, which proves rather useful in the computational search for
interesting automata, is that the nucleus of any counter-example to the ern or
Pin conjectures must also be a counter-example to the ern or Pin conjectures.
Hence searches for such a counter-example are usually limited to strongly-connected
automata that is, automata that are equal to their nucleus.
As well as justifying the above result, the NuclearWord procedure can also
be used to generate a terminal word of a DFA . We produce a new automaton whose
nuclear words correspond precisely to the terminal words of the given automaton. In
order to perform this construction, we must quotient one automaton by another. If
is a DFA, and Q0 is the state set of some subautomaton 0 of , then the quotient
automaton \ 0 is the automaton formed by replacing the states in Q0 of with
one new state q 0 , and leaving the transitions between all other states unchanged.
More precisely, quotienting is defined as follows .
Definition 5.13. If = hQ, , i is a DFA, with subautomaton 0 = hQ0 , , 0 i,
then the automaton \ 0 = h(Q \ Q0 ) {q 0 }, , 00 i with q 0
/ Q and
0
q
if q = q 0 or (q, ) Q0
00 (q, ) =
(q, )
otherwise
is called the quotient automaton of by 0 .
a
q0
a, b
3
a
Figure 5.2: The graph of A \ A0 , the quotient automaton of the DFA A by the
subautomaton A0 with state set {1, 2}.
By way of example, Figure 5.2 depicts A \ A0 , the quotient of the automaton A,
from Figure 1.2, by the subautomaton A0 with state set {1, 2}. The states 1 and 2
of A have been collapsed into a new state q 0 , and all transitions into states 1 or 2
have been replaced with transitions into q 0 . Notice, in particular, that since A0 is a
subautomaton, there are no transitions from states 1 or 2 to any states outside of
A0 . Hence, we are faced with no dilemmas when defining transitions out of the new
state q 0 ; we may set each such transition to map to q 0 itself.
Suppose is a DFA, with pair automaton P [2] (). As stated above, is a
subautomaton of P [2] (). Further, the set of states in P [2] () from which all states
25
of are unreachable gives rise to another subautomaton D of P [2] (). Then the
subautomaton D of P [2] () contains the nucleus of P [2] (). To see why this
is the case, consider any strongly-connected subautomaton S of P [2] (). Suppose
firstly that S contains a 1-element state. Then, as S is strongly connected, and no
2-element state is reachable from a 1-element state, S may only contain 1-element
states. So S is contained in . Alternately, if S does not contain a 1-element state,
then no state of is reachable from any state in S. Hence, S is contained in D.
Define the quotient pair automaton as P [2] () \ ( D). From the discussion
above, it is clear that the nuclear words of the quotient pair automaton are precisely
those words that map every state of P [2] () into D. That is, the quotient pair
automaton has a single nuclear state. Furthermore, it exhibits the property we
desire.
Proposition 5.14. A word of a DFA is terminal if and only if it is nuclear in the
quotient pair automaton.
Proof. Let = hQ, , i be a DFA, w be a terminal word of , and p, q Q
be states of . Suppose firstly that w unifies p and q. Then w takes the state
{p, q} of P [2] () into the subautomaton . Alternately, suppose that (p)w 6= (q)w.
There can be no word, then, that unifies (p)w and (q)w; if there was such a word
w0 , then the word ww0 would have rank less than w, contradicting the fact that
w is terminal. Hence, no state in is reachable from the state ({p, q})w of P [2] (),
and so w takes {p, q} to a state in D. So w is a nuclear word of the quotient pair
automaton.
Conversely, suppose w is nuclear word of the quotient pair automaton. Then w
maps any state {p, q} of P [2] () to or D. Suppose, for the sake of contradiction,
that w is not terminal. That is, suppose there is a word w0 with rank less
than w. Then |(Q)ww0 | < |(Q)w|, and hence there must be two distinct states
(p)w, (q)w w(Q) that w0 unifies. However, as (p)w 6= (q)w, w must map the state
{p, q} of P [2] () to a state in D. Then, as D is a subautomaton containing only
2-element states, ww0 must also map {p, q} to a 2-element state in P [2] (). So we
have (p)ww0 6= (q)ww0 a contradiction. Therefore, no word can have rank less
than w and, hence, w is a terminal word of .
So we can compute a terminal word for a DFA = hQ, , i by using the algorithm described above to generate a nuclear word for its quotient pair automaton.
The quotient pair automaton can be constructed in O(|Q|2 ||) time, and has O(|Q|2 )
states. Hence, the final algorithm requires O(|Q|4 ||) time, and produces a terminal word of length less than |Q|4 . This also gives a (weak) upper bound on the
length of shortest terminal words. We provide, in Appendix C, a GAP implementation RystsovTerminalWordOfSemigroup of the procedure described in this
chapter.
26
CHAPTER 6
Eppsteins Reset Word Algorithm

In this chapter, we discuss the best known technique for computing a reset word
of a DFA = hQ, , i; Eppsteins O(|Q|3 + |Q|2 ||) time algorithm [10]. We find
that, with a small modification, this algorithm can be used to identify a terminal
word without an increase in time complexity. In this chapter, our primary original
contributions are the identification of this small modification, a detailed proof of the
algorithms correctness, and an implementation of the algorithm in question.
Much like Pins algorithm (discussed in Chapter 4), Eppsteins algorithm attempts to construct a (not necessarily optimal) path from state Q to a singleton set
in P(), by iteratively selecting a word v that maps the current state set P to a
smaller state set (P )v. Where the two algorithms differ, however, is in their selection
of such an intermediate word. Pins algorithm attempts to minimise the length of
the returned reset word by choosing v to be the shortest word that decreases the
size of P , while Eppsteins algorithm for the sake of time efficiency arbitrarily
chooses two states in P , and sets v to be the shortest word that unifies these states.
Eppsteins algorithm, then, can be more precisely specified as follows:
procedure ResetWord(Q, , )
P Q
w
while |P | > 1 do
{s, t} an arbitrary pair of distinct states in P
v the shortest word unifying s and t
w wv
P (P )v
end while
return w
end procedure
It is clear that the returned word if it can be constructed maps Q to a singleton
set in P(). From Proposition 3.4, then, such a word is a reset word for . To verify
that the above algorithm may always proceed on a synchronising DFA, we need
only appeal to Proposition 3.6, which guarantees that a word v may be successfully
selected in each iteration of the while-loop.
We present a slight generalisation of Eppsteins algorithm, designed to compute
a terminal word of any DFA, instead of a reset word of a synchronising DFA. The
algorithm begins, as Eppsteins does, by constructing a unifying word for each pair
of states from Q. To do so, it utilises the pair automaton P [2] () of . With a
breadth-first search through the transposition of P [2] (), the shortest word taking
each pair state to a singleton state can be constructed. More precisely, we present
the following algorithm to construct, for each pair of states s, t Q that can be
unified, a word U [{s, t}] unifying s and t:
27
procedure UnifyingWords(Q, , )
hQ0 , , 0 i PairAutomaton(Q, , )
S queue containing {u} for u Q
U map undefined on all state sets
for u Q do
U [{u}]
end for
while not Empty(S) do
q Dequeue(S)
for q 0 Q0 , with 0 (q 0 , ) = q and U [q 0 ] undefined do
U [q 0 ] U [q]
Enqueue(S, q 0 )
end for
end while
return U
end procedure
Proposition 6.1. The UnifyingWords procedure returns, for each pair of distinct states {s, t} Q that can be unified, the shortest word U [{s, t}] unifying s
and t, in O(|Q|3 + |Q|2 ||) time.
Proof. The proof of this proposition proceeds in a manner similar to the proof of
Proposition 5.8. We verify that the following invariant holds during the execution
of the algorithm: immediately before state p Q0 is processed by an iteration of the
while-loop, U [p] contains a shortest word taking p to a singleton state of P [2] ().
Before continuing, we note that two distinct states s, t Q can be unified if
and only if there is some path from the state {s, t} of P [2] () to a singleton set.
As UnifyingWords performs a breadth-first search proceeding from the singleton
states in the transposition of P [2] (), a state p Q0 is eventually processed if an
only if there is a path from p to a singleton state. Furthermore, such states are
processed in increasing order of distance from a singleton state. Hence we prove the
above claim inductively, on distance from a singleton state.
Clearly, every singleton state {u} Q of P [2] () is mapped to a singleton state
by . Note that U [{u}] is set to for each u Q at the beginning of the algorithm,
and cannot be modified during the remaining steps (as only undefined entries in U
are updated). So the claim holds for all singleton states. Suppose, then, that p Q0
is a pair state of P [2] (), and that the given invariant holds for those states of P [2] ()
closer to a singleton state than p. Let q Q0 be the first-processed state that is a
direct successor of p on a shortest path from p to a singleton set. That is, let q be
the first-processed state such that q = (p), where w (with and w )
is some shortest word taking p to a singleton state. Then q is closer to a singleton
state than p, and w is a shortest word taking q to a singleton state. As states are
28
CHAPTER 6. EPPSTEINS RESET WORD ALGORITHM

visited in increasing order of distance from a singleton state, q is processed before
p in the execution of UnifyingWords. From the inductive hypothesis, before q is
processed, U [q] contains a shortest word v taking q to a singleton state (i.e.
|v| = |w|).
At the point at which q is processed by an iteration of the while-loop, we must
have U [p] undefined. For the sake of contradiction, suppose that this was not the
case. Then there must exist some state q 0 Q0 , processed before q, such that
q 0 = (p) 0 for some 0 . From the inductive hypothesis, v 0 = U [q 0 ] is
some shortest word taking q 0 to a singleton state, and |v 0 | |v|, as q 0 is processed
before q (where states are processed in increasing order of distance from a singleton
state). Then 0 v 0 is a shortest word taking p to a singleton state. This contradicts
the assumption that q is the first-processed state that is a direct successor to p on
a shortest path from p to a singleton state.
So, during the processing of q, U [p] is undefined, and is hence set to U [q] = v.
As (p)v = (q)v, U [p] takes p to a singleton set. Furthermore, as |v| = |w|, v is a
shortest word taking p to a singleton set. As noted above, once U [p] has been set,
it is never modified. Hence, immediately before the processing of p, U [p] contains
a shortest word mapping p to a singleton set. So the correctness of the algorithm
follows inductively.
We now discuss the time efficiency of the algorithm. The construction of the
pair automaton can be performed in O(|Q|2 ||) time. During the execution of the
breadth-first search, each of the O(|Q|2 ) states and O(|Q|2 ||) transitions of P [2] ()
is examined at most once. The examination of each state entails the prepending
of an input symbol to a word, an O(|Q|) operation (see the proof of Proposition
5.8 for discussion of this implementation detail). Hence the final time complexity is
O(|Q|3 + |Q|2 ||).
The main algorithm simply combines the unifying words together to create a
terminal word. Note, specifically, the method of state-pair selection within the
while-loop; while Eppsteins original algorithm unifies an arbitrary pair of states in
P , the algorithm detailed below attempts to find a unifying word for each possible
pair of states in P . This is because, in a non-synchronising DFA, not all pairs of
states can be unified. As long as there exists a pair of states in P that can be unified,
however, a smaller state set can still be reached.
procedure TerminalWord(Q, , )
U UnifyingWords(Q, , )
P Q
w
repeat
for each pair of distinct states {s, t} P do
if U [{s, t}] defined then
29
v U [{s, t}]
w wv
P (P )v
break
end if
end for
until |P | unchanged
return w
end procedure
Proposition 6.2. The TerminalWord procedure returns a terminal word of
length no greater than |Q|3 in O(|Q|3 + |Q|2 ||) time, for a DFA = hQ, , i.
Proof. In each iteration of the main loop, a unifier is appended to the word w,
and P is updated to the state (Q)w of P(). The loop exits, then, when there is
no possible word that takes (Q)w to a smaller set in P(). This can be verified by
noting that any word that takes (Q)w to a smaller set must necessarily unify two
states in (Q)w. Suppose w0 is a terminal word of . Then |(Q)ww0 | = |(Q)w|,
as no state smaller than (Q)w is reachable from (Q)w. Hence the rank of w0 is no
less than the rank of w. So w is terminal.
The main loop of the algorithm executes at most |Q| times (as the size of P
decreases during each iteration), and each iteration appends a word of length less
than |Q|2 to w (as a shortest path between states in P [2] () uses less than |Q|2
input symbols). Hence, the final length of w does not exceed |Q|3 . This proves the
correctness of the TerminalWord procedure.
A simple analysis of each step in the algorithm justifies the specified time complexity. The initial call to UnifyingWords requires O(|Q|3 + |Q|2 ||) time. This
is followed by at most |Q| iterations of the main loop. During each iteration of
the loop, O(|Q|2 ) constant-time map look-ups (one for each pair of states in P ) are
performed. If a useful state pair is found, then one word concatenation and one application of a state transformation occurs. The word concatenation requires O(|Q|2 )
time (as the length of each word stored in U is O(|Q|2 )), and the application of v
to P requires O(|Q|) time. Hence, the main loop runs in O(|Q|3 ) time, leaving the
O(|Q|3 + |Q|2 ||) time bound of the UnifyingWords procedure to dominate the
process.
The TerminalWord procedure given above is an efficient generalisation of
Eppsteins reset word algorithm to non-synchronising automata. It appears that the
given modification has not been previously considered, and the resulting algorithm
represents a new level of efficiency for computing terminal words. In particular, it is
a factor of |Q| faster than the algorithm proposed by Rystsov in 1992 [26], which was
detailed in Chapter 5. This algorithm has been implemented as the GAP function
TerminalWordOfSemigroup, and is available in Appendix C.
30
CHAPTER 7
Enumerating Small Automata

As this dissertation largely consists of the analysis of small automata, efficient methods for generating and enumerating such automata were required. We propose two
methods for generating small automata, each of which was used to verify the correctness of the other.
The first method was initially proposed by Kisielewicz and Szykua in a 2013
paper [16]. The paper gives an adequate, high-level account of an algorithm for the
generation of automata, but is lacking in detail. In particular, implementation details of the two core optimisations are omitted, as are any proofs of the algorithms
correctness. The authors promise a more complete treatment of the algorithm in a
future paper, which has yet to materialise. Here we rectify this problem, producing
what we believe to be the first public, detailed account of the algorithm, its implementation, and its correctness. Included in Appendix C is a (somewhat unoptimised)
C++ implementation of this method, gena.cpp. To the best of our knowledge, this
also serves as the first publicly available implementation of the specified process.
The proposed method proceeds recursively to generate all n-state automata with
arity k. Given an ordered list A of all pairwise non-isomorphic n-state automata over
a fixed alphabet of size k 1, and an ordered list B of all pairwise non-isomorphic
n-state automata over a fixed alphabet of size 1, the algorithm computes all n-state
automata over a fixed alphabet of size k, by extending, in every possible way, each
automaton in A with the transitions from each automaton in B. The resulting list of
automata may contain some isomorphic entries, and some automata of arity k 1,
which must be removed via a separate process (detailed below). The initial list of
all pairwise non-isomorphic n-state automata over a fixed alphabet of size 1 must be
obtained through some other means; in our case, the list of pairwise non-isomorphic
transformations of degree n on J. D. Mitchells webpage [19] was used.
We begin by codifying the notion of equivalence between automata.
Definition 7.1. A DFA 1 = hQ1 , 1 , 1 i is said to be isomorphic to a DFA 2 =
hQ2 , 2 , 2 i if there exist bijections : Q1 Q2 and : 1 2 such that
2 ((q), ()) = (1 (q, )) for all q Q1 and 1 . The map is called an
isomorphism from 1 to 2 .
We write A ' B to denote that A is isomorphic to B. The relation of being
isomorphic is an equivalence relation between automata, and its equivalence classes
are called isomorphism classes. An isomorphism from DFA to itself is called an
automorphism. We say that an isomorphism is strong if its associated symbol
map is the identity function.
Two automata 1 and 2 are isomorphic exactly when the states and symbols in
2 can be relabelled to yield 1 . Clearly, then, two isomorphic automata share all
31
properties pertinent to the ern and Pin conjectures. Hence, when enumerating
automata, we wish to only consider one DFA from each isomorphism class. To do
so, we use an algorithm guaranteed to generate at least one DFA from each such
class, then retain exactly one automaton from each class through a canonicalisation
process. The canonicalisation process assigns, to each DFA, a representative of that
DFAs isomorphism class, such that isomorphic DFAs are assigned the same representative. This is the standard notion of canonicalisation, as used in the influential
paper Practical graph isomorphism, II, for instance [18].
The algorithm referred to above involves the extension of a DFA A with the
transitions of a DFA B . We now precisely define this process.
Definition 7.2. If A = hQ, A , A i is some size-n, arity-k DFA, B = hQ, B , B i
is some size-n, arity-1 DFA with A B = , and is some permutation of Q,
then the union of A and B under , denoted A B , is the DFA hQ, A B , i,
where

A (q, )
if A
(q, ) =
1 (B ((q), )) if B
We can visualise the union process as gluing each state q of B on top of the
state 1 (q) of A . For example, consider the union in Figure 7.1, also depicted
in Kisielewicz and Szykuas original paper [16]. Note that, for instance, the edge
from state 0 to state 2 in B is transformed into an edge from state 1 = 1 (0) to
state 0 = 1 (2) in A B . In particular, we use the opposite meaning of that
conventionally signalled by the given notation. That is, what we denote by A B
1
might conventionally be written as A B . For the sake of brevity, when we
write A B , with A and B automata, and a permutation, we tacitly assume
that A and B share the same state set, that is a permutation of this state set,
and that B has an alphabet of size 1 that is disjoint from the alphabet of A .
We use the following notation in the proof of Proposition 7.4.
Definition 7.3. If = hQ, , i is a DFA, and 0 , then the DFA hQ, 0 , |Q0 i
is called the restriction of to 0 , and is denoted |0 .
In other words, |0 is the DFA that arises by discarding from all transitions
except those defined by symbols in 0 .
Proposition 7.4. If A is an ordered list of all pairwise non-isomorphic automata
with state set Q and alphabet A of size k1, B is an ordered list of all pairwise nonisomorphic automata with state set Q and alphabet B of size 1 (with B A = ),
and is a DFA with state set Q and alphabet A B , then ' A B for some
A A, B B, and permutation of Q.
Proof. Suppose = hQ, A B , i is a DFA. As |A is a DFA with state set Q and
alphabet A , there exists an isomorphism a from |A to some A = hQ, A , A i
CHAPTER 7. ENUMERATING SMALL AUTOMATA
32
0
a
1
b
1
a
a
a
2
2
a
A B
Figure 7.1: The graph of the union of A and B under = (0, 2, 1).
A. Similarly, there exists an isomorphism b from |B to some B = hQ, B , B i
B. Then let = b 1
a . Certainly is a permutation of Q, as both a and b are
bijective functions from Q to Q. We claim that a is a strong isomorphism from
to the DFA A B = hQ, A B , 0 i. So we must show that a satisfies the
definition of such an isomorphism. Suppose q Q and A . Then
0 (a (q), ) = A (a (q), )
= a (|QA (q, ))
= a ((q, ))
from the definition of 0 when A

as a is an isomorphism from |A to A
as A
Alternatively, suppose B . Then

0 (a (q), ) = 1 (B ((a (q)), ))
from the definition of 0 when

B
1 1
1
= (b a ) (B (b a a (q), )) as = b 1
a
1
= a b (B (b (q), ))
by cancellation and re-bracketing
1
= a b b (|QB (q, ))
as b is an isomorphism from
|B to B
= a (|QB (q, ))
by cancellation
= a ((q, ))
as B
So a is an isomorphism between and A B . Hence, ' A B , as

required.
So, with lists A and B as defined in Proposition 7.4, we can construct every
possible arity-k automaton with the given state set and alphabet, by computing
33
A B for each A A, B B and permutation of Q. This approach,
however, is non-optimal. In particular, there are |Q|! possible permutations of Q,
many of which generate isomorphic automata.
Kisielewicz and Szykuas algorithm is designed to generate all pairwise nonstrongly-isomorphic automata that arise from the union of a given A A and
B B, without requiring the examination of every possible permutation of Q.
To more closely mirror the included implementation of the procedure, all related
pseudo-code and analysis presented in this chapter assumes that A and B use the
state set Zn .
In order to avoid the examination of redundant permutations, we must precompute data about the structure of A and B . For each i Zn , we require (up to)
one state PrevA[i] Zn based on the structure of A and for each S Zn
and j Zn \ S, we require one boolean value PrevB[S][j] based on the structure
of B . The meaning of these data, and the manner in which they are computed,
will be elucidated shortly. Before doing so, however, we present the algorithm in its
entirety. The procedure accepts the required source automata, the data tables mentioned above, and a partial permutation i . A partial permutation i is an injective
map from Zi to Zn . We use S to denote the image set of i . During execution of the
algorithm, the empty partial permutation 0 is recursively extended to a number of
complete permutations (with some extensions possibly ignored, based on values in
the two data tables). For each complete permutation reached, the DFA A B
is output.
procedure PermutB(A , B , PrevA, PrevB, i , S)
if i = n then
output A B
else
m i (PrevA[i]) + 1 if PrevA[i] defined, otherwise 0
for j = m, , n 1 do
if j
/ S and not PrevB[S][j] then
i+1 the extension of i with i+1 (i) = j
PermutB(A , B , PrevA, PrevB, i+1 , S {j})
end if
end for
end if
end procedure
Consider, first, the unoptimised variant of the algorithm, by leaving PrevA[i]
undefined for all i Zn , and by setting PrevB[S][j] to false for each S Zn
and j Zn \ S. In this situation, a call to PermutB simply extends the partial
permutation i in every possible way (checking that the new image j is not in the
current image set S, to maintain injectivity), with each recursive call leading to a
complete permutation. Notice that permutations are constructed in lexicographical
(i.e. dictionary) order with respect to the image sequence (0), , (n 1).
34
We now define the two major optimisations to be implemented each used to

prune the tree of permutations considered and show that the optimised algorithm
still generates some DFA isomorphic to each possible union of A and B . We
employ the following relation between states of a DFA.
Definition 7.5. If = hQ, , i is a DFA, q, q 0 Q and S Q, then q 'S q 0 if
there is some strong automorphism of such that (s) = s for all s S, and
(q) = q 0 . In this case, we say that q and q 0 are conjugate with respect to S.
For a concrete example of this relation, consider the DFA depicted in Figure
7.2, originally from Kisielewicz and Szykuas paper [16]. From the existence of the
automorphism = (0, 1)(2, 3) (written here in cycle notation), we see that 0 'S 1
and 2 'S 3 for S {4}. There is, however, no pair of (distinct) states conjugate
with respect to any other S Z5 .
0
a
4
Figure 7.2: The graph of a DFA admitting non-trivial automorphisms.

With these notions in place, we define the first optimisation to be implemented,
which takes advantage of symmetries in the structure of A .
Optimisation 1. For each state i Zn , precompute the greatest state h Zn such
that h < i and i 'Sh h in A , where Sh = {0, , h 1}. If such a state h exists,
set PrevA[i] to h. If not, leave PrevA[i] undefined.
With Optimisation 1 implemented, we will see that the following property holds:
if i is a partial permutation, and PrevA[i] is defined, then any extension of i
with (i) < i (PrevA[i]) will generate a DFA isomorphic to one that has already
been output. The proof of this fact requires the following proposition.
Proposition 7.6. If A = hQ, A , A i and B = hQ, B , B i are DFAs, is a
permutation of Q and is an automorphism of A , then A B ' A B .
Proof. Firstly, note that is a permutation of Q, as both and are bijective
transformations on Q. Let 1 denote the transition function of A B and 2
denote the transition function of A B . We show that is a strong isomorphism
between A B and A B . Suppose that q Q, and A . Then
2 ((q), ) = A ((q), )
= (A (q, ))
= (1 (q, ))

as is an automorphism for A
35
Alternatively, suppose that B . Then
2 ((q), ) = 1 (B ((q), ))
= 1 1 (B ((q), ))
= ()1 (B ((q), ))
= (1 (q, ))
from the definition of 2 when B

as 1 is the identity transformation on Q
by re-bracketing
Hence is a strong isomorphism between A B and A B . So A B '

A B , as required.
We now prove that the PermutB procedure, augmented with Optimisation 1,
generates at least one automaton from each possible isomorphism class.
Lemma 7.7. If A B is the union of DFAs A and B (each with state set Zn )
under a permutation , then PermutB, augmented with Optimisation 1, generates
0
the DFA A B , where 0 is the lexicographically least permutation such that
0
A B ' A B .
Proof. Suppose that 0 , as defined above, is not computed during the execution
of the optimised procedure. Then there must be some (possibly empty) partial
permutation i0 that can be extended to 0 , with h = PrevA[i] < i defined and
0 (i) < i0 (h) = 0 (h). From the definition of the PrevA table, then, there must
exist some automorphism of A such that (j) = j for all j < h, and (i) = h.
Then the permutation 0 1 is lexicographically less than 0 , as 0 1 (j) = 0 (j) for
0
j < h, and 0 1 (h) = 0 (i) < 0 (h). However, from Proposition 7.6, A B '
0 1
A B , as 1 is an automorphism of A . So 0 is not the lexicographically
0
least permutation such that A B ' A B a contradiction. Hence the
lemma is established.
We now move on to describe the second optimisation, which takes advantage of
symmetries in the structure of B .
Optimisation 2. For each subset S Zn and j Zn \ S, precompute whether
there exists some h Zn \ S such that h < j and h 'S j in B . If such a state
exists, set PrevB[S][j] to true. If not, set PrevB[S][j] to false.
With Optimisation 2 implemented, we will see that the following property holds:
if i is a partial permutation with image S, j
/ S, and PrevB[S][j] true, then
any extension of i with (i) = j will generate a DFA isomorphic to one that has
already been output. The proof of this fact requires the following proposition.
Proposition 7.8. If A = hQ, A , A i and B = hQ, B , B i are DFAs, is a
permutation of Q and is an automorphism of B , then A B = A B .
36

Proof. Firstly, note that is a permutation of Q, as both and are bijective
transformations on Q. Let 1 denote the transition function of A B and 2
denote the transition function of A B . We show that 1 = 2 . Suppose q Q.
For A , we immediately have 1 (q, ) = A (q, ) = 2 (q, ), from the definition
of the union construction. Suppose, then, that B . We have
1 (q, ) = ()1 (B ((q), ))
= 1 1 (B ((q), ))
= 1 (B (1 (q), ))
= 1 (B ((q), ))
= 2 (q, )

by re-bracketing
as is an automorphism of B
as 1 is the identity map on Q
Clearly 1 = 2 , and hence A B = A B , as required.

We now prove that the PermutB procedure, augmented with Optimisation 2,
generates at least one automaton from each possible isomorphism class.
Lemma 7.9. If A B is the union of DFAs A and B (each with state set Zn )
under a permutation , then PermutB, augmented with Optimisation 2, generates
0
the DFA A B , where 0 is the lexicographically least permutation such that
0
A B ' A B .
Proof. Suppose that 0 , as defined above, is not computed during the execution
of the optimised procedure. Then there must be some (possibly empty) partial
permutation i0 that can be extended to 0 , with PrevB[S][j] = true, where S is
the image set of i0 , and j = 0 (i) Zn \ S. From the definition of the PrevB table,
then, there must exist some automorphism of B such that (k) = k for all k S,
and (h) = j for some h < j. Then the permutation 1 0 is lexicographically less
than 0 , as k < i = 0 (k) S = 1 0 (k) = 0 (k), and 1 0 (i) = 1 (j) =
0
1 0
h < j = 0 (i). However, from Proposition 7.8, A B = A B , as 1 is
an automorphism of B . So 0 is not the lexicographically least permutation such
0
that A B ' A B a contradiction. Hence the lemma is established.
The final algorithm for generating automata simply selects, in turn, each possible
combination of automata A from list A and B from list B, precomputes the tables
PrevA and PrevB, and calls the PermutB procedure (with both optimisations
enabled). For completeness, we list the algorithm, and a proof of its correctness.
procedure GenerateAutomata(A, B)
for B B do
PrevB CalculatePrevB(B )
for A A do
PrevA CalculatePrevA(A )
PermutB(A , B , PrevA, PrevB, 0 , {})
37
end for
end for
end procedure
Theorem 7.10. If A is an ordered list of all pairwise non-isomorphic automata with
state set Zn and alphabet A of size k 1, B is an ordered list of all pairwise nonisomorphic automata with state set Zn and alphabet B of size 1 (with A B =
), and is a DFA with state set Zn and alphabet A B , then an automaton
isomorphic to is output during execution of GenerateAutomata(A, B).
Proof. By Proposition 7.4, = A B for some A A, B B and permutation
of Zn . Let 0 be the lexicographically least permutation of Zn such that A B '
0
A B . Then, by Lemmas 7.7 and 7.9, neither Optimisation 1 nor Optimisation
0
2 prevent PermutB from outputting A B . Hence the theorem is proven.
We now pause to briefly discuss the implementation of the optimisations mentioned above. Each optimisation requires the computation and analysis of automorphisms of a DFA. These automorphisms can be computed with the help of existing
software tools (notably McKay and Pipernos nauty and Traces suite [18]), or directly, by searching for automorphisms among all state permutations. The latter
option was used in gena.cpp, and proved adequate for the generation of small automata. When taking this approach, we can reduce the number of permutations
that must be examined by considering the indegrees (that is, the number of incoming transitions) of each state. As an automorphism may only map between states
of equal indegree, we may construct each potential automorphism out of disjoint
permutations each one permuting only those states of a given indegree.
Note that the automorphisms of B fixing each of the 2n subsets of Zn must be
generated during the precomputation of the PrevB table. This requires significantly
more processing time than the precomputation of the PrevA table. Hence, in the
complete algorithm, the PrevB table is computed once per automaton B in list
B, after which all unions involving B are output. In contrast, the PrevA table is
computed multiple times for each automaton A in list A once for each automaton
B with which A is coupled. For a greater insight into the implementation of such
procedures, see the source code of gena, found in Appendix C.
As mentioned above, some isomorphic DFAs are output by the described algorithm. Specifically, some DFAs generated from the union of different source automata may be equivalent up to a permutation of alphabet symbols (i.e. isomorphic,
but not strongly isomorphic), and some DFAs generated from the union of the same
source automata may be isomorphic (i.e. the pruning defined above is not perfect).
We now detail a process by which output DFAs can be canonicalised. The process begins by constructing a coloured graph corresponding to a given DFA. This
graph is then canonicalised, using McKay and Pipernos nauty and Traces software
suite [18]. A canonicalised version of the original DFA is then recovered from the
canonicalised graph.
38
vb
GA
v1,b
b
v1
v0,a
v0
v3,b
v0,b
v2,a
v1,a
v3
v2
v3,a
v2,b
va
G0A
A0
va
v3,a
a
v0,b
v0
v3
v1,a
v2,b
v0,a
v3,b
v1
v2
v1,b
v2,a
vb
Figure 7.3: The graph of the DFA A, its corresponding graph GA , the canonicalised
graph G0A , and the canonicalised DFA A0 .
39
Given a DFA = hQ, , i, we construct the corresponding graph G as follows.
For each state q Q, we introduce one state vertex vq to G , for each q Q and
, we introduce one transition vertex vq, to G , and for each alphabet symbol
, we introduce one symbol vertex v to G . We colour all state vertices
one colour, all transition vertices a second colour, and all symbol vertices a third
colour. Lastly, for each q Q and , we introduce the directed edges (vq , vq, ),
(vq, , v(q,) ) and (v , vq, ) to G . We then compute G0 , the canonicalised version
of the graph G . A canonicalised DFA 0 is extracted from G0 in the natural way.
That is, the transition from q Q to q 0 Q by is included in 0 precisely when
the edges (vq , vq, ), (vq, , vq0 ) and (v , vq, ) are all present in G0 . This construction
is more easily understood by example. Consider, for instance, the DFA A, from
Figure 1.2. The canonicalisation of A proceeds as depicted in Figure 7.3.
We provide a C program canon_dfa in Appendix C, which implements the process described above. The program accepts the description of a DFA in the format
used by the Semigroups GAP package, and outputs a canonicalised version of the
DFA in the same format. Due to the efficiency of the nauty library, canon_dfa
can process arity-2 DFAs of up to 1000 states in less than a minute. It is one of
the only publicly available tools for the efficient canonicalisation of DFAs, and is
the only such tool (to the best of our knowledge) that can be easily used with the
Semigroups GAP package.
We also used a second, group-theoretic, method to generate DFAs for the purposes of this dissertation. This second method was primarily intended as a means of
verifying the correctness of the algorithms listed above, compared to which it proved
significantly less efficient. A GAP implementation of the procedure, along with a
discussion of its operation, is included in Appendix C. This implementation is simple and transparent, and its behaviour is almost completely dependent on existing,
tested code. Hence, we are very confident in its accuracy. Despite its simplicity, the
second method was efficient enough to enumerate automata of size less than 7. The
output of the GAP routine was canonicalised (also with GAP), and compared to the
corresponding output of gena and canon_dfa, for automata of size 4, 5 and 6. It was
verified that the number of automata returned by each process matched the value
in the OEIS [20], and that each process produced equivalent output. For reasons of
efficiency, however, and due to the fact that the second method was not designed to
generate automata of arity greater than 2, we chose to use the more sophisticated
combination of gena and canon_dfa as the primary method of automata generation
for this dissertation.
40
CHAPTER 8
Results
In this chapter, we detail the results of our computational enumeration of all arity-2
DFAs of size less than 9. No counter examples to either the ern or Pin conjectures were found among DFAs in this parameter space. Furthermore, it appears
that the task of computationally generating significant non-synchronising slowlyconverging automata is unfeasible. More information about the enumeration, including terminal-threshold histograms and notable slowly-converging automata, can
be found in Appendix A. In the following discussion, we use C(n, k) = (n k)2 (for
integers 1 k n) to denote Pins conjectured upper bound on the terminal
threshold of a size-n, rank-k DFA.
Table 8.1 lists the number of automata generated in our complete enumeration.
Harrison [13] used Plya enumeration to compute the number of arity-2 automata
of each size; we have verified that our results match this theoretical calculation.
As shown in the table, over 3 thousand million automata were generated. The
computational requirements for such a task were significant; weeks of computation
were performed, and terabytes of raw data were produced, by our 96-core cluster of
machines. In comparison, Kisielewicz and Szykua [16] claim to have enumerated
all arity-2 DFAs of size less than 12 (over one thousand trillion automata), and
Trahtman [31] claims to have generated all DFAs of size less than 8 and arity less
than 5 (a truly staggering number of automata).
n
k
1 49 1111 34256 1318679

2 13 250 5505 165099
3
5 71 1564 41810
4
23 415 11699
5
89
2795
6
484
7
8
Total 67 1455 41829 1540566
Total
60477796 3210707511 3272539402

6002354 258435436
264608657
1365970 52696831
54106251
372155 13813770
14198062
100050
3853376
3956310
21540
971895
993919
2904
189753
192657
22002
22002
68342769 3540690574 3610617260
Table 8.1: The number of non-isomorphic arity-2, size-n, rank-k automata, for 3
n 8 and 1 k n.
We begin our analysis with Table 8.2. It lists the number of edge cases (that
is, automata with terminal threshold equal to Pins bound) among arity-2 DFAs of
size no greater than eight.
The information in this table is at once exciting and dispiriting. On one hand,
the relatively frequent rate at which edge cases occur (particularly for higher-rank
41
n
k
1
2
3
4
5
6
7
8
3 4
2 2 1
2
1
1
13 2 2
1
2
1
5 71 8
8
4
8
23 415 22
22
11
89 2795
86
86
484 21540
322
2904 189753
22002
Table 8.2: The number of non-isomorphic arity-2, size-n, rank-k automata with
terminal threshold of C(n, k) = (n k)2 for 3 n 8 and 1 k n.
DFAs) suggests that uncovering edge cases to Pins conjecture is significantly easier
than uncovering edge cases to erns conjecture. On the other hand, the sheer
number of such edge cases makes their analysis difficult it is a formidable task to
extract, from such a large set of candidate DFAs, patterns that could form the basis
of a counter example to, or even an infinite family of edge cases for, Pins conjecture.
n
T
C(n, 1) 0
C(n, 1) 1
C(n, 1) 2
C(n, 1) 3
C(n, 1) 4
C(n, 1) 5
C(n, 1) 6
C(n, 1) 7
C(n, 1) 8
C(n, 1) 9
C(n, 1) 10
< C(n, 1) 10
2 2
4 5
29 11
14 21
0 64
104
375
478
51
0
1
2
1
1
4
0
0
0
11
2
0
0
23
11
0
0
43
22
3
0
46
45
3
1
151
61
13
1
294
112
39
3
551
208
75
1
986
378
123
5
1855
718
203
12
30291 1317120 60477336 3210706586
Table 8.3: The number of arity-2, size-n, rank-1 DFAs with terminal threshold T ,
for 3 n 8.
We persist by restricting our attention to DFAs of rank 1 and 2, in the hope that
this proves instructive for the analysis of higher-rank automata. Listed in Table
8.3 is the number of rank-1 DFAs with reset threshold close to erns bound. To
account for the wide range of values exhibited, we denote reset thresholds by their
deficiency from erns bound. For example, the shaded row of the table lists edge
cases for the ern conjecture.
CHAPTER 8. RESULTS
42
This table confirms known facts about the distribution of rank-1 automata. In
particular, it verifies that the edge cases identified by Trahtman [31] are the only
arity-2 synchronising DFAs with reset threshold reaching erns bound. The first
gap (i.e. contiguous sequence of 0s in a column, as detailed in the discussion of
Table 2.2) is also verified to appear at T = C(n, 1) 1, for 6 n 8.
n
T
C(n, 2) 0 13 2
2
1
2
1
C(n, 2) 1
0 10
5
4
0
0
C(n, 2) 2
100 15
11
2
0
C(n, 2) 3
138 24
23
11
0
C(n, 2) 4
0 102
43
22
3
C(n, 2) 5
218
50
45
3
C(n, 2) 6
1154
157
63
13
C(n, 2) 7
3033
366
113
39
C(n, 2) 8
952
754
216
75
C(n, 2) 9
0 1519
410
123
C(n, 2) 10
3016
828
205
< C(n, 2) 10
159155 6000642 258434974
Table 8.4: The number of arity-2, size-n, rank-2 DFAs with terminal threshold T ,
for 3 n 8.
Table 8.4 contains analogous data for rank-2 automata. As existing literature
concerns only rank-1 automata, this table presents new information about the distribution of small DFAs. There are a number of interesting features to note. We immediately see that, as above, there appears to be significantly more slowly-converging
automata of rank-2 than there are slow-synchronising automata. We also see that
a gap is visible at T = C(n, 2) 1 for 7 n 8. The gaps that appear in this
table, however, are smaller than those in Table 8.3, and no gap appears in the n = 6
column. Indeed, this trend persists among automata of higher-rank: there is a single
one-symbol-long gap that appears at T = C(8, 3) 1 for size-8, rank-3 DFAs. In
general, it appears that the frequency and size of terminal-threshold-histogram gaps
decrease as the rank of the DFAs in question increase. We may suggest, then, that
the gap conjecture be generalised to automata of all ranks. That is, we may make
the following claim: for any integers g 1 and k 1, there exists a large enough
integer n 1 such that there are at least g gaps in the distribution of terminal
thresholds among arity-k, size-n automata. While the breadth of the data collected
may be insufficient to convincingly justify this claim, it certainly seems reasonable.
It is natural to be particularly interested in the 21 (shaded) automata from Table
8.4 with terminal threshold reaching Pins bound. Unfortunately, these automata
prove to be uniformly uninteresting. Each such automaton of size n > 3 consists of a
synchronising DFA of size n1 coupled with a single isolated state. As disconnected
43
automata, these DFAs are not significant a disconnected DFA that is an edge
case for (or counter example to) Pins conjecture must necessarily contain a smaller,
connected DFA that is also an edge case for (or counter-example to) Pins conjecture,
for reasons similar to the those used in the proof of Theorem 5.12. This pattern
also persists for higher-rank automata there is no size-n, rank-k edge case to Pins
conjecture that is connected, for 4 n 8 and 1 < k < n 1. The fact that
all edge cases take on this form is a testament to the primacy of erns family of
automata, and of the ern conjecture, even in the context of non-synchronising
automata.
We have reason, then, to generalise the conjecture made by Trahtman in his
2006 computational survey of synchronising automata [31] that the eight known
sporadic examples of edge-case automata, along with erns infinite family, are the
only edge cases to the ern conjecture of any size or arity. Taking into account
our computational findings (and pending the analysis of higher-arity automata),
we may offer the following modification: the known sporadic edge-case automata,
and erns infinite family of automata, are the only connected edge cases to Pins
conjecture, among automata of size n > 3 and rank k with 1 k < n 1.
So we must instead turn our attention to connected slowly-converging automata.
Even this restriction, however, proves inadequate; it appears that the connected
automata that converge most slowly do so by virtue of containing a large, slowlysynchronising (i.e. rank-1) subautomaton. Hence, we consider only strongly-connected
slowly-converging automata. This is consistent with the approach used in Trahtmans [31], and also Kisielewicz and Szykuas [16], papers on the enumeration of
small DFAs, in which only strongly-connected automata are considered for the sake
of computational efficiency. Table 8.5 gives the maximum terminal threshold among
arity-2, strongly-connected DFAs of size-n and rank-k.
It appears, then, that most strongly-connected automata converge significantly
more quickly than Pins upper bound. Based on these results, we are not confident
that computational methods can be used to find a non-synchronising counter example to Pins conjecture. We also suggest that rank-1 DFAs remain the focus of
future work, on account of the slow rate at which they synchronise, in comparison
to Pins bound. These results also serve as a strong indicator that Pins conjecture
is true, at least for automata of rank greater than 1.
We now examine the most slowly-converging strongly-connected, rank-2 automata identified in the course of our computational enumeration. Extrapolating
from the edge cases of size 6 and 8, we find that automata of the form E0,n (for
4 < n and n even), depicted in Figure 8.1, are among the most slowly-converging.
Specifically, E0,8 is found to be the unique arity-2, size-8, rank-2, strongly-connected
automaton with reset threshold no less than 25, and E0,6 is found to be the equal
sixth slowest-converging arity-2, size-6, rank-2, strongly-connected automaton.
Extrapolating from the edge cases of size 5 and 7, we find that automata of
CHAPTER 8. RESULTS
44
n
k
1
2
3
4
5
6
7
8
9
4
16
9
3
1
1
0
16
25
25
9
9
3
1
0
36
15
16
4
1
49
25
12
9
49
36
19
6
4
36
25
11
3
1
25
16
1
0
16
6
4
1
9
3
1
0
4
1
1
0
0
Table 8.5: The maximum terminal threshold of a strongly-connected, arity-2, size-n,

rank-k DFA (upper left), versus C(n, k) (lower right), for 3 n 8 and 1 k n.
the form E1,n (for 3 < n and n odd), depicted in Figure 8.2, are among the most
slowly-converging. Specifically, both E1,5 and E1,7 are the arity-2, rank-2, stronglyconnected automata of equal largest terminal threshold for their respective sizes.
Experimental analysis of these automata suggests an unusual fact: both families have a linear-order terminal threshold (see Table 8.6 for confirmation of this
fact). Given that the families E0,n and E1,n comprise many of the slowest-converging
strongly-connected, rank-2 automata generated, we may suspect that there is no
family of such automata requiring a quadratic order terminal word. This conjecture, however, is not correct. Define the automaton Cn,k = hZn , {a, b}, i by
if = a
q + 1 mod n
q
if = b and q Zn1
(q, ) =
k
otherwise
We have, for instance, that Cn = Cn,0 and Fn = Cn,1 , where Fn is described
in Ananichev, Gusev and Volkovs paper on slowly-synchronising automata [3].
Then, for even n > 2, Cn,1 is an experimentally-verified family of rank-2 DFAs with
quadratic-order terminal threshold. Our enumeration did not identify the automata
in this family, on account of the small coefficients in the polynomial describing their
terminal thresholds. Specifically, it is not until n = 16 that Cn,1 has a larger terminal
threshold than En,0 . This fact does not bode well for the computational identification
45
b
2
a
a
b
1
b
b
b
a
3
a
b
n2
n1
Figure 8.1: The graph of the DFA E0,n for an even, natural number n with n > 4.
a
2
b
0
b
1
3
a
a
b
5
b
a
6
n2
n1
Figure 8.2: The graph of the DFA E1,n for an odd, natural number n with n > 3.
of slowly-converging automata, as it suggests that the slowest-converging families
of automata only distinguish themselves at sizes for which complete enumeration is
implausible.
n 5 6 7 8 9 10 11 12 13 14 15 16
E0,n
9
25
41
57
73
89
E1,n 7
19
31
43
55
67
Cn,1
9
19
33
51
73
99
n
8n 39
6n 23
1
n 2n + 3
2
Table 8.6: The observed (for 5 n 16) and extrapolated terminal thresholds of
three slowly-converging families of strongly-connected automata.
While it is possible that the methods described in this dissertation could be used
to completely enumerate larger automata, time, space, and computational resource
constraints made this task unfeasible. In an attempt to combat this fact, we chose
CHAPTER 8. RESULTS
46
to generate a subset of all size-n automata, for 9 n 12, in the hope of identifying
some automata from which slowly-converging families could be extrapolated. We
were required to choose, then, some subset of automata in which the most slowlyconverging examples could be found. Based on an examination of known edge
cases to erns conjecture, we chose to enumerate those automata that are the
union of a cyclic transformation, and a deficiency-1 transformation. Recall that
a cyclic transformation cyclically permutes the state set of a DFA. A deficiency-1
transformation is a transformation : Q Q where |(Q)| = |Q| 1. It seems,
intuitively, that the union of such transformations must be slow to converge.
We now present the results of this restricted enumeration. Slowly-synchronising
(i.e. rank-1) automata have been studied by Ananichev, Gusev and Volkov [3], and
are hence not considered here. Listed in Table 8.7 is the distribution of terminal
thresholds for the non-synchronising automata generated in the manner specified
above.
We first note that, due to the block structure of an automaton with a cyclic
transformation, a size-n, rank-k automaton is generated only if k divides n. Secondly, we note that the terminal thresholds of the automata generated fall well
below Pins bound. This is consistent with the results established for smaller automata, as each generated automaton contains a cyclic transformation, and is hence
strongly-connected .
As the edge-cases for higher-rank automata are too numerous to analyse, we
consider only the slowest-converging rank-2 and -3 automata of size 9 through 12.
We immediately find that the rank-2 automata of size 10 and 12 share a common
form: the most slowly-converging automaton is Cn,1 , and the two second-mostslowly-converging automata belong to the families which we will denote Sn,1 and
Sn,2 , and define below. Herein, we focus our attention on these two families. The
important edge-cases are depicted in Figure 8.3, at the end of the chapter.
Based on the common form of rank-2 edge-cases of size 10 and 12, we define
Sn,1 = hZn , {a, b}, 1 i and Sn,2 = hZn , {a, b}, 2 i, for n 4 and n even, as follows:

q + 1 if q < 2 or q even
q 1 if q > 2 and q odd
1 (q, a) = q + 1 mod n
and 1 (q, b) =
2 (q, a) = q + 1 mod n
q 1 mod n if q even
if q = n 1
and 2 (q, b) = 2
q+1
if q =
6 n 1 and q odd
Table 8.8 lists experimental data collected about the families defined above.
These data verify that the families Sn,1 and Sn,2 have a quadratic order terminal
threshold that is always one less than the terminal threshold of Cn,1 .
Observing the data in Table 8.7 (particularly the size-12 rank-2 histogram), we
may be tempted to claim that we have identified the slowest-converging families
47
Size-9, rank-3 (C(9, 3) = 36)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
N9,3 (`) 0 0 0 0 0 0 144 36 274 186 86 142 102 104 112 76 24 8 2
Size-10, rank-2 (C(10, 2) = 64)
< 21 21 22 23 24 25 26 27 28 29 30 31 32 33
N10,2 (`) 50766 2924 1505 949 575 394 267 132 57 19 5 4 2 1
Size-10, rank-5 (C(9, 5) = 25)
`
0 1 2 3 4 5 6 7 8 9 10 11 12
N10,5 (`) 0 0 0 0 0 384 256 432 336 192 160 64 96
Size-12, rank-2 (C(12, 2) = 100)
< 35
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
N12,2 (`) 2574059 2664 1530 692 266 79 21 3 4 2 4 1 0 0 0 0 2 1
Size-12, rank-3 (C(12, 3) = 81)
< 20 20 21 22 23 24 25 26 27 28 29 30 31 32 33
N12,3 (`) 95175 5504 6018 4199 4443 3824 2038 857 431 202 127 39 19 3 1
Size-12, rank-4 (C(12, 4) = 64)
`
< 12 12 13 14 15 16 17 18 19 20 21 22 23 24
N12,4 (`) 11062 3480 2970 2916 2756 1466 2022 1566 910 700 960 228 32 36
Size-12, rank-6 (C(12, 6) = 36)
012345 6
7
8
9
10 11 12 13 14 15
N12,6 (`) 0 0 0 0 0 0 3840 2464 4864 4896 2272 1472 1120 960 384 768
Table 8.7: The number Nn,k (`) of size-n, rank-k automata constructed from a cyclic
transformation and a transformation of deficiency 1, with terminal threshold `.
of strongly-connected rank-2 automata. We must bear in mind, however, that we

are only considering those automata that arise from the combination of a cyclic
transformation and a transformation of deficiency 1. For instance, the families Bn ,
Dn00 , En and Hn (introduced by Ananichev, Gusev and Volkov [3]) are among the
most slowly-synchronising families that have been identified, yet do not arise from
a combination of such transformations. We are, however, confident that we have
identified some of the most slowly-converging families of strongly-connected, arity-2,
rank-2 automata that exist.
More detailed results of our restricted enumeration are available in Appendix
A. Listings of the slowest-converging automata generated, and reset-threshold his-
CHAPTER 8. RESULTS
48
n
Sn,1
Sn,2
Cn,1
r
w
T
r
w
T
r
w
T
10
12
14
16
n
2
2
2
2
2
2 8 3 2
2 10 4 2
2 12 5 2
2 14 6 2
2 n2 n/22 2
b
(b a ) b
(b a ) b
(b a ) b
(b a ) b
(b a )
1 2
32
50
72
98
n
2n
+
2
2
2
2
2
2
2
2 8 3 2
2 10 4 2
2 12 5 2
2 14 6 2
2 n2 n/22 2
b
(b a ) b
(b a ) b
(b a ) b
(b a ) b
(b a )
1 2
32
50
72
98
n
2n
+
2
2
2
2
2
2
2
7 3
9 4
11 5
13 6
n3 n/22
(baba ) bab (baba ) bab (baba ) bab (baba ) bab (baba )
bab
1 2
33
51
73
99
n
2n
+
3
2
Table 8.8: The observed (for 10 n 16) and extrapolated rank r, shortest
terminal word w, and terminal threshold T of automata from the families Sn,1 ,
Sn,2 and Cn,1 .
tograms of the synchronising automata generated, are included. We note that this
enumeration is the first in the study of either the ern or Pin conjectures that
considers any number of size-12 automata, with Trahtman [31] and Kisielewicz and
Szykua [16] considering automata of size up to 9 and 11, respectively. No counter
examples to ern or Pins conjectures were found in this parameter space, and the
most slowly-synchronising size-12 automata generated were those already identified
by Ananichev, Gusev and Volkov [3].
While not treated in this dissertation, the results of our restricted enumeration
leave open a number of questions that deserve attention. To begin with, an attempt
to place the rank-3 automata S9,1 , S9,2 and U12 into slowly-converging families
proves to be an interesting challenge. In particular, the natural family arising from
S9,1 appears to have an anomalous sequence of terminal thresholds, and it is not
immediately clear how to extend S9,2 and U12 into automata of larger size. Also
outside the scope of this dissertation, but obviously desirable, is a combinatorial
proof of the terminal thresholds listed in Table 8.8.
To summarise, our enumeration has elucidated the following facts:
There are no arity-2 counter examples to Pins conjecture of size greater than
2 and less than 9, but there are many edge cases in this range
For n > 3, such edge cases are all constructed from an edge case to the ern
conjecture coupled with a single isolated state
There are no significant (i.e. strongly-connected) non-synchronising automata
in this range with terminal threshold close to Pins bound
The families Cn,1 , Sn,1 and Sn,2 are likely among the slowest-converging
strongly-connected arity-2, rank-2 families of automata that exist
49
0
a
8
a
1
8
a
7
a
6
a
a, b
1
a, b
1
b
b
2
a
b
3
b
a
4
a
5
S10,2
S10,1
a
a
11
10
a
b
b
6
3
a
a
7
T9,2
T9,1
a
1
b
a
8
b
a
b
U12
Figure 8.3: The slowest-converging automata produced in our restricted enumeration

of DFAs.
50
CHAPTER 9
Concluding Remarks
As one of the oldest unresolved conjectures in automata theory, the ern conjecture has stood the test of time as a fascinating combinatorial problem. As part of
this dissertation, we have reviewed the wealth of literature written about erns
conjecture, and understood some of the most fundamental results in the study of
synchronising automata.
In many ways, Pins conjecture appears to be the right generalisation of erns
conjecture. Specifically, results and algorithms related to erns conjecture tend
to naturally extend to Pins conjecture. Chapters 5 and 6 are two striking examples
of this fact.
While computation has been routinely used to approach erns conjecture, it
appears that Pins conjecture had not been exposed to a similar level of computational analysis. The work in this dissertation represents the first concerted effort to
computationally attack Pins conjecture, including the extension of known results
and algorithms to operate on non-synchronising automata, and the development of
a number of tools for the generation and analysis of automata.
Our enumeration has revealed that there are no arity-2, size-n counter examples
to Pins conjecture for 3 n 8. Moreover, an analysis of our results has reinforced
the primacy of the ern conjecture every observed edge case for Pins conjecture
consists of an edge case for the ern conjecture, coupled with a single isolated
state. This result inspired us to examine strongly-connected non-synchronising automata, from which a number of slowly-converging families were identified. The
relative rate at which these examples converge, however, suggests that significant
non-synchronising edge cases for Pins conjecture do not exist.
For the benefit of future researchers, the appendices of this document list the
code developed in the course of this dissertation, along with more results from
our enumeration, including full terminal-threshold histograms and many edge-case
automata.
Beyond the results presented in this dissertation, there are numerous avenues
available for further research into Pins conjecture. To begin with, automata of
arity greater than 2 could be enumerated and examined in the manner described
by this dissertation. In the same vein as the work done by Ananichev, Gusev and
Volkov [3], the experimentally-verified terminal thresholds of families identified in
this dissertation could be proven using combinatorial arguments. Similarly, Pins
conjecture could be proven for families of automata for which the ern conjecture is
known to hold. Lastly, group theoretic results could be applied to Pins conjecture,
as they have been applied to erns conjecture in existing literature.
51
Bibliography
[1] Jorge Almeida, Stuart Margolis, Benjamin Steinberg, and Mikhail Volkov. Representation theory of finite semigroups, semigroup radicals and formal language
theory. Trans. Amer. Math. Soc., 361(3):14291461, 2009.
[2] Marco Almeida, Nelma Moreira, and Rogrio Reis. Enumeration and generation
with a string automata representation. Theoret. Comput. Sci., 387(2):93102,
2007.
[3] D. S. Ananichev, M. V. Volkov, and V. V. Gusev. Primitive digraphs with
large exponents and slowly synchronizing automata. Zap. Nauchn. Sem. S.Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 402(Kombinatorika i Teoriya
Grafov. IV):939, 218, 2012.
[4] W.R. Ashby. An Introduction to Cybernetics. University paperbacks. J. Wiley,
1956.
[5] Arturo Carpi and Flavio DAlessandro. Strongly transitive automata and the
ern conjecture. Acta Inform., 46(8):591607, 2009.
[6] Arturo Carpi and Flavio DAlessandro. The synchronization problem for locally
strongly transitive automata. In Mathematical foundations of computer science
2009, volume 5734 of Lecture Notes in Comput. Sci., pages 211222. Springer,
Berlin, 2009.
[7] Jn ern. A remark on homogeneous experiments with finite automata. Mat.Fyz. asopis Sloven. Akad. Vied, 14:208216, 1964.
[8] Jn ern, Alica Pirick, and Blanka Rosenauerov. On directable automata.
Kybernetika (Prague), 7:289298, 1971.
[9] L. Dubuc. Sur les automates circulaires et la conjecture de ern. RAIRO
Inform. Thor. Appl., 32(1-3):2134, 1998.
[10] David Eppstein. Reset sequences for monotonic automata. SIAM J. Comput.,
19(3):500510, 1990.
[11] P. Frankl. An extremal problem for two families of sets. European J. Combin.,
3(2):125127, 1982.
[12] Seymour Ginsburg. On the length of the smallest uniform experiment which
distinguishes the terminal states of a machine. J. Assoc. Comput. Mach., 5:266
280, 1958.
[13] Michael A. Harrison. A census of finite automata. Canad. J. Math., 17:100113,
1965.
52
BIBLIOGRAPHY
[14] Helmut Jrgensen. Synchronization. Inform. and Comput., 206(9-10):1033
1044, 2008.
[15] Jarkko Kari. A counter example to a conjecture concerning synchronizing words
in finite automata. Bull. Eur. Assoc. Theor. Comput. Sci. EATCS, (73):146,
2001.
[16] Andrzej Kisielewicz and Marek Szykua. Generating small automata and the
ern conjecture. In Implementation and application of automata, volume 7982
of Lecture Notes in Comput. Sci., pages 340348. Springer, Heidelberg, 2013.
[17] Zvi Kohavi. Switching and finite automata theory. McGraw-Hill Book Co., New
York-Dsseldorf-London, 1970. McGraw-Hill Computer Science Series.
[18] Brendan D. McKay and Adolfo Piperno. Practical graph isomorphism, {II}.
Journal of Symbolic Computation, 60(0):94 112, 2014.
[19] J. D. Mitchell. j.d. mitchell - semigroups data. http://www-groups.mcs.
st-andrews.ac.uk/~jamesm/data.php. Accessed: 2014-08-06.
[20] J. D. Mitchell. The on-line encyclopedia of integer sequences, sequence a225772.
https://oeis.org/A225772. Accessed: 2014-08-06.
[21] J. D. Mitchell. Semigroups - GAP package, Version 2.0, April 2014.
[22] Edward F. Moore. Gedanken-experiments on sequential machines. In Automata
studies, Annals of mathematics studies, no. 34, pages 129153. Princeton University Press, Princeton, N. J., 1956.
[23] J.-E. Pin. Sur un cas particulier de la conjecture de Cerny. In Automata,
languages and programming (Fifth Internat. Colloq., Udine, 1978), volume 62
of Lecture Notes in Comput. Sci., pages 345352. Springer, Berlin-New York,
1978.
[24] J.-E. Pin. Le problme de la synchronisation et la conjecture de ern. In
Noncommutative structures in algebra and geometric combinatorics (Naples,
1978), volume 109 of Quad. Ricerca Sci., pages 3748. CNR, Rome, 1981.
[25] J.-E. Pin. On two combinatorial problems arising from automata theory. In
Combinatorial mathematics (Marseille-Luminy, 1981), volume 75 of NorthHolland Math. Stud., pages 535548. North-Holland, Amsterdam, 1983.
[26] I. K. Rystsov. On the rank of a finite automaton. Kibernet. Sistem. Anal.,
(3):310, 187, 1992.
[27] I. K. Rystsov. Quasioptimal bound for the length of reset words for regular
automata. Acta Cybernet., 12(2):145152, 1995.
BIBLIOGRAPHY
[28] Peter H. Starke.
Abstract automata.
North-Holland Publishing Co.,
Amsterdam-London; American Elsevier Publishing Co., Inc., New York, 1972.
Translated from the German by I. Shepherd.
[29] Benjamin Steinberg. The averaging trick and the ern conjecture. Internat.
J. Found. Comput. Sci., 22(7):16971706, 2011.
[30] Benjamin Steinberg. The ern conjecture for one-cluster automata with prime
length cycle. Theoret. Comput. Sci., 412(39):54875491, 2011.
[31] A. N. Trahtman. An efficient algorithm finds noticeable trends and examples
concerning the erny conjecture. In Mathematical foundations of computer
science 2006, volume 4162 of Lecture Notes in Comput. Sci., pages 789800.
Springer, Berlin, 2006.
[32] A. N. Trahtman. Notable trends concerning the synchronization of graphs
and automata. In CTW2006Cologne-Twente Workshop on Graphs and Combinatorial Optimization, volume 25 of Electron. Notes Discrete Math., pages
173175. Elsevier Sci. B. V., Amsterdam, 2006.
[33] A. N. Trahtman. The ern conjecture for aperiodic automata. Discrete Math.
Theor. Comput. Sci., 9(2):310, 2007.
[34] A. N. Trahtman. Modifying the upper bound on the length of minimal synchronizing word. In Fundamentals of computation theory, volume 6914 of Lecture
Notes in Comput. Sci., pages 173180. Springer, Heidelberg, 2011.
[35] A. N. Trahtman. The erny conjecture. CoRR, abs/1202.4626, 2012.
[36] Mikhail V. Volkov. Synchronizing automata and the ern conjecture. In
Language and automata theory and applications, volume 5196 of Lecture Notes
in Comput. Sci., pages 1127. Springer, Berlin, 2008.
53
54
Appendices
CHAPTER A
A Listing of Significant Automata

A.1
Slowly-converging strongly-connected automata
This appendix lists the slowly-converging, strongly-connected automata found

during our computational enumeration of all arity-2 automata of size n, for 3 n
8. We produce, for each automaton in question, a string representing the DFA (for
use with the Semigroups GAP package), its shortest terminal word, the terminal
subset (i.e. image) of this word, and the rank of the DFA. The automata are ordered
first by size, then by rank, then by terminal threshold. It is impractical to list
higher-rank automata, due to the frequency with which they occur. Hence, only
automata of rank 1, 2 and 3 are listed.
A.1.1
Slowly-converging automata of size 3
Automaton
Shortest terminal word
Terminal threshold
Terminal subset
Rank
t121113312
Automaton
Terminal threshold
Terminal subset
Rank
t1321113312
Automaton
Terminal threshold
Terminal subset
Rank
t1313213211
Automaton
Terminal threshold
Terminal subset
Rank
t1321113311
abba
4
{1}
1
abba
4
{1}
1
b
1
{1, 2}
2
a
1
{1, 2}
2
55
APPENDIX A. A LISTING OF SIGNIFICANT AUTOMATA
56
A.1.2
Automaton
Terminal threshold
Terminal subset
Rank
t144123144234
Automaton
Terminal threshold
Terminal subset
Rank
t14424313312
Automaton
Terminal threshold
Terminal subset
Rank
t142143144324
Automaton
Terminal threshold
Terminal subset
Rank
t142413144234
Automaton
Terminal threshold
Terminal subset
Rank
t142413144324
A.1.3
baaabaaab
9
{4}
1
abbababba
9
{4}
1
bab
3
{2, 4}
2
bab
3
{3, 4}
2
bab
3
{2, 4}
2
Automaton
Terminal threshold
Terminal subset
Rank
t15523451551234
abbbbabbbbabbbba
16
{5}
1
A.1. SLOWLY-CONVERGING STRONGLY-CONNECTED AUTOMATA
Automaton
Terminal threshold
Terminal subset
Rank
t1553254144213
Automaton
Terminal threshold
Terminal subset
Rank
t15545231531254
Automaton
Terminal threshold
Terminal subset
Rank
t1555234142431
Automaton
Terminal threshold
Terminal subset
Rank
t15552341524513
Automaton
Terminal threshold
Terminal subset
Rank
t1552354142413
Automaton
Terminal threshold
Terminal subset
Rank
t1553254142413
A.1.4
abbabba
7
{4, 5}
2
abbabba
7
{3, 5}
2
abbabba
7
{4, 5}
2
abbabba
7
{4, 5}
2
aba
3
{3, 4, 5}
3
aba
3
{2, 4, 5}
3
Automaton
t1656123416623465
57
58

Terminal threshold
Terminal subset
Rank
baabababaabbabaabaababaab
25
{6}
1
Automaton
Terminal threshold
Terminal subset
Rank
t1661234516623456
Automaton
Terminal threshold
Terminal subset
Rank
t1621634516653624
Automaton
Terminal threshold
Terminal subset
Rank
t1626431516623546
Automaton
Terminal threshold
Terminal subset
Rank
t1625613416643265
Automaton
Terminal threshold
Terminal subset
Rank
t1626134516632456
Automaton
Terminal threshold
Terminal subset
t1626413516623546
baaaaabaaaaabaaaaabaaaaab
25
{6}
1
babaaabaaab
11
{3, 6}
2
baaabaaabab
11
{4, 6}
2
babaaaabab
10
{2, 6}
2
babaaaabab
10
{5, 6}
2
babaaaabab
10
{4, 6}
Rank
Automaton
Terminal threshold
Terminal subset
Rank
t1623164516642536
Automaton
Terminal threshold
Terminal subset
Rank
t1623614516642536
Automaton
Terminal threshold
Terminal subset
Rank
t1624163516632546
Automaton
Terminal threshold
Terminal subset
Rank
t1624613516632546
Automaton
Terminal threshold
Terminal subset
Rank
t1636214516632546
Automaton
Terminal threshold
Terminal subset
Rank
t1636214516652346
Automaton
t1665234616312645
baabab
6
{3, 5, 6}
3
baabab
6
{3, 5, 6}
3
baabab
6
{4, 5, 6}
3
baabab
6
{4, 5, 6}
3
baabab
6
{4, 5, 6}
3
baabab
6
{4, 5, 6}
3
59
60

Terminal threshold
Terminal subset
Rank
A.1.5
abaaba
6
{4, 5, 6}
3
Automaton
Terminal threshold
Terminal subset
Rank
t177123456177234567
Automaton
Terminal threshold
Terminal subset
Rank
t177452376176231745
Automaton
Terminal threshold
Terminal subset
Rank
t177752346172634715
Automaton
Terminal threshold
Terminal subset
Rank
t174213765177327456
Automaton
Terminal threshold
Terminal subset
Rank
t177327456144213
baaaaaabaaaaaabaaaaaabaaaaaabaaaaaab
36
{7}
1
abbabbababababbabba
19
{6, 7}
2
abbabbaaabababbabba
19
{6, 7}
2
baabaabababbaabaab
18
{6, 7}
2
abbabbabababbabba
17
{6, 7}
2
Automaton
t17745237616623145
Shortest terminal word abbbabbbabbbabbba
Terminal threshold
Terminal subset
Rank
17
{6, 7}
2
Automaton
Terminal threshold
Terminal subset
Rank
t17732547616624315
Automaton
Terminal threshold
Terminal subset
Rank
t177326745173254176
Automaton
Terminal threshold
Terminal subset
Rank
t17747235616326541
Automaton
Terminal threshold
Terminal subset
Rank
t173127465177472356
Automaton
Terminal threshold
Terminal subset
Rank
t177325476176271345
Automaton
Terminal threshold
Terminal subset
Rank
t175321476177243765
abbababbabbababba
17
{5, 7}
2
abaabbabaabbababa
17
{4, 7}
2
abbababbabbababba
17
{6, 7}
2
baabaababaabbbaab
17
{6, 7}
2
abbababbabbababba
17
{4, 7}
2
baabaababaab
12
{4, 6, 7}
3
61
62
Automaton
Terminal threshold
Terminal subset
Rank
t175712346177243765
Automaton
Terminal threshold
Terminal subset
Rank
t177273465173452176
Automaton
Terminal threshold
Terminal subset
Rank
t177273465173752146
Automaton
Terminal threshold
Terminal subset
Rank
t176721345177543276
Automaton
Terminal threshold
Terminal subset
Rank
t177452376176321745
Automaton
Terminal threshold
Terminal subset
Rank
t177752346172643715
baabaababaab
12
{2, 3, 7}
3
abbababbabba
12
{5, 6, 7}
3
abbababbabba
12
{3, 6, 7}
3
baabaababaab
12
{2, 4, 7}
3
abbababbabba
12
{3, 6, 7}
3
abbababbabba
12
{3, 6, 7}
3
Automaton
t177762435172573146
Shortest terminal word abbababbabba
Terminal threshold
12
Terminal subset
Rank
A.1.6
{2, 3, 7}
3
Automaton
t18812345671882345678
Shortest terminal word baaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaa
aab
Terminal threshold
49
Terminal subset
{8}
Rank
1
Automaton
Terminal threshold
Terminal subset
Rank
t1878583762174436715
Automaton
Terminal threshold
Terminal subset
Rank
t18784783621555314
Automaton
Terminal threshold
Terminal subset
Rank
t18264315871882354876
Automaton
Terminal threshold
Terminal subset
Rank
t18825438761826134587
abbbaabbabbbaabbabbbabbba
25
{3, 7}
2
abbbabbbabbabbabbbabbba
23
{6, 7}
2
baaabaaabababaaabaaabab
23
{7, 8}
2
ababbbaabaabbababbbaaba
23
{3, 8}
2
Automaton
t188743785616513264
Shortest terminal word abbbabbbabbabbabbbabbba
Terminal threshold
23
63
64
Terminal subset
Rank
{5, 7}
2
Automaton
Terminal threshold
Terminal subset
Rank
t18866543871871746825
Automaton
Terminal threshold
Terminal subset
Rank
t1721374561884523687
Automaton
Terminal threshold
Terminal subset
Rank
t1882436587172735416
Automaton
Terminal threshold
Terminal subset
Rank
t18371248561883265487
Automaton
Terminal threshold
Terminal subset
Rank
t18216534871884328756
A.2
abbbabbabbbabbbabbabbba
23
{7, 8}
2
babaaabaaabaaab
15
{2, 7, 8}
3
abbbabbbabbbaba
15
{5, 7, 8}
3
babaaabaaabaaab
15
{4, 5, 8}
3
babaaabaaabaaab
15
{2, 5, 8}
3
Large slowly-converging automata
This section lists the slowly-converging automata found during the enumeration
of all n-state automata arising from the union of a cyclic transformation and a
deficiency-1 transformation, for 9 n 12 (see Chapter 8 for more details on this
endeavour).
A.2. LARGE SLOWLY-CONVERGING AUTOMATA

For each size, we begin with the reset-threshold histogram of the slowest- synchronising automata generated. Known slowly-synchronising automata are labelled,
with notation from Ananichev, Gusev and Volkov [3]. We then produce, for each
slowly-converging automaton generated, a string representing the DFA (for use with
the Semigroups GAP package), its shortest terminal word, the terminal subset (i.e.
image) of this word, and the rank of the DFA. The automata are ordered first by
size, then by rank, then by terminal threshold.
A.2.1
`
36 37 38 39 40 43 50 51 52 57 58 64
N (`) 22 5 3 1 5 0 1 0 1 0 1 0 2 1 0 1
Classes C9,4
C9,3
G9
W9 D90
C9
F9
The number N (`) of synchronising, size-9 automata constructed from a cyclic
transformation and a transformation of deficiency 1 with reset threshold `, for
36 ` 64.
Automaton
t1991234567819923456789
Shortest terminal word baaaaaaaabaaaaaaaabaaaaaaaabaaaaaaaabaaaaaaaab
aaaaaaaabaaaaaaaab
Terminal threshold
64
Terminal subset
{9}
Rank
1
Automaton
t1929134567819992345678
Shortest terminal word baaaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaa
aaabaaaaaaab
Terminal threshold
58
Terminal subset
{9}
Rank
1
Automaton
t1999234567819912345678
Shortest terminal word abaaaaaaaabaaaaaaaabaaaaaaaabaaaaaaaabaaaaaaa
abaaaaaaaaba
Terminal threshold
57
Terminal subset
{9}
Rank
1
Automaton
t1992345678919291345678
65
66
Shortest terminal word abbbbbbbabbbbbbbabbbbbbbabbbbbbbabbbbbbbabbbbbbbabbbb

bbba
Terminal threshold
57
Terminal subset
{9}
Rank
1
Automaton
t1929934567819249315678
Shortest terminal word aababaaaaaababaaaaaababaaaaaababaaaaaababaaaaa
ababaa
Terminal threshold
52
Terminal subset
{9}
Rank
1
Automaton
t1929934567819291345678
Shortest terminal word aabaaaaaaaabaaaaaaaabaaaaaaaabaaaaaaaabaaaaaa
aabaa
Terminal threshold
50
Terminal subset
{9}
Rank
1
Automaton
Terminal threshold
Terminal subset
Rank
t1949231567819923456789
Automaton
Terminal threshold
Terminal subset
Rank
t1999732465819239145867
Automaton
Terminal threshold
Terminal subset
Rank
t1939527496819241638597
Automaton
t1949239567819482395671
baaaaabaaaaabaaaaabaaaaabaaaaabaaaaabaaaaab
43
{9}
1
abbaaabbbaaabbbaaabbbaaabbbaaabbbaaabbba
40
{9}
1
abbbbbaaaaaabbbbbaaaaaabbbbbaaaaaabbbbba
40
{9}
1

Terminal threshold
Terminal subset
Rank
aaaabaaabaaaabaaabaaaabaaabaaaabaaabaaaa
40
{9}
1
Automaton
Terminal threshold
Terminal subset
Rank
t1989234756119699235478
Automaton
Terminal threshold
Terminal subset
Rank
t1993254896719719324658
Automaton
Terminal threshold
Terminal subset
Rank
t1991234567819923456879
Automaton
Terminal threshold
Terminal subset
Rank
t1931527496819249638597
Automaton
Terminal threshold
Terminal subset
Rank
t1991234567819932456789
Automaton
Terminal threshold
Terminal subset
t1993124567819924356789
babbbabbbbbbabbabbbbbbbababbbbbbbbaabbbb
40
{9}
1
abbbbabbbbabbbbabbbbbabbbbbabbbbbabbbbba
40
{9}
1
baabaabaaababaaaabaabaaaaaaaabaaaaaaaab
39
{9}
1
bbbbbaaaaaabbbbbaaaaaabbbbbaaaaaabbbbb
38
{9}
1
baabaabaabaababaabaaabbaaabaaaababaaab
38
{9}
1
baaaabaabaaaabaaabaaaababaabaaaabaaaab
38
{9}
67
68
Rank
Automaton
Terminal threshold
Terminal subset
Rank
t1993245678919291345678
Automaton
Terminal threshold
Terminal subset
Rank
t1929634157819923548769
Automaton
Terminal threshold
Terminal subset
Rank
t1992347856919912364578
Automaton
Terminal threshold
Terminal subset
Rank
t1993246578919319524678
Automaton
Terminal threshold
Terminal subset
Rank
t1943297659819492315678
Automaton
Terminal threshold
Terminal subset
Rank
t1993246587919317524968
Automaton
t1995672349819238914567
abbbababbbbabbbababbbbabbbabbabbabbba
37
{9}
1
baaabbabaaaababbaaababaaaabbaabaaaaab
37
{9}
1
abbababbabbbabbbaabbbbbabbbbaababbbba
37
{9}
1
abbbabaabbbabbbabbabbbabbbbababbbbbba
37
{9}
1
aabababaabbbbabbaabbabbbbbbababbbbbaa
37
{9}
1
abaabaabbbbaabaaba
18
{5, 7, 9}
3

Terminal threshold
Terminal subset
Rank
A.2.2
abaabaabbbbaabaaba
18
{3, 7, 9}
3
`
43 44 45 46 47 48 49 57 60 65 73 74 81
N (`) 5 5 3 4 1 0 2 0 1 0 1 0
1
0 1 1 0 1
0
Classes
C10,2
W10 D10
C10
43 ` 81.
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t21010 1 2 3 4 5 6 7 8 921010 2 3 4 5 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 210 1 3 4 5 6 7 8 92101010 2 3 4 5 6 7 8 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t2101010 2 3 4 5 6 7 8 921010 1 2 3 4 5 6 7 8 9
Automaton
t210 310 2 1 4 5 6 7 8 921010 2 3 4 5 6 7 8 910
baaaaaaaaabaaaaaaaaabaaaaaaaaabaaaaaaaaabaaaa
aaaaabaaaaaaaaabaaaaaaaaabaaaaaaaaab
81
{10}
1
baaaaaaaaabaaaaaaaabaaaaaaaabaaaaaaaabaaaaaaa
abaaaaaaaabaaaaaaaabaaaaaaaab
74
{10}
1
abaaaaaaaaabaaaaaaaaabaaaaaaaaabaaaaaaaaabaaa
aaaaaabaaaaaaaaabaaaaaaaaaba
73
{10}
1
69
70
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
baaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaa
aabaaaaaaabaaaaaaab
65
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 3 6 210 4 5 1 7 8 9210 310 210 4 5 6 7 8 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 310 2 1 4 5 6 7 8 9210 310 210 4 5 6 7 8 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 71010 2 3 4 6 5 8 9210 910 2 3 4 5 8 6 7 1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 81010 2 3 4 5 6 7 9210 7 9 210 4 3 6 5 8 1
Automaton
t21010 2 4 3 5 6 7 8 910210 2 310 1 4 5 6 7 8 9
{10}
1
baaaaaaaaabaaaaaabaaaaaabaaaaaabaaaaaabaaaaaab
aaaaaabaaaaaab
60
{10}
1
baaaaaabaaaaaabaaaaaabaaaaaabaaaaaabaaaaaabaaa
aaabaaaaaab
57
{10}
1
aaaaabaaaabaaaaaabaaabaaaaaaabaabaaaaaaaababaa
aaa
49
{10}
1
aaaaabaaabaaaaaaababaaaaaaaaabaaaabaaaaaabaaba
aaa
49
{10}
1
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
abbbbabbabbbbbbbbababbbbabbaabbbabbbbabbbabbbba
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 510 2 1 3 9 4 6 7 821010 6 8 210 4 3 7 5 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t2101010 2 3 4 5 6 9 7 8210 2 810 1 3 4 5 7 6 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 310 2 1 4 5 6 9 7 821010 810 2 3 4 5 7 6 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t21010 2 3 5 4 6 7 8 910210 410 1 2 3 5 6 7 8 9
Automaton
t2101010 2 3 4 5 6 9 7 8210 8 110 2 3 4 5 7 6 9
47
{10}
1
baaaabbaaaabbaaaabbaaaabbaaaabbaaaabbaaababaab
46
{10}
1
abbbbabbbbabbbbabbbbbabbbbbabbbbbabbbbbabbbbba
46
{10}
1
baabbbbaabbbbaabbbbaabbbbaabbbbaabbbbaabbbbaab
46
{10}
1
abbbbabababbbbbbabbabbbbbabbbbbbbbababbabbbbba
46
{10}
1
71
72
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
abbbbabbbbabbbbabbbbabbbbbabbbbbabbbbbabbbbba
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 210 1 3 4 5 6 9 7 821010 810 2 3 4 5 7 6 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t21010 1 2 3 4 5 6 7 8 921010 2 3 4 5 6 7 9 810
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 2 610 3 4 8 1 5 7 921010 7 5 4 3 9 210 6 8
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t21010 1 2 3 4 5 6 7 8 921010 2 3 4 5 7 6 8 910
Automaton
t21010 1 2 3 4 5 6 7 8 921010 3 2 4 5 6 7 8 910
45
{10}
1
baabbbbaabbbbaabbbbaabbbbaabbbbaabbbbaabbbbab
45
{10}
1
baaaaaabaaababaabaabaaababaaaaabaabaaaaaaaaab
45
{10}
1
baaaababaaaababaaaababaaaaabaabbaababaaaaaab
44
{10}
1
baaaabaabaabaaabaaaaabaaaaabaaaababaabaaaaab
44
{10}
1
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
baabaabaabaabaaababaabaaabaaaabaaaabababaaab
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t21010 3 1 2 4 5 6 7 8 921010 2 4 3 5 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 310 2 6 4 7 1 5 8 921010 3 2 5 4 8 9 6 710
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 310 2 1 4 5 6 7 8 921010 210 3 4 5 6 7 8 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 410 2 3 1 5 6 7 8 9210 31010 2 4 5 6 7 8 9
Automaton
t21010 2 4 3 5 6 9 8 71021010 3 1 2 4 5 6 7 8 9
44
{10}
1
baaaaabaabaaabaabaaabaabaaababaabaaaaabaaaab
44
{10}
1
baaaaaababaabbaabaaaaababaaaababaaaababaaaab
44
{10}
1
baaaaaabaabaaaabaabaaaabaabaaaaaaabaaaaaaab
43
{10}
1
babaaaaaabbaaaaaabbaaaaaabbaaaaaabbaaaaaabb
43
{10}
1
73
74
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
abbababbabbbabbbabbbaababbbbabbbaababbbabba
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 310 2 1 4 5 6 7 8 921010 4 5 2 3 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t21010 1 2 3 4 5 6 7 8 921010 8 2 3 4 5 6 7 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 210 1 3 4 5 6 7 8 921010 2 3 4 5 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t210 210 1 3 4 5 6 7 8 9210 210 4 3 6 5 8 710 9
Automaton
t210 210 4 3 6 5 910 7 8210 8 1 2 3 4 5 610 7 9
43
{10}
1
bababbaaaaaabababaaaaaaabaaabaaaabaaabaaaab
43
{10}
1
baabaabaabbaaabaababaaaaabaababbbabaaaaaaab
43
{10}
1
babaaaaaaababaaaaaaababaaaaaaabab
33
{9, 10}
2
bbaaaaaaaabbaaaaaaaabbaaaaaaaabb
32
{9, 10}
2
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
A.2.3
aabbbbbbbabbabbbbbbabbabbbbbbbaa
32
{8, 10}
2
`
60 64 68 73 76 82 83 84 91 92 100
N (`) 5 0
2
0 1 0
2
0 1 0
2 0 1 0 2 1 0 1
0
Classes
C11,4
C11,3
C11,2
G11
W11 D11
C11
F11
60 ` 100.
Automaton
Shortest
terminal word
t21111 1 2 3 4 5 6 7 8 91021111 2 3 4 5 6 7 8 91011
baaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaab
aaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaaba
aaaaaaaaab
100
Terminal
threshold
Terminal subset {11}
Rank
1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t2111111 2 3 4 5 6 7 8 910211 211 1 3 4 5 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
t21111 2 3 4 5 6 7 8 91011211 211 1 3 4 5 6 7 8 910
abbaaaaaaaaabbaaaaaaaaabbaaaaaaaaabbaaaaaaaaab
baaaaaaaaabbaaaaaaaaabbaaaaaaaaabbaaaaaaaaabba
92
{11}
1
abbbbbbbbbabbbbbbbbbabbbbbbbbbabbbbbbbbbabbbbbbbbbabbb
bbbbbbabbbbbbbbbabbbbbbbbbabbbbbbbbba
91
{11}
75
76
Rank
Automaton
Shortest
terminal word
t2111111 2 3 4 5 6 7 8 91021111 1 2 3 4 5 6 7 8 910
abaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaa
baaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaab
a
91
Terminal
threshold
Rank
1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 21111 3 4 5 6 7 8 910211 2 411 3 1 5 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 311 2 1 4 5 6 7 8 91021111 2 3 4 5 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 21111 3 4 5 6 7 8 910211 211 1 3 4 5 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
t211 3 6 211 4 5 1 7 8 910211 311 211 4 5 6 7 8 910
aababaaaaaaaababaaaaaaaababaaaaaaaababaaaaaaaa
babaaaaaaaababaaaaaaaababaaaaaaaababaa
84
{11}
1
baaaaaaaabaaaaaaaabaaaaaaaabaaaaaaaabaaaaaaaab
aaaaaaaabaaaaaaaabaaaaaaaabaaaaaaaab
82
{11}
1
aabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaa
abaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaa
82
{11}
1
baaaaaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaaaabaa
aaaaabaaaaaaabaaaaaaabaaaaaaab
76

Rank
1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 411 2 3 1 5 6 7 8 91021111 2 3 4 5 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 311 211 4 5 6 7 8 910211 311 2 1 4 5 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 411 2 311 5 6 7 8 910211 4 8 2 311 5 6 7 1 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 511 2 3 4 1 6 7 8 91021111 2 3 4 5 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
t211 411 2 3 1 5 6 7 8 910211 411 2 311 5 6 7 8 910
baaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaa
aabaaaaaaabaaaaaaabaaaaaaab
73
{11}
1
aaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaa
aabaaaaaaaaaabaaaaaaaaaabaaa
73
{11}
1
aaaabaaabaaaaaabaaabaaaaaabaaabaaaaaabaaabaaaa
aabaaabaaaaaabaaabaaaa
68
{11}
1
aaabaaaaaabaaaaaab
64
{11}
1
aaabaaaaaabaaaaaab
64
77
78

Rank
1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 311 5 2 7 4 9 611 810211 2 4 1 6 3 8 510 711 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 510 2 3 411 6 7 8 9 1211 511 2 3 411 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 81111 2 3 4 5 7 6 9102111011 2 3 4 5 6 9 7 8 1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 2 311 1 4 5 6 710 8 92111111 9 3 2 4 5 6 8 710
Automaton
Shortest
terminal word
Terminal
threshold
t21111 4 5 2 3 7 61011 8 9211 9 1 211 4 5 3 6 8 710
abbbbbbaaaaaaabbbbbbaaaaaaabbbbbbaaaaaaabbbbbbaaa
aaaabbbbbba
60
{11}
1
baaaaaaaaaabaaaaabaaaaabaaaaabaaaaabaaaaabaaaa
abaaaaabaaaaab
60
{11}
1
aaaaabbaaaaaaaaaababaaaaaaaaabaabaaaaaaaabaaab
aaaaaaabaaaaba
60
{11}
1
baaabbbbaaabbbbaaabbbbaaabbbbaaabbbbaaabbbbaaabbbb
aaabbbbaab
60
{11}
1
abbbbbabbbbbabbbbbabbbbbabbbbbbabbbbbbabbbbbbabbbbbba
bbbbbba
60

Rank
1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 3 1 5 2 7 4 9 611 810211 2 411 6 3 8 510 711 9
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 611 2 3 4 5 1 7 8 91021111 2 3 4 5 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t21111 311 2 4 5 6 710 8 9211 9 111 3 2 4 5 6 8 710
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 511 2 3 411 6 7 8 910211 511 2 3 4 1 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
t211 511 2 3 7 4 9 611 810211 2 3 4 6 1 8 510 711 9
bbbbbbaaaaaaabbbbbbaaaaaaabbbbbbaaaaaaabbbbbbaaaa
aaabbbbbb
58
{11}
1
baaaaabaaaaabaaaaabaaaaabaaaaabaaaaabaaaaabaaa
aabaaaaab
55
{11}
1
ababbbaabbbbaabbbbaabbbbaabbbbaabbbbaabbbbaabbbbaab
bbba
55
{11}
1
aaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaa
aaaabaaaaa
55
{11}
1
abbbbbabbbbbabbbbbabbbbbabbbbbabbbbbabbbbbabbbbbabbbb
ba
55
79
80

Rank
1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 2 311 1 4 5 6 7 8 91021111 2 4 3 5 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 411 1 2 3 5 6 7 8 91021111 2 3 5 4 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t21111 4 5 2 3 6 7 8 91011211 311 2 1 4 5 6 7 8 910
Automaton
Shortest
terminal word
Terminal
threshold
Terminal subset
Rank
t211 3 4 211 1 5 6 7 8 91021111 2 3 5 4 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
t211 511 2 1 3 4 6 7 8 91021111 2 3 4 7 6 5 8 91011
baaaabaabbaabbaaaabaaaabaaaabaaaabaaaaaabaababa
aabaaaab
55
{11}
1
baabaaaaabababaaaaaaaababaabaaaaaabaaabaaaaabaa
abaaaaab
55
{11}
1
abaabbbbababbbbabbbbbbabbabbbbbabbbbabbbbbabbbabbbbb
a
53
{11}
1
baaabbaabaaabbaabaaaaabaabaaaaaaababaaaabaabaab
aaaaab
53
{11}
1
babbaabaaaaabababbaaaaabaabaaaaaababaaabaaaabaa
aaaaab
53

Rank
1
A.2.4
`
61 62 63 66 67 71 76 81 111 112 121
N (`)
3 3 1 0 2 3 0 5 0 1 0
1
0 1 1 0 1
0
Classes C12,6

C12,4
W12 D12
C12
61 ` 121.
Automaton
t21212 1 2 3 4 5 6 7 8 9101121212 2 3 4 5 6 7 8 9101112
Shortest
baaaaaaaaaaabaaaaaaaaaaabaaaaaaaaaaabaaaaaaaaaaab
terminal word aaaaaaaaaaabaaaaaaaaaaabaaaaaaaaaaabaaaaaaaaaaaba
aaaaaaaaaabaaaaaaaaaaab
Terminal
121
threshold
Terminal
{12}
subset
Rank
1
Automaton
t212 212 1 3 4 5 6 7 8 910112121212 2 3 4 5 6 7 8 91011
Shortest
baaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaa
terminal word aaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaa
aabaaaaaaaaaab
Terminal
112
threshold
Terminal
{12}
subset
Rank
1
Automaton
t21212 1 2 3 4 5 6 7 8 910112121212 2 3 4 5 6 7 8 91011
Shortest
baaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaa
terminal word aaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaaabaaaaaaaaa
abaaaaaaaaaab
Terminal
111
threshold
Terminal
{12}
subset
81
82
Rank
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 512 2 3 4 1 6 7 8 9101121212 2 3 4 5 6 7 8 9101112
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 512 2 3 412 6 7 8 91011212 510 2 3 412 6 7 8 9 111
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 512 2 3 412 6 7 8 91011212 512 2 3 4 1 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 3 512 6 2 4 8 71112 91021210 112 2 3 5 6 4 7 9 811
Automaton
t212 91212 3 2 4 5 6 8 7101121211 312 2 4 5 6 710 8 9 1
baaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaaaabaaaaaaab
aaaaaaabaaaaaaabaaaaaaabaaaaaaab
81
{12}
1
aaaaabaaaabaaaaaabaaaabaaaaaabaaaabaaaaaabaaaabaa
aaaabaaaabaaaaaabaaaabaaaaa
76
{12}
1
aaaaabaaaaaaaaaaabaaaaaaaaaaabaaaaaaaaaaabaaaaaaa
aaaabaaaaaaaaaaabaaaaa
71
{12}
1
aabbbbbaabbbbbaabbbbbaabbbbbaabbbbbababbbbaabbbbbaabbbbb
aabbbbbaabbbbba
71
{12}
1
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
aaaaaabaaaaabaaaaaaabaaaabaaaaaaaabaaabaaaaaaaaab
aabaaaaaaaaaababaaaaaa
71
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t2121012 212 4 3 6 5 7 8 911212 91112 2 3 4 5 6 8 710 1
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 2 3 5 1 7 4 9 611 81210212 412 2 6 3 8 510 712 911
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t2121212 2 3 6 4 5 7 811 910212 21012 1 4 3 6 5 7 9 811
Automaton
Shortest
terminal word
Terminal
threshold
t212 312 2 1 6 4 5 7 811 910212121012 2 4 3 6 5 7 9 811
{12}
1
aaaaabbaaaaaaaaaabaabaaaaaaaabaaaabaaaaabbaaaaaaa
aaaababaaaaaaaaabaaaba
71
{12}
1
baaaaaabaaaaaabaaaaaabaaaaaabaaaaaabaaaaaabaaaaaa
baaaaaabaaaaaabaaaaaab
71
{12}
1
abbbbbabbbbbabbbbbabbbbbabbbbbbabbbbbbabbbbbbabbbbbbabbbb
bbabbbbbba
67
{12}
1
baabbbbbaabbbbbaabbbbbaabbbbbaabbbbbaabbbbbaabbbbbaabbbb
baabbbbbaab
67
83
84
Terminal
subset
Rank
{12}
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 2 3 512 7 4 9 611 81210212 4 1 2 6 3 8 510 712 911
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t2121212 2 3 6 4 5 7 811 91021210 112 2 4 3 6 5 7 9 811
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 212 1 3 6 4 5 7 811 910212121012 2 4 3 6 5 7 9 811
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 212 1 3 4 5 6 7 8 9101121212 2 3 4 5 6 7 8 9101112
aaaaaabbbbbbbbaaaaaabbbbbbbbaaaaaaaaaaabbbbbbbbaaaaa
abbbbbbbbaaaaaa
67
{12}
1
abbbbbabbbbbabbbbbabbbbbabbbbbabbbbbbabbbbbbabbbbbbabbbbb
babbbbbba
66
{12}
1
baabbbbbaabbbbbaabbbbbaabbbbbaabbbbbaabbbbbaabbbbbaabbbb
baabbbbbab
66
{12}
1
babaaaaaaaaababaaaaaaaaababaaaaaaaaababaaaaaaaaaba
b
51
{11, 12}
2
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 212 4 3 6 5 8 710 91211212 212 1 3 4 5 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 212 4 3 6 5 8 71112 91021210 1 2 3 4 5 6 7 812 911
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t21212 2 3 412 5 7 8 6 91110212 2 512 1 3 4 6 9 7 81011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 212 6 9 3 4 5 7 1 8101121212 2 3 4 5 8 7 61110 912
aabbbbbbbbbbaabbbbbbbbbbaabbbbbbbbbbaabbbbbbbbbbaa
50
{11, 12}
2
aabbbbbbbbbabbabbbbbbbbabbabbbbbbbbabbabbbbbbbbbaa
50
{10, 12}
2
ababbbaaaabaaaabbbaaaabaaaabbbaaaabaaaabbbaba
45
{11, 12}
2
babaaaabababaaaaaaababbabaaaaabbabaabbabaaab
44
{6, 12}
2
Automaton
t212 212 1 9 6 4 3 5 7 8101121212 3 2 4 5 8 7 61110 912
Shortest
baaababbaababbaaaaababbabaaaaaaabababaaaabab
terminal word
Terminal
44
threshold
85
86
Terminal
subset
Rank
{9, 12}
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 21212 3 4 5 6 7 8 91011212 2 412 3 1 5 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 412 212 3 5 6 711 9 810212 2 310 1 6 4 8 5 712 911
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 2 312 9 7 1 4 5 6 8101121212 2 6 4 5 310 811 7 912
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 4 3 6 8 212 511 1 9 71021212 2 5 9 3 7 610 4 81211
aabbbbaaaaaaaabbbbaaaaaaaabbbbaaaaaaaabbbbaa
44
{11, 12}
2
aabbbababbababbbbababbababbbbababbababbbbbaa
44
{10, 12}
2
babaaaababaaaaaaababaaababaaaabab
33
{9, 10, 12}
3
baaaababaabbaaaababbabaaaaaabaab
32
{4, 5, 12}
3
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t21212 4 2 5 3 6 8 7 9111012212 2 312 1 4 5 6 7 8 91011
Automaton
Shortest
terminal word
Terminal
threshold
Terminal
subset
Rank
t212 312 2 6 4 710 5 8 1 91121212 3 2 5 4 8 9 6 7111012
abaaabbabbbbababaaabbabbbbabbaba
32
{10, 11, 12}
3
baaaabbaababaaaaaababbabaaaaabab
32
{5, 10, 12}
3
87
88
CHAPTER B
Terminal-Threshold Histograms
This appendix lists the histograms of terminal thresholds created during our computation enumeration of all arity-2, size-n automata, for 3 n 8. The tables in
this section list the number of size-n automata of rank k with terminal threshold T .
B.1
Terminal-threshold histograms of all automata
We begin with the terminal-threshold histograms of all enumerated automata.

Note that, as discussed in Chapter 8, the most slowly-converging, non-synchronising
automata listed in these tables are disconnected.
B.1.1
Terminal-threshold histogram of size-3 automata

k
T
1 2 3
0 0 05
1 14 13
2 29
3 4
4 2
B.1.2

k
T
3 4
0
0 0 0 23
1 51 138 71
2 478 100
3 375 10
4 104 2
5 64
6 21
7 11
8
5
9
2
B.1.3
B.1. TERMINAL-THRESHOLD HISTOGRAMS OF ALL AUTOMATA
k
T
0
0
0
0 0 89
1
174 952 1019 415
2 7003 3033 499
3 12229 1154 38
4 7350 218
8
5 3535 102
6 1855 24
7
986 15
8
551
5
9
294
2
10
151
11
46
12
43
13
23
14
11
15
4
16
1
B.1.4

k
T
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
1
0
570
103769
363010
350652
217845
118630
71184
40956
23098
12677
6494
4096
2714
1425
718
378
208
0
0
0
0 484
6585 12129 8708 2795
71230 22481 2805
54306 5714 164
19565 917 22
7469 386
3016 107
1519
48
754
20
366
8
157
50
43
23
11
4
1
89
APPENDIX B. TERMINAL-THRESHOLD HISTOGRAMS
90
18
19
20
21
22
23
24
25
B.1.5
112
61
45
22
11
2
0
2

k
T
0
0
0
0
0
0
0 2904
1
1837 41565 128457 148320 80332 21540
2 1625796 1533823 735711 185959 18750
3 10816132 2214125 349514 31925 882
4 15553259 1188288 96471 4046
86
5 12573692 533341 31815 1350
6 7908014 232278 12782
341
7 4875236 123488 5760
137
8 3004633 64881 2896
55
9 1749654 33912 1408
22
10
985608 16869
632
11
551444
8171
188
12
327420
5023
180
13
210221
3245
92
14
123451
1633
44
15
69507
828
16
16
41203
410
4
17
24608
216
18
14131
113
19
8291
63
20
5738
45
21
3536
22
22
2047
11
23
1006
2
24
479
0
25
393
2
26
203
27
123
28
75
B.1. TERMINAL-THRESHOLD HISTOGRAMS OF ALL AUTOMATA
29
30
31
32
33
34
35
36
B.1.6
k
T
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
91
39
13
3
3
0
0
0
1
0
5833
27368343
333762012
695438027
716459494
531635252
351545569
225016129
137371011
79514375
45572201
26404390
16197318
9716381
5689365
3442881
2123727
1271965
782029
521917
342692
209699
120956
68917
47757
30582
19624
12377
0
259162
33046085
86484027
65012120
35558128
17313391
9329153
5208031
2827812
1486177
789205
457898
289937
161976
90061
50949
29704
16560
9731
6636
4000
2264
1057
482
428
205
123
75
0
1266793
21387325
18312915
7027155
2588790
1051071
518132
264120
135214
67422
31603
20380
12739
6502
3203
1601
843
449
246
180
88
44
8
0
8
0
0
0
0 22002
2223237 1875604 821669 189753
8208652 1726812 144346
2613233 219006 5558
536201 23110
322
144647
6450
50599
1576
20182
517
9416
215
4438
86
1733
522
481
253
121
44
11
92
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
B.2
7103
3833
2070
1524
901
516
364
222
92
39
12
5
1
3
1
1
0
0
0
0
1
39
13
3
3
0
0
0
1
Terminal-threshold histograms of strongly-connected automata
As discussed in Chapter 5, we need only consider strongly-connected automata

when searching for edge cases or counter examples to the ern or Pin conjectures. Furthermore, as discussed in Chapter 8, the terminal-threshold distributions of strongly-connected automata are significantly different from those listed
above. Hence, we produce here the terminal-threshold histograms of all enumerated
strongly-connected automata, in the same format as above.
B.2.1

k
T
1 23
0 004
1 32
2 14
3 3
4 2
B.2. TERMINAL-THRESHOLD HISTOGRAMS OF STRONGLY-CONNECTED

AUTOMATA
B.2.2

k
T
2 3 4
0
0 0 0 15
1
4 21 8
2 115 17
3 146 3
4 49
5 44
6 18
7 10
8
5
9
2
B.2.3

k
T
3 4 5
0
0 0
1
4 26
2 536 122
3 1880 77
4 1991 15
5 1162 18
6 782 0
7 494 4
8 310
9 196
10 123
11
38
12
37
13
19
14
11
15
4
16
1
B.2.4
0 0 55
50 49
24
2

k
T
93
94
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
B.2.5
0
5
4204
30315
49765
46174
29487
21650
13370
8128
4569
3103
1993
1433
756
442
275
165
94
47
40
18
11
2
0
2
0
84
1190
1301
621
296
121
106
45
26
3
2
0 0 0 330
219 315 299
387 93
147 8
18
18
7

k
T
0
0
0
0
0
0
0 2100
1
7 100 557 1253 1853 2022
2
30956 6851 3538 1463 447
3 457530 18413 2212 366 32
4 1149576 13132 815 33
5 1466376 8499 213 16
6 1238981 3232 179
7
7 881418 2369 89
8 618764 999 67
9 390284 814 10
10 229088 322 14
B.2. TERMINAL-THRESHOLD HISTOGRAMS OF STRONGLY-CONNECTED

AUTOMATA
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
B.2.6
136729
82188
51980
34012
21605
14623
9086
5533
3333
2437
1660
1011
556
331
243
157
96
62
33
10
3
3
0
0
0
1
293
80
95
15
34
9
7
1
2
0
8

k
T
0
1
2
3
4
5
6
7
8
9
10
0
0
270155
7620491
28686239
47993654
51103869
42504761
30501056
20731218
12731689
0
113
49713
259616
285052
223497
113114
68984
32199
20845
10878
0
1220
27119
32499
17456
7037
3088
1706
977
475
336
0
5487
17679
7641
2210
536
251
114
54
27
0
0
0 16423
9918 14523 15956
8330 2693
1449 185
119
50
17
95
96
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
7611127
4383243
2587263
1542258
939234
589899
386141
248803
169184
112995
78121
50282
32393
22100
15098
10463
7090
4608
2836
1814
1269
1000
638
373
279
136
68
32
9
4
1
3
1
1
0
0
0
0
1
7151
3585
2434
1275
1136
432
340
117
102
31
23
7
5
0
1
147
121
41
26
4
CHAPTER C
Source Code
Listed in this appendix is the source code of the programs written and used for this
dissertation.
C.1
C.1.1
Generating automata
The gena method
Here we present the source code for the C++ program gena, which implements
a method of automata generation originally described by Kisielewicz and Szykua
[16]. For more information on this process, see Chapter 7.
1
2
// compile with
// g++ -std=c++11 gena.cpp -o gena
3
4
5
6
7
8
9
10
11
12
#include <iostream>
#include <iomanip>
#include <vector>
#include <algorithm>
#include <fstream>
#include <sstream>
#include <cstdio>
#include <cstring>
using namespace std;
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
const char* about =

"gena is an implementation of the algorithm described in the\n"
"paper Generating Small Automata and the Cerny Conjecture\n"
"by Andrzej Kisielewicz and Marek Szykula.\n"
"\n"
"The program takes two text files describing all non-isomorphic\n"
"n-state automata with k-1 and 1 symbols(s) in their alphabet,\n"
"respectively, and generates all n-state automata with k symbols\n"
"in their alphabet. The program is designed to avoid generating\n"
"isomorphic automata, but some may be generated nevertheless.\n"
"\n"
"The format used to describe an automaton is that of J.D.\n"
"Mitchells GAP library Semigroups.\n"
"\n"
"The program can be invoked with\n"
"gena n <file a> <file b> [<a start> <a end> <b start> <b end>]\n"
97
APPENDIX C. SOURCE CODE
98
30
31
32
33
34
"where the optional parameters are used to generate only those\n"

"automata arising from the combination of automata on lines\n"
"[a start, a end) of file a and lines [b start, b end) of file\n"
"b. Hence automata generation can be parallelised, so long as\n"
"every pair of automata from a and b are combined at least once.\n";
35
36
37
38
typedef vector<int> perm;

typedef vector<int> trans;
typedef vector<trans> automaton;
39
40
41
42
43
44
45
46
// Iterates through all automorphisms of an automaton that fix a given

// set of states
class aut_iter {
vector<vector<int>> images;
vector<int> indeg, pos;
const int fixes;
const automaton atm;
47
48
49
50
51
// Gets the image of a point under the current perm

int im(int i) const {
return (fixes & (1<<i)) == 0 ? images[indeg[i]][pos[i]] : i;
}
52
53
54
55
56
57
58
59
60
// Makes next perm out of list of disjoint perms

bool next_perm() {
int i;
for(i = images.size() - 1; i >= 0; --i) {
if(next_permutation(images[i].begin(), images[i].end())) {
break;
}
}
61
return i >= 0;
62
63
64
65
66
67
68
69
70
71
72
73
74
// Returns true if current perm is a strong automorphism on atm

bool is_aut() const {
for(const trans& t : atm) {
for(int i = 0; i < t.size(); ++i) {
if(im(t[i]) != t[im(i)]) {
return false;
}
}
}
C.1. GENERATING AUTOMATA

return true;
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
public:
aut_iter(const automaton& atm, int fixes = 0) :
atm(atm), fixes(fixes), indeg(atm[0].size()),
pos(atm[0].size()) {
// calculate indegrees for non-fixed states
int maxindeg = 0;
for(int i : t) {
if((fixes & (1<<i)) == 0) {
maxindeg = max(maxindeg, ++indeg[i]);
}
}
}
91
// store perm as composition of perms on all states

// of each indegree
images.resize(maxindeg + 1);
for(int i = 0; i < atm[0].size(); ++i) {
if((fixes & (1<<i)) == 0) {
pos[i] = images[indeg[i]].size();
images[indeg[i]].push_back(i);
}
}
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
// Populates p with current automorphism

void cur(perm& p) const {
p.resize(atm[0].size());
for(int i = 0; i < p.size(); ++i) {
p[i] = im(i);
}
}
110
111
112
113
114
115
116
117
118
119
// Calculates next automorphism, returning false

// if there is none
bool next() {
while(next_perm()) {
if(is_aut()) {
return true;
}
}
99
100
return false;
120
121
122
};
123
124
125
126
127
128
129
130
// Sets p_inv to the inverse of p

void inv_perm(const perm& p, perm& p_inv) {
p_inv.resize(p.size());
for(int i = 0; i < p.size(); ++i) {
p_inv[p[i]] = i;
}
}
131
132
133
134
// Generates the PrevA array

void gen_prev_a(const automaton& a, vector<int>& prev_a) {
const int n = a[0].size();
135
prev_a.resize(n);
fill(prev_a.begin(), prev_a.end(), -1);
136
137
138
aut_iter aut(a);
139
140
do {
141
perm pi, pi_inv;

aut.cur(pi);
inv_perm(pi, pi_inv);
142
143
144
145
for(int h = 0; h < n; ++h) {

if(h > 0 && pi[h-1] != h-1) {
break;
}
146
147
148
149
150
const int i = pi_inv[h];

if(i > h) {
prev_a[i] = max(prev_a[i], h);
}
151
152
153
154
155
}
while(aut.next());
156
157
158
159
160
161
162
// Generates the PrevB array

void gen_prev_b(const automaton& b, vector<vector<bool>>& prev_b) {
const int n = b[0].size();
163
164
prev_b.resize(1<<n);

fill(prev_b.begin(), prev_b.end(), vector<bool>(n));
165
166
for(int S = 0; S < (1<<n); ++S) {

aut_iter aut(b, S);
167
168
169
do {
170
perm pi, pi_inv;

aut.cur(pi);
inv_perm(pi, pi_inv);
171
172
173
174
for(int j = 0; j < n; ++j) {

if(pi_inv[j] < j) {
prev_b[S][j] = true;
}
}
175
176
177
178
179
}
while(aut.next());
180
181
182
183
184
185
186
187
188
189
190
// Combines a and b with pi

void union_automata(const automaton& a, const automaton& b,
const perm& p, automaton& c) {
perm p_inv;
inv_perm(p, p_inv);
191
const trans& t = b[0];

c = a;
c.push_back(trans(n));
192
193
194
195
for(int i = 0; i < n; ++i) {

c.back()[i] = p_inv[t[p[i]]];
}
196
197
198
199
200
201
202
203
int read_chars(istream& in, int n) {

char tmp[BUFSIZ];
in.get(tmp, n + 1);
204
return atoi(tmp);
205
206
207
208
209
// Reads in an automaton as a set of transformations

bool read_automaton(automaton& atm, int size, istream& in = cin) {
101
102
atm.clear();
210
211
string line, trans;

getline(in, line);
istringstream line_ss(line);
212
213
214
215
line_ss.ignore();
while(!line_ss.eof()) {
const int m = read_chars(line_ss, 1);
const int n = read_chars(line_ss, m);
216
217
218
219
220
atm.emplace_back();
for(int i = 0; i < n; ++i) {
atm.back().push_back(read_chars(line_ss, m) - 1);
}
221
222
223
224
225
// pad transformation if necessary

for(int i = n; i < size; ++i) {
atm.back().push_back(i);
}
226
227
228
229
230
231
return !atm.empty();
232
233
234
235
236
237
238
239
240
241
242
// Prints out the automaton as a set of transformations

void write_automaton(const automaton& atm, ostream& out = cout) {
// find number of characters required to print numbers
int digits = atm[0].size(), m = 0;
while(digits != 0) {
m++;
digits /= 10;
}
243
244
out << setfill( );
245
246
247
248
out << t;
out << m;
249
out << setw(m) << t.size();

for(int i : t) {
out << setw(m) << i + 1;
}
250
251
252
253
254

out << endl;
255
256
257
258
259
260
261
262
263
264
265
// Generates all automata that can be constructed from A and an

// automaton isomorphic to B
void permutation_procedure(const automaton& a, const automaton& b,
const vector<int>& prev_a,
const vector<vector<bool>>& prev_b,
int i, perm pi, int S,
ostream& out = cout) {
266
if(i == n) {
automaton c;
union_automata(a, b, pi, c);
write_automaton(c, out);
}
else {
const int m = prev_a[i] >= 0 ? pi[prev_a[i]]+1 : 0;
267
268
269
270
271
272
273
274
for(int j = m; j < n; ++j) {

if((S & (1<<j)) == 0 && !prev_b[S][j]) {
perm pi_ext(pi);
pi_ext.push_back(j);
275
276
277
278
279
const int S_ext = S | (1<<j);

permutation_procedure(a, b, prev_a, prev_b,
i+1, pi_ext, S_ext);
280
281
282
283
284
285
286
287
288
289
290
291
// Returns the offset in the file of the start of the given line
int line_offset(istream& in, int line) {
in.clear();
in.seekg(0);
292
293
294
295
296
297
298
299
string dump;
int i = 1;
while(in.peek() != EOF && i != line) {
getline(in, dump);
i++;
}
103
104
return i == line ? (int)in.tellg() : -1;
300
301
302
303
304
305
306
307
308
309
310
311
312
// Generates (mostly) non-isomorphic automata from combinations of

// the automata described between lines [a_start_line, a_end_line) of
// file af and between lines [b_start_line, b_end_line) of bf
void gena(int size, istream& af, istream& bf,
int a_start_line, int a_end_line,
int b_start_line, int b_end_line) {
const int a_start_pos = max(line_offset(af, a_start_line), 0);
const int a_end_pos = line_offset(af, a_end_line);
const int b_start_pos = max(line_offset(bf, b_start_line), 0);
const int b_end_pos = line_offset(bf, b_end_line);
313
bf.clear();
bf.seekg(b_start_pos);
314
315
316
while(bf.peek() != EOF && bf.tellg() != b_end_pos) {

automaton b;
read_automaton(b, size, bf);
317
318
319
320
vector<vector<bool>> prev_b;
gen_prev_b(b, prev_b);
321
322
323
af.clear();
af.seekg(a_start_pos);
324
325
326
while(af.peek() != EOF && af.tellg() != a_end_pos) {

automaton a;
read_automaton(a, size, af);
327
328
329
330
vector<int> prev_a;
gen_prev_a(a, prev_a);
331
332
333
permutation_procedure(a, b, prev_a, prev_b, 0, {}, 0);
334
335
336
337
338
339
340
341
342
343
344
int main(int argc, char *argv[]) {

if(argc < 4 || argc > 8) {
cout << about << endl;
return 1;
}
105
int size = atoi(argv[1]);

ifstream af(argv[2]), bf(argv[3]);
345
346
347
if(!af.good()) {
cout << "error: reading " << argv[2] << "" << endl;
return 1;
}
348
349
350
351
352
if(!bf.good()) {
cout << "error: reading " << argv[3] << "" << endl;
return 1;
}
353
354
355
356
357
int a_start_line = -1, a_end_line = -1,

b_start_line = -1, b_end_line = -1;
358
359
360
if(argc > 4) {
a_start_line = atoi(argv[4]);
a_end_line = a_start_line + 1;
}
361
362
363
364
365
if(argc > 5) {
a_end_line = atoi(argv[5]);
}
366
367
368
369
if(argc > 6) {
b_start_line = atoi(argv[6]);
b_end_line = b_start_line + 1;
}
370
371
372
373
374
if(argc > 7) {
b_end_line = atoi(argv[7]);
}
375
376
377
378
gena(size, af, bf, a_start_line, a_end_line,

b_start_line, b_end_line);
379
380
381
return 0;
382
383
}
C.1.2
The GAP method
As discussed in Chapter 7, a secondary method of automata generation was

written, and used to verify the correctness of the gena procedure. Here, we reproduce
106
the GAP code of this secondary routine, which accepts two state transformations
(each representing an arity-1 DFA), and outputs every possible arity-2 DFA arising
from their combination. The argument n is the number of states in each arity-1
DFA, f is the name of a file containing the list of input transformations, and first
and second are the indices of the desired input transformations (representing DFAs
A and B , respectively) in the file f.
1
2
GenerateAutomata := function(n, f, first, second)

local sym, ftrans, strans, stab, cand;
3
4
5
ftrans := ReadGenerators(f, first)[1];

strans := ReadGenerators(f, second)[1];
6
7
8
sym := SymmetricGroup(n);
cand := strans^sym;
9
10
11
12
stab := Stabilizer(sym, ftrans, POW);

return List(Orbits(stab, cand), o -> [ftrans, o[1]]);
end;
We now step through the execution of this routine. We begin by noting one
technicality. Let pi be a permutation in GAP code, representing a state permutation
. Then, due to a difference in notational convention, the GAP code [ftrans,
1
stranspi] represents the arity-2 DFA A B . Lines 4 and 5 above read the
appropriate A and B from the given file. Lines 7 and 8 compute the set R of every
possible relabelling of B . Line 10 calculates the stabiliser S of A that is, the
group of state permutations whose application leaves the transition function of A
unchanged. Line 11 outputs the automaton A B 0 for one B 0 in each orbit of S
on R.
From the discussion above, it is clear that any DFA output by the procedure is
1 1
of the form A B , for some state permutation and S. Hence each
output DFA is some union of A and B . Furthermore, it is not difficult to see that
every possible union of A and B is output by the above routine. For any possible
1
1 1
1
union A B and S, we have that A B ' A B ; as (and
1 1
hence 1 ) fixes the transformation from A , A B is simply the renaming
1
of A B by . Hence, we need only output the union of A with one DFA in
each orbit of S on R.
C.2
Canonicalising automata
Here we present the source code for the C program canon_dfa, which was used
to canonicalise automata output by gena. Compilation of the program requires
C.2. CANONICALISING AUTOMATA
107
the nauty package[18]. For more information on the canonicalisation process, see
Chapter 7.
1
2
3
/* compile with
* gcc canon_dfa.c nautinv.o nausparse.c nauty.a -o canon_dfa
*/
4
5
6
#include <stdio.h>
#include <string.h>
7
8
9
#include "nausparse.h"
#include "nautinv.h"
10
11
12
#define MIN(X,Y) ((X) < (Y) ? (X) : (Y))

#define MAX(X,Y) ((X) > (Y) ? (X) : (Y))
13
14
15
16
17
18
19
20
21
22
23
24
25
const char* usage =

"canon_dfa is a tool for the canonicalisation of automata.\n"
"The program reads in a sequence of automata from the standard\n"
"input, one per line. The canonicalised version of each automaton\n"
"is written to the standard output.\n"
"\n"
"The format used to describe an automaton is that of J.D.\n"
"Mitchells GAP library Semigroups.\n"
"\n"
"The program can be invoked with\n"
"canon_dfa n\n"
"where n is the number of states in each input automaton.\n";
26
27
28
29
30
int is_blank(const char* str) {

if(str == NULL) {
return TRUE;
}
31
const char* c;
for(c = str; *c != \0; ++c) {
if(*c != && *c != \t &&
*c != \n && *c != \r) {
return FALSE;
}
}
32
33
34
35
36
37
38
39
return TRUE;
40
41
42
108
43
44
45
46
47
48
/* read in a dfa in J. D. Mitchells Semigroups GAP package

format */
int load_dfa(int orig_nv, const char* line, sparsegraph* g,
int** lab, int** ptn) {
char tmp_str[BUFSIZ];
int i;
49
50
/* count number of verts and edges */
51
52
53
54
int arity = 0, orig_nde = 0;

const int line_len = strlen(line);
int cur_pos = 1;
55
56
57
58
59
60
while(cur_pos < line_len) {

/* m is number of characters needed to represent
each number */
const int m = line[cur_pos] - 0;
cur_pos++;
61
/* probably trailing whitespace */

if(m <= 0 || m > 9) {
break;
}
62
63
64
65
66
/* n is the degree of the current transformation */

strncpy(tmp_str, line + cur_pos, m);
tmp_str[m] = \0;
const int n = atoi(tmp_str);
cur_pos += m;
67
68
69
70
71
72
orig_nv = MAX(orig_nv, n);

arity++;
73
74
75
cur_pos += n * m;
76
77
78
79
80
81
82
/* account for a vert in the middle of each edge */

orig_nde = arity * orig_nv;
g->nv = orig_nv + orig_nde + arity;
g->nde = 3 * orig_nde;
83
84
/* allocate space for graph structure */
85
86
87
g->v = (size_t*)malloc(sizeof(size_t) * g->nv);

g->d = (int*)malloc(sizeof(int) * g->nv);

88
89
90
g->e = (int*)malloc(sizeof(int) * g->nde);

*lab = (int*)malloc(sizeof(int) * g->nv);
*ptn = (int*)malloc(sizeof(int) * g->nv);
91
92
93
94
95
96
if(g->v == NULL || g->d == NULL || g->e == NULL ||

*lab == NULL || *ptn == NULL) {
if(g->v != NULL) {
free(g->v);
}
97
if(g->d != NULL) {
free(g->d);
}
98
99
100
101
if(g->e != NULL) {
free(g->e);
}
102
103
104
105
if(*lab != NULL) {
free(*lab);
}
106
107
108
109
if(*ptn != NULL) {
free(*ptn);
}
110
111
112
113
return FALSE;
114
115
116
117
/* fill in arity-based information */
118
119
120
121
122
123
124
125
126
/* first comes original edges, which now end at

introduced vertices */
for(i = 0; i < orig_nv; ++i) {
g->v[i] = arity * i;
}
g->d[i] = arity;
}
127
128
129
130
131
132
/* then comes introduced vertices, each of which has a

single outgoing edge to complete the original edge
into which they were inserted */
for(i = 0; i < orig_nde; ++i) {
g->v[orig_nv + i] = orig_nde + i;
109
110
133
134
135
136
}
for(i = 0; i < orig_nde; ++i) {
g->d[orig_nv + i] = 1;
}
137
138
139
140
141
142
143
144
145
146
147
/* then comes one vertex per transformation, with one

outgoing edge ending in each vertex inserted due to
that transformation */
for(i = 0; i < arity; ++i) {
g->v[orig_nv + orig_nde + i] =
2 * orig_nde + i * orig_nv;
}
for(i = 0; i < arity; ++i) {
g->d[orig_nv + orig_nde + i] = orig_nv;
}
148
149
150
151
152
153
154
155
156
157
/* original vertices, introduced (edge) vertices, and

transformation vertices are all coloured differently */
for(i = 0; i < g->nv; ++i) {
(*lab)[i] = i;
(*ptn)[i] = 1;
}
(*ptn)[orig_nv - 1] = 0;
(*ptn)[orig_nv + orig_nde - 1] = 0;
(*ptn)[g->nv - 1] = 0;
158
159
/* read in edges */
160
161
162
163
164
165
cur_pos = 1;
int ei;
for(ei = 0; ei < arity; ++ei) {
const int m = line[cur_pos] - 0;
cur_pos++;
166
167
168
169
170
/* probably trailing whitespace */

if(m <= 0 || m > 9) {
break;
}
171
172
173
174
175

tmp_str[m] = \0;
const int n = atoi(tmp_str);
cur_pos += m;
176
177
/* read transformation images */
111
int vi = 0;
for(vi = 0; vi < orig_nv; ++vi) {
/* if enough images are not specified, extend
with id transformation */
int dest;
if(vi < n) {
/* each image stored in a block of m chars */
tmp_str[m] = \0;
dest = atoi(tmp_str) - 1;
cur_pos += m;
178
179
180
181
182
183
184
185
186
187
188
189
/* bail out if dest is non-sensical */

if(dest < 0 || dest >= orig_nv) {
free(g->v);
free(g->d);
free(g->e);
free(*lab);
free(*ptn);
return FALSE;
}
190
191
192
193
194
195
196
197
198
}
else {
dest = vi;
}
199
200
201
202
203
/* edge from original vertex to introduced edge vertex */

g->e[vi * arity + ei] = orig_nv + orig_nv * ei + vi;
/* edge from introduced vertex to destination */
g->e[orig_nde + orig_nv * ei + vi] = dest;
/* edge from transformation vertex to edge vertex */
g->e[2 * orig_nde + ei * orig_nv + vi] =
orig_nv + orig_nv * ei + vi;
204
205
206
207
208
209
210
211
212
213
return TRUE;
214
215
216
217
218
219
220
int write_dfa(const sparsegraph* g, const int* lab,

const int* ptn, char* out) {
int i, j;
int orig_nv = -1, orig_nde = -1, arity = -1;
221
222
/* find original graph info */
112
223
224
225
226
227
228
229
230
231
232
233
234
235
236
for(i = 0; i < g->nv; ++i) {

if(ptn[i] == 0) {
if(orig_nv == -1) {
orig_nv = i + 1;
}
else if(orig_nde == -1) {
orig_nde = i - orig_nv + 1;
}
else {
arity = i - orig_nv - orig_nde + 1;
}
}
}
237
238
/* allocate required data structures */
239
240
241
242
243
244
245
int* edge_trans = (int*)malloc(sizeof(int) * g->nv);

int* out_order = (int*)malloc(sizeof(int) * arity * orig_nv);
if(edge_trans == NULL || out_order == NULL) {
if(edge_trans != NULL) {
free(edge_trans);
}
246
if(out_order == NULL) {
free(out_order);
}
247
248
249
250
return FALSE;
251
252
253
254
255
/* fill out edge_trans with the transformation id of

each edge vertex */
256
257
258
259
260
261
262
for(i = 0; i < arity; ++i) {

const int e_off = g->v[orig_nv + orig_nde + i];
for(j = 0; j < orig_nv; ++j) {
edge_trans[g->e[e_off + j]] = i;
}
}
263
264
/* store vertex images in correct output order */
265
266
267

for(j = 0; j < arity; ++j) {

const int edge_vert = g->e[g->v[i] + j];
const int dest_vert = g->e[g->v[edge_vert]];
out_order[edge_trans[edge_vert] * orig_nv + i] =
dest_vert;
268
269
270
271
272
273
274
275
276
277
/* look for duplicate transformations */

/* could be moved before canonicalisation if performance
proved inadequate */
278
279
280
281
282
283
284
285
286
287
288
289
for(i = 0; i < arity; ++i) {

for(j = i + 1; j < arity; ++j) {
const int* t1 = out_order + i * orig_nv;
const int* t2 = out_order + j * orig_nv;
if(memcmp(t1, t2, orig_nv * sizeof(int)) == 0) {
free(edge_trans);
free(out_order);
return FALSE;
}
}
}
290
291
/* output to string */
292
293
294
295
296
297
298
299
/* find number of characters needed to represent

maxiumum image */
int digits = orig_nv, m = 0;
while(digits != 0) {
m++;
digits /= 10;
}
300
301
302
303
304
305
306
307
/* write line */
out[0] = t;
int str_pos = 1;
for(i = 0; i < arity; ++i) {
str_pos += sprintf(out + str_pos, "%d", m);
str_pos +=
sprintf(out + str_pos, "%*d", m, orig_nv);
308
309
310
311
312
for(j = 0; j < orig_nv; ++j) {

str_pos += sprintf(out + str_pos, "%*d", m,
out_order[i * orig_nv + j] + 1);
}
113
114
}
out[str_pos] = \0;
313
314
315
free(edge_trans);
free(out_order);
return TRUE;
316
317
318
319
320
321
322
323
int main(int argc, char* argv[]) {

char line[BUFSIZ];
int i, *lab, *ptn, *orbits;
324
325
326
327
328
if(argc != 2) {
printf(usage);
return 1;
}
329
330
const int n = atoi(argv[1]);
331
332
333
334
335
while(fgets(line, BUFSIZ, stdin) != NULL) {

sparsegraph g, cg;
SG_INIT(g);
SG_INIT(cg);
336
337
338
339
340
341
DEFAULTOPTIONS_SPARSEDIGRAPH(options);
statsblk stats;
options.defaultptn = FALSE;
options.getcanon = TRUE;
options.invarproc = adjacencies_sg;
342
343
344
345
346
if(!load_dfa(n, line, &g, &lab, &ptn)) {

printf("read error\n");
return 1;
}
347
348
349
350
351
352
353
354
355
356
357
orbits = (int*)malloc(sizeof(int) * g.nv);

if(orbits == NULL) {
printf("error allocating memory\n");
free(g.v);
free(g.d);
free(g.e);
free(lab);
free(ptn);
return 1;
}
C.3. ANALYSING AUTOMATA
115
358
sparsenauty(&g, lab, ptn, orbits, &options,

&stats, &cg);
sortlists_sg(&cg);
359
360
361
362
if(write_dfa(&cg, lab, ptn, line)) {

printf("%s\n", line);
}
363
364
365
366
free(g.v);
free(g.d);
free(g.e);
free(cg.v);
free(cg.d);
free(cg.e);
free(lab);
free(ptn);
free(orbits);
367
368
369
370
371
372
373
374
375
376
377
return 0;
378
379
}
C.3
Analysing automata
Here we present the GAP source code for a suite of utilities, designed to assist with the analysis of automata. These tools were used extensively during our
enumeration of small automata. The code listed here builds on top of, and hence
requires, J. D. Mitchells Semigroups GAP package.
1
LoadPackage("semigroups");
2
3
4
5
6
# Takes a set of intergers and converts it a bitset index.

SetToIndex := function(s)
return Sum(List(s, e -> 2^e));
end;
7
8
9
10
11
12
# Returns a list wherein the ith entry is the shortest word

# of rank i, or blank if such a word does not exist.
ShortestWordsByRank := function(s, n)
local pos, queue, visited, gens, words, q, w, c, k;
gens := GeneratorsOfSemigroup(s);
13
14
# each queue elmt is a pair (state_set, word)
116
15
16
17
# where word takes {1..n} to state_set

queue := [[[1..n], []]];
pos := 1;
18
19
20
visited := [];
visited[SetToIndex([1..n])] := true;
21
22
words := [];
23
24
25
26
27
while Length(queue) - pos >= 0 do

q := queue[pos][1];
w := queue[pos][2];
pos := pos + 1;
28
if not IsBound(words[Length(q)]) then

words[Length(q)] := w;
fi;
29
30
31
32
for k in [1..Length(gens)] do
c := OnSets(q, gens[k]);
33
34
35
if not IsBound(visited[SetToIndex(c)]) then

visited[SetToIndex(c)] := true;
Append(queue, [[c, Concatenation(w, [k])]]);
fi;
36
37
38
39
od;
40
41
od;
42
43
44
return words;
end;
45
46
47
# Deprecated name: use ShortestWordsByRank instead

TransformationSemigroupRanks := ShortestWordsByRank;
48
49
50
51
52
53
54
55
56
# Takes an edge-labelled adjacency list representation of

# a complete deterministic automaton and computes the
# transformations that give rise to the DFA. Assumes that
# the edges are all labelled with positive integers in the
# range [1..n] for some positive n.
AdjacencyListToSemigroup := function(adj)
local gens, v, n, pair;
n := Size(adj);
57
58
59
# reconstruct transformations acting on vertices of

# the graph

60
61
62
63
64
65
66
gens := [];
for v in [1..n] do
for pair in adj[v][2] do
# first time weve seen this transition
if not IsBound(gens[pair[2]]) then
gens[pair[2]] := [1..n];
fi;
67
gens[pair[2]][v] := pair[1];
68
od;
69
70
od;
71
72
73
return Semigroup(List(gens, Transformation));

end;
74
75
76
77
78
79
80
81
82
83
84
85
# Given a semigroup s (with transformations acting on

# [1..n]), generates the adjacency list representation
# of the graph of the semigroup, with p[1] being the
# image of i under generator p[2], for each p in adj[i][2].
# adj[i][1] contains the label of state i, which can be a
# number or list of numbers. If undir is set to true, the
# returned graph is undirected (with both directions of a
# transformation labelled with the corresponding generator
# number).
SemigroupToAdjacencyList := function(s, n, undir)
local i, t, gens, adj;
86
87
88
adj := List([1..n], i -> [i, []]);
89
90
91
92
for t in [1..Size(gens)] do
for i in [1..n] do
Append(adj[i][2], [[i^gens[t], t]]);
93
if undir then
Append(adj[i^gens[t]][2], [[i, t]]);
fi;
94
95
96
od;
97
98
od;
99
100
101
return adj;
end;
102
103
104
# Takes a set of (1 or 2) states and generates the index

# of that state set in the pair automaton.
117
118
105
106
StateSetToPairIndex := function(st)
local i, j;
107
108
109
110
111
112
113
i := st[1];
if Size(st) > 1 then
j := st[2];
else
j := i;
fi;
114
115
116
return j * (j - 1) / 2 + i;
end;
117
118
119
120
121
122
# Takes an index in the pair automaton and generates the

# state set in terms of the vertices from the original
# automaton.
PairIndexToStateSet := function(i)
local j;
123
124
125
126
127
128
j := 1;
while i > j do
i := i - j;
j := j + 1;
od;
129
130
131
return Set([i, j]);

end;
132
133
134
135
136
137
138
139
140
# Gives the adjacency list corresponding to the pair

# automaton of the semigroup s (with n states). Each
# node is labelled with the set of states which it
# represents. States of the pair automaton are ordered
# as follows: [1], [1, 2], [2], [1, 3], [2, 3], [3],
# [1, 4] ...
SemigroupToPairAdjacencyList := function(s, n)
local t, pn, gens, v, dest, StToIndex, adj;
141
142
143
144
145
pn := n*(n+1)/2;
adj := List([1..pn], i -> [PairIndexToStateSet(i), []]);
146
147
148
149
# fill in adjacency list

for v in [1..pn] do

dest := OnSets(PairIndexToStateSet(v), gens[t]);
Append(adj[v][2], [[StateSetToPairIndex(dest), t]]);
150
151
od;
152
153
od;
154
155
156
return adj;
end;
157
158
159
160
161
162
# Takes a set of states and returns the index of

# the corresponding state-set in the power automaton.
StateSetToPowerIndex := function(st)
return Sum(List(st, s -> 2^(s-1)));
end;
163
164
165
166
167
168
# Takes an index of a power automaton state and

# returns the corresponding state set in the original
# automaton.
PowerIndexToStateSet := function(i)
local st, j;
169
170
171
st := [];
j := 1;
172
173
174
175
176
while i <> 0 do
if i mod 2 <> 0 then
st[Size(st) + 1] := j;
fi;
177
i := Int(i / 2);
j := j + 1;
178
179
180
od;
181
182
183
return st;
end;
184
185
186
187
188
189
190
191
192
193
194
# Gives the adjacency list corresponding to the power

# automaton of the semigroup s (with n states). Each
# node is labelled with the set of states which it
# represents. States in the power automaton are ordered
# such that if the ith bit is set in state j of the power
# automaton, then state i of the original automaton is in
# state-set j of the power automaton
SemigroupToPowerAdjacencyList := function(s, n)
local t, pn, gens, states, orig, dest,
StToIndex, IndexToSt, adj;
119
120
195
196
197
198
199
pn := 2^n-1;
adj := List([1..pn], i -> [PowerIndexToStateSet(i), []]);
200
201
202
203
204
# fill in adjacency list

for orig in [1..pn] do
dest := OnSets(PowerIndexToStateSet(orig), gens[t]);
205
Append(adj[orig][2], [[StateSetToPowerIndex(dest), t]]);
206
od;
207
208
od;
209
210
211
return adj;
end;
212
213
214
# Functions for generating the pair/power semigroups

# of a semigroup
215
216
217
218
219
PairSemigroup := function(s, n)
return AdjacencyListToSemigroup(
SemigroupToPairAdjacencyList(s, n));
end;
220
221
222
223
224
225
226
# Named to avoid collision with Semigroups example

# function
PowerSemigroupOfSemigroup := function(s, n)
return AdjacencyListToSemigroup(
SemigroupToPowerAdjacencyList(s, n));
end;
227
228
229
230
231
# Gives a string representing a state set

PrettyStateSet := function(state)
local str;
str := String(state);
232
233
234
235
236
237
238
239
# weird hack to detect if state is a number

if [] > state then
return str;
else
# lop off opening/closing brackets
return str{[3..Size(str)-2]};
fi;

240
end;
241
242
243
244
245
246
247
# Writes the given adjacency list graph adj to the given

# file f. If loops is set to false, transitions from a
# node to itself will not be rendered. The graph is writen
# in graphviz format.
RenderAdjacencyList := function(f, adj, loops)
local alpha, gens, orig, pair, dest, symb;
248
249
250
251
252
253
254
255
256
# can only handle 26 different edge labels

# (which must be numeric)
alpha := List([97..122], i -> [CharInt(i)]);
AppendTo(f, "digraph g {\n");
AppendTo(f, "node [shape=circle,fixedsize=true,width=.3,");
AppendTo(f, "fontsize=10,fontname=helvetica];\n");
AppendTo(f, "edge [arrowhead=vee,arrowsize=.5,fontsize=10");
AppendTo(f, ",fontname=helvetica];\n");
257
258
259
260
261
262
263
264
265
# write state info

for orig in [1..Size(adj)] do
AppendTo(f, "n");
AppendTo(f, orig);
AppendTo(f, " [label=\"");
AppendTo(f, PrettyStateSet(adj[orig][1]));
AppendTo(f, "\"];\n");
od;
266
267
268
269
270
271
# write transition info

for orig in [1..Size(adj)] do
for pair in adj[orig][2] do
dest := pair[1];
symb := pair[2];
272
if loops or orig <> dest then

AppendTo(f, "n");
AppendTo(f, orig);
AppendTo(f, " -> n");
AppendTo(f, dest);
AppendTo(f, " [label=\"");
AppendTo(f, alpha[symb]);
AppendTo(f, "\"];\n");
fi;
273
274
275
276
277
278
279
280
281
od;
282
283
284
od;
121
122
285
286
AppendTo(f, "}\n");
end;
287
288
289
# Specialisations of the above rendering function to

# semigroups
290
291
292
293
294
RenderAutomaton := function(f, s, n, loops)

RenderAdjacencyList(f,
SemigroupToAdjacencyList(s, n, false), loops);
end;
295
296
297
298
299
RenderPairAutomaton := function(f, s, n, loops)

SemigroupToPairAdjacencyList(s, n), loops);
end;
300
301
302
303
304
RenderPowerAutomaton := function(f, s, n, loops)

SemigroupToPowerAdjacencyList(s, n), loops);
end;
305
306
307
308
309
310
311
# Given an adjacency list representation of a labelled

# graph (as given above), returns the number of connected
# components in the graph. The returned value is only
# meaningful if the graph is undirected.
NrConnectedComponents := function(adj)
local n, pair, v, vis, cur, queue, pos, comp;
312
313
n := Size(adj);
314
315
316
317
318
# has node i been visited?

vis := List([1..n], i -> false);
# number of connected components seen so far
comp := 0;
319
320
321
322
323
324
325
326
# explore each connected component,

for v in [1..n] do
# skipping vertex if it belongs to an
# already-explored component
if vis[v] then
continue;
fi;
327
328
329
vis[v] := true;

# use a list as a queue
queue := [v];
pos := 1;
330
331
332
333
while Length(queue) - pos >= 0 do

cur := queue[pos];
pos := pos + 1;
334
335
336
337
for pair in adj[cur][2] do

if not vis[pair[1]] then
vis[pair[1]] := true;
queue[Length(queue) + 1] := pair[1];
fi;
od;
338
339
340
341
342
343
od;
344
345
comp := comp + 1;
346
347
od;
348
349
350
return comp;
end;
351
352
353
354
355
356
357
358
# Given a semigroup s (with generating transformations

# acting on [1..n]), returns the number of connected
# components in the (undirected) graph of s
IsConnectedSemigroup := function(s, n)
return NrConnectedComponents(
SemigroupToAdjacencyList(s, n, true)) = 1;
end;
359
360
361
362
363
364
365
366
367
# Returns a list wherein the ith element is the set of

# states in the ith strongly connected component of adj.
# Taken shamefully from
# http://en.wikipedia.org/wiki/
# Tarjans_strongly_connected_components_algorithm
AdjacencyListSCC := function(adj)
local n, index, stack, indices, lowlink, onstack, v, w,
StrongConnect, pair, scc, curcomp;
368
369
370
371
372
n := Size(adj);
indices := List([1..n], i -> 0);
lowlink := List([1..n], i -> 0);
onstack := List([1..n], i -> false);
373
374
index := 1;
123
124
375
stack := [];
376
377
scc := [];
378
379
380
381
382
383
384
StrongConnect := function(v)
# set the depth index for v to the smallest
# unused index
indices[v] := index;
lowlink[v] := index;
index := index + 1;
385
386
387
stack[Size(stack) + 1] := v;
onstack[v] := true;
388
389
390
391
# consider successors of v
w := pair[1];
392
# successor w has not yet been visited

# so recurse on it
if indices[w] = 0 then
StrongConnect(w);
lowlink[v] := Minimum(lowlink[v], lowlink[w]);
# successor w is in stack S and hence in the
# current scc
elif onstack[w] then
lowlink[v] := Minimum(lowlink[v], lowlink[w]);
fi;
393
394
395
396
397
398
399
400
401
402
403
od;
404
405
406
407
# if v is a root node, pop the stack and generate scc

if lowlink[v] = indices[v] then
curcomp := [];
408
repeat
w := stack[Size(stack)];
Unbind(stack[Size(stack)]);
onstack[w] := false;
409
410
411
412
413
Append(curcomp, [w]);
until v = w;
414
415
416
Append(scc, [curcomp]);
417
418
419
fi;
end;

420
421
422
423
424
425
for v in [1..n] do
if indices[v] = 0 then
StrongConnect(v);
fi;
od;
426
427
428
return scc;
end;
429
430
431
432
433
434
435
# Returns true if given semigroup (with transformations

# acting on [1..n]) is strongly connected.
IsStronglyConnectedSemigroup := function(s, n)
return Size(AdjacencyListSCC(
SemigroupToAdjacencyList(s, n, false))) = 1;
end;
436
437
438
439
440
# Returns the nucleus (i.e. the union of strongly-connected

# subautomata) of the given graph.
AdjacencyListNucleus := function(adj)
local scc, curcomp, v, pair, good, nucleus;
441
442
443
scc := List(AdjacencyListSCC(adj), Set);

nucleus := [];
444
445
446
447
448
# A strongly-connected subautomaton is a scc where no

# edge leaves the scc
for curcomp in scc do
good := true;
449
for v in curcomp do
if not pair[1] in curcomp then
good := false;
fi;
od;
od;
450
451
452
453
454
455
456
457
if good then
Append(nucleus, curcomp);
fi;
458
459
460
461
od;
462
463
464
return nucleus;
end;
125
126
465
466
467
468
469
470
# Returns the transposed version of the graph. That is,

# the graph where there is an edge from nodes i to j iff
# there is an edge from nodes j to i in the original graph.
TransposedAdjacencyList := function(adj)
local tadj, n, orig, pair, v;
471
472
473
n := Size(adj);
tadj := List([1..n], i -> [adj[i][1], []]);
474
475
476
477
478
479
for v in [1..n] do
Append(tadj[pair[1]][2], [[v, pair[2]]]);
od;
od;
480
481
482
return tadj;
end;
483
484
485
486
487
488
489
490
491
492
# Generates, for each node, a list of edge labels

# representing a word that takes that node to the given
# destination node set. The optional third argument is
# the list of transformations corresponding to each edge
# transition. These can be provided to prevent redundant
# recomputation, or for efficiencys sake.
ShortestTransitions := function(arg)
local adj, dest, trans,
tadj, words, queue, pos, v, pair;
493
494
495
adj := arg[1];
dest := arg[2];
496
497
498
499
500
501
502
if IsBound(arg[3]) then
trans := arg[3];
else
trans := GeneratorsOfSemigroup(
AdjacencyListToSemigroup(adj));
fi;
503
504
tadj := TransposedAdjacencyList(adj);
505
506
507
508
509
#
#
#
#
words taking each node to the nucleus - also serves

as visited table first element in an entry is the list
of input symbols that make up the word, the second is
the resulting transformation

510
words := [];
511
512
513
514
for v in dest do
words[v] := [[], Transformation([])];
od;
515
516
517
518
# bfs through transposed graph, constructing words as we go

queue := List(dest);
pos := 1;
519
520
521
522
while Size(queue) - pos >= 0 do

v := queue[pos];
pos := pos + 1;
523
for pair in tadj[v][2] do

if not IsBound(words[pair[1]]) then
words[pair[1]] :=
[Concatenation([pair[2]], words[v][1]),
trans[pair[2]] * words[v][2]];
queue[Size(queue) + 1] := pair[1];
fi;
od;
524
525
526
527
528
529
530
531
532
od;
533
534
535
return words;
end;
536
537
538
539
540
541
542
543
# Generates, for each node, a list of edge labels

# representing a word that takes that node to the
# nucleus.
NucleusTransitions := function(adj)
return ShortestTransitions(adj,
AdjacencyListNucleus(adj));
end;
544
545
546
547
# Returns a nucleur word for an automaton.

NuclearWord := function(adj)
local words, trans, nucword, nuctrans, v, n, pair;
548
549
n := Size(adj);
550
551
552
553
554
# establish how edge transitions affect automaton

words := NucleusTransitions(adj);
nucword := [];
nuctrans := Transformation([1..n]);
127
128
555
556
557
558
559
for v in [1..n] do
Append(nucword, words[v^nuctrans][1]);
nuctrans := nuctrans * words[v^nuctrans][2];
od;
560
561
562
return nucword;
end;
563
564
565
566
567
568
569
570
571
# Returns the quotient of the graph adj by the list of

# vertices in subatma. The quotient of a graph G by a
# subautomaton A is the graph where all states are the
# same as G except for the states in A, which are replaced
# with a single new state. Transitions into any state in A
# are modified to transition into the new state.
QuotientAdjacencyList := function(adj, subatma)
local sortatma, n, qn, qadj, v, qv, vmap, pair;
572
573
574
575
n := Size(adj);
qn := n - Size(subatma) + 1;
qadj := [];
576
577
sortatma := Set(subatma);
578
579
580
581
582
583
584
585
586
587
588
589
# construct vmap taking v in adj to vmap[v] in qadj

qv := 1;
vmap := [];
for v in [1..n] do
if v in sortatma then
vmap[v] := qn;
else
vmap[v] := qv;
qv := qv + 1;
fi;
od;
590
591
592
# new state has same label as first quotiented state

qadj[qn] := [adj[sortatma[1]][1], []];
593
594
595
596
597
for v in [1..n] do
if vmap[v] <> qn then
qadj[vmap[v]] := [adj[v][1], []];
fi;
598
599

Append(qadj[vmap[v]][2],
[[vmap[pair[1]], pair[2]]]);
600
601
od;
602
603
od;
604
605
606
return qadj;
end;
607
608
609
610
611
# Returns a list wherein the ith entry is true if state

# i can be reached from a node in the starting set.
ReachabilityAdjacencyList := function(adj, start)
local n, vis, queue, pos, v, pair;
612
613
n := Size(adj);
614
615
616
vis := List([1..n], i -> false);

vis{start} := List([1..Size(start)], i -> true);
617
618
619
queue := List(start);
pos := 1;
620
621
622
623
while Size(queue) - pos >= 0 do

v := queue[pos];
pos := pos + 1;
624

if not vis[pair[1]] then
vis[pair[1]] := true;
queue[Size(queue) + 1] := pair[1];
fi;
od;
625
626
627
628
629
630
631
od;
632
633
634
return vis;
end;
635
636
637
638
639
# Determines (in O(n^2) time) whether the given semigroup

# is syncrhonising or not.
PolyIsSynchronisingSemigroup := function(s, n)
local padj, tadj, onestate, vis;
640
641
642
643
644
padj := SemigroupToPairAdjacencyList(s, n);

onestate := List([1..n], i -> StateSetToPairIndex([i]));
tadj := TransposedAdjacencyList(padj);
vis := ReachabilityAdjacencyList(tadj, onestate);
129
130
645
646
647
return ForAll(vis, i -> i);

end;
648
649
650
651
652
653
# Identifies the states of the "original automaton + pair

# states from which a singleton state cannot be reached"
# subautomaton of the pair automaton
TerminalWordSubautomaton := function(adj)
local n, onestate, tadj, vis;
654
655
656
n := Size(adj);
onestate := Filtered([1..n], i -> Size(adj[i][1]) = 1);
657
658
659
660
661
662
# perform a bfs on the transposed graph, starting from

# original automaton states, to find which pair states
# we cannot reach
tadj := TransposedAdjacencyList(adj);
vis := ReachabilityAdjacencyList(tadj, onestate);
663
664
665
666
return Concatenation(onestate,
Filtered([1..n], i -> not vis[i]));
end;
667
668
669
670
671
672
673
# Generates a terminal word of a semigroup by quotienting

# the pair automaton by a specific subautomaton and
# constructing a nuclear word for the resulting automaton.
# Runs in O(n^4) time.
RystsovTerminalWordOfSemigroup := function(s, n)
local padj, subatma, qadj;
674
675
676
677

subatma := TerminalWordSubautomaton(padj);
qadj := QuotientAdjacencyList(padj, subatma);
678
679
680
return NuclearWord(qadj);
end;
681
682
683
684
685
686
687
688
689
# Generates a terminal word of the subautomaton by

# calculating the shortest word unifying each pair
# of states (if one exists), then combining them into
# one. Runs in O(n^3) time.
EppsteinTerminalWordOfSemigroup := function(s, n)
local trans, padj, words, onestate, curstate,
allpairs, pair, index, stop, termword;

690
691
692
693
694
# get words mapping each pair of states to a single

# state
onestate := List([1..n],
i -> StateSetToPairIndex([i]));
695
696
697
698
699
700
701
# use original automaton transitions for efficiency

# (O(n) vs. O(n^2)) and so we dont have to keep
# translating between pair and regular indices
trans := GeneratorsOfSemigroup(s);
words := ShortestTransitions(padj, onestate,
trans);
702
703
704
705
# combine unifying words into terminal word

curstate := [1..n];
termword := [];
706
707
708
709
repeat
stop := true;
allpairs := Combinations(curstate, 2);
710
711
712
713
714
715
716
717
718
719
720
721
722
723
# unify the first possible pair of states in cur

# state
for pair in allpairs do
index := StateSetToPairIndex(pair);
if IsBound(words[index]) then
Append(termword, words[index][1]);
curstate := OnSets(curstate,
words[index][2]);
stop := false;
break;
fi;
od;
until stop;
724
725
726
return termword;
end;
727
728
729
# The Eppstein algo is more efficient than the Rystsov algo

TerminalWordOfSemigroup := EppsteinTerminalWordOfSemigroup;
730
731
732
733
734
# Returns the rank of a semigroup by calculating a

# terminal word and checking its rank.
# Runs in O(n^3) time.
RankOfSemigroup := function(s, n)
131
132
735
736
737
738
739
740
local gens, word, trans;

word := TerminalWordOfSemigroup(s, n);
trans := EvaluateWord(gens, word);
return RankOfTransformation(trans, n);
end;
741
742
743
744
745
# Returns the shortest terminal word of the given

# semigroup. Runs in exponential time.
ShortestTerminalWord := function(s, n)
local words, rank;
746
747
748
words := ShortestWordsByRank(s, n);

rank := First([1..n], i -> IsBound(words[i]));
749
750
751
return words[rank];
end;
133
Index
ern
ern, 3, 5
ern automaton, 3, 5
ern conjecture, 3
ern function, 3
edge cases to erns conjecture,
41
algorithms
automata generation, 36, 97, 106
canonicalisation, 37, 107
short reset word, 7, 26
short terminal word, 7, 24, 26, 28
shortest terminal word, 12, 25
synchronisation check, 13
utilities, 115
automata
circular automata, 6, 46
locally strongly-transitive automata,
7
one-cluster automata, 6
regular automata, 7
strongly-transitive automata, 7
automorphism, 30, 37
block structure, 46
canonicalisation, 31, 37
conjugate automata, 34
connected automata, 43
conucleus, 18, 19
counting automata, 8, 40
deficiency-1 transformation, 46
deterministic finite-state automaton, 1
disconnected automata, 43
edge cases to erns conjecture, 8
enumeration, 30, 40
GAP, 7, 11, 13, 29, 39, 106
gap, 9, 42
generating automata, 8
indegree, 37
island, 9
isomorphism, 30, 37
lexicographical order, 33, 35, 36
nauty, 37, 39
nuclear state, 18
nuclear word, 18, 21, 25
nucleus, 1720, 22
number of automata, 8, 40, 42
orbit, 106
pair automaton, 13, 24, 26
permutation, 31, 106
Pin
edge cases to Pins conjecture, 40,
42, 43, 46, 48
Pins conjecture, 3, 9, 10
Pins cubic bound, 6, 15
power automaton, 11
pruning, 34
quotient automaton, 24
quotient pair automaton, 25
rank, 3, 22
rank-2, 42, 44, 46, 48
rank-3, 46
reachability, 11
reset threshold, 2, 41
restriction of an automaton, 31
semigroup, 3
sink state, 11
slowly-converging, 9, 40, 4346, 48
slowly-synchronising automata, 8
stabiliser, 106
strong isomorphism, 30
strongly-connected automata, 24, 43,
46, 48
strongly-connectedness, 17, 18, 23
subautomaton, 17, 24
synchronising, 2
synchronising state, 2
134
synchronising word, 2
terminal threshold, 3, 43, 44, 46
terminal word, 3, 25, 26
the deficiency conjecture, 9
the gap conjecture, 9, 42
transformation, 3
transition alphabet, 1
transition function, 1
transposition, 20
unifying word, 13, 26
union, 31
word, 1
word concatenation, 21
Index

A Computational Survey of The Cerny and Pin Conjectures

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

A Computational Survey of The Cerny and Pin Conjectures

Încărcat de

Drepturi de autor:

Formate disponibile

A Computational Survey of the

ern and Pin Conjectures

This thesis is presented for the partial requirements of the degree of

1 Motivations and Definitions

4 Pins Bound on Reset Word Length

5 Nuclear Words and Strongly Connected Automata

6 Eppsteins Reset Word Algorithm

7 Enumerating Small Automata

A A Listing of Significant Automata

A.1 Slowly-converging strongly-connected automata . . . . . . . . . . . . 55

A.2.3 Slowly-converging automata of size 11

A.2.4 Slowly-converging automata of size 12

B.1 Terminal-threshold histograms of all automata . . . . . . . . . . . . . 88

C.1 Generating automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

The graph of the DFA C4 . . . . . . . . . . . . . . . . . . . . . . . . .

The graph of an arity-2, size-4, rank-2 DFA A. . . . . . . . . . . . . .

The graph of the DFA Cn . . . . . . . . . . . . . . . . . . . . . . . . .

A graph depicting the pertinent input symbol of a one-cluster DFA. .

The graph of Karis counter-example to the deficiency conjecture. . . 10

The graph of the power automaton of C4 . . . . . . . . . . . . . . . . . 12

The graph of a DFA B with an illustrative nucleus. . . . . . . . . . . 18

The graph of A \ A0 , the quotient automaton of the DFA A by the

The graph of the union of A and B under = (0, 2, 1). . . . . . . . 32

The graph of a DFA admitting non-trivial automorphisms. . . . . . . 34

The slowest-converging automata produced in our restricted enumeration of DFAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

The number N (n) of non-isomorphic arity-2, size-n automata, for

The state of the variables in GreedyTerminalWord(C4 ) after each

The number of non-isomorphic arity-2, size-n, rank-k automata, for

The number of non-isomorphic arity-2, size-n, rank-k automata with

The number of arity-2, size-n, rank-1 DFAs with terminal threshold

The number of arity-2, size-n, rank-2 DFAs with terminal threshold

The maximum terminal threshold of a strongly-connected, arity-2,

The observed (for 5 n 16) and extrapolated terminal thresholds

The number Nn,k (`) of size-n, rank-k automata constructed from a

The observed (for 10 n 16) and extrapolated rank r, shortest

Motivations and Definitions

Figure 1.1: The graph of the DFA C4 .

CHAPTER 1. MOTIVATIONS AND DEFINITIONS

CHAPTER 1. MOTIVATIONS AND DEFINITIONS

This conjecture is sometimes called the ern-Pin conjecture, to recognise the

Figure 1.2: The graph of an arity-2, size-4, rank-2 DFA A.

CHAPTER 2. RELEVANT LITERATURE

Figure 2.1: The graph of the DFA Cn .

CHAPTER 2. RELEVANT LITERATURE

CHAPTER 2. RELEVANT LITERATURE

Figure 2.3: The graph of Karis counter-example to the deficiency conjecture.

CHAPTER 3. ELEMENTARY RESULTS

Figure 3.1: The graph of the power automaton of C4 .

CHAPTER 3. ELEMENTARY RESULTS

Pins Bound on Reset Word Length

CHAPTER 4. PINS BOUND ON RESET WORD LENGTH

by a standard inductive argument. Specifically, in the case that r = 1 (i.e. for a

Nuclear Words and Strongly Connected Automata

Definition 5.2. Define an equivalence relation R on states of a DFA = hQ, , i

Our algorithm meets the specification of an existing algorithm presented by Rystsov in a

CHAPTER 5. NUCLEAR WORDS AND STRONGLY CONNECTED AUTOMATA

Figure 5.1: The graph of a DFA B with an illustrative nucleus.

CHAPTER 5. NUCLEAR WORDS AND STRONGLY CONNECTED AUTOMATA

Finally, we detail an algorithm for generating a nuclear word. Having found

CHAPTER 5. NUCLEAR WORDS AND STRONGLY CONNECTED AUTOMATA

ri2 n2 2nr + r2 = (n r)2 .

Hence Pins conjecture holds for the nucleus N of .

CHAPTER 5. NUCLEAR WORDS AND STRONGLY CONNECTED AUTOMATA