Documente Academic
Documente Profesional
Documente Cultură
Abstract
We develop systematic string techniques to study brane world effective actions for models with magnetized (or equivalently intersecting) D-branes. In particular, we derive the dependence on all NSNS moduli
of the kinetic terms of the chiral matter in a generic non-supersymmetric brane configurations with noncommuting open string fluxes. Near a N = 1 supersymmetric point the effective action is consistent with a
FayetIliopoulos supersymmetry breaking and the normalization of the scalar kinetic terms is nothing else
than the Khler metric. We also discuss, from a stringy perspective, D and F term breaking mechanisms,
and how, in this generic set up, the Khler metric enters in the physical Yukawa couplings.
2006 Elsevier B.V. All rights reserved.
0550-3213/$ see front matter 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.nuclphysb.2006.02.044
properties in the compact space. This can happen when constant magnetic fields are switched
on along the D-brane world-volume [1], or when the D-branes intersect with some non-trivial
angles [2] (actually, these two situations can be usually connected by means of some T-dualities,
see for instance [3]). By exploiting these basic features, a new class of string models has been
studied in these last years, starting from [46], providing various interesting phenomenological
applications. Recent reviews on this subject, often named intersecting brane worlds (IBW) [7],
are Refs. [812] and also the detailed derivation of some results can be found in the PhD-theses
[1316]. One of the nice features of this class of string models is that they are calculable. This
means that, by using known string techniques, it is possible to compute explicitly the Standard
Model-like effective action. Moreover, all the parameters appearing in such a low-energy action
are functions of the microscopic data specifying the D-brane configuration and the geometry of
the compact space. The explicit derivation of the effective action is certainly possible whenever
the string vacuum under consideration is described, from the world-sheet point of view, by some
tractable conformal field theory (CFT). Even if this is a rather particular set of points in the whole
moduli space of the D-brane/string compactifications, it contains already some very interesting
situations, like those involving orbifolds or orientifolds and, as we already said, also the case of
D-branes with constant magnetic fields. Thanks to the simplicity of the underlying string theory,
it has been possible to study various features of the IBW models which go beyond the analysis
of the spectrum and of its quantum numbers. For instance, several authors studied how the Higgs
mechanism [17] and the Yukawa couplings [1821] are realized in intersecting brane models (or
in the T-dual case of magnetized D-branes [22]); some of their phenomenological implications
are discussed in [23,24]; threshold corrections [25,26] have been computed; proton decay can be
studied quantitatively [27,28]; it has also been shown that the problem of moduli stabilization
can be partly addressed in the framework of solvable string models, by using D-brane worldvolume fluxes [29]. The issue of complete moduli stabilization has been thoroughly studied, see
for instance Refs. [26,3041]; however generically these constructions go beyond the class of
solvable models we consider in this paper. Various recent papers discuss phenomenological
features of open string models, where the techniques analyzed in this paper might be useful, see
for instance Refs. [4248].
In this paper we describe in some generality the string theory techniques necessary to compute the effective actions for this class of string models, where the Standard Model fields live
on intersecting or magnetized D-branes. The technique we use is conceptually simple and wellknown: one can reconstruct the effective action by requiring that it reproduces the low energy
limit of the string amplitudes. Thus this is a two steps procedure: first it is necessary to compute a
string amplitude contributing to a particular term of the effective actionone is interested in; then
one can extract the low-energy amplitude by sending the string length to zero with all fourdimensional momenta and masses kept fixed. We focus on the dynamics of the fields coming
from the open strings and try to determine the dependence of the relevant pieces of the fourdimensional effective action on the closed string moduli, whose dynamics is kept frozen (i.e. we
work in a limit where gravity is non-dynamical on the brane). This technique has been explicitly
applied in Heterotic string theory by Dixon, Louis and Kaplunovsky [49] and more recently in
the context of IBW in Ref. [21]. Here we follow the same approach, with the goal to generalize it
in various directions. First we show that this technique is not limited to supersymmetric models.
On the contrary, it is most effective in situations where supersymmetry is spontaneously broken,
because in this cases we can use the presence of mass terms to fix unambiguously the overall
normalization of the string amplitudes, which actually plays an important rle in the form of the
resulting effective action. Then we show that, by using the language of magnetized D-branes, it
is possible to treat in a simple fashion the case of six-dimensional compactifications that are not
factorized in products of two-dimensional torii T 2 (the model discussed in Ref. [29] is in fact
already of this type even if the compact space is T 2 T 2 T 2 , since the magnetic fields on the
brane world-volume do not respect the factorization of the geometry). In this more generic situation, contrary to the completely factorized case, the magnetic fields living on different D-branes
do not need to commute. The presence of non-commuting or oblique fluxes is an important feature in order to achieve the stabilization of the off-diagonal moduli in T 6 (see Refs. [26,33] for
recent developments in this direction). Here we will take also a non-trivial metric and B field, and
show that computations remain manageable even if the compact space does not have a factorized
structure at all. In a full-fledged model some of the NSNS moduli are absent due the presence
of orientifolds. However, it is known that these moduli need not to be trivial, but can be frozen
to some non-zero (discrete) values [50]. So they will affect the form of the effective action, and
need to be taken into account in our computations.
As an explicit example of this approach to the derivation of the effective action, we focus here
on the kinetic term for the scalar fields living at the D-brane intersections. This term is particularly interesting for two reasons. In models where we have N = 1 supersymmetry (possibly
spontaneously broken), this term contains the Khler metric. This function, together with the superpotential and the normalization of the kinetic terms for the gauge fields, specifies completely
any N = 1 gauge theory action [51]. However, in contrast to the other two building blocks, the
Khler metric enters in a non-holomorphic piece of the action and so has no protection against
string (or quantum) corrections. There is also a stringy reason that makes the Khler metric interesting. The open strings stretched between two different D-branes, like those living at the
D-brane intersections, behave like the twisted sectors of the (Heterotic) orbifold models. This
means that the terms of the effective actions involving this kind of fields cannot be derived by
simple dimensional compactification from the flat ten-dimensional string theory or from Born
Infeld action. Therefore, the computation of scattering amplitudes represents basically the only
possible way to reconstruct these terms of the effective action. Our analysis shows that the full
(NSNS) moduli dependence of the Khler metric is encoded in a disk amplitude with two open
strings and one closed string inserted.
We also consider the scalar fields associated to open strings that start and end on the same
D-brane (corresponding to the string untwisted sector) and compute their metric. In this case the
low-energy dynamics can be readily derived also from the BornInfeld action, upon compactification. Then we can check that the full moduli dependence of the metric for the untwisted fields
is correctly extracted from a three point function involving two scalars and a generic closed string
modulus, showing the validity of this diagrammatic approach.
1.1. Organization of the paper
In Section 2 we review the basics of the open string quantization and this will serve also to set
up our notations. As in the usual case, the open strings stretched between magnetized or tilted
D-branes are more easily analyzed by using the doubling trick, that is by rewriting the bosonic
and fermionic open string coordinates x (, ) and (, ) in terms of holomorphic CFTs. The
properties of this holomorphic fields depend on the angles or magnetic fluxes of the D-branes and
on the moduli of the compact space. In particular, in Section 3, we derive the relation between
the twists i of the holomorphic fields and the closed string moduli of the NSNS sector. We also
write the vertex operators related to these moduli and, in doing so, we clarify some details about
the off-shell continuation of string amplitudes. In fact this off-shell continuation is necessary,
if one wants to derive the full effective action and not just the S-matrix elements. In Section 4
we briefly review how to derive the open string spectrum for the IBW models and how to write
the vertex operators for open string states. We also provide a careful analysis of the field theory
limit in the non-supersymmetric case and give the relation between the string twist parameters
i and the surviving field theory mass terms. Then, in Section 5, which contains the main results of this paper, we compute the dependence of the Khler metric on the NSNS moduli. We
follow the procedure used in Ref. [21]: we compute a disk amplitude with two open strings, representing the fields present in the kinetic terms we are interested in, and a closed string related
to a NSNS modulus. Clearly this amplitude is related to the variation of the quadratic part of
the effective action when one of the closed string moduli is modified and the others are kept
fixed. Since the string computation is exact in all NSNS parameters, the above result translates
into a differential equation for the Khler metric. So we can fix its dependence on the NSNS
v.e.v.s exactly to all orders in . As anticipated, we consider a compactification on a generic
non-factorized six-dimensional torus, which is equivalent to resum all possible insertions of soft
gravitons in the compact space. Thus our result truly depends, through the i s, on all NSNS
moduli, without any constraint coming from particular hypothesis that are usually pre-assumed,
like the requirement of switching off the moduli breaking the T 2 T 2 T 2 factorized structure of the compact space or the supersymmetric constraints on the i s [21]. Our results hence
generalize (and partially correct, as we shall show) previous results in the literature. We also
discuss, from a string theory perspective, how supersymmetry breaking is implemented in these
models. We show via a string computation that the masses the twisted scalars have for generic
i originate from a FayetIliopoulos term (a D-term supersymmetry breaking). On the contrary,
F-term breaking, which might be present for a generic choice of the open string fluxes, does not
affect the value of these tree-level masses. Clearly both mechanisms break supersymmetry in
the bulk, too. Previous works on the issue of D- and F-term breaking in IBW are Refs. [5255].
Finally, we show that, in the case of factorized fluxes, our results can be easily translated with
three T-dualities in the type IIA configuration where the magnetized D9-branes are described
as intersecting D6-branes with generic angles. In Section 6, we discuss Yukawa couplings, focusing on the quantum (world-sheet) contribution. This part can be perturbatively expanded in
and usually does not enter in the superpotential, which receives only non-perturbative contributions via world-sheet instantons. This non-renormalization property [5658] was proven in
the context of Heterotic models and is certainly interesting to check whether it holds also in
the IBW models. In the factorized case this non-renormalization property has been checked in
[21], where the authors showed that the quantum part of the Yukawa couplings can be expressed
solely in terms of the Khler metric. Following our approach, we can prove that the same property holds in a non-factorized case with commuting fluxes. In a generic case, the explicit check
of the non-renormalization of the superpotential is difficult, since it requires to compute a correlator among non-Abelian twists. Of course, it is possible to reverse the logic and assume that the
non-renormalization theorem is valid in a general setup also in open string models. In this case
our results provide strong constraints on the form of the three-point correlator for non-Abelian
twists which must have a surprisingly simple form. Appendix A contains the derivation of the
formula providing, in a generic situation, the dependence of the open string twists on the closed
string moduli. This enters crucially in getting the results presented in Section 5. In Appendix B
we apply exactly the same technique described in Section 5 to the matter fields arising from the
open strings that start and end on the same D-brane. In this way we are able to derive the full
dependence on the NSNS moduli of the BornInfeld action and of the open string metric from
a disk amplitude with two open strings and one closed string. This represents a nice test of our
diagrammatic approach to the computation of the low-energy action.
1.2. Outlook
The main motivation for this work is to provide the techniques to generalize, in the context
of brane world models, previous results in the literature to potentially more realistic models.
Once all the ingredients to compute effective actions for such a generic situation are available,
it becomes possible to address many questions in phenomenologically interesting models that
have been recently constructed. In this respect, string theory techniques prove to be an efficient
tool to compute low energy effective actions whenever this cannot be done otherwise. There are,
however, a number of issues we have not addressed in this work and which we leave to future
investigations. The most technically difficult but interesting thing to do would be to compute directly the Yukawa three-point function in the generic case, i.e. for non-Abelian twists. Moreover,
in our string computations, we neglect the contribution coming from world-sheet instantons; it is
of course very important to include them systematically in our approach. We have not discussed
the dependence on the RR moduli, but these should be included in a complete low energy effective description. Similarly, we have not considered open string moduli (i.e. Wilson lines), whose
stabilization, in IBW models, has not been addressed in much detail, so far. Finally, the Khler
potential is a D term hence is not protected by non-renormalization theorems and would then be
very interesting to compute higher loop corrections in the string coupling (recent results in this
direction can be found in Refs. [5961]).
2. Open strings in closed string background
In this section we review the quantization of open strings moving in a 2d-dimensional Euclidean space with a constant metric G and a constant NSNS anti-symmetric tensor B. This case is
relevant in discussing systems of magnetized D-branes or, after T-dualities, systems of intersecting D-branes.
2.1. Bosonic sector
We begin our analysis by considering the bosonic sector described by the string coordinates
x M (M = 1, . . . , 2d) whose action (in a Euclidean world-sheet) is
M
1
2
N
M
N
Sbos =
x
G
+
i
x
B
q dx M AM ,
d
MN
MN
4
(2.1)
where the index on C, A and q takes the values = 0 or = and labels the string endpoints; q is the charge with respect to a background gauge field A along the boundary C . Our
conventions are such that q = q0 = 1 and = 1; the string coordinates x M and the gauge
fields A are dimensionless, while the background metric G and the B field have dimensions of
(length).2 In the following we will consider only the case in which G and B are constant, and
the gauge fields A are linear with constant field strengths F . Then, it is easy to realize that the
field equations x M = 0 must be supplemented by the following boundary conditions
GMN x N + i(F )MN x N =0, = 0,
(2.2)
where
F = B + 2 F .
(2.3)
z = e +i
R = (G F )1 (G + F ),
(2.4)
(2.5)
A convenient way to solve these equations is to define, in the complex z-plane, multi-valued
chiral fields X M (z) such that
M
X M e2i z = R1 R0 N X N (z) R M N X N (z), where R R1 R0 .
(2.6)
Then, putting the branch cut in the z-plane just below the negative real axis, a solution to the
boundary conditions (2.5) is
x M (z, z ) = q M +
1 M
X (z) + (R0 )M N X N (z) ,
2
(2.7)
where z is restricted to the upper half-complex plane, and q M are constant zero-modes.
Let us observe that the reflection matrix R defined in (2.4) leaves the metric G invariant:
t
R GR = G,
(2.8)
A,
and so does, as a consequence, the monodromy matrix R. Then, introducing the vielbein EM
A
B
such that GMN = EM E N AB , we see that in the new basis RAB is simply a SO(2d) matrix,
so that it is always possible to find an orthonormal frame and a unitary transformation to put the
monodromy matrix in a diagonal form, namely
ERE 1 R = diag e2i1 , . . . , e2id , e2i1 , e2id
(2.9)
for 0 i < 1. Let us point out that in the resulting complex basis Z = (Z i , Z i ) given by Z =
EX, the metric is
0 1
t 1
1
G = E GE =
(2.10)
1 0
and the monodromy properties (2.6) become
Z i e2i z = e2ii Z i (z) and Z i e2i z = e2ii Z i (z)
(2.11)
i
i
i
n+i 1
n+i 1
Z (z) = i 2
(2.12a)
a ni z
+
an+i z
,
n=1
n=0
n=0
n=1
i
i
Z i (z) = i 2
an+
zni 1 +
a ni zni 1 .
i
(2.12b)
2 We have included appropriate prefactors to recover the standard expansions for = 0 of dimensionful string fields
i
Z i . This means that the matrix E of the change of basis has dimensions of (length).
We remark that the shifts i are related to the eigenvalues of the monodromy matrix R, and not
to the two individual reflection matrices R0 and R , which, in general, do not commute with
each other. If one or more i s are zero, particular care must be paid due to the appearance of
extra zero-modes in the corresponding chiral bosons. In the following, however, we will consider
the generic case in which all shifts are non-vanishing. Canonical quantization implies that also
the zero-modes q M in (2.7) are operators that do not commute among them [62]. This fixes the
degeneracy of the open string vacuum, but we will not need this information in what follows.
The modes appearing in (2.12) obey the following commutation relations
i
j
a ni , a mj = (n i ) ij n,m n, m 1,
(2.13a)
i
j
an+i , am+j = (n + i ) ij n,m n, m 0.
(2.13b)
i
i
i
i
In particular the oscillators a n
and an+
are annihilation operators, whereas a n
and an+
i
i
i
i
are the corresponding creation operators with respect to the twisted vacuum | |{i }, i.e. for
any i
i
i
a n
| = 0 n 1 and an+
| = 0
i
i
n 0.
(2.14)
The contribution of the bosons Z i to the Virasoro generators can be easily derived from the
action (2.1), and in particular one finds that
d
i
1
(Z )
i
i
i
a ni a ni +
an+i an+i + i (1 i ) = N (Z ) + c(Z ) ,
L0 =
(2.15)
2
i=1
n=1
n=0
where in the last step we have distinguished the operator N (Z ) which measures the number of a
and a oscillators from the c-number c(Z ) due to the normal ordering with respect to the twisted
vacuum introduced above. From this expression, we can see that | is related to the Sl(2, R)
invariant vacuum |0 through the action of d twist fields [63] i (z) of conformal dimensions
1
hi = i (1 i )
2
as follows
| = lim
z0
d
(2.16)
(2.17)
i (z)|0.
i=1
On the other hand, the conjugate vacuum | is obtained by acting at infinity with the conjugate
twist fields i (z) of conformal dimensions hi = hi , namely
| = lim 0|
z
d
i (z)z
2h
.
(2.18)
i=1
Normalizing the vacuum states in such a way that | = 1, from (2.17) and (2.18) it immediately follows that
i (z)j (w)
ij
,
(z w)i (1i )
ij j (w)
Z i (z)
j (w)
,
(z w)1j
2
Z i (z) Z j (w)
2 ij
(z w)2
ij j (w)
Z i (z)
j (w)
.
(z w)j
2
(2.19a)
(2.19b)
Exploiting the mode expansions (2.12) and the properties of the twisted vacuum, it is straightforward to show that
i
w
2 ij
w
1
.
|Z i (z) Z j (w)| =
(2.20)
i
z
z
(z w)2
Then, by performing a projective Sl(2, R) transformation, we can move the position of the twist
fields to arbitrary positions and obtain
i (z1 )Z i (z2 ) Z i (z3 )i (z4 )
i (z1 )i (z4 ) Z i (z2 ) Z i (z3 )
= i 1 i (1 ) ,
Aibos (z1 , . . . , z4 |i )
(2.21)
(z1 z2 )(z3 z4 )
.
(z1 z3 )(z2 z4 )
(2.22)
(2.23)
which can be proved with an explicit calculation along the same lines outlined above.
2.2. Fermionic sector
Let us now turn to the fermionic sector described by world-sheet spinors M , whose Euclidean world-sheet action is [64]
i
i
2
M
N
Sferm =
q d M N FMN
,
d (GMN + BMN )
4
2
C
(2.24)
where are the 2-dimensional Dirac matrices. Denoting by M and +M , respectively, the upper
and lower components of M , from the above action one finds that the standard field equations
M = 0 must be supplemented by the following boundary conditions
M =0 = (R0 )M N +N =0 and M = = (R )M N +N = ,
(2.25)
where = 1 for the NS sector and = 1 for the R sector. The solution to these equations can
be conveniently written in terms of multi-valued chiral fermions M (z) such that
M
M e2i z = R1 R0 N N (z).
(2.26)
Indeed, remembering that the fermionic fields have conformal dimensions 1/2, we have
M
+
(z) z1/2 +M (z) = M (z)
and
(2.27)
for any z with Im z 0. In the complex basis = E = ( i , i ), where the monodromy matrix
R is diagonal, we can rewrite (2.26) simply as
i e2i z = e2i i (z) and i e2i z = e2i i (z)
(2.28)
2
i
1
1
i
ni zn+i 2 + n+
zn+i 2 ,
i
(2.29a)
n=0+
i (z) =
2
i
1
1
i
n+i zni 2 + n
zni 2 ,
i
(2.29b)
n=0+
where = 0 in the R sector, = 1/2 in NS sector. The modes in (2.29) obey the following
anti-commutation relations
i
i
i
i
n+i , m+
(2.30)
= n
= ij n,m n, m 0 + .
, m
i
i
i
i
i
Let us concentrate on the NS sector ( = 1/2). The oscillators n+
and n
are annihilai
i
i
i
tion operators, while n+i and ni are creation operators with respect to the fermionic twisted
NS vacuum |NS , i.e.
1
i
i
n+
|NS = n
|NS = 0 n .
i
i
2
(2.31)
Notice that this definition of creation/destruction operators is natural only for 0 i < 12 . In this
range, in fact, the oscillator 1i is a true creation operator since it increases the energy by the
2 i
positive amount
( 12
i ). If, instead,
1
2
2 i
the state it acts on. Thus, in this case the roles of the NS vacuum |NS and of 1i
2 i
|NS are
exchanged and the latter state becomes the true vacuum of the theory, since it has lower energy.
This will be relevant in the discussion of the GSO projection, see Section 4.
From the action (2.24), one can easily derive the fermionic contribution to the Virasoro generators, and in particular one finds
d
1 2
( )
i
i
i
i
L0 =
(n + i )n+i n+i + (n i ) ni ni + i
2
i=1
n=1/2
= N ( ) + c( ) ,
(2.32)
where again we have distinguished between the number operator N ( ) that counts the fermionic
modes and the c-number c( ) arising from the normal ordering with respect to |NS . In analogy
with our discussion of the bosonic sector, we deduce that this twisted vacuum can be related to the
Sl(2, R) invariant vacuum |0NS through the action of d fermionic twist fields si (z) of conformal
dimensions
1
hsi = i2
2
(2.33)
as follows
|NS = lim
z0
d
i=1
si (z)|0NS .
(2.34)
10
To obtain the conjugate vacuum we use instead fermionic twist fields si of conformal dimensions hsi = hsi acting at infinity, namely
NS | =
lim
NS 0|
d
2h
si (z)z si .
(2.35)
i=1
ij
i2
(z w)
i (z) j (w)
2 ij
,
(z w)
(2.36)
which is the fermionic counterpart of (2.19a). Let us now consider some fermionic correlation
functions. Using the mode expansions (2.29), it is easy to prove that
i
1
i
j
ij w
(2.37)
|
(z)
(w)|
=
2
NS
NS
z
(z w)
and then deduce for any i
si (z1 ) i (z2 ) i (z3 )si (z4 )
= i ,
si (z1 )si (z4 ) i (z2 ) i (z3 )
(2.38)
where is the anharmonic ratio (2.22). Other useful correlators are those involving the first
excited states
|ti NS = 1i
2 +i
|NS
2 i
|NS ,
(2.39)
1
and ht = ht = (i 1)2
i
i
2
ij
ti (z)ti (w)
+1)2
(z w)(i
i
(z)
ij ti (w)
sj (w)
,
(z w)i
2
ti (z)tj (w)
(2.40)
ij
(z w)(i 1)
i
(z)
ij ti (w)
sj (w)
.
(z w)i
2
,
(2.41)
(2.42a)
(2.42b)
The fermionic correlation function can be alternatively derived thanks to the bosonization
equivalence
i = eiHi ,
FH
S F = eii
i
(2.43)
11
= i .
F
(2.44)
(2.45)
= (i 2 ) .
(2.46)
12
anti-symmetric tensor, we can partially fix this ambiguity by requiring that, in the complex frame,
it is of type (1, 1). For instance in the heterotic context, it is natural to use the B-field and fix the
vielbein E so that
0
ib
1
B = t (E ) B(E )1 =
(3.2)
,
b =b .
ib 0
This form is invariant under the U(d) subgroup of SO(2d) acting block-diagonally on the complex frame: Z U Z, Z U Z and can be imposed by means of SO(2d)/U(d) transformations.
The residual U(d) invariance can be used, for instance, to put the complex vielbein E in the form
1 V 0
1 U
E =
(3.3)
1 U
2 0 V
with V real.
In Type I theories, it is more natural to ask the property (3.2) for F . It might be impossible
to choose a complex structure so that all F s are (1, 1) forms, in which case supersymmetry is
broken [65]. We will return on this point in Section 5, when we compute the v.e.v. of the auxiliary
fields D and F . Now let us just notice that, if all F s are (1, 1) forms, the reflection matrices
R are block diagonal in the complex basis
r 0
.
R =
(3.4)
0 r
There are several different ways to organize the compactification moduli which we denote
generically by m. When we are interested in holomorphicity properties, then it is convenient to
use the elements of the matrix U in Eq. (3.3), which are directly related to the complex structure.
In fact, the mixed tensor i dzi zi i d z i z i depends only on U , when written in the (original)
real basis (in the Type I case, the Khler structure arises from the complexification of V (3.3) with
the RR 2-form). Alternatively, when one deals with non-holomorphic terms, it is more natural
to associate the moduli m directly to the G. To write the vertex operator associated to a generic
modulus, let us recall that the closed string coordinates are given by
1
1
M
X (z) + XR (z) and x M (z, z ) = XLM (z) + XR
(3.5)
(z) ,
2 L
2
where the index labels the uncompact directions, and the subscripts L and R denote, respectively, the left and right moving parts. For example one has, for any z C,
M
M
aLn n aLn
z
zn .
XLM (z) = qLM i2 pLM log z + i 2
(3.6)
n
n
x (z, z ) =
n=1
The fermionic string coordinates admit a similar left/right decomposition. The mode expansion
for X and is formally identical to that of X M and M , the only difference being that the
former are chosen to be dimensionful. The massless closed string excitations of the NSNS
sector that represent fluctuations of the metric G or of the B field along the compact directions4
are described by the usual vertex operators VLM (z)VRN (z), where (in the 0-superghost picture) we
4 As already mentioned, in a complete orientifold compactification the B field is not dynamical; however we will
formally consider it on the same footing as G, in order to derive the dependence of the effective action on the possible
discrete values B can have [50].
have
VLM (z) = XLM (z) + i(kL L )LM (z) eikL XL (z) ,
N
V N (z) = X
(z) + i(kR R ) N (z) eikR XR (z) .
R
13
(3.7a)
(3.7b)
In these expressions kL and kR denote the left and right momenta of the emitted state while
the symbol is a shortcut for the vector product with metric . In general, when kL = kR ,
a more careful definition of the vertex operator is necessary to ensure the bosonic character of
the operators VL,R [66]
W MN (z, z ) = ei (kL +kR )pL VLM (z)ei (kL +kR )pR VRN (z).
(3.8)
Actually we are not interested in KaluzaKlein modes and we take kL and kR to be aligned
entirely along the uncompact directions. However, we perform a slight off-shell extension of the
closed string vertices, by formally taking, along the uncompact directions, kL2 = kR2 = 0 with
kL = kR . In this way we can have (kL + kR )2 = 0, without spoiling the conformal properties of
the left (or right) part of the vertex.
The insertion of the operator (3.8) inside a string correlation function induces a variation of
GMN and of BMN , associated respectively to the symmetric and anti-symmetric parts in the
indices M and N .5 Thus, the variation due to a change in a generic modulus m is produced by
the following vertex operator
1
(G B)MN W MN (z, z ).
(3.9)
4 m
Since in toroidal compactifications the vertex (3.9) represents a truly marginal deformation for
any value of m, we can schematically write
,
d 2 z Wm (z, z ) =
(3.10)
m
where stand for any sequence of string vertex operators. The partial derivative with respect to
m is taken by keeping fixed at arbitrary values all other moduli, which are indeed described by independent vertices. As it is intuitively natural, two different vertices Wm and Wm are independent
if the corresponding states are orthogonal, i.e. m |m = 0. For instance, the four-dimensional
dilaton 4 is clearly independent of the moduli (3.9) specifying the compact space, since it involves only string coordinates along the Minkowski directions. This means that the differential
equation we derive from (3.10) are computed by keeping 4 fixed. As it shown in [21], the dependence on the four-dimensional dilaton can be derived in the same way by inserting in (3.10)
the appropriate vertex W4 .
Let us now introduce a stack of D-branes wrapped on T 2d . On their world-volume we may
introduce a background field F whose components FMN are quantized as
Wm (z, z ) =
pMN
1
FMN =
,
(3.11)
2
lM lN
where pMN is the standard Chern class and lM is the wrapping number of the D-brane around
the cycle dX M . As discussed in Section 2, the open strings connecting two such D-branes are described in terms of twisted bosonic and fermionic fields, whose monodromy matrix R = R1 R0
5 This can be seen by using (2.1) and taking the derivative of the Euclidean weight eS present in the path integral;
m
notice that the normalization of (3.9) depends also on our convention (3.5).
14
is defined in terms of the boundary reflection matrices R given in (2.4). These conformal fields
and their correlation functions are described in the orthonormal complex basis introduced in
Eqs. (2.9) and (2.10), so that all relevant information is encoded entirely in the d phases i . Of
course these twists, as well as the complex vielbein E, depend on the 4d 2 parameters contained in
G and B. In the next sections we will compute mixed amplitudes with insertions of closed string
vertex operators Vm inside correlators of twisted open strings, which, as indicated in (3.10), account for the derivatives with respect to a NSNS modulus m. For the physical interpretation
of the results it will be crucial to know how the twists i depend on m. In particular it will be
important to know the derivatives of i with respect to m. As shown in detail in Appendix A,
these are given by
(G B)
i
1
R 1
EG1
2i
=
=
R
[R R0 ]E 1
m
m
2
m
ii
ii
1
1 1 (G + B) 1
1
E R R 0 G
E
.
(3.12)
2
m
ii
It is worth noticing the appearance in this formula of the same expression that plays the role of
the polarization in the vertex operator (3.9). Eq. (3.12) applies to a generic toroidal configuration
with any value of G and B, and to generic (i.e. non-commuting) magnetic fluxes F on the
wrapped D-branes. To make contact with the set-up that is usually considered in the literature,
and as an illustration, we now consider the simple case of D-branes on factorized torii with
diagonal fluxes, which are T-dual to a system of intersecting D-branes at angles.
3.1. Factorized torus with commuting fluxes
2 T 2
Let us consider a model in which the internal torus T 6 is metrically factorized as T(1)
(2)
2 . Let us also assume that the background NSNS field B and the gauge fields F respect this
T(3)
T1
0
U2 U1 |U |2
This parameterization with T and U is very convenient to discuss the effects of simple T-duality
transformations. Indeed, a T-duality along the x = x 1 axis amounts just to the exchange T U ,
while a T-duality along y = x 2 corresponds to T 1/U . On each torus the magnetic fluxes
are of the form
0
f
,
2 F =
(3.14)
f 0
where f is real and quantized according to Eq. (3.11). We can use the complex vielbein
T2 1 U
U U
1
and
E
=
i
E=
(3.15)
1
2U2 1 U
2T2 U2 1
to put the metric in the form (2.10). In the resulting complex basis Z = EX it is straightforward
to compute the reflection matrices R = ER E 1 by specializing their definition (2.4) to the
15
(3.16)
T f T f0
.
T f T f0
(3.17)
These formulas will be useful in later sections to make contact with some existing results in the
literature.
4. Low-energy spectrum on D-branes with fluxes
In this section we recall the main features of the open string low-energy spectrum for systems
of D9-branes with general magnetic fluxes. In a system with two or more stacks of D9-branes,
there are two classes of open strings: those that start and end on the same set of D9-branes, and
those which connect D9-branes with different magnetic fields. The first type of open strings give
rise to untwisted states transforming in the adjoint representation of the gauge group living on
the D9s under consideration. In what follows we will focus on the second type of open strings
related to twisted states transforming in the bi-fundamental representation. In the NS sector,
the complete Hamiltonian for this twisted open string is
1
x
(Z )
( )
H NS = L0 + L0 + L0 , with
2
x
L0 = p p +
an an +
rr r .
n=1
(4.1)
r=1/2
(Z )
( )
By using Eqs. (2.15) and (2.32) for L0 and L0 , we can express the mass-shell condition for
the NS sector as follows
3
1 1
x
(Z )
( )
L0 + N
(4.2)
+N
+
i |NS = 0.
2 2
i=1
Finally, in order to define the physical spectrum, we should specify the GSO projection. In the
NS sector, the GSO projection on open strings stretched between two D-branes is defined to
remove the vacuum and to select only those states with an odd number of fermionic oscillators
acting on it. The opposite choice would describe an open string stretched between a D-brane
and an anti-D-brane. It follows from the observation made just after Eq. (2.31) that we can now
interpolate continuously between these two situations. In fact, when one of the angles i is bigger
than 1/2 the usual GSO projection with respect to |NS selects the vacuum (i.e. 1i |NS )
2 i
as well as all states with an even number of fermionic oscillators acting on it.6 Thus we have two
possibilities: we can limit the range of the angles to [0, 1/2] and specify in each case whether we
take the brane/brane or the brane/anti-brane GSO; otherwise we keep the interval 0 i < 1, but
we stick always to the same GSO. Here we will use this second option. Then the first low-lying
6 The case = 1 is special and requires a separate treatment due to the appearance of zero modes.
i
2
16
1 |k; NS ,
2 M 2 =
3 scalars
3 scalars
3
j ,
(4.3a)
j + 3i ,
(4.3b)
j i ,
(4.3c)
j =1
1i
2 +i
1i
2 i
|k; NS ,
2 Mi2 =
3
j =i
|k; NS ,
2 Mi2 =
3
j =i
where |k; NS is the twisted vacuum with four-dimensional momentum k . With our convention, it is clear that the vector (4.3a) and the three scalars (4.3b) never contribute to the low-energy
spectrum except when all i s are small. In fact they can survive the field theory limit 0,
only if all i goes to zero as does. On the contrary, some of the scalars (4.3c) may remain in
the effective theory also for non-zero twists: for particular values of the i s they are massless,
but in general they are massive. To appreciate better this point, let us write the twists as follows7
i = i(0) + 2 i ,
(4.4)
and i are quantities which are kept fixed in the limit 0. In other words, i
where
is the field theory value of the ith twist, while i , which has dimensions of a (mass),2 is its
subleading string correction. Inserting (4.4) in the mass formula (4.2), we find
3
3
(0)
1
(0)
j i
j i .
Mi2 =
(4.5)
+
2
(0)
i
(0)
j =i
j =i
Therefore, by suitably choosing the i(0) s we can cancel the term in brackets and obtain, in
the limit 0, a finite mass for some of the states (4.3c).8 Thus, the spectrum is in general non-supersymmetric, but it is known that the presence of a non-trivial mass (4.5) breaks
supersymmetry spontaneously. In the field theory limit this breaking appears simply as a Fayet
Iliopoulos term due to the presence of non-trivial v.e.v.s of the auxiliary field D in the U (1)
gauge superfields [6870]. This observation will be important for our future calculations and we
will give a direct stringy proof of this statement in Section 5.
In view of these considerations, from now on we will focus on the scalars (4.3c), which we
denote by i . Recalling our discussion of Section 2 and adopting the notation presented there,
we can see that the vertex operator for the emission of i with momentum k is (in the (1)superghost picture)
V i (z) = i (k)
3
S F (z)j (z) e(z) eikX(z) ,
j =1
j (i)
(4.6)
7 We recall that a behavior like (4.4) is typical in the instanton sector of non-commutative gauge theories realized with
open strings in a non-trivial B background. In fact, some of the instanton moduli correspond to twisted open strings
for which the subleading corrections i are related to the dimensionful non-commutativity parameter [67]. Moreover,
a scaling behavior like (4.4) has been considered also in the field theory analysis of intersecting brane models [22].
8 Using (4.4) in the mass formulas (4.3a) and (4.3b), we may find a finite non-zero mass for the vector (4.3a) and the
(0)
17
where is the chiral boson of the superghost bosonization formulae, and j and S F are the
j (i)
bosonic and fermionic twist fields. The labels of the latter are
j
for j = i,
F
j (i) =
(4.7)
j 1 for j = i
which, according to Eq. (2.43), correspond to take
s (z) for j = i,
S F (z) = t j (z) for j = i.
j (i)
(4.8)
One can easily check that the vertex (4.6) has conformal dimension 1 if the mass-shell condition
(4.3c) is satisfied.
The complex conjugate scalars i are associated to twisted open strings with the opposite
orientation as compared to those considered so far, and thus their corresponding vertex operators
(again in the (1)-superghost picture) are
V i (z) = i (k)
3
S F (z)j (z) e(z) eikX(z) .
j =1
(4.9)
j (i)
Finally, we remark that the polarizations i and i of the vertices (4.6) and (4.9) contain the
appropriate ChanPaton factors for the bi-fundamental representations of the gauge group, and
have dimensions of (length)1 in units of 2 .
Let us now consider the twisted R sector. For generic values of the twists i s, only the four
fermionic coordinates along the uncompact directions have zero modes, and thus the vacuum
will carry a spinor representation of the four-dimensional Lorentz group SO(1, 3). Furthermore,
in the R sector the GSO projection selects a definite chirality (say positive) for such a spinor,
which therefore can be denoted by |, k; R , with being a chiral spinor index. The complete
Hamiltonian H0R of the R sector is given by the obvious generalization of (4.1) in which the s
have integer moding and the twisted fermions are as in (2.29) with = 0. As a consequence, there
is a cancellation between the bosonic and fermionic c-number terms due to normal ordering, so
that cR = 0, and the mass-shell condition for any state |R is
LR
0 |R = 0.
(4.10)
3
1
V (z) = (k)S (z)
sj 1 (z)j (z) e 2 (z) eikX(z) ,
j =1
(4.11)
where S is the chiral spin-field of SO(1, 3) and the polarization has dimensions of
(length)3/2 in units of 2 . One can easily check that this vertex operator has conformal dimension 1 if k 2 = 0.
When one of the twist parameters is zero, one of the internal complex fermions i ceases
to be twisted and two extra fermionic real zero-modes appear. In this case the vacuum becomes
doubly degenerate and one finds two massless fermions in four dimensions. When all twists
are vanishing, all internal fermions have zero-modes and, upon compactification, one finds four
massless fermions in the resulting four-dimensional theory.
18
In summary, the low-energy spectrum of open strings stretched between two stacks of D9
branes consists of one chiral massless fermion and a number of scalars that are generically massive (or tachyonic). For specific values of the fluxes and hence of the twists, one or more scalars
may become massless and supersymmetric configurations may be realized. This situation can be
conveniently represented in terms of a tetrahedron in the twist parameters space [3], as shown
in Fig. 1. This represents supersymmetric configurations and separate an inner region, where
the scalars are all massive, from an outer region, where the scalars become tachyonic. Faces,
edges and vertices of the tetrahedron correspond to N = 1, N = 2 and N = 4 configurations,
respectively. Notice that in our conventions, where the twists i s are taken in the range [0, 1), the
vertices A, B and C, and the face (ABC) are in fact not part of the moduli space. Of course one
could change conventions and choose a different parameterization without changing the physical
conclusions. We will briefly return on this point in Section 5.
5. Moduli dependence of the Khler metric
In this section we compute the dependence on the closed string moduli of the Khler metric
for the chiral matter in the effective action of magnetized D9-branes. Exploiting T-duality, this
system can be used also for brane-worlds involving branes at angles with arbitrary open string
fluxes. Our analysis generalizes previous results in the literature since we obtain an expression for
the Khler metric that is valid not only for commuting and supersymmetric fluxes on factorized
torii, but also for arbitrary non-commuting and non-supersymmetric configurations on generic
torii.
Let us then consider the chiral fields (4.3c) arising from -twisted open strings connecting two
(stacks of) D9-branes with generic non-commuting open string fluxes. The moduli dependence of
the Khler metric can be extracted from a 3-point function on a disk involving two open twisted
matter fields i and i , and one closed string parameter m, namely
Am i i =
1
V i Wm V i .
2
(5.1)
19
The normalization in this amplitude can be obtained by using the results of Appendix A of
Ref. [71],9 but, as we shall see, it can also be fixed independently by using the value of the
tree-level mass given in (4.3c).
In order to extract the Khler metric from Am i i , a few steps must be performed. First of
all, to write the results in terms of (properly normalized) field theory quantities, the open string
vertices should be transformed from the canonically normalized string theory basis to the field
theory basis according to
V i (Kii )1/2 V i ,
V i (Kii )1/2 V i ,
(5.2)
where Kii is the Khler metric for the ith matter multiplet. Second, a factor of i must be introduced to transform the string scattering amplitude into the corresponding term in the effective
action. The string/field theory dictionary then reads
Am i i = i(Kii )1
(2)
L ,
m
(5.3)
where L(2) is the quadratic part of Lagrangian of the twisted scalars with the field theory normalization. In our conventions, with a mostly plus metric, this Lagrangian reads
L(2) = Kii i i + Mi2 i i .
(5.4)
Thus, from (5.3) and going to momentum space, one finds
k1 k2 Mi2 Kii i (k1 ) i (k2 )
Am i i = i(Kii )1
m
Mi2
2 ln Kii i
+ k 1 k 2 Mi
(k1 ) i (k2 ),
=i
m
m
(5.5)
where in the second line we have explicitly taken into account the dependence of the mass Mi2
on the open string twists and hence on the closed string moduli.
The correlator (5.1) is a mixed open/closed string amplitude which can be computed after
writing the closed string vertex operator Wm in terms of the propagating (twisted) open string.
This is done by using the boundary conditions on the disk discussed in Section 2, which imply
in particular
VLM (z) = V M (z),
(5.6)
VL (z) = V (z),
VR (z) = V (z)
(5.7)
along the uncompact ones. Thus, the amplitude (5.1), including the cocycle introduced in
Eq. (3.8), can be written as
ei kL kR
(G B) R0
V i V M V N V i .
Am i i =
(5.8)
2
m
8
MN
9 In particular it sufficient to use Eq. (A.16) of that paper with the caveat that the normalization of the twisted scalar
20
Am i i =
E
E
A ,
(G B) R0
a
b (i)
m
MN
(5.9)
Aj(i) j k
Aj (i) j k
0
(5.10)
where
Aj (i) =
ei kL kR
8 2
V i V j V j V i ,
(5.11)
and Aj(i) is given by the same expression (5.11) with V j and V j exchanged.
The string correlator Aj (i) has to be computed in the orthonormal basis, where one can use
the CFT results summarized in Section 2. It is important to notice that although Aj (i) depends
only on the open string twists i s, the full amplitude Am i i contains additional dependencies
on the various closed string moduli through the reflection matrix R0 and the inverse vielbein
E 1 . In summary, the computation of the string amplitude Am i i elegantly separates into two
pieces: the correlator Aj (i) and a prefactor that carries the information on the specific closed
string modulus inserted and the boundary conditions. Let us start computing the first piece.
5.1. The string correlator
Here we derive the four point function Aj (i) defined in (5.11). For the chiral matter fields i
and i we take the vertex operators (4.6) and (4.9) for which the mass-shell condition is
k12 = k22 =
1 F
1 1 F 2
j (i) + j (1 j ) =
j (i) j = Mi2 ,
2 2
2
i
(5.12)
where
jF(i) =
1,
i = j,
1, i =
j.
(5.13)
For the closed string modulus, we use the vertices (3.7) with the identifications (5.6). Thus, the
amplitude (5.11) can be written as
i (k1 ) i (k2 )
dx1 dx2 d 2 z i kL kR
e
W(x1 x2 )1 (z z )2
4
dVCKG
j
j
Abos (x1 , z, z , x2 |j ) 2 kL kR Aferm (x1 , z, z , x2 |S F ) ,
Aj (i) =
j (i)
(5.14)
where Abos and Aferm are the correlators given in Eqs. (2.21) and (2.44), respectively, and W is
defined by
21
W eik1 X(x1 ) eikL X(z) eikR X(z) eik2 X(x2 )
3
j =1
j (x1 )j (x2 ) S F (x1 )S F (x2 )
j (i)
j (i)
= (x1 x2 )1 (t+Mi ) (1 ) s
2
(5.15)
(x1 z)(z x2 )
,
(x1 z )(z x2 )
|| = 1,
(5.16)
(5.17)
with
kL2 = kR2 = 0,
s + t + u = 2Mi2 .
(5.18)
In the following we keep the open strings on-shell, but we take the closed string off-shell. If
also the closed string were on-shell, we would have u = t and s = 0. In our off-shell extension,
instead, we retain the relation u = t but keep s non-vanishing, i.e. we take s = 2(t + Mi2 ).
Finally, in (5.14) the open string punctures x1 and x2 are integrated on the real axis, while
the closed string variable z is integrated on the upper-half complex plane, modulo the Sl(2, R)
projective invariance which is fixed by the Conformal Killing Group volume dVCKG . Using this
fact, one can show that
dx1 dx2 d 2 z
(x1 x2 )2 (z z )2 = (1 )2 d
dVCKG
(5.19)
F
1 j (1 ) s j j (i) .
(5.20)
Notice that the original integral over z takes into account all possible orderings of the closed
string insertion along the open string boundary. In the -variable this translates into a closed
integral C along the unit circle || = 1 clockwise oriented.11 The integrand in (5.20) has a branch
cut along the positive real axis and thus the contour C must be deformed in order to circumvent
the cut singularity. So we have to perform the integration just below the cut for [0, 1] and
then subtract the contribution from above the cut for e2i [0, 1]. Using the definition of the
11 To see that the unit circle C is clockwise oriented we can consider the definition of in Eq. (5.16) and take the limits
x1 and x2 0, so that z /z. Since z H+ , we easily see that e2i with 0 < , and thus C is
covered clockwise.
22
Aj (i) =
i i (k1 ) i (k2 ) F ij
j (i) e
sin (j + s/2)
4
( s + 1) (1 j s/2)
,
(1 j + s/2)
(5.21)
(5.23)
(5.25)
Then, after simple manipulations one sees that Eq. (5.10) can be rewritten as
1
A(i) = G 1 R1 1 H(i) .
2
(5.26)
(G B)R0 E 1t A(i)
Am i i = tr t E 1
m
1 t 1
1
1
= tr E
(G B)(R R0 )E H(i) G
2
m
3
(G B)
1
EG1
(R R0 )E 1
h.c. hj (i) .
=
2
m
jj
23
(5.27)
j =1
At this point the crucial observation is that the term multiplying hj (i) in the above expression can
be written as a total derivative with respect to m. This fact follows from the non-trivial identity
(3.12), whose proof is presented in Appendix A. Using this identity in (5.27), we find
Am i i = 2i
3
hj (i)
j =1
j
.
m
(5.28)
Then, inserting the relation s = 2k1 k2 2Mi2 in the explicit expression of hj (i) given in (5.24),
we get
1
1
F
2
2hj (i) = j (i)
k1 k2 Mi j i (k1 ) i (k2 )
2 2
F
F /2
j (i)
j (i)
2 d
2E j (1 j )
=
i (k1 ) i (k2 ), (5.29)
+ k 1 k 2 Mi
ln e
2
dj
(j )
where in the last step we have used the definition (5.23) for j . From the analysis of the spectrum
we know the value of the tree-level mass Mi2 = 21 j jF(i) j , which allows to interpret the
first term in the equation above as Mi2 /j . Notice that this is a way to fix unambiguously the
overall normalization of the string amplitude and thus also the power of the Khler metric below.
At this point we can use Eq. (5.28) to write the amplitude Am i i in the form of Eq. (5.5) and
read the explicit form of K
Kii ( ) = e
2E Mi2
F
3
(1 j ) j (i) /2
j =1
(j )
(5.30)
Formula (5.30) is the main result of this paper and, as we discuss below, it generalizes previous
results in the literature. It displays the full moduli dependence of the Khler metric of the chiral
matter coming from the -twisted open strings and holds for an arbitrary brane setup in presence
of generic non-commuting fluxes. Notice that the result (5.30) holds independently on whether
supersymmetry is preserved or broken, i.e. it is valid even when Mi = 0. Remarkably, the Khler
metric is always determined by a simple function of the twists i . It is worth stressing that in
this derivation it is useful to keep Mi = 0 in order to fix the overall normalization of the string
amplitude, including the sign, and hence to determine in the end the exact power in (5.30).
Notice that Eq. (5.30) is exact in . The field-theory result is obtained by taking the limit
0 with Mi2 fixed. In this limit the exponential vanishes and the Khler metric entering the
24
(1
)
(0)
i
Kii = lim
,
K
(
)
=
ii
(0)
0
(i ) j =i (1 j(0) )
(5.31)
(0)
where j = lim 0 j as in (4.4), and the signs jF(i) for the scalar i have been made explicit.
Some comments are in order at this point. First we notice that if we start from a nonsupersymmetric set of i s, the only way to decouple one of the twisted scalars from the string
(0)
scale is to suppose that the i s satisfy a supersymmetric constraint, as is clear from the mass
formula (4.5). This means that one is considering a particular point in Fig. 1 which is at a
stringy distance from a given supersymmetric configuration, in such a way that one scalar
can survive in the field theory limit 0 with a finite mass. As we will discuss in the next
section this breaking can be interpreted at the field theory level as coming from a non-vanishing
v.e.v. of a FayetIliopoulos term.
Notice that in our derivation we have chosen a set of conventions for which the N = 4 supersymmetric point included in our -space is the vertex O of the tetrahedron in Fig. 1. However,
our results hold for any other choice. For instance, we could repeat the above analysis with different conventions and consider a field theory where the starting point is another N = 4 vertex
of the tetrahedron in Fig. 1. In this case, the scalar becoming massless on the outer wall (ABC)
would now enter the low energy effective spectrum and its Khler metric would be
(0)
3 (1 (0) ) 1/2
j
j =1
(0)
(5.32)
(j )
This is the scalar that is usually considered in the literature. However, we point out that the
exponent in Eq. (5.32) has a different sign as compared to previous findings, but it agrees with
the result of Ref. [54]. In the Heterotic computations [49] the Khler metric for the scalar fields
is derived from a four point amplitude on the sphere. One may wonder why in models with open
strings it is possible to derive this result from a three point function and, conversely, what rle
a four point function would have in this context. However, at the CFT level, the insertion of a
closed string on a disk is equivalent to the insertion of two open vertices. Thus the KobaNielsen
integrals are those also considered in Ref. [49]. Of course, the spacetime interpretation and
kinematics are those of a three point function. Thus, to get a meaningful result, it is important
to give a prescription to continue the string amplitude off-shell, as we do, at least in the field
theory limit. It would be very interesting to check this off-shell prescription by computing disk
diagrams with the insertion of two moduli vertices and see whether the results are consistent
with Eq. (5.30). This is a challenging computation and some preliminary results were presented
in [21] for the factorized and commuting case. The authors of [21] suggested that the differential
equations derived from the three point function should actually be modified to agree with the
results coming from higher point amplitudes. Here we seem to have no room for modifications
of this type. We present a check of this in Appendix B; there we focus on the untwisted scalars
where it is possible to compare the result with the BornInfeld action and we find complete
agreement. So we believe that our off-shell prescription is able to capture the full NSNS moduli
dependence of the metric for all scalar fields.
25
=
.
2i
(5.33)
T
(T f )(T f0 )
Using Eqs. (3.13)(3.15) it is not difficult to check that the general relation (3.12) correctly
reproduces Eq. (5.33) (see Appendix A.1 for further details).
As is well known, a T-duality along the y direction of the torus T 2 corresponds to the exchange T 1/U , and the magnetized branes of the type IIB theory become branes of type IIA
intersecting at angles. A careful analysis of the boundary conditions (2.5) reveals that under this
T-duality the reflection matrices (3.16) transform into
U f
0
1+
1+Uf
R =
(5.34)
.
1+Uf
1+
0
Uf
Clearly R0 and R commute with each other. These T-dual reflection matrices depend on the
complex structure U and the quantized magnetic fluxes f , but are independent of the Khler
modulus T , in contrast to the original matrices R of Eq. (3.16). The monodromy matrix R =
(R )1 R0 of the T-dual theory is of the form R = diag(e2i , e2i ) with
e2i =
1 + Uf 1 + U f0
,
1 + U f 1 + Uf0
(5.35)
which is the direct T-dual transform of Eq. (3.17). In this case the twist represents the intersecting angle between the two D-branes to which the open string is attached. If we take one of
the branes to lie on the x axis, i.e. if we set f0 = 0, then Eq. (5.35) can be simply rewritten as
U2 p
,
(5.36)
q + U1 p
where the quantization condition f = p/q has been used. This is the usual relation of the angle
between two D-branes with the complex structure moduli of the two-dimensional torus T2 in
which they intersect, a relation that can be easily understood and derived also in geometrical
terms. From Eq. (5.36) it follows that
tan( ) =
2p
=
,
(5.37)
U
(q + Up)
which again agrees with the general result (3.12). Notice that indeed Eqs. (5.33) and (5.37)
are related by the T-duality map T U1 . More generally, under a T-duality transformation
X M = T M N X N , it can be shown that the flux matrices transform as [72]
M P
M
R (m ) N = R m(m )
(5.38)
P T N,
2i
where m and m the T-dual moduli and the d d matrix T M N satisfies TMN = TN M and T 2 = 1.
The dependence on this matrix T cancels out in the monodromy matrix R and therefore open
26
string twists in T-dual theories are simply related by replacing m m . The case of the T-duality
in the y direction of the two torus T 2 discussed above is just an explicit example of this more
general statement.
5.4. Supersymmetry breaking by D- and F -terms
In presence of generic fluxes (or angles) supersymmetry may be broken by D- and F -terms.
Here we would like to analyze explicitly from a string theory point of view these mechanisms
starting from the one produced by D-terms.
Let us then compute the v.e.v. of the auxiliary fields D of the gauge vector multiplet for our
system of magnetized branes with generic fluxes. Since the chiral matter arises from open strings
stretched between two (stacks of) D9-branes, we should consider both the D field of the gauge
multiplet for the branes at = 0 and the D field on the branes at = , and then focus on
their respective U(1) parts which are the only ones that can get a v.e.v. In particular we should
compute from string diagrams the difference
D D0
(5.39)
and show that, as expected, it corresponds to a mass for the twisted chiral matter. Just like we did
for the Khler metric, we will actually compute the derivative of the above quantity with respect
to a closed string modulus m, rather than the v.e.v.s themselves. More precisely we consider a
disk amplitude between a vertex operator VD for the auxiliary field D and a closed string vertex
operator Wm for the modulus m, and read from it the v.e.v. of the D fields according to
D D0 ,
(5.40)
m
where the subscripts 0 and on the string correlators indicate that the appropriate boundary
conditions for the branes at = 0 and = should be enforced. Auxiliary fields are realized
in string theory in terms of non-BRST invariant operators in the 0-superghost picture (see, for
example, Ref. [73] for details and Ref. [74] for some recent applications in mixed open/closed
string amplitudes) given by
AmD Wm VD Wm VD 0 = i
1
VD (z) = (i)MN : M (z) N (z):,
(5.41)
2
where (i) is the imaginary part of the Khler form of the internal torus. The label (i) specifies
along which N = 1 supersymmetry, out of the starting N = 4, the auxiliary field under consideration is aligned.12 In the complex basis (2.9) we have
0
jF(i) j k
t 1
1
,
(5.42)
E (i) E =
jF(i) jk
0
where are the signs introduced in Eq. (5.13). In writing (i) in this form we use the fact that in the
orthonormal basis the metric of the torus is of the form (2.10) and rearrange rows and columns
in order to ensure that the twists i s are all positive.
12 Here we use the same notation already adopted to distinguish the three faces of the tetrahedron of Fig. 1, which
correspond to three different N = 1 supersymmetries. The fourth supersymmetry associated to the outer face of the
tetrahedron is out of the present discussion, but, as we have already seen, it could be incorporated without any problem
by simply changing our conventions.
27
Since the vertex VD is in the 0-superghost picture, we need to take the closed string vertex
Wm in the (1, 1) picture, where it is given by Eq. (3.9) with
1
1
VLM (z) = eL (z) LM (z) and VRM (z) = eR (z) RM (z).
(5.43)
2
2
As already mentioned, the amplitude AmD receives contributions from insertions in the disks at
= 0 and = with boundary conditions parameterized by the reflection matrix R , so that
the identifications of the left and right moving parts of the closed string with the propagating
(untwisted) open string are
VLM (z) = V M (z)
1
AmD =
(G B)(R R0 )
(i)P Q
16
m
MN
dx d 2 z (z) M
e
(5.44)
(5.45)
Notice that the full dependence on world-sheet positions cancels in the integrand in agreement
with the Sl(2, R) invariance. That only the U (1) part of VD contributes to AmD is clear since this
amplitude is proportional to the trace of the ChanPaton factor carried by the D-vertex. Finally,
using (5.42) to rewrite the polarization (i) in the orthonormal basis and exploiting the non-trivial
identity (3.12), one finds
3
1
1 (G B)
1
EG
h.c. jF(i)
AmD =
(R R0 )E
8
m
jj
j =1
i
2
3
j =1
jF(i)
Mi2
j
=i
.
m
m
(5.46)
Comparing with Eq. (5.40), we see that indeed Mi2 = D D0 , thus proving that the twisted
scalars i become massive when the D fields acquire a v.e.v. This calculation shows also in a
very explicit way that the subleading terms i in the open string twists, defined in (4.4), which
responsible for the scalar mass, have the interpretation of FayetIliopoulos parameters in the
effective low-energy theory.
In a similar way one can compute also the F -terms, i.e. the v.e.v. of the auxiliary fields F i and
i
F of the adjoint chiral multiplets of the untwisted sector. Their corresponding vertex operators
are of the form
1 i
1 i
VF i (z) = MN
(5.47)
: M (z) N (z): and VF i (z) = MN
: M (z) N (z):,
2
2
0
0
j k 0
t 1 i 1
t 1 i 1
.
and
E E
E
0 i j jk
0
0
(5.48)
28
Notice that unlike the polarization (5.42) of the D vertex operator, these polarizations have nonvanishing entries in the diagonal blocks when they are expressed in the orthonormal frame.
The v.e.v. of F i and F i can be obtained from the string amplitudes AmF i and AmF i , which
have the same form as (5.45) but with (i) replaced by i and i . Due to the structure of these
polarizations, we immediately see that in the interesting case where fluxes are of type (1, 1) and
m (G B) is block off-diagonal, there is no F-term since the trace in (5.45) vanishes. In this
way we see that fluxes of type (1, 1) do not give rise to any F-term and hence do not break
supersymmetry. On the contrary, fluxes of the type (2, 0), which correspond to a m (G B) with
non-vanishing entries also in the diagonal blocks, do produce an non-vanishing F-term amplitude
and hence induce a non-vanishing v.e.v. for these auxiliary fields. From the structure of the vertex
operators (5.47) it is easy to realize that in the field theory limit F i and F i do not couple to the
chiral fields of the twisted sector, and thus the presence of a non-vanishing F-term does not break
supersymmetry there. However, supersymmetry will be broken by these F-terms in other sectors,
for example in the bulk.
6. Relation with the Yukawa couplings
The Yukawa couplings among the fields arising from intersecting or magnetized brane worlds
admit a nice stringy description [18], which represents actually one of the strong points of such
constructions. We focus on the couplings between chiral fermions and scalars all arising from
twisted strings. In this stringy description, the couplings appearing in the Yukawa terms of the
effective action have the form
YI J K = AI J K WI J K ,
(6.1)
where I, J, K are generic indices denoting the various scalars and fermions, which we will
specify better in the cases we are actually concerned with. Here WI J K we denote classical
contributions, which in the case of intersecting branes [1821] are given by world-sheet instantons bordered by the intersecting branes.13 In this context, since the replica families of fields
arise from multiple intersections of the branes, the different areas of the minimal world-sheet
connecting different intersections provide naturally an exponential hierarchy of couplings, see
Fig. 2(a).
The (string) quantum contributions AI J K to the couplings are instead provided by the correlator of the twisted emission vertices of a scalar V I (from the NS sector of a twisted string) and
of two fermions VJ and VK , from the R sector of two other twisted strings, see Fig. 2(b)
dx1 dx2 dx3
AI J K = V I VJ VK
(6.2)
V I (x1 )VJ (x2 )VK (x3 ) .
dVCKG
A three-point CFT correlator is determined from conformal invariance up to a constant; since the
vertices have conformal dimension 1, this structure constant coincides directly with the string
amplitude
AI J K
V I (x1 )VJ (x2 )VK (x3 ) = 3
.
a,b=1 (xa xb )
(6.3)
13 The world-sheet instanton contributions have obviously a counterpart in the magnetized brane models, which is
discussed for instance in [22].
(a)
29
(b)
Fig. 2. (a) Classical contributions WI J K to the Yukawa couplings in intersecting D-brane models. (b) Quantum contributions AI J K are given by string correlators.
The world-sheet dependence from the xa , indeed, just cancels in the amplitude Eq. (6.2) against
the Jacobian to gauge-fix SL(2, R) invariance.
We consider N = 1 configurations, in which supersymmetry may be broken, as we have
just seen, by the presence of D-terms. In N = 1 theories, the Yukawa couplings are encoded
in the superpotential. In truth, our effective action is an N = 1 supergravity, and beside the
twisted matter multiplets I we have matter multiplets originating from the closed string sector,
including the moduli scalars m. As we did for the Khler potential, though, we presently consider
the moduli m as fixed and expand the superpotential in the open string multiplets I . The cubic
level of this expansion
W = WI J K (m) I j K ,
(6.4)
displays the holomorphic couplings WI J K when the moduli are written in the appropriate complex basis. These couplings govern the Yukawa terms involving one scalar and two fermions from
these multiplets. They, however, do not directly represent the physical Yukawa couplings YI J K
because in the N = 1 Lagrangian the chiral multiplet fields have non-canonical kinetic terms
involving the Khler metric KI J (m). To read off the physical couplings14 we have to rescale the
fields: I (KI I )1/2 I , see the discussion before (5.2), getting
YI J K = [KI I KJ J KKK ]1/2 WI J K .
(6.5)
For N = 1 effective theories realized in Heterotic string compactifications, a powerful nonrenormalization theorem [56] asserts that the superpotential W gets no perturbative corrections. It is likely that the same non-renormalization property holds also in the brane-world
context. If this is the case, we should identify the holomorphic couplings WI J K with the classical
world-sheet instanton contributions
W I J K = WI J K ,
(6.6)
14 We assume here that the Khler metric is diagonal in the space of the I , which is indeed the case for the twisted
matter we consider.
30
since these contributions depend non-perturbatively on : WI J K eS/ . On the other hand, the
string amplitude AI J K with three twisted vertices can be certainly expanded perturbatively in
and so cannot contribute to the form of the superpotential. It then follows from Eqs. (6.1) and
(6.5) that AI J K should factorize in term of the Khler metrics for the involved fields, namely
AI J K = [KI I KJ J KKK ]1/2 .
(6.7)
This remarkable statement should be checked against the direct computation of the string correlator AI J K . Let us analyze this problem and, to begin with, let us recall which chiral multiplets
we consider and set up a convenient notation.
As we discussed in Section 4, an open string stretching between two different D-branes is
characterized by the eigenvalues i (i = 1, 2, 3) of its monodromy R(). It contains in its NS
spectrum three different scalars i which can be retained in the effective theory also for nontrivial values of the twists. These are the states of Eq. (4.3c); their
mass is given in Eqs. (4.3c)
(4.5), and the corresponding emission vertices in Eq. (4.6). For j =i j i = 0, the scalar
i is massless and sits in a chiral N = 1 multiplet with the massless chiral fermion from the
R sector. The latter is present for any value of the s, and its emission vertex was written in
Eq. (4.11).
In a given model of magnetized or intersecting D-branes, there are various types of branes,
and many open strings sectors associated to various pairs of different D-branes. These sectors are
distinguished by the corresponding monodromy R() and its eigenvalues, the twists j . We can
thus label15 the chiral multiplets as i . The corresponding Khler metric is evidently diagonal
in the space of different open string sectors, and also with respect to the type i of scalars under
consideration. We choose for it the notation Kii ( ). The string amplitude computing the quantum
j
part of the Yukawa couplings among three such chiral multiplets i , and k is
Aij k (, , ) = V i V V
(6.8)
and it is encoded in the conformal correlator of the vertices as indicated in Eq. (6.3). We restrict
ourselves to the coupling between multiplet of the same type i = j = k, say for instance i = 1. We
can then simplify further the notation, writing K( ) for the Khler metric K11 ( ) and A(, , )
for the quantum Yukawa amplitude A111 (, , ).
6.1. The factorized case
Let us consider first the situation in which the torus is factorized and the fluxes (or the angles)
for all the branes involved in the amplitude respect the factorization so that the reflection matrices,
and hence the monodromy matrices R for the various open strings commute. In this case there is
a single complex basis of bosonic and fermionic world-sheet fields, Z i and i , in which all the
monodromies act diagonally as specified by their eigenvalues i , i and i . We can then directly
substitute into the amplitude A(, , ) the expressions of the vertices given in Section 4.
15 That is, the index I is a shortcut for (, i), in this case: the twist singles out which open string sector and the index
i = 1, 2, 3 which scalar component Eq. (4.3c) we refer to.
31
Both the NS vertex V 1 given in Eq. (4.6) and the R vertices V and V (given by Eq. (4.11)
with, obviously, the s replaced by s or s) contain16 product of bosonic twist fields i and
of fermionic ones.
From the fermionic twist fields we get the correlators
S1 1 (x1 )S1 1 (x2 )S1 1 (x3 ) S2 (x1 )S2 1 (x2 )S2 1 (x3 )
2
2
2
2
S3 (x1 )S3 1 (x2 )S3 1 (x3 ) .
(6.9)
2
Using, for instance, the bosonized formalism introduced in Eq. (2.43) it is immediate to see that
the non-vanishing of these correlators requires
1 + 1 + 1 = 2,
2 + 2 + 2 = 1,
3 + 3 + 3 = 1.
(6.10)
Subtracting the first of these equations from the sum of the others yields a relation between the
1
masses of the scalar components ,,
of the three multiplets involved in the interaction: using
Eq. (4.3c) we find indeed
M 2 ( ) + M 2 () + M 2 () = 0.
(6.11)
This relation is obviously satisfied in supersymmetric configurations of all the three open strings,
i.e. when 2 + 3 1 = 0 and similarly for the s and the s so that all scalar masses vanishes.
If we allow for non-zero masses as explained in (4.4)(4.5), then this relation implies that at
least one of the three multiplets has a scalar component which is tachyonic. In fact, this might
be a desirable feature: following the idea of the Higgs as a tachyon [17], the Yukawa couplings
correctly represent the 3-point functions among the Higgs field and the SM fermions.
The correlator of our vertices involves also, and this is in fact the most crucial and non-trivial
ingredient, the following correlator of the bosonic twist fields
1 (x1 )1 (x2 )1 (x3 ) 2 (x1 )2 (x2 )2 (x3 ) 3 (x1 )3 (x2 )3 (x3 ) .
(6.12)
As for the fermionic twist fields, we get the product of three independent correlators, pertaining
to the CFTs of the three bosons Z i , because of the factorized situation we are considering.
For the CFT of a complex boson, the correlator of three twist fields can be obtained from
factorization of the 4-twist correlator and it turns out [63,7578] to be given by
(x1 ) (x2 ) (x3 )
(1) (1) (1) 1/4 , + + = 1,
()
()
()
= (xi xj )h2(hi +hj )
(6.13)
() () () 1/4 , + + = 2,
i>j
(1) (1) (1)
where hi h i is the conformal dimension of the twist field given in Eq. (2.16), and h = i hi .
Inserting this result in the product of bosonic twist correlators (6.12), upon taking into account
the relations (6.10) among the twist angles, we finally can write the fusion coefficient of the three
vertices V 1 , V and V
16 They also contain superghost terms and eikX terms, but it is easy to see that their correlators do not modify the fusion
coefficients, a part from imposing the obvious momentum conservation.
32
Fig. 3. The reflection matrices R0 and R and the monodromy matrices R = R1 R0 for the three twisted open strings.
The matrices pertaining to the different strings are labeled by the angles , , which determine the eigenvalues of the
monodromy according to Eq. (2.9).
A(, , ) =
(1 ) (1 i ) (1 )
(1 1 )
(i ) 1 (1 )
i=1
(1 i )
i=1
(i )
(1 ) (1 i )
(1 1 )
(i )
1/2
= K( )K()K()
,
1/4
i=1
(6.14)
where in the last line we used the explicit expression of the Khler metrics for the chiral multiplets, given in Eq. (5.30) (see also Eq. (5.31) in the field theory limit). Notice that the exponential
terms in K( )K()K() reduce to 1 using the relation Eq. (6.11) between the masses. Thus,
the explicit computation of the quantum part of the stringy Yukawa couplings in the case of
factorized twisted chiral multiplets agrees with the expectation Eq. (6.7) inferred from the nonrenormalization property of the superpotential. Of course, the same property can be derived also
by working with the symmetric scalar, following the reasoning for the Khler metric before
Eq. (5.32), finding agreement.
6.2. The oblique case and non-Abelian twists fields
Let us now suppose that the monodromy matrices pertaining to the three open strings we
are considering do not commute with each other. This is what happens for generic quantized
fluxes on each stack of branes (i.e., for oblique fluxes) on a generic torus. Indeed the set of
reflection matrices for strings with their endpoints = 0, attached to D-branes with fluxes F ,
given by Eq. (2.4), have no reason to commute between themselves, except in the factorized case
considered above. Hence, the various monodromy matrices R = R1 R0 do not commute either.
Each monodromy can still be diagonalized as in Eq. (2.9), and its eigenvalues depend on a set of
angles i . However, the monodromies R(), R() and R() of the three open strings involved in
the Yukawa amplitude cannot be simultaneously diagonalized.17
17 They are not completely independent, though, since R()R()R() = 1, see Fig. 3.
33
The bosonic and fermionic twist fields occurring in a vertex must be such that their OPE with
the bosonic (respectively fermionic) fields impose the condition
M
X M e2i z = R() N X N (z)
(6.15)
for the six bosonic coordinates X M along the torus (respectively, the analogous relation for the
M fields) as indicated in Eq. (2.6). These fields can therefore be defined only within the CFT
describing the six bosonic (respectively fermionic) directions along T6 , and not within its factorization in the CFTs of three complex bosons (respectively fermions). We use the notation R()
for such bosonic twist fields and SR() for the fermionic ones.
According to Eqs. (6.8) and (6.3), the quantum Yukawa amplitude18 A(, , ) coincides with
the fusion coefficient in the correlator of the three emission vertices V 1 , V and V . It is thus
determined by the product of the three-point correlators of the bosonic and fermionic twist fields
appearing in these vertices
Ramond
Ramond
SR() (x1 )SR()
(6.16)
(x2 )SR()
(x3 ) R() (x1 ) R() (x2 ) R() (x3 ) .
Here we denoted as SR()
the excited19 fermionic twist field which enters the emission vertex
V 1 , and by S Ramond the fermionic twists in the Ramond sector, which implement an extra minus
R()
sign in the monodromy with respect to the NS ones. If the three monodromy matrices R(),
R() and R() commute with each other, we can diagonalize simultaneously the twist operators
R ( ), R () and R (), and thus the last correlator in (6.16) can be factorized into a product of
three correlators as in (6.12). Notice that this may happen also for a non-diagonal background.
However, in the most general situation the three monodromy matrices do not commute and the
structure of the bosonic twist correlator is more involved.
On the other hand, the non-renormalization property of the superpotential W still suggests
that the amplitude A(, , ) should be given in terms of the Khler metrics for the three chiral
multiplets, as in Eq. (6.7)
1/2
.
A(, , ) = K( )K()K()
(6.17)
We have shown in this paper that the expression of the Khler metric is always given by
Eq. (5.30), independently of whether we are in an Abelian or in a oblique situation, and depends just on the monodromy eigenvalues. So, in fact, A(, , ) should really depend just on
the angles i , i and i .
As a consequence, we are lead to conjecture that the non-Abelian twist field correlator in
Eq. (6.16), which in principle depends on the entire monodromy matrices R(), R() and R(),
has in fact the same expression of a correlator of Abelian twist fields characterized by the
monodromy eigenvalues i , i , i . Proving (or disproving) this conjecture is a very interesting
challenge in CFT.
18 The notation here is slightly misleading. The angles , , just label the type of vertices in the amplitude. There is
no reason a priori, from the CFT point of view, that the amplitude actually be a function of these angles only. We would,
in general, expect that it depends of the complete monodromy matrices R(), R(), R().
19 Of course, we can diagonalize one of the monodromy matrices, say R() by choosing its complex eigenvector basis
Z. The corresponding open string sector is then described exactly as in Section 2. The twist fields are factorized: R( ) =
1 2 3 , and similarly for the fermionic ones. The vertex V 1 has the expression of Eq. (4.6), and the excited fermionic
twist SR(
) it contains is just S1 1 S2 S3 .
34
Acknowledgements
We thank Laurent Gallot for collaboration at the beginning of this project. We also thank
Bobby Acharya, Massimo Bianchi, Giulio Bonelli, Gianguido DallAgata, Dario Du, Marialuisa Frau, Wolfang Lerche, Igor Pesando, Claudio Scrucca, Marco Serone, Gary Shiu, Stephan
Stieberger and Angel Uranga for useful discussions and comments. This work is partially supported by the European Communitys Human Potential Programme under contract MRTN-CT2004-005104 (in which A.L. is associated to Torino University and M.Be. to Padova University)
and by the Italian MIUR under contract PRIN-2003023852. M.Be. is also supported by a MIUR
fellowship within the program Incentivazione alla mobilit degli studiosi italiani e stranieri residenti allestero.
Appendix A. Dependence of the open string twists from the closed moduli
Let m be a closed string modulus which is a generic function of the NSNS fields G and B.
From Eq. (2.9) we have
R 1
R 1 1
E 1
i
=
R
R E +
E , R R1
= E
2i
m
m
m
m
ii
ii
R 1 1
R E
= E
,
(A.1)
m
ii
where in the last step we used the fact that the commutator of an arbitrary matrix with a diagonal matrix, such as R, has no entries on the diagonal. Since R = R1 R0 , we get with simple
manipulations
1
R01
R
i
1
= E
R R
R E
.
2i
(A.2)
m
m
m
ii
We can now take advantage of the following property (which will be needed again in later stages
of the computation):
ERAE 1 ii = EARE 1 ii ,
(A.3)
which holds for any matrix A, since ERE 1 = R is diagonal, to get
1
R01
R
i
= E
R
R0 E 1 .
2i
m
m
m
ii
From the expression (2.4) of the reflection matrices R ( = 0, ) it follows that
(G + B) (G B)
R1
1
R = 1 + R1 G1
+
R ,
m
2
m
m
(A.4)
(A.5)
where we used
1
(A.6)
1 + R1 G1 .
2
Substituting the expression (A.5) into Eq. (A.4) we get four contributions. Combining the two
terms proportional to (G + B) we find
1 1
1 1 (G + B) 1
E R R0 G
(A.7)
E
.
2
m
ii
(G + B + F )1 =
35
Again, by using Eq. (A.6), the two terms proportional to (G B) can be written as
1
(G B)
(G B)
E 1 + R1 G1
R0 R01 R 1 + R01 G1
R0 E 1
2
m
m
ii
1
1
1 (G B)
1
=
E R 0 R 1 G
R0 E
2
m
ii
(G
B)
1
EG1
=
(A.8)
[R R0 ]E 1 ,
2
m
ii
where we used several times the property (A.3) to move the matrix R 1 . The last step is to write
Eq. (A.7) in terms of the transpose matrix. This can be done by using the properties (2.10) of
the vielbeins t EG = GE 1 and of the reflection matrix (2.8). In this way we find that Eq. (A.7)
can be written in terms of the anti-holomorphic elements of the same matrix appearing in the last
step of Eq. (A.8)
1
(G B)
i
= EG1
[R R0 ]E 1
2i
m 2
m
ii
(G
B)
1
1
1
[R R0 ]E
EG
.
(A.9)
2
m
ii
If m is a real modulus, then the second term in the equation above is just the complex conjugate
of the first one.
A.1. The two-dimensional case
Let us check the general expression of the dependence of the open string twists from the
closed string moduli in the simple case of a 2-dimensional torus. From the results of Section 3
we know that the twists depend only on T
2i
= 0,
U
2i
f f0
=
.
T
(T f )(T f0 )
(A.10)
Let us retrieve the same result from Eq. (A.9). Since in the present case the reflection matrices
R in the complex basis are diagonal, it is convenient to rewrite Eq. (A.9) by suitably inserting
the identity written as t E t E 1 or EE 1 , getting
1 t 1 (G B) 1
=
E
2i
E
(R R0 )11
m 2
m
21
1 t 1 (G B) 1
E
E
(R R0 )22 .
(A.11)
2
m
12
By using (3.13) and (3.15), we find for m = U
i
0 0
t 1 (G B) 1
E =
.
E
U
U2 0 1
(A.12)
Since this matrix has no (21) component, we get immediately from Eq. (A.11) that /U = 0,
in agreement with Eq. (A.10). For m = T , we get instead
i 0 0
t 1 (G B) 1
(A.13)
.
E =
E
T
T2 1 0
36
This matrix has a non-vanishing (21) component and so contributes to the first term of Eq. (A.11)
and we get
T f
i
i
T f0
=
2i
(R R0 )11 =
T
2T2
2T2 T f
T f0
f f0
,
=
(A.14)
(T f )(T f0 )
in perfect agreement with Eq. (A.10).
Appendix B. The metric for the untwisted matter
In this appendix we apply the same technique described in Section 5 to the case of open
strings starting and ending on the same stack of D9-branes. In particular, we show that it is
possible to determine completely the metric of the untwisted scalars from a 3-point function
involving two scalars and one closed string modulus. Of course, this result can be read directly
by compactifying the BornInfeld action of a D9-brane to four dimensions
N
4
4 det(G F)
Suntw =
(B.1)
D AM D AN GMN
d xe
open
2
(det G)1/4
where N = T9 (2 )3/2 , with T9 being the D9-brane tension, and the four-dimensional dilaton
4 is related to the ten-dimensional one by e4 = e10 (det G)1/4 (2 )3/2 (recall that in our
conventions the internal metric GMN has dimensions of (length)2 and that the fields AM are
dimensionless). Notice the appearance in (B.1) of the open string metric [79]
(MN )
1 MN 1
1 t MN
1
MN
GMN
(B.2)
+
)
+
R
=
.
G
(R
0
0
open
2
4
4
GF
Usually the action Suntw is written in terms of a rescaled string coupling [79]; here, instead,
we prefer to keep the dependence on the four-dimensional dilaton 4 , because the insertion of
the closed string modulus m gives a differential equation where 4 is kept constant, as seen in
Section 3.
The dimensionless fields AM in (B.1) simply represent the internal components of the gauge
field and so the corresponding vertex operator (in the zero picture) can be read from the boundary
terms of the actions (2.1) and (2.24), namely
1 M 1 M
i
i
M
M
VM = M (k) x + x k + k + eikX ,
(B.3)
2
2
2
2
where M = AM / 2 . On the other hand, the vertex for a closed string modulus m is given by
Eq. (3.9), where in the (1, 1) picture the operator W MN (z, z ) is the product of the left and
right vertices of Eq. (5.43).
Now we can proceed as in Section 5 and compute the amplitude
ei kL kR
dx1 dx2 d 2 z
V
(x
)W
(z,
z
)V
(x
)
,
Auntw =
(B.4)
1
m
2
M
N
dVCKG
8 2
where in the normalization we have included also the cocycle factor of the closed string vertex.
The boundary conditions for this diagram are
(x) = (x),
M
+
(x) = M (x),
(B.5)
37
(B.6)
for the closed string ones. Then by following the same steps as in Section 5, we obtain
Auntw =
ei kL kR
dx1 dx2 d 2 z
A
(k
)A
(k
)
Y
(G
B)R
M
1
N
2
0
m
dVCKG
32 3
PQ
M
1
x (x1 )x N (x2 ) P (z) Q (z)
(z z )
k1 k2
(x1 x2 )1 (1 + R0 )M I (1 + R0 )N J
2
I
P
Q
J
(x1 ) (z) (z) (x2 ) ,
+
(B.7)
where
Y eik1 X(x1 ) eikL X(z) eikR X(z) eik2 X(x2 ) = s/2 (1 ) s
(B.8)
in terms of the anharmonic ratio defined in Eq. (5.16). To proceed, we use the following basic
correlators
M
x (x1 )x N (x2 ) = 2 GMN
open ln(x1 x2 ) and
2 GMN
M (x1 ) N (x2 ) =
(x1 x2 )
(B.9)
(G B)R0
AM (k1 )AN (k2 )
d s/2 (1 ) s2
Auntw =
8
m
PQ
s P M t P M QN t QN 1
PQ
G
G
GMN
G
(1
s)
+
+ R0
+ R0
open
4
s QM t QM P N t P N
G
(B.10)
G + R0 (1 ) .
+ R0
4
In this expression, the last two terms (proportional to s) are one the transpose of the other,
as one can see with the change of variable 1/; so we can keep just the first term and
symmetrize the result in the indices M and N . In this way we get
i
Auntw =
(G
B)R
A
(k
)A
(k
)
sin( s/2)
M 1
N 2
0
4
m
PQ
PQ
(1 s)B(1 s/2, s 1)GMN
open G
+
s
P (M QN)
QN)
B( s/2, s) GP (M + t R0
G
.
+ t R0
2
(B.11)
38
Now we expand this result in s and focus on the (leading) term with two derivatives which
captures the kinetic term for the scalar fields AM , that is
1
i
Auntw = k1 k2 AM (k1 )AN (k2 ) tr
(G B)R0 G1 GMN
open
2
4
m
(MN ) !
,
(G B)R0 (G + F )1
(G F )1
(B.12)
m
where we have used t R0 = GR01 G1 and the identity (A.6) to rewrite the last term. The term
proportional to GMN
open yields the same total derivative found in Section 3 of [21], namely
1
1
tr
(G B)R0 G1 = tr
(G F ) 2(G F )1 G1 = m ln K,
4
m
4
m
where20
K = Ne
det(G F )
.
(det G)1/4
The second term in (B.12) yields exactly the open string metric (B.2); indeed
(MN )
1
1
(G F )
(G B)R0 (G + F )
m
(MN )
1
MN
G
=
=
.
m G F
m open
Thus, we can write our result as
1
KGMN
k
k
A
(k
)A
(k
)
Auntw = iK 1
2 M 1
n 2 .
open 1
m 2
(B.13)
(B.14)
(B.15)
In complete analogy with what we did in Eq. (5.5) for the twisted scalars, we identify the square
bracket of Eq. (B.15) with the (momentum space) Lagrangian of the untwisted fields and thus
reconstruct the action (B.1), in perfect agreement with the BornInfeld result.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
C. Bachas, hep-th/9503030.
M. Berkooz, M.R. Douglas, R.G. Leigh, Nucl. Phys. B 480 (1996) 265, hep-th/9606139.
R. Rabadan, Nucl. Phys. B 620 (2002) 152, hep-th/0107036.
R. Blumenhagen, L. Goerlich, B. Kors, JHEP 0001 (2000) 040, hep-th/9912204.
R. Blumenhagen, L. Goerlich, B. Kors, D. Lust, JHEP 0010 (2000) 006, hep-th/0007024.
C. Angelantonj, I. Antoniadis, E. Dudas, A. Sagnotti, Phys. Lett. B 489 (2000) 223, hep-th/0007090.
G. Aldazabal, S. Franco, L.E. Ibanez, R. Rabadan, A.M. Uranga, JHEP 0102 (2001) 047, hep-ph/0011132.
A.M. Uranga, Class. Quantum Grav. 20 (2003) S373, hep-th/0301032.
E. Kiritsis, Fortschr. Phys. 52 (2004) 200, hep-th/0310001.
D. Lust, Class. Quantum Grav. 21 (2004) S1399, hep-th/0401156.
C. Kokorelis, hep-th/0410134.
20 Here, we have included also the dependence on the dilaton which could be fixed by computing a 3-point amplitude
4
with the dilaton vertex, as discussed in detail in Ref. [21]. We have also included the appropriate dimensional prefactor
N = T9 (2 )3/2 to make K dimensionless.
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]
[64]
[65]
[66]
[67]
39
40
[68]
[69]
[70]
[71]
[72]
[73]
[74]
[75]
[76]
[77]
[78]
[79]
M. Cvetic, G. Shiu, A.M. Uranga, Phys. Rev. Lett. 87 (2001) 201801, hep-th/0107143.
M. Cvetic, G. Shiu, A.M. Uranga, Nucl. Phys. B 615 (2001) 3, hep-th/0107166.
D. Cremades, L.E. Ibanez, F. Marchesano, JHEP 0207 (2002) 009, hep-th/0201205.
P. Di Vecchia, L. Magnea, A. Lerda, R. Russo, R. Marotta, Nucl. Phys. B 469 (1996) 235, hep-th/9601143.
H. Ooguri, Y. Oz, Z. Yin, Nucl. Phys. B 477 (1996) 407, hep-th/9606112.
M. Dine, I. Ichinose, N. Seiberg, Nucl. Phys. B 293 (1987) 253.
M. Billo, M. Frau, F. Lonegro, A. Lerda, JHEP 0505 (2005) 047, hep-th/0502084.
T.T. Burwick, R.K. Kaiser, H.F. Muller, Nucl. Phys. B 355 (1991) 689.
J. Erler, D. Jungnickel, M. Spalinski, S. Stieberger, Nucl. Phys. B 397 (1993) 379, hep-th/9207049.
S. Stieberger, D. Jungnickel, J. Lauer, M. Spalinski, Mod. Phys. Lett. A 7 (1992) 3059, hep-th/9204037.
S. Stieberger, Phys. Lett. B 300 (1993) 347, hep-th/9211027.
N. Seiberg, E. Witten, JHEP 9909 (1999) 032, hep-th/9908142.
Abstract
We study the measurement of the atmospheric neutrino oscillation parameters, 23 and m223 , at the
disappearance channel at three conventional beam facilities, the SPL, T2K-phase I and NOA. These two
parameters have been shown to be of crucial importance in the measurement of two of the unknowns of
the PMNS mixing matrix, 13 and the leptonic CP-violating phase . In our analysis, the effect of the two
discrete ambiguities, sign(m223 ) and sign(tan 223 ), is explicitly taken into account. We analyse also the
disappearance channel at the neutrino factory, and combine it with the golden e and silver
e appearance channels to study its impact on the measurement of 13 and . Eventually, we present
the sensitivity of the four facilities to different observables: 13 , , maximal 23 , the sign of the atmospheric
mass difference, satm , and the 23 -octant, soct .
2006 Elsevier B.V. All rights reserved.
PACS: 14.60.Pq; 14.60.Lm
1. Introduction
The results of atmospheric, solar, accelerator and reactor neutrino experiments [1] show that
flavour mixing occurs not only in the hadronic sector, as it has been known for long, but in
the leptonic sector as well. The experimental results point to two very distinct mass differen-
* Corresponding author.
42
ces,1 m2sol 8.2 105 eV2 and |m2atm | 2.5 103 eV2 . Only two out of the four parameters of the three-family leptonic mixing matrix UPMNS [4] are known: 12 34 and 23 45 .
The other two parameters, 13 and , are still unknown: for the mixing angle 13 direct searches
at reactors [5] and three-family global analysis of the experimental data [6] give the upper bound
13 11.5 , whereas for the leptonic CP-violating phase we have no information whatsoever.
Two additional discrete unknowns are the sign of the atmospheric mass difference and the 23 octant (if 23 = 45 ).
The full understanding of the leptonic mixing matrix constitutes, together with the discrimination of the Dirac/Majorana character of neutrinos and with the measurement of their absolute
mass scale, the main neutrino-physics goal for the next decade. In the recent past, most of the
experimental breakthroughs in neutrino physics have been achieved by exploiting the so-called
disappearance channels: by observing a deficit in the neutrinos that reach the detector with
respect to those expected to be emitted from the source, a positive and eventually unambiguous signal of neutrino oscillations has been established. The SK detector has gathered indirect
evidence of conversion of atmospheric neutrinos. However, no direct observation of
s is possible at this detector. To observe directly conversion, two detectors are under construction at the Gran Sasso laboratory [7,8]. The SNO experiment [9] has shown that a
fraction of the e s emitted by the Sun core reach the Earth converted into s and s (and
not into unobservable sterile neutrinos). However, the SNO detector is not able to distinguish
between s and s and thus measure the subleading oscillations e , directly. For
this reason, new-generation experiments have been proposed to look for the fleeting and intimately related parameters 13 and through the more promising appearance channels such as
e or e (the golden channel) and e (the silver channel). However, strong
correlations between 13 and [10] and the presence of parametric degeneracies in the (13 , )
parameter space, [1114], make the simultaneous measurement of the two variables extremely
difficult. A further problem arises from our present imprecise knowledge of atmospheric parameters, whose uncertainties are far too large to be neglected when looking for such tiny signals
as those expected in appearance experiments [15]. Most of proposed solutions to these problems
suggests the combination of different experiments and facilities, such as super-beams (of which
T2K [16] is the first approved one), -beams [17] or the neutrino factory [18,19].
Clearly, the disappearance channel is the best place to reduce further the uncertainties
on atmospheric parameters. It must be reminded, however, that this channel is afflicted by a
four-fold degeneracy [20]: the sign of the atmospheric mass difference and the 23 -octant are
poorly measured through this channel, indeed.2 It has been proposed to face this problem using
atmospheric neutrinos at T2K-II, [21,22], or at a magnetized iron calorimeter, [23,24].
The study of the disappearance channel at several proposed facilities is the main
goal of this paper (partially presented in Ref. [25]). How the combination of appearance and
disappearance channels at different facilities can be used to soften the degeneracy problem in the
measurement of 13 and is also analysed. We first study in detail the disappearance channel
at three proposed super-beam facilities, T2K-I [16] (that is under construction at the Jaeri site),
NOA [26] (that passed first scrutiny and will rely upon the existing NuMI facility at FermiLab) and the SPL [27]. We will show how energy resolution is crucial to reduce significantly
1 A third mass difference, m2
2
LSND 1 eV , suggested by the LSND experiment [2], has not being confirmed yet [3]
and will not be considered in this paper.
2 This is why we have to deal with these ambiguities when looking to appearance signals, ultimately.
43
atmospheric parameter uncertainties and how our present ignorance of 13 and affects the results in the disappearance channel.
We then investigate the disappearance channel at the neutrino factory. This channel, although already considered in the literature (see Refs. [28,29], for example), has never
gained the center of the stage on its own, being overshadowed by appearance channels such as
the golden e and the silver [30] e transitions. This channel is able to reduce
atmospheric parameter uncertainties to an unprecedented level and, using the intertwine with appearance channels, to solve many of the discrete ambiguities for 13 3 4 . In this range of 13 ,
combination of appearance and disappearance channels is rather effective and 13 and can also
be measured unambiguously. Unfortunately, appearance and disappearance signals are optimized
with different baselines: degeneracies are solved most effectively with a L = 3000 km baseline
in the appearance channel and with a L = 7000 km baseline in the disappearance channel.3
Eventually, we present a comparison of the sensitivity to 13 , to the sign of the atmospheric
mass difference, to the 23 -octant and to maximal 23 of the different facilities. Their CPdiscovery potential [20] is also shown. As expected, the neutrino factory outperforms the considered super-beams in every aspect, with the notable exception of the CP-discovery potential that
seems to be larger for T2K-II. The neutrino factory is in this case limited by the accidental
flow of degeneracies towards || = 0 or 180 for very small values of 13 , see Refs. [33,34].
The paper is organized as follows: in Section 2 we shortly introduce the four facilities and the
neutrinonucleon cross-section; in Section 3 we compare the potential of the three conventional
beams in the measurement of the atmospheric parameters 23 and m223 ; in Section 4 we study
the potential of the neutrino factory in the measurement of the atmospheric parameters and combine the disappearance channel with golden and silver appearance channels; in Section 5
we show the sensitivity to several unknowns of the considered facilities combining appearance
and disappearance channels; in Section 6 we eventually draw our conclusions.
2. The experimental setup
In this section we describe, briefly, the four facilities that we will use in the following and we
remind the neutrinonucleon cross-section used throughout the paper.
2.1. The T2K beam
A super-beam is a conventional neutrino beam with a proton intensity higher than that of
existing (or under construction) beams such as K2K [35], NuMI [36] and the CNGS [37]. With
respect to the -beam [17] and the neutrino factory [18], neutrino beams of a new design, it has
the advantage of a well-known technology. On the other hand, the flux composition (with as
the main component for a + focusing, plus a small but unavoidable admixture of , e and e )
limits its sensitivity to e oscillations.
The T2K-I facility [16] has been approved and it is under construction. It consists of a conventional beam with 0.75 MWatt power 50 GeV protons produced at the J-Parc site in Tokai
44
Fig. 1. Left: T2K-I fluxes at the Kamioka location (295 km baseline), [16]; Right: NOA fluxes at the far location
(812 km baseline), [38].
has a threefold advantage: first, it allows that neutrinos produced through protons whose energy
is fixed by the requirement that they may be used for different purposes have a e oscillation probability peaked at L = 295 km, where the detector is located; second, the off-axis beam
is narrower than the on-axis one, thus improving the matching of the L/E ratio with the first
peak of the oscillation; third, it allows a reduction of the beam-driven background. Notice that
the off-axis angle has not yet been chosen, to arrange the L/E ratio to slight variations in the
measured value of m223 . The technical design of the tunnel is such that the off-axis angle can be
modified ranging between 2 and 3 . The T2K fluxes for a 2 off-axis angle as they are expected
at the Kamiokande site are shown in Fig. 1 (left). Following the Letter of intent [16], we have
considered four 200 MeV bins with Emin = 400 MeV. The muon and electron identification efficiencies per bin are presented in Fig. 21. The average neutrino energy is E = 0.75 GeV with
5 years of + running, only.
The dominant background is made of pions from neutral currents. The detector is extremely
45
a narrow beam to match the L/E oscillation peak ratio. A reduced version of the detector will
be placed at the FermiLab site to extrapolate the expected rates and their energy spectrum in the
far detector in the absence of oscillation. The NOA fluxes for a 0.72 off-axis angle as they are
expected at the far site are shown in Fig. 1 (right).
In the following, we consider a totally active 30 kton detector made of liquid scintillator and
PVC [40] with 6.5 1020 pot/y (no proton driver is considered, [39]). The average neutrino
energy is E = 2.22 GeV, with 5 years of + run only [26]. Events are grouped in three
650 MeV bins with Emin = 1000 MeV, with a constant muon identification efficiency = 0.9
[41] and a constant electron identification efficiency e = 0.24 [40].
The dominant background for the disappearance channel are x neutral currents with a
rejection factor at the level of 5 103 (for the appearance channel the background is typically
from electron neutrinos in the beam and from neutral current events faking electron neutrinos,
with a rejection factor approximately of 2 103 ).
The main source of systematic errors are the incomplete knowledge of the neutrino flux and
the extrapolation of the flux and of the backgrounds from the near to the far detector. The near
detector will be used to improve the present knowledge of low energy neutrino cross-sections
and to complement the results of the MINERA experiment [42] (operating before the bulk of
NOA data is collected). We have introduced a pessimistic 5% global systematic error.
2.3. The SPL
The SPL is a proposal for a high intensity conventional beam with a 4 MW 2.2 GeV proton
driver located at CERN, Ref. [27]. The neutrino fluxes have first been computed in a full simulation of the beamline in Ref. [43], assuming a decay tunnel length of 60 m. The corresponding
fluxes are shown in Fig. 2 (left). However, the original beam design was conceived as the first
stage of a neutrino factory, and it was not optimized as a facility to look for e oscillations
on its own. Such an optimization has been presented in Ref. [44], and the modified fluxes are
shown in Fig. 2 (right). We will consider both the old and new neutrino fluxes to study the
potential of the disappearance channel at this facility to measure the atmospheric parameters
23 and m223 .
Fig. 2. Left: Standard SPL super-beam fluxes at the Frjus location (130 km baseline), [43]; Right: Optimized SPL
super-beam fluxes at the Frjus location (same baseline) [44].
46
Fig. 3. Left: 50 GeV neutrino factory fluxes at 3000 km [10]; Right: the N and N cross-sections on water [53].
47
The detectors background, muon identification efficiency and systematics for the golden and
silver channels at this specific facility have been studied in Ref. [50] (the magnetized iron detector) and [51] (the emulsion cloud chamber). It has been noticed that the MID simulation must
be updated due to the growing evidence for the relevance of low energy bins that were sacrified
at first to reduce backgrounds in Ref. [10]. To take advantage of the low energy bins, in this
paper we will consider for the disappearance channel 4 GeV bins when the MID is located at the
L = 3000 km baseline and 5 GeV bins when the MID is located at the L = 7000 km baseline,
instead of the 10 GeV bins considered in the appearance channels. The disappearance signal
is much different from the golden and silver appearance signals. In particular, not such strong
cuts as for the appearance channels signals must be applied to reduce the background. A reasonable constant muon identification efficiency = 0.7 has been applied when the disappearance
channel is considered throughout the paper. A global 2% systematic error has been included.
2.5. The neutrino cross-section
An important source of systematic error is our present poor knowledge of the N and N
cross-sections for energies below 1 GeV [52]: either there are very few data (the case of neutrinos) or there are no data at all (the case of antineutrinos). On top of that, the few available data
have generally not been taken on the target used in the experiments (either water, iron, lead or
plastics), and the extrapolation from different nuclei is complicated by nuclear effects that at the
considered energies can play an important role.
We use for each detector a different cross-section following Ref. [53]. For example, we show
in Fig. 3 (right) the cross-sections on water used for the water Cerenkov
detectors throughout
the paper. Notice the difference between the e N and e N cross-sections: the former, being an
interaction between the e and a neutron inside the oxygen nucleus, is affected by nuclear effects
and thus shows a threshold energy. The latter is mainly a e interaction with the protons of the
two hydrogens, approximately free. This effect, although less pronounced, is visible also for
and for . This feature is quite relevant for neutrino/antineutrino of hundreds of MeV energy,
region where different cross-sections can easily differ by a factor 2. Be aware that there are other
nuclear effects (see Ref. [54] and references therein) not included yet in any of the available
calculations that could play an important effect at the cross-section threshold energy.
In the rest of the paper, we will make use of cross-sections on iron, lead and scintillator for
the MID, the ECC and the NOVA detectors, respectively.
3. The measurement of 23 and m223 at SPL, T2K and NOA.
3.1. Correlations and degeneracies in disappearance
A conventional beam with neutrinos of moderate energy can perform an independent measurement of the atmospheric parameters 23 and m223 via the disappearance channel: it is
expected that this kind of facilities will reduce the error on the atmospheric mass difference to
less than 10% in a few years for m223 2.2 103 eV2 [55]. The expected error on the atmospheric angle depends on the value of 23 itself, the smallest error for large but non-maximal
mixing (as it has been shown in Ref. [56]). It is interesting to study in detail the parameter correlations and degeneracies that affect this measurement and that can induce large errors. We remind
the vacuum oscillation probability expanded to the second order in the small parameters 13 and
(12 L/E) [57]:
48
2 23 L
2
2
2
P ( ) = 1 sin 223 s23 sin 213 cos 223 sin
2
12 L 2
2
(1)
where J = cos 13 sin 212 sin 213 sin 223 and 23 = m223 /2E, 12 = m212 /2E. The first
term in the first parenthesis is the dominant one and is symmetric under 23 /2 23 . This
is indeed the source of our present ignorance on the 23 -octant [13], parametrized by soct =
sign(tan 223 ). This symmetry is lifted by the other terms, that introduce a mild CP-conserving
-dependence also, albeit through subleading effects very difficult to isolate.
Considering that the sign of the atmospheric mass difference, satm = sign(m223 ), is unknown
at present, for any experimental input (23 , m2atm ) we must solve two systems of equations:
N
(2)
23 , m223 ; satm ,
23 , m2atm ; satm = N
2
2
N 23 , matm ; satm = N 23 , m23 ; satm ,
(3)
where satm is the physical mass hierarchy.
For non-maximal 23 four different solutions are found: for |m223 | m2atm we get two
solutions from Eq. (2), i.e. the input value 23 = 23 and 23
/2 23 , being the second
solution not exactly at 23 = /2 23 due to the small 23 -octant asymmetry; and two more
solutions from Eq. (3) at a different value of |m223 | [20]. In Eq. (1) we can see that changing
sign to m223 the second term becomes positive: a change that must be compensated with an
2
(m2 ; s
49
12 = 33 . For maximal atmospheric mixing 23 = 45 , Fig. 4 (left), two solutions are found
at 90% CL when both choices of satm are considered. On the other hand, using a non-maximal
input atmospheric angle 23 = 41.5 (sin2 23 = 0.44) [6] four degenerate solutions are found,
Fig. 4 (right). In general, we must therefore speak of a two-fold or four-fold degeneracy in the
disappearance channel, as it was pointed out in Ref. [20].
Notice how the disappearance sign clones appears at a value of |m223 | higher than the input
value, as it was expected from Eq. (1). The shift in the vertical axis is a function of 13 and ,
that in this particular case have been kept fixed to 13 = 0 = . The degeneracy can be softened
or solved by using detectors with large baselines that can exploit matter effects, as it will be
shown in Section 4. Notice, eventually, that the uncertainty on the atmospheric parameters can
be enhanced once we take into account that 13 and are unknown [56]. This will be studied in
Section 3.4.
3.2. A matter of conventions
It is useful to open here a short parenthesis to address a problem that arised recently concerning the physical meaning of the variable used to fit the atmospheric mass difference, m2atm .
Notice, first of all, that the experimentally measured solar mass difference m2sol can be unambiguously identified with the three-family parameter m212 = m22 m21 . This is not true for the
experimentally measured atmospheric mass difference m2atm . Since the subleading solar effects
are, at present, barely seen in atmospheric neutrino experiments, we can define in different ways
the three-family parameter to be used in the fits: for example, using m223 = m23 m22 (the choice
adopted throughout this paper), or m213 = m23 m21 , or even m2 = (m223 + m213 )/2, [58],
we will get in general the same results using present data. When future experiments aiming to
the measurement of the atmospheric mass difference at the level of 104 eV2 will be running,
however, different choices of the fitting parameter will give different results.
This can be observed in Fig. 5, where the three choices introduced above are compared. We
plot in the three panels the 90% CL contours resulting from a fit to the experimental data corresponding to the input value, m2atm = 2.5 103 , in normal hierarchy, but fitted in turn in m223
(left panel), m213 (middle panel) and m2 (right panel). As it can be seen in the figure, the
contour corresponding to the normal hierarchy, satm = satm , is always located around the input
value (in each plot the input value corresponds to a different fitting variable, though). On the
other hand, the contour obtained for the inverted hierarchy is located above, below and on top of
Fig. 5. Different choices of the three-family atmospheric mass difference; left: m223 ; middle: m213 ; right: m2 ,
[58].
50
the input value, respectively, depending on the fitting variable. This is a clear consequence of the
fact that the difference between each of the possible choices is O(m212 ) and it is reflected in a
different form of Eq. (1).
Up to here, it is perfectly clear what happens whenever we use a certain three-family variable
to fit the results that at present are given under the label m2atm . A philosophical discussion, however, arised around the physical meaning of the different choices reported above. The idea is
that the physically meaningful quantities to be measured in oscillation experiments are the oscillation frequencies. For three-family mixing, three frequencies can be defined, the shortest
being the solar oscillation frequency (as we said, unambiguously related to the mass difference
m212 ). In normal hierarchy, the middle frequency is related to m223 and the longest one to
m213 . In inverted hierarchy these two frequencies are interchanged and the middle frequency
will be related to m231 and not to m232 . For this reason, it has been suggested that plots in different hierarchies should be presented using different variables where maintaining the ordering
of the oscillation frequencies. For example, if we choose to identify m2atm with m223 in normal
hierarchy (i.e., with the middle frequency), we should identify it with m231 in inverted hierarchy
(i.e., again with the middle frequency). As a consequence, we could not plot contours for the two
hierarchies on the same figure (the vertical axis corresponds to different variables depending on
the choice of satm ). This is a drawback of giving to the oscillation frequencies a deeper physical
meaning than to the mass differences. A new variable to be identified with m2atm has been introduced in Ref. [58]: m2 = (m223 + m213 )/2. When changing hierarchy, this variable just
flip its sign (since m223 m213 and vice versa). It is thus possible to present both hierarchies in the same figure with |m2 | in the vertical axis. The oscillation frequency related to this
variable is neither the longest nor the middle one, but it maintains its role of next-to-longest or
a-bit-longer-than-the-middle-one in both hierarchies.4
It seems to us that the physical meaning of a frequency is not deeper than that of a mass
difference, and that it is perfectly acceptable to use any variable to fit the experimental data when
considering a full three-family analysis. What is really important is to be consistent with the
adopted choice, in particular when adding appearance and disappearance data, something that
we will do in Section 3.4. In the rest of the paper we adopt m223 as fitting variable.
3.3. The importance of energy resolution
It is extremely important that an experiment whose goal is to improve significantly the present
uncertainties on the atmospheric parameters may be able to use energy dependence. A counting
experiment is certainly limited, as it has been shown in Refs. [20,59].
In Fig. 6 we present a comparison of the disappearance channel at T2K-phase I (left panels)
and NOA (right panels). Both maximal mixing, 23 = 45 (top panels), and non-maximal mixing, 23 = 41.5 (bottom panels), are used as input values. In each plot we present the 90% CL
contours in the (23 , m223 ) plane for different energy bins (only the bins corresponding to neutrinos with energy just below and above the peak energy are reported for T2K, all bins for NOA).
Again, solar parameters are kept fixed to their present best fit values, m212 = 8.2 105 eV2 ,
12 = 33 , and the unknowns are fixed to 13 = = 0 . Both for T2K-phase I and NOA 5 years
4 It should be stressed that even fitting in inverted hierarchy using m2 as the fitting variable, the allowed region is
shifted with respect to that corresponding to the normal hierarchy. This can be observed in Fig. 5 (right), and it implies
that |m2 |N H = |m2 |I H .
51
Fig. 6. Binning at T2K-I (left) and NOA (right); top: 23 = 45 ; bottom: 23 = 41.5 .
of + are considered. Notice that the contours corresponding to neutrinos with an energy below
and above the oscillation peak have a different shape. As a consequence, the combination of different bins significantly increases the 23 resolution of the experiment with respect to a counting
experiment. It can be seen that T2K-I is able to measure m223 with a precision of less than
104 eV2 for m223 = 2.5 103 eV2 . However, it is not able to exclude maximal mixing at
90% CL for 23 = 41.5 . The same happens for NOA, see Fig. 6 (right).
In Fig. 7 we present a comparison of the disappearance channel at the SPL using the neutrino
fluxes of Ref. [43], Fig. 2 (left panels), or the neutrino fluxes of Ref. [44], Fig. 2 (right panels),
again for both maximal mixing, 23 = 45 (top panels), and non-maximal mixing, 23 = 41.5
(bottom panels). The 90% CL contours corresponding to both SPL energy bins are drawn. In
this case, both neutrinos and antineutrinos fluxes are produced by + and decays with 2 and
8 years of data taking, respectively. Notice that, however, being the neutrino and antineutrino
52
Fig. 7. Binning at the standard (left) and optimized SPL (right); top: 23 = 45 ; bottom: 23 = 41.5 .
average energies extremely similar in this setup, the neutrino (antineutrino) contours are almost
superimposed. As a consequence, experimental information from the neutrino and the antineutrino fluxes is not complementary and we observe just an increase in the statistics. For this reason,
although in this setup a 1 Mton detector is considered, the resolution in 23 is not astonishing. A
big improvement with respect to the results presented for this facility in Refs. [15,20] is represented by the spectral information: in the new analysis two energy bins are considered for both
setups. Thanks to this, maximal mixing can be excluded at 90% CL for 23 = 41.5 . As a final
comment, notice that the new neutrino fluxes of Ref. [44] do not improve the 23 resolution with
respect to old fluxes from Ref. [43]. This is because the new fluxes have been optimized to look
for the e appearance signal and not to the disappearance one. In particular, the
average neutrino and antineutrino energies are identical (see Fig. 2) and thus the small comple-
53
mentarity of the two fluxes reduces. In the rest of the paper we will consider the SPL with old
fluxes, only.
It is remarkable how a relatively small experiment such as T2K-I, with only 5 years of data
and 23 not much worse than the SPL facility, with a much larger 1 Mton water Cerenkov
and
10 years of data taking in both pion polarities (compare the left panels of Figs. 6 and 7). One
of the reasons is that the SPL has both neutrino and antineutrino beams with an average energy
corresponding to the oscillation peak for the L = 130 baseline. As a consequence, information
coming from the two beams just add statistically but it is not complementary (in the absence of
matter effects). It would be better to run the SPL with neutrinos again but at a different energy
(not at the oscillation peak) after the first two years with + .
3.4. The impact of 13 and
Up to this moment we have kept the unknown parameters 13 and as external fixed quantities,
13 = = 0 , following what we have done for the solar parameters m212 and 12 . However, the
two sets of parameters should be treated differently. Indeed, we do have a good measure of solar
parameters and it has been shown in the literature that the impact of solar parameter uncertainties
in the measurement of atmospheric parameters is negligible [15]. The main effect is the shift in
the atmospheric mass difference fitting variable, as it has been discussed in Section 3.2. This is
not the case for the unknown parameters (13 , ). Being both parameters unknown, we must fit the
atmospheric parameters introducing them as free variables to be reconstructed at the same time
with (23 , m223 ). A recent comprehensive three-families analysis of present solar, atmoshperic
and reactor data can be found, for example, in Ref. [6].
In Fig. 8 we show the effect of a varying 13 (but fixed ) on the previous fits at T2K-I for
23 = 45 (left panel) and 23 = 41.5 (right panel) and three values of the atmospheric mass
difference, m223 = (2.2, 2.5, 2.8) 103 eV2 . Both choices of satm are shown on the same plot.
The input values of the unknowns are: 13 = 0 , = 0 . In the fit, 13 is free to vary in the range
13 [0 , 10 ]. Notice that the input values can be fitted with increasing values of 13 if 23 is
also increased, resulting in a shift of the 90% CL contours to the right. This is a consequence of
the 23 -asymmetric second term in the first parenthesis of Eq. (1).
Such a naive treatment of the experimental data is, however, not correct. The considered superbeam facilities have indeed been proposed to look for the appearance e channel, the
oscillation probability in vacuum expanded to the second order in the small parameters 13 and
(12 L/E) [10,57] for which is:
P ( e )
2
= s23
sin2 (213 ) sin2
23 L
2
12 L
23 L
23 L
+ sin(212 ) sin(223 ) sin(213 ) cos
sin
sin
2
2
2
L
12
2
,
+ c23
sin2 (212 ) sin2
2
(4)
where refers to neutrinos and antineutrinos, respectively. It is clear that to take properly into
account the effect of (13 , ) we should combine (whenever possible) appearance and disappearance signals at a given facility.
54
The results of a four-parameters fit in (23 , m223 , 13 , ) projected onto the (23 , m223 ) plane
obtained by combining the disappearance and appearance signals at the T2K-I, NOA and SPL
facilities are presented in Figs. 911, respectively. In all figures two choices of 23 and 13 are
shown, 23 = 45 , 41.5 (left and right panels) and 13 = 0 , 8 (top and bottom panels), whereas
= 0 . Solid lines represent the result of a fit with variable 13 and . As a reference, we also
present the results of a fit5 with 13 = 13 and = (dotted lines). It can be observed that in most
of the cases the combination of disappearance with appearance signals reduces the spread in 23
that was observed in Fig. 8. Only for 13 large we can still see some effect.
Eventually, the effect of a non-vanishing can be easily understood: the CP-conserving [0 , 90 ], negative for ||
dependent term in the second row of Eq. (1) is positive for ||
CL at all three facilities for 23 = 41.5 (only the SPL, with its gigantic 1 Mton water Cerenkov,
is able to exclude maximal mixing for 23 = 41.5 for a vanishing 13 ). Notice, however, that
while the disappearance octant degeneracy appears to be solved when combining disappearance
and appearance signals for 13 = 8 when 13 is treated as a fixed parameter, is indeed restored
if 13 is left freely varying in the presently allowed range.
55
Fig. 9. Appearance and disappearance at T2K-I. Left: 23 = 45 ; right: 23 = 41.5 ; top: 13 = 0 ; bottom: 13 = 8 .
56
Fig. 10. Appearance and disappearance at NOA. Left: 23 = 45 ; right: 23 = 41.5 ; top: 13 = 0 ; bottom: 13 = 8 .
be carefully studied and it will not be discussed here. We will devote this section to a detailed
study of the combination of golden, silver and disappearance channels at the neutrino factory.
Two possible baselines are studied, L = 3000 km and L = 7000 km, i.e. the optimal distance
to look for a CP-violating signal [10] and the optimal distance to exploit matter effects through
the disappearance channel, respectively.
In Figs. 12 and 13 we first present the 90% contours of the on-peak and above-peak disappearance channel energy bins for two values of 23 , 23 = 41.5 , 45 (left and right panels,
respectively), and two values of 13 , 13 = 0 , 8 (top and bottom panels, respectively). The
medium baseline results are shown in Fig. 12, the long baseline results in Fig. 13. Both neutrino
and antineutrino bins are presented. Again, solar parameters are kept fixed to their present best
fit values, m212 = 8.2 105 eV2 , 12 = 33 . The input value for is = 0 .
57
Fig. 11. Appearance and disappearance at the standard SPL. Left: 23 = 45 ; right: 23 = 41.5 ; top: 13 = 0 ; bottom:
13 = 8 .
Notice, first of all, that the resolution in 23 is extremely good and that maximal mixing can
be easily excluded for 23 = 41.5 . In particular, at the L = 7000 baseline a 4% error on 23 for
23 = 41.5 is found. This was expected, being the statistics much higher than at the super-beams
experiments studied in Section 3 (see Tables 15). Moreover, a new feature arises for 13 = 8
at both baselines. When a rather large non-vanishing 13 is switched on, matter effects become
extremely important and introduce a strong 23 -asymmetry in Eq. (1). The asymmetry can be
clearly seen in the bottom panels of Figs. 12 and 13, and it is crucial in solving the disappearance
octant degeneracy, see Section 3.1.
We point out that the on-peak bin shows a circular shape centered around the input value,
distinct from the upward(downward)-curved shapes of contours corresponding to above(below)peak bins of Figs. 6 and 7. This is because the neutrino factory flux is not centered around the
58
Fig. 12. Binning at the L = 3000 km neutrino factory. Left: 23 = 45 ; right: 23 = 41.5 ; top: 13 = 0 ; bottom:
13 = 8 .
peak energy for the chosen baselines: the peak energy for a L = 3000 km baseline would be
E 6 GeV and for a L = 7000 km baseline E 14 GeV. As a consequence, we have
no energy bins below the peak energy, but only on-peak or above-peak bins. This is a major flaw
of the present neutrino factory design, that could perhaps be solved with an improved detector
capable to take advantage of low energy bins.
The solving of the octant-degeneracy in the disappearance channel (for large 13 ) is of great
importance. It is useful to recall that, when looking for a CP-violating signal in appearance channels as e , e and e , we must deal in general with an eightfold-degeneracy
[14] that originates from three sources: the discrete ambiguities parametrized by satm [12] and soct
[13] and the intrinsic ambiguity due to the trigonometric nature of Pe , Pe and Pe in the two
unknowns 13 and [11] (see Eq. (4) as an example). As a consequence, for generic values of the
59
Fig. 13. Binning at the L = 7000 km neutrino factory. Left: 23 = 45 ; right: 23 = 41.5 ; top: 13 = 0 ; bottom:
13 = 8 .
we get eight different solutions in the (13 , ) plane. Different ways to solve
input pair (13 , ),
the eightfold ambiguity have been proposed in the literature, such as to combine experiments
with different L/E [11], or the golden channel at the neutrino factory with e at a superbeam [33], or the golden and silver channels at the neutrino factory [30,51], or even to combine
different super-beams [60,61] or a super-beam and a -beam [59,62] (see also Refs. [6367]). In
Ref. [68] it was shown that the combination of the neutrino factory with a SPL-like super-beam
was able to solve the eightfold-degeneracy and to give a single allowed region in the (13 , )
plane at the price of three specialized detectors, a 40 kton magnetized iron calorimeter and a
4 kton emulsion cloud chamber (to look for e and e oscillations, respectively) and
60
factory combining both golden and silver appearance signal with the disappearance signal,
61
Fig. 14. Appearance and disappearance in the (23 , m223 ) plane at the L = 3000 km neutrino factory. Left: 23 = 45 ;
right: 23 = 41.5 ; top: 13 = 0 ; bottom: 13 = 8 .
and L = 7000 km will be analysed in Section 5.2. All the contours have been computed adding
the appearance and disappearance channels available at a given facility and, if not differently
stated, all the parameters not involved in the fits are fixed to their best fit values as quoted in the
introduction.
5.1. Sensitivities at T2K-I and NOA
In Fig. 17 we compare the 13 -sensitivity (left plot) and the sensitivity to maximal 23 (right
plots) at T2K-I (solid lines) and NOA (dashed lines).
62
Fig. 15. Appearance and disappearance in the (23 , m223 ) plane at the L = 7000 km neutrino factory. Left: 23 = 45 ;
right: 23 = 41.5 ; top: 13 = 0 ; bottom: 13 = 8 .
13 -sensitivity
The 13 -sensitivity is defined as the one-parameter 3 excluded region as a function of in
case of absence of signal. The contours presented in the following figures represent the excluded
values of 13 for a given facility taking into account all possible choices of satm and soct . For
both facilities the 13 -sensitivity is rather poor, with a typical excluded region ranging from
[sin2 13 ]min = [2, 8] 103 . Sensitivity is slightly better for negative than for positive , with
a maximal sensitivity at = 90 (135 ) for T2K (NOA). We have noticed that, while the
T2K-I sensitivity is basically unaffected by the satm , soct choice, the NOA sensitivity is sensibly
diminished in [90 , 90 ] when choosing a wrong value for the sign of m223 or the 23 octant.
63
Fig. 16. Appearance and disappearance in the (13 , ) plane at the neutrino factory. Left: L = 3000 km baseline; right:
L = 7000 km baseline; top: = 45 ; bottom: = 90 . The mixing angle takes the following values: 13 = 2 , 5 and
8 .
To increase both the 13 -sensitivity and the CP-discovery potential it has been proposed to
add a proton driver to the NOA design, such as to increase the neutrino flux from four to six
times [26,40].
Sensitivity to maximal 23
The potential to exclude maximal 23 has been computed in the following way: for a given
m223 , we look for the largest value of 23 for which the two-parameter 3 contours do not touch
23 = 45 . We have considered both octants of 23 and we have found a behavior approximately
symmetric of the sensitivity; we then display our results in the sin2 23 variable up to the maximal
mixing value of 0.5.
64
Fig. 17. Left: 3 13 -sensitivity; right: 3 sensitivity to maximal 23 . Solid lines refer to T2K-I; dashed lines to NOA.
Fig. 18. Left: 3 13 -sensitivity; right: 3 CP-discovery potential. Solid lines refer to the L = 3000 km neutrino factory;
dashed lines to the L = 7000 km neutrino factory; dotted lines to T2K-II; dot-dashed lines to the (standard) SPL.
As we can see in Fig. 17 (right), the sensitivity for the two experiments is essentially the same
and strongly decreases for low values of m223 . At the best fit point m223 = 2.5 103 eV2
deviations as small as 14% of sin2 23 from maximal mixing could be established at both facilities. Notice that these curves have been computed for a fixed 13 = 0 (and satm = +1) so that
matter effects in disappearance probabilities and those from the appearance channel e
are completely negligible. We have also checked that for 13 close to the current bound, none of
the shown results changes drastically and that everything is basically independent on the mass
hierarchy.
5.2. Sensitivities at T2K-II, SPL and neutrino factories
13 -sensitivity and CP-discovery potential
In Fig. 18 (left) we present the 13 -sensitivity (computed as explained in the previous section)
for T2K-II (dotted line), the SPL (dot-dashed line), the neutrino factory at L = 3000 (solid line)
and L = 7000 (dashed line). As we can see, the NF at 3000 km shows the best sensitivity to 13
in the whole range of (with the best value of sin2 13 2 105 for around 30 ) except a
small region around = 90 where T2K-II assures a slightly better performance (at the level
of 8 105 ). For the neutrino factory at L = 7000 km, the loss in sensitivity is to be abscribed to
65
the sign degeneracy. For example, we have found that for satm = satm we get that 13 is excluded
at 3 down to sin2 13 7 105 (13 0.5 ) if = 90 . On the other hand, for the same
value of (but for a wrong choice of satm ) we can only exclude 13 down to sin2 13 4 104
(13 1.1 ). As for the SPL 13 -sensitivity, the 3 excluded region varies in the range sin2 13
[4 104 , 1.5 103 ] for different values of , with no big differences coming from different
choices of satm and soct . T2K-II is significantly better than the SPL but significantly worse than
the NF at 3000 km, apart from = 90 (where it improves a little the NF limit).
The CP-violation discovery potential has been computed as in Refs. [15,20]: at a fixed 13 , we
for which the two-parameters 3 contours of any of
look for the smallest (largest) value of ||
the degenerate solutions do not touch = 0 nor || = 180 . Notice that, although the input 13
value is fixed, the degeneracies can touch = 0 , 180 at 13 = 13 , also.7 The outcome of this
procedure is finally plotted, representing the region in the (13 , ) parameter space for which a
CP-violating signal is observed at 3 . For each facility we show one single curve obtained taking
into account the impact of all the degeneracies.
First of all, notice that no CP-discovery potential has been considered for the NF at 7000 km:
due to matter effects and the choice of the baseline (close to the magic baseline, see Ref. [32]),
the sensitivity to vanishes. The best sensitivity to the measurement of the CP-violating phase
is reached by the T2K-II experiment: a CP-violating signal can be observed at 3 for ||
[27 , 155 ] down to sin2 13 2 104 (13 0.8 ). At the SPL a good sensitivity can be
reached also: CP violation is clearly observed at 3 for || [45 , 135 ] down to sin2 13
9 104 (13 1.7 ), as it was shown in Ref. [20]. The NF at 3000 km is not as good as one
would expect from its 13 -sensitivity contours, having a CP-discovery potential very similar to
the SPL; the loss in sensitivity with respect to T2K-II is mainly due to the presence of the sign
degeneracy. We have observed that the NF starts to be insensitive to the leptonic CP-violation
just when, for 13 2 , sign degeneracies close to 13 = 0 appear, which do not allow to exclude
the CP-conserving case any longer. This is a consequence of the fact that, for neutrinos with
E 30 GeV, the effective 13 in matter [10,69] is extremely small. As it has been shown in
Ref. [34], when 13 vanishes, the sign degeneracies flow to = 0 or = 180 , as can be seen
in Figs. 2 and 3 (left) of Ref. [34]. At the SPL and T2K-II, for which the vacuum parameter 13
is the relevant parameter, this happens at a much smaller value 13 0.5 . Therefore, at these
experiments the CP-discovery potential is statistics-dominated.
Sensitivity to the mass hierarchy
In addition to the previous sensitivities one can ask for the sensitivity to the sign of the atmospheric mass difference. We compute the smallest values of 13 for which the sign of m223
can be measured in the (sin2 13 , ) plane. The measurement of satm requires matter effects to
be sizable. For this reason neither T2K-II nor the SPL have the capabilities to perform such
a measurement. In Fig. 19 we present the results for the neutrino factory only and for both
possibilities of the true satm = 1. As expected, the NF at 3000 km (solid and dotted lines
for normal and inverted hierarchy, respectively) exhibits the worst sensitivity to sign(m223 ).
For satm = +1 the best sensitivity is reached in a quite large region around = 90 for which
7 This is not the case of Fig. 11 in Ref. [59], where the excluded region in at fixed in the absence of a CP-violating
13
signal at 90% CL is presented. In practice, in that figure we compare N (13 , ) with N (13 , 0 ), thus obtaining a oneparameter sensitivity plot in only.
66
Fig. 19. 3 sensitivity to the sign(m223 ). Solid lines refer to the L = 3000 km neutrino factory with normal hierarchy;
dashed lines to the L = 7000 km neutrino factory with normal hierarchy; dotted lines to the L = 3000 km neutrino
factory with inverted hierarchy; dot-dashed lines to the L = 7000 km neutrino factory with inverted hierarchy.
[sin2 13 ]min = 3.7 104 (1.1 ) whereas for 90 [sin2 13 ]min = 4 103 (3.6 ). The situation is completely reversed for satm = 1 due to the fact that in matter a flip in the sign of
m223 corresponds to a change among neutrino and antineutrinos oscillation probabilities. At
least one order of magnitude in sensitivity can be gained at the NF at 7000 km, depending on the
fact that around L = 7000 Km the effect of in the e oscillation probability vanishes and
that helps to measure the sign of the atmospheric mass difference. For the normal hierarchy, a
maximal sensitivity is achieved for 0 < < 90 at the level of [sin2 13 ]min = 4.4 105 (0.4 )
whereas the largest value for [sin2 13 ]min = 1.7 104 (0.8 ) is around = 100 . For the
inverted hierarchy the sensitivity varies in the range sin2 13 [1, 3] 104 (0.6 , 1 ).
Sensitivity to maximal 23 and the octant-discovery potential
Eventually in Fig. 20 we present the sensitivity to maximal 23 (plot on the left) and the
octant-discovery potential (plot on the right). The curves have been computed for 13 = 0.
With respect to the sensitivity to maximal 23 , we observe that the NF at 3000 km (solid line)
is not as good as one may expect since it measures far away from the oscillation maximum. At the
value of m223 in which the sensitivity is maximal, deviations as small as 10% of sin2 23 from
maximal mixing could be established. Similar behaviour is expected at the SPL (dot-dashed
line), except for small m223 in which it outperforms the NF at 3000 km. A big improvement
will be however achieved at a NF at 7000 km (dashed lines) and T2K-II (dotted line); both
experiments have energies and baselines (as well as off-axis angle for T2K-II) chosen to match
the first oscillation peak (in vacuum and matter respectively); the reached sensitivities are at the
level of sin2 23 [0.450.48] almost independently on the value of m223 , which means that
deviations from maximal mixing of the order of 4% could be established.
Notice that, although this sensitivity is rather good, in general it is very difficult to determine
the octant in which the atmospheric angle lies. As we can see from Eq. (1), it is quite difficult to
break the 23 /223 symmetry induced by the leading term in the transition probability; the
subleading terms that could help in lifting this degeneracy are very difficult to isolate. However,
for values of 13 different from zero, we can take full advantage of matter effects in the disappearance of muon neutrinos, as we have already seen in Section 4 (Figs. 1213). Obviously the
67
Fig. 20. Left: 3 sensitivity to maximal 23 ; right: 3 sensitivity to the 23 -octant. Solid lines refer to the L = 3000 km
neutrino factory; dashed lines to the L = 7000 km neutrino factory; dotted lines to T2K-II; dot-dashed lines to the
(standard) SPL.
matter effects at SPL and T2K can never be sizable enough to solve the octant ambiguity; on the
other hand, the neutrino factory shows a (limited) capability to solve it, irrespective of the baseline and the value of . To illustrate this point, we fixed 23 = 41.5 and m223 = 2.5 103 eV2
and, for any value of , we compute the minimum value of 13 for which the octant ambiguity is solved. As we can see in the right plot of Fig. 20, a 3 octant discovery is possible for
sin2 23 > 0.01 (6 ).
6. Conclusions
In this paper, we studied the measurement of the atmospheric neutrino oscillation parameters,
23 and m223 using the disappearance channel at three conventional beam facilities, the SPL,
T2K-phase I and NOA, and at the neutrino factory. The precision on these two parameters
will be of crucial importance in the measurement of two of the unknowns of the PMNS mixing
matrix, 13 and the leptonic CP-violating phase .
It has been shown that counting experiments cannot reduce significantly the uncertainties on
23 and m223 . Hence, we have considered detectors with (modest but non-vanishing) sensitivity
to the energy of the final leptons produced via neutrino interaction. For T2K-I and NOA we have
indeed found that, independently of the input value of 23 , the errors on 23 can be significantly
reduced. Different energy bins give different allowed regions in the (23 , m223 )-plane and the
combination of them eventually helps in reducing the uncertainties around the physical point.
The two experiments only run with + , and thus are not sensitive to the CP-violating phase .
The SPL, with a 1 Mton detector and 2 + 8 years of running time with + and , respectively,
could in principle greatly improve these results. However, having neutrinos and antineutrinos at
this setup does not help, since they have been produced with the same energy, roughly. Moreover,
the baseline is too short for matter effects to make a difference. As a consequence, information
coming from the two fluxes is not complementary and just add statistically. Remember, however,
that this is not the case in the appearance mode, e , for which this setup was designed. In
this paper, we have considered two bins of energy at the SPL: this already makes a big difference
in the precision on 23 as it can be seen comparing with our results of Refs. [20,59] and [15],
where only a counting experiment was considered for this facility.
We have then studied the impact of 13 and in the measurement of 23 and m223 . Since
both parameters are unknown at present, their values should be reconstructed at the same time
68
with the atmospheric parameters. We have shown that the main effect of considering 13 as a
free variable is a shift of the contours toward larger values of 23 (with respect to the input point)
while a free in the whole [180 , 180 ] range does not produce any significant distortion
of the allowed regions. These effects can be strongly softened if, in addition to the disappearance
channels, we introduce in the analysis the appearance e channels, which the super-beam
facilities have been proposed to look for. Two important conclusions can then be drawn: if 13 is
kept fixed, the combination of appearance and disappearance channels solves the disappearance
octant degeneracy for 13 large enough; on the other hand, the octant degeneracy is not lifted if
13 is free to vary in the current allowed range, for almost any value of 13 .
The situation is quite different at the neutrino factory. We have pointed out that the
disappearance channel is a very powerful tool to reduce the uncertainties on the atmospheric
parameters up to an unprecedented level and, in combination with appearance channels, to solve
many of the discrete ambiguities affecting the measurement of the PMNS matrix elements. The
main feature emerging from such an analysis is that, for 13 3 4 , the synergy between disappearance and appearance channels (the more renowned golden and silver channels) greatly helps
in solving the octant ambiguity, at both L = 3000 km and L = 7000 km baselines, as a result
of the strong matter effects along the neutrino path and the huge statistics at hand. This remains
true even in the case we leave 13 as a free parameter, a situation which has been shown not to
be true for the other facilities. Equally remarkable, in the same range of 13 , the sign clones are
solved and they are not present in the fits, independently of the baseline and the input values of
23 and m223 . This allows a precise measurement of 13 at both facilities while the CP-phase
, as it is already well known, can only be measured at L = 3000 km. For values of 13 3 ,
some remnant of the clones still remains and the measurement of the two unknowns cannot be
performed with huge precision.
Eventually, we have presented a comparison at the different facilities of the 3 sensitivity to
13 , to maximal 23 , to the mass hierarchy, to the 23 -octant and their CP-discovery potential.
T2K-I and NOA, running with neutrinos only are not able to measure . On the other hand,
they have very similar performances in the 13 -sensitivity, with an excluded region ranging from
[sin2 13 ]min = 2 103 to 8 103 , and in the sensitivity to maximal 23 , with the maximal 23
that can be distinguished from 23 = 45 ranging from [sin2 23 ]max = 0.45 to 0.40 for m223
[2.0, 3.0] 103 eV2 .
The neutrino factory outperforms the considered super-beams for both baselines in the 13 sensitivity and in the sensitivity to the sign of the atmospheric mass difference. The longest
baseline is as effective as T2K-II in the sensitivity to maximal 23 . The excluded region in 13
at the 3000 km neutrino factory ranges from [sin2 13 ]min = 2 105 to 2 104 (for particular
values of ). The sign of the atmospheric mass difference can be measured at the 7000 Km
(3000 Km) neutrino factory for sin2 13 as small as 104 (103 ). Finally, the 23 -octant can be
identified at both facilities for sin2 13 as small as 102 (13 6 ). Notably enough, the CPdiscovery potential seems to be larger for T2K-II than for neutrino factory at L = 3000 km
(no CP-discovery potential is expected at the magic baseline L = 7000 km). This is explained
as follows: as a general rule, for small values of 13 the degeneracies flow toward = 0 and
|| = 180 (see Refs. [33,34]), thus mimicking a non-CP violating phase. Due to a parametric
conspiracy between the chosen energy and baseline and the matter effects, at the neutrino factory
the typical value of 13 for which this happens is much larger than at the SPL and T2K. Therefore,
although from the statistical point of view the neutrino factory would certainly outperform both
the SPL and T2K-II, in practice for small values of 13 a CP-violating phase will be difficult to
distinguish from a non-CP-violating one, if satm and soct are not measured previously.
69
Acknowledgements
We would like to thank E. Couce, B. Gavela, J. Gomez-Cadenas, P. Hernandez, P. Huber,
O. Mena, P. Migliozzi, T. Schwetz and W. Winter for useful discussions. The authors acknowledge the financial support of MCYT through project FPA2003-04597 and of the European Union
through the networking activity BENE. E.F. acknowledges financial support from the UAM.
Appendix A
In this appendix we recall the number of events per bin that are expected in the ( ) disappearance channel at the SPL, the T2K-I, the NOA and the neutrino factory experiments. The
expected T2K-I muon and electron identification efficiencies are presented, as well.
In all tables, two different values of 13 and are considered, as well as two choices of
sign(m223 ).
In Table 1 we show the expected event rates per bin at the SPL after a 2 years run with +
and a 8 years run with , using the fluxes of Ref. [43] and a 440 kton detector at L = 130 km.
In Table 2 we show the expected event rates for the two lowest bins at T2K-I after a 5 years
run with + , for a 22.5 kton detector at L = 295 km.
Table 1
Disappearance event rates for a 2 + 8 years + + run at the standard SPL, for different values of 13 , and of the
sign of the atmospheric mass difference, satm
No Osc.
E [0, 250] MeV
N , satm = +
N , satm =
E [250, 600] MeV
N , satm = +
N , satm =
E [0, 250] MeV
N+ , satm = +
N+ , satm =
E [250, 600] MeV
N+ , satm = +
N+ , satm =
13 = 0
13 = 8 ; = 0
13 = 8 ; = 90
2784
186
158
198
158
192
162
21461
2401
2591
2423
2682
2455
2642
6310
830
740
859
737
841
753
19157
1708
1858
1734
1939
1757
1906
Table 2
Disappearance event rates for a 5 years + run at T2K-I, for different values of 13 , and of the sign of the atmospheric
mass difference, satm
No Osc.
13 = 0
13 = 8 ; = 0
13 = 8 ; = 90
752
52
45
56
46
54
47
2218
122
140
126
149
128
145
70
Table 3
Disappearance event rates for a 5 years + run at NOA, for different values of 13 , and of the sign of the atmospheric
mass difference, satm
No Osc.
13 = 0
13 = 8 ; = 0
13 = 8 ; = 90
1217
120
105
125
102
116
107
8635
819
920
800
949
847
907
7545
1909
2021
1886
2048
1938
2003
Table 4
Disappearance event rates for a 5 + 5 years run at the L = 3000 km neutrino factory, for different values of 13 , and of
the sign of the atmospheric mass difference, satm
No Osc.
13 = 0
13 = 8 ; = 0
13 = 8 ; = 90
E [4, 8] GeV
N , satm = +
N , satm =
6546
407
418
449
430
441
426
26110
6608
6914
6727
7034
6776
7004
E [4, 8] GeV
N , satm = +
N , satm =
3014
187
192
186
210
188
210
12201
3092
3232
3123
3328
3136
3305
Table 5
Disappearance event rates for a 5 + 5 years run at the L = 7000 km neutrino factory, for different values of 13 , and of
the sign of the atmospheric mass difference, satm
No Osc.
13 = 0
13 = 8 ; = 0
13 = 8 ; = 90
11129
639
551
438
515
413
519
27181
2114
2374
2582
2565
2556
2578
5225
299
259
269
206
268
216
12799
997
1116
1083
1334
1077
1345
71
Fig. 21. Muon (solid) and electron (dashed) identification efficiency at T2K-I as a function of the neutrino energy.
In Table 3 we show the expected event rates per bin at NOA after a 5 years run with + , for
a 30 kton detector at L = 812 km.
In Tables 4 and 5 we show the expected event rates per bin after a 5 years run with and a 5
years run with + at the neutrino factory, for a 40 kton detector at L = 3000 and L = 7000 km,
respectively.
Eventually, in Fig. 21 we present the muon (solid) and electron (dashed) identification efficiencies at the T2K-I experiment, as found in Ref. [16,70], respectively.
References
[1] Y. Fukuda, et al., Super-Kamiokande Collaboration, Phys. Rev. Lett. 81 (1998) 1562, hep-ex/9807003;
M. Ambrosio, et al., MACRO Collaboration, Phys. Lett. B 517 (2001) 59, hep-ex/0106049;
M.H. Ahn, et al., K2K Collaboration, Phys. Rev. Lett. 90 (2003) 041801, hep-ex/0212007;
B.T. Cleveland, et al., Astrophys. J. 496 (1998) 505;
J.N. Abdurashitov, et al., SAGE Collaboration, Phys. Rev. C 60 (1999) 055801, astro-ph/9907113;
W. Hampel, et al., GALLEX Collaboration, Phys. Lett. B 447 (1999) 127;
S. Fukuda, et al., Super-Kamiokande Collaboration, Phys. Rev. Lett. 86 (2001) 5651, hep-ex/0103032;
Q.R. Ahmad, et al., SNO Collaboration, Phys. Rev. Lett. 87 (2001) 071301, nucl-ex/0106015;
K. Eguchi, et al., KamLAND Collaboration, Phys. Rev. Lett. 90 (2003) 021802, hep-ex/0212021.
[2] C. Athanassopoulos, et al., LSND Collaboration, Phys. Rev. Lett. 81 (1998) 1774, nucl-ex/9709006;
A. Aguilar, et al., LSND Collaboration, Phys. Rev. D 64 (2001) 112007, hep-ex/0104049.
[3] I. Stancu, et al., MiniBooNE Collaboration, FERMILAB-TM-2207.
[4] B. Pontecorvo, Sov. Phys. JETP 6 (1957) 429, Zh. Eksp. Teor. Fiz. 33 (1957) 549;
Z. Maki, M. Nakagawa, S. Sakata, Prog. Theor. Phys. 28 (1962) 870;
B. Pontecorvo, Sov. Phys. JETP 26 (1968) 984, Zh. Eksp. Teor. Fiz. 53 (1967) 1717;
V.N. Gribov, B. Pontecorvo, Phys. Lett. B 28 (1969) 493.
[5] M. Apollonio, et al., CHOOZ Collaboration, Phys. Lett. B 466 (1999) 415, hep-ex/9907037;
M. Apollonio, et al., CHOOZ Collaboration, Eur. Phys. J. C 27 (2003) 331, hep-ex/0301017.
[6] G.L. Fogli, E. Lisi, A. Marrone, A. Palazzo, hep-ph/0506083;
G.L. Fogli, E. Lisi, A. Marrone, A. Palazzo, A.M. Rotunno, hep-ph/0506307.
[7] H. Pessard, OPERA Collaboration, hep-ex/0504033;
72
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]
[64]
[65]
[66]
[67]
[68]
[69]
[70]
73
Abstract
Family symmetry could explain large mixing of the atmospheric neutrinos. The same symmetry could
explain why the flavor changing current processes in supersymmetric standard models can be so suppressed.
It also may be able to explain why the proton is so stable. We investigate these questions in a supersymmetric, renormalizable extension of the standard model, which possess a family symmetry based on a binary
dihedral group Q6 . Wefind that the amplitude for e + enjoys a suppression factor proportional
to |(VMNS )e3 | me /( 2m ) 3.4 103 , and that B(p K 0 + )/B(p K 0 e+ ) |(VMNS )e3 |2
105 , where VMNS is the neutrino mixing matrix.
2006 Elsevier B.V. All rights reserved.
PACS: 12.60.Jv; 11.30.Hv; 12.15.Ff; 14.60.Pq; 02.20.Df
1. Introduction
The remarkable success of the standard model (SM) suggests that we have a highly nontrivial
part of a more fundamental theory for elementary particle physics. In spite of this success, the
SM suffers from various problems. One of them is that due to the quadratic divergence of the
Higgs mass its natural scale cannot exceed O(1) TeV [1]. Therefore, in order to extend the SM
in a natural way, the quadratic divergence has to be canceled [1,2]. As it is well known today,
low energy supersymmetry (SUSY) is introduced to protect the Higgs mass from the quadratic
divergence [3,4]. Unfortunately, SUSY is broken, and therefore its breaking should be soft to
maintain the very nature of low energy SUSY, whatever its origin is [3,4].
* Corresponding author.
75
76
Section 4 we consider the B and L violating operators that are allowed by the family symmetry
and then calculate the dominant proton decay modes as function of superpartner masses. We
conclude in Section 5.
2. The model
2.1. Group theory of Q2N
The binary dihedral group Q2N (N = 2, 3, . . .) is a finite subgroup of SU(2).1
Its defining matrices are given by [27,33]
sin 2N
cos 2N
i 0
,
PQ =
,
R2N =
0 i
sin 2N cos 2N
(1)
for N = 2, 4, 6, . . . ,
for N = 3, 5, 7, . . . ,
(3)
where the 1+,0 is the true singlet of Q2N , and only 1,1 and 1,3 are complex irreps. The N 1
different two-dimensional irreps are denoted by
2 ,
= 1, . . . , N 1.
(4)
2 with odd is a pseudo real representation, while 2 with even is a real representation. It is
straightforward to calculate the ClebschGordan coefficients for tensor products of irreps. The
following multiplication rules in Q6 in particular are used to construct the model [27]
1+,0
+
1+,2
+
22
22 22 =
a1 b1 + a2 b2
a1
b1
(a1 b2 a2 b1 )
= (a1 b1 + a2 b2 )
,
a2
b2
a1 b2 + a2 b1
(5)
1,3
+
1,1
+
21
21 22 =
x1
a1
x1 a1 + x2 a2
= (x1 a2 + x2 a1 )
.
(x1 a1 x2 a2 )
x2
a2
x1 a2 x2 a1
(6)
77
Table 1
Q6 Z4 R assignment of the chiral supermultiplets, where R is the R parity. This is an alternative assignment to the
one given in [27]. The Abelian Z4 is also an alternative to the Abelian discrete R symmetry Z12R [27], for which one
needs more singlets to construct a desired Higgs sector. The anomalies, Q6 [SU(2)L ]2 , Q6 [SU(3)C ]2 , Z4 [SU(2)L ]2 and
Z4 [SU(3)C ]2 , can be canceled by the GreenSchwarz mechanism if, for instance, 2 = 3 is satisfied [40], where 2
and 3 are the KacMoody levels for SU(2)L and SU(3)C , respectively
Q6
Z4
R
Q3
U c , Dc
U3c , D3c
L3
Ec , N c
E3c
N3c
H u, H d
H3u , H3d
21
3
1+,2
3
22
0
1,1
0
22
1
1+,0
1
22
2
1+,0
2
1,3
2
22
1
+
1,1
1
+
21
2
+
22
2
+
1+,2
0
+
leptons and Higgs bosons, respectively. Similarly, SU(2)L singlet supermultiplets for quarks,
charged leptons and neutrinos are denoted by U c , U3c , D c , D3c , E c , E3c and N c , N3c . S, T and Y
are SU(3)C SU(2)L U (1)Y singlet Higgs supermultiplets. As an alternative to the Abelian
discrete R symmetry Z12R of [27], for which one needs more singlets to construct a desired
Higgs sector, we introduce an Abelian Z4 symmetry. The Z4 constrains the Higgs sector, where
it does not constrain anything in the Yukawa sector. When discussing proton decay in Section 4,
we will assume neither Z4 nor Z12R , because the symmetry of the Higgs sector may have a
stronger model dependence. We then write down the most general, renormalizable, Q6 Z4 R
invariant superpotential W :
W = WQ + WL + WH ,
(7)
where
WQ =
uI
Yij Qi Ujc HIu + YijdI Qi Djc HId ,
(8)
I,i,j,k=1,2,3
eI
mN c 2 N c 2
N3 Y,
Yij Li Ejc HId + YijI Li Njc HIu +
Ni +
2
2
I,i,j,k=1,2,3
i=1,2
WH = mT T12 + T22 + mY Y 2 + S S12 + S22 Y
+ 1 H1u S2 + H2u S1 H3d + 2 H1d S2 + H2d S1 H3u
+ 3 H1u H1d H2u H2d T1 + H1u H2d + H2u H1d T2 .
WL =
Yce
e1
Y = 0
Ybe
Yb
Ybu(d)
0
Yu3(d3) = Ycu(d)
0
u(d)
0
Yce
0
u(d)
Yc
0
0
Ybe
0 ,
0
0
0 ,
0
Yu2(d2) = 0
u(d)
Yb
0
0
0
(9)
(10)
u(d)
Yb
0 ,
0
(11)
Yau(d)
0
Ye2 = Yce
0
Yce
0
Ybe
0
Ybe ,
0
Ye3 = 0,
(12)
78
Yc
1
Y = 0
Yb
0
Yc
0
0
0,
0
0 0 0
Y3 = 0 0 0 .
0 0 Ya
0
0,
0
Yc
0
Y2 = Yc
0
Yb
(13)
All the parameters appearing above are real, because we assume a spontaneous CP violation to
occur. We will shortly come back to this issue.
The Z4 charges of the fields are so chosen that the most general Q6 Z4 R invariant Higgs
superpotential (10) has an accidental symmetry:
H1u,d H2u,d ,
S1 S2 ,
T1 T1 ,
(14)
where H3u,d , T2 and Y do not transform. This symmetry ensures the stability of the VEV structure
1 u i u
e ,
H1u = H2u = vD
2
u,d
u,d
1
H3 = v3u,d ei3 ,
2
Y = v Y ei(
Y /2)
1 d i d
H1d = H2d = vD
e ,
2
S /2)
T2 = v T ei(
T1 = 0,
T /2)
,
(15)
2
2
2
LH
SSB = BT T1 + T2 + BY Y + S AS S1 + S2 Y
+ 1 A1 H 1u S2 + H 2u S1 H 3d + 2 A2 H 1d S2 + H 2d S1 H 3u
+ 3 A3 H u H d H u H d T1 + H u H d + H u H d T2 + h.c.
1
(16)
(The fields with a tilde are the bosonic components of the corresponding supermultiplets.) As in
the case of the supersymmetric part, Bs and As are assumed to be real. We have investigated
the minimization conditions for the angles s given in (15), and found that a nontrivial solution
of the conditions can exist. How the mass spectrum for this nontrivial solution looks like is still
an open problem. A complete analysis of this problem, which is similar to that of [41], will go
beyond the scope of the present paper. We will publish the analysis elsewhere. To proceed, we
here simply assume that there exist a nontrivial CP violating set of VEVs.
2.3. Fermion mass matrices and diagonalization
We assume that VEVs take the form (15), from which we obtain the fermion mass matrices.
2.3.1. Quark sector
The quark mass matrices are given by
u u i u
0
2Yc v3 e 3
1 u u i u
u
m = 2Yc v3 e 3
0
2
u
u ei
u ei u
Ybu vD
Ybu vD
u ei
Ybu vD
u ei u ,
Ybu vD
u u i u
3
2Ya v3 e
(17)
md =
1
2Y d v d ei3d
c 3
2
d
d ei d
Yb vD
d d i d
2Yc v3 e 3
0
de
Ybd vD
i d
d i
Ybd vD
e
79
d ei d .
Ybd vD
d d i d
2Ya v3 e 3
(18)
In the present case, the unitary matrices that rotate quarks have the following form: UuL(R) =
u Ou
RL(R) PL(R)
L(R) , where Os are orthogonal matrices, and
1 1 0
1 1 0
1
1
RL =
(19)
RR = 1 1
1 1 0 ,
0 ,
2
2
0 0
0
0
2
2
1
0
0
,
PLu = 0 exp(i2 u )
(20)
0
u
0
0
exp(i )
u
exp(i2 ) 0
0
exp i u ,
PRu =
(21)
0
1
0
3
u
0
0 exp(i )
u = 3u u ,
(22)
u,d
PL,R
0
qu /yu 0
u = PLu RLT mu RR PRu = mt qu /yu
m
(23)
0
bu ,
2
0
bu
yu
which can be then diagonalized as3
mu 0
u ORu = 0 mc 0 ,
OLuT m
0
0 mt
(24)
(25)
bu = 0.04443,
yu = 0.99732,
qd = 0.005091,
bd = 0.02570,
bd = 0.77606,
bu = 0.09338,
yd = 0.7940,
we obtain
mu /mt = 1.10 105 ,
3 The form of the mass matrix is known as the next-neighbor interaction form [42,43].
(26)
80
0.9739 to 0.9751
0.221to0.227
0.0029 to 0.0045
exp
V
= 0.221 to 0.227 0.9730 to 0.9744
0.039 to 0.044 ,
(27)
CKM
0.0048 to 0.014
sin 2(1 ) = 0.736 0.049.
0.037 to 0.043
0.9990 to 0.9992
(28)
(29)
where the values in the parentheses are the theoretical values obtained from (27) for mt =
174 GeV and mb = 2.9 GeV. So, we see that the model can well reproduce the experimentally measured parameters. Because of the family symmetry, the CKM parameters and the quark
masses are related. In Fig. 1 we plot the predicted area in the sin 21 3 plane. We see from
Fig. 1 that the model requires
3 68 .
(30)
Another quantity to be compared may be |Vtd /Vts |, whose experimental value has been recently
obtained from the observation of b d + in the B decays [47]:
Model:
Exp.:
+0.038
|Vtd /Vts | = 0.200+0.026
0.025 (exp.)0.029 (theo.).
(31)
Fig. 1. Predicted area in the sin 21 3 plane. The vertical and horizontal lines correspond to the experimental values,
sin 21 () = 0.726 0.037 and 3 = (60 14 ) [44,46].
81
We may conclude that 9 independent parameters of the model can well describe 10 physical
observables.
The mass matrices given in the present model can be analytically diagonalized [48], and the
approximate formula for VCKM implies that
md
mu i2.5
mu
md i2.5
Vus 0.794
e
,
Vcd
+ 0.794
e
,
ms
mc
mc
ms
ms i2.5
mc i1.25
e
e
9.63
,
Vcb 0.81
(32)
mb
mt
which should be compared with the Fritzsch formulas [43].
Finally, we give the unitary matrices that rotate the quarks for the choice of the parameters
given in (26):
u
u
u
UuL(R) = RL(R)
PL(R)
OL(R)
,
d
d
d
UdL(R) = RL(R)
PL(R)
OL(R)
,
u
d
where RL,R
and RL,R
are given in (19) and (20), and
0.9987
0.0514 2 105
OLu 0.0513 0.9977 0.0442 ,
0.0023 0.0442
0.9990
(33)
(34)
0.9834
0.1814 0.0050
OLd 0.1813 0.9833 0.0162 ,
0.0078 0.0150 0.9999
(35)
Using these orthogonal matrices and the matrices defined in (19)(22), Eq. (33) gives the unitary
matrices in the explicit form:
0.706
0.0363
1.42 105
UuL = 0.706 0.0363 1.42 105
0
0
0
0.0363
0.705
0.0313
u
+ e2i
(36)
0.0363
0.705
0.0313 ,
u
u
u
i
i
i
0.00229e
0.0442e
0.999e
0.695
0.128
0.00352
UdL = 0.695 0.128 0.00352
0
0
0.128
d
+ e2i
0.128
d
0.00783ei
0.695
0.695
d
0.0150ei
0.0115
0.0115 ,
d
1.00ei
(37)
82
0.706
0.0364
6.72 106
u
u
+ e2i +i3
0.706
0.0364
6.72 106 ,
u
u
u
i
i
0.00482e
0.0932e
0.996ei
0.675
0.209
7.35 105
d
d
+ e2i +i3
0.675
0.209
7.35 105 .
d
d
d
0.230ei
0.741ei
0.631ei
(38)
(39)
The unitary matrices above will be used when discussing the SUSY flavor problem in Section 3
and proton decay in Section 4.
2.3.2. Lepton sector
The mass matrices in the lepton sectors are:
Y Y
0
c
c
1
u i u
vD
Yc
0
m = Yc
e ,
u
u
2
Yb Yb
2Ya tan u ei(3 )
u . We start with the mass matrix of the charged leptons me :
where tan u = v3u /vD
me 0
UeL
me UeR = 0 m 0 .
0
0 m
e (1 + 2 ) (1/ 2)(1 e2 e2 2 ) 1/ 2
UeL = R e (1 2 )
(1/ 2)(1 e2 + e2 2 )
1/ 2 ,
2e
2e 2
1 e2
e2 (1 2 )
1 2 /2
i d
UeR = e2 (1 2 /2)
1
0
e ,
e2
0 1 0
R = 1 0 0,
0 0 1
(40)
(41)
(42)
(43)
(44)
1 2 /2
(45)
83
0 1/ 2 1/ 2
0 1/ 2 1/ 2 ,
1
0
0
which is the origin for a maximal mixing of the atmospheric neutrinos.
As for the neutrino masses, we assume that a see-saw mechanism [49] takes place. As
we can see from (9), the mass matrix for the right-handed neutrinos is diagonal: mN =
((1) mN , (1) mN , N v Y exp(iY /2)), ( = 0 or 1), where mN and N v Y are real and positive by assumption. Therefore, the Majorana mass matrix for the left-handed neutrinos becomes
T
M = m m1
N m
0
22 4
2(2 )2
u
R,
= (1) ei2 R 0
(46)
2(2 )2
0
22 4
0
2(4 )2 + (3 )2 exp i23
where R is given in (45), and
23 = 2 3u u + Y /2,
1
1
u
u
2 = Yc vD
/ mN ,
4 = Yb vD
/ mN ,
2
2
(47)
1
3 = Ya v3u / YN v Y .
2
(48)
The factor (1) ei2 has no effect, and so we ignore it in the following discussions.
Noticing that the s in (46) are real numbers, we recall that M can be diagonalized as [36]
0
0
m1
UT M U = 0
(49)
m2
0 ,
0
0
m3
where c12 = cos 12 , and s12 = sin 12 , and
0
s12 ei( i1 )/2 c12 ei( i2 )/2
U = R
0
0
1 ,
i(
+i
)/2
i(
+i
)/2
1
2
c12 e
s12 e
0
23 = 1 + 2 ,
m3 sin = m2 sin 2 = m1 sin 1 ,
m22
m223
1
2
tan2
for |r|
1.
(50)
(51)
(52)
(53)
is consistent with the experimental constraint | m221 | < | m223 | in the present model. Note that
Eq. (51) is satisfied for
23 = 1 + 2
(54)
and (50), respectively. The product UeL U can be brought by an appropriate phase transformation
84
Fig. 2. The effective Majorana mass mee as a function of sin with sin2 12 = 0.3 and m221 = 6.9 105 eV2 . The
dashed, solid and dot-dashed lines stand for m223 = 1.4, 2.3 and 3.0 103 eV2 , respectively. The m221 dependence
is very small.
to a popular form, which in the present model approximately assumes the form
s12
s13 ei
c12
1
s12 / 2 c12 / 2 1/ 2
0
0
ei
0
0
0
ei
(55)
with
me
s13 e =
= 3.4 . . . 103 ,
2m
sin 2 = sin(1 2 ),
CP ,
sin 2 = sin(1 ),
(56)
(57)
where 1 , 2 and are defined in (51).4 There are seven independent parameters to describe
12 parameters (3 + 3 = 6 masses, three angles and three phases) of the lepton sector. Therefore,
the effective Majorana mass mee in neutrinoless double decay, for instance, can be predicted.
In Fig. 2 we plot mee as a function of sin for sin2 12 = 0.3, m221 = 6.9 105 eV2 and
m223 = 1.4, 2.3, 3.0 103 eV2 [51]. As we see from Fig. 2, the prediction is consistent with
recent experiments [52,53].
We will use the unitary matrices given
in (43), (44) and (50) in Sections 3 and 4, and will see
that the small parameter s13 = e = me /( 2)/m 0.0034 appears as a suppression factor in
FCNCs as well as in some of proton decay modes.
4 Unfortunately, this value of s is too small to be measured [50].
13
85
0
0
aL
q,
2(q,)LL = m2q,
m
0 ,
aL
0
0
aRa
0
2aRR = m2q,
m
0
0
aRa
0
q,
0
bL
0
0 (a = u, d, e),
a
bR
(58)
where mq,
denote the average of the squark and slepton masses, respectively, and (aL(R) , bL(R) )
are dimensionless free real parameters of O(1). Further, since the trilinear interactions (A terms)
are also Q6 invariant, the leftright mass matrices assume the form
2
aLR ij = Aaij ma ij (a = u, d, e),
m
(59)
where Aai s are free parameters of dimension one, and the fermion masses ms are given in (17),
(18) and (40). Here we assume that Aai s are in the same order as the gaugino masses. They are
real, because we impose CP invariance at the Lagrangian level.
We work in the super-CKM basis and calculate
2(q,)LL UaL
m
aLL = UaL
2aRR(LR) UaR
m
and aRR(LR) = UaR(L)
(60)
to parameterize FCNCs and CP violations coming from the SSB sector, where the unitary matrices U s are given in (33), (43) and (44). In doing so, one observes that something interesting
happens; a phase alignment. This is because the only source for CP phases comes from VEVs
(15). To see the phase alignment, we first observe that the matrices RL,R and the phase rotation
u,d
2qLL
matrices PL,R
, given in (19), (20) and (21), commute with the scalar soft mass matrices m
e
2u,dRR . This implies that u,d
and m
LL,RR are real, where LL,RR is trivially real as we see from
2aLR RRa PRa
(43) and (44). As for the leftright soft mass squared (59), we find that (PLa ) (RLa )T m
is a real matrix for all a = u, d, e. Consequently, no CP violating processes induced by the SSB
terms are possible in this model, satisfying the most stringent experimental constraint coming
from the EDM of the neutron and the electron [10].
In [10], experimental bounds on the dimensionless quantities
a
= aLL,RR,LR /m2q,
LL,RR,LR
(a = u, d, e),
(61)
are given. The theoretical values of s for the present model are calculated below, where
q,
q,
q,
aL = aL bL ,
a
aRa = aRa bR
,
are introduced.
Leptonic sector (LL and RR):
e
e
4.9 103 aL ,
12 LL = 21
LL
A ai =
Aai
mq,
(a = u, d, e),
(62)
86
e
e
13 LL = 31
1.7 105 aL ,
LL
e
e
23 LL = 32
8.4 108 aL ,
LL
e
e
12 RR = 21
8.4 108 aRe ,
RR
e
e
13 RR = 31 RR 5.9 102 aRe ,
e
e
23 RR = 32
1.4 106 aRe .
RR
(63)
e
100 GeV
12 LR 5.1 106 A ec A eb
,
m
e
100 GeV
,
21 LR 2.5 108 A ec A eb
m
e
100 GeV
7 e
e
,
13 LR 3.1 10 Ab Ab
m
e
100 GeV
,
31 LR 1.1 103 A ec A eb
m
e
100 GeV
,
23 LR 1.5 109 A eb A eb
m
e
100 GeV
.
32 LR 2.5 108 A ec A eb
m
(64)
(65)
u
u
500 GeV
6
u
u
u
12 LR 21
,
A
7.4
10
A
+
A
a
c
b
b
LR
mq
u
u
500 GeV
4
u
2 u
u
u
u
500 GeV
3
u
3 u
u
,
31 LR 10 1.7 Aa Ab 3.3 10 Ab Ac
mq
u
500GeV
23 LR 102 1.5 A ua A ub + 1.3 102 A ub + 3.6 105 A uc
,
mq
u
500 GeV
32 LR 102 3.2 A ua A ub 6.4 103 A ub 1.7 105 A uc
.
mq
Down quark sector (LL and RR):
d
d
q
12 LL = 21
1.2 104 aL ,
LL
d
d
q
13 LL = 31
7.8 103 aL ,
LL
d
d
q
23 LL = 32 LL 1.5 102 aL ,
d
d
12 RR = 21
1.7 101 aRd ,
RR
d
d
13 RR = 31 RR 1.4 101 aRd ,
d
d
23 RR = 32
4.7 101 aRd .
RR
87
(66)
(67)
d
500 GeV
11 LR 106 6.6 A da A db A db + 2A dc
,
mq
d
500 GeV
22 LR 105 4.1 A da A db + 11A db + 1.5A dc
,
mq
d
500 GeV
3
d
3 d
d
4 d
(68)
In Tables 2 and 3, theoretical values of certain s calculated above and their experimental
bounds are summarized. We see from the tables that to satisfy the experimental constraints, the
SSB parameters aRd , aL , A eb and A ec of the present Q6 model should satisfy
aRd < 101 ,
(69)
88
Table 2
q denotes mq /500 GeV,
Experimental bounds on s and their theoretical values in Q6 model, where the parameter m
and aL,R and A are given in (62)
Exp. bound
Q6 model
4.0 102 m
q
2.8 103 m
q
d
(LL)1.2 104 aL , (RR)1.7 101 aR
q
d
4.5 103 aL aR
4.4 103 m
q
2 105 (A da A db A db + A dc )m
1
q
9.8 102 m
q
1.8 102 m
q
d
(LL)7.8 103 aL , (RR)1.4 101 aR
q
d
3.4 102 aL aR
3.3 102 m
q
2 105 (A da A db + A db A dc )m
1
q
1.0 101 m
q
1.7 102 m
q
u
(LL)1.0 104 aL , (RR)4.5 104 aR
q
u
2.1 104 aL aR
3.1 102 m
q
7 105 (A ua A ub A ub + A uc )m
1
q
d )
|(23
LL,RR |
8.2m
2q
d
(LL)1.5 102 aL , (RR)4.7 101 aR
d )
|(23
LR |
1.6 102 m
2q
d )2
| Re(12
LL,RR |
d ) ( d )
| Re(12
LL 12 RR |
d )2 |
| Re(12
LR
d )2
| Re(13
LL,RR |
d ) ( d )
| Re(13
LL 13 RR |
d )2 |
| Re(13
LR
u
| Re(12 )2LL,RR |
u ) ( u )
| Re(12
LL 12 RR |
u )2 |
| Re(12
LR
Table 3
denote m /100 GeV and
Experimental bounds on s and the theoretical values in Q6 model, where the parameter m
aL,R and A are given in (62)
e ) |
|(12
LL
e )
|(12
RR |
e )
|(12
LR |
e
|(13 )LL |
e )
|(13
RR |
e
|(13 )LR |
e ) |
|(23
LL
e )
|(23
RR |
e )
|(23
LR |
e
e ) |
|(23 )LL (13
LL
e )
e )
|(23
(
RR 13 RR |
e
e )
|(23 )LL (13
RR |
e )
e ) |
|(23
(
RR 13 LL
Exp. bound
Q6 model
4.0 105 m
2
4.9 103 aL
9 104 m
2
8.4 107 m
2
2
2
2 10 m
3 101 m
2
2
1.7 10 m
2
2 102 m
2
3 101 m
2
1 102 m
2
1 104 m
2
9 104 m
2
2 105 m
2
2 105 m
2
e
8.4 108 aR
8.4 108 aL
e
1.4 106 aR
)2
1.4 1012 ( aL
e )2
8.4 108 ( aR
a e
5.0 109 aL
R
a e
2.4 1011 aL
R
while the other SSB parameters are allowed to be of O(1). So, one can fairly say that Q6
symmetry can soften the SUSY flavor problem. Note also that the degree of degeneracy of the
q
left-handed squark masses aL does not need to be very accurate. We find that because of the
89
constraint on aRd given in (69), aL < 10+1 is sufficient to satisfy all the constraints. This has
an important consequence for proton decay as we will see in the next section.
q
4. Proton decay
In this section we consider proton decay, which is a process reflecting the flavor structure of a
model.
4.1. Q6 invariant baryon and lepton number violating operators
In supersymmetric models, the baryon and/or lepton number violating operators of lower
dimensions are [28]:
(1) dimension-four R parity violating operators, and
(2) dimension-five baryon and lepton number violating operators.
Both operators are controlled by the flavor structure of a model. As for the dimension-four R
parity violating operators, Q6 flavor symmetry in the model considered in the previous sections
allows only lepton number violating operators:
LLE c :
LQD c :
c
113 LI LI E3c + 311 L3 LI EIc + 333 L3 L3 E3c + fI J K LI LJ EK
,
2
c
1
c
132 LI i I J Q3 DJ + 123 LI I J QJ D3 ,
H d NH u :
NNN:
fI J K NI NJ NK ,
(70)
and those in which LI is replaced by HId . Note that the baryon number violating operator
U c D c D c is forbidden by Q6 symmetry. Therefore, the dimension-four operators in the present
model cannot mediate proton decay.
We next look at dimension-five operators. The baryon number violating dimension-five operators (which are allowed by R parity and BL symmetry) can be written as [28,29,54,55]
1 ij kl
1
ij kl c c c c
W5 =
(71)
C Qi Qj Qk Ll + CR Ui Ej Uk Dl ,
M
2 L
i,j,k,l=13
where the first and the second term is called the LLLL and RRRR operator, respectively. In
grand unified theories (GUTs), effective dimension-five operators can be generated by integrating
out colored Higgs multiplets [28,29], and therefore the size of the coefficients of the operators
strongly depends the Yukawa matrices. For the minimal SUSY SU(5) GUT [55], one obtains
M = MH C ,
ij kl
CL
ij kl
= CR
ij
= YU YDkl
at MGUT ,
(72)
where MHC is the colored Higgs mass of order of the GUT scale, and YU , YD are Yukawa coupling matrices appearing in the superpotential
ij
1 ij
WY = YU 10i 10j H + 2YD 10i 5 j H .
4
(73)
90
Unfortunately, the minimal SUSY SU(5) GUT should be excluded by the decay mode p K +
if the gauge coupling unification should be strictly satisfied [30].5
If we do not assume any GUTs, the baryon number violating operators could be generated by
some unknown Planck scale physics. In this case, the mass parameter M in (71) is given by
M = MP L = 2 1018 GeV,
(74)
ij kl
CL,R
remain undetermined. So, the operators are supplied with a suppreswhile the coefficients
sion factor 1/MP L which should be compared with 1/MHC 1/1016 GeV1 in the GUT case.
However, this suppression is not sufficient to keep the proton stable, unless that the coefficients
Cs are smaller than O(107 ). An efficient tool to suppress or to forbid proton decay is symmetry. Many authors [24,32,56] have proposed a model in which both the gross structure of the
baryon and lepton number violating operators and the structure of the Yukawa couplings are
fixed by a single flavor symmetry and its breaking. If this is realized, the flavor symmetry can be
tested by proton decay, too. In the models of [24,32,56], a certain set of flavon fields is needed
to form invariants, and the flavor symmetry is assumed to be broken at a superhigh energy scale.
In contrast to these models, Q6 flavor symmetry is broken (spontaneously and at most softly)
only at a low energy scale which is comparable with the SUSY breaking scale. So, it is natural
to assume that Q6 flavor symmetry is intact at the Planck scale, too; we do not have to introduce
flavon fields. Moreover, Q6 is non-Abelian, which can indeed reduce the number of independent
coefficients drastically, as we will see now. We find that the relevant superpotential containing
the baryon number violating Q6 invariant dimension-five operators generated at the Planck scale
can be written as
1
Q
(1)
W5 6 =
CL QI QI Q3 L3 + CR EIc i 2 I J UJc U3c D3c
MP L
I,J =1,2
(2) c 2
+ CR EI i I J U1c U2c DJc ,
(75)
where the superfields in (75) are in the flavor eigenstates. To obtain Eq. (75), we have not assumed
any symmetries such as R-parity and BL except for Q6 symmetry, where the Q6 assignment is
given in Table 1.
4.2. Gross structure of the dimension-five operators and the lowest order approximation
As we can see from (75), Q6 allows only three independent coefficients, CL , CR(1) and CR(2) .
We will see moreover that the first term, the LLLL operator, gives the most dominant contribution
to proton decay, while the RRRR operators can be neglected in the lowest order approximation.
Consequently, the relative size of all the partial decay rates is fixed in this approximation, once
the SSB sector is fixed. To begin with, we recall two basic facts.
Q
(1) Since the operators in the first two terms in W5 6 contain quark fields of the third generation in the flavor eigenstate, small mixing parameters appear when fields are rewritten in terms
of the mass eigenstates, that is,
f
3 = V3I Im ,
I = 1, 2,
(76)
5 However, rapid proton decay can be avoided for a certain choice of squark and slepton mixing matrices even in the
minimal SUSY SU(5) GUT. See [31].
91
Fig. 3. One-loop diagrams contributing to the effective four-Fermi Lagrangian (78). We consider only gluino (g)
and
wino (w)
dressings in the lowest order approximation.
where the subscripts f and m denote the flavor and mass eigenstates, respectively. The mixing
parameters V3I will be multiplied with the coefficients Cs in the decay amplitudes, so that the
condition on Cs can be relaxed if V3I are small (see (36)(39)). Note that within the framework
of the MSSM, there is no such suppression.
(2) In most of models, the universality of the SSB parameters at the GUT or Planck scale
is assumed to suppress the SUSY contributions to FCNCs. This assumption has an important
consequence that the gluino contributions to proton decay are negligibly small if all the squark
masses are degenerate [57]. (In Fig. 3 we show a gaugino dressing diagram which contributes
to an effective four-fermion operator.) However, in our case, we do not assume the universality.
We have seen in the previous section that the degeneracy of the squark masses of the first two
generations is almost exact due to Q6 symmetry. Moreover, to suppress the SUSY contributions
to FCNCs the degeneracy of the first two and third generations needs not to be accurate (see
Eq. (69) and the discussions below). This means that the gluino dressing diagrams may not be
negligible [58] in our case; we will have to investigate it.
We now argue that the first term, the LLLL operator, is the most dominant one. It is known
that for the RRRR operators, the dominant contributions come from the gluino dressing diagrams.
From the superpotential (75), we first observe that the second term in (75), the first RRRR operator, contains two quark fields of the third generation, implying that two small mixing parameters
will be multiplied when going to the mass eigenstates. Further, the third term, the second RRRR
term, contains only quark fields of the first two generation. That is, the gluino dressing contributions vanish because the squark masses of the first two generations are almost degenerate thanks
to Q6 symmetry. Thus, the LLLL operator is the only one which should be considered in the
lowest order approximation, as we will do it in the following discussions.
Note that the LLLL operator contains two third generation fields Q3 and L3 , but the (3, 1)
element of the mixing matrix UeL (see (43)) is equal to one, so that it does not act as a suppression factor. The dominant diagrams for the LLLL operator are those with gluino dressing.
The zino and photino dressing diagrams have the same structure as the gluino ones, but they are
negligibly small because the corresponding gauge couplings are small. So we ignore them in our
calculations. We have calculated the higgsino dressing contributions and found that they can be
neglected, too, if tan < 10. So we assume this to simplify our calculations. We may further
approximate that the squark masses are diagonal in the super-CKM basis. To see this, we recall
that the squark mass squared in the super CKM basis can be written as
a
LL a
LR
, a = u, d, e
m2a = (mfa )2 +
(77)
aT
aT
LR RR
where mfa is the diagonal fermion mass matrix of the flavor a, and s are given in (60). As we
can see from (64), (66) and (68), the nondiagonal elements are sufficiently small.
92
The effective four-Fermi Lagrangian Leff can be obtained from the diagrams shown in Fig. 3,
where as argued we consider only gluino and wino dressings. We find6
1
Leff =
CLL (udue)1M1l ud M uel
2
(4) MP L M=d,s
l=e,
1MNl
CLL (udd)
M N l
ud
d ,
(78)
M,N=d,s
l=e,,
g
w
CLL (udue)1M1l = C LL (udue)1M1l + C LL
(udue)1M1l ,
g
w
CLL (udd)1MNl = C LL (udd)1MNl + C LL
(udd)1MNl ,
(79)
where C g and C w stand for the contributions coming from the gluino and wino dressing diagrams, respectively. These are explicitly calculated to be
q q
43 I 1 I M 3N 3l g q q
g
UuL UdL UdL UeL F aL , bL F g aL , aL ,
C LL (udd)1MNl = 4
3
I =1,2
q q
q
w
1MNl
I 1 I M 3N 3l
UuL
= 42
UdL UdL UeL F w aL , aL + F w bL , bL
CLL (udd)
I =1,2
31 I M I N 3l
UdL UdL UeL
UuL
w q q
q
F aL , bL + F w aL , bL
,
(80)
q q
43 I 1 I M 31 3l g q q
g
C LL (udue)1M1l = 4
UuL UdL UuL UeL F aL , bL F g aL , aL ,
3
I =1,2
q q
q
w
1M1l
I 1 I M 31 3l
UuL
= 42
UdL UuL UeL F w aL , aL + F w bL , bL
C LL (udue)
I =1,2
q q
q
I 1 3M I 1 3l
,
UdL UuL UeL F w aL , bL + F w aL , bL
UuL
(81)
,
F aL , bL =
mg xg1 xg3 xg1 1
xg3 1
xw1 ln xw1 yw3 ln yw3
1
1
w q
F aL , bL =
,
mw xw1 yw3 xw1 1
yw3 1
with
xg,w1 =
m2q
m2g,
w
q
aL ,
q
xg,w3 =
q
m2q
m2g,
w
q
bL ,
yw3 =
m2
2
mw
bL
.
(82)
m2q
m2g
(83)
93
Then we calculate the decay amplitudes as a function of xg3 for given values of rg Q and yw3 ,
while for simplicity we assume the GUT relation between the wino and gluino mass7
mw = 0.27mg .
(84)
The unitary matrices U(u,d,e)L in CLL are explicitly given in (36), (37) and (43), where the
individual phases u,d and 3u,d in the quark sector are not fixed (see (26)). However, it is found
from (36) and (37) that the phase dependence of the combinations appearing in CLL , that is,
I 1 U I (1,2) and U I 1
I1
UuL
dL
(u,d)L U(u,d)L , is small, and moreover the absolute size of the suppression factor
3I
is independent of the phases. Therefore, in the following calculations we choose
U(u,d)L
u = 3u = 0,
3d d = 1.25
(85)
94
Fig. 4. The ratio C g (udd)/C w (udd) is plotted as a function of xg3 for rg Q = (10, 1, 0.1, 0.01), which expresses
the degree of the degeneracy of the squark masses, where xg3 is defined in (82). C g s stand for the gluino contributions
and C w s for the wino ones, respectively. The figures in the left column correspond to yw3 = 1, and those in the right
column to yw3 = 10. The corresponding amplitudes are responsible for the anti-neutrino modes.
M)
(p M) =
(86)
(4)2 MP L
32m3P f2
with
Amp p K + = 6 CLL (udd)121l + 2 CLL (udd)112l ,
(87)
95
112l
w
Fig. 6. Ratio of CLL (udd)112l
Q6 to CLL (udd)MSSM as a function of xg3 , where rg Q = 10 is assumed for
g
112l is the largest coefficient for p K + , and the deCLL (udd)112l
Q , where xg3 is defined in (82). CLL (udd)
6
w (udd)112l .
generacy of the squark masses ( Q = 0) is assumed for CLL
MSSM
Amp p + = 25 CLL (udd)111l ,
Amp p K 0 el+ = 1 CLL (udue)121l ,
Amp p 0 el+ = 5 CLL (udue)111l ,
(88)
(89)
(90)
mP
(F D) = 0.70,
mB
2 = 1 +
mP
(3F + D) = 1.6,
3mB
96
3 = 1 +
mP
(F + D) = 2.0,
mB
1
5 = (1 + F + D) = 1.6,
2
4 = 2 +
6 =
2mP
F = 2.7,
mB
2mP
D = 0.4,
3mB
(91)
and
D = 0.81,
F = 0.44,
mP = 938 MeV,
p = 0.003 GeV3 ,
mB = 1150 MeV,
f = 131 MeV,
mK = 495 MeV,
Here, D, F stand for the coupling constants for the interaction between baryons and mesons, p
for the hadronic matrix element, f for the pion decay constant, mP for the proton mass, mB for
the averaged baryon mass, and mK and m for the Kaon and pion mass, respectively. Further,
A is the renormalization group (RG) enhancement factor for the coefficient CL [29,55], which
in our case becomes
A = 10.5 (for the MSSM),
12 (for Q6 model).
(93)
(Since Q6 model contains four more Higgs doublets than in the case of the MSSM, the enhancement factor is slightly larger.)
As the first task, we would like to compare the decay rates in the MSSM and Q6 model. By
the MSSM decay rates we mean the decay rates which are obtained from the LLLL operators
under the assumption of the degenerate squark masses (no gluino dressing). We also assume that
all the coefficients of the operators are equal to the single constant CLMSSM , and that all the fields
appearing in the operators are in the mass eigenstates (no mixing matrix). In Fig. 7, we plot for
each decay mode, the experimental bound (thick solid line), the partial lifetime calculated in Q6
model (solid line) and that in the MSSM (dotted line), where we assume
CL = CLMSSM = 1,
mq = 1 TeV.
(94)
The lifetime in Q6 model is calculated from the sum of the gluino and wino contributions for
rg Q = 0.01, 0.1, 1, 10, which corresponds to four solid lines. We first see from the figures that
if rg Q varies from 0.01 to 10, a change of the life time about an order of magnitude can appear.
We also see that the decay rates in Q6 model is much more suppressed than those in the MSSM.
Quantitatively, we find that the experimental bounds can be satisfied if the coefficient for the
LLLL operator satisfies
CL < 10(45) ,
(95)
< 10(67) should be satisfied for the case of the MSSM. That is, Q6 flavor symwhile
metry can suppress proton decay by four orders of magnitude (which can be also seen from the
figures).
Next we would like to compare our results with those obtained in the minimal SUSY SU(5)
GUT. In our lowest order approximation, only CL is an independent coefficient, implying that
the ratio of partial decay widths is independent of CL . That is, in Q6 model, the relative size
of the partial decay rates is fixed (once the SSB sector is fixed). First we recall the case of the
minimal SUSY SU(5) GUT [32,55]. The superpotential for the baryon number violating effective
dimension-five operators in this case is given by
CLMSSM
SU(5)
W5
1
yu ydl V 2l (Q1 Q1 )(Q2 Ll ) + yc ydl V 1l (Q2 Q2 )(Q1 Ll )
2MHC
(96)
97
MSSM = 1, m = 1 TeV.
Fig. 7. Partial lifetime of the proton for each decay mode as a function of xg3 with CL = CL
g
The experimental bound (thick solid line), the partial lifetime calculated in Q6 model (solid line) and that in the MSSM
(dotted line) are plotted. The lifetime in Q6 model is calculated from the sum of the gluino and wino contributions for
rg Q = 0.01, 0.1, 1, 10, which corresponds to four solid lines.
in the approximation that the third generation is dropped, where yu,c , ydl are the diagonal Yukawa
couplings of the corresponding quarks, l = 1, 2 is generation index (i.e., d1 = d and d2 = s),
and V is the CKM matrix. In writing (96), a nontrivial assumption is made; the up quark Yukawa
matrix is diagonal over the whole range of energies. Therefore, the superpotential (96) is not a
unique prediction of the model. Under the assumption of the degeneracy of the squark masses,
only the wino dressing diagrams contribute to the decay, and one finds [32,55]
2
2
p K + = m2P m2K yc ys sin2 C 3 ,
2
2
p + = m2P m2 2yc ys sin2 C tan C 25 ,
2
2
p K 0 el+ = m2P m2K yu ydl V 2l cos C 1 ,
2
2
p 0 e+ = m2 m2 yu ydl V sin C 5 Q ,
l
2l
(97)
(98)
(99)
(100)
where we have dropped the corresponding loop functions (82), and the common factor is given
by
=
Ap 2 cos C
MHC
2
1
32m3P f2
(101)
98
From these partial decay widths, we obtain the relative decay widths in the minimal SU(5) GUT
[32,55]:
2
3
B(p K + ) (m2P m2K )2
=
(102)
2,
B(p + )
(m2P m2 )2 2 25 tan C
B(p K 0 el+ ) (m2P m2K )2 1 cos C 2
(103)
2,
=
B(p 0 el+ )
(m2P m2 )2 5 sin C
yu 1 cos C 2
B(p K 0 + )
=
(104)
6 104 ,
B(p K + )
yc 3 sin2 C
2
yu cos2 C
B(p 0 + )
=
(105)
3 104 .
B(p + )
2 2yc sin2 C
The corresponding results for Q6 model are found to be
32 2
2 UdL
B(p K + ) (m2P m2K )2
1.25,
=
2
31
2
2
1,
B(p + )
(mP m )
25 UdL
32
2
1 UdL
31
B(p K 0 el+ ) (m2P m2K )2 5 UdL
0.5,
31 sin
+ =
2
0
2
2
U
5 103 ,
2
C
1
B(p el )
(mP m )
uL
1
2
2
3l 2
U
(107)
31
5 UuL
1
3l 2 0.2,
31 sin
UuL
C 2 UeL
eL
2.4 104 ,
32
UdL
3l 2 0.5,
B(p 0 el+ ) 1 3l 2 1
31
U
2
= UeL
UeL
uL
5.1 102 ,
B(p + )
2
U 31
B(p K 0 el+ )
=
B(p K + )
(106)
(108)
(109)
dL
where as in the case of the minimal SUSY GUT we have suppressed the loop functions. The
upper (lower) numbers on the right-hand side correspond to the wino (gluino) contributions.
Note that in contrast to the case of the minimal SUSY GUT the mixing parameters explicitly
appear, reflecting the flavor structure of the present Q6 model. The most remarkable difference
3l |2 ,
is the charged lepton modes. As we see from (108) and (109), they are proportional to |UeL
where the unitary matrix UeL (which rotates the left-handed charged leptons) is explicitly given
in (43). We find
32 |2
B(p K 0 + ) B(p 0 + ) |UeL
me 2
=
=
(110)
=
2.37 105 .
31 |2
m
B(p K 0 e+ )
B(p 0 e+ )
|UeL
The same ratio in the case of the minimal SUSY GUT becomes B(p K 0 + ) 103 B(p
K 0 e+ ) if we use the superpotential (96). In Figs. 8 and 9 we plot four different combinations of
ratios as a function of xg3 for various values of rg Q in Q6 model. The figures on the left (right)
side correspond to yw3 = 1(10). From the figures, we find the hierarchical structure of the decay
modes:
B p K 0 e+ < B p 0 e+ < B p + < B p K + .
(111)
In the case of the minimal SUSY GUT we obtain instead
B p 0 e+ < B p K 0 e+
B p + < B p K + .
(112)
99
Fig. 8. Ratio of partial decay rates for each decay mode with rg Q = (10, 1, 0.1, 0.01). Figs. in the left column are for
yw3 = 1, and the right column for yw3 = 10.
Note, however, that although B(p 0 e+ ) < B(p K + ), they are basically in the same oder
in the Q6 model. That is, the Q6 model predicts that once the decay mode K + is experimentally
observed, then it is likely to observe the decay mode 0 e+ , too, in sharp contrast to the case of
the minimal SUSY SU(5) GUT.
5. Conclusion
Flavor symmetry can play important rolls in supersymmetric models [61]. We have investigated the SUSY contributions to FCNCs and to proton decay in a supersymmetric extension of
the SM based on a binary dihedral family group Q6 . We have seen that the discrete low energy
flavor symmetry Q6 can be an alternative to the universality assumption of the soft supersymmetry breaking parameters. That is, the existence of a hidden sector, in which supersymmetry is
assumed to be broken in a flavor blind manner, is not indispensable in this model. Therefore, a
variety of supersymmetry breaking mechanisms may become phenomenologically viable.
It has turned out from our analysis on FCNCs that the degeneracy requirement of the soft
scalar masses of the left-handed squarks in this model can be significantly relaxed. This has a
considerable effect on the gluino-mediated one-loop amplitudes on proton decay. In most of the
previous calculations, the degeneracy was assumed so that the gluino contributions are canceled
with each other, which implies that the wino contributions are the most dominant contributions.
We have fond that the nondegenerate squarks can change the decay rate in the charged lepton
modes by an order of magnitude. We have also found that in the lowest order approximation
there is only one independent coefficient for dimension-five operators that lead to proton decay.
100
Fig. 9. Ratio of partial decay rates for each decay mode with rg Q = (10, 1, 0.1, 0.01). Figures in the left column are
for yw3 = 1, and the right column for yw3 = 10.
Consequently, the relative partial decay rates in this approximation are fixed, reflecting the flavor
structure dictated by Q6 .
Our main finding is that Q6 flavor symmetry acts in such a way that the smallness of (VMNS )e3 ,
the suppression of e + , and the smallness of B(p K 0 ( 0 ) + + )/B(p K 0 ( 0 ) +
e+ ) have the same origin; the electron is much lighter than the muon and the tau. They in fact
vanish in the me 0 limit.
Acknowledgements
We would like to thank J. Hisano, H. Nakano, T. Singai and K. Tobe for useful discussions.
E.I. is supported by Research Fellowship of the Japan Society for the Promotion of Science
(JSPS) for Young Scientists (No. 16-07971). This work is supported by the Grants-in-Aid for
Scientific Research from the Japan Society for the Promotion of Science (# 13135210).
References
[1] G. t Hooft, Naturalness, chiral symmetry, and spontaneous chiral symmetry breaking, in: Recent Developments in
Gauge Theories, Cargse, 1979.
[2] M. Veltman, Acta Phys. Pol. B 12 (1981) 437.
[3] S.P. Martin, hep-ph/9709356.
[4] D.J.H. Chung, et al., Phys. Rep. 407 (2005) 1.
[5] S. Dimopoulos, D. Sutter, Nucl. Phys. B 452 (1995) 496.
[6] L. Hall, V.A. Kostelecky, S. Raby, Nucl. Phys. B 267 (1986) 415;
F. Gabbiani, A. Masiero, Phys. Lett. B 209 (1988) 289.
101
102
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
103
M. Gell-Mann, P. Ramond, R. Slansky, in: P. van Nieuwenhuizen, D.Z. Freedman (Eds.), Supergravity, NorthHolland, Amsterdam, 1979;
R.N. Mohapatra, G. Senjanovic, Phys. Rev. Lett. 44 (1980) 912.
H. Minakata, H. Sugiyama, O. Yasuda, K. Inoue, F. Suekane, Phys. Rev. D 68 (2003) 033017;
H. Minakata, H. Sugiyama, O. Yasuda, K. Inoue, F. Suekane, Phys. Rev. D 70 (2004) 059901, Erratum;
O. Yasuda, hep-ph/0309333;
H. Minakata, Nucl. Phys. B (Proc. Suppl.) 137 (2004) 74;
P. Huber, M. Lindner, M. Rolinec, T. Schwetz, W. Winter, Phys. Rev. D 70 (2004) 073014;
H. Sugiyama, O. Yasuda, F. Suekane, G.A. Horton-Smith, hep-ph/0409109.
M. Maltoni, T. Schwetz, M.A. Trtalo, J.W.F. Valle, New J. Phys. 6 (2004) 122.
H.V. Klapdor-Kleingrothaus, et al., Eur. Phys. J. A 12 (2001) 147;
C.E. Aalseth, et al., Phys. At. Nucl. 63 (2000) 1268;
H.V. Klapdor-Kleingrothaus, A. Dietz, H.L. Harney, I.V. Krivosheina, Mod. Phys. Lett. A 16 (2001) 2409.
D.N. Spergel, et al., Astrophys. J. Suppl. 148 (2003) 175.
S. Dimopoulos, S. Raby, F. Wilczek, Phys. Lett. B 112 (1982) 133;
P. Nath, A.H. Chamseddine, R. Arnowitt, Phys. Rev. D 32 (1985) 2348.
J. Hisano, H. Murayama, T. Yanagida, Nucl. Phys. B 402 (1993) 46.
V. Ben-Hamo, Y. Nir, Phys. Lett. B 339 (1994) 77;
M. Kakizaki, M. Yamaguchi, JHEP 0206 (2002) 032;
R. Harnik, D.T. Larson, H. Murayama, M. Thormeier, Nucl. Phys. B 706 (2005) 372.
V.M. Belyaev, M.I. Vysotsky, Phys. Lett. B 127 (1983) 215;
J. Ellis, S. Hagelin, D.V. Nanopoulos, K. Tamvakis, Phys. Lett. B 124 (1983) 484.
S. Chadha, G.D. Coughlan, M. Daniel, G.G. Ross, Phys. Lett. B 149 (1984) 477;
J. Milutinovic, P.B. Pal, G. Senjanovic, Phys. Lett. B 140 (1984) 324;
M. McDonald, C.E. Vayonakis, Phys. Lett. B 163 (1985) 148.
T. Goto, T. Nihei, J. Arafune, Phys. Rev. D 52 (1995) 505.
T. Goto, T. Nihei, Phys. Rev. D 59 (1999) 115009.
E. Bilgin, B. Patt, D. Tucker-Smith, F. Wilczek, hep-ph/0509075;
B. Patt, D. Tucker-Smith, F. Wilczek, hep-ph/0509295.
Received 14 September 2005; received in revised form 11 January 2006; accepted 8 March 2006
Available online 27 March 2006
Abstract
We demonstrate the stability under subsequent-to-leading logarithm corrections of the quartic scalarfield coupling constant and the running Higgs boson mass obtained from the (initially massless) effective
potential for radiatively broken electroweak symmetry in the single-Higgs-doublet Standard Model. Such
subsequent-to-leading logarithm contributions are systematically extracted from the renormalization group
equation considered beyond one-loop order. We show to be the dominant coupling constant of the effective potential for the radiatively broken case of electroweak symmetry. We demonstrate the stability of
and the running Higgs boson mass through five orders of successively subleading logarithmic corrections to
the scalar-field-theory projection of the effective potential for which all coupling constants except the dominant coupling constant are disregarded. We present a full next-to-leading logarithm potential in the three
dominant Standard Model coupling constants (t-quark-Yukawa, s , and ) from these coupling constants
contribution to two loop - and -functions. Finally, we demonstrate the manifest order-by-order stability
of the physical Higgs boson mass in the 220231 GeV range. In particular, we obtain a 231 GeV physical
Higgs boson mass inclusive of the t-quark-Yukawa and s coupling constants to next-to-leading logarithm
order, and inclusive of the smaller SU(2) U (1) gauge coupling constants to leading logarithm order.
2006 Elsevier B.V. All rights reserved.
* Corresponding author.
105
1. Introduction
The motivation for radiative electroweak symmetry breaking, as first considered by Coleman and Weinberg [1], draws its roots all the way back to the first formulation of the hierarchy
problem [2] for embedding an SU(2) U (1) electroweak gauge theory within a large grand
unified theory. As Sher points out in his review of radiative symmetry breaking [3], any scalar
field mass term within the SU(2) U (1) Lagrangian would necessarily be expected to have a
magnitude sensitive to GUT-level mass scales via higher order processes involving the embedding theory. However, indirect empirical bounds [4] do not accommodate a Higgs boson mass
appreciably larger than the electroweak vacuum expectation value v = 246 GeV. Such a scalar
field mass in conventional spontaneous symmetry breaking could occur only if the scalar-field
mass term in the electroweak Lagrangian were exceedingly finely tuned to cancel off GUT-level
scales from successive perturbative corrections arising within the embedding theory. A more
natural approach would be to assume that the embedding theory includes some symmetry (e.g.,
conformal invariance) that serves to protect the scalar-field mass term from such GUT-scale corrections.
Radiative symmetry breaking assumes this protective symmetry is exactthat no scalar-field
mass term appears in the electroweak Lagrangian. In the absence of such a mass term, Coleman
and Weinberg [1] found the one-loop electroweak effective potential to be of the form
2
9g24 + 6g 2 g22 + 3g 4 + 1922 48gY4
(1L)
4
Veff =
(1.1)
+
log 2 + K ,
4
1024 2
where is the quartic scalar-field interaction coupling constant, g2 and g are the SU(2) U (1)
gauge interaction coupling constants, gY is the dominant electroweak Yukawa coupling constant,
is an arbitrary renormalization scale necessarily occurring as a by-product of the removal of
infinities from 1-loop graphs, and K is the finite coefficient of 4 after such O( 4 ) infinities
have been subtracted.
At the time of Coleman and Weinbergs paper, there was no evidence for a massive t-quark;
the magnitude of gY could be reasonably assumed to be comparable to the b-quark Yukawa
coupling constant gb2 = 2(mb /v)2
= 8 104 . Any such comparable gY2 would be ignorable
relative to the corresponding electroweak gauge coupling constants g22 = e2 / sin2 w
= 0.44,
2
2
2
g = e / cos w = 0.13. Thus the gauge coupling constant terms were anticipated to dominate
the opposite-sign Yukawa coupling constant term in the coefficient of the logarithm in (1.1).
Coleman and Weinberg chose to determine the finite counterterm K by insisting that be the
1L /d 4 coincides with d 4 V
4
scale at which d 4 Veff
tree /d = 6. Ignoring gY they found that
K =
25 9g24 + 6g 2 g22 + 3g 4 + 1922
.
6
1024 2
(1.2)
(1L)
Since d 4 Veff
/d 4 is infinite at = 0, Coleman and Weinberg necessarily had to choose a non(1L)
zero value of to serve as the scale of Veff . By identifying this scale with the electroweak
vacuum expectation value v, i.e. by identifying the arbitrary parameter in Eq. (1.1) with the
(1L)
electroweak minimum v, one finds from the requirement dVeff /d|v = 0 that
11 4
3g2 + 2g 2 g22 + g 4 + 642 ,
2
256
(1.3)
106
where all coupling constants in (1.3) are evaluated at the vev scale [i.e., = (v)]. Although
(1.3) is formally a quadratic equation in , Coleman and Weinberg noted the existence of a small
11
4
2 2
4
solution [
= 256
2 [3g2 + 2g g2 + g ]] for which radiative symmetry breaking would be
perturbatively viable.
This approach, however, fails upon incorporation of the physical t-quarks Yukawa couplant:
gY2 = gt2 = 2(mt /v)2
= 1.0 in Eq. (1.1). Then the opposite-sign Yukawa coupling-constant term
is seen to dominate the gauge coupling-constant terms, in which case equations (1.2) and (1.3)
get replaced by relations
25 2
4 gt4 ,
2
128
2
11
=
4 gt4 .
2
16
K
=
(1.4)
(1.5)
As noted in Ref. [5], there is no solution to (1.5) for sufficiently small to be perturbatively
viable. However, this does not mean that radiative electroweak symmetry breaking is impossible for empirical electroweak coupling constants. Rather, it means that the one-loop effective
potential (1.1) is an inappropriate choice for radiative electroweak symmetry breaking; leadinglogarithm two-loop electroweak potential terms are comparable or larger than one-loop terms [3].
In Refs. [5,6] it is argued that a potential based upon the summation of its leading-logarithm contributions leads not only to a Higgs boson mass within indirect empirical bounds [6], but also
to a substantially reduced value for that may be sufficiently small for perturbative stability
of the physics extracted from the effective potential under its subsequent-to-leading-logarithm
corrections.
In the present paper we address this issue by showing reasonable stability of the predictions of Refs. [5,6] under subsequent-to-leading-logarithm corrections obtained via known
renormalization-group functions for the electroweak couplings. On a more general level, the
present paper demonstrates how to formulate radiative symmetry breaking for effective potentials subject to a large destabilizing Yukawa couplant. This is of value even if our specific choice
of electroweak potential, which is considered in the present paper to arise from a single Higgs
doublet, is incorrect. We have seen how a large Yukawa couplant necessarily eliminates any small
(i.e. g24 ) solution. The formulation of predictive results from radiative symmetry breaking
when is not fortuitously small, as in Ref. [1], but rather is the dominant couplant in the effective potential, is important both for electroweak symmetry breaking and for cosmological
applications [7].
To develop these ideas further, we first review in Section 2 the leading logarithm results of
Refs. [5,6]. These results, incorporating the largest Standard Model couplants
gt2 (v)
(v)
s (v)
,
,
y=
,
z=
2
2
4
4
g 2 (v)
(g (v))2
r = 2 2 and s =
,
4
4 2
x=
(1.6)
(1.7)
are indicative of a Higgs boson mass of 218 GeV, in conjunction with the dominance of the
scalar self-interaction couplant y over all other Standard Model couplants.
In Section 3, we consider the scalar field theory projection (SFTP) of the effective potential,
the approximation in which all Standard Model couplants except y are ignored. Such a theory has
renormalization group functions y (y) and (y) that are known to 5-loop order. We are therefore
107
able to construct successive approximations to the full SFTP involving the summation of leading
and four successively subleading logarithms in the full effective potential series.
Since the SFTP is not scale-free (by virtue of gauge-sector interactions, its vacuum expec1/2
tation value v is constrained to satisfy MW = g2 v/2, or alternatively, v = GF 21/4 ), we are
able to calculate the running Higgs boson mass [V (v)]1/2 for each successive order. We find
surprisingly that the leading logarithm SFTP reproduces quite closely the full leading logarithm
results of Section 2, despite the deliberate omission of other Standard Model couplants {x, z, r, s}
from the SFTP potential.
In Section 4, we develop an explicit set of successive subleading-logarithm approximations
to the full SFTP potential. We then demonstrate for this potential that the running Higgs boson
mass [V (v)]1/2 (
= 226 GeV) and the scalar field self-interaction couplant y(v)(
= 0.054) are
remarkably stable as the order of these approximations increases.
In Section 5, we demonstrate how known two-loop renormalization group functions involving
the dominant Standard Model couplants {x, y, z} can be utilized to obtain leading and nextto-leading logarithm corrections to the SFTP resulting from turning on the t-quark Yukawa
interaction. Thus, we are able to include the next-to-leading logarithm corrections to the leadinglogarithm potential of Refs. [5,6].
In Section 6 we find such corrections alter the running Higgs boson mass obtained from
the SFTP potential by only a few GeV, depending upon the order of the SFTP employed. In
particular, we find the fully N LL extension of LL effective potential expressed in terms of
dominant SM couplants x, y, z [5,6] leads to a running Higgs boson mass of 228 GeV and a
corresponding couplant y(v) = 0.0531.
Finally, in Section 7, we discuss further aspects of subleading logarithm corrections to the
effective potential series. We first consider augmentation of the results of Section 6 by the
leading-logarithm contributions of the relatively small electroweak gauge couplants r and s to
the effective potential. Upon incorporation of these contributions, the running Higgs boson mass
is found to be quite stable at 224 GeV over successive orders of SFTP scaffolding. The couplant
y(v)
= 0.054 is similarly shown to exhibit stability over four SFTP orders. The fully N LL results in {x, y, z} described in the previous paragraph are altered slightly upon incorporation of
LL corrections in {r, s} to a Higgs boson mass of 231 GeV, with y(v) = 0.054. We also show
that the physical Higgs boson mass differs by less than 0.3 GeV from the mass obtained from
the effective potential taken to N LL order in contributions from gt and s , and to LL order
in contributions from the SU(2) U (1) gauge coupling constants. We discuss the phenomenological differences anticipated between radiatively and conventionally broken electroweak
symmetry. We find surprising lowest-order agreement between both approaches for a number
of processes [W W (ZZ, W W ); H (ZZ, W W )], though we do find an enhancement of
two-Higgs scattering processes like W W H H for the radiative case over the conventional
symmetry breaking case. We conclude with a summary of the methodology employed toward
obtaining order-by-order stability in the physics extracted from radiative electroweak symmetry
breaking.
2. Review of leading logarithm results
In Refs. [5,6], the summation of leading-logarithm contributions to the effective potential of
the Standard Model is expressed in terms of its dominant three couplants
x=
gt2 (v)
= 0.0253,
4 2
y=
(v)
,
4 2
z=
s (v)
= 0.0329,
(2.1)
108
where v = , the vacuum expectation value. The leading logarithm contribution is of the series
form
2 4 SLL = 2 4
xn
yk
z Cn,k, Ln+k+1 ,
L log 2 /v 2 ,
n=0
k=0
=0
C0,1,0 = 1.
(2.2)
This effective potential is shown in Ref. [6] to include explicitly the one-loop contributions
C0,2,0 = 3, C2,0,0 = 3/4 [all other Ci,j,k with i + j + k = 2 are equal to zero] derived from
Feynman graphs in Ref. [1]. All remaining coefficients Cn,k, in (2.2) are determined by the
renormalization group (RG) equation, in which all RG functions are evaluated to one-loop order:
+ x
+ y
+ z 4 S = 0.
(2 2 )
(2.3)
L
x
y
z
Only one-loop RG functions enter, because SLL is determined in full by those contributions
to the differential operator on the left-hand side of (2.3) that either lower the degree of L by one
or raise the aggregate power of couplants by one [5,6]. The leading logarithm effective potential
may be expressed as a power series in the logarithm L:1
Veff = 2 4 (SLL + K) = 2 4 A + BL + CL2 + DL3 + EL4 + ,
(2.4)
where the constant K includes all finite 4 counterterms remaining after divergent contributions
from 4 graphs degree-2 and higher in couplant powers are removed. Thus the purely 4 coefficient A is equal to y + K, and coefficients {B, C, D, E} are explicitly obtained in Refs. [5,6]
via Eq. (2.3) as degree {2, 3, 4, 5} polynomials in the couplants x, y, z. The unknown couplant
y(v) and finite counterterm 2 K 4 in Veff are numerically determined by the simultaneous application of Coleman and Weinbergs renormalization conditions [1], which impose a minimum
at = v and which define the quartic scalar-field interaction couplant at = v:2
Veff
(v) = 0 K =
(4)
Veff (v) =
B
y,
2
35
d 4
4
11
= 24 2 y y = B + C + 20D + 16E.
4
3
3
d
(2.5)
(2.6)
In terms of the dominant three couplants (2.1), one finds from Eqs. (2.5) and (2.6) that
3x 2 3y 2
K = y +
(2.7)
,
8
2
35
11
105
2
105 3
y = 11y 2 x 2 + 105y 3 +
xy x 2 y + x 2 z
x
4
4
2
32
1125 2 2 115 2 2 75 3
x y
x z + x z
+ 540y 4 + 270xy 3 30xy 2 z + 60x 2 yz
8
2
4
495 4
45
225 3
x y+
x + 1296y 5 + 972xy 4 144xy 3 z + xy 2 z2 69x 2 yz2
4
64
2
1 We also note that the RG equation (2.3) can be applied directly to the expression Eq. (2.4) as in [8].
2 For the one-loop potential of Coleman and Weinberg (in which C = D = E = 0), these conditions imply that K =
531 2 2
345 2 3 603 3 2 207 3
8343 3 2
x y z+
x z
x z +
x yz
x y
4
4
16
2
32
459 4
135 4
837 5
x z+
x y+
x ,
32
32
64
109
270x 2 y 3 +
(2.8)
where x = x(v) = 0.0253 and z = z(v) = 0.0329 in the above two equations. Thus Eq. (2.8) is a
fifth order polynomial equation for y( y(v)), with solutions y = {0.0538, 0.278, 0.00143}.
Since y must be a positive-definite couplant, only the first of these solutions is viable (y(v) =
0.053829), in which case the finite counterterm K is found from Eq. (2.7) to be 0.057935.
The running Higgs boson mass [9], which is defined from the second derivative of the effective
potential at its minimum v,
(v) = 8 2 v 2 (B + C),
m2H = Veff
(2.9)
is found from the physical values (2.1) for x(v) and z(v) and the numerical solution to Eq. (2.8)
y(v) = 0.0538 to be mH = 216 GeV.
When one augments the dominant couplants (2.1) with the next-largest couplants in the Standard Model, the electroweak gauge couplants
[g2 (v)]2
[g (v)]2
(2.10)
=
0.0109,
s
=
= 0.00324,
4 2
4 2
one finds from the additional r- and s-dependent contributions to {B, C, D, E}, as listed in the
erratum to Ref. [6], that the numerical solution to Eq. (2.6) for y(v) is altered slightly from
0.05383 to 0.05448. Correspondingly, we find from Eq. (2.5) that the finite counterterm K is
also altered from 0.0579 to
r=
3s 2
9r 2
3rs
3x 2 3y 2
= 0.058703.
(2.11)
8
2
128 128
64
Most importantly, however, the running Higgs boson mass (2.9) is elevated only minimally
(mH = 218 GeV) from its 216 GeV value when electroweak gauge couplants are omitted. Thus,
the leading logarithm effective potential (2.2) results appear to be stable under the incorporation of the leading additional subdominant electroweak couplants, i.e., the electroweak gauge
couplants themselves.
The question that remains, however, is whether the value determined for y(v) (= 0.0545)
is sufficiently small for these leading-logarithm predictions to be reasonably stable under
subsequent-to-leading-logarithm corrections to the scalar field potential. The couplant y(v) is
clearly still the dominant electroweak couplant in the radiative symmetry-breaking scenario, as
is evident by comparison to couplants x and z [Eq. (2.1)]. Indeed, such a value for y (= /4 2 )
would correspond to having a Higgs boson mass of 510 GeV in a conventional symmetry breaking scenario. Nevertheless, the two-loop terms in known electroweak RG functions are still seen
to be substantially smaller than the one-loop terms when y(v) = 0.0545 [5], suggesting that corrections to y and mH from subsequent-to-leading logarithms may be controllable in radiatively
broken electroweak symmetry.
K = y +
110
(SM) couplants except for the dominant scalar-field self-interaction couplant y. Such an approach
is analogous to the usual SM procedure for processes (such as R(s)) involving both strong and
electroweak perturbative corrections: one first evaluates QCD corrections in isolation, since s
is the dominant coupling constant, prior to introducing SM corrections from subdominant electroweak gauge coupling constants.
The summation-of-leading-logarithms SFTP for radiative electroweak symmetry breaking is
known to all orders in the couplant y, and is given in closed form as the x = 0 limit of Eq. (6.1)
in Ref. [6]:
y
LL
= 24
VSFTP
(3.1)
+K .
1 3yL
The constant K is the residual coefficient of 4 after infinities from multiloop 4 graphs have
been subtracted. Thus K is inclusive of all finite counterterms degree-2 and higher in y. Curiously, one finds by applying the condition (2.6) to the series expansion of (3.1),
y = 11y 2 + 105y 3 + 540y 4 + 1296y 5 ,
(3.2)
that a solution y = 0.054135 exists quite close to the one quoted in the previous section (y =
0.0545) when non-zero physical values are included for the subdominant electroweak couplants
x, z, r and s. Similarly, one finds from the condition (2.5) that K = y 3y 2 /2 = 0.05853,
in close agreement with the aggregate counterterm coefficient (2.11) when the same subdominant electroweak couplants are included. Although we are approximating all subdominant electroweak couplants to be zero in the SFTP potential, this potential is not scale-free; a physical
1/2
still arises from the SM gauge sector. We require the vacuum expecvev-scale v = 21/4 GF
tation value of the SFTP potential to be at v = 246.2 GeV, and then find from Eq. (2.9) that
mH = 221.2 GeV [B = 3y 2 , C = 9y 3 , y = 0.054135], only a small departure from the 218 GeV
result [6] obtained when the subdominant couplants x, z, r, s are no longer omitted, but turned
on from zero to their physical values (0.0253, 0.0329, 0.0109, 0.00324, respectively). These results demonstrate that the SFTP approximation is a surprisingly good one for leading-logarithm
radiative electroweak symmetry breakingthat y is truly the driving couplant for obtaining the
leading-logarithm results summarized in the previous section.
To carry the SFTP approximation (in the absence of an explicit scalar-field mass term) past
leading-logarithms, we first note that the SFTP all-orders potential takes the form of a perturbative field theoretic series (y = /4 2 , L = log( 2 /2 ))
VSFTP = 2 4 SSFTP ,
SSFTP = y +
n
Tn,m y n+1 Lm .
(3.3)
n=1 m=0
n=1
y n+1 Tn,0 .
(3.4)
111
In other words, the finite counterterm coefficient K in Eq. (3.1) is inclusive of all terms in the
full SFTP potential that can contribute to it.
The invariance of VSFTP under changes in the renormalization scale implies that SSFTP
satisfies the RGE
+ y
4 S = 0,
(2 2 )
(3.5)
L
y
where RG functions y and have been calculated to 5-loop order for global O(N) symmetric
scalar field theory [10]. The SM RG functions in the SFTP of the single Higgs effective potential
are just the N = 4 case of this theory:
39 3
y + 187.855y 4 2698.27y 5 + 47974.7y 6 + ,
(3.6)
2
3
9
585 4
y 49.8345y 5 + .
= y2 y3 +
(3.7)
8
16
128
The series SSFTP in the full potential (3.3) may be rewritten in terms of sums of leading (S0 ) and
successively subleading (S1 , S2 , . . .) logarithms:
y = 6y 2
Sk (u)
Tn,nk unk .
(3.8)
n=k
Given u = yL, we employ the methods of Ref. [11] to obtain successive differential equations
for Sk (u), first by substituting Eq. (3.8) into the RG equation (3.5), and then by organizing the
RG equation in powers of y:
O y2 :
O y3 :
O y4 :
O y5 :
O y6 :
dS0
6S0 = 0, S0 (0) = T0,0 = 1;
du
dS1
39 dS0
12S1 = 21S0 u
, S1 (0) = T1,0 ;
2(1 3u)
du
2 du
dS0 81
3 dS0
dS2
18S2 = 190.105S0
+ 187.855u
S1
2(1 3u)
du
4 du
du
2
39 dS1
, S2 (0) = T2,0 ;
u
2 du
dS3
24S3
2(1 3u)
du
dS0
9 dS0
3 dS1
2698.27u
+ 377.959S1
= 2716.55S0 +
8 du
du
4 du
dS1
39 dS2
60S2 u
, S3 (0) = T3,0 ;
+ 187.855u
du
2 du
dS4
30S4
2(1 3u)
du
585 dS0
9 dS1
dS0
= 48174.1S0
+ 47974.7u
5414.81S1 +
64 du
du
8 du
dS2
dS1
3 dS2
+ 565.814S2
+ 187.855u
2698.27u
du
4 du
du
159
39 dS3
S3 u
, S4 (0) = T4,0 .
2
2 du
2(1 3u)
(3.9)
(3.10)
(3.11)
(3.12)
(3.13)
112
Equations for summations Sk with k > 4 require the knowledge of 6-loop-and-higher terms in
the RG functions (3.6) and (3.7).
Initial conditions for all but S0 are dependent on finite coefficients Tk,0 (k > 0) of 4 after infinities are subtracted. Such coefficients will be determined via successive applications of
Eq. (2.5). The solution to Eq. (3.9) is
S0 (u) =
1
,
1 3u
(3.14)
recovering (u = yL) the leading logarithm sum within the potential (3.1). Similarly one finds
explicit solutions to Eqs. (3.10)(3.13) to be
S1 (u) =
(3.15)
(1 3u)3 S2 (u)
3
9u2
13
+ u T1,0 + 59.8023 + (1 3u) log(1 3u)
= T2,0 +
4
4
16
2
169
13
195
T1,0
log(1 3u) +
log(1 3u) ,
+
2
16
16
(3.16)
(1 3u)4 S3 (u)
9
3
= T3,0 + T2,0 1047.88 + 120.917T1,0 u +
T1,0 + 1190.26 u2 + 28.6798u3
4
4
117 2
39
351
39
T1,0 + 401.512 T1,0 u +
u log(1 3u)
+ 101.754 + T2,0
4
16
8
16
2 2197
3
11661 507
507
+
(3.17)
u+
T1,0 log (1 3u) +
log (1 3u) ,
128
64
16
64
(1 3u)5 S4 (u)
3
= T4,0 + 19636.1 T3,0 + 182.032 T2,0 2089.57 T1,0 u
4
9
+ 44052.4 + 2409.76 T1,0 + T2,0 u2 + (28.6798T1,0 + 42536.2)u3
4
117
T1,0 + 7806.13 u2
+ 1085.32u4 + 93.2095u3 +
8
117
T2,0 u + 202.290T1,0 1465.37
+ 8938.02 + 1199.66 T1,0
16
1521 2
65
1521
u +
T1,0 + 1991.04 u
+ 13T3,0 T2,0 log (1 3u) +
2
64
64
2
507
7943
T1,0 +
T2,0 log (1 3u)
+ 1174.97
32
8
3 28561
4
195533 2197
6591
u
+
T1,0 log (1 3u) +
log (1 3u) .
+
256
384
16
256
(3.18)
113
Obtaining Sk (u) with k > 4 requires knowledge past 5-loop order terms in the RG functions (3.6)
and (3.7), as noted above. We see, however, that if is chosen to equal the electroweak vev scale
v = 246.2 GeV, the all-orders SFTP potential (3.3) may be expressed in the form [L = log 2 /v 2
as before; y = y(v)]
VSFTP = 2 4 y S0 (yL) + yS1 (yL) + y 2 S2 (yL) + y 3 S3 (yL) + y 4 S4 (yL) +
(3.19)
where Sk (yL) is given by the final expression of Eq. (3.8).
4. Successive approximations to the full SFTP potential
The all-orders SFTP Potential, expressed as the double summation (3.3) with L L (
v) may be approached by successive summations of subleading logarithms [LL leading-log,
N k LL (next-to-)k -leading log] contributing to the complete perturbative series (3.16):
VLL = 2 4 y
Tn,n (yL)n + yT1,0 + y 2 T2,0 + y 3 T3,0 + y 4 T4,0 +
n=0
= 2 4 yS0 (yL) + K ,
(4.1)
VN LL = 2 4 y
Tn,n (yL)n + y
Tn,n1 (yL)n1 + y 2 T2,0 + y 3 T3,0 + y 4 T4,0 +
n=0
n=1
2 4
n
n1
2
VN 2 LL = y
Tn,n (yL) + y
Tn,n1 (yL)
+y
Tn,n2 (yL)n2
n=0
n=1
(4.2)
n=2
+ y 3 T3,0 + y 4 T4,0 +
(4.3)
(4.4)
q=2
Note from Eq. (3.4) for K that the expression (3.19) for the full SFTP potential is just the k
limit of Eq. (4.4):
y p+1 Sp (yL) + K
y q Tq1,0
= VSFTP .
lim VN k LL = 2 4
(4.5)
k
p=0
q=2
Indeed, we have chosen to express K by Eq. (3.4) in order to assure the consistency of the limit
(4.5) with VSFTP .
We have already seen in the previous section that the series expansion of Eqs. (4.1) and (3.14)
VLL = 2 4 y + K + 3y 2 L + 9y 3 L2 + 27y 4 L3 + 81y 5 L4 +
(4.6)
(v)]1/2 =
yields the results y = 0.054135, K = 0.058531, and a running Higgs mass [VLL
221.2 GeV. The corresponding N LL series expansion of Eq. (4.2), as obtained from series expansions of summations (3.14) and (3.15), is given by
VN LL = 2 4 y + K + BN LL L + CN LL L2 + DN LL L3 + EN LL L4 +
(4.7)
114
where the NLL numerical value for y is obtained via the series coefficients
2
3
BN LL (yN LL ) = 3yN
LL + 3yN LL (4T1,0 7)/2,
(4.8)
3
4
CN LL (yN LL ) = 9yN
LL + (27T1,0 621/8)yN LL ,
4
5
DN LL (yN LL ) = 27yN
LL + 9(89 + 24T1,0 )yN LL /2,
5
6
EN LL (yN LL ) = 81yN
LL + 27(240T1,0 1049)yN LL /16.
(4.9)
(4.10)
(4.11)
Substitution of Eq. (4.8) into the condition (2.5) that ensures minimization of the N LL potential
at = v yields the following equation for T1,0 in terms of K and the N LL value for y:
3
2
3
T1,0 = 4K 21yN
(4.12)
LL + 6yN LL + 4yN LL /12yN LL .
Similarly, one can substitute Eqs. (4.7)(4.11) into Eq. (2.6) to find
yN LL =
11
35
BN LL + CN LL + 20DN LL + 16EN LL .
3
3
(4.13)
Given the already-determined numerical value for the aggregate 4 counterterm coefficient
K = 0.058531, we see that Eq. (4.12) can be substituted into each series term (4.8)(4.11)
within Eq. (4.13) to yield a sixth order polynomial equation for yN LL . One finds only one real
positive solution to this equation, yN LL = 0.053812 (we discard as spurious a negative real solution y = 0.176 as well as complex solutions). This solution exhibits a controllably small
deviation from yLL = 0.054135 obtained from the LL approximation to the SFTP potential in
the previous section. We then find from Eq. (4.12) a numerical value for T1,0 = 2.5521, in which
case BN LL (4.8) and CN LL (4.9) have numerical values 0.0094373 and 0.0013294, respectively.
Upon substitution of these N LL numerical values into Eq. (2.9) for the running Higgs boson
mass, we find that (VN LL (v))1/2 = 227 GeV, a controllable departure from the 221.2 GeV LL
result of the previous section.
This entire procedure can be repeated for N 2 LL, N 3 LL and N 4 LL versions of the SFTP
potential. For the N 2 LL case, one finds from Eq. (4.3) that
AN 2 LL (yN 2 LL ) = yN 2 LL + K,
81
4
9T
T
BN 2 LL (yN 2 LL ) = BN LL (yN 2 LL ) + yN
+
93.9273
,
2,0
1,0
2 LL
4
423
5
CN 2 LL (yN 2 LL ) = CN LL (yN 2 LL ) + yN 2 LL 54T2,0
T1,0 + 1001.16 ,
2
5661
6
DN 2 LL (yN 2 LL ) = DN LL (yN 2 LL ) + yN 2 LL 270T2,0
T1,0 + 6872.92 ,
4
61641
7
EN 2 LL (yN 2 LL ) = EN LL (yN 2 LL ) + yN 2 LL 1215T2,0
T1,0 + 38397.6 .
8
(4.14)
(4.15)
(4.16)
(4.17)
(4.18)
(4.19)
where all ys appearing in Eq. (4.19) are understood to be yN 2 LL . Since 4 -counterterm coefficients K and T1,0 have already been numerically determined to be 0.058531 and 2.5521,
115
11
16
B 2 (y 2 ) + CN 2 LL (yN 2 LL ) + 20DN 2 LL (yN 2 LL )
3 N LL N LL
3
+ 16EN 2 LL (yN 2 LL )
(4.20)
by application of Eq. (2.6) to the N 2 LL potential (4.3), we see that Eqs. (4.19) and (4.20) represent two equations in two unknowns: yN 2 LL and T2,0 . The smallest (and only viable) positive real
solution for yN 2 LL is 0.05392, in which case T2,0 = 8.1770. We then substitute these values,
as well as the prior determination of T1,0 = 2.5521, into B and C within Eq. (2.9) to find that
the N 2 LL running Higgs boson mass is (VN 2 LL (v))1/2 = 225.0 GeV. Thus it is clear that yN 2 LL
and the N 2 LL Higgs boson mass are controllable departures from their LL counterparts.
For completeness, we list below the corresponding results for N 3 LL and N 4 LL version of
the SFTP potential as obtained from series solutions to Eqs. (3.12) and (3.13):
BN 3 LL = BN 2 LL + y 5 (12 T3,0 30 T2,0 + 186.730 T1,0 1352.65),
(4.21)
3231
T2,0 + 2641.53T1,0 17523.9 ,
CN 3 LL = CN 2 LL + y 6 90T3,0
(4.22)
8
13257
7
3
T2,0 + 22690.0T1,0 143.417 10 , (4.23)
DN 3 LL = DN 2 LL + y 540T3,0
4
342387
8
3
3
EN 3 LL = EN 2 LL + y 2835T3,0
T2,0 + 152.644 10 937.659 10 ,
16
(4.24)
159
6
BN 4 LL BN 3 LL = y 15T4,0
T3,0 + 279.532T2,0 2696.44T1,0 + 24032.2 ,
4
(4.25)
2619
CN 4 LL CN 3 LL = y 7 135T4,0
T3,0 + 4933.79T2,0 + 44780.1T1,0
4
+ 360.414 103 ,
(4.26)
25443
8
DN 4 LL DN 3 LL = y 945T4,0
T3,0 + 50885.5T2,0 446.879 103 T1,0
4
4
+ 337.923 10 ,
(4.27)
94959
EN 4 LL EN 3 LL = y 9 5670T4,0
T3,0 + 400.145 103 T2,0 345.173 104 T1,0
2
+ 250.285 105 .
(4.28)
At the N 3 LL level, the condition K = yN 3 LL BN 3 LL (yN 3 LL )/2 implies that
81
24y 5 T3,0 = 4(y + K) + 6y 2 + y 3 (12T1,0 21) + y 4 T1,0 + 187.855
2
+ y 5 (60T2,0 373.459T1,0 + 2705.30),
(4.29)
116
Table 1
Results for the SFTP potential taken to N 4 LL order, as discussed in Section 4. Tk,0 (k 1) is the coefficient of the finite
2 4 y k+1 counterterm contributing to the SFTP potential. The final column is in GeV units
k
Tk,0
[V (v)]1/2
0
1
2
3
4
0.05414
0.05381
0.05392
0.05385
0.05391
1
2.552
8.117
83.21
1142
221.2
227.0
224.8
226.3
225.0
(4.30)
V (2) = M 2 2 ,
(4.31)
2
and as noted in [1,3], this term generates an imaginary part in the effective potential for small
values of at one-loop order. However, near the electroweak scale = v [the scale at which
the effective potential has a minimum (2.5), renormalization conditions are applied (2.6), and
the Higgs mass is defined (2.9)], this quadratic term can be treated as a perturbation. Expanding the one-loop results of [3] to leading order in = M/v results in the following perturbation
to the effective potential
1
1
(2)
(4.32)
117
breaking (CSB) regime M 2 = v 2 , and thus (4.32) truly represents a perturbation to the radiative
scenario from a CSB mass term. Combining (4.32) with the SFTP leading-logarithm results
allows us to examine the numerical effect of the perturbation on the leading-logarithm (k = 0)
results of Table 1. For = 0.1 (i.e., M 25 GeV), y decreases slightly to y = 0.05392 and mH
increases marginally to mH = 223.4 GeV. Even for = 0.2 (M 50 GeV), which results in
y = 0.05328 and mH = 229.8 GeV, the effect
mH 10 GeV on the Higgs mass is still far
less than the CSB expectation
mH M 50 GeV. We thus conclude that the LL radiativelybroken SFTP scenario is not destabilized by a perturbation from a CSB mass term in the effective
potential.
5. Turning on the Yukawa sector
The subdominant electroweak couplants x and z (2.1) provide the largest alteration to the
SFTP potential. If x = 0 (i.e., if there is no Yukawa coupling), then there is no way for z, the QCD
quarkgluon coupling, to enter the potential. Diagrammatically, z enters the potential beginning
at two-loop order as a virtual gluon exchange within a t-quark loop.
If we augment the SFTP potential with contributions from the subdominant electroweak couplants x and z, the potential one obtains [in the absence of (u, d, s, c, b) Yukawa couplings and
electroweak gauge couplants] is just
Vxyz = 2 4
n=0
xn
k=0
yk
=0
z
n+k+1
Lp Dn,k,,p = 2 4 S
(5.1)
p=0
where D0,0,0,0 = D1,0,0,0 = D0,0,1,0 = 0, D0,1,0,0 = 1. The leading logarithm (LL) portion of S
is comprised of those terms in Eq. (5.1) for which p = n + k + 1. Thus coefficients Cn,k, in
Eq. (2.2) are just coefficients Dn,k,,n+k+1 in Eq. (5.1). Similarly the next-to-leading logarithm
(NLL) portion of S is comprised of those terms in Eq. (5.1) for which p = n + k + 2. The
series S may be expressed as a power series in L log( 2 /v 2 ) of the form (2.4):
S = (y + K) + BL + CL2 + DL3 + EL4 +
(5.2)
where {BLL , CLL , DLL , ELL }, as obtained from the RGE (2.3) with one-loop RG functions, are
given by Eqs. (7.2)(7.5) of Ref. [6]. As in the SFTP case, the constant K in (5.2) corresponds
to the aggregate contribution of purely- 4 finite terms contributing to the potential (5.1) after
infinities are removed:
K=
p pn
x n y k zpnk Dn,k,pnk,0 .
(5.3)
Thus the SFTP counterterm coefficients Tk,0 in the previous section correspond to coefficients
D0,k,0,0 in Eq. (5.3). However, Eq. (5.3) now includes terms not present in the SFTP version
of K given by Eq. (3.4); Eq. (3.4) is just the sum of the x = z = 0 subset of terms contributing to Eq. (5.3). Recall from Section 2 that K was found numerically via Eqs. (2.5) and (2.6)
to be equal to 0.057935 [5] from the leading logarithm version of the potential (5.1). This is
only a small departure from the value K = 0.058531 obtained from the SFTP potential (3.1).
This difference reflects the presence of additional finite 4 terms. For example, the 2 4 y 2 T1,0
(= 2 4 y 2 D0,2,0,0 ) term associated with the post-subtraction contribution of the 1-loop divergent graph of Fig. 1 is now augmented by a 2 4 x 2 D2,0,0,0 ( 2 4 x 2 U ) term associated with
the post-subtraction contribution of the 1-loop divergent graph of Fig. 2.
118
Fig. 1. The divergent 1-loop graph leading to the finite 2 T1,0 y 2 4 counterterm (T1,0 = D0,2,0,0 ).
Fig. 2. The divergent 1-loop graph leading to the finite 2 U x 2 4 counter-term (U = D2,0,0,0 ).
As in the SFTP case, we approximate the full potential (5.1) via a series of successive approximations:
p pn
VLL = 2 4
(5.4)
x n y k zpnk Lp1 Dn,k,pnk,p1 + K ,
p=1 n=0 k=0
VN LL =
2 4
+
p pn
n k pnk
x y z
p1
Dn,k,pnk,p1
n k pnk
x y z
p2
Dn,k,pnk,p2
+K
2
2n
n k 2nk
x y z
Dn,k,2nk,0
(5.5)
n=0 k=0
VN q LL =
2 4
p pn
q
n k pnk
x y z
pm1
+K
q+1 p pn
n k pnk
x y z
Dn,k,pnk,pm1
Dn,k,pnk,0
(q 1),
(5.6)
where the final expression in Eq. (5.5) is just the two finite 1-loop 4 post-subtraction terms
associated with Figs. 1 and 2:
2
2n
n=0 k=0
(5.7)
119
Comparison of Eq. (5.6) to Eq. (5.1) through use of the definition (5.3) for K shows that the full
potential Vxyz is just the q limit of VN q LL : limq VN q LL = Vxyz , analogous to Eq. (4.5)
for SFTP case.
We now use the two loop RG functions to obtain the N LL contributions to the series S in
Eq. (5.1). To do this, we break these MS functions up into their known one-loop (1L) and twoloop (2L) components [12]:
= 1L + 2L ,
1L =
3x
,
4
2L =
3y 2 27x 2 5xz
+
;
8
64
4
9
x1L = x 2 4xz,
4
3 3 3 2
3 2 9 2
27
x2L = x x y + xy + x z xz2 ;
2
2
4
2
2
3
y = y1L + y2L , y1L = 6y 2 + 3yx x 2 ,
2
39 3
3 2
15 3
2
y2L = y 9xy x y + x + 5xyz 2x 2 z;
2
16
8
(5.8)
x = x1L + x2L ,
(5.9)
(5.10)
7
13
xz2
.
(5.11)
z1L = z2 , z2L = z3
2
4
4
We also break up the series coefficients in Eq. (5.2) into their LL and N LL contributions,
A y + K = y + T1,0 y 2 + U x 2 + ,
(5.12)
z = z1L + z2L ,
B = BLL +
BN LL + ,
(5.13)
C = CLL +
CN LL + ,
(5.14)
D = DLL +
DN LL + ,
(5.15)
E = ELL +
EN LL + .
(5.16)
If we substitute the series (5.2) into the RGE (2.3) using the RG functions (5.8)(5.11), we find
that the cancellation of O(L0 ) terms on the left-hand side of (2.3) implies that
2BLL = y1L 41L y,
(5.17)
2
BN LL = 21L BLL + 2x1L U x + 2y1L T1,0 y + y2L
41L T1,0 y 2 + U x 2 42L y.
(5.18)
BN LL = +
4
2
2
4
2
3 3T1,0 2
x y.
+
(5.20)
4
2
BLL = 3y 2
120
(5.21)
BLL
BLL
BLL
+ y1L
+ z1L
41L BLL
x
y
z
(5.22)
4CL = 4 CL + x
hence that
4CLL = x1L
and that
BN LL
BLL
BN LL
BLL
+ x2L
+ y1L
+ y2L
x
x
y
y
BN LL
BLL
+ z2L
41L
BN LL 42L BLL .
+ z1L
(5.23)
z
z
Given our solution (5.19) for BLL and the 1L RG functions, one recovers the result [5,6]
4
CN LL = 41L CLL + x1L
9
9
3
9
CLL = 9y 3 + xy 2 x 2 y + x 2 z x 3 .
(5.24)
4
4
2
32
Given our solution (5.20) for
BN LL and the 1L and 2L RG functions in Eqs. (5.8)(5.11), we
find that
621
27T1,0 225
3T1,0 21
4
3
+ 27T1,0 y +
xy +
+
xy 2 z
CN LL =
8
2
4
2
2
127 2 2
9 2
225T1,0 27 2 2
23U
x yz +
+
x y +
+
x z
+ 3T1,0
2
32
8
2
16
9T1,0 4
27 15U 3
45T1,0 351 3
405 45U
x z+
+
x y+
+
+
x .
+
4
4
16
32
256
64
16
(5.25)
The O(L2 ) terms in the RGE (2.3) vanish provided
6DL2 = 6 DL2 + x
C 2
C 2
C 2
L + y
L + z
L 4 CL2 .
x
y
z
(5.26)
CLL
CLL
CLL
+ y1L
+ z1L
41L CLL
x
y
z
(5.27)
CN LL
CLL
CN LL
CLL
+ x2L
+ y1L
+ y2L
x
x
y
y
CN LL
CLL
+ z2L
41L
CN LL 42L CLL .
+ z1L
(5.28)
z
z
If one substitutes CLL [Eq. (5.24)] into (5.27), one recovers Eq. (7.4) of Ref. [6]:
6
DN LL = 61L DLL + x1L
27 3 3 2
225 2 2 23 2 2
xy xy z + 3x 2 yz
x y x z
2
2
32
8
15 3
45 3
99 4
x .
+ x z x y+
16
16
256
DLL = 27y 4 +
(5.29)
121
DN LL :
801
11547
147
+ 108T1,0 y 5 +
+ 81T1,0 xy 4 + 12T1,0 +
xy 3 z
DN LL =
2
32
2
15T1,0 291
23T1,0 75 2 2
xy 2 z2 +
+
x yz
+
8
16
4
4
45 45T1,0 2 3
177T1,0 33 2 2
877 115U 2 3
x y +
x y z+
x z
+
32
2
16
8
32
4
3125 201U 3 2
69T1,0 615 3
+
x z +
x yz
+
128
16
8
16
9T1,0 4
19323 2781T1,0 3 2
1023 135U
x y +
x z
+
256
128
128
32
4
81T1,0 5
3825 45T1,0 4
1035 45U
+
x y+
+
+
x .
+
(5.30)
256
128
512
64
64
Finally, to cancel O(L3 ) terms in the RGE (2.3), we see that terms degree-5 in couplants cancel
provided
8ELL = x1L
DLL
DLL
DLL
+ y1L
+ z1L
41L DLL ,
x
y
z
(5.31)
DN LL
DLL
DN LL
+ x2L
+ y1L
x
x
y
DLL
DN LL
DLL
+ z1L
+ z2L
41L
DN LL 42L DLL .
+ y2L
y
z
z
(5.32)
Upon substitution of Eq. (5.29) into Eq. (5.31), we recover Eq. (7.5) of Ref. [6]:
8
EN LL = 81L ELL + x1L
243 4
45
69
135 2 3 531 2 2
xy 9xy 3 z + xy 2 z2 x 2 yz2
x y +
x y z
4
32
16
8
64
8343 3 2 459 4
135 4
837 5
345 2 3 603 3 2 207 3
x z
x z +
x yz
x y
x z+
x y+
x .
+
64
256
32
512
512
512
1024
(5.33)
Substitution of Eqs. (5.29) and (5.30) into Eq. (5.32) yields the corresponding N LL contribution
to the series (5.2):
55539T1,0 1081377 2 4
2187T1,0 29133
+
y x +
yx 5
EN LL =
4096
8192
256
2048
105T1,0 111633 4 2
125793
1035U
5
+ 405T1,0 y x +
+
+
x z
+
64
64
16
4096
38613 2 4
1215T1,0 195939 4 2
4255U
y x +
+
x z
+
32
1024
64
512
207T1,0 17703 5
75315 3 3
315U
4509U
+
x z+
x z
+
64
32
2048
128
1024
ELL = 81y 5 +
122
28323
4581 855T1,0 3 2
6
+ 405T1,0 y +
+
y x z
+
16
64
32
3231 135T1,0 4
31455T1,0 227529 3 3
+
y x +
y xz
+
256
512
8
2
28197 7191T1,0 2 3
3807 225T1,0 3 2
+
y xz +
+
y x z
+
32
16
128
128
3603 165T1,0 2 3
14847 1215T1,0 2 2 2
y x z +
y xz
+
512
64
128
64
35145 621T1,0
7323 2643T1,0
+
yx 4 z +
yx 3 z2
+
512
256
64
128
1269T1,0 6
93 345T1,0
208629 1485U
yx 2 z3 +
+
+
x . (5.34)
+ +
2
32
32768
2048
1024
One could continue this procedure indefinitely to obtain O(L5 ), O(L6 ), etc. N LL contributions
to the series (5.2); thus, one can in principle obtain the entire N LL contribution to the full potential Vxyz (5.1). However, we have already seen that the conditions (2.5) and (2.6) are sensitive
only up to O(L4 ) terms in the potential series S.
6. Yukawa sector corrections to SFTP results
Since the scalar field couplant y(= /4 2 ) is the dominant couplant of the SM, with the
t-quark Yukawa couplant x(= gt2 /4 2 ) and the QCD couplant z(= g32 /4 2 = s /) characterizing the less dominant Yukawa sector of the effective potential [z does not enter the effective
potential unless x is non-zero], one test of order-by-order stability of the effective potential is to
augment the SFTP of Section 4 with LL contributions from the Yukawa sector. These are given
explicitly by Eqs. (5.19), (5.24), (5.27) and (5.33) of the previous section. Thus, we assume here
that
B
B
B
C
C
C
(6.1)
=
+
.
D
D
D
E y k +{x,z}LL
E {x,y,z}LL
E N k LL SFTP
N LL
For example, the augmentation of the N LL SFTP with LL contributions from the Yukawa
sector leads to the following series coefficients in Eq. (2.6)
B(y, T1,0 ) = 3y 2
3x 2
21 3
+ 6T1,0
y ,
4
2
9
9
3
9
621 4
y ,
C(y, T1,0 ) = 9y 3 + xy 2 x 2 y + x 2 z x 3 + 27T1,0
4
4
2
32
8
(6.2)
(6.3)
+
2
2
32
8
16
16
256
801 5
y ,
+ 108T1,0
(6.4)
2
123
243xy 4
45xy 2 z2 69x 2 yz2 135x 2 y 3 531x 2 y 2 z
9xy 3 z +
+
4
32
16
8
64
345x 2 z3 603x 3 z2 207x 3 yz 8343x 3 y 2 459x 4 z 135x 4 y
+
+
64
256
32
512
512
512
837x 5
28323 6
+
(6.5)
+ 405T1,0
y .
1024
16
The conditions (2.5) and (2.6) in conjunction with the prior determination of K = 0.057935
(see Section 2) enable one to have two equations in the two unknowns T1,0 and y (or equivalently
T1,0 y 3 and y),
(y + K) y 2 x 2 7y 3
+
+
,
(6.6)
3
2
8
4
11
35
y = B(y, T1,0 ) + C(y, T1,0 ) + 20D(y, T1,0 ) + 16E(y, T1,0 ).
(6.7)
3
3
Utilizing the values x(v) = 0.0253, z(v) = 0.0329, K = 0.05793, one finds that y(v) =
0.05351 and hence that T1,0 = 2.5533. Note that these values are only small departures from the
NLL SFTP, consistent with y being the dominant SM couplant. One then finds from Eq. (2.9)
that [V (v)]1/2 = 222 GeV, a 5 GeV decrease from the N LL SFTP value of Table 1.
One can, of course, continue this procedure to k = {2, 3, 4} levels in Eq. (6.1). For example,
if k = 2, the finite counterterm coefficient T2,0 is obtainable from the minimization condition
K = B/2 y, where from Eq. (4.15),
3x 2
21 3
2
4 495T1,0 220T2,0 2296
+ 6T1,0
y 9y
.
B(y, T1,0 , T2,0 ) = 3y
4
2
220
(6.8)
One can similarly utilize Eqs. (4.16), (4.17) and (4.18) for C(y, T1,0 , T2,0 ), D(y, T1,0 , T2,0 ), and
E(y, T1,0 , T2,0 ), respectively, in order to express the fourth derivative condition (2.6) as
T1,0 y 3 =
11
35
B(y, T1,0 , T2,0 ) + C(y, T1,0 , T2,0 ) + 20D(y, T1,0 , T2,0 ) + 16E(y, T1,0 , T2,0 ).
3
3
(6.9)
Given the prior determination of T1,0 = 2.5533, K = 0.05793, and given the physical couplant
values x(v) = 0.0253, z(v) = 0.0329, Eqs. (6.8) and (6.9) represent two equations in the two
unknowns y, T2,0 . The smallest positive real solution for y is y = 0.05362, in which case T2,0 =
8.1744 and, from Eq. (2.9), V (v) = (219.5 GeV)2 , results almost identical to the SFTP results
of Section 4 [T2,0 = 8.1770, yN 2 LL = 0.05392, V (v) = (225 GeV)2 ].
y=
Table 2
Perturbative stability of results inclusive of N k LL contributions to the effective potential from the dominant couplant
y(v) (v)/4 2 augmented by LL contributions from the t -quark Yukawa couplant x(v) gt2 (v)/4 2 and z(v) =
s (v)/
k
y(v)
Tk,0
[V (v)]1/2 (GeV)
0
1
2
3
4
0.05383
0.05351
0.05362
0.05355
0.05338
1
2.5533
8.1744
83.190
982.21
215.8
221.7
219.5
221.3
223.6
124
One can continue this procedure through two subsequent orders of the SFTP. The results one
obtains are tabulated in Table 2. These results exhibit stability about y = 0.054, [V (v)]1/2
=
221 GeV through four subleading orders in y.
We conclude this section by developing a fully N LL set of predictions in the couplants
{x, y, z}. To begin, we note that the additional terms involving x and z contributing to B to
NLL order are
BN LL =
BSFTP +
x B,
3
27 3T1,0
3
3
3x
xy 2 + x 3 x 2 z + xy 2 + U
x B = +
4x 2 z .
4
2
2
4
4
(6.10)
(6.11)
CN LL =
CSFTP +
x C,
(6.12)
DN LL =
DSFTP +
x D,
(6.13)
EN LL =
ESFTP +
x E.
(6.14)
By subtracting from Eqs. (5.25), (5.30), and (5.34) the explicitly N LL SFTP contributions to
Eqs. (4.9), (4.10) and (4.11), the N LL contributions to C, D and E arising entirely from the
Yukawa sector are given by
621
4
,
x C =
CN LL y 27T1,0
(6.15)
8
89 + 24T1,0
,
x D =
DN LL 9y 5
(6.16)
2
240T1,0 1049
x E =
EN LL 27y 6
(6.17)
.
16
Now, if one uses only N LL expressions for {B, C, D, E}SFTP in Eqs. (6.11)(6.14), one finds
that
BLL +
BN LL
x B
B
BLL
C C +
CN LL
C CLL
C
+ x = LL
(6.18)
+
,
=
DLL
DLL +
DN LL
x D
D
D
ELL
ELL +
EN LL
x E
E SFTP
E
thereby recovering the series coefficients within Eq. (5.2) for the potential VN LL (5.5), as calculated in the remainder of Section 5.
Applying the minimization condition (2.5) to these series coefficients leads to an explicit
expression for the unknown finite counterterm coefficient U associated with the finite part chosen
from Fig. 2:
U = y 3 (24T1,0 42) + xy 2 (6T1,0 27) + x 2 y(6T1,0 + 3) + 6x 3 4x 2 z
+ 12y 2 3x 2 + 8(y + K) x 2 (16z + 3x) .
(6.19)
We already know that K = 0.057935 from the LL calculation of Refs. [5,6]. We also have
the NLL results that T1,0 = 2.5533, as obtained prior to turning on N LL Yukawa sector
contributions
x {B, C, D, E}. Given physical values (2.1) for x and z, we see that Eq. (6.19)
is a relation expressing the finite counterterm coefficient U (= D2,0,0,0 ) as a function of the
125
CN LL (5.26),
DN LL (5.31) and
EN LL (5.34). We then find from substituting Eq. (6.18)
into the condition (2.6) that y is the solution of a degree-six polynomial equation. The only
viable solution to this equation (i.e., the smallest real positive root) is y(v) = 0.05311, in
which case we see from Eq. (6.19) that U = 17.306. Substituting these numerical values into
BLL (5.20),
BN LL (5.21), CLL (5.25) and
CN LL (5.26), the running Higgs boson mass to
NLL order in the three dominant electroweak couplants {x, y, z} is found from Eq. (2.9) to be
[VN LL (v)]1/2 = 227.8 GeV. This is to be compared with the 216 GeV LL result for these same
three couplants; similarly the N LL value y = 0.0531 is a controllable departure from the LL
result y = 0.0538 [5,6] discussed in Section 2. Thus, the fully N LL extension of the LL results
for Vxyz presented in Refs. [5,6] lead to a very modest decrease in y(v) and a 5% increase in the
running Higgs boson mass, indicative of order-by-order stability of Vxyz .
7. Discussion
7.1. Turning on the electroweak gauge couplants
As noted in Section 2, the leading-logarithm contributions of the electroweak gauge coupling
constants g2 (v) and g (v) to {B, C, D, and E} are listed in the Erratum to Ref. [6], where they
are denoted as
ew B,
ew C,
ew D and
ew E. The aggregate 4 counterterm K is altered as
well, as indicated in Eq. (2.11). To incorporate these additional (algebraically lengthy) corrections into the potential of the previous section, we modify the LL expression for B to include
electroweak gauge coupling constant contributions
3x 2
+
ew B,
(7.1)
4
3
9
3
ew B = s 2 + r 2 + rs.
(7.2)
64
64
32
We find from condition (2.5) for the LL analysis inclusive of
ew C,
ew D and
ew E that
K = 0.058703 [6]. One can then repeat the analysis of the previous section to include the LL
electroweak couplings and Yukawa couplings, in conjunction with the N k LL scalar field theory
projection of the effective potential. The results of this analysis are listed in Table 3.
As is evident from the table, both y(v) and the running Higgs boson mass are very stable
as higher-order contributions from y alone are incorporated. Moreover, the O(y k+1 4 ) finite
counterterm coefficients Tk,0 are found to be virtually the same as these coefficients in Table 1,
for which the subleading SM couplants {x, z, r, s} are assumed to be zero.
The calculation to N LL order in {x, y, z} in the previous section can also be supplemented
with LL contributions from the gauge couplants {r, s}, since r(v) = 0.0109, s(v) = 0.00324 are
BLL = 3y 2
Table 3
yN k LL +{x, z, r, s}LL ; i.e. results for the kth subleading order of the SFTP augmented by leading logarithm contributions
from the Yukawa couplant x, the QCD couplant z, and the SU(2) U (1) gauge couplants r and s
k
y(v)
Tk,0
[V (v)]1/2 (GeV)
0
1
2
3
4
0.05448
0.05415
0.05426
0.05419
0.05406
1
2.5603
8.1773
83.195
1191.8
218.3
224.4
222.1
224.0
225.5
126
0.054. Given the prior detersubstantially smaller than x(v) = 0.0253, z(v) = 0.0329, y(v) =
minations of K = 0.058703 and T1,0 = 2.5603 [Table 3], one finds that U = 17.857, only a
small departure from the value (17.31) obtained in the last section, and that y(v) = 0.05374,
[V (v)]1/2 = 230.7 GeV.
7.2. The next order physical Higgs mass
Unlike the case of conventional spontaneously broken symmetry (CSB), the Lagrangian for
radiatively broken symmetry (RSB) does not contain a primitive 2 term. This means that an
approach involving the next-order calculation of momentum-independent contributions from the
Higgs boson self-energy is not viable, as counterterm subtractions of infinity cannot occur. Instead, the next-order Higgs boson inverse propagator mass term, as in Ref. [13], must remain the
(v). The kinetic term for the inverse propagator can be worked out by
next-order expression Veff
analogy to the RG-analysis of the kinetic term extracted by Politzer [14] for the massless gauge
boson propagator. This kinetic term may be written as
2
p
(p, ) = 1 + C(x, y, z, r, s) log 2 p 2
(7.3)
where
+ x
+ + s
2 (x, y, z, r, s) (p, ) = 0.
(7.4)
x
s
Our sign for the anomalous dimension differs from Politzers because our RG equation (2.3) is
consistent with a negative sign
d
=
d
(7.5)
d
in the chain rule decomposition of d
F (x, y, z, r, s; ; ). In Ref. [15]s seminal calculation,
Politzer calculated the analog coefficient to C [Eq. (7.3)] in the massless gauge boson propagator,
in order to determine the corresponding anomalous dimension A . In our case is known.
Lowest-order application of Eq. (7.4) onto Eq. (7.3) yields
2
3x() 9r() 3s()
+O y
C(x, y, z, r, s) = =
(7.6)
.
4
16
16
Thus the full next-order inverse propagator for the Higgs field, with chosen to be equal to the
vev v is given by
2
p
3x(v) 9r(v) 3s(v)
log 2 p 2 Veff
p2 , v = 1
(7.7)
(v).
4
16
16
v
The zero of Eq. (7.7) corresponds to the physical Higgs boson mass mH , the Higgs propagator pole: (m2H , v) = 0.3 One finds this value to be reduced by only 0.20.3 GeV from
(v) between 220 and 231 GeV for the physically known values x(v) = 0.0253,
values of Veff
r(v) = 0.0109, s(v) = 0.00324 of the running Yukawa and SU(2) U (1) gauge couplants. Thus
the next-order physical Higgs boson mass is effectively the same thing as our next order deter (v), largely because of the near proximity of V (v) with the magnitude of the
mination of Veff
eff
vev itself (v = 246.2 GeV).
3 As has been borne out in other contexts [16] we assume this propagator pole to be gauge independent.
127
+ = w + w + 32 + z2 /2 12 + 22 + 32 + 42 /2 = 2 /2
(7.9)
H ZZ:
m2
3 VCSB
= 2v = H ,
2
v
H z H =1 =2 =z=0
3
VCSB
3 VCSB
T (H w + w )
+
H 12 H 22 H =1 =2 =z=0
T (H zz)
H W +W :
2m2H
,
v
m2H
4 VCSB
+
+
W W ZZ: T (w w zz)
=
2
=
,
w + w z2 w =z=H =0
v2
4 VCSB
W + W W + W : T (w + w w + w )
+
2
2
(w ) (w ) w =z=H =0
= 4v =
= 4 =
2m2H
,
v2
(7.11)
(7.12)
(7.13)
(7.14)
128
W +W H H :
HH HH:
T (w + w H H )
4 VCSB
+
2
w w (H ) w =z=H =0
m2
= 2 = H
,
v2
3m2H
4 VCSB
T (H H H H )
=
6
=
.
(H )4 w =z=H =0
v2
(7.15)
(7.16)
VRSB = 2 2 A + BL + CL2 + DL3 + EL4 +
(7.17)
where
2
2 +
= log 2 ,
L = log
v2
v
(7.18)
and where minimization requires that A = B/2. In addition, we have from Eq. (2.9) that to
lowest order
2 VRSB
2
mH =
(7.19)
= 8 2 v 2 (B + C).
2
H
1 =2 =H =4 =0
The corresponding lowest order Higgs/Goldstone amplitudes for the RSB case to those presented
in Eqs. (7.11)(7.16) are
3 VRSB
H ZZ: T (H zz)
H z2 H =1 =2 =z=0
H W +W :
m2
= 4 2 v(2A + 3B + 2C) = 8 2 v(B + C) = H ,
v
3
3
VRSB
VRSB
T (H w + w )
+
2
H 1
H 22 H =1 =2 =z=0
= 16 2 v(B + C) =
W + W ZZ:
2m2H
,
v
(7.20)
(7.21)
T (w + w zz)
W +W W +W :
4 VRSB
w + w z2 w =z=H =0
m2
3B
2
+ C = 8 2 (B + C) = H
= 8 A +
,
2
v2
(7.22)
T (w + w w + w )
4 VRSB
+
2
2
(w ) (w ) w =z=H =0
= 8 2 (2A + 3B + 2C) = 16 2 (B + C) =
2m2H
,
v2
(7.23)
W +W H H :
T (w + w H H )
4 VRSB
= 2 (8A + 28B + 56C + 48D)
+
2
w w (H ) w =z=H =0
= 24 2 (B + C) + 2 (32C + 48D) =
HH HH:
129
3m2H
+ 2 (32C + 48D),
v2
(7.24)
T (H H H H )
25
35
4 VRSB
2
A
+
B
+
C
+
20D
+
16E
=
24
4
6
3
H
w =z=H =0
11B 35C
= 24 2
(7.25)
+
+ 20D + 16E = 24 2 y = 6RSB .
3
3
The results (7.20)(7.25) all utilized A = B/2 [Eq. (2.5)] as well as Eq. (7.19) for the RSB
Higgs mass. Note that RSB results from Eqs. (7.20)(7.23) are in complete agreement with
corresponding leading order CSB results from Eqs. (7.11)(7.14). The result (7.25) makes additional use of Eq. (2.6) for the quartic couplant y. Since RSB
= 5CSB , we see that lowest-order
HiggsHiggs scattering will be enhanced by a factor of 25 in the RSB scenario. The RSB amplitude W + W H H is enhanced by more than a factor of 3 from the CSB amplitude as
well. If one substitutes the leading-logarithm SFTP coefficients B = 3y 2 , C = 9y 3 , D = 27y 4
with y = RSB /4 2 = 0.0541 [see the discussion at the beginning of Section 3] into Eqs. (7.19)
and (7.24), then mH = 221 GeV, and the final line of (7.24) is numerically equal to 2.98. By
contrast, the corresponding CSB amplitude is found from Eq. (7.15) when mH = 221 GeV to
have a numerical value of 0.807. Thus the RSB amplitude is seen to be enhanced by a factor of
2.98/0.807 = 3.7 relative to the CSB amplitude [19].
These leading-logarithm SFTP results are corroborated by the leading-logarithm effective potential inclusive of the subdominant Standard Model couplants x(v) = gt2 /4 2 , z(v) = s / ,
r(v) = g22 /4 2 , s(v) = g 2 /4 2 . The RSB Higgs boson mass is now 218 GeV (Table 3). The coefficients C and D are respectively found to be 0.00152 and 0.000256, and the numerical value
of the final line of Eq. (7.24) is 2.96. For CSB with the same Higgs mass (218 GeV), the
numerical value of the same amplitude (7.15) is 0.785. Thus, the Higgs/Goldstone sector scattering amplitude associated with WL+ WL H H is enhanced in RSB over CSB by a factor of
2.96/0.785 = 3.8 in close agreement with the leading logarithm SFTP. Identical enhancements
characterize the Higgs/Goldstone sector amplitudes ZZ H H . However, as noted above, all
other amplitudes considered are found to be the same to lowest order in CSB and RSB scenarios.
In particular, the Higgs widths H ZZ, H W + W do not differ to lowest order in CSB and
RSB scenarios [19].
Suppose a Higgs boson of mass 220 GeV were found in near-future collider experiments, as
predicted by radiatively-broken symmetry. This mass in and of itself would not be a confirmation
of the radiative mechanism, as the Higgs boson mass is not determined by CSB and could fortuitously have the same empirical value as in radiatively-broken electroweak symmetry breaking.
However, if the scattering processes (WL+ WL H H ), (ZZ H H ), (H W H W ) were
found to be enhanced from CSB expectations by an order of magnitude, such an enhancement
in conjunction with the 220 GeV Higgs boson mass would be a strong signal that electroweak
symmetry is broken radiatively rather than conventionally.
130
7.4. Methodology
The salient result presented in this paper for radiative electroweak symmetry breaking is the
manifest controllability of corrections to the scalar-field self-coupling y(v) (= (v)/4 2 ) and the
running Higgs boson mass [V (v)]1/2 when contributions of sequentially subleading logarithms
are incorporated into the effective potentials perturbative series. Such contributions are obtained
from higher-than-one-loop terms in the renormalization-group equation. Indeed, it was noted
by Coleman and Weinberg [1] that the one-loop effective potential they obtained diagramatically could have also been obtained directly via the renormalization group (CallanSymanzik)
equation. This is demonstrated explicitly in Ref. [5].
However, the full set of leading logarithm contributions [as opposed to just the one-loop logarithm term] to the radiatively broken effective potential expressed in terms of the Standard
Models largest couplants {x, y, z} can also be obtained from one-loop RG functions [5]. In Sections 5 and 6 of the present paper, particularly as summarized in Eq. (6.9), these results are
extended to next-to-leading logarithm order through use of the {x, y, z}-sensitive portions of
the Standard Models two-loop RG functions, an advance in its own right in the formulation of
radiative electroweak symmetry breaking.
We are aware of the unconventional methodology of the paper in establishing the values of
RG-inaccessible finite 4 counterterms. The successive approximations to the full effective potential series delineated by Eqs. (5.4)(5.6) for the dominant Standard Model couplants {x, y, z},
or alternatively the successive approximations (4.1)(4.4) to the full SFTP potential, are imposed
upon us because the aggregate 4 coefficient K is itself O(|y|) in magnitude by virtue of the minimization condition (2.5). A conventional perturbative approach, in which K might be identified
as simply a next order O(y 2 ) coefficient is simply not feasible. Coleman and Weinberg [1] were
able to escape this conundrum only by assuming (as appropriate to the time of their paper) that
wrong sign Yukawa couplant contributions to the effective potential were small, enabling y
(or ) to be an O(g24 ) quantity. Such is not the case for the true Standard Model, as pointed out in
Section 1. Thus the successive approximations in which K is defined to be a sum of 4 counterterms [Eqs. (5.3) and (3.4)] appear to be the only field-theoretically consistent way to reconcile
an O(|y|) magnitude for K.
Of course, K itself and its constituent 4 coefficients in Eq. (5.3) are necessarily determined
via a sequential procedure of consistently applied renormalization conditions, for which we have
chosen repeated applications of Eqs. (2.5) and (2.6). On the face of things, this procedure does
not appear to be equivalent to order-by-order perturbative subtraction schemes, such as MS. For
such approaches to be viable, successive finite counterterms must be next order. Since K, the
first of these, is itself O(|y|) in magnitude, any perturbative approach identifying K with an
O(y 2 ) coefficient is inappropriate. By contrast, the approach delineated in Eqs. (5.4)(5.6) and
Eqs. (4.1)(4.4) converges on the full effective potential series via summation of successively
subleading-logarithm contributions generated by their lead terms, namely successively higherorder purely- 4 terms within the potential [e.g. Tk,0 y k+1 is the 4 coefficient that serves as
the lead term of the summation y k+1 Sk (yL) in Eq. (3.8)]. All such 4 coefficients, however,
are included in the aggregate coefficient K common to every approximation [e.g. Eq. (3.4) and
Eqs. (4.1)(4.4)].
Since K is indeed an O(|y|) quantity by virtue of the opposite-sign t-quarks contribution
swamping the O(g24 ) contribution to the first leading logarithm of the effective potential, such
successive approximations appear to be the only way we have been able to find to formulate a
131
large couplant version radiative symmetry breaking. Indeed, the approach we have developed
constrains the effective potential
(1) to maintain a minimum at the physical electroweak vacuum expectation value v =
1/2
GF 21/4 = 246.2 GeV,
(2) to satisfy a renormalization group equation [Eq. (2.3) and, for the SFTP case, Eq. (3.5)]
whose perturbative - and -functions are calculated via MS,
(3) to maintain a consistent definition (2.6) of the scalar-field self-interaction couplant y at the
= v scale, and
(4) to generate predictions for the scalar field self-coupling (or y = /4 2 ) and the running
Higgs boson mass [V (v)]1/2 that are both reasonable in magnitude [y = 0.05 is sufficiently
small for the RG functions (3.6) and (3.7) to decrease term-by-term in magnitude] and stable
under subsequent subleading-logarithm corrections.
Point (2) above is of particular importanceour approach is rendered consistent with MS by
construction.
We reiterate that the discovery of a Higgs boson mass in the 220 GeV region, as indicated in
Section 6, is not in itself a definitive confirmation of radiative electroweak symmetry breaking.
Such a discovery would clearly point to radiative symmetry breaking only if accompanied by
evidence for an anomalously large scalar field self-interaction coupling constanti.e., a five
2 /2v 2 . We have
to six times larger than the conventional symmetry breaking prediction = MH
argued that such enhancements would manifest themselves in HiggsHiggs scattering and, to a
lesser extent, in scattering processes such as W + W H H , ZZ H H . However, no such
enhancements are evident in lowest order expressions for the Higgs width or in processes such
as W W ZZ, W W with Higgs/Goldstone sector analogs.
Acknowledgements
We are grateful for discussions with V.A. Miransky and M. Sher, and for support from the
Natural Sciences and Engineering Research Council of Canada.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
132
Abstract
In no-scale supergravity global symmetries protect local supersymmetry and a zero value for the cosmological constant. We consider the breakdown of these symmetries and present a minimal SUGRA model
motivated by the multiple point principle, in which the total vacuum energy density is naturally tiny. In order to reproduce the observed value of the cosmological constant and preserve gauge coupling unification,
1. Introduction
The origin of a tiny energy density spread all over the Universe (the cosmological constant ),
which is responsible for its accelerated expansion, is one of the most challenging problems nowa4 1055 M 4 [1]. At the same time the
days. A fit to the recent data shows that 10123 MPl
Z
presence of a gluon condensate in the vacuum is expected to contribute an energy density of or4 . On the other hand if we believe in the Standard Model (SM) then a much
der 4QCD 1074 MPl
* Corresponding author.
134
135
such a way that our MPP scenario is fulfilled without any extra fine-tuning. In the next section we
specify the no-scale SUGRA models, consider the breakdown of local supersymmetry in these
models and discuss the connection with MPP.
The simplest model, in which the implementation of our MPP scenario does not require any
extra fine-tuning, is constructed in Section 3. In Section 4 we estimate the value of the cosmological constant in MPP inspired SUGRA models. The realization of our MPP scenario in
models based on enlarged gauge symmetry groups like [SU(3) SU(2) U (1)]3 is considered
in Section 5. Section 6 is reserved for our conclusions and outlook.
2. No-scale supergravity
The full (N = 1) SUGRA Lagrangian [4,5] is specified in terms of an analytic gauge kinetic
), which depend on the
function fa (M ) and a real gauge-invariant Khler function G(M , M
chiral superfields M . The function fa (M ) determines the kinetic terms for the fields in the
vector supermultiplets and the gauge coupling constants Re fa (M ) = 1/ga2 , where the index a
designates different gauge groups. The Khler function is a combination of two functions
2
G M , M
(1)
= K M , M
+ lnW (M ) ,
) is the Khler potential whose second derivatives define the kinetic terms for
where K(M , M
the fields in the chiral supermultiplets. W (M ) is the complete analytic superpotential of the
considered SUSY model. Here we shall use standard supergravity mass units: MPl = 1.
8
The SUGRA scalar potential can be presented as a sum of F - and D-terms VSUGRA (M ,
) = V ( , ) + V ( , ), where the F - and D-parts are given by [46]
M
F M
D M
M
M
=
eG GM GM N GN 3 ,
VF M , M
M,N
1 a 2
VD M , M
=
D ,
2 a
GM M G G/M ,
D a = ga
Gi Tija j ,
i,j
GM M G G/M
.
(2)
In Eq. (2) ga is the gauge coupling constant associated with the generator T a of the gauge
F M = eG/2 GM P GP
(3)
is non-vanishing, then local SUSY is spontaneously broken. At the same time a massless fermion
with spin 1/2the goldstino, which is a combination of the fermionic partners of the hidden
sector fields giving rise to the breaking of SUGRA, is swallowed up by the gravitino which
thereby becomes massive m3/2 = eG/2 . This phenomenon is called the super-Higgs effect [7].
136
Usually the vacuum energy density at the minimum of SUGRA scalar potential (2) is negative. To show this, let us suppose that, the Khler function has a stationary point, where all
derivatives GM = 0. Then it is easy to check that this point is also an extremum of the SUGRA
scalar potential. In the vicinity of this point local supersymmetry remains intact while the energy density is 3eG . It implies that the vacuum energy density must be less than or equal to
this value. Therefore, in general, an enormous fine-tuning must be imposed, in order to keep the
total vacuum energy density in SUGRA models around the observed value of the cosmological
constant [8].
Because the smallness of the parameters in a physical theory may be related to an almost
exact symmetry, it is interesting to investigate what kind of symmetries could protect the cosmological constant in N = 1 supergravity. It was discovered a long time ago that invariance
with respect to SU(1, 1) symmetry transformations results in a tree-level scalar potential which
vanishes identically along some directions [4,9,10]. In other words the corresponding scalar potential (2) possesses an infinite set of degenerate minima with zero vacuum energy density. The
SU(1, 1) structure of the N = 1 SUGRA Lagrangian can have its roots in supergravity theories
with extended supersymmetry (N = 4 or N = 8) [4].
The group SU(1, 1) contains subgroups of imaginary translations and dilatations [10,11]. The
invariance of the Khler function under the imaginary translations of the hidden sector superfields
zi zi + ii ,
(4)
implies that the Khler potential depends only on zi + z i , while the superpotential is given by [12]
m
ai zi W ( ),
W (zi , ) = exp
(5)
i=1
where the ai are real. Here we assume that the hidden sector involves m singlet superfields while
the observable sector comprises chiral multiplets . Since G(M , M ) is evidently invariant
under the Khler transformations [13]
K(M , M ) K(M , M ) g(M ) g ( M ),
W (M ) eg(M ) W (M ),
the most general Khler function can be written as
G(M , M ) = K(zi + z i , , ) + lnW ( ),
(6)
where W ( ) = W ( ).
The dilatation invariance constrains the Khler potential and superpotential further. Suppose
that hidden and observable superfields transform differently
zi 2 zi ,
(7)
Then the structure of the superpotential W ( ) in phenomenologically acceptable SUGRA models is determined by the symmetry transformations (4) and (7). Indeed because the superpotential
in these models contains trilinear terms, which induce masses of quarks and leptons, all terms
involving n chiral superfields with n
= 3 are forbidden by the dilatation invariance. If there is
only one field T in the hidden sector, then the Khler function is fixed uniquely by the gauge and
global symmetries of the model
K(T + T , , ) = 3 ln(T + T ) +
| |2
,
(T + T )
W ( ) =
1
Y ,
6
137
(8)
,,
where C and Y are constants. Here we restrict our consideration to the lowest order terms
| |2 in the expansion of the Khler potential in terms of observable superfields. The contribution
of higher order terms to the SUGRA scalar potential is suppressed by inverse powers of MPl and
can be safely ignored.
For the particular choice of the symmetry transformations (7) the part of the SUGRA scalar
potential which is induced by the Khler function of the hidden sector vanishes [10], i.e.,
Vhid = eG GT GT T GT 3 = 0.
Then the full scalar potential takes the form
1 2K/3 W ( ) 2 1 a 2
V= e
D ,
+2
3
(9)
where the observable superfields are rescaled as = C3 . The potential (9) leads to a
supersymmetric particle spectrum at low energies. Owing to the particular form of the Khler
potential (8), it is positive definite. Its minimum is reached at the points for which W( ) =
D a = 0. As a consequence the vacuum energy density goes to zero near global minima of the
scalar potential (9). Thus imaginary translations (4) and dilatations (7) protect a zero value for
the cosmological constant in supergravity [10].2
The invariance of the Khler function with respect to symmetry transformations (4) and (7)
also prevents the breaking of local supersymmetry. In order to illustrate this, let us consider an
SU(5) SUSY model with one field in the adjoint representation and with one singlet field S.
As before the structure of the Khler function is completely fixed by the global symmetries (4)
and (7), which result in a Khler potential and superpotential of the form given by Eq. (8). The
superpotential of the considered model is further constrained by the SU(5) gauge symmetry:
W (S, ) = S 3 + Tr 3 + S Tr 2 .
(10)
3
In the general case the minimum of the scalar potential, which is induced by the superpotential
(10), is attained when S = = 0 and does not lead to the breakdown of local supersymmetry
or of gauge symmetry. But if = 40 3 /(32 ) there is a vacuum configuration
1 0 0
0
0
0 1 0
0
0
0
= 0 0 1
0
0 ,
15 0 0 0 3/2
0
0 0 0
0
3/2
4 15
S0 ,
S = S0 ,
(11)
0 =
3
which breaks SU(5) down to SU(3) SU(2) U (1). However, along the valley (11), the superpotential and all auxiliary fields Fi vanish preserving local supersymmetry and the zero value of
the vacuum energy density.
2 In [14] a symmetry that forbids a cosmological constant in six- and ten-dimensional theories is discussed.
138
In order to get a vacuum where local supersymmetry is broken, one should violate dilatation
invariance in the superpotential. Eliminating the singlet field from the considered SU(5) model
and introducing a mass term for the adjoint representation, we get the superpotential
W () = MX Tr 2 + Tr 3 .
(12)
The scalar potential of the resulting model is given by Eq. (9). It has a few degenerate vacua with
vanishing vacuum energy density. For example, in the scalar potential there exists a minimum
where
= 0 and another vacuum, which has a configuration similar to Eq. (11) but with
4 15
0 = 3 MX . In the first vacuum the SU(5) symmetry and local supersymmetry remain intact,
while in the second one the auxiliary field FT acquires a vacuum expectation value and a nonzero gravitino mass is generated:
|W ()|
) ,
=
m
|FT |
(T
+
T
3/2
(T + T )1/2
MX3
40
|W ()|
m3/2 =
(13)
=
9 2 (T + T )3/2
(T + T )3/2
although the vacuum expectation value of T is undetermined at tree level, since the hidden sector scalar potential is flat. As a result, local supersymmetry and gauge symmetry are broken in
the second vacuum. Nevertheless the invariance of the low energy effective Lagrangian of the
observable sector under the transformations of global supersymmetry is preserved (see Eq. (9)).
When MX goes to zero the dilatation invariance, as well as SU(5) symmetry and local supersymmetry in the second vacuum, are restored.
This simple SU(5) model with the superpotential (12) illustrates how the degenerate vacua
required for the application of MPP to supergravity are naturally realized in no-scale supergravity. In the second vacuum local supersymmetry is broken, as is supposed to be the case in the
physical vacuum in which we live. It is usually supposed that local supersymmetry breaking induces SUSY breaking terms. However there are no such terms in this no-scale SUGRA model
and global supersymmetry is unbroken in both vacua.
3. Minimal MPP inspired SUGRA model
The no-scale SUGRA model with the superpotential (12) is not viable from the phenomenological point of view, due to the absence of global supersymmetry breaking in the observable
sector for all vacua. This raises the question of whether it is possible to construct a phenomenologically acceptable model based on broken global symmetries (4) and (7), which realizes
our MPP scenario without extra fine-tuning. We need to generate soft SUSY breaking terms that
break global supersymmetry in the observable sector of the physical vacuum. These soft terms are
generally characterized by the gravitino mass scale, which must then be of order the electroweak
scale. This required small value of the gravitino mass m3/2 of course constitutes the gauge hierarchy problem, whose solution was the original motivation for no-scale models with a flat hidden
sector scalar potential.3 In this paper we concentrate on the hierarchy problem associated with
the tiny value of the cosmological constant and do not explicitly address the solution of the gauge
3 An enormous mass hierarchy (m
3/2 MPl ) can appear due to a non-perturbative source of local supersymmetry
breaking [15].
139
hierarchy problem. We shall simply assume there is a weak breaking of the dilatation invariance
of the hidden sector superpotential characterized by an hierarchically small parameter .
In fact, we take the hidden sector to include two superfields, T and z, that transform differently
under dilatations
T 2T ,
z z,
(14)
(15)
z z,
In Eqs. (14), (15) represent the observable superfields. The hidden sector superfield z transforms similarly to under the global symmetry transformations (14), (15). It plays a role
analogous to the SU(5) adjoint field in Eq. (12) and appears in the full superpotential of
the model
1
3
2
n
cn z +
W (z, ) = z + 0 z +
(16)
Y .
6
n=4
,,
The bilinear mass term for the superfield z and the higher order terms cn zn in the superpotential (16) spoil the dilatation invariance. But, as we noticed in Section 2, such a breakdown of the
symmetry protecting the cosmological constant may preserve a zero value of the vacuum energy
density in all global minima of the scalar potential of the model, if the structure of the Khler
potential remains intact. It may also give rise to the spontaneous breakdown of local supersymmetry in the physical vacuum. Furthermore we require a locally supersymmetric vacuum with
zero cosmological constant in our MPP scenario. We note that the conditions for the existence
of such a vacuum are that the superpotential W for the hidden sector and its derivatives should
vanish4 at the corresponding minimum of the scalar potential
W (z)
= 0.
W (z) =
(17)
z
So we restrict our considerations to breakdowns of dilatation invariance which result in a global
minimum of the SUGRA scalar potential at z = 0, because it represents a vacuum where local
supersymmetry remains intact. According to Eq. (9) there is no global minimum at z = 0, if
the superpotential (16) contains a term proportional to z or terms which are inversely proportional to a power of z. Terms involving negative powers of the superfields are not present in the
superpotentials of the simplest SUSY models like the minimal supersymmetric standard model
(MSSM) and the next to minimal supersymmetric standard model. A term proportional to z can
be forbidden by a gauge symmetry of the hidden sector, if z transforms non-trivially under the
corresponding gauge transformations, as in the case of our toy SU(5) model (12).
Because the dilatation invariance is broken explicitly, one may expect the appearance of bilinear and higher order terms in the superpotential of the observable sector. Some of them are
potentially dangerous. For instance, the inclusion of the bilinear terms leads to the socalled -problem in the simplest SUSY models. Actually in the MSSM, the SM gauge symmetry
allows only one bilinear term H1
H2 where H1 and H2 are Higgs doublets. From dimensional
4 The vanishing of W implies that the last term in the expression for V ( , ) (see Eq. (2)), which led to the
F M M
negative energy density, vanishes. Taking into account that the Khler metric of the hidden sector is positive definite, one
can prove that the absolute minimum of the scalar potential (2) is achieved when the derivative of W vanishes [3].
140
considerations it is obvious that the corresponding mass parameter should be of order of the
Planck scale, because this is the only scale characterizing SUGRA theories. At the same time the
correct pattern of electroweak symmetry breaking requires to be in the TeV range. In order to
avoid a new hierarchy problem, the dilatation invariance should not be spoilt in the part of the
superpotential (16) that includes observable superfields.
For completeness we have to specify the Khler potential in our MPP inspired SUGRA model.
It is fixed as follows
2
2
+ h.c.
| | +
K M , M = 3 ln T + T |z|
2
,
+
(18)
| |2 ,
where , , are some constants. The kinetic terms of the scalar fields, which are induced
by the first term on the right-hand side of Eq. (18), are invariant under the isometric transformations of the non-compact SU(N, 1) group [16], where N is the number of chiral superfields
in the model. This symmetry can be derived from extended (N 5) supergravity theories [17].
The Yukawa interactions in the superpotential (16) and D-terms in the scalar potential break
SU(N, 1) symmetry explicitly, in such a way that only invariance under the dilatations and imaginary translations can be realized in phenomenologically viable N = 1 SUGRA models. Exactly
this type of SUGRA model was discussed in Section 2. The Khler potential (8) can be easily
2
2
and T|+ |T . Thus, in the
reproduced, if one expands the first term in Eq. (18) in powers of T|z|
+T
limit when , and go to zero, the invariance under the symmetry transformations (14),
(15) is restored, protecting supersymmetry and a zero value of the cosmological constant.
In Section 2 we demonstrated that the violation of dilatation invariance does not necessarily
cause the breaking of global supersymmetry at low energies. This is the reason why we include
extra terms in the Khler potential of our SUGRA model. We allow the breakdown of the di )
latation invariance in the Khler potential of the observable sector only. The part of K(M , M
involving hidden sector superfields is responsible for the cancellation of the negative contribution to the total vacuum energy density coming from the term 3eG in the scalar potential (2).
Therefore any variations in the Khler potential of the hidden sector may spoil the vanishing of
the vacuum energy density in global minima. For example, if the factor in front of the logarithm
in Eq. (18) is greater than 3 then the SUGRA scalar potential is not positive definite and the
total energy density tends to be huge and negative.
In order to avoid cumbersome calculations, we introduce the simplest set of terms breaking the
dilatation invariance in the Khler potential. All the terms are bilinear with respect to observable
superfields and do not depend on the hidden sector fields. Higher order terms are irrelevant for
our study, since their contribution to the low energy effective potential is suppressed by inverse
powers of MPl . Additional terms which are proportional to | |2 normally appear in minimal
SUGRA models [1820]. The other terms introduced in the Khler potential (18) give
rise to effective terms after the spontaneous breakdown of local supersymmetry, solving the
problem [21].
In the limit when and vanish while 1, we return back to the SUGRA scalar
potential of the form (9). In this case, the scalar potential of the hidden sector becomes
W (z) 2
1
.
V (T , z) =
(19)
3(T + T |z|2 )2 z
141
The minima of the scalar potential (19) are attained at the stationary points of the hidden sector
superpotential. In the simplest case when cn = 0, the superpotential (16) has two extremum points
at z = 0 and z = 23 0 . At these points the scalar potential (19) achieves its absolute minimal
value, i.e., zero. In the first vacuum where z = 23 0 , local supersymmetry is broken and the
gravitino gets a non-zero mass
430
W (z)
=
.
m3/2 =
(20)
42
(T + T |z|2 )3/2
27(T + T 9 0 )3/2
In the second minimum, the vacuum expectation value of the superfield z and the superpotential
of the hidden sector vanish. Therefore the conditions (17) are fulfilled automatically and local
supersymmetry remains intact. If the high order terms cn zn are present in Eq. (16), the scalar
potential of the hidden sector may have many degenerate vacua with vanishing vacuum energy
density, where the gravitino may remain massless or gain a non-zero mass.
The main disadvantage of the scenario considered above is related with the degeneracy between bosons and fermions in the observable sector, which is preserved in the limit , 0
despite the breakdown of local supersymmetry. In the general case, when both and have
non-zero values, the situation changes dramatically. Since, by construction, the dilatation invariance is only broken in the part of the Khler potential (18) containing observable superfields, it
does not affect the scalar potential of the hidden sector which is still described by Eq. (19). As a
result our MPP scenario is realized without any extra fine-tuning.
Nevertheless the shape of the effective scalar potential of the observable sector, in the vacua
where the super-Higgs effect takes place, alters significantly. The structure of the soft SUSY
breaking terms in the considered model, which is discussed in Appendix A, allows us to write
the effective potential of the observable sector (A.1) in a compact form5
2
Weff (y )
1 a 2
D .
Veff ,
+
m
y
(21)
+
y
2 a
Although global supersymmetry is softly broken, the effective potential of the scalar fields (21) is
still positive definite and vanishes near its global minima. It follows that the spontaneous breakdown of electroweak symmetry cannot be naturally arranged in our model, because normally it
results in negative vacuum energy density, i.e., the minimum of the scalar potential with broken
SU(2)W U (1)Y symmetry ought to be deeper than the vacuum where gauge invariance is preserved and the doublet Higgs fields vanish (H1 = H2 = 0). Moreover in the simplest MPP
inspired SUGRA model discussed above, the mechanism for the stabilization of the vacuum expectation value of the hidden sector field T remains unclear. As a result the gravitino mass (see
Eq. (20)) and the supersymmetry breaking scale are not fixed in the physical vacuum.
However all these problems cannot be addressed in the framework of the simplest MPP
inspired SUGRA model. In order to get a self-consistent solution, one has to include all perturbative and non-perturbative corrections to the considered SUGRA Lagrangian, which should
depend on the structure of the underlying theory. If we take into account the evolution of the soft
scalar masses, then their renormalization group flow might provide a radiative mechanism for
5 This form of the scalar potential can be established in a straightforward way in the limit when all go to zero
142
electroweak symmetry breaking [22]. We hope that an underlying renormalizable or even finite
theory can be found, which sheds light on the origin of the terms that spoil the global symmetry
protecting the cosmological constant in our SUGRA model. It should also ensure the stabilization of the vacuum expectation values of the hidden sector fields and the supersymmetry breaking
scale.
4. Cosmological constant in MPP inspired SUGRA models
We now assume that a phenomenologically viable MPP inspired SUGRA model of the type
just discussed can be constructed. That is to say, we assume the existence of a phase with electroweak gauge symmetry breaking induced by soft SUSY breaking terms degenerate with a
second phase, in which the low-energy limit of the considered theory is described by a pure
supersymmetric model in flat Minkowski space. Non-perturbative effects in the observable sector may lead to supersymmetry breakdown in the second vacuum state (for recent reviews see
[23,24]). Then in compliance with our MPP philosophy, we require the degeneracy of the vacua
after all non-perturbative effects are included.
The non-perturbative effects in simple SUSY models, like the minimal supersymmetric standard model (MSSM), are extremely weak. Our strategy is to estimate these effects in the second
vacuum and thereby estimate the energy density in the second (almost supersymmetric) phase.
This value of the cosmological constant can then be interpreted as the physical value in our phase,
by virtue of MPP.
If supersymmetry breaking takes place in the second vacuum, it is caused by the strong interactions. Indeed, even in the pure MSSM, the beta function of the strong gauge coupling constant
3 exhibits asymptotically free behaviour (b3 = 3).6 As a consequence 3 (Q) increases in
the infrared region and one can expect that the role of non-perturbative effects is enhanced.
Since in the minimal SUGRA model the kinetic functions essentially do not depend on the hidden superfields (fa (zm ) const), the values of the gauge couplings at the high energy scale
and their running down to the scale MS m3/2 are the same in both vacua. Below the scale
MS all superparticles in the physical vacuum decouple and the corresponding beta functions
(1)
change (b3 = 7). Using the value of 3 (MZ ) 0.118 0.003 and the matching condition
(2)
(1)
3 (MS ) = 3 (MS ), one finds the strong coupling in the second vacuum
1
(2)
3 (MS )
(1)
1
(1)
3 (MZ )
M2
b3
ln S2 .
4 MZ
(22)
(2)
In Eq. (22) 3 and 3 are the values of the strong gauge couplings in the physical and second
minima of the SUGRA scalar potential.
At the scale SQCD , where the supersymmetric QCD interactions become strong in the second
vacuum
2
SQCD = MS exp
(23)
(2)
b3 3 (MS )
the supersymmetry may be broken dynamically due to non-perturbative effects. If instantons
generate a repulsive superpotential [23,25,26] which lifts and stabilizes the vacuum valleys in the
6 The gauge couplings obey the renormalization group equations d log i (Q) = bi i (Q) , where (Q) = g 2 (Q)/(4 ).
i
i
4
d log Q2
143
Fig. 1. The value of log[SQCD /MPl ] versus log MS . The thin and thick solid lines correspond to the pure MSSM
and the MSSM with an additional pair of 5 + 5 multiplets, respectively. The dashed and dash-dotted lines represent the
uncertainty in 3 (MZ ). The upper dashed and dash-dotted lines are obtained for 3 (MZ ) = 0.124, while the lower ones
correspond to 3 (MZ ) = 0.112. The horizontal line represents the observed value of 1/4 . The SUSY breaking scale
MS is measured in GeV.
scalar potential, then a generalized ORaifeartaigh mechanism gives rise to a non-zero positive
value for the cosmological constant
4SQCD .
(24)
In Fig. 1 the dependence of SQCD on the SUSY breaking scale MS is examined. Because
b3 < b3 the QCD gauge coupling below MS is larger in the physical minimum than in the second
one. Therefore the value of SQCD is much lower than the QCD scale in the Standard Model and
diminishes with increasing MS . When the supersymmetry breaking scale in our vacuum is of the
order of 1 TeV, we obtain SQCD = 1026 MPl 100 eV. This results in an enormous suppres4 ) compared to say an electroweak scale
sion of the total vacuum energy density ( 10104 MPl
4
62
contribution in our vacuum v 10 MPl . From the rough estimate of the energy density (24),
it can be easily seen that the measured value of the cosmological constant is reproduced when
SQCD = 1031 MPl 103 eV. The appropriate values of SQCD can therefore only be obtained for MS = 103 104 TeV. However the consequent large splitting within SUSY multiplets
would spoil gauge coupling unification in the MSSM and reintroduce the hierarchy problem,
which would make the stabilization of the electroweak scale rather problematic.
A model consistent with electroweak symmetry breaking and cosmological observations can
be constructed, if the MSSM particle content is supplemented by an additional pair of 5 + 5
multiplets. These new bosons and fermions would not affect gauge coupling unification, because
144
they form complete representations of SU(5) (see for example [27]). In the physical vacuum
these extra particles would gain masses around the supersymmetry breaking scale. The corresponding mass terms in the superpotential are generated after the spontaneous breaking of local
+ h.c.] in the Khler potential
supersymmetry, due to the presence of the bilinear terms [(5 5)
of the observable sector [21]. Near the second minimum of the SUGRA scalar potential the new
particles would be massless, since m3/2 = 0. Therefore they give a considerable contribution
to the functions (b3 = 2), reducing SQCD further. In this case the observed value of the
cosmological constant can be reproduced even for MS 1 TeV (see Fig. 1).
Unfortunately achieving dynamical SUSY breaking at the scale SQCD is actually not at all
easy. The situation is different depending on whether the number of flavours Nf is larger or
smaller than the number of colours Nc . In the MSSM and its simplest extensions, where Nc = 3
and Nf = 6, the generated superpotential has a polynomial form [24,28]. The absolute minimum
of the SUSY scalar potential is then reached when all the superfields, including their F - and
D-terms, acquire zero vacuum expectation values preserving supersymmetry in the second vacuum. This result throws some doubt on our estimations of the value of the cosmological constant,
which is based on Eq. (24).
But the above disappointing facts concerning dynamical SUSY breaking were revealed in the
framework of pure supersymmetric QCD, where all Yukawa couplings were supposed to be small
or even absent. At the same time the t-quark Yukawa coupling in the MSSM is of the same order
of magnitude as the strong gauge coupling at the electroweak scale. Therefore it might change
the results of the SUSY breaking studies drastically leading, for example, to the formation of a
quark condensate that breaks supersymmetry.
5. Implementation of the MPP in the models with extended gauge symmetry
The dynamical breakdown of supersymmetry in the observable sector of the second vacuum
can be more easily achieved in models with an extended strong interaction gauge sector. Here
we restrict our consideration to the class of models based on SU(N ) gauge symmetry groups.
Since the extension of the gauge sector of the SM is already a very strong assumption, we prefer
to limit the particle content of the model as much as possible. In particular it is worth combining
the spontaneous breakdowns of the enlarged gauge symmetry and local supersymmetry, as takes
place in our toy SU(5) model (12), rather than introducing two separate sectors for this purpose. It
seems that the simplest gauge extension of the MSSM, for which the dynamical supersymmetry
breaking occurs at low energies independently of the values of the Yukawa couplings, should
include at least three SU(3) gauge groups. If the quarks of each generation are coupled to the
gauge bosons of just their own distinct SU(3), then the criterion Nc > Nf is satisfied. We here
consider a model with an [SU(3)]3 gauge symmetry, as in the family replicated gauge group or
anti-grand unification model [2,29]. As in this model, we take the corresponding three [SU(3)]3
gauge coupling constants to be equal and we denote their value at the scale Q by g33 (Q).
In the physical vacuum the [SU(3)]3 gauge symmetry must be broken down to SU(3)C , which
is associated with the SM strong interactions. This can be simply arranged if the considered
theory includes multiplets in the bi-fundamental representation, which transform as a triplet with
respect to one SU(3) and as an anti-triplet under another SU(3) symmetry. In the models based on
SU(3)a SU(3)b SU(3)c there can be six bi-fundamental representations: a b , a c , ba , bc ,
ca , cb where the indices a, b and c correspond to the three different SU(3) gauge groups and
145
the corresponding quark generations. If the superfields i j acquire vacuum expectation values
1 0 0
i j = 0 0 1 0 , i, j = a, b, c,
(25)
0 0 1
then below the energy scale 0 the [SU(3)]3 gauge group reduces to the diagonal subgroup
corresponding to the usual QCD SU(3)C symmetry. It follows that the QCD gauge coupling
constant g3 (Q) is then related to the [SU(3)]3 gauge coupling constant at the scale 0 :
(1)
(1)
g33 (0 + ) = 3g3 (0 ).
(26)
The desired pattern of [SU(3)]3 gauge symmetry breaking can be obtained in the no-scale
SUGRA model with superpotential
W = X Tr(a b ba ) + Tr(a c ca ) + Tr(bc cb )
+ k Tr(a b bc ca ) + Tr(ba a c cb ) + W ( ),
(27)
and Khler potential
K
M , M
2
= 3 ln T + T
|i j | + K ,
(28)
i,j
the scalar components of the observable superfields gain a universal mass which coincides
33
X
with the gravitino mass m3/2 = k 2 (T +T 18(
2 3/2 (see Eq. (A.4) where x ). For simX /k) )
plicity, we take k of order unity in the following.
In the second vacuum (0 = 0) supersymmetry and gauge symmetries are left unbroken. The
gauge couplings of each SU(3)i grow with decreasing energy scale developing a Landau pole
much below X . At low energies, where the SU(3)i gauge interactions become very strong (E
SQCD ), non-perturbative effects induce a sizable instanton contribution Winst to the effective
superpotential (see [25]) that takes the form
tc + hb (H d Q)
b c + h (H d L)
c ,
W = Winst + ht (H u Q)
7SQCD
.
Winst
(Q t c )
(Q bc )
(30)
For simplicity we only keep superfields belonging to the third generation in the superpotential
(30), together with the Higgs doublets Hu and Hd . In Eq. (30) and are SU(2) indices labeling the components of the SU(2) doublet Q , whereas
is the completely antisymmetric
146
tensor. The non-perturbative superpotential Winst gives rise to supersymmetry breaking. Indeed,
in a vacuum where supersymmetry is preserved, all the auxiliary fields Fi have to be zero. The
vanishing of FHu implies that the vacuum expectation value of either Q or t c is zero. At the
same time with the superpotential (30), Winst as well as FQ and Ft c are singular when Q = 0 or
t c = 0. Therefore it is not consistent to assume that supersymmetry is preserved in the vacuum,
but non-perturbative instanton effects must break the supersymmetry and give rise to a non-zero
vacuum energy density 4SQCD .
So far the gauge kinetic function in the considered model has not been specified. In contrast with the simplest MPP inspired models, a constant gauge kinetic function in this particular
gauge extension of the SM does not allow us to reproduce the observed value of the cosmological constant. In realistic scenarios the supersymmetry breaking scale in the physical vacuum has
to be above a few hundred GeV, restricting the permitted range of X from below. Assuming
that T gets a vacuum expectation value around unity (i.e. T MPl ), the scale of [SU(3)]3 symmetry breaking ought to be higher than 1013 GeV but should not exceed MPl . In order to get
a phenomenologically acceptable value for the vacuum energy density in the second minimum,
which according to MPP coincides with the cosmological constant in our vacuum, we require
(2)
SQCD 103 eV. Hence the SU(3) gauge couplings g33
at the scale X are required to be in
the vicinity of 0.4 in the second vacuum (see Fig. 2). However, the value of the SU(3) gauge cou-
Fig. 2. The value of the vacuum energy density as a function of the overall [SU(3)]3 gauge coupling at the scale X
(2)
in the second vacuum. The dash-dotted and thick curves represent the dependence of the energy density on g33 (X )
13
for X = MPl and X = 10 GeV, respectively. The horizontal solid line corresponds to the observed value of the
cosmological constant .
147
Fig. 3. The dependence of the overall [SU(3)]3 gauge coupling g33 (X ) on the scale X . The upper solid curve repre(1)
sents g33 (X ) and is obtained by the extrapolation of 3 (MZ ) up to the scale X in the physical vacuum. The lower
(2)
thick line represents the values of g33 (X ) that allow us to fit the vacuum energy density in the second vacuum to
4 . The scale of the [SU(3)]3 symmetry breaking is given
its phenomenologically acceptable value 10123 MPl
X
in GeV.
(1)
(2)
plings in the physical vacuum g33
just above the scale X is considerably larger than g33
(X ),
as one can see from Fig. 3.
Thus, in order to obtain an appropriate value of SQCD , the SU(3)i gauge couplings in the
second vacuum have to be two or three times smaller than in the physical one. This can be
achieved if the gauge kinetic function depends quite strongly on the vacuum expectation values
of the bi-fundamental multiplets i j . The simplest gauge kinetic function for the gauge group
SU(3)a , which is invariant under gauge symmetry transformations, imaginary translations and
dilatations, can be written as
fa (M ) = fa0 +
i,j
fiaj
|i j |2
.
(T + T )
(2)
(31)
When we take fa0 6.28, i.e., (g33 (MPl ))2 = 1/fa0 0.16, the gauge couplings of [SU(3)]3
blow up near the scale SQCD 103 eV, inducing a suitable value of the vacuum energy den(1)
(2)
sity. In the physical vacuum the gauge couplings g33 (MPl ) differ from g33 (MPl ), because the
bi-fundamental multiplets acquire non-zero vacuum expectation values. If the second term in
(1)
Eq. (31) takes the value (4.9) in the physical vacuum, the measured value of 3 (MZ ) is re-
148
produced using Eq. (26). This value can be obtained with all the parameters fa0 and fiaj of the
same order of magnitude, provided that 0 MPl .
In the case when 0 MPl the gauge symmetry, global and local supersymmetry are all broken just below the Planck scale in the physical vacuum. As can be seen from Fig. 3, the [SU(3)]3
(1)
gauge couplings then take the value g33 (MPl ) 0.85. This is consistent with the critical value
of the gauge coupling constant obtained from lattice calculations [30], for which three phases of
the regularized SU(3) gauge theory coexist, i.e., for which the corresponding vacuum states have
the same energy density in agreement with our MPP philosophy. Similar results were obtained
for the [SU(2)]3 and [U (1)]3 gauge couplings in the family replicated gauge group model [2,29],
using the measured values of 2 (MZ ) and 1 (MZ ) as inputs. We note that a phenomenologically
successful structure for the quark and lepton mass matrices can be naturally generated from the
chiral gauge charges in the family replicated gauge group model [31].
6. Summary and concluding remarks
In supergravity the cosmological constant problem can be alleviated by imposing an extra
global symmetry. In particular the invariance under imaginary translations and dilatations, which
are subgroups of SU(N, 1), leads to the vanishing of the vacuum energy density in the no-scale
SUGRA models. At the same time these symmetries, which naturally arise in theories with
extended supersymmetry (N 5), preserve local supersymmetry which must however be broken in any phenomenologically acceptable theory. We have argued that the breakdown of these
global symmetries protecting the cosmological constant does not necessarily result in a non-zero
vacuum energy density. In particular, violation of dilatation invariance in the superpotential of
no-scale models may give rise to the spontaneous breakdown of local supersymmetry, and still
preserve a zero value for the energy density in the vacua of these models.
All global minima of the SUGRA scalar potential (2) in the no-scale models, where the invariance with respect to dilatations is spoiled in the superpotential, are degenerate. Normally the
set of global minima in the considered class of models includes vacua with broken and unbroken
local supersymmetry. In the vacua where local supersymmetry remains intact, the gravitino mass
goes to zero and the conditions (17) are fulfilled automatically. According to our MPP scenario
the SUGRA scalar potential must possess at least two degenerate vacua in which m3/2 = 0 and
m3/2
= 0, respectively. In one of them, where m3/2 has a non-zero value, local supersymmetry
is broken in the hidden sector at the high energy scale ( 1010 1012 GeV), inducing a set of soft
SUSY breaking terms for the observable fields. In the other vacuum (m3/2 = 0) the low energy
limit of the considered theory is described by a pure supersymmetric model in flat Minkowski
space. The energy density and all auxiliary fields F M of the hidden sector vanish in this second
vacuum preserving supersymmetry.
Although the breakdown of dilatation invariance in the superpotential of no-scale SUGRA
models ensures the degeneracy of vacua, where m3/2 = 0 and m3/2
= 0 respectively, the particle
spectrum remains supersymmetric at low energies in all vacua. Thereby none of these vacua can
be the physical one. Nevertheless a minimal SUGRA model has been constructed, where our
MPP scenario is realized without any extra fine-tuning. It is based on broken SU(N, 1) symmetry. The hidden sector of the minimal MPP inspired SUGRA model contains two superfields T
and z, which transform differently under imaginary translations and dilatations. We allowed the
breakdown of dilatation invariance in the superpotential of the hidden sector and in the part of
the Khler potential which contains the observable superfields. The SU(N, 1) structure of the
Khler potential of the hidden sector guarantees the vanishing of the cosmological constant in
149
all the global minima of the scalar potential in the model. Owing to the breakdown of dilatation
invariance in the hidden sector superpotential, a set of degenerate vacua with broken and unbroken local supersymmetry emerges. Meantime we maintain dilatation invariance in the observable
sector superpotential, preventing the appearance of bilinear and high order terms involving observable superfields in the rest of the superpotential and thereby eliminating the -problem.
Finally, due to a suitable breakdown of dilatation invariance in the Khler potential of the observable sector, effective -terms and a set of soft SUSY breaking terms are generated in the
vacua where local supersymmetry is spontaneously broken.
In spite of the vanishing of the vacuum energy density in all global minima of the tree level
scalar potential of the MPP inspired SUGRA models, the value of the cosmological constant may
differ from zero. This occurs if non-perturbative effects in the observable sector give rise to the
breakdown of supersymmetry in the second vacuum (phase). Our MPP philosophy then requires
that the phase in which local supersymmetry is broken in the hidden sector has the same energy
density as a phase where supersymmetry breakdown takes place in the observable sector. If the
gauge couplings at high energies are identical in both vacua, the value of the energy density in
the second vacuum can be estimated relatively easily. It is positive definite and determined by the
scale where the SU(3)C gauge interactions become strong. The numerical analysis carried out
in the framework of the pure MSSM has revealed that the corresponding scale is naturally low
(SQCD 1025 MPl ) for a reasonable choice of the supersymmetry breaking scale, MS 1 TeV,
in the first (physical) vacuum. Moreover the introduction of an extra pair of 5 + 5 multiplets
reduces this scale down further, so that the energy density of the second phase approaches the
observed value of the cosmological constant even when MS 1 TeV. The crucial idea is then
to use MPP to transfer the energy density or cosmological constant from this second vacuum
into all other vacua, especially into the physical one in which we live. In such a way we have
suggested an explanation of why the observed value of the cosmological constant is positive and
takes on the tiny value it has. The MPP scenario with additional 5 + 5 multiplets of matter and
supersymmetry breaking scale in the TeV range can be tested at the LHC or ILC.
The trouble with the MPP prediction for the value of the cosmological constant is that it is not
clear if the required dynamical supersymmetry breaking actually takes place in the framework
of the simplest SUSY extensions of the SM, which describe the observable sector of SUGRA
models at low energies. On the other hand, the dynamical breakdown of supersymmetry can be
attained in SUSY models with an extended gauge sector for the strong interactions similar to
that in the family replicated gauge group model [2,29,31]. But, in order to obtain the appropriate
value of the cosmological constant in this case, the gauge couplings in the first and second vacua
should differ considerably. Therefore one has to admit a dependence of the gauge kinetic function
on the chiral superfields, which are responsible for the breaking of the enlarged gauge symmetry
down to SU(3)C SU(2)W U (1)Y . Then, if local supersymmetry and the extended gauge
symmetry are broken near the Planck scale, the gauge couplings in the second vacuum can be
smaller than in the physical one by a factor of 2, which allows us to reproduce the observed value
of the cosmological constant. In the first vacuum where we live the SM is valid up to the Planck
scale. It has recently been pointed out that the enormous hierarchy between the electroweak and
Planck scales might also be explained by MPP [29,32] in the SM.
Although MPP provides an attractive explanation for the smallness and sign of the cosmological constant in (N = 1) supergravity, we have not been able to present a fully self-consistent
model. The no-scale models discussed above possess one defect. Namely, the mechanism for the
stabilization of the vacuum expectation value of the hidden sector field T and the SUSY breaking
scale remains unclear.
150
Acknowledgements
The authors are grateful to O. Kancheli, S. King and D. Sutherland for fruitful discussions.
R.N. would like to acknowledge support from the PPARC grant PPA/G/S/2003/00096. R.N.
was also partly supported by a Grant of the President of Russia for young scientists (MK3702.2004.2). The work of C.F. was supported by PPARC and the Niels Bohr Institute Fund.
C.F. would like to acknowledge the hospitality of the Niels Bohr Institute, while part of this
work was done.
Appendix A
Here we discuss the structure of the soft SUSY breaking terms which appear in the physical
vacuum in a low energy effective Lagrangian of the MPP inspired SUGRA model with superpotential (16) and Khler potential given by Eq. (18). In order to compute the effective scalar
potential, one has to substitute vacuum expectation values for T and z as well as for their auxiliary fields (3), taking into account that only F T acquires a non-zero vacuum expectation value.
Then one expands the full SUGRA scalar potential (1) in powers of observable fields, taking the
flat limit [18] where MPl but m3/2 is kept fixed. In the considered limit hidden sector superfields are decoupled from the low-energy theory. The only signal they produce is a set of terms
that break the global supersymmetry of the low-energy effective Lagrangian of the observable
sector in a soft way [19,33], i.e., without inducing quadratic divergences. All non-renormalizable
terms vanish in the flat limit since they are suppressed by inverse powers of MPl . Thus one is left
with a global SUSY scalar potential VSUSY plus a set of soft SUSY breaking terms Vsoft , i.e.,
Veff y , y = VSUSY + Vsoft ,
Weff (y ) 2 1 2
+
VSUSY =
Da ,
y
2
a
1
1
Vsoft =
(A.1)
m2 |y |2 +
B y y +
A h y y y + h.c. .
2
6
,,
,
x =
C = 1 +
.
y = C ,
x
3
(A.2)
When MPl the effective superpotential, which describes the interactions of observable
superfields at low energies, only contains bilinear and trilinear terms
h
+
,
Weff =
2
6
,
,,
= m3/2 (C C )1 ,
h =
Y (C C C )1
.
(T + T |z|2 )3/2
(A.3)
The complete set of soft SUSY breaking terms involves: gaugino masses Ma , masses of scalar
components of observable superfields m , trilinear A and bilinear B scalar couplings associated with Yukawa couplings and -terms in the superpotential [34]. Three types of soft SUSY
breaking parameters m2 , A and B appear in the scalar potential (A.1). In the vacua, where
151
local SUSY is broken and the gravitino gains a non-zero mass m3/2 , these parameters are given
by7
x
,
(1 + x )
= m + m ,
m = m3/2
B
A = m + m + m .
(A.4)
The structure of the soft SUSY breaking terms given above permits us to rewrite the effective
scalar potential (A.1) in a more compact form (21). It is worth emphasizing that the expressions
for the soft SUSY breaking parameters obtained above would not change if the hidden sector
of our model had many superfields zi . The soft scalar masses m in the low energy effective
Lagrangian maintain the splitting between bosons and fermions within one supermultiplet. According to Eq. (A.4), the masses of the superpartners of the ordinary quarks and leptons are set
by the parameter and the vacuum expectation value of the superpotential of the hidden sector
(or ), which spoil the dilatation invariance. In other words the qualitative pattern of the sparticle
spectrum in the considered SUGRA model depends on the extent to which the symmetry protecting the cosmological constant is broken. Assuming that , , 0 and T are all of order unity,
the phenomenologically acceptable value of the supersymmetry breaking scale MS 1 TeV can
only be obtained for extremely small values of 1015 .
Explicit expressions for the gaugino masses are not included in Eq. (A.4) because their values
are determined by the gauge kinetic functions fa (T , z) that has not been specified yet. A canonical choice for the kinetic function in minimal supergravity fa (T , z) = const corresponds to
Ma = 0. In order to avoid a conflict with chargino and gluino searches at present and former
colliders, we need gaugino masses in the few hundred GeV range. Therefore we assume a mild
dependence of fa (T , z) on the hidden sector fields, which is strong enough to induce appreciable gaugino masses but weak enough to ensure that the gauge couplings in the physical and
supersymmetric vacua do not differ significantly.
References
[1] A.G. Riess, et al., Astron. J. 116 (1998) 1009;
S. Perlmutter, et al., Astrophys. J. 517 (1999) 565;
C. Bennett, et al., Astrophys. J. Suppl. 148 (2003) 1;
D. Spergel, et al., Astrophys. J. Suppl. 148 (2003) 175.
[2] D.L. Bennett, H.B. Nielsen, Int. J. Mod. Phys. A 9 (1994) 5155;
D.L. Bennett, C.D. Froggatt, H.B. Nielsen, in: Proceedings of the 27th International Conference on High Energy
Physics, Glasgow, Scotland, 1994, p. 557;
D.L. Bennett, C.D. Froggatt, H.B. Nielsen, in: D. Klabucar, I. Picek, D. Tadic (Eds.), Perspectives in Particle
Physics 94, World Scientific, Singapore, 1995, p. 255, hep-ph/9504294.
[3] C. Froggatt, L. Laperashvili, R. Nevzorov, H.B. Nielsen, Phys. At. Nucl. 67 (2004) 582.
[4] A.B. Lahanas, D.V. Nanopoulos, Phys. Rep. 145 (1987) 1.
[5] H.P. Nilles, Phys. Rep. 110 (1984) 1.
[6] E. Cremmer, S. Ferrara, L. Girardello, A. Van Proeyen, Phys. Lett. B 116 (1982) 231;
E. Cremmer, S. Ferrara, L. Girardello, A. Van Proeyen, Nucl. Phys. B 212 (1983) 413.
[7] S. Deser, B. Zumino, Phys. Rev. Lett. 38 (1977) 1433;
E. Cremmer, B. Julia, J. Scherk, P. van Nieuwenhuizen, S. Ferrara, L. Girardello, Phys. Lett. B 79 (1978) 231;
E. Cremmer, B. Julia, J. Scherk, P. van Nieuwenhuizen, S. Ferrara, L. Girardello, Nucl. Phys. B 147 (1979) 105.
7 In the most general case a complete set of expressions for the soft SUSY breaking parameters can be found in [35,36].
152
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
Universidad Carlos III de Madrid, Avda. de la Universidad, 30, 28911 Legans, Spain
Received 2 November 2005; received in revised form 26 January 2006; accepted 14 February 2006
Available online 20 March 2006
Abstract
We study the phase diagram of Q-state Potts models, for Q = 4 cos2 (/p) a Beraha number (p > 2 integer), in the complex-temperature plane. The models are defined on L N strips of the square or triangular
lattice, with boundary conditions on the Potts spins that are periodic in the longitudinal (N ) direction and
free or fixed in the transverse (L) direction. The relevant partition functions can then be computed as sums
over partition functions of an Ap1 type RSOS model, thus making contact with the theory of quantum
groups. We compute the accumulation sets, as N , of partition function zeros for p = 4, 5, 6, and
L = 2, 3, 4 and study selected features for p > 6 and/or L > 4. This information enables us to formulate
several conjectures about the thermodynamic limit, L , of these accumulation sets. The resulting phase
diagrams are quite different from those of the generic case (irrational p). For free transverse boundary conditions, the partition function zeros are found to be dense in large parts of the complex plane, even for the
Ising model (p = 4). We show how this feature is modified by taking fixed transverse boundary conditions.
2006 Elsevier B.V. All rights reserved.
Keywords: Potts model; RSOS model; Beraha number; Limiting curve; Quantum groups
* Corresponding author.
154
1. Introduction
The Q-state Potts model [1,2] can be defined for general Q by using the FortuinKasteleyn
(FK) representation [3,4]. The partition function ZG (Q; v) is a polynomial in the variables Q
and v. This latter variable is related to the Potts model coupling constant J as
v = eJ 1.
(1.1)
(1.2)
(1.3)
For generic1 values of Q, the main features of the phase diagram of the Potts model in the real
(Q, x)-plane have been known for many years [2,5]. It contains in particular a curve xFM (Q) > 0
of ferromagnetic phase transitions which are second-order in the range 0 < Q 4, the thermal
operator being relevant. The analytic continuation of the curve xFM (Q) into the antiferromagnetic
regime yields a second critical curve xBK (Q) < 0 with 0 < Q < 4 along which the thermal
operator is irrelevant. Therefore, for a fixed value of Q, the critical point xBK (Q) acts as the
renormalization group (RG) attractor of a finite range of x values: this is the BerkerKadanoff
(BK) phase [6,7].
The generic phase diagram is shown in Fig. 1. Since the infinite-temperature limit (x = 0)
and the zero-temperature ferromagnet (|x| = ) are of course RG attractive, consistency of the
phase diagram requires that the BK phase be separated from these by a pair of RG repulsive
curves x (Q) < 0. The curve x+ (Q) is expected to correspond to the antiferromagnetic (AF)
phase transition of the model [8].
The above scenario thus essentially relies on the RG attractive nature of the curve xBK (Q), and
since this can be derived [5] from very general Coulomb gas considerations, the whole picture
should hold for any two-dimensional lattice. But it remains of course of great interest to compute
the exact functional forms of the curves xFM (Q), xBK (Q), and x (Q)and the corresponding
free energiesfor specific lattices.
The square-lattice Potts model is the best understood case. Here, Baxter [2,9] has found the
exact free energy along several curves x = xc (Q):
+1
(FM),
2 + 4Q (AF),
Q
xc (Q) = 1 Q
(1.4)
(BK),
4Q
2
(AF),
Q
Q
xc = 1 can be identified respectively with xFM (Q) and xBK (Q). The curves
where xc =1 and
x = 2/ Q (4 Q)/Q are mutually dual (and hence equivalent) curves of AF phase
1 More precisely, a generic value of Q corresponds to an irrational value of the parameter p defined in Eq. (1.3).
This point will be made more precise in Section 2 below.
155
Fig. 1. Generic phase diagram for the two-dimensional Potts model in the (Q, v)-plane. The solid black curve in the
ferromagnetic (v > 0) region shows the standard ferromagnetic phase transition curve vFM (Q), and the blue dashed
curve is its analytic continuation vBK (Q) into the antiferromagnetic region. This latter curve acts as an RG attractor
for the BerkerKadanoff phase (the orange hatched region). This is separated from the limit of infinite temperature
(red dashed curve) by the antiferromagnetic phase-transition curve v+ (Q) (solid black curve in the v < 0 region), and
from the v limit by its counterpart v (Q) (dot-dashed blue curve). The red horizontal dotted curve represents
the zero-temperature antiferromagnet (v = 1). The pink vertical lines show the Beraha numbers Q = 4 cos2 (/p)
(p = 2, 3, . . .): the phase diagram on these lines is different from the generic one shown here and forms the object of the
present article. Note that the exact functional forms of the curves vFM (Q), vBK (Q), and v (Q) are lattice-dependent;
the figure shows their explicit forms for the square-lattice model. (For interpretation of the references to colour in this
figure legend, the reader is referred to the web version of this article.)
transitions, which are again second-order in the range 0 < Q 4. These curves also form the
boundaries of the x-values controlled by the BK fixed point [7], as outlined above. Note that the
four points xc (q) in Eq. (1.4) correspond to the points where the circles
|x| = 1,
2
x + = 4 Q
Q
Q
(1.5a)
(1.5b)
cross the real x-axis. These two circles intersect at the points
x = ei/p ,
(1.6)
156
which will be shown below to play a particular role in the phase diagram (see Conjecture 4.1.1).
In the case of a triangular lattice, Baxter and collaborators [1012] have found the free energy
of the Potts model along the curves
3
Qx + 3x 2 = 1,
(1.7a)
1
x = .
(1.7b)
Q
The upper branch of Eq. (1.7a) is identified with the ferromagnetic critical curve xFM (Q). We
have numerical evidence that the middle and lower branches correspond respectively to xBK (Q)
and x (Q), the lower boundary of the BK phase. The position of x+ (Q), the upper branch of
the BK phase, is at present unknown [13] (but see Ref. [14] for the Q 0 limit). Along the
line (1.7b) the Potts model reduces to a coloring problem, and the partition function is here
known as the
chromatic polynomial. The line (1.7b) belongs to the RG basin of the BK phase for
0 < Q < 2 + 3 [15].
The critical propertiesstill with Q taking generic valuesfor these two lattices are to a large
extent universal. This is not so surprising, since the critical exponents can largely be obtained by
Coulomb gas techniques (although the antiferromagnetic transition still reserves some challenges
[8]). Thus, there is numerical evidence that the exponents along the curves xFM (Q), xBK (Q) and
x (Q) coincide, whereas the evidence for the curve x+ (Q) is non-conclusive [14]. On the other
hand, on the less-studied triangular lattice we cannot yet exclude the possible existence of other
curves of second-order phase transitions that have no counterpart on the square lattice.
But in general we can only expect universality to hold when the Boltzmann weights in the FK
representation are non-negative (i.e., for Q 0, v 0), or when the parameter p takes generic
(i.e., irrational) values. The present paper aims at studying the situation when p takes non-generic
values; for simplicity we limit ourselves to the case of integer p > 2. The number of spin states
is then equal to a so-called Beraha number Bp
2
Q = Bp = 4 cos
(1.8)
, p = 3, 4, 5, . . . .
p
The special physics at rational values of p is intimately linked to the representation theory of the
quantum group Uq (SU(2)), the commutant of the TemperleyLieb algebra, when the deformation parameter q is a root of unity. As we shall review in Section 2 below, the quantum group
symmetry of the Potts model at rational p implies that many eigenvalues of the transfer matrix
in the FK representation have zero amplitude or cancel in pairs because of opposite amplitudes;
these eigenvalues therefore become spurious and do not contribute to the partition function [6,7].
Remarkably, for p integer and x inside the BK phase, even the leading eigenvalue acquires zero amplitude. Moreover, all the eigenvalues which scale like the leading one in the
thermodynamic limit vanish from the partition function, and so, even the bulk free energy
f (p; x) is modified [8]. In other words, f (p; x) experiences a singularity whenever p passes
through an integer value. This means in particular that for p integer the critical behavior can
either disappear, or be modified, or new critical points (and other non-critical fixed points) can
emerge.
For the sake of clarity, we discuss the simplest example of this phenomenon. Consider, on the
square lattice, on one hand the Q 2 state model (i.e., with Q tending to 2 through irrational
values of p) and on the other the Q = 2 Ising model (i.e., with fixed integer p = 4). For the
former case, the generic phase diagram and the associated RG flows are shown in the top part of
157
Fig. 2. Phase diagram and RG flows for the Q 2 state model (top) and the Q = 2 Ising model (bottom), on the real
x-axis. Filled (respectively, empty) circles correspond to critical (respectively, non-critical) fixed points.
Fig. 2. The three critical points xFM and x have central charge c = 1/2, while the fourth one
xBK has c = 25/2. For the latter case, new non-critical fixed points appear (by applying the
duality and Z2 gauge symmetries to the one at x = 0), and the RG flows become as shown in the
bottom part of Fig. 2. One now has c = 1/2 for all four critical fixed points. (We shall treat the
Ising model in more detail in Section 7.1 below.)
By contrast to the universality brought out for generic Q, the phase diagram and critical
behavior for integer p is likely to have lattice dependent features. Let us give a couple of examples of this non-universality. The zero-temperature triangular-lattice Ising antiferromagnet,
(Q, v) = (2, 1), is critical and becomes in the scaling limit a free Gaussian field with central
charge c = 1 [1618], whereas the corresponding square-lattice model is non-critical, its partition
function being trivially Z = 2. While this observation does not in itself imply non-universality,
since the critical temperature is expected to be lattice dependent (as is the value of xFM (Q)), the
point to be noticed is that for no value of v does the Q = 2 square-lattice model exhibit c = 1
critical behavior. In the same vein, the square-lattice Potts model with (Q, v) = (3, 1) is equivalent to a critical six-vertex model (at = 1/2) [19,20], with again c = 1 in the scaling limit,
whereas now the corresponding triangular-lattice model is trivial (Z = 3). Now, the triangularlattice model does in fact exhibit c = 1 behavior elsewhere (for x = x ), but the compactification
radius is different from that of the square-lattice theory and accordingly the critical exponents
differ. Finally, (Q, v) = (4, 1) is a critical c = 2 theory on the triangular lattice [21,22], but is
non-critical on the square lattice [23].
Because of the eigenvalue cancellation scenario sketched above, the FK representation is not
well suited2 for studying the Potts model at integer p. Fortunately, for Q = Bp there exists
another representation of the Potts model, in terms of an RSOS model of the Ap1 type [24],
in which the cancellation phenomenon is explicitly built-in, in the sense that for generic values
of x all the RSOS eigenvalues contribute to the partition function. On the square lattice, the
RSOS model has been studied in great detail [2427] at the point x = xFM = 1, where the model
happens to be homogeneous. Only very recently has the case of general real x = 1 (where the
RSOS model is staggered, i.e., its Boltzmann weights are sublattice dependent) attracted some
attention [8], and no previous investigation of other lattices (such as the triangular lattice included
in the present study) appears to exist.
2 We here tacitly assume that the study relies on a transfer matrix formulation. This is indeed so in most approaches
that we know of, whether they be analytical or numerical. An exception would be numerical simulations of the Monte
Carlo type, but in the most interesting parts of the phase diagram this approach would probably not be possible anyway,
due to the presence of negative Boltzmann weights.
158
The very existence of the RSOS representation has profound links [27,28] to the representation theory of the quantum group Uq (SU(2)) where the deformation parameter q defined by
2
Q = q + q 1 = Bp ,
q = exp(i/p),
(1.9)
is a root of unity. To ensure the quantum group invariance one needs to impose periodic boundary
conditions along the transfer direction. Further, to ensure the exact equivalence between Potts and
RSOS model partition functions the transverse boundary conditions must be non-periodic.3 For
definiteness we shall therefore study square- or triangular-lattice strips of size L N spins, with
periodic boundary conditions in the N -direction. The boundary conditions in the L-direction are
initially taken as free, but we shall later consider fixed transverse boundary conditions as well.
For simplicity we shall henceforth refer to these boundary conditions as free cyclic and fixed
cyclic.4
Using the RSOS representation we here study the phase diagram of the Potts model at Q = Bp
through the loci of partition function zeros in the complex x-plane. According to the Beraha
KahaneWeiss theorem [30], when N , the accumulation points of these zeros form either
isolated limiting points (when the amplitude of the dominant eigenvalue vanishes) or continuous
limiting curves BL (when two or more dominant eigenvalues become equimodular); we refer to
Ref. [32] for further details. In the RSOS representation only the latter scenario is possible, since
all amplitudes are strictly positive.5 As usual in such studies, branches of BL that traverse the
real x-axis for finite L, or pinch it asymptotically in the thermodynamic limit L , signal
the existence of a phase transition. Moreover, the finite-size effects and the impact angles [33]
give information about the nature of the transition.
The limiting curves BL constitute the boundaries between the different phases of the model.
Moreover, in the present set-up, each phase can be characterized topologically by the value of
the conserved quantum group spin Sz , whose precise definition will be recalled in Section 2
below. (A similar characterization of phases of the chromatic polynomial was recently exploited
in Ref. [34], but in the FK representation.) One may think of Sz as a kind of quantum order
parameter. A naive entropic reasoning would seem to imply that for any real x the ground state
(free energy) has Sz = 0, since the corresponding sector of the transfer matrix has the largest
dimension. It is a most remarkable fact that large portions of the phase diagram turn out have
Sz = 0.
We have computed the limiting curves BL in the complex x-plane completely for p =
4, 5, 6, and L = 2, 3, 4 for both lattices. Moreover, we have studied selected features thereof
for p > 6 and/or L > 4. This enables us to formulate several conjectures about the topology
of BL which are presumably valid for any L, and therefore, provides information about the
thermodynamic limit L . The resulting knowledge is a starting point for gaining a better
understanding of the fixed point structure and renormalization group flows in these Potts models.
3 There are however some intriguing relationships between modified partition functions with fully periodic boundary
conditions [29]. We believe that the RSOS model with such boundary conditions merits a study similar to the one
presented here, independently of its relation to the Potts model.
4 It is convenient to introduce the notation L N (respectively, L N ) for a strip of size L N spins with free
F
P
X
P
(respectively, fixed) cyclic boundary conditions.
5 Sokal [31, Section 3] has given a slight generalization of the BerahaKahaneWeiss theorem. In particular, when
there are two or more equimodular dominant eigenvalues, the set of accumulation points of the partition-function zeros
may include isolated limiting points when all the eigenvalues vanish simultaneously. See Section 3.1.1 for an example of
this possibility.
159
Our work has been motivated in particular by the following open issues:
(1) As outlined above, the eigenvalue cancellation phenomenon arising from the quantum group
symmetry at integer p modifies the bulk free energy in the BerkerKadanoff phase. For the
Ising model we have seen that this changes the RG nature (from attractive to repulsive) of
the point xBK as well as its critical exponents (from c = 25/2 to c = 1/2). But for general
integer p it is not clear whether xBK will remain a phase transition point, and assuming this
to be the case what would be
its properties.
(2) The chromatic line x = 1/ Q does not appear to play any particular role in the generic
phase diagram of the square-lattice model. By contrast, it is an integrable line [11,12] for
the generic triangular-lattice model. Qua its role as the zero-temperature antiferromagnet
one could however expect the chromatic line to lead to particular (and possibly critical)
behavior in the RSOS model. Even when critical behavior exists in the generic case (e.g.,
on the triangular lattice) the nature of the transition may change when going to the case of
integer p (e.g., from c = 25/2 for the Q 2 model to c = 1 for the zero-temperature
Ising antiferromagnet).
(3) Some features in the antiferromagnetic region might possibly exhibit an extreme dependence
on the boundary conditions, in line with what is known, e.g., for the six-vertex model. It is
thus of interest to study both free and fixed boundary conditions. To give but one example
of what may be expected, we have discoveredrather surprisinglythat with free cyclic
boundary conditions the partition function zeros are actually dense in substantial parts of the
complex plane: this is true even for the simplest case of the square-lattice Ising model.
(4) A recent numerical study [8] of the effective central charge of the RSOS model with periodic
boundary conditions, as a function of x, has revealed the presence of new critical points
inside the BK phase. In particular, strong evidence was given for a physical realization of the
integrable flow [35] from parafermion to minimal models. The question arises what would
be the location of these new points in the phase diagram.
(5) In the generic case, the spin Sz of the ground state may be driven to arbitrary large values
upon approaching the point (Q, x) = (4, 1) from within the BK phase [7,34]. Is a similar
mechanism at play for integer p?
The paper is organized as follows. In Section 2 we introduce the RSOS models and describe
their precise relationship to the Potts model, largely following Refs. [24,27,28]. We then present,
in Section 3, the limiting curves found for the square-lattice model with free cyclic boundary
conditions, leading to the formulation of several conjectures in Section 4. Sections 5, 6 repeat
this programme for the triangular-lattice model. In Section 7 we discuss the results for free cyclic
boundary conditions, with special emphasis on the thermodynamic limit, and motivate the need to
study also fixed cyclic boundary conditions. This is then done in Sections 8, 9. Finally, Section 10
is devoted to our conclusions. Appendix A gives some technical details on the dimensions of the
transfer matrices used.
2. RSOS representation of the Potts model
The partition function of the two-dimensional Potts model can be written in several equivalent ways, though sometimes with different domains of validity of the relevant parameters
(notably Q). The interplay between these different representations is at the heart of the phenomena we wish to study.
160
The spin representation for Q integer is well-known. Its low-temperature expansion gives
the FK representation [3,4] discussed in the introduction, where Q is now an arbitrary complex
number. The (interior and exterior) boundaries of the FK clusters, which live on the medial lattice,
yield the equivalent loop representation with weight Q1/2 per loop.
An oriented loop representation is obtained by independently assigning an orientation to each
loop, with weight q (respectively, q 1 ) for counterclockwise (respectively, clockwise) loops, cf.
Eq. (1.9). In this representation one can define the spin Sz along the transfer direction (with
parallel/antiparallel loops contributing 1/2) which acts as a conserved quantum number. Note
that Sz = j means that there are at least j non-contractible loops, i.e., loops that wind around
the periodic (N ) direction of the lattice. The weights q 1 can be further redistributed locally, as
a factor of q /2 for a counterclockwise turn through an angle [2]. While this redistribution
correctly weights contractible loops, the non-contractible loops are given weight 2, but this can
be corrected by twisting the model, i.e., by inserting the operator q Sz into the trace that defines
the partition function.
A partial resummation over the oriented-loop splittings at vertices which are compatible with
a given orientation of the edges incident to that vertex now gives a six-vertex model representation [36]. Each edge of the medial lattice then carries an arrow, and these arrows are conserved
at the vertices: the net arrow flux defines Sz as before. The six-vertex model again needs twisting
by the operator q Sz to ensure the correct weighing in the Sz = 0 sectors. The Hamiltonian of the
corresponding spin chain can be extracted by taking the anisotropic limit, and is useful for studying the model with the Bethe Ansatz technique [2]. The fact that this Hamiltonian commutes
with the generators of the quantum group Uq (SU(2)) links up with the nice results of Saleur and
coworkers [6,7,27,28].
Finally, the RSOS representation [24,27,28] emerges from a certain simplification of the above
representations when q = exp(i/p) is a root of unity (see below).
All these formulations of the Potts model can be conveniently studied through the corresponding transfer matrix spectra: these give access to the limiting curves BL , correlation functions,
critical exponents, etc.
In the FK representation the transfer matrix T(2)
FK (L) is written in a basis of connectivities
(set partitions) between two time slices of the lattice (see Ref. [34] for details), and the transfer
matrix propagates just one of the time slices. Each independent connection between the two
slices is called a bridge; the number of bridges j is a semi-conserved quantum number in the
sense that it cannot increase upon action of the transfer matrix. The bridges serve to correctly
weight the clusters that are non-contractible with respect to the cyclic boundary conditions.6
This is accomplished by writing the partition function as
(2)
i N
ZFK = f |TFK (L)N |i =
(2.1)
i
i1
for suitable initial and final vectors |i and f |. The vector |i identifies the two time slices,
while f | imposes the periodic boundary conditions (it reglues the time slices) and weighs the
resulting non-contractible clusters. Note that these vectors conspire to multiply the contribution
of each eigenvalue i by an amplitude i = i (Q): this amplitude may vanish for certain values
of Q.
6 In particular, the restriction of T(2) (L) to the zero-bridge sector is just the usual transfer matrix T
FK in the FK
FK
representation, i.e., the matrix used in Ref. [32] to study the case of fully free boundary conditions.
161
On the other hand, in the six-vertex representation the transfer matrix is written in the purely
local basis of arrows, whence the partition function can be obtained as a trace (which however
has to be twisted by inserting q Sz as described above). But even without the twist the eigenvalues
are still associated with non-trivial amplitudes, as we now review.
Let us consider first a generic value of q, i.e., an irrational value of p. The Uq (SU(2)) symmetry of the spin chain Hamiltonian implies that one can classify eigenvalues according to their
value j of Sz , and consider only highest weights of spin j . Define now K1,2j +1 (p, L; x) as the
generating function of the highest weights of spin j , for given values of p, L and x. The partition function of the untwisted six-vertex model with the spin S (not Sz ) fixed to j is therefore
(2j + 1)K1,2j +1 (p, L; x). Imposing the twist, the corresponding contribution to the partition
function of the Potts model becomes Sj (p)K1,2j +1 (p, L; x), where the q-deformed number
Sj (p) (2j + 1)q is defined as follows
Sj (p) =
sin((2j + 1)/p)
.
sin(/p)
(2.2)
L
(2.3)
j =0
Note that the summation is for 0 j L, as the maximum number of bridges is equal to the
strip width L.
For p rational, Eq. (2.3) is still correct, but can be considerably simplified. In the context of
this paper we only consider the simplest case of p integer. Indeed, note that using Eq. (2.2), we
obtain that, for any integer n,
S(n+1)p1j (p) = Sj (p),
(2.4a)
(2.4b)
(p2)/2
j =0
where
(2.5)
162
1,2j +1 (p, L; x) =
K1,2(np+j )+1 (p, L; x) K1,2((n+1)p1j )+1 (p, L; x) .
(2.6)
n0
For convenience in writing Eq. (2.6) we have defined K1,2j +1 (p, L; x) 0 for j > L. Note that
the summation in Eq. (2.5) is now for 0 j (p 2)/2. Furthermore, 1,2j +1 (p, L; x) is a lot
simpler that it seems. Indeed, when p is integer, the representations of Uq (SU(2)) mix different
values of j related precisely by the transformations j j + np and j (n + 1)p 1 j [cf.
Eq. (24)]. Therefore, a lot of eigenvalues cancel each other in Eq. (2.6). This is exactly why the
transfer matrix in the FK representation contains spurious eigenvalues, and is not adapted to the
case of p integer.
The representation adapted to the case of p integer is the so-called RSOS representation. It
can be proved that 1,2j +1 is the partition function of an RSOS model of the Ap1 type [24]
with given boundary conditions [27] (see below). In this model, heights hi = 1, 2, . . . , p 1 are
defined on the union of vertices and dual vertices of the original Potts spin lattice. Neighboring
heights are restricted to differ by 1 (whence the name RSOS = restricted solid-on-solid). The
boundary conditions on the heights are still periodic in the longitudinal direction, but fixed in
the transverse direction. More precisely, the cyclic strip LF NP has precisely two exterior dual
vertices, whose heights are fixed to 1 and 2j + 1, respectively. It is convenient to draw the lattice
of heights as in Figs. 34 (showing respectively a square and a triangular-lattice strip of width
L = 2), i.e., with N exterior vertices above the upper rim, and N exterior vertices below the lower
rim of the strip: all these exterior vertices close to a given rim are then meant to be identified.
For a given lattice of spins, the weights of the RSOS model are most easily defined by building
up the height lattice face by face, using a transfer matrix. The transfer matrix adding one face at
position i is denoted Hi = xIi + ei (respectively, Vi = Ii + xei ) if it propagates a height hi h
i
standing on a direct (respectively, a dual) vertex, where Ii = (hi , h
i ) is the identity operator,
and ei is the TemperleyLieb generator in the RSOS representation [24]:
ei = (hi1 , hi+1 )
(2.7)
Fig. 3. RSOS lattice (solid thick lines) and label convention for the basis in the height space for a square-lattice of width
L = 2 (dashed thinner lines). The thick black arrow shows the transfer direction (to the right).
163
Fig. 4. RSOS lattice (solid thick lines) and label convention for the basis in the height space for a triangular-lattice of
width L = 2 (dashed thinner lines). The thick black arrow shows the transfer direction (to the right).
Note that all the amplitudes Sj (p) entering in Eq. (2.5) are strictly positive. Therefore, for
a generic value of the temperature x, all the eigenvalues associated with 1,2j +1 (p, L; x) for
0 < 2j + 1 < p contribute to the partition function.7 This is the very reason why we use the
RSOS representation. Recall that there are analogous results in conformal field theory [37]. In
fact, for x equal to xFM (Q) and in the continuum limit, K1,2j +1 corresponds to the generating
function of a generic representation of the conformal symmetry with Kac-table indices r = 1
and s = 2j + 1, whereas 1,2j +1 corresponds to the generating function (character) of a minimal
model. Thus, Eq. (2.6) corresponds to the RochaCaridi equation [38], which consists of taking
into account the null states. One could say that the FK representation does not identify all the
states differing by null states, whereas the RSOS representation does. Therefore, the dimension
of the transfer matrix is smaller in the RSOS representation than in the FK representation.
The computation of the partition functions 1,2j +1 (p, L; x) can be done in terms of transfer
matrices T1,2j +1 , denoted in the following simply by T2j +1 . In particular, for a strip of size
L N , we have that
1,2j +1 (p, L; x) = tr T2j +1 (p, L; x)N .
(2.8)
Note that this is a completely standard untwisted trace. The transfer matrix T2j +1 (L; x) acts
on the space spanned by the vectors |h0 , h1 , . . . , h2L , where the boundary heights h0 = 1 and
h2L = 2j + 1 are fixed. The dimensionality of this space is discussed in Appendix A. For any
fixed h0 and h2L , this dimensionality grows asymptotically like QL .
Remarks. (1) Our numerical work is based on an automatized construction of T2j +1 . To validate
our computer algorithm, we have verified that Eq. (2.5) is indeed satisfied. More precisely, given
Q = Bp , and for fixed L and N , we have verified that
7 For exceptional values of x there may still be cancellations between eigenvalues with opposite sign. However, the
pair of eigenvalues that cancel must now necessarily belong to the same sector 1,2j +1 .
164
(2.9)
0<2j +1<p
where ZNP LF (Q; v) is the partition function of the Q-state Potts model on a strip of size
NP LF with cylindrical boundary conditions, as computed in Refs. [39,40]. We have made
this check for p = 4, 5, 6 and for several values of L and N .
(2) For p = 3 the RSOS model trivializes. Only the 1,1 sector exists, and T1 is onedimensional for all L. Eq. (2.5) gives simply
ZLF NP (Q = 1; x) = (1 + x)E ,
(2.10)
where E is the number of lattice edges (faces on the height lattice). It is not possible to treat the
bond percolation problem in the RSOS context, since this necessitates taking Q 1 as a limit,
and not to sit directly at Q = 1. Hence, the right representation for studying bond percolation is
the FK representation.
3. Square-lattice Potts model with free cyclic boundary conditions
3.1. Ising model (p = 4)
The partition function for a strip of size LF NP is given in the RSOS representation as
ZLF NP (Q = 2; x) = 2N L/2 1,1 (x) + 1,3 (x) ,
(3.1)
where 1,2j +1 (x) = tr T2j +1 (p = 4, L; x)N . The dimensionality of the transfer matrices can be
obtained from the general formulae derived in Appendix A:
dim Tk (p = 4, L) = 2L1 ,
k = 1, 3.
(3.2)
We have computed the limiting curves BL for L = 2, 3, 4. These curves are displayed in
Fig. 5(a)(c). In Fig. 5(d), we show simultaneously all three curves for comparison. In addition,
we have computed the partition-function zeros for finite strips of dimensions LF (L)P for
aspect rations = 10, 20, 30. These zeros are also displayed in Fig. 5(a)(c).8 For 5 L 8,
we have only computed selected features of the corresponding limiting curves (e.g., the phase
diagram for real x).
3.1.1. L = 2
This strip is displayed in Fig. 3. Let us denote the basis in the height space as |h1 , h2 , h3 ,
h4 , h5 , where the order is given as in Fig. 3.
The transfer matrix T1 is two-dimensional: in the basis {|1, 2, 1, 2, 1 , |1, 2, 3, 2, 1 }, it takes
the form
1 Y2,0 Y2,1
T1 (p = 4, L = 2) =
(3.3)
,
2 Y2,3 Y2,2
8 After the completion of this work, we learned that Chang and Shrock had obtained the limiting curves for L = 2 [41,
Fig. 20] and L = 3 [42, Fig. 7]. The eigenvalues and amplitudes for L = 2 had been previously published by Shrock [43,
Section 6.13].
165
Fig. 5. Limiting curves for the square-lattice RSOS model with p = 4 and several widths: L = 2 (a), L = 3 (b), and
L = 4 (c). For each width L, we also show the partition-function zeros for finite strips of dimensions LF (10L)P
(black 1), LF (20L)P (red !), and LF (30L)P (brown P). Figure (d) shows all these limiting curves together: L = 2
(black), L = 3 (red), L = 4 (green). The solid squares 2 show the values where Baxter found the free energy. The symbol
in (a) marks the position of the found isolated limiting point. In the regions displayed in gray (respectively, white) the
dominant eigenvalue comes from the sector 1,3 (respectively, 1,1 ). The dark gray circles correspond to (1.5). (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
2L1k
2)
,
k = 0, . . . , 2L 1.
(3.4)
166
(3.6)
The dominant sector on the real x-axis is always 1,1 , except at x = 2 and x = 1/ 2;
at these points the dominant eigenvalues coming from each sector 1,k become equimodular. On
the regions with null intersection with the real x-axis, the dominant eigenvalue comes from the
sector 1,3 .
3.1.2. L 3
For 3 L 8, we find two phase-transition points on the real x-axis:
1
xc,1 = 0.7071067812,
2
xc,2 = 2 1.4142135624.
(3.7a)
(3.7b)
Both points are actually multiple points (except xc,2 for L = 3). There is an additional pair of
complex conjugate multiple points at x = ei/4 .
For x > xc,1 , the dominant eigenvalue always belongs to the sector 1,1 . For x < xc,1 , this
property is true only for even L = 4, 6, 8; for odd L = 3, 5, 7, the dominant eigenvalue for
x < xc,1 belongs to the 1,3 sector.
3.2. Q = B5 model (p = 5)
The partition function for a strip of size LF NP is given in the RSOS representation as
N L/2
1,1 (x) + B5 1,3 (x)
ZLF NP (Q = B5 ; x) = B5
(3.8)
where 1,2j +1 (x) = tr T2j +1 (p = 5, L; x)N .
We have computed the limiting curves BL for L = 2, 3, 4. These curves are displayed in
Fig. 6(a)(c). In Fig. 6(d), we show all three curves for comparison. For L = 5, 6, we have only
computed selected features of the corresponding limiting curves.
3.2.1. L = 2
The transfer matrix T1 is two-dimensional: in the basis {|1, 2, 1, 2, 1 , |1, 2, 3, 2, 1 }, it takes
the form
9 Throughout this paper a point on B of order 4 is referred to as a multiple point.
L
167
Fig. 6. Limiting curves for the square-lattice RSOS model with p = 5 and several widths: L = 2 (a), L = 3 (b), and L = 4
(c). Figure (d) shows all these curves together: L = 2 (black), L = 3 (red), L = 4 (green). The solid squares 2 show the
values where Baxter found the free energy. In the regions displayed in light gray (respectively, white) the dominant
eigenvalue comes from the sector 1,3 (respectively, 1,1 ). The dark gray circles correspond to (1.5). (For interpretation
of the references to colour in this figure legend, the reader is referred to the web version of this article.)
T1 (p = 5, L = 2) =
B5 X3
1/4 3
x
B5
1/4
B5 xX32
x 2 (1 + x)
B5
,
(3.9)
(3.10)
168
3+ 5
3 5
B5 =
(3.11)
,
B5 =
.
2
2
The transfer matrix T3 is three-dimensional. In the basis {|1, 2, 1, 2, 3 , |1, 2, 3, 4, 3 , |1, 2, 3,
2, 3 }, it takes the form
2
1/4
0
B5 x 2 X3
B5 xX3
1/4
T3 (p = 5, L = 2) =
(3.12)
B5 x 2
xX3
B5 x(1 + x) .
1/4 2
x (1 + x)
B5
1/4
B5
x(1 + x)2
1+ 5
B5
=
0.8090169944.
xc =
(3.13)
2
4
We have also found that the limiting curve contains a horizontal line between x = xBK = 1 and
x 1.3843760945. The latter point is a T point, and the former one, a multiple point. There is
an additional pair of complex conjugate multiple points at
1+ 5 i
i/5
(5B5 )1/4 0.8090169944 0.5877852523i.
=
x = e
(3.14)
4
2
We have found two additional pairs of complex conjugate T points at x 1.5613823329
0.3695426938i, and x 0.9270509831 0.3749352940i. The dominant sectors on the real
x-axis are
1,1 for x (, 1.3843760945) (0.8090169944, ),
1,3 for x (1.3843760945, 0.8090169944).
3.2.2. L 3
For L = 3 there are two real phase-transition points at
xc,1 2.1862990086,
(3.15a)
xc,2 0.9176152641.
(3.15b)
The limiting curve contains a horizontal line between two real T points x 1.2066212246
and x 0.9713270390. There are nine additional pairs of complex conjugate T points. The
dominant sectors on the real x-axis are
1,1 for x (2.1862990086, 1.2066212246) (0.9176152641, ),
1,3 for x (, 2.1862990086) (1.2066212246, 0.9176152641).
For L = 4, the real transition points are located at
xc,1 1.3829734471,
(3.16a)
xc,2 0.9475070976.
(3.16b)
We have found that the curve B4 contains a horizontal line between two real T points: x
1.1982787848 and x 0.9776507663. Two points belonging to such line are actually mul-
169
(3.17a)
xc,2 1.2097913730,
(3.17b)
xc,3 1.1717714277,
(3.17c)
xc,4 0.9616402644.
(3.17d)
(3.18a)
xc,2 1.2712112920,
(3.18b)
xc,3 1.1323753929,
(3.18c)
xc,4 1.1052066740,
(3.18d)
xc,5 0.9700021428.
(3.18e)
The limiting curve contains a horizontal line between two real T points: x 1.0877465961 and
x 0.9792223546. This line contains the multiple point x 1.0781213888. The dominant
sectors on the real x-axis are
1,1 for x (, xc,1 ) (xc,2 , 1.0877465961) (1.0781213888, ),
1,3 for x (xc,1 , xc,2 ) (1.0877465961, 1.0781213888).
In all cases 2 L 6, there is a pair of complex conjugate multiple points at x = ei/5
0.8090169944 0.5877852523i.
3.3. Three-state Potts model (p = 6)
The partition function for a strip of size LF NP is given in the RSOS representation as
170
Fig. 7. Limiting curves for the square-lattice RSOS model with p = 6 and several widths: L = 2 (a), L = 3 (b), and
L = 4 (c). Figure (d) shows all these curves together: L = 2 (black), L = 3 (red), L = 4 (green). The solid squares
2 show the values where Baxter found the free energy. In the regions displayed in light gray (respectively, white) the
dominant eigenvalue comes from the sector 1,3 (respectively, 1,1 ). In the regions displayed in a darker gray the
dominant eigenvalue comes from the sector 1,5 . The dark gray circles correspond to (1.5). (For interpretation of the
references to colour in this figure legend, the reader is referred to the web version of this article.)
ZLF NP (Q = 3; x) = 3N L/2 1,1 (x) + 21,3 (x) + 1,5 (x) ,
where 1,2j +1 (x) = tr T2j +1 (p = 6, L; x)N .
(3.19)
171
We have computed the limiting curves BL for L = 2, 3, 4. These curves are displayed in
Fig. 7(a)(c).10
In Fig. 7(d), we show all three curves for comparison. For L = 5, 6, 7 we have only computed
selected features of the corresponding limiting curves.
3.3.1. L = 2
The transfer matrix T5 is one-dimensional, as there is a single basis vector {|1, 2, 3, 4, 5 }. The
matrix is given by
T5 (p = 6, L = 2) = x 2 .
(3.20)
3
X1
2xX12
1
(3.21)
T1 (p = 6, L = 2) =
,
2x 3
x 2 X2
3
where we have used the shorthand notation
X1 = x +
3,
X2 = 2x +
3.
(3.22)
0
2 2 x 2 X1
2xX12
1 2
(3.23)
T3 (p = 6, L = 2) =
6x
3 xX2
3 xX2 .
2 3 2
2 x X2
3x
xX22
For real x, there are two phase-transition points
xc,1 = 3 = x 1.7320508076,
3
xc,2 =
0.8660254038.
2
(3.24a)
(3.24b)
are three multiple points at x = 3/2, and x = 3/2i/2 = ei/6 . The dominant sectors
on the real x-axis are
1,1 for x (,
3 ) ( 3/2, ),
1,5 for x ( 3, 3/2).
On the regions with null intersection with the real x-axis, the dominant eigenvalue comes from
the sector 1,3 .
10 After the completion of this work, we learned that the limiting curves for the smallest widths had been already
obtained by Chang and Shrock: namely, L =2 [42, Fig. 22], and L = 3 [42, Fig. 8]. Please note that in the latter case,
they used the variable u = 1/(v + 1) = 1/(x Q + 1), instead of our variable x.
172
3.3.2. L 3
For L = 3, there are three real phase-transition points
xc,1 1.9904900679,
(3.25a)
xc,2 = 3 = x 1.7320508076,
(3.25b)
3
0.8660254038.
xc,3 =
(3.25c)
2
The limiting curve contains a small horizontal segment running from x 1.0539518478 to x =
xBK = 1. On this line, the two dominant equimodular eigenvalues come from the sector 1,5 .
We have found 15 T points (one real point and seven pairs of complex conjugate T points).
The real point is x = 1. The phase structure is vastly more complicated than that for L = 2. In
particular, it contains three non-connected pieces, and four bulb-like regions. On the real x-axis,
the dominant eigenvalue comes from
xc,1 = 3 = x 1.7320508076,
xc,2 1.3678583305,
(3.26a)
(3.26b)
xc,3 1.2237725061,
(3.26c)
3
0.8660254038.
xc,4 =
(3.26d)
2
This is the strip with smallest width for which a (complex conjugate) pair of endpoints appears:
x 0.99514360660.00444309186i. These points are very close to the transition point xBK =
1. We
pairs of conjugate T points. We have also found three multiple points at
have found 36
x = 3, and x = 3/2 i/2. The dominant sectors on the real x-axis are
1,1 for x (, x
c,3 ) ( 3/2, ),
1,5 for x (xc,3 , 3/2).
For L = 5, there are six real phase-transition points
xc,1 2.3018586529,
xc,2 = 3 = x 1.7320508076,
(3.27b)
xc,3 1.4373407728,
(3.27c)
xc,4 1.3412360954,
(3.27d)
(3.27a)
xc,5 1.2613579653,
(3.27e)
3
0.8660254038.
xc,6 =
(3.27f)
2
We have also found a horizontal line running between the T points x 1.0226306002 and
x 0.9984031794. The dominant sectors on the real x-axis are
173
xc,1 = 3 = x 1.7320508076,
(3.28a)
xc,2 1.2852299467,
(3.28b)
xc,3 1.2238569234,
(3.28c)
xc,4 1.1271443188,
(3.28d)
xc,5 1.0085262838,
3
xc,6 =
0.8660254038.
2
(3.28e)
(3.28f)
1 (x + 2)3
3x(x + 2)2
3
T1 =
,
3x
x 2 (2 + 3x)
2
0
3 3x 2 (x + 2)
3x(x + 2)2
2
1
T3 =
2 6x
2x(3x + 4) 2 2x(3x + 2) ,
6 2
x(3x + 2)2
3x (3x + 2)
4 2x
T5 = x 2 .
(3.29a)
(3.29b)
(3.29c)
For real x, we find a multiple point at x = 1, where all eigenvalues become equimodular
with |i | = 1. The dominant sector on the real x-axis is always 1,1 .
174
Fig. 8. Limiting curves for the square-lattice RSOS model with p = (Q = 4) and several widths: L = 2 (a), L = 3
(b), and L = 4 (c). Figure (d) shows all these curves together: L = 2 (black), L = 3 (red), L = 4 (green). The solid
squares 2 show the values where Baxter found the free energy. In the regions displayed in light gray (respectively, white)
the dominant eigenvalue comes from the sector 1,3 (respectively, 1,1 ). In the regions displayed in a darker gray the
dominant eigenvalue comes from the sector 1,5 . (For interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.)
3.4.2. L 3
For L = 3 there are two real phase-transition points: x = 1 (which is a multiple point), and
xc 1.6424647621. We have found ten pairs of complex conjugate T points and a pair of
complex conjugate endpoints. The dominant sectors on the real x-axis are 1,3 for x < 1, and
175
1,1 for x > 1. The sector 1,5 is only dominant in two complex conjugate regions off the real
x-axis, and the sector 1,7 is never dominant.
For L = 4 we only find a single real phase-transition point at x = 1. We have also found
32 pairs of complex conjugate T points and two pairs of complex conjugate endpoints. The dominant sector on the real x-axis is always 1,1 . There is also two complex conjugate regions where
the dominant eigenvalue comes from the sector 1,5 , and the sectors 1,7 and 1,9 are never
dominant in the complex x-plane.
For L = 5 we find four real phase-transition points at
xc,1 = 1.9465787472,
(3.30a)
xc,2 = 1.5202407889,
(3.30b)
xc,3 = 1.3257163278,
(3.30c)
xc,4 = 1.
(3.30d)
The dominant sectors are 1,3 for x (, xc,1 ) (xc,2 , 1); and 1,1 in the region x
(xc,1 , xc,2 ) (1, ).
For L = 4 we only find a single real phase-transition point at x = 1. The dominant sector
on the real x-axis is always 1,1 .
In all cases 3 L 5, the point x = 1 is a multiple point where all the eigenvalues are
equimodular with |i | = 1.
4. Common features of the square-lattice limiting curves with free cyclic boundary
conditions
From the numerical data discussed in Sections 3.13.3, we can make the following conjecture
that states that certain points in the complex x-plane belong to the limiting curve BL :
Conjecture 4.1. For the square-lattice Q-state Potts model with Q = Bp and widths L 2:
1. The points x = ei/p belong to the limiting curve. At these points, all the eigenvalues are
equimodular with |i | = 1.11Thus, they are in general multiple points.
12
2. For even p, the point x = Q/2
always belongs to the limiting curve BL . Furthermore,
if p = 4, 6, then the point x = Q also belongs to BL .
The phase structure for the models considered above show certain regularities on the real
x-axis (which contains the physical regime of the model). In particular, we conclude
Conjecture 4.2. For the square-lattice Q-state Potts model with Q = Bp and widths L 2:
1. The relevant eigenvalue on the physical line v [1, ) comes from the sector 1,1 .
2. For even L, the leading eigenvalue for
real xcomes always from the sector 1,1 , except
perhaps in an interval contained in [ Q, Q/2].
3. For
odd L, the leading eigenvalue for real xcomes from the sector 1,3 for all x < x0
Q, and from the sector 1,1 for all x Q/2.
11 This property has been explicitly checked for all the widths reported in this paper.
12 This property has been verified for p = 8, 10 and 2 L 6.
176
In the limiting case p = the RSOS construction simplifies. Namely, the quantum group
Uq (SU(2)) reduces to the classical U (SU(2)) (i.e., q 1), and its representations no longer
couple different K1,2j +1 , cf. Eq. (2.4). Accordingly we have simply K1,2j +1 = 1,2j +1 . When
increasing p along the line xBK (Q), the sector K1,2j +1 which dominates for irrational p will
have higher and higher spin j [7]; this is even true throughout the BerkerKadanoff phase.13 One
would therefore expect that the p = RSOS model will have a dominant sector 1,2j +1 with j
becoming larger and larger as one approaches xBK (Q = 4) = 1.
This argument should however be handled with care. Indeed, for p the BK phase contracts to a point, (Q, v) = (4, 2), and this point turns out to be a very singular limit of the Potts
model. In particular, one has xBK = x for Q = 4, and very different results indeed are obtained
depending on whether one approaches (Q, v) = (4, 2) along the AF or the BK curves (1.4).
This is visible, for instance, on the level of the central charge, with c 2 in the former and
c in the latter case. To wit, taking x 1 after having fixed p = in the RSOS model
is yet another limiting prescription, which may lead to different results.
The phase diagrams for Q = 4 (p ) do agree with the above general Conjectures 4.1
and 4.2. In particular, when p , the multiple
points ei/p 1 = xBK (Conjec
ture 4.1.1) and this coincides with the point Q/2 (Conjecture 4.1.2). On the other hand,
the sector 1,1 is the dominant one on the physical line v [1, ) (Conjecture 4.2.1), and we
observe a parity effect on the unphysical regime v (, 1). For even L, the only dominant
sector is 1,1 in agreement with Conjecture 4.2.2 (although there is no interval inside[2, 1]
where 1,3 becomes dominant). For odd L, Conjecture 4.2.3 also holds with x0 = Q = 2
(at least for L = 3, 5). For L = 2, 3, 4, we find that in addition to the sectors 1,1 and 1,3 , only
the sector 1,5 becomes relevant in some regions in the complex x-plane.
4.1. Asymptotic behavior for |x|
Figs. 58 show a rather uncommon scenario: the limiting curves contain outward branches. As
a matter of fact, these branches extend to infinity (i.e., they are unbounded14 ), in sharp contrast
with the bounded limiting curves obtained using free longitudinal boundary conditions [39,40].
It is important to remark that this phenomenon also holds in the limit p , as shown in Fig. 8.
As |x| these branches converge to rays with definite slopes. More precisely, our numerical data suggest the following conjecture15 :
Conjecture 4.3. For any value of p, the limiting curve BL for a square-lattice strip has exactly
2L outward branches. As |x| , these branches are asymptotically rays with
n
1
, n = 1, 2, . . . , 2L.
arg x n (L) =
(4.1)
L 2L
By inspection of Figs. 58, it is also clear that the only two sectors that are relevant in this
regime are 1,1 and 1,3 . In particular, the dominant eigenvalue belongs to the 1,1 sector for
13 See Ref. [34] for numerical evidence along the chromatic line x = 1/Q which intersects the BK phase up to
p = 12 [15].
14 An unbounded branch is one which does not have a finite endpoint.
15 Chang and Shrock [42] observed for L = 3 that if we plot the limiting curve in the variable u = 1/(x Q + 1), then
the point u = 0 is approached at specific angles arg u consistent with our Conjecture 4.3.
177
large positive real x, and each time we cross one of these outward branches, the dominant eigenvalue switches the sector it comes from. In particular, we conjecture that
Conjecture 4.4. The dominant eigenvalue for a square-lattice strip of width L in the large |x|
regime comes from the sector 1,1 in the asymptotic regions
arg x 2n1 (L), 2n (L) , n = 1, 2, . . . , L.
(4.2)
In the other asymptotic regions the dominant eigenvalue comes from the sector 1,3 .
In particular, this means that for large positive x the dominant sector is always 1,1 . However,
for large negative x the dominant eigenvalue comes from 1,1 is L is even, and from 1,3 if L is
odd. Thus, this conjecture is compatible with Conjecture 4.2.
An empirical explanation of this fact comes from the computation of the asymptotic expansion
for large |x| of the leading eigenvalues in each sector. It turns out that there is a unique leading
eigenvalue in each sector 1,1 and 1,3 when |x| . As there is a unique eigenvalue in this
regime, we can obtain it by the power method [44]. Our numerical results suggest the following
conjecture
Conjecture 4.5. Let
,1 (L) (respectively,
,3 (L)) be the leading eigenvalue of the sector 1,1
(respectively, 1,3 ) in the regime |x| . Then
ak (L) k
(L1)/2 2L1
,1 (L) = Q
(4.3a)
x
x
1+
,
Qk/2
k=1
,1 (L)
,3 (L) = Qx L1 + 3(L 1)x L2 + O x L3 .
(4.3b)
Furthermore, we have that
a1 (L) = 2L 1,
L 2,
a2 (L) = 2L 3L + 1,
2
(4.4a)
L 3.
(4.4b)
The first coefficients ak (L) are displayed in Table 1; the patterns displayed in (4.4) are easily
verified. The coefficients ak (L) also depend on p for k 3.
Indeed, the above conjecture explains easily the observed pattern for the leading sector when
x is real. But it also explains the observed pattern for all the outward branches. These branches
are defined by the equimodularity of the two leading eigenvalues
|
,1 | = |
,3 | =
,1 Qx L1 + O x L2 .
(4.5)
This implies that
Re
,1 x L1 = 0,
(4.6)
where x is the complex conjugate of x. Then, if x = |x|ei , then the above equation reduces to
(2n 1),
2L
in agreement with Eq. (4.1).
cos(L) = 0 n =
n = 1, 2, . . . , 2L
(4.7)
178
Table 1
First L coefficients ak for the leading eigenvalue
,1 (L) coming from the sector 1,1 for a square-lattice strip of width L
p
a1
a2
a3
a4
a5
a6
a7
2
3
4
5
6
7
3
5
7
9
11
13
4
10
21
36
55
78
13
37
86
167
288
48
143
352
742
186
564
1444
739
2256
2973
B5
10
21
36
55
10 + 3 B5
35 + 2 B5
84 + 2 B5
165 + 2 B5
35 + 13 B5
126 + 17 B5
330 + 22 B5
128 + 60 B5
464 + 102 B5
479 + 277 B5
3+
2
3
4
5
6
3
5
7
9
11
2
3
4
5
6
7
3
5
7
9
11
13
5
10
21
36
55
78
16
39
88
169
290
61
160
374
769
250
670
1605
1050
2838
2
3
4
5
6
3
5
7
9
11
6
10
21
36
55
19
41
90
171
70
177
396
318
780
1395
4470
Remark. The existence of unbounded outward branches for the limiting curve of the Potts model
with cyclic boundary conditions is already present for the simplest case L = 1. Here, the strip is
just the cyclic graph of n vertices Cn . Its partition function is given exactly by
ZCn (Q, v) = (Q + v)n + (Q 1)v n .
(4.8)
Furthermore, the limiting curve is the line Re x = Q/2, which, as |x| , has slopes given
by /2, in agreement with Conjecture 4.3.
4.2. Other asymptotic behaviors
For the Ising case (p = 4) the points x = 2 and x = 1/ 2 are in general multiple points
and we observe
a pattern similar to the one observed for|x| .
For x = 1/ 2, we find that, if we write x = 1/ 2 + with || 1, within each sector
there is only one leading eigenvalue
,j (L) O(1). More precisely, for L 3,
,1 (L) = 2L/2 + O 3 ,
(4.9a)
L+1
L
.
,1 (L)
,3 (L) = 2 + O
(4.9b)
Again, the equimodularity condition when || 0 implies that Re( L ) = 0, whence arg = n
with n given by Eq. (4.7).
179
,1 (L) = 2(L1)/2 + O 2
,1 (L),
(4.10a)
2 L1 ,
L even,
(1)
(2)
,1 (L) +
,1 (L) =
(4.10b)
L
2(L 1) , L odd,
(1)
(2)
,3 (L) = 2(L1)/2 + O 2
,3 (L),
(4.10c)
L
2(L 1) , L even,
(1)
(2)
,3 (L) +
,3 (L) =
(4.10d)
L odd,
2 L1 ,
(1)L
(1)
(1)
,1 (L) +
,3 (L) = L1 + O L .
(4.10e)
2
The equimodularity condition implies that
Re L1 = 0 cos (L 2) = 0.
(4.11)
n =
(4.12)
(2n 1), n = 1, . . . , 2(L 2).
2(L 2)
5. Triangular-lattice Potts model with free cyclic boundary conditions
5.1. Ising model (p = 4)
For this model we know [1618] the exact transition temperature for the antiferromagnetic
model vc,AF = 1 = vc,BK . The partition function is given by a formula similar to that of the
square lattice, and the dimensionality of Tj (2, L) is the same as for the square lattice. In what
follows we give the different matrices in the same bases as for the square lattice.
We have computed the limiting curves BL for L = 2, 3, 4. These curves are displayed in
Fig. 9(a)(c).16 In Fig. 9(d), we show all three curves for comparison.
5.1.1. L = 2
This strip is drawn in Fig. 4. The transfer matrices are
1 2x 4 + 5 2x 3 + 12x 2 + 8 2x + 4 x(2x 3 + 5 2x 2 + 8x + 2 2 )
T1 =
,
x 2 (2x 2 + 3 2x + 2)
x 2 (2x 2 + 3 2x + 2)
2
2
3
x 2x + 5 2x + 8x + 2 2 2x 3 + 5 2x 2 + 8x + 2 2
.
T3 =
x(2x 2 + 3 2x + 2)
8x 3 + 3 2x 2 + 6x + 2 2
2
For real x, there is a single phase-transition point at
xc = 1/ 2 0.7071067812.
(5.1a)
(5.1b)
(5.2)
16 After the completion of this work, we learned that Chang and Shrock had obtained the limiting curve for L = 2 [41,
Fig. 18].
180
Fig. 9. Limiting curves for the triangular-lattice RSOS model with p = 4 and several widths: L = 2 (a), L = 3 (b), and
L = 4 (c). For each width L, we also show the partition-function zeros for finite strips of dimensions LF (10L)P
(black 1), LF (20L)P (red !), and LF (30L)P (brown P). Figure (d) shows all these limiting curves together:
L = 2 (black), L = 3 (red), L = 4 (green). The solid squares 2 show the values where Baxter found the free energy.
The symbol in (a) marks the position of the found isolated limiting point. In the regions displayed in gray (respectively, white) the dominant eigenvalue comes from the sector 1,3 (respectively, 1,1 ). The gray ellipse corresponds to
(Re x + 1/ 2 )2 + 3(Im x)2 = 3/2. This curve goes through the points x = ei /4 . (For interpretation of the references
to colour in this figure legend, the reader is referred to the web version of this article.)
Re x = 1/ 2
(5.3)
181
For L = 3, 4 we have found that (a) The line Re x = 1/ 2 belongs to the limiting curve;
(b) BL is symmetric under reflection with respect to that line; (c) BL contains a pair of multiple
points at x = ei/4; and (d) The dominant sector on the real x-axis is 1,1 for x > 1/ 2,
and 1,3 for x < 1/ 2.
1/4
x 2 (1 + 3x + 3 B5 x 2 )
x 2 B5 (1 + 3 B4
x + x 2 )
1/4
B5 + 4x + x 2 X4
B5 X3
B5 (1 + x 5B5 + x 2 X4
)
T3 = x
B5 x
X3
(B5
)1/4 (1 + x)
,
1/4
2
1/4
2
3
1 + 3x + 3x + B5 x
(B5 ) x(1 + 3x + B5 x ) (B5 ) (1 + x)
(5.4b)
where we have defined the shorthand notations
X4
= 1 + 3 B5
+ X3
,
X5
= 1 + 4 B5
+ X3
.
(5.5a)
(5.5b)
(5.6a)
xc,2 0.5908569607.
(5.6b)
In fact both points are T points and the whole interval [xc,1 , xc,2 ] belongs to the limiting curve B2 .
Finally, there are two complex conjugate multiple points at x = ei/5 , as for the square-lattice
case. The dominant sector on the real x-axis is 1,1 for x > xc,1 , and 1,3 for x < xc,1 .
182
Fig. 10. Limiting curves for the RSOS model with p = 5 and several widths: L = 2 (a), L = 3 (b), and L = 4 (c).
Figure (d) shows all these curves together: L = 2 (black), L = 3 (red), L = 4 (green). The solid squares 2 show the
values where Baxter found the free energy. In the regions displayed in light gray (respectively, white) the dominant
eigenvalue comes from the sector 1,3 (respectively, 1,1 ). (For interpretation of the references to colour in this figure
legend, the reader is referred to the web version of this article.)
5.2.2. L 3
For L = 3, there are two real phase-transition points at
xc,1 1.0976251052,
(5.7a)
xc,2 0.6376476917.
(5.7b)
183
(5.8a)
xc,2 0.9708876996,
(5.8b)
xc,3 0.6102005246.
(5.8c)
The points xc,2 and xc,3 are T points, and they define a line belonging to the limiting curve.
This line contains two multiple points at x 0.6319374252, and x 0.7685805289. We
have found two additional pairs of complex conjugate endpoints at x 0.9270404586
0.3749352143i, and x = ei/5 . In addition, there are 22 pairs of complex conjugate T points.
The dominant sectors on the real x-axis are
1,1 for x (xc,1 , xc,2 ) (xc,3 , ),
1,3 for x (, xc,1 ) (xc,2 , xc,3 ).
For L = 5, we have found five real phase-transition points at
xc,1 1.0945337809,
(5.9a)
xc,2 1.0615208835,
(5.9b)
xc,3 0.8629689747,
(5.9c)
xc,4 0.6393693994,
(5.9d)
xc,5 0.6362471039.
(5.9e)
The result is xc,1 0.6221939194 < 1/ B5 . The sector 1,1 dominates for all x > xc,1 ; and
for x xc,1 , the sector 1,3 dominates.
5.3. Three-state Potts model (p = 6)
For this model we also know that there is a first-order phase transition in the antiferromagnetic
regime at [40,45]
xc,AF (q = 3) = 0.563512(14).
(5.10)
184
Fig. 11. Limiting curves for the triangular-lattice RSOS model with p = 6 and several widths: L = 2 (a), L = 3 (b),
and L = 4 (c). Figure (d) shows all these curves together: L = 2 (black), L = 3 (red), L = 4 (green). The solid squares
2 show the values where Baxter found the free energy. In the regions displayed in light gray (respectively, white) the
dominant eigenvalue comes from the sector 1,3 (respectively, 1,1 ). In the regions displayed in a darker gray the
dominant eigenvalue comes from the sector 1,5 . (For interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.)
We have computed the limiting curves BL for L = 2, 3, 4. These curves are displayed in
Fig. 11(a)(c).17 In Fig. 11(d), we show all three curves for comparison.
17 After the completion of this work, we learned that Chang and Shrock had obtained the limiting curve for L = 2 [41,
Fig. 19].
185
5.3.1. L = 2
The transfer matrices are
1 x 4 + 2 3x 3 + 6x 2 + 4 3x + 3 x 2(x 3 + 2 3x 2 + 4x + 3 )
,
T1 =
(5.11a)
x 2 2(x 2 + 3x + 1)
x 2 (2x 2 + 2 3x + 1)
2
2X1
2(2x 3 + 4 3x 2 + 7x + 3 ),
2(x 3 + 2 3x 2 + 4x + 3 )
x
,
T3 =
2x
X2
X2
2
x 2(2x 2 + 2 3x + 1)
X2
4x 3 + 4 3x 2 + 6x + 3
(5.11b)
T5 = x 2 .
xc,1 = 2/ 3 1.1547005384,
xc,2 = 1/ 3 0.5773502692.
(5.11c)
(5.12a)
(5.12b)
points at
x = ei/6 = 3/2
i/2.
The dominant sectors
onthe real x-axis are: 1,1 for
xc,1 = 2/ 3 1.1547005384,
xc,2 0.9712924104,
xc,3 = 1/ 3 0.5773502692.
(5.13a)
(5.13b)
(5.13c)
The latter one is actually a multiple point. We have found two pairs of complex conjugate endpoints at x 0.3495004588 0.6911735024i, and x 0.2862942369 0.8514701201i.
There are 16 pairs
the real x-axis are
of complex conjugate
T points. The dominantsectors on
1,1 for x > 1/ 3, 1,3 for x < 2/ 3, and 1,5 for x (2/ 3, 1/ 3 ).
For L = 4, there are five real phase-transition points at
xc,1 = 2/ 3 1.1547005384,
(5.14a)
xc,2 1.0219801955,
(5.14b)
xc,3 1.0041094453,
(5.14c)
xc,4 0.7664034488,
xc,5 = 1/ 3 0.5773502692.
(5.14d)
(5.14e)
The points xc,3 and xc,4 are T points, while xc,5 is a multiple point. We have found a pair of
complex conjugate endpoints at x 0.3857232364 0.6652216322i. In addition, there are 14
pairs of complex conjugate T points. The dominant sectors on the real x-axis are
1,1 for x > xc,4 ,
1,3 for x < 2/ 3 and x (xc,2 , xc,3 ),
1,5 for x (2/ 3, xc,2 ) (xc,3 , xc,4 ).
186
xc,1 = 2/ 3 1.1547005384,
xc,2 0.9326923327,
(5.15a)
(5.15b)
xc,3 0.7350208125,
(5.15c)
xc,4 0.6186679617,
xc,5 = 1/ 3 0.5773502692.
(5.15d)
(5.15e)
xc,1 = 2/ 3 1.1547005384,
xc,2 1.0504774228,
xc,3 = 1/ 3 0.5773502692.
(5.16a)
(5.16b)
(5.16c)
We have also found a small horizontal line belonging to the limiting curve B6 and bounded by
the T points
xc,4 0.7688389273,
xc,5 0.7646464215.
(5.17a)
(5.17b)
1 X8 (2x 3 + 3x 2 + 6x + 4)
3xX8 X7
T1 =
,
3x 2 X7
x 2 X6
2
3xX8 (2x 2 + 3x + 2)
3xX8 X6
2 6xX8
2
1
T3 =
2 6x
2x(4 + 3x)
2 2x(2 + 3x) ,
2
6
3x X6
2 2x(3x + 2)
xX9
T5 = x 2 ,
(5.18a)
(5.18b)
(5.18c)
187
Fig. 12. Limiting curves for the triangular-lattice RSOS model with p = (Q = 4) and several widths: L = 2 (a),
L = 3 (b), and L = 4 (c). Figure (d) shows all these curves together: L = 2 (black), L = 3 (red), L = 4 (green). The
solid squares 2 show the values where Baxter found the free energy. In the regions displayed in light gray (respectively,
white) the dominant eigenvalue comes from the sector 1,3 (respectively, 1,1 ). In regions displayed in a darker gray
the dominant eigenvalue comes from the sector 1,5 . In (c), an even darker gray marks the regions with a dominant
eigenvalue coming from the sector 1,7 . The gray circle corresponds to (7.8). (For interpretation of the references to
colour in this figure legend, the reader is referred to the web version of this article.)
(5.19a)
X7 = 2x + 3x + 2,
(5.19b)
188
X8 = x + 2,
(5.19c)
(5.19d)
189
2. The relevant eigenvalue belongs to the sector 1,3 for all real x < 2/ Q.
The above conjectures also apply to the limiting case p (i.e., Q = 4). As for the squarelattice case, the multiple points ei/p 1 as p (Conjecture 6.1.1) in agreement
with the fact that x = 1 is a multiple pointfor Q = 4. Furthermore, this is also in agreement
with Conjecture 6.1.2, as in this limit, 2/ Q = 1. The dominant sectors for p also
agree with Conjecture 6.2: on the physical line v [1, ) the dominant sector is 1,1 , and for
x < 1, the dominant sector is 1,3 . More precisely, we can state the following conjecture based
on the empirical observations reported above:
Conjecture 6.3. For the triangular-lattice 4-state Potts model defined on a semi-infinite strip of
width L 2, there exists some xc (L) > 1 such that 1,1 is dominant for x > xc (L), 1,2L1 is
dominant for 1 < x < xc (L), 1,3 is dominant for x < 1.
6.1. Asymptotic behavior for |x|
Figs. 912 show a similar scenario to the one discussed in Section 4: There are several unbounded outward branches with a clear asymptotic behavior for large |x|. Again, this scenario
also holds in the limit p (see Fig. 12). However there are quantitative differences with the
scenario found for the square lattice. We should modify Conjecture 4.5 as follows:
Conjecture 6.4. Let
,1 (L) (respectively,
,3 (L)) be the leading eigenvalue of the sector 1,1
(respectively, 1,3 ) in the regime |x| . Then
bk (L) k
L1 3L2
,1 (L) = Q
(6.1a)
x
x
1+
,
Qk/2
k=1
,1 (L)
,3 (L) = 2L1 Qx L1 + (L 1)2L1 x L2 + O x L3 .
(6.1b)
Furthermore, we have that
b1 (L) = 3L 2, L 2,
9
15
b2 (L) = L2 L + 3,
2
2
(6.2a)
L 2,
(6.2b)
18 This property has been verified for p = 6 and 2 L 7, and for p = 8, 10 and 2 L 5.
19 For p = 5, we find that L = 5. For L = 2, 4, the relevant eigenvalue belongs to the sector
0
1,3 on a small portion of
the antiferromagnetic physical line v [1, v0 ].
190
Table 2
First min(2L 1, 7) coefficients bk for the leading eigenvalue
,1 (L) coming from the sector 1,1 for a triangular-lattice
strip of width L
p
b1
b2
b3
2
3
4
5
6
7
8
4
7
10
13
16
19
22
6
21
45
78
120
171
231
2
3
4
5
6
4
7
10
13
16
6
21
45
78
120
6
35
120
286
560
969
1540
4 + 2 B5
35
120
286
560
2
3
4
5
6
7
4
7
10
13
16
19
6
21
45
78
120
171
2
3
4
5
6
4
7
10
13
16
6
21
45
78
120
b4
b5
b6
b7
37
212
717
1822
3878
7317
31
264
1305
4392
11658
26370
244
1793
8146
27349
74927
184
1919
11940
51389
172304
35 + 2 B5
210 + 2 B5
715 + 2 B5
1820 + 2 B5
21 + 10 B5
252 + 12 B5
1287 + 18 B5
4368 + 24 B5
210 + 34 B5
1716 + 77 B5
8008 + 138 B5
122 + 64 B5
1718 + 203 B5
11442 + 500 B5
8
35
120
286
560
969
39
214
719
1824
3880
41
276
1323
4416
11688
278
1870
8284
27566
252
2126
12444
52394
10
35
120
286
560
41
216
721
1826
51
288
1341
4440
312
1947
8422
324
2337
12952
9
27
b3 (L) = L3 L2 + 13L 4,
2
2
L 3.
(6.2c)
The first coefficients bk (L) are displayed in Table 2; the patterns displayed in (6.2) are easily
verified. The coefficients bk (L) also depend on p for k 4.
Conjecture 6.4 explains the number of outward branches in the triangular-lattice case, as
well as the observed pattern for the outward branches. Again, these branches are defined by
the equimodularity of the two leading eigenvalues
|
,1 | = |
,3 | =
,1 const x L1 + O x L2 .
(6.3)
(6.4)
(2n 1),
2(2L 1)
n = 1, 2, . . . , 2(2L 1).
(6.5)
Thus, we get the same asymptotic behavior as for the square lattice with the replacement L
2L 1.
191
D1 = D(0, 1) \ D( 2, 1),
(7.1a)
D2 = D(0, 1) D( 2, 1),
(7.1b)
D3 = D( 2, 1) \ D(0, 1),
(7.1c)
D4 = C \ D(0, 1) D( 2, 1) .
(7.1d)
The L N strips with even N are bipartite, whence the Ising model possesses the exact gauge
symmetry J J (change the sign of the spins on the even sublattice). Since the limit N
can be taken through even N only, the limiting curves BL should be gauge invariant. In terms of
x the gauge transformation reads
x
x
(7.2)
.
1+x 2
Note that it exchanges D
2 D4 , while leaving D1 and D3 invariant. In particular, the structures
of BL around x = 1/ 2 and |x| = discussed in Section 4 are equivalent.
On the other hand, the duality transformation x 1/x is not a symmetry of BL : this is due to
the fact that the boundary conditions prevent the lattice from being selfdual. Note that the duality
exchanges D1 D4 and D2 D3 . But whilst there are many branches of BL in D4 , there are
none in D1 .
The Ising model being very simple, we do however expect the fixed point structure on the real
x-axis to satisfy duality. Combining the gauge and duality transformations one can connect all
critical fixed points:
gauge
duality
gauge
xFM x+ x xBK ,
(7.3)
and the first and the last points in the series are selfdual. In the same way, all the non-critical
(trivial) fixed points are connected:
duality
duality
gauge
x = 0 |x| = x = 1/ 2 x = 2,
(7.4)
and the first and the last points in the series are gauge invariant.
The reason that we discuss these well-known facts in detail is that the square-lattice Ising
model is really the simplest example of how taking p rational (here, in fact, integer) profoundly
modifies and enriches the fixed/critical point structure of the Potts model, as compared to the
generic case of p irrational. Taking the limit p 4 through irrational values we would have
192
had three equivalent c = 1/2 critical points, RG repulsive in x, situated at xFM and x ; one
c = 25/2 critical point, RG attractive in x, situated at xBK ; and two non-critical (trivial) fixed
points, RG attractive in x, situated at x = 0 and |x| = . This makes up for a phase diagram on
the real x-axis which is consistent in terms of renormalization group flows (see the top part of
Fig. 2).
Conversely, sitting directly at p = 4 replaces this structure by the four repulsive c = 1/2
critical points (7.3) and the four attractive non-critical fixed points (7.4). This again gives a
consistent scenario, in which notably the BK phase has disappeared (see the bottom part of
Fig. 2). In other cases than the Ising model (p > 4 integer) we could expect the emergence of even
more new (as compared to the case of irrational p) fixed points (critical or non-critical), which
will in general be inequivalent (due in particular to the absence of the Ising gauge symmetry).
Going back to the case of complex x we can now conjecture:
Conjecture 7.1. Let D1 be the domain defined in Eq. (7.1a). Then
The points x such that
ZLF NP (Q = 2; x) = 0
(square lattice)
(7.5)
(7.6)
Then
The points x such that
ZLF NP (Q = 2; x) = 0
(triangular lattice)
(7.7)
193
a Beraha number, the thermal operator is repulsive at xBK (Q) (and not attractive as it would
have been in the BK phase for irrational p), whereas it remains repulsive at x (Q) and x+ (Q).
Therefore, there must at the very least be one attractive fixed point in each of the two intervals
mentioned, in order for a consistent phase diagram to
emerge. Indeed, for p even, there are two
Q/2 for all even p, and the other being
new fixed points,
one
of
them
being
conjectured
as
equal to Q only for p = 4 and p = 6. But our results for finite L are in favor of an even
more complicated structure, involving more new fixed points. The structure of the phase diagram
for p odd is further complicated by the emergence of segments of the real x-axis belonging
to BL . It is however uncertain, whether these segments will stay of finite length in the L
limit.
In the models with p = 5, 6, and on both the square and triangular lattices, we have found
strong numerical evidence to conjecture that the partition-function zeros are dense in the whole
complex x-plane with the exception of the interior of some domain. The shape of this domain
depends on both p and the lattice structure; and unlike in the Ising case (p = 4), we do not have
enough evidence to conjecture its algebraic expression [cf., Conjectures 7.1 and 7.2]. For the
square lattice and fixed p, the limiting curves BL seem to approach (from the outside) the circles (1.5), especially in the ferromagnetic regime Re x 0. For the triangular lattice and p = ,
the limiting curves in Fig. 12 seem to approach the circle
Re x +
1
4
2
+ (Im x)2 =
2
3
,
4
(7.8)
194
QV E/21 x E Z(Q;
Q/v) = Z(Q; v),
(7.9)
where E (respectively, V ) is the total number of lattice edges (respectively, direct sites). Note
that V = LN , and that E = 2V N (respectively, E = 3V 2N ) for the square (respectively,
triangular) lattice. We now claim that this object with fixed and equal values for S can again be
expressed in terms of K1,2j +1 , for a generic p. The precise relation reads
x Z(Q;
Q/v) S
V E/2 E
ZLX NP (Q; v) Q
+ =S
=Q
LN/2
L
j =0
(7.10)
which should be compared with Eq. (2.3). We henceforth refer to ZLX NP (Q; v) as the partition
function of the Potts model with fixed cyclic boundary conditions (even though it would be more
precise to say that it is actually the two exterior dual spins that get fixed). The amplitudes read
Sj (p)
1
j
j (p) =
(7.11)
+ (1) 1
.
Q
Q
(p2)/2
(7.12)
j =0
which should be compared with Eq. (2.5). For p odd, there does not appear to exist an expansion
of ZLX NP in terms of 1,2j +1 .
Note in particular that 1 (p) = 0 for any p. This has the consequence of eliminating the 1,3
sector from the partition function, and, as we now shall see, modify the |x| 1 behavior of the
phase diagram.
195
Fig. 13. Limiting curves for the square-lattice RSOS model with p = 4 and several widths: L = 2 (a), L = 3 (b), and
L = 4 (c) when only the sector 1,1 is taken into account. Figure (d) shows all these curves together: L = 2 (black),
L = 3 (red), L = 4 (green). The solid squares 2 show the values where Baxter found the free energy. The dark gray
circles correspond to (1.5). (For interpretation of the references to colour in this figure legend, the reader is referred to
the web version of this article.)
196
Fig. 14. Limiting curves for the square-lattice RSOS model with p = 6 and several widths: L = 2 (a), L = 3 (b), and
L = 4 (c) when only the sectors 1,1 and 1,5 are taken into account. In the regions displayed in dark gray (respectively, white) the dominant eigenvalue comes from the sector 1,5 (respectively, 1,1 ). Figure (d) shows all these curves
together: L = 2 (black), L = 3 (red), L = 4 (green). The solid squares 2 show the values where Baxter found the free
energy. The dark gray circles correspond to (1.5). (For interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.)
Before presenting the results for fixed cyclic boundary conditions in detail we wish to explain this similarity. We proceed in two stages. First we present an argument why the limiting
curves corresponding to just the sector 1,1 almost coincide with those for fully free boundary
conditions. Second, we take into account the effect of adding other sectors 1,2j +1 .
197
Fig. 15. Limiting curves for the triangular-lattice RSOS model with p = 4 and several widths: L = 2 (a), L = 3 (b), and
L = 4 (c) when only the sector 1,1 is taken into account. Figure (d) shows all these curves together: L = 2 (black),
L = 3 (red), L = 4 (green). The gray ellipse corresponds to (Re x + 1/ 2 )2 + 3(Im x)2 = 3/2. This curve goes through
the points x = ei /4 . (For interpretation of the references to colour in this figure legend, the reader is referred to the
web version of this article.)
Let TFK be the transfer matrix in the FK representation with zero bridges (cf. footnote 6), and
let i be its eigenvalues.20 Then one has, with cyclic boundary conditions
20 We label the by letting be the eigenvalue which dominates for x real and positive, and using lexicographic
i
0
ordering [32] for the remaining eigenvalues.
198
Fig. 16. Limiting curves for the triangular-lattice RSOS model with p = 6 and several widths: L = 2 (a), L = 3 (b), and
L = 4 (c) when only the sectors 1,1 and 1,5 are taken into account. In the regions displayed in dark gray (respectively, white) the dominant eigenvalue comes from the sector 1,5 (respectively, 1,1 ). Figure (d) shows all these curves
together: L = 2 (black), L = 3 (red), L = 4 (green). The solid squares 2 show the values where Baxter found the free
energy. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of
this article.)
K1,1 = tr TN
FK =
N
i .
(8.1)
Due to the coupling of K1,2j +1 , given by Eq. (2.6), the eigenvalues of T1 (i.e., the transfer matrix
that generates 1,1 , cf. Eq. (2.8)) form only a subset of the eigenvalues of TFK . More precisely,
1,1 =
i N
i ,
199
(8.2)
where i = 0 or 1 are independent of x. Note that when L < p 1, Eq. (2.6) gives simply
1,1 = K1,1 , and so in that case all i = 1.
Meanwhile, the partition function of the Potts model with fully free boundary conditions is
given by [32]
i N
Zfree = f |TN
(8.3)
FK |i =
i ,
i1
where the amplitudes i are due to the free longitudinal boundary conditions. Note that some of
the i could vanish identically, and indeed many of them do vanish. For example, in the case of
the square lattice, the vectors |i and f | are symmetric under a reflection with respect to the axis
of the strip, whence only the i corresponding to eigenvectors which are symmetric under this
reflection will contribute to Zfree .
For x > 0 real and positive, it follows from simple probabilistic arguments that the dominant
eigenvalue 0 will reside in the zero-bridge sector K1,1 and is not canceled by eigenvalues coming from other sectors. Therefore 0 = 1. On the other hand, the PerronFrobenius theorem and
the structure of the vectors |i and f | implies that 0 > 0. We conclude that the dominant term
in the expansions of 1,1 and Zfree are proportional. By analytic continuation the same conclusion holds true in some domain in the complex x-plane containing the positive real half-axis.
Moving away from that half-axis, a first level crossing will take place when 0 crosses another
eigenvalue i . If none of the functions i and i are identically zero, the corresponding branch
of the limiting curve BL coincides in the two cases. Further away from the positive real half-axis
other level crossings may take place, and the limiting curves remain identical until a level crossing between j and k takes place in which either j = 0 and j = 0, or conversely j = 0 and
j = 0. When L < p 1 the only possibility is the former one, since all i = 1.
If we now compare the limiting curves of Zfree and ZRSOS , the latter being defined as some
linear combination of 1,2j +1 (containing 1,1 ), the above argument will be invalidated if the first
level crossing in ZRSOS when moving away from the positive half-axis involves an eigenvalue
from 1,2j +1 with j > 0.
With free cyclic boundary conditions, ZRSOS contains 1,3 . The first level crossing involves
eigenvalues from 1,1 and 1,3 (cf. the observed unbounded branches) and is situated very
close [cf. Eqs. (4.7) and (6.5) with n = 1] to the positive real half-axis. Accordingly, the limiting curves BL do not at all resemble those with fully free boundary conditions. On the other
hand, when 1,3 is excluded (i.e., in the case of fixed cyclic boundary conditions) the first level
crossing is between two different eigenvalues from the 1,1 sector (see Figs. 1316).
8.1. Ising model (p = 4)
We have studied the limiting curves given by the sector 1,1 in the square-lattice Ising case.
The results are displayed in Fig. 13. It is clear that there are no outward branches, as there
is a unique dominant eigenvalue in the region |x| 1. Indeed, this agrees with the expected
non-critical phase. These curves are very similar to those obtained using the FortuinKasteley
representation for a square-lattice strip with free boundary conditions [39]. In particular, for even
L = 2, 4 we find that these curves do in fact coincide. However, for L = 3 we find disagreements;
but only in the region Re v < 1. Namely, the complex conjugate closed regions defined by
200
the multiple points x = ei/4 and x = 2 (see Fig. 13(b)) are replaced by two complex
conjugate arcs emerging from x = ei/4 . These arcs bifurcate at two complex conjugate T
points.
For L = 2 we find two pairs of complex conjugate endpoints at x 0.5558929703
0.1923469388i,
and x 0.5558929703 1.6065605012i. There is a double endpoint at x =
2.
For L = 3 we also find two pairs of complex conjugate endpoints at x 0.5054436896
at x 1.4346151869 0.9530458628i,
0.6206108204i, x 1/ 2 0.4918781633i, and x 1/ 2 0.9374415716i. The limiting
curve contains two complex conjugate vertical lines determined by the latter two pairs of endpoints, and a horizontal line determinedby the two real endpoints. Wehave found three pairs of
complexconjugate T points at x 1 2 0.5353475100i, x 1 2 0.7246267519i, and
x 1 2 0.8539546894i. Finally, there is a multiple point at x 0.9681295813.
201
For L = 4,we again find a horizontal real line bounded by two real endpoints at x = 2,
andx = 1/ 2, and a pair of complex
conjugate vertical lines bounded by the endpoints x
1 2 1.0514178378i, and x 1 2 0.3816638845i. We have found and additional pair
of endpoints at x 0.5890850526 0.4519358255i.
There arefive pairs of T points; two of
2.
These
are
x 1 2 0.4336035301i, and x
them
are
located
on
the
line
Re
x
=
1
202
features and exotic phase transitions. Despite of these complications, we venture to summarize
our essential findings, by regrouping them in the same way as in the list of open issues presented
in the introduction:
(1) The points xFM (Q) and x (Q) (and for the square lattice also its dual x+ (Q)), that act as
phase transition points in the generic phase diagram, should play a similar role for integer p.
This can be verified from the figures in which it is more-or-less obvious that the corresponding red solid squares will be traversed, or pinched, by branches of BL in the L limit.
What is maybe more surprising is that also xBK (Q) has a similar property, despite of the
profoundly changed physics inside the BK phase. Indeed, in most cases, xBK (Q) is either
exactly on or very close to a traversing branch of BL . It remains an open question to characterize exactly the nature of the corresponding phase transition.
jecture 6.1.2 involves the points x = 2/ Q for p integer and x = 1/ Qfor Q integer.
Thus, both lattices exhibit a phase transition on the chromatic line x = 1/ Q or its dual,
but only for integer Q. It is tempting to speculate that the chromatic line and its dual might
play symmetric roles upon imposing fully periodic boundary conditions, but that remains to
be investigated.
(3) We have found that with free cyclic boundary conditions, partition-function zeros are dense
in a substantial region of the phase diagram, including the region |x| 1. See in particular
Conjectures 4.3, 4.4 for the square lattice and Conjecture 6.4 for the triangular lattice. For
the Ising model (Q = 2), the finite-size data is conclusive enough to make a precise guess as
to the extent of that region, cf. Conjectures 7.1 and 7.2. We have argued (in Section 7.4) and
observed explicitly (in Sections 8, 9) that this feature is completely modified by changing to
fixed cyclic boundary conditions. Another example of the paramount role of the boundary
conditions has been provided with the argument of Section 8 that when restricting to the
sector 1,1 one sees essentially the physics of free longitudinal boundary conditions.
(4) It is an interesting exercise to compare the limiting curves found here with the numerically
evaluated effective central charge shown in Figs. 2325 of Ref. [8]. In particular, for p = 5 it
does not seem far-fetched that the two new
in Fig. 23 of that pa phase transitions identified
per might be located exactly at x = 1/ Q 0.618 and x = Q/2 0.809. These
points (for the former point, actually its dual, but we remind that the transverse boundary
conditions in Ref. [8] are periodic) are among the special points discussed in item 2 above.
(5) We have provided some evidence that on the triangular lattice for Q = 4 (i.e., p = )
phases with arbitrary high j will exist close to the point x = 1. For the square lattice we
have only found phases with j 5. This should be compared with the arbitrarily high values
of Sz taken when approaching (Q, x) = (4, 1) from within the BK phase in the generic
case [7,34].
It would be interesting to extend the study to fully periodic (toroidal) boundary conditions.
This would presumably diminish the importance of finite-size corrections, but note that the possibility of the non-trivial clusters having a more complicated topology makes the link to the
quantum group more subtle.
Another line of investigation would be to study the Potts model for a generic value of Q,
i.e., to transpose what we did for the 1,2j +1 to the K1,2j +1 . Indeed, studies for v given in
the complex Q-plane have already been made, for example in Ref. [34] for v = 1, but to our
203
knowledge no study exists for Q given in the complex v-plane. Note that the results are very
different in these two cases. For example, with L fixed and finite, the Beraha number Q = Bp
are limiting points in the complex Q-plane for fixed v = 1 (and presumably everywhere in the
BerkerKadanoff phase), but v = 1 is not a limiting point in the complex v-plane for fixed
Q = Bp (p > 4). This is just one example that different limits may not commute and the very
concept of a thermodynamic limit for antiferromagnetic models has to be manipulated with
great care.
Acknowledgements
We thank to Hubert Saleur for useful comments on the first stage of this work, Robert Shrock
for correspondence, and Alan Sokal for discussions on closely related projects. J.S. thanks the
warm hospitality of the members of the LPTMS, where part of this work was done. This research
has been partially supported by US National Science Foundation grants PHY-0116590 and PHY0424082, and by MEC (Spain) grants MTM2004-01728 and FIS2004-03767.
Appendix A. Dimension of the transfer matrix
The dimension of the transfer matrices Tk (p, L) can be obtained in closed form. First note
that for given p, k = 2j + 1, and L, the dimension of the transfer matrix Tk (p, L)
(p)
(A.1)
equals the number of random walks (with up and down steps) of length 2L steps that start at
height 1 and end at height k. This random walks have to evolve between a ceiling 1 and a
roof m = p 1.
Let us now proceed in steps. For k = 1 and m = we have just the Catalan numbers. Thus,
if z is the fugacity of a single step, then the ordinary generating function (o.g.f.) is
1 1 4z2
=
1
+
CL z2L .
f (z) =
(A.2)
2z2
L=1
We now keep k = 1, and we introduce the roof m. A walk is either empty or consists of two
independent parts. The first part is between the very first step (necessarily up) and the first down
step that hits the ceiling (i.e., 1); the second part is the rest of the walk (which may be empty).
For instance, if p = 4 (m = 3) and L = 3, a possible walk can be 1232121. The first part
of this walk is 12321; while the second part of the walk is 121. If we take away the first
and last steps of the first part (i.e., we are left with 232), we have a walk with m m 1 (as
this is equivalent to 121). Thus, the o.g.f. f (m, z) satisfies the equation
f (m, z) = 1 + z2 f (m 1, z)f (m, z),
(A.3)
f (m, z) =
(A.4a)
(A.4b)
Finally, let us consider the general case with k > 1. In this case, the walk cannot be empty,
and the first step is necessarily up. There are two classes of walks. In the first one, the walk never
204
hits the ceiling 1 again. For instance if p = 4, L = 3, and k = 3, a walk belonging to this class is
given by 1232323. So it consists in one step and a walk with a raised ceiling (i.e., 23
2323 is equivalent to 121212 with roof m = 2). In the second class, the walk does hit
the ceiling somewhere for the first time, so we can split the walk into two independent parts as
in the preceding paragraph. Thus, the o.g.f. satisfies the equation
f (m, k, z) = zf (m 1, k 1, z) + z2 f (m 1, z)f (m, k, z),
(A.5)
(A.6a)
(A.6b)
(p)
dk (L)
(p)
dk (L)z2L .
(A.7)
L=0
f (3, 1, z) =
1 z2
=1+
2L1 z2L ,
2
1 2z
(A.8a)
L=1
f (3, 3, z) =
z2
=
2L1 z2L .
2
1 2z
(A.8b)
L=1
For the other cases, we can get closed formulas for the generating functions, and obtain the result
(p)
np+j (L) (n+1)p1j (L) ,
dk (L) =
(A.9)
n0
where k = 2j + 1 and we have defined j (L) 0 for j > L. The j (L) are given by
2L
2L
2j + 1
2L
=
j (L) =
.
Lj
Lj 1
L+j +1 Lj
(A.10)
This result can also be obtained by another method, which consists of calculating the number
j (L) of states of highest weight with spin S = Sz = j for the vertex model and taking into
account the coupling of Uq (SU(2)) between different j for p integer [28]. Yet another method
(p)
consists in relating dk to the number of paths on the Dynkin diagram Ap1 going from 1 to
2j + 1 and using the eigenvectors of the adjacency matrix [27].
The j (L) can also be interpreted as the dimension of the transfer matrix in the FK representation with j bridges, i.e., for a generic (irrational) value of p. In that context, Eq. (A.9) represents
the reduction of the dimension that takes case at p integer when going from the FK to the RSOS
representation (with spin j ), and thus, is completely analogous to Eq. (2.6) for the generation
functions.
On the chromatic
line x = 1/ Q, 2j (L) is replaced by a smaller dimension j (L), because
the operator V = Vi is a projector (V = V ) that projects out nearest-neighbor connectivities
(i.e. the action of V on states with nearest neighbours connected gives zero). We do not know of
205
any explicit expression for j (L), but it verifies the following recursion relation [47]
0 (L + 1) = 1 (L),
(A.11a)
(A.11b)
with the convention that j (L) = 0 for j < 0 and the conditions j (L) = 0 for j > L,
L (L) = 1, and 0 (1) = 0. In particular, it can be shown that 0 (L) = ML1 , where ML1
is a Motzkin number and corresponds to the number of non-crossing non-nearest neighbor partitions of {1, . . . , L} (i.e., it is the dimension ofthe cluster transfer matrix in the case of free
longitudinal boundary conditions and x = 1/ Q ). Note that in the RSOS representation (in
the case of p integer), we cannot reduce the dimension of the T2j +1 , since although V is a projector in the RSOS representation too the states which are projected out are linear combinations
of the basis states (corresponding to a given configuration of the heights), and not simply basis
states as in the case of the FK representation. But because of Eq. (2.6), the number dj (L) of
non-null eigenvalues of T2j +1 is given by Eq. (A.9) with j (L) replaced by j (L). In particular,
L2 and d (L) = 2L1 .
for Q = 3 we find using
3
the recursion relation that d1 (L) = d5 (L) = 2
Indeed, for x = 1/ Q, the three-state Potts model is equivalent to a homogeneous six-vertex
model with all the weights equal to 1 [19] (note that this six-vertex model is different from the
one we considered before).
References
[1] F.Y. Wu, Rev. Mod. Phys. 54 (1982) 235;
F.Y. Wu, Rev. Mod. Phys. 55 (1983) 315, Erratum.
[2] R.J. Baxter, Exactly Solved Models in Statistical Mechanics, Academic Press, London, 1982.
[3] P.W. Kasteleyn, C.M. Fortuin, J. Phys. Soc. Jpn. 26 (Suppl.) (1969) 11.
[4] C.M. Fortuin, P.W. Kasteleyn, Physica 57 (1972) 536.
[5] B. Nienhuis, in: C. Domb, J.L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, vol. 11, Academic,
London, 1987.
[6] H. Saleur, Commun. Math. Phys. 132 (1990) 657.
[7] H. Saleur, Nucl. Phys. B 360 (1991) 219.
[8] J.L. Jacobsen, H. Saleur, The antiferromagnetic transition for the square-lattice Potts model, Nucl. Phys. B 743
(2006) 207, the next article in this issue, cond-mat/0512056.
[9] R.J. Baxter, Proc. R. Soc. London A 383 (1982) 43.
[10] R.J. Baxter, H.N.V. Temperley, S.E. Ashley, Proc. R. Soc. London A 358 (1978) 535.
[11] R.J. Baxter, J. Phys. A 19 (1986) 2821.
[12] R.J. Baxter, J. Phys. A 20 (1987) 5241.
[13] J.L. Jacobsen, J. Salas, A.D. Sokal, Phase diagram and renormalization-group flow for the square-lattice and
triangular-lattice Potts models, in preparation.
[14] J.L. Jacobsen, J. Salas, A.D. Sokal, J. Stat. Phys. 119 (2005) 1153, cond-mat/0401026.
[15] J.L. Jacobsen, J. Salas, A.D. Sokal, J. Stat. Phys. 112 (2003) 921, cond-mat/0204587.
[16] J. Stephenson, J. Math. Phys. 5 (1964) 1009.
[17] H.W.J. Blte, H.J. Hilhorst, J. Phys. A 15 (1982) L631.
[18] B. Nienhuis, H.J. Hilhorst, H.W.J. Blte, J. Phys. A 17 (1984) 3559.
[19] Lenard, cited in E.H. Lieb, Phys. Rev. 162 (1967) 162, at pp. 169170.
[20] R.J. Baxter, J. Math. Phys. 11 (1970) 3116.
[21] R.J. Baxter, J. Math. Phys. 11 (1970) 784.
[22] J. Kondev, J. de Gier, B. Nienhuis, J. Phys. A 29 (1996) 6489.
[23] S.J. Ferreira, A.D. Sokal, Phys. Rev. B 51 (1995) 6727, hep-lat/9405015.
[24] V. Pasquier, J. Phys. A 20 (1987) L1229.
[25] G.E. Andrews, R.J. Baxter, P.J. Forrester, J. Stat. Phys. 35 (1984) 193.
[26] D. Huse, Phys. Rev. B 30 (1984) 3908.
[27] H. Saleur, M. Bauer, Nucl. Phys. B 320 (1989) 591.
206
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
Abstract
We solve in this paper the problem of the antiferromagnetic transition for the Q-state Potts model (defined
geometrically for Q generic using the loop/cluster expansion) on the square lattice. This solution is based
on the detailed analysis of the Bethe ansatz equations (which involve staggered source terms of the type
real and anti-string) and on extensive numerical diagonalization of transfer matrices. It involves subtle
distinctions between the loop/cluster version of the model, and the associated RSOS and (twisted) vertex
models. The essential result is that the twisted vertex model on the transition line has a continuum limit
described by two bosons,
one which is compact and twisted, and the other which is not, with a total central
charge c = 2 6t , for Q = 2 cos t . The non-compact boson contributes a continuum component to the
spectrum of critical exponents. For Q generic, these properties are shared by the Potts model. For Q a
Beraha number, i.e., Q = 4 cos2 n with n integer, and in particular Q integer, the continuum limit is given
by a truncation of the two boson theory, and coincides essentially with the critical point of parafermions
Zn2 .
Moreover, the vertex model, and, for Q generic, the Potts model, exhibit a first-order critical point on the
transition linethat is, the antiferromagnetic critical point is not only a point where correlations decay algebraically, but is also the locus of level crossings where the derivatives of the free energy are discontinuous.
In that sense, the thermal exponent of the Potts model is generically equal to = 12 . Things are however
profoundly different for Q a Beraha number. In this case, the antiferromagnetic transition is second order,
with the thermal exponent determined by the dimension of the 1 parafermion, = t2
2 . As one enters the
adjacent BerkerKadanoff phase, the model flows, for t odd, to a minimal model of CFT with central
* Corresponding author.
208
6 , while for t even it becomes massive. This provides a physical realization of a flow
charge c = 1 (t1)t
conjectured long ago by Fateev and Zamolodchikov in the context of ZN integrable perturbations.
Finally, though the bulk of the paper concentrates on the square-lattice model, we present arguments and
numerical evidence that the antiferromagnetic transition occurs as well on other two-dimensional lattices.
2006 Elsevier B.V. All rights reserved.
1. Introduction
It is well known how to generalize the definition of the Q-state Potts model to Q real by turning to a loop/cluster formulation, and the related six-vertex model (see, e.g., [13]). On the square
lattice, the ferromagnetic Potts model thus defined exhibits reasonably well understood critical
properties. In many respects, it can be considered as the lattice equivalent of the Dotsenko
Fateev twisted bosonic theory, and thus, has served as a benchmark for our understanding of
numerous issues in conformal field theory and integrable systems.
The antiferromagnetic case is much less understood. This is not too surprising, as antiferromagnetic models often exhibit frustration, and thus more complex phase diagrams. While the
physical interest of this model per se is probably limited to a few special values of Q, it turns
out that the difficulties encountered in its study are very similar to those encountered in the study
of models with supergroup symmetries. These models are believed to play a crucial role in the
description of critical points in non-interacting disordered systems.
Strong similarities with the sl(2/1) spin chain relevant to the spin quantum Hall effect [4]
in particular motivated us to tackle the long overdue continuation of a first work on the topic
by one of us in 1991 [2]. In this paper, we are finally able to present a conjecture, supported
by a long list of arguments and partial calculations, for the conformal field theory describing
the critical antiferromagnetic Potts model on the square lattice for Q [0, 4]. In a nutshell, this
theory involves two bosons: a compact boson not unlike the one appearing in the description
of the ferromagnetic critical point, plus a non-compact boson. To our knowledge, this is the
second time only that a non-compact boson is encountered in a statistical mechanics model with
a finite number of degrees of freedom per site, the first time being the super-Goldstone phases
of [5]. The appearance of such bosons is generic in the study of conformal field theories based
on supergroups.
Another major peculiarity of the antiferromagnetic transitionand of the neighboring
BerkerKadanoff phaseis the profound role played by the quantum group symmetry, and
the related differences between the Potts model and the vertex model for Q generic, as well as
the difference between the case Q generic and Q a Beraha number [i.e., Q = 4 cos2 n with n
integer] for the Potts model.
Finally, the last peculiarity of the antiferromagnetic transition is that it is, for Q generic,
a first-order critical point, that is a point where correlations are algebraic but at the same time the
derivatives of the free energy are discontinuous. This feature was initially observed in Ref. [6]
for the particular case of the Q 0 limit.
Obviously this quick summary of our conclusions requires much elaboration, and careful
distinction between incarnations of the model which are usually considered equivalent. For
easy reference, we review in Section 2 key aspects of the various representations of the Potts
model. We also comment on the importance of boundary conditions and on the definition of the
transfer matrices in the various representations. Further background material is given in the main
text, and details can also be found in [2].
209
The rest of the paper is organized as follows. We shall start in Section 3 by the simplest
problem, which is the properties of the (mostly untwisted) vertex model associated with the
Potts model on the antiferromagnetic critical line.
In Section 4, we proceed to discuss the properties of the twisted vertex model, as well as those
of the loop model and thus the Q-state Potts model for Q generic.
In Section 5, we discuss the related incarnation of RSOS models for Q a Beraha number. We
show how their continuum limit is given by a Zk parafermion theory, and use this in particular
to complete the identification of the continuum limit of the antiferromagnetic Potts model in
Section 6.
Another subtlety in this problem is the parity of the number of sites (Potts spins) in the transfer
matrix. While most of the paper discusses the case N even, interesting things also happen at N
odd, making the picture richer (the non-compact boson acquires a twist). This is discussed in
Section 7.
Section 8 goes back to physics, with the discussion of the magnetic exponent and the extremely delicate case of the thermal exponent. We discuss there also the issue of the first-order
critical point, and the flow from parafermionic to minimal theories la FateevZamolodchikov.
Section 9 contains conclusions as well as prospects for (much) future work.
Appendix A meanwhile discusses the limit Q 0 further, in particular how the R matrix for
blocks of four vertices in the staggered vertex model case reduces to the osp(2/2) R matrix in
this limit.
2. Representations of the Potts model
The Q-state Potts model is defined initially in terms of integer-valued spins i = 1, 2, . . . , Q
living at the lattice vertices i, and interacting through nearest neighbor coupling, with the partition function
eK(i ,j ) .
Z=
(1)
{i } ij
Here K is the coupling constant and ij denotes the lattice edges. We shall refer to this as the
Potts spin representation.
The low-temperature expansion of Eq. (1) gives the cluster representation [7,8] for which
|E|
Z=
(2)
Qc(E) eK 1 ,
Eij
where c(E) is the number of clusters (connected components) in the edge subset E. The equivalence between the Potts spin representation and the cluster representation holds true with any
boundary conditions. Note that now one can decide to give the variable Q any arbitrary real
value, hence extending the definition of the Potts model to Q non-integer.
Instead of the clusters, one may count the number of loops l(E) of the cluster boundaries
which live on the medial, or surrounding, lattice (here another square lattice oriented diagonally
with respect to the initial one). Using the Euler relation one obtains the loop representation [11]
K
|E|
S/2
l(E)/2+(E) e 1
Z=Q
(3)
Q
,
Q1/2
Eij
where S is the number of lattice vertices. The function (E) = 0, except on a torus if there is a
cluster in E that wraps around both periodic directions [9]; in the latter case (E) = 1. With this
210
slight topological subtlety, one obtains a loop representation of the Potts model for Q arbitrary,
totally equivalent to the cluster representation. Formulas (2) and (3) define the model we want to
study, and for which we want to determine the phase boundaries, and critical properties.
A key feature of the Q-state Potts model for Q arbitrary is that it has a non-local definition.
While this is not a problem for applications (e.g., the study of percolation, trees, forests, etc.),
it makes the analytical and numerical studies, or the construction of associated conformal field
theories, particularly tricky. Let us see this more explicitly. We will be mostly interested in the
case of the square lattice, which we will often consider as being built up by a transfer matrix
acting along the time (t) direction (of length M lattice spacings), the perpendicular direction
being referred to as the space (x) direction (of width N lattice spacings). We then have S =
NM.
We consider for definiteness the case in which the transfer matrix propagates along1 the spin
lattice (i.e., diagonally with respect to the medial lattice), and we let the width of the x-direction
be N Potts spins.2
The basis states for the transfer matrix in the Potts spin representation, valid for Q integer,
are of course N variables i = 1, 2, . . . , Q, whence dim T = QN . In the cluster representation,
basis states are set partitions of N elements respecting planarity, whence dim T = CN with CN =
(2N )!
N
N !(N+1)! being Catalan numbers. For N 1, we have CN 4 . Excited sectors are constructed
by restricting to set partitions with N0 components, of which exactly N0 are marked. By
definition, the marked components cannot join among themselves under the action by T .
In the loop representation, basis states are complete pairings of 2N points respecting planarity,
with again dim T = CN . Excited sectors are constructed by replacing N0 of the pairings with
exactly 2N0 marked loop segments that cannot close among themselves.
The crucial point is now that the partition functions (2) or (3) cannot be written as traces of the
transfer matrix raised to the power M because of the non-locality. This is of course expected, the
cluster or loop transfer matrices having different dimension (and thus different sets of eigenvalues) than the Potts spin transfer matrix for Q integer. Instead, the partition functions are written
via more complicated functionals of the cluster or loop transfer matrices, allowing in particular
for the disappearance of many eigenvalues (corresponding to non-local information) for Q integer. For instance, with boundary conditions which are free in the x-direction and periodic in the
t-direction, the loop/cluster partition function is given by ZM = vf |T M |vi for suitably initial
and final vectors (and in fact with a modified transfer matrix defined in terms of connectivities
among two time-slices) [10]. The end result is that
N
ZM =
N0 =0
(2N0 + 1)q
(N0 ) M
(4)
(N )
where {k 0 } is the set of eigenvalues of the foregoing transfer matrix, corresponding to the
sector with N0 marked components defined above. The amplitudes (2N0 + 1)q are q-deformed
1 In some cases it may be advantageous to study instead the case where the transfer matrix T propagates diagonally
with respect to the spin lattice (i.e., axially with respect to the medial lattice).
2 In all subsequent figures of numerical data, the finite-size estimates of free energies f , central charges c, and physical
scaling dimensions x = h + h are labeled by the width N in terms of Potts spins. When the
transfer direction is diagonal
with respect to the spin lattice, we use the label N , meaning that the projected width is 2N lattice spacings in terms
of the spin lattice.
211
numbers and are derived from quantum group considerations [3], while we have set Q = q +
q 1 .
The essential point is that the amplitudes in (4) can vanish when Q is integer. In fact, setting
q = exp(i/t), they can vanish whenever t is rational. A remarkable feature of the Potts model
around the antiferromagnetic transition is that the disappearing eigenvalues can be the leading
ones, giving rise to strong singularities in the thermodynamic properties. In contrast, in the ferromagnetic region, the disappearing eigenvalues correspond only to excited states, and lead merely
to the disappearance of some non-local observables from the spectrum of excitations.
To summarize, the lesson so far is that for q a root of unity, not all eigenvalues of the cluster
or loop transfer matrix do contribute to the Potts partition function. The same would of course
be true in the doubly free, or doubly periodic case.
It would now seem crucial to remove the non-local aspect of the weights in the loop/cluster
representation. To this end, we again consider the case of free boundary conditions in the
x-direction and periodic in the t -direction. For all loops which are homotopic to a point, assigning independently an orientation to each loop and giving a weight q /2 to a loop that turns
through an angle to the left then reproduces [11] the full loop weight of Q1/2 upon setting
Q1/2 = q + q 1 . As for the non-contractible loops, they would acquire instead a weight 2 in this
formulation, but this can be corrected by assigning the weight q (respectively q 1 ) to any oriented loop that passes through the periodic boundary condition. Summing over the oriented loop
splittings of vertices which are compatible with given edge orientations finally gives a representation in terms of the six-vertex model [11]. This model again lives on the medial lattice. The
Hamiltonian of the corresponding (XXZ) spin chain can be extracted by taking the anisotropic
limit, and is useful for studying the model with the Bethe ansatz technique. Note that the vertex
model is in general staggered, as the weight assigned to a given vertex depends on whether this
vertex stands on a direct or a dual edge of the original spin lattice. Also, in the case of boundary conditions which are free in the x-direction and periodic in the t -direction, the vertex model
has special weights for the vertices at the surface, corresponding to boundary fields in the XXZ
formulation. We shall refer to it as a twisted vertex model. An untwisted version would be
obtained by setting the boundary fields to zero.
Unlike for the cluster or loop transfer matrix, the transfer matrix of the vertex model is a local
object.3 Basis states are orientations of 2N arrows, with an equal number of ups and downs,
whence dim T = (N + 1)CN . Excited sectors are constructed by having instead N + k ups and
N k downs. Periodic boundary conditions in the x-direction are imposed by giving a special
weight (twist) to the first arrow in each row. But again, the correspondence with the Potts model
implies that the object of interest is not the trace of the transfer matrix raised to the power M,
but some functional of it. In the simplest case of free boundary conditions in the x-direction and
periodic in the t -direction, adjusting the weight of non-contractible loops leads to the expresz
sion ZM = tr q S T M , where S z is the conserved spin (arrow flux) along the transfer direction.
The commutation of the vertex transfer matrix with the quantum group SU(2)q leads to a more
compact expression of ZM in terms of q-dimensions, with the same features of disappearing
eigenvalues when q is a root of unity. So we get the same lesson that for q a root of unity, not
all eigenvalues of the twisted vertex model transfer matrix do contribute to the Potts partition
function. The same would of course be true in the doubly free, or doubly periodic case.
3 Note however that the presence of a seam in the twisted version remains a non-local feature.
212
Obviously the phenomenon of disappearing eigenvalues requires great care in handling the
results from transfer matrix analysis.
To complete this introduction we must now discuss briefly the doubly periodic case. The
correspondence between the models is then more complicated. Consider for instance the loop
representation and its vertex model reformulation. The weight Q1/2 for non-contractible loops
can only be reproduced, after orientation, by giving a special weight that depends on the winding
of the loop in both periodic directions; this weight, in turn, transforms into a complex combination of weighed sectors for the vertex model obtained after summing over loop splittings! On
top of this, the sector with no non-contractible loops must also be singled out because of the
extra Q(E) factor in (3)! This is discussed in details in the literature [9], and will be addressed
below. For now, we need to consider only what happens in the vertex model when the noncontractible loops wind in only one direction. We have already seen what to do when they wind
in the t-direction: their weight must be adjusted from 2 to Q1/2 by introducing a modified trace
term. Similarly, when they wind in the x-direction, one must introduce a twist term assigning
the weight q (respectively q 1 ) to any oriented loop passing through the x-periodic boundary
condition. This leads, in the XXZ limit, to a modified coupling between the first and the last
SU(2)q spins, and defines what we call the twisted vertex model or spin chain transfer matrix.
This is mostly an intermediate object for us: while we consider the Potts partition function or the
untwisted vertex model partition function in this paper, we do not enter the discussion of what
would be the proper definition of the twisted vertex model partition function. Going beyond
the simple case of loops which are non-contractible in the x-direction is a very technical topic
[12] (best handled in the framework of TemperleyLieb and periodic TemperleyLieb algebras)
which we will address when necessary in the following. Suffice it to say that for generic q, the
eigenvalues of the cluster or loop transfer matrix are a subset of the eigenvalues of the vertex
model transfer matrix for various twist sectors.
3. Bethe ansatz analysis of the vertex model transfer matrix
The antiferromagnetic critical line for the square-lattice Potts model (defined, for Q generic,
using the loop/cluster formulation) was identified by Baxter [13] as
K
e 1 + 1 eK2 + 1 = 4 Q,
(5)
where K1 , K2 are the usual horizontal and vertical couplings. It is as always convenient to introduce the variables
eK 1
x
Q
(6)
and set
x1 =
sin u
,
sin( u)
(7)
where Q = 2 cos , with [0, 2 ], and u is a spectral parameter. The antiferromagnetic line
then corresponds to
cos u
sin v
1
=
,
x2
cos( u) sin( v)
v=u
.
2
(8)
213
The Potts model on the line (5) is never selfdual. Selfduality would correspond instead to x1 = x12 ,
that is u = v.
The Potts model is generally related to a six-vertex model with quantum group symmetry
obtained from the loop formulation by putting arrows on the medial lattice (see Section 2, and
Ref. [2] for further review), and using the spectral parameters u, v to define the vertex R matrix
on the two sublattices. In general, this vertex model is not solvable.
The peculiarity of the antiferromagnetic line is that the vertex modeland hence the Potts
model, at least in principlebecomes solvable because, when v = u 2 , it can be considered
as a particular case of models exhibiting Z invariance [14]. In general, such models are obtained by associating a spectral parameter hi with every horizontal line (with label i), a spectral
parameter vj with every vertical line, such that the R matrix at the vertex ij is R(hi vj ). On
the antiferromagnetic line, the vj are alternatingly 0, 2 , while the hi are alternatingly u, u + 2 .
Thanks to the special shift 2 , and the fact that u can be identified with u , there are indeed
only two types of vertices for the resulting vertex model, with interactions given by R(u) and
R(u + 2 ), respectively (see Appendix A for more details). The isotropic case (K1 = K2 , x1 = x2 )
would correspond to u = 2 + 4 .
While vertex models with imaginary heterogeneities of the spectral parameter have been studied in some detail [15], the situation of a real heterogeneity does not seem to have attracted much
attention. We will now see how the vertex model exhibits truly remarkable properties.
We consider a transfer matrix propagating in the vertical direction. The basic equations can be
obtained using standard techniquesthey appeared initially in the pioneering paper by Baxter
[14].4
3.1. Ground state energy, degeneracies
We will use modern quantum inverse scattering notations, following [15]. We introduce the
kernel
() = i ln
sinh 12 ( + 2i )
(9)
sinh 12 ( 2i )
sinh 12 ( i )
(10)
sinh 12 ( + i )
and
p1 () = i ln
cosh 12 ( i )
cosh 12 ( + i )
(11)
Recall that the parameter is related to Q by Q = 2 cos , with [0, 2 ]. The Bethe equations read then symbolically, for a system of 2N vertical lines (this corresponds to N Potts spins),
ip () N ip () N i( )
e
= ei .
e 1
e 1
(12)
4 Note that some misprints have cropped up in this reference: in Eq. (4.40) there should have been a t 2 instead of t 2
on the right-hand side, as confirmed by the analysis of the analytic Bethe ansatz, Eq. (5.4) of that same paper.
214
(13)
eip1 (+2iu) eip1 (+2iu) .
Rather than the vertex model transfer matrix, we will consider the Hamiltonian, obtained in the
strongly anisotropic limit u 0. Its eigenvalues read
E=c
d
p1 () + p1 () ,
d
(14)
2 sin 2
cosh 2 cos 2
(15)
we can conjecture that the ground state is obtained (at least when the phase = 0) by filling the
lines Im = 2 and Im = 2 . The source terms are of a new type then, which we will denote
(16)
The other kernels to consider are 11 and 1,1 () ( i), with Fourier transforms
sinh( 2 )k
,
sinh k
sinh 2 k
.
1,1 = 2
(17)
sinh k
The minus sign in the source term leads to a ground state where rapidities are a decreasing
function of the Bethe integers, and we get, in the continuum version
2 1 + 1h = (q1 + q1 ) 11 1 1,1 1 ,
h
= (q1 + q1 ) 1,1 1 1,1 1 .
2 1 + 1
(18)
11 = 2
Solving these equations gives the densities in the ground state. It is most useful to go one step
further and write physical equations where on the right-hand side appear holes in the ground
state distributions, that is densities of excitations over the physical vacuum:
h
2 1 + 1h = s() + 11 1h + 1,1 1
,
h
h
h
.
2 1 + 1 = s() + 1,1 1 + 1,1 1
(19)
215
The function s =
first follow,
gs
gs
1
1 = 1 = 2 s(). The ground state energy per vertex is then
E gs
= c
N
sinh( k)
.
sinh(k/2) cosh( 2 )k/2
(20)
state densities
(21)
It is interesting to compare this result with Baxters expressions for the free energy [13].
Using analyticity techniques, he finds that the free energy per site can be written as f =
f (K1 ) + f (K2 ) f (u) + f (u + /2) where the function f (denoted by in his work) is
given in Eqs. (30a)(30d) of [13]. To compare with our Hamiltonian we need to let u approach 0
from below, and take a derivative, so we must consider the expression
sinh2 ( k) sinh(2uk)
d
d
f (u) + f (/2 u) =
dk
du
du
k sinh(k) sinh( 2 )k
sinh( k) sinh( )k sinh(2uk)
+ dk
k sinh(ki) sinh( 2 )k
sinh( k)
= 2 dk
(22)
.
sinh(k/2) cosh( 2 )k/2
This agrees with our ground state energy, given in (21), and constitutes an interesting check of
our equations.
3.2. Excitations
The physical Bethe equations immediately show that there are two branches of massless
excitations (at least) obtained by making holes in the ground state at on the lines Im = 2
and Im = 2 . This is twice the usual result for the XXZ chain, and leads one to expect that the
central charge will be equal to two, a result we will confirm below. The rapidity of the excitations
The kernels describe a complex scattering theory, which we do not yet entirely understand.
A crucial fact is that these kernels diverge at small k, a very unusual feature, that is also encountered in the studies of models based on super algebras [4]. We will argue below that it implies
the existence of a continuum component of the spectrum. One can however observe some drastic
simplifications for symmetric excitations, i.e., excitations which are identical for the two lines
Im = 2 and Im = 2 . For these, the effective kernel is obtained as = 11 + 1,1 ,
sinh( 4 )k/2
.
2 sinh k cosh( 2 )k/2
(24)
216
This is the solitonsoliton kernel for the sine-Gordon model regime with 8
= 2
. This observation has a simple origin. Consider again the Bethe equations: it is clear that there is a particular
type of solutions obtained when the real parts of the roots on Im = 2 and Im = 2 coincide.
The equations for these real parts are then of the form
sinh( 2i )
sinh( i + i/2) N
= ei
(25)
sinh( + i i/2)
sinh( + 2i )
2 sin 2
2( 2 )
.
cosh 2 + cos 2
(26)
Comparing with the results for the XXZ chain (with relativistically invariant normalization)
H =
y y
z
x
ix i+1
+ i i+1 cos iz i+1
2 sin
(27)
we see from standard works on this topic [16] that this spectrum will lead to twice the central
charge and critical exponents of the XXZ chain (27) with parameter = 2 . On the other
hand it is known that the conformal field theory limit of this XXZ chain is a free boson with
2
2
coupling constant g = 2
(in agreement with the value 8 = obtained before for the scattering
matrix). Notice in particular that the limit Q 0 corresponds to = 0, i.e., the XXX chain. In
other words, there has to be a whole class of excitations in our system for which the gaps will
be the same as the gaps of this XXZ chain (27), up to a factor two (this should hold with or
without twist angle ). We call this the XXZ subset of our spectrum. It is tempting to speculate
that it corresponds exactly to the configurations of roots which are invariant under the extra Z2
symmetry + i . Of course, there has to be many more excitations, since doubling the
spectrum of the XXZ chain creates many gaps in the conformal towers. These excitations should
appear with an extra degeneracy of two, since they corresponds to configurations of roots which
are not symmetric.
3.3. A quick analysis of finite-size effects
We can make further progress on this question by analyzing in more details the finite-size
spectrum from the Bethe ansatz. For this, we go back to the physical Bethe equations involving
holes in the distributions of strings and antistrings. It is interesting to determine the dressed
charges of the excitations. Normally, this is an easy task. Write the bare equations as + h =
p + K .
It follows that + h = Z p + (1 Z) h , where Z = (1 K)1 . Standard formulae
then exist to express the gaps in terms of the matrix Z [17]:
h + h = nt Z 2 n + d t Z 2 d,
(28)
1
where n is the two-column vector ( n
n2 ), and ni the change of solutions of the ith type. Simi-
larly, d is the two-column vector ( dd12 ), and di the number of solutions of the ith type backscattered
from the left to the right of the Fermi sea. In (28), t denotes transposition.
Here however we encounter a major snag: the matrix Z has all its elements infinite, as the
kernels diverge when k 0. We have taken the most naive approach possible, and assumed that
the standard formulas still made sense in this case, after maybe some regularization. It turns out
217
that such regularization is easy: just keep k small but non-zero in the matrix elements. One can
then calculate the inverse of Z, and let k 0 in the end. One finds
1
1/2 1/2
1
=
Z lim
(29)
k0 Z(k)
1/2 1/2
and thus
=
+
(n1 + n2 )2 + 0 (n1 n2 )2 +
(d1 + d2 )2 + (d1 d2 )2 .
2
8
(30)
We recover here the XXZ subset (see later) for d1 = d2 . On top of this however, we have a
continuum of soft modes corresponding to n1 = n2 . This is similar to observations in [18]
and [4] where the soft modes were interpreted as arising from a non-compact boson. We therefore
believe that the continuum limit of the untwisted vertex model is described by a compact boson
(called 1 in what follows) with g = 1t plus a non-compact boson (called 2 ) giving a continuum
of soft modes on top of the gaps for the first boson.
While the action of the twist angle on 1 (that is, the XXZ subset of the spectrum) is well
understood, the action on 2 seems to be more subtle, and cannot be captured by our simple
analysis. We shall see that even the parity of N affects 2 in a non-trivial way.
3.4. Some remarks on extra symmetries
The Bethe equations expressed in terms of roots at Im = 2 look like equations for a rank
two algebra, and suggest the existence of two U (1) conserved quantities, that is, an extra quantity
on top of the spin S z . One can get an idea of what this extra quantity might be by considering
the limit Q 0. We observed in [18] that in this limit, the Bethe equations and the energy of
the model coincide with those of the integrable osp(2/2) spin chain in the fundamental representation. One can in fact go further (see Appendix A), and prove that the osp(2/2) integrable
R matrix can be obtained by considering a block made up of four elementary vertices of the
6 vertex model. In this case, the extra U (1) charge is the B quantum number of the osp(2/2)
algebra; this is defined initially for a pair of neighboring spins |S2i1 S2i by B = 0 for |++ or
|, and B = 1/2 for |+ i|+ (see Fig. 1), and then extended to the whole system by
summing over i. It is very interesting to study what becomes of this extra symmetry on the whole
antiferromagnetic line, and the relation with the ospq (2/2) integrable R matrixwe will report
about this elsewhere. In any case, the quantity n1 n2 plays the role of an extra U (1) charge
along the whole antiferromagnetic line.
On top of this charge, the antiferromagnetic line exhibits an extra Z2 symmetry, as already
mentioned in [2]. This can be seen most clearly by considering again the block of spins in Fig. 1
and writing its R matrix algebraically as (for notations see [2])
C = X1
(32)
X3
2
2
where the fact that X( ) = X( ) is crucial. On the other hand, one has [2]
commutes with R,
2
2
X
(33)
= P1 P0 ,
2
218
Fig. 1. A block of four vertices can be considered as a supervertex for a model with 2 2 = 4 degrees of freedom per
edge.
where P1 , P0 are the projectors on (quantum) spin one and zero representations in the product of
two spin 1/2 representations, so C 2 = 1.
We can now discuss briefly the phase diagram of the Potts model in terms of the symmetries
of the associated vertex model. The selfdual case x1 = x12 gives rise to two kinds of physical
behaviors, which correspond in the isotropic case to two selfdual lines, x1 = x2 = 1 and x1 =
x2 = 1. In the selfdual case, the symmetry of the associated vertex model is U (1) Z2 . The
U (1) is just the conservation of the spin S z , while the Z2 is the symmetry of translation by
one lattice site. The latter symmetry, in the continuum limit, becomes another U (1) [19] so the
symmetry of the continuum limit on the selfdual lines is U (1) U (1): this corresponds to chiral
symmetries for left and right fermions, or symmetry L L + , R R + , of the free
bosonic continuum limits. Going away from the selfdual lines breaks the Z2 . If one starts from
the physical critical line (x1 = x2 = 1), the model becomes massive. If one starts from the nonphysical critical line (x1 = x2 = 1), one enters the BerkerKadanoff phase [2] where the Z2
symmetry breaking operator is irrelevant, and the full U (1) is actually restored in the continuum
limit.
The antiferromagnetic critical line marks the boundary of the BerkerKadanoff phase. As just
discussed, on this line an extra U (1) Z2 symmetry appears, and it is reasonable to expect that in
the continuum limit, this also gets enhanced into the full U (1) U (1), corresponding to another
free boson. This is in agreement with the Bethe ansatz calculation of the central charge. However,
when Q = 0 it has been argued in [18] that this boson is non-compact. This feature must be true
along the full antiferromagnetic line in fact, as follows from the finite-size corrections to the
gaps.
3.5. The spectrum of the untwisted vertex model for N even
One subtlety is that results for N odd and N even turn out in fact to be profoundly different.
We do not understand this fully, but notice this can be expected, as the foregoing analysis applied
219
mostly to N even, which allowed one to treat the two lines Im = 2 . The case of N odd will
be considered later. In what follows, we parametrize = t , with t [2, ].
To start, we consider the XXZ subset some more. Magnetic gaps read
=2
g(S z )2 (S z )2
=
,
4
t
(34)
where S z is the spin of the equivalent XXZ chainthat is the chain for which the Bethe
equations are (25). Since the antiferromagnetic system has 2N spins, whereas the XXZ chain
z
described by (25) is for N spins only, we must rescale the spin according to S z = S2 , where S z
is the spin in our initial system, i.e., S z = n1 + n2 in (30). Therefore
(S z )2
(35)
4t
where S z (the spin in our initial system) is an integer since we have a even number of spins.
One can see here again why parity effects are likely to occur. If N is even, since we must
also have an even number of upturned spins to make up pairs of Bethe roots in the folding of
the equations, the result strictly speaking will only apply to S z even, including the ground state.
Numerical study shows that in fact it holds for S z odd as well. If N is odd on the other hand,
since we must also have an even number of upturned spins to make up pairs of Bethe roots in the
folding of the equations, the result strictly speaking will only apply to S z odd. Numerically,
we have observed indeed that for S z even, including the ground state S z = 0, results for N odd
obey a different pattern. See Section 7 below.
The electric charges in the S z = 0 sector of the equivalent XXZ chain are integer (e = d1 =
2
e2
= te4 . Combining all the excitations, we thus
d2 ), and give rise to the following gaps = 2 4g
get exponents from a single free boson, em = 14 ( mt e t)2 , with e and m integers.5
Meanwhile, the non-compact boson adds one unit to the central charge, and decorates these
gaps by soft modes, which would manifest themselves in a numerical study as very large degeneracies [4]. One can write this in a compact form by considering the generating function of
levels
1 em em
2
Zconj =
(36)
q c/24 q c/24 = d (q q)
1/24
q
q
e,m
=
,
periodic vertex
q = e2M/N the modular parameter for a system of size N M and
model with
1/24
n
=q
n=1 (1 q ) Dedekinds eta function.
This means that in the sector of vanishing magnetization, one should observe a continuum
spectrum above the ground state. In the sector with magnetization unity, one should observe a
continuous spectrum over the basic magnetic gap, etc. Numerical study by direct diagonalization
of the transfer matrices (possible for N 12) is compatible with this result, though the sizes
accessible are too small to provide a complete confirmation. In [4] where the Bethe equations
themselves were studied for a similar problem, it was found that sizes as large as N 104 were
necessary.
5 In general, we will denote by the (rescaled) gaps with respect to the central charge c = 2, and reserve the notation
h for the (rescaled) gaps with respect to the true ground state of the antiferromagnetic Potts model.
220
4. The spectrum of the twisted vertex model and the Potts model transfer matrices on
the antiferromagnetic line
The usual strategy is, once the continuum limit of the untwisted vertex model is identified, to
evaluate the effect of the twisting and other ingredients involved in the correspondence with the
loop/cluster formulation, and gain in this way knowledge of the conformal properties of the Potts
model itself. The process in the antiferromagnetic region is however more involved than in the
ferromagnetic region. To start, we discuss the BerkerKadanoff phase in more details.
4.1. The BerkerKadanoff phase
Recall first that the untwisted vertex model in the whole BerkerKadanoff (BK) phase is
described in the continuum limit, like on the non-physical selfdual critical line, by a single free
boson with g = 1t . It is important to notice that this free boson is in fact exactly the same as the
one describing the XXZ subset on the antiferromagnetic critical line.
The first step in going from the vertex model to the loop/cluster representation of the Potts
model is to introduce a twist in the vertex model. We have explained that this twist has a convenient geometrical interpretation if one decomposes the vertex configurations in loops: without
it,
loops which are non-contractible around the cylinder get a weight 2, instead of the required Q
for the loop representation of the Potts model. Suppose we want to give to non-contractible loops
a more general weight, w = 2 cos . We introduce a seam or twist in the vertex model and the
associated conformal weight should give the effective central charge
6 2
c=1 1
(37)
= 1 6t 2 .
g
Since non-contractible loops come in pairs, we have defined modulo an integer: these are
the allowed values of the electric charge in the Coulomb gas. The Q-state Potts model itself
corresponds to e0 = 1t , leading, in principle, to the central charge c = 1 6t .
This latter value is definitely the central charge of the twisted vertex model. But it is not the
central charge of the loop/cluster Potts model!
The point is that the eigenvalues of the loop or cluster model are in general (as discussed
in Section 2) a subset of the eigenvalues of the vertex model, essentially those associated with
highest weights only. The determination of this subset is based on algebraic considerations,
and follows some simple rules [3]. For instance, eigenvalues of the vertex model transfer matrix
z
in the sector of spin S z = 1 and with no twist are also exactly observed
in the sector S = 0 and
1
with a twist = t (i.e., giving to non-contractible loops the weight Q, that is = e0 ). Such a
descendent eigenvalue is thrown away in the loop or cluster formulation.
In the BK phase, the dimension associated with electric charge e0 = 1t , since it is the same as
the dimension coming from the sector with no electric charge (twist) and S z = 1, and equal to
e0 ,0 = 4t1 , is thrown away in the truncation from vertex to loop (or cluster). This means that the
central charge of the twisted vertex model and the loop/cluster model are different in this phase.
For the vertex model one has c = 1 6t while the central charge of the loop or cluster model
2
, all
follows from the next choice, e0 = 1 1t , leading instead to central charge c = 1 6(t1)
t
this for t generic: this is illustrated in Fig. 2.
We have performed careful numerical checks of this phenomenon, with some results presented in Figs. 3, 4. A technical note here: one can essentially write two transfer matrices, one
221
Fig. 2. A schematic representation of the behavior of some levels (in finite size) of the vertex model in the BK phase,
for t generic. Levels which are at the head of an arrow are thrown away in going from the vertex to the loop or cluster
1 is thus always discarded, leaving generically the next level = (t 1)2 /4t to determine
formulation. The level = 4t
the central charge c = 1 6(t 1)2 /t for the loop or cluster model for t generic.
Fig. 3. Numerical estimates of the central charge for the twisted vertex model on the non-physical selfdual line. The
continuous curve is the expected value, c = 1 6/t . Diagonal geometry.
propagating along the axial direction of the original Potts lattice (and thus diagonally for the
medial lattice, which is the lattice where the loops or the vertex spins live), and one propagating
diagonally for the Potts lattice (and thus axially for the loops and vertex spins). Results will not
(except for parity effects) depend on the geometry in the thermodynamic limit, but might converge better in finite size with one choice or the other. Note that the first formulation is better
suited for using the TemperleyLieb algebra and quantum group symmetries. It is the one we use
most. In all figure captions we will use the name diagonal and axial in reference to the medial
lattice (see also footnotes 1, 2).
Notice that for t an integer (and only then), the eigenvalue associated with e0 = 1 1t is also
thrown away because it is a descendent from the sector with no twist and spin S z = t 1, as
illustrated in Fig. 5. This is the mechanism that, in the periodic case, produces the singularities
of the BK phase at the Beraha numbers Q = 4 cos2 t , with t integer. It is the equivalent of the
slightly simpler mechanism studied in [2] in the case of open boundary conditions. In fact, many
more levels are thrown away, so many that even the free energy per unit site of the (twisted)
vertex and loop or cluster models are different at the Beraha values in the BK phase. The dif-
222
Fig. 4. Numerical estimates of the central charge for the loop model on the non-physical selfdual line. The continuous
curve is the expected value, c = 1 6(t 1)2 /t . Axial geometry.
Fig. 5. A schematic representation of the behavior of some levels (in finite size) of the vertex model in the BK phase, for
t an integer. Now the two levels indicated on the right are both discarded. Many more levels are in fact discarded, leading
to the singularities at the Beraha numbers within the BK phase.
ference of these free energies vanishes as the antiferromagnetic line is approached. Right on the
antiferromagnetic line, the free energy does not show any singularity while one crosses a Beraha
number (see Section 8, in particular Figs. 21, 22).
A note of caution on the case t integer is in order. Of course, the spectrum of the loop/cluster
transfer matrix does not show any particular behavior when one crosses these values. What happens however is that the partition function of the model on a torus, say, would be expressed
by weighing these eigenvalues with complicated non-integer degeneracy factors, and some of
these factors can vanish when t is integer, a feature we refer to as a level being truncated, or
discarded. Hence what one calls an exponent of the loop/vertex model has to be defined with
great care!
4.2. Central charge on the antiferromagnetic critical line
We can now carry out a similar analysis for the model on the antiferromagnetic critical line.
To give non-contractible loops a weight w = 2 cos , we introduce a seam, and the effective
223
Fig. 6. Numerical estimates of the central charge for the twisted vertex model on the antiferromagnetic critical line. The
continuous curve is the expected value, c = 2 6/t . Diagonal geometry.
Fig. 7. Numerical estimates of the central charge for the loop model on the antiferromagnetic critical line. The continuous
curve is the expected value, c = 2 6/t . Axial geometry.
(38)
2
1
since now g = 2
= t as in Section 3.2. For the Q-state Potts model itself we have = e0 = t ,
leading to a central charge c = 2 6t .
Remarkably, this is not only the central charge of the twisted vertex model, but also the central
charge of the loop/cluster Potts model on this line, as was established earlier [2], and can be
checked numerically to great accuracy. Some results are presented in Figs. 6, 7. For Fig. 6 the
transfer matrix propagates along the diagonal of the medial lattice, for Fig. 7 along the medial
lattice itself. Similar results are obtained for other geometries.
224
Fig. 8. First scaling levels, normalized as free energy densities, of the cluster model with t = 5 and size N = 6 (diagonal
geometry). The abscissa is the temperature parameter x = Q1/2 (eK 1). The vertical line shows the bulk antiferromagnetic transition temperature. The ground state on one side of the transition becomes a high excitation on the other
side, and vice versa.
This result is remarkable, because we have seen that the central charge of the twisted vertex
model comes entirely from the XXZ sector, and that this sector is described by the same free
boson as in the BerkerKadanoff phase. Hence one should have expected the level corresponding
to (38) to be truncated when going from the vertex to the loop/cluster formulation, and the central
charge of the Potts model to be given by another value. What happens is the following.
Observe that the XXZ subset of the spectrum (in the scaling limit) on the antiferromagnetic
critical line coincides with the spectrum (in the scaling limit) in the BK phase, with or without
twist angle (and up to an additional contribution of one to the central charge). Since, because
of the doubling effect discussed previously, the vertex model on the antiferromagnetic critical
line has more scaling gaps than in the BK phase, it means that some of the levels which were
at finite distance from the ground state in the BK phase have to cross and become scaling levels
on the antiferromagnetic critical line. (A similar crossing phenomenon is clearly visible on the
finite-size levels of the loop/cluster model, see Fig. 8.) One of these levels happens to merge
(asymptotically) with the descendent level, so after truncation one of them remains, to give the
central charge c = 2 6/t still. This is illustrated schematically in Fig. 9.
This is confirmed numerically: we find that the watermelon operator h2 = 0, thatis the largest
eigenvalue of the loop transfer matrix in the sector with zero legs, and weight Q for noncontractible loops is asymptotically degenerate6 with the largest eigenvalue in the sector with
two legs, and weight 2 for non-contractible loops. Since the latter eigenvalue is also observed
z
(as a descendent eigenvalue) in the vertex model in
the sector with S = 0 and twist angle
= 1/t (giving to non-contractible loops the weight Q), it follows that the latter sector has a
ground state degenerate at least twice. This is the hall mark of a first-order phase transition, and
we conclude that the entire antiferromagnetic critical line is in fact a line of first-order critical
points for the twisted vertex model. This is also true of the loop model, even though it has fewer
6 More precisely, the logarithm of their ratio vanishes faster than 1/N .
225
Fig. 9. The levels corresponding to Fig. 2, but on the antiferromagnetic line. The ground state in the sector S z = 0, = nt
has now double degeneracy, so that one of the two levels remains after the descendent levels are discarded, leading to the
central charge c = 2 6t , for any t . The additional level comes down from the very excited part of the spectrum as one
moves within the BK phase towards the antiferromagnetic critical line.
eigenvalues, because of the result that h2 = 0.7 Finally, it is also true for the untwisted vertex
model (since its ground state is in fact infinitely degenerate due to the continuous component).
4.3. Spectrum of the loop/cluster model transfer matrix with no marked loops
More generally, the weight w of non-contractible loops is compatible with charges e = 1t
n, with n an integer. The choice e = e0 1 = 1t 1 leads to a conformal weight (defined over
the ground state with c = 2 6t )
(t 1)2 1 t 2
(39)
=
.
4t
4
This is well observed numerically as seen in Fig. 10. We believe that there is a continuum spectrum on top of this gap, all of which is truncated when one goes from the Potts transfer matrix to
the Potts partition function when t is integer.
One can also identify numerically a level very well matched by the formula h = t3
t2 . This has
no natural explanation within the XXZ subset, and will be discussed in details below.
We will argue later that the spectrum is in fact continuous above h = t2
2 . This is all for t
generic of course. For Beraha numbers, the continuous component disappears entirely, as well as
other exponents.
h=
4.4. Spectrum of the twisted vertex model transfer matrix in the sector S z = 0
Exponents are shown in Fig. 11. Observe how one sees considerably more levels than in
the loop/cluster model case (see Fig. 10). We will argue later that here, the spectrum is in fact
continuous above the ground state h = 0.
4.5. Watermelon operators
These operators describe the properties of L marked lines in the loop model (or, for even
L > 2, equivalently by L/2 marked clusters in the cluster model). As usual, their conformal
weight is obtained by considering the ground state of the sector with S z = L2 (L even), so, with
7 The existence of a first-order critical point in the loop/cluster formulation was made initially for the Q 0 limit in
Ref. [6].
226
Fig. 10. Spectrum of the loop/cluster Potts model transfer matrix in the sector with no marked loops/clusters. Two
exponents are clearly identified (solid lines) as well as descendent levels (dotted lines). Axial geometry.
Fig. 11. Spectrum of twisted vertex model transfer matrix in the sector S z = 0. Axial geometry. Note the considerable set
of additional levels when compared to Fig. 10. The particular levels identified with solid lines in Fig. 10 are also present
here.
respect to the ground state of the theory with central charge c = 2 6t , we find
hL =
(L/2)2 1
.
4t
(40)
227
Fig. 12. Watermelon exponents on the antiferromagnetic critical line, in the loop model with L = 2, 4, 6, 8 marked lines.
Axial geometry.
N K
N
e 1
eK 1
+
e
e
1
+
2i1
2i
Q1/2
Q1/2
i=1
(41)
i=1
i=j
(hi , hi )
(Shi Shi )1/2
Shi1
(42)
where Sh = sin(hs/r).
In the transfer matrix study of the RSOS models, the basis states are collections of 2N heights
h1 , h1 , . . . , h2N belonging alternately to the direct and dual lattice, and subject to the constraint
8 I.e., they do not even possess a twist.
228
|hi hi+1 | = 1. To have periodic boundary conditions in the x-direction we impose h2N+i = hi .
One can show that dim T [2 cos(/r)]2N = QN when N 1.
Of course, the correspondence between the partition function of the RSOS model, the partition
function of the Potts model, and the spectrum of the loop/cluster transfer matrix or twisted vertex
model transfer matrix is rich and intricate. For instance, the partition function of the RSOS model
with doubly periodic boundary conditions on a torus would read [20]
Z=Q
S/2
[l(E)l0 (E)]/2
Eij
eK 1
Q1/2
|E|
r1
a l0 (E)
,
2 cos
r
(43)
a=1
where the number of non-contractible loops l0 (E) has been singled out. This formula indicates
that, in the vertex model, r 1 sectors with different values of the twist are now necessary.
In this paper we shall restrict the study to t integer (i.e., r = t and s = 1), and use the identification of the associated conformal field theory to strengthen our general results (on the critical
ferromagnetic line for instance, the CFT for the RSOS models would be the minimal theory with
central charge c = 1 6/t (t 1)).
The critical antiferromagnetic variety does not intersect the selfdual manifold, and thus corresponds to a staggered RSOS model. We are not aware of any previous study of such model.
To proceed, let us define the four regimes
(i)
(ii)
(iii)
(iv)
0 < u < ,
<u< ,
2
<u< + ,
2
2
+ < u < .
2
(44)
The Potts model on the antiferromagnetic critical line involves two kinds of vertices, with spectral parameters differing by 2 . Hence, either they lie in regimes (i), (iii) or in regimes (ii),
(iv).
We can go from the Potts model to the RSOS model algebraically by using a different representation of the TemperleyLieb algebra (that is, quantum group restricting the vertex model
representation). A homogeneous spectral parameter u (such as would be obtained on the selfdual lines) would lead to a homogeneous RSOS model of the type studied by Andrews, Baxter
and Forrester [21]. To dispel confusion, we will reserve the name ABF to homogeneous models,
and denote the restricted version of the vertex model on the antiferromagnetic line (as well as
elsewhere in the phase diagram) as the RSOS model (in general, staggered). Where, in the ABF
phase diagram, these models stand is an interesting question. For u in regimes (i) or (iii), the
ABF model is at the transition point between its regime III and regime IV (capital letters refer
to the regimes originally defined in [21]), while for u in regimes (ii), (iv), the ABF model stands
at the transition point between its regime I and regime II. Based on this observation, we see
that we ought to expect that the antiferromagnetic Potts model, for a given value of Q, can be in
two different universality classes, depending on the anisotropy. The isotropic case is what we are
interested in here: K1 = K2 corresponds to the SDP (self-dual Potts) models in regimes (ii), (iv)
229
Fig. 13. Same as Fig. 8 but for the RSOS model. Remark the absence of levels responsible for the BK phase. Some levels
are identical with those of the cluster model, including the ground state on the high-x side of the transition. This ground
state does not participate any level crossings.
and thus the ABF models at the transition between regime I and II. In fact, only the regime (ii)
corresponds to a physical ABF model, and thus was studied in [21].9
Now remarkably, the central charge we have found for the isotropic Potts model on the antiferromagnetic critical line coincides with the one of the associated ABF model at the transition
between regime I and regime II for t an integer, t k + 2. It is therefore tempting to conjecture
that the RSOS version of the antiferromagnetic critical line for k = t 2 an integer is in the
universality class of the ABF models at the transition between regime I and regime II, and thus,
following [21,22] is a theory of Zk parafermions. Let us discuss further the evidence for this.
An immediate objection to this claim might be that the ABF models are not at a first-order
critical point. Indeed, they are not, and the point is that when one goes from the twisted vertex
model to the RSOS models, the eigenvalues from the untwisted S z = 1 sector are discarded
(L = 2 watermelon), as well as the descendent eigenvalues in the S z = 0 sector with twist = 1t .
Therefore, the second h = h2 = 0 exponent disappears, and there is no reason to expect a firstorder phase transition any longer. This is illustrated in Fig. 13.
Note that in the ABF language, the extra Z2 symmetry which appears on the antiferromagnetic critical line corresponds to the chiral symmetry of the Zk models discussed in [22], that
exchanges clockwise and counterclockwise.
The ABF models at the transition between regime I and regime II have parafermionic exponents, of the general form [23]
h=
m2
l(l + 2)
,
4(k + 2)
4k
l = 0, 1, . . . , k; l m l; l m = 0 mod 2
(45)
9 Note that we also predict from this discussion the existence of more critical regimes for the selfdual Potts model itself.
Since only the isotropic case was usually considered, attention focused on the cases x1 = x2 = 1 and x1 = x2 = 1,
corresponding respectively to u = 2 and u = 2 + 2 , points which lie in regimes (i) and (iii) respectively. So the
excursions of the selfdual manifold away from the isotropic points, in regimes (ii) and (iv) respectively, remains to
be studied. We suspect that the corresponding universality classes will be closely related to the ones of the critical
antiferromagnetic (hence non-selfdual) Potts model.
230
Fig. 14. Exponents for the RSOS model on the antiferromagnetic critical line, and comparison with the spectrum of the
Zk model. Solid lines show the exponents (45), while dashed lines represent descendents. Diagonal geometry.
and
(k l)(k l + 2) (k m)2
,
4(k + 2)
4k
l = 0, 1, . . . , k; l m 2k l 2; l m = 0 mod 2.
h=
(46)
All these exponents (and only these) are indeed observed numerically in the RSOS model on the
antiferromagnetic critical line: see Fig. 14.
On the other hand, by the quantum group construction [3], the eigenvalues of the RSOS
models are obtained by combining sectors of the vertex model with vanishing spin and twists
(l+1)
= (k+2)
(recall, t k + 2), l integer, 0 l k. The associated dimension in the XXZ part
l+1
of the spectrum is obtained by setting e = k+2
, and coincides with the one from the sector with
z
vanishing twist and S = l + 1; as such, this eigenvalue disappears from the loop model in the BK
case. On the antiferromagnetic line however, this value is still observed: like in the case l = 0,
there is double degeneracy in the vertex model, so the loop model still has this exponent. Observe
now that this exponent (with respect to c = 2 6t ) agrees with formula (45) for the same l and
m = 0.
For a given l, the parafermionic tower gives weights (45). We recognize there the value coming
2
from the XXZ subset, from which the quantity m
4k has been subtracted. In particular, in the sector
l+1
, the lowest lying excitation should not be given by m = 0 (unless l = 0) but
with twist = k+2
by the dimension of the order parameter in the Zk theory, which corresponds to l = m and reads
l(k l)
.
2k(k + 2)
Replacing by the value of gives the result
hl =
(47)
231
Fig. 15. Effective central charge for t = 6 in the sector with S z = 0 and twist . For 1t , the XXZ gap determines the effective central charge ceff = c 24h = 2 6t 2 (blue line), while beyond this value, the effective central charge is determined by a different gap, whose origin will be related to a non-compact boson later,
3t (2 1)2 (cyan line). Axial geometry. (For interpretation of the references to color in
ceff = c 24h = 1 + t2
this figure legend, the reader is referred to the web version of this article.)
top of this, it should also contain other eigenvalues which have double degeneracy, (48) being the
l+1
1
lowest one for = k+2
. Only for l = 0, i.e., = k+2
will the two coincide. This pattern should
presumably extend to t generic.
Numerical study for the leading exponent in the sector of twist as a function of is illustrated in Fig. 15. One sees that for || < 1t , the XXZ formula holds, while beyond this, it has to
be replaced by the formula (48) with t k + 2, and t now arbitrary, as expected.
Exponents with l m l should also be present, but we have had difficulties identifying
them numerically in general. An exponent which we have identified however corresponds to (46)
with l = 0 and m = 2: it is the exponent of the first (most relevant) parafermionic chiral current,
with
k1
h=
(49)
.
k
While we have argued that the spectrum has a continuous component for the vertex model
and the loop/cluster model, there is of course no such component in Zk models, and therefore all
the corresponding exponents must disappear in the RSOS partition function. This is in particular
k
the case for the continuous branch in the sector l = 0, which starts at h = t2
t = k+2 .
6. A free field representation for the antiferromagnetic Potts model
To put all these elements together in the construction of a fully consistent conformal field
theory for the cluster/loop Potts model does not seem straightforward, and is plagued with the
usual difficulties, in particular logarithmic featuressuch a fully consistent construction is not
available even in the much simpler case of the ferromagnetic critical line. Like in the latter case,
some progress can be made however by using a free field representation.
This free field representation is inspired by two features. First, we have deduced the presence
of a compact boson and a non-compact boson from the Bethe ansatz analysis. Second, the value
232
of the central charge at the Beraha numbers t = k + 2 coincides (as was observed in [2]) with
the central charge for Zk models, and for the latter, a free field representation based on a pair of
bosons, one compact and one non-compact, is well known [24] (see also [25,26]).
We thus start, following [24] by introducing a pair of bosonic fields 1 and 2 , with propagators
1 (z)1 (w) = 2 ln(z w),
2 (z)2 (w) = 2 ln(z w)
(50)
and a stress tensor
1
1
T = (1 )2 + (2 )2 + i0 2 1 .
4
4
(51)
1
for the first boson 1 , the central charge is c = 2 6t .
With a charge at infinity 0 = 2
t
There are various ways to introduce screening operators in this theory. The first choice, which
was used in [24] uses both bosons 1 , 2 , and leads to the three currents
J1 = 2 exp[2i0 1 ]
(52)
together with
i
1
J = exp
t1
t 22 .
2
2
(53)
These screening operators, having conformal dimension one, commute with the Virasoro algebra.
It turns out that they also commute with the parafermionic currents
t
1
i
2 ,
1 + i2 exp
=
2
t 2
t 2
t
i
1
2 .
1 i2 exp
=
(54)
2
t 2
t 2
The vertex operators
l
m
2
Vlm = exp i 1 +
2 t
2 t 2
(55)
then play a special role: the action of J on Vlm is only well defined for l integer and l m even.
Vlm is annihilated by the corresponding charges Q iff l m l. Similarly, if we consider
acting on Vlm , unless t is
the action of powers of Q1 , it is well defined only in the case Ql+1
1
l+1
rationalwe will suppose it is not the case here. Then Q1 Vlm = 0. Finally, one can get outside
the range l m l by acting with the parafermionic fields, which are also annihilated by Q
and Q1 .
The case l = 0 leaves fields obtained by successive action of the parafermionic fields on the
identity, the lowest of which has dimension h = t3
t2 and is twice degenerate.
We call the fields obtained in this way the first Coulomb gas contribution. In this sector, 2
presumably does not contribute a continuous part, as the corresponding charges are constrained
by the screening operators.
233
. Introducing
Note that we can also consider twisting differently the boson 1 with 0 = 2t1
t
the usual screening charges for a one boson theory, + = 1, + + = 2 0 , the corresponding screening currents read
J = exp[i 1 ].
(56)
The usual Felder construction [27] gives the allowed vertex operators
1r
1s
rs = exp i
+ + i
1 .
2
2
(57)
2
The central charge and the dimensions are not those expected; c = 2 6 (t1)
(the additional
t
contribution of one coming from the boson 2 ) and h = (trs) 4t(t1) . But this theory is nonunitary, and admits in particular the negative-dimension operator with r = 0, s = 1, leading to
an effective central charge c = 2 6t which coincides with the central charge of the loop/cluster
Potts model. The exponents with respect to this effective central charge are
2
h=
(tr s)2 1
4t
(58)
234
Fig. 16. The central charge of the untwisted vertex model on the antiferromagnetic critical line for N odd takes the value
ceff = 12 in contrast with the value ceff = 2 for N even. Diagonal geometry.
(L/2)2 1
1
(60)
+
4t
16
showing that the contribution from the boson 1 is as in the N even case, and thus it must be the
non-compact boson 2 that is twisted, i.e., sees antiperiodic boundary conditions.
Notice that if the field 2 is twisted in the N odd sector, the continuous part of the spectrum must disappear, as the twisted sector of a compact and non-compact bosons are identical.
The spectrum for N odd thus ought to be considerably simpler than for N even, a fact well in
agreement with numerical studies.
We now turn to the loop/cluster Potts model. The effective central charge is found numerically
(see Fig. 17) to be given by a field of dimension
hL =
t 3
.
16
To interpret the presence of the weight (61) in general, we observe that it can be written
1
1
t 1 1 2
.
+
hD0 =
16 4 2 t
4t
hD =
(61)
(62)
235
Fig. 17. Effective central charge for the loop model on the antiferromagnetic critical line in the sector N odd (diagonal
geometry). The solid line shows the expected result, ceff = c 24hD = (2 6t ) 24( t3
16 ).
1
As argued earlier, the 16
contribution comes from an antiperiodic sector for the non-compact
boson 2 . The remaining contribution can then be interpreted as the weight for an electric charge
e0 = 12 1t in the 1 theory. Recall that in the N even sector we had charges e0 = 1t , 1 1t , . . .
instead. We do not know the origin of this additional 12 contribution to the charge.12
Effective watermelon exponents in the twisted vertex model (and in the loop model) are then
given by measuring the gaps with respect to the odd-N ground state. This yields
heff
L = hL hD0 =
(L/2)2 1 4 t
+
4t
16
(63)
[k 2 + (k 2s)2 ]
,
16(k + 2)
k
0s .
2
(64)
The largest of these weights is obtained for s = 0, and coincides with (61). On the other hand, it
is well known that correlation functions in the disorder sectors have to be defined on a two-sheet
Riemann sphere, implying in particular the presence of descendents on half-integer levels. We
expect this feature to extend to t generic, and this is indeed well verified by the numerics for the
first excitations in the odd-N sector, both for the loop model and for the twisted vertex model.
We spare the reader the corresponding figures.
12 Since in the numerical study, the twist for the vertex model is exactly = 1 , it means that for N odd, this twist
t
translates into an electric charge e0 for the 1 boson, and a Z2 twist for the 2 boson. Moreover, the present value of the
charge would correspond, in the usual case, to giving to an oriented non-contractible loop, a weight iei/t . This says
something about the relation between the microscopic arrow degrees of freedom and the bosons 1 and 2 .
236
Fig. 18. Effective watermelon exponents, in the sector N odd, for the loop model along the antiferromagnetic critical
line. Diagonal geometry.
t
(t 1)2
.
16
4t
(65)
t
The dimension in the untwisted boson theory is thus BK
H = 16 . The dimension on the antiferromagnetic line, conjectured first in [2], is obtained by adding a contribution from the non-compact
boson, so
t
1
t 2 1
1
(66)
= .
16 4t
16
8 4t
This value is easy to understand following the discussion after formula (48): the magnetic exponent corresponds indeed to = 12 (a value which kills non-contractible loops, in relation with
the term having (E) = 1 in (3)), for which the XXZ value is never valid as t 2 always, while
the correct value deduced from the parafermionic exponents gives ceff = 1, in agreement with
the conjectured value of hAF
H .
The results (65), (66) for the magnetic exponent are verified numerically in Figs. 19, 20.
Next, we consider the thermal exponent. An increase of K moves the model into a massive
phase, while a decrease moves it into the BerkerKadanoff (BK) phase. Numerical evidence is
that the BK and massive phases are not related analytically in any way, and that the largest eigenvalue in the massive phase becomes the smallest (or maybe one of the smallest, in the sense that
it scales with the smallest) eigenvalue in the BK phase. As a result, the free energy per unit area
for the Potts model at Q generic and the twisted vertex model exhibits a discontinuity of the
first derivative at the antiferromagnetic critical point, a characteristic of first-order phase transitions. Nevertheless, we have amply argued that these models right at the critical point have
algebraic decaying correlations. We are thus in a case of a first-order critical point with an expohAF
H =
237
Fig. 19. Magnetic exponent of the loop model along the non-physical selfdual line. Axial geometry. The solid line
represents the result (65).
Fig. 20. Magnetic exponent of the loop model along the critical antiferromagnetic line. Axial geometry. The solid line
represents the result (66).
nent = 1/2 all along the line for the twisted vertex model, and, generically, for the loop/cluster
Potts model.
On the other hand, these results do not apply to the loop/cluster Potts model in the case of Q a
Beraha number, that is t = k + 2 integer. For such values of Q, the free energy does not exhibit a
kink any longer, the eigenvalues crossing the ground state at the antiferromagnetic critical point
being cut-off from the partition function within the BerkerKadanoff phase. The same is true for
the RSOS version of the model. These different features are illustrated in Fig. 21.
While the existence of the first-order critical point can be pretty much argued analytically for
the twisted vertex model and the Potts model, it is natural to expect a similar transition in the
238
Fig. 21. The dashed (respectively solid) curves represent the free energy of the RSOS model (respectively the loop model)
at t = 5 for various widths N . The two free energies coincide perfectly (and hence, also with that of the twisted vertex
model) for x = Q1/2 (eK 1) large enough (whence the dashed curves are hidden by the solid ones). However, for
smaller x, the free energy of the loop model experiences a sharp singularity with a discontinuity of the derivative. As N
increases, this singularity moves towards the bulk antiferromagnetic transition temperature, here shown as a vertical line.
Within the BK phase, the free energies of the RSOS and the loop models are different, with a jump whose magnitude
increases as one goes deeper within this phase. Diagonal geometry.
Fig. 22. The dotted (respectively full) curves represent the free energy of the untwisted (respectively the twisted) vertex
model at t = 5 for various widths N . The dashed curves are for the RSOS model, same as in Fig. 21. Only the twisted
vertex model exhibits singular behavior for finite N . However, the free energies of the twisted and untwisted vertex
model behave similarly in the thermodynamic limit, with a first-order critical point on the antiferromagnetic critical line.
Diagonal geometry.
untwisted vertex model, since for the vertex model, the free energy per unit area should behave
normally and thus be independent of twists. This is confirmed by numerical study, as illustrated
in Fig. 22.
239
Fig. 23. Parafermion to minimal model flow in the RSOS model with t = 5. The parameter x = Q1/2 (eK 1). The
vertical lines show the non-physical selfdual line (x = 1) and the antiferromagnetic critical point (x 0.51). The
horizontal lines are at c = 4/5 and c = 7/10, the central charges of the relevant parafermion and minimal models.
The data indicates the existence of a new repulsive fixed point at x 0.83 in the parafermion model universality class.
The minimal model fixed point is situated somewhere between this and the antiferromagnetic point, and its attractive
nature leads to a plateau in c in the range 0.83 < x < 0.51. The data further reveal a c = 0 fixed point, presumably
repulsive, at x 0.93. If true, consistency of the RG flow diagram would necessitate another attractive fixed point; the
data is indeed consistent with such a point at x 0.87, again in the minimal model universality class. The interpretation
of the data in the range 1 < x < 0.93 is somewhat uncertain. Right at x = 1, the data are again well-behaved and
converge to c = 0.385 0.001. We have no explanation for this value at present. Diagonal geometry.
At the Beraha numbers, and for the loop/cluster Potts model or the RSOS model, the thermal
t3
operator does not have dimension 12 any more. Rather, the dimension becomes h = k1
k = t2 ,
the dimension of the parafermion operator. It is easiest to find out what happens for the RSOS
version of the model, which has the same largest eigenvalue as the loop/cluster model. In this
case, recall that the antiferromagnetic critical point is in the universality class of the Zk theory, and the flow in the BerkerKadanoff phase coincides with the flow predicted by Fateev and
Zamolodchikov [29] who studied perturbed parafermionic theories with the following Hamiltonian:
H = Hk + ei/k + ei/k
(67)
( positive by convention). These authors argued that for = the theory flows, for k odd, to
6
the minimal model with central charge c = 1 (k+1)(k+2)
while it remains massive for k even.
Numerical evidence for these flows is presented in Figs. 2325, where we also give phenomenological interpretations of other relevant features in the RSOS model phase diagrams.
Note meanwhile that if t is rational, we can still define an RSOS model, but its ground state
energy should be the same as the one of the vertex modelthat is, exhibit the first-order critical
point, in sharp contrast with what happens for t integer. This is illustrated in Fig. 26 for t = 52 .
Finally the question remains of how to describe the first-order critical point within the
Coulomb gas. We have not found the complete answer to this question, but a probable scenario
is that we are dealing with the perturbation described by Fateev [30],
H = H0 + cos(1 1 ) cosh(2 2 ),
(68)
240
Fig. 24. RSOS model phase diagram for t = 6. The physics is differentand simplerthan in the case t = 5 shown
in Fig. 23. The c = 1 parafermion fixed point at the antiferromagnetic transition, x 0.58, is still repulsive, but the
nearest attractive fixed point is now the non-critical c = 0 point at x 0.87. Right at x = 1 the data converge slowly
towards c = 1.6 0.1, an unexplained value. The model is expected to be massive everywhere, except at the two vertical
lines. Diagonal geometry.
Fig. 25. RSOS model phase diagram for t = 7. The physics is similar to the case of t = 5 shown in Fig. 23. In particular,
we observe again a parafermion (c = 87 ) to minimal model (c = 67 ) flow. The plateau at c = 67 develops more slowly in
N , in the form of a shoulder. Analyzing the behavior at the left edge of that plateau would call for larger system sizes
than those presented here. Diagonal geometry.
where H0 is the Hamiltonian for a pair of free bosons 1 and 2 as in Section 6, and
12 22 = 4.
(69)
Indeed, consider first the case where the boson 1 is twisted so that the central charge is c =
2 6t , and set
12
8
the perturbation to be h =
12 22
8
241
Fig. 26. The free energy of the RSOS model for t = 52 also develops a first-order singularity in the thermodynamic limit.
Diagonal geometry.
one with dimension unity. Meanwhile, the theory is integrable, and admits non-local conserved
currents, one of them being
4i
2
J = 1 exp
(70)
2
with dimension h = 1
2
22
=1
1
t2
t3
t2 ,
(71)
242
A last observation we can make is that, since the parafermionic theories for t integer flow to
minimal theories in the BK phase, and since the latter are derived from a Coulomb gas with
coupling g = t1
t , it is most likely that excitations for Q generic around the false ground
statei.e., the state that crosses from the high K phase, and that would still be the ground
state of the Potts model for Q a Beraha numberare described by this Coulomb gas, at least in
the vicinity of the antiferromagnetic transition. Excitations around the true ground state meanwhile are described by a Coulomb gas with g = 1t , so both manifolds are in fact present within
the BK phase!
9. Conclusion
The least one can say is that the properties of the antiferromagnetic Potts model are extremely
complex, even for the simple case of the square lattice. There are definitely loose ends in our
study, the most important one being probably our lack of understanding of the emergence of a
non-compact bosonic degree of freedom. One might understand it at a very qualitative level by
considering the block interactions as in Fig. 1. Summing the arrows on either of the pairs of
incoming or outgoing legs defines dual height variables which are described by the boson 1 .
The states for which the two arrows have opposite directions do not change the height 1 , and
draw loops on the lattice, which can intersect at vertices. These loops bear some resemblance
to the loops of the Goldstone phases in O(n) models, for which it is known that the continuum
limit is essentially a collection of non-compact bosons and symplectic fermions. It is thus not
so far fetched to expect that the loops in the antiferromagnetic Potts model are described by a
non-compact boson, which would mean roughly that they are Brownian at large scales. It would
of course be most interesting to elaborate this picture further.
A more straightforward direction of study would be to consider, when t is rational, the RSOS
versions on the antiferromagnetic critical line. Since when t = k + 2 is an integer, they are in
the universality class of Zk parafermions, identical with the SU(2)k /U (1) coset model, it is
tempting to speculate that they might be related to SU(2) with a rational level. This will be
discussed elsewhere.
The appearance of a line of first-order critical points is intriguing. Such points are not entirely
unheard of. In early works, they were exhibited for instance in the one-dimensional q-state classical clock model with a topological term [31]. Another case which is related is the point H = 0
in the high-temperature loop version of the low-temperature Ising model (i.e., the O(n = 1)
dense loop model [32]). Recall indeed that the low-temperature Ising model is a massive theory
as far as the local spin degrees of freedom are concerned, but nevertheless presents interesting conformal properties when one reformulates it as a high-temperature loop expansion. These
properties are the same as the ones of the dense O(n = 1) loop model, and are described by a
conformal field theory with vanishing central charge, and, more importantly, a field of vanishing
dimensionthe spin operator. Turning on a magnetic field leads to a first-order singularity of the
free energy, that is, the model is also at a first-order transition point where the two manifolds of
possible spontaneous magnetization m = msp intersect. On the other hand, it is not clear how
to see the difference between positive and negative magnetic fields within the loop formulation,
since the field is conjugate to the number of loop end points, and that number is necessarily even.
Therefore, it is not clear how to actually observe for instance the first-order singularity of the free
energy.
Interestingly, first-order critical points have also been independently proposed in a series of
papers by Pruisken and collaborators revisiting the quantum Hall effect transition. These authors
243
Fig. 27. Central charge for the cluster model on the triangular lattice. The transfer direction is perpendicular to a line of
N Potts spins. All fits are for N which have the same remainder modulo 3 (otherwise deviations from the expected result
c = 2 6t will appear close to Q = 0 and Q = 4).
have argued for instance [33] that in the large-N CPN model, there are bulk massless degrees of
freedom at the first-order phase transition point. If true, these massless degrees of freedom could
maybe be understood in the Q-state Potts model incarnation as geometrical degrees of freedom,
probably non-local in terms of the original Potts spins.
Finally, the reader might wonder whether the antiferromagnetic transition discussed here
is particular to the square lattice. We believe on the contrary that it is common to any twodimensional lattice. Indeed, it is well established that the properties along the ferromagnetic
transition line are universal, for example by invoking the standard one-boson Coulomb gas construction. The existence of another line of transitions for which the thermal operator is irrelevant
(which we have called, for the square lattice, the non-physical selfdual line) follows by analytic
continuation in the Coulomb gas coupling constant, and leads immediately to the existence of a
BerkerKadanoff phase, in the region where eK < 1. To isolate this phase from the trivial fixed
points at zero and infinite temperature, we need (at least) on either side a further line of repulsive fixed points. These lines must necessarily be the loci of level crossings, and it is thus not
far fetched to expect that the rest of the physics of the first-order critical points (at generic Q)
will follow. Moreover, the singularities at the Beraha numbers follow from the fact that the vertex
model transfer matrix can be built in terms of TemperleyLieb generators and that this commutes
with the generators of the quantum group, two very universal ingredients.
To test this heuristic argument, let us consider the case of the triangular lattice. Setting
v = eK 1, the Potts model is integrable when v 3 +3v 2 = Q [34]. When Q [0, 4] this describes
three branches. The upper one is the ferromagnetic transition, of course, and by analytic continuation the middle one should then govern the BerkerKadanoff phase. The remaining lower branch
is thus a natural candidate for one of the two antiferromagnetic transition lines. Numerics for the
central charge of the cluster model along the lower branch is shown in Fig. 27 and is indeed
well described by c = 2 6t . The same results of course hold by duality for the model on the
hexagonal lattice.
244
Acknowledgements
J.L. Jacobsen thanks J.-F. Richard, J. Salas, and A. Sokal for related collaborations, and SPhT
for their warm hospitality.
Appendix A. Block spins and the Q = 0 case
We use for instance the osp(2/2) R matrix given in [35]. In the latter reference, a more general
expression is given for the q-deformed case. We take the limit q 1. At the same time we scale
the spectral parameter x = ei v , q = ei , with 0. We then obtain the expression
2+v
4v
PN .
P0 2
R = P1 +
2v
(v 2)2
(A.1)
The R matrix acts in the tensor product of two fundamental, four-dimensional representations.
The basis states in the latter are the bosonic states |1, |4 and the fermionic states |2, |3. The
projectors are as follows:
(i) one-dimensional blocks:
P1 = 1,
P0 = 0,
P1 = 0,
P0 = 1,
2 0 0
1
0
2 2
P0 =
0 2 2
4
2 0 0
1
1
2
1
1
PN =
4 1 1
1 1
P1 =
1
2
1 1
,
1 1
PN = 0
(A.2)
(A.3)
|1 |3, |3 |1 ,
|3 |4, |4 |3
2
2
0
1
0
,
P1 =
0
4 0
2
2
1 1
1 1
1 1
1
(A.4)
0
2
2
0
0
2
2
0
2
0
,
0
2
(A.5)
(A.6)
245
Consider now a different problem where spins 1/2 interact with the well-known six-vertex
model R matrix with anisotropy parameter . For the value u of the spectral parameter the
Boltzmann weights are represented in Fig. 28, with
w1 = w2 = 1,
w5 =
w 3 = w4 =
sin
eiu ,
sin( + u)
sin u
,
sin( + u)
w6 =
1 sin iu
e ,
sin( + u)
(A.7)
where is a parameter at our disposal (a gauge parameter), and for = 1 the R matrix commutes
with su(2)q , with q = ei .
We now consider the following situation. Take a square lattice, and with every horizontal
line associate the parameters h, h + 2 in alternance. With the vertical lines, associate similarly
the parameters v, v + 2 in alternance (see Fig. 29). The weights at the vertices are given by
the foregoing six-vertex formula with u being the difference of the horizontal parameter and the
vertical parameter for the two intersecting lines. Since the Boltzmann weights are invariant under
a shift of u by , we end up with two kinds of vertices, those for which u = h v and those for
which u = h v 2 .
We now consider interaction blocks where a pair of horizontal lines meets a pair of vertical
lines. We use these blocks to define a new interaction matrix, where each link can carry four
states, made out of the two spins 1/2 of the block. In the limit q i ( 2 ), the interaction
becomes trivial at fixed u, but if we let at the same time u approach zero as
=
+ ,
2
u = w
(A.8)
246
Fig. 29. Staggered vertex model, with alternating parameters on horizontal and vertical lines.
we claim that we obtain, after inserting appropriate gauge factors, the osp(2/2) R matrix, with
the following correspondences
|++ |1,
|+ + i|+ |2,
|+ i|+ |3,
| |4
(A.9)
|4
+
|4 |1
(v 2)2
(v 2)2
2v
|2 |3 + |3 |2 .
+
2
(v 2)
(A.10)
This leads among others to the two vertices represented in Fig. 30. These vertices are obtained
in turn as a sum over configurations of the spins 1/2. For the first, one has the weight
2
2
sin u
w2
cos u
sin( + u)
cos( + u)
(w + 1)2
and for the second
sin2
sin2 ( + u)
e2iu
sin2 sin2 u
sin2 ( + u) cos2 ( + u)
e2iu 1
w2
1 + 2w
=
2
(w + 1)
(1 + w)2
S + = q + + + q .
z
(A.11)
247
Fig. 30. Two examples of the equivalence between R matrix elements. Diagrams on the left are osp vertices together
with their Boltzmann weight, and they are obtained from the corresponding microscopic configurations on the right.
z
z
S + = q + + + q
(S )2
,
q+q 1
(A.12)
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
P.P. Martin, Potts Models and Related Problems in Statisticals Mechanics, World Scientific, Singapore, 1991.
H. Saleur, Nucl. Phys. B 360 (1991) 219.
V. Pasquier, H. Saleur, Nucl. Phys. B 330 (1990) 523.
F. Essler, H. Frahm, H. Saleur, Nucl. Phys. B 712 (2005) 513, cond-mat/0501197.
J.L. Jacobsen, N. Read, H. Saleur, Phys. Rev. Lett. 90 (2003) 090601, cond-mat/0205033.
J.L. Jacobsen, J. Salas, A.D. Sokal, J. Stat. Phys. 119 (2005) 11531281, cond-mat/0401026.
P.W. Kasteleyn, C.M. Fortuin, J. Phys. Soc. Jpn. Suppl. 26 (1969) 11.
C.M. Fortuin, P.W. Kasteleyn, Physica 57 (1972) 536.
P. Di Francesco, H. Saleur, J.B. Zuber, J. Stat. Phys. 49 (1987) 57.
J.L. Jacobsen, J. Salas, cond-mat/0407444, J. Stat. Phys., in press.
R.J. Baxter, S.B. Kelland, F.Y. Wu, J. Phys. A 9 (1976) 397.
N. Read, H. Saleur, Nucl. Phys. B 613 (2001) 409, hep-th/0106124.
R.J. Baxter, Proc. R. Soc. London 383 (1982) 43.
R.J. Baxter, Stud. Appl. Math. 50 (1971) 51.
248
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
Abstract
We recently proposed a new approach to the Casimir effect based on classical ray optics (the optical
approximation). In this paper we show how to use it to calculate the local observables of the field theory.
In particular, we study the energymomentum tensor and the Casimir pressure. We work three examples
in detail: parallel plates, the Casimir pendulum and a sphere opposite a plate. We also show how to calculate thermal corrections, proving that the high temperature classical limit is indeed valid for any smooth
geometry.
2006 Elsevier B.V. All rights reserved.
PACS: 03.65.Sq; 03.70.+k; 42.25.Gy
1. Introduction
The Casimir effect [15] is a manifestation of the quantum fluctuations of a quantum field at
a macroscopic level. Experiments on Casimir forces are precise tests of one of the less intuitive
predictions of field theory. For a theoretician, predicting the outcome of these experiments is a
worthy challenge. Hence it seems somewhat astonishing that an exact solution exists only for
infinite, parallel plates case [1]. Other formal solutions for geometries not made of distinct rigid
bodies free to move (like the wedge, the interior of a sphere or of a rectangular box [3]) are
irrelevant for an experimental setup. Moreover in many such solutions divergences have been
discarded in a way that leaves the result unrelated to practical materials and configurations [6].
* Corresponding author.
250
Interesting theoretical developments include the method developed in [7] where the solution for
infinite periodic geometries is obtained as a series expansion in the corrugation, and the numerical
Monte Carlo analysis in [8].
Amongst the various effects that an experimentalist must take in account to interpret the
data (e.g. finite conductivity, temperature and roughness corrections) probably the most challenging, interesting and full of connections with other branches of physics and mathematics
is the dependence of the force on the geometry of the bodies. Calculating the Casimir force
for perfectly reflecting bodies in the end reduces to finding the density of states (DOS) of the
Schrdinger Hamiltonian for the equivalent billiard problem including the oscillatory ripple on
the averaged DOS. This is an incredibly difficult problem in spectral theory that still challenges
mathematicians and physicists today [9] and in essence is not solved beyond the semiclassical
approximation.
In this context we have introduced in Refs. [10,11] a method based on classical optics which
has several virtues: accuracy, uniform validity when a symmetry is born, straightforward extension to higher spin fields, to non-zero temperatures, to include finite reflectivity and, the main
topic of this paper, it can provide an approximation to local observables.
This paper is structured as follows: in Section 2 we show how to cast the energymomentum
tensor into a sum over optical paths contributions and how to regulate and analyze the divergences, ubiquitous in Casimir energy calculations. Section 3 is dedicated to the analysis of the
three examples already studied in [11] with pedagogical intent. We study parallel plates, the
Casimir torsion pendulum and a sphere opposite a plate. In Section 4 we show how to calculate
the same local observables and the free energy for a thermal state and we prove (within the limits
of our approximation) the classical limit theorem [4,13], which states that at high T , Casimir
forces become independent of h and proportional to T . As far as we know this is the first time
this assertion can be generalized to geometries other than parallel plates. We also study the example of parallel plates (finding the known results) and of a sphere opposite a plate at non-zero
temperature. We find evidence, again within the framework of the optical approximation, that the
low T behavior of the Casimir force is a difficult problem, qualitatively different from the T = 0
and high temperature cases.
2. Local observables
Local properties of the quantum vacuum induced by the presence of boundaries are of
broad interest in quantum field theory [14]. For example, gravity couples locally to the energy
momentum tensor. Vacuum polarization induces local charge densities near boundaries, provided
the symmetries of the theory allow it. Also, local densities are free from some of the cutoff dependencies that plague many other Casimir effects. Any local observable that can be expressed
in terms of the Greens function can be estimated using the optical approach. In this section we
study the energy, momentum and stress densities for a scalar field.
Some local observables are not unambiguously defined [15]. For example, the charge density
(in a theory with a conserved charge) is unambiguously defined while the energy density, in
general, is not (while its integral over the volume, the total energy, is). In this paper we use the
Noether definition of the energymomentum tensor, similar results would be obtained with other
interesting definitions.
251
L
g L,
( )
(2.2)
1
(2.3)
m2 2
2
from which we identify the energy density T00 , the momentum density T0i , and the stress tensor
Tij . The definition of these quadratic operators involves divergences that we will regulate by point
splitting. We hence replace quadratic operators like (x)2 by limx x (x )(x). The energy
density operator, for example, is
1
1
1 2
m
(x
,
t)
(x,
t)
+
(x
,
t)(x,
t)
+
(x
,
t)(x,
t)
(2.5)
(2.6)
(2.7)
We also use the definition E(k) = k 2 + m2 , and Ej = kj2 + m2 so that the eigenvalue equation reads
2 j = kj2 j ,
(2.8)
252
We now introduce the propagator G(x , x, k), defined as in Ref. [11] to be the Greens function
of the problem (2.7) or (2.8):
2 k 2 G(x , x, k) = (x x),
G(x , x) = 0 for x
or
x S,
(2.9)
n (x )n (x)
kn2 k 2 i
(2.10)
In Ref. [10] we have developed an approximation for the propagator G(x , x, k) in terms of
optical paths (closed, in the limit x x). The derivation can be found in Ref. [11], the general
result valid for N spatial dimensions being
Gopt (x , x, k) =
r
(1)nr
(1)
(
r r )1/2 k N/21 H N (k
r ),
2N/2+1 N/21
2 1
Gr (x , x, k),
(2.11)
where H is a Hankel function, r labels the paths from x to x , nr is the number of reflections of
the path r,
r (x , x) is its length and r (x , x) is the enlargement factor familiar from classical
optics,
r (x , x) =
dx
.
dAx
(2.12)
r (x , x) is the ratio between the angular opening of a pencil of rays at the point x and the area
spanned at the final point x following the path r. For N = 3 we have
r (x , x) ik
r (x ,x)
.
e
(2.13)
4
With this explicit form for the propagator G, we now have to rewrite the elements of the
quadratic operator T as functions of G and its derivatives. It is useful to pass from the pointsplitting to a frequency cutoff by inserting the latter in the normal modes decomposition (2.6)
as
Gr (x , x, k) = (1)nr
kj /
1/2
dk ek/ 2k k 2 kj2 .
(2.14)
The limit
x
x can then be exchanged with the dk integral and we get for the energy density,
dk e
k/ 1
E(k)(x, k) +
dk ek/
k
j (x, k).
2E(k)
(2.15)
2k
Im G(x, x, k),
(2.16)
1
1
j(x, k) = lim
, x, k) =
Im G(x
Im G(x,
x, k).
2
x x
253
(2.17)
E is obtained by integrating T00 over the whole volume between the bodies:
E=
d x
D
dk e
k/ 1
E(k)(x, k) +
dk ek/
k
2E(k)
d S j(x, k).
(2.18)
We have turned the integral over the divergence of j into a surface integral using Gausss theorem. In the case of Dirichlet or Neumann boundary conditions, since d S n we have (here
and jn = n j)
n n
jn (x, k) =
1
Im n G(x, x, k) = 0,
xS
(2.19)
and the surface integral term disappears. It should be noted that the vanishing of the j contribution to the total energy relies on the continuity of the propagator for x , x D. In some
approximations, including the optical one, this continuity is lost. Hence spurious surface terms
arise on the boundary of certain domains D D. This region is what in wave optics is called
the penumbra region. Diffractive contributions are also not negligible in this region and they
cancel the discontinuities in G, hence eliminating the surface terms.1 The surface terms in the
energy are hence of the same order of the diffractive contributions which define the error in our
approximation.
j could also be eliminated from T00 by changing the energymomentum
The divergence
tensor according to
T = T + ,
(2.20)
with
1
= (g g ).
(2.21)
2
The total energy E and momentum are not affected by this redefinition however the new tensor
T is not symmetric.
It can be seen that the stress tensor Tij is normal on the surface S (for both Dirichlet and
Neumann BC) so locally the force on the surface is given by the pressure alone
d F
= nP = 0|Tn,n |0.
dS
The operator Tnn regulated by point splitting is
1
2 2
Tn,n (x, t) = lim
0
n
n,
n 0
n
2
x x
1
1
2
2 2
= lim
+
)
0
n
n
2 0
2
x x
(2.22)
(2.23)
where is shorthand for (x , t). The second term in brackets is zero when averaged over an
eigenstate of the number operator |{nj }, by virtue of the equations of motion. For Dirichlet BC
1 As an example see Kirchoffs treatment of the diffraction from a hole in Ref. [19].
254
0|Tn,n |0 = lim
(2.24)
P (x) = lim
(2.25)
This expression can be rewritten, in terms of the propagator G, regulated by a frequency cutoff
as we did for T00 ,
P (x) = lim
x x
n n
dk ek/
k
Im G(x , x, k).
2E(k)
(2.26)
In this regulated expression we can exchange the derivatives, limit and integral safely. Below we
discuss what the divergences are when and how to interpret and dispose them.
All the above expressions are exact. Once the propagator G is known, we can calculate the
energymomentum tensor components from them. However as discussed above in the interesting
cases it is difficult to find an exact expression for G and some approximations must be used.
For smooth impenetrable bodies we use the optical approximation to the propagator developed
in Refs. [10,11] and recalled in Eq. (2.11). This gives G as a series of optical paths and hence the
pressure P as a sum of contributions due to optical paths which reflect over the smooth, metallic
surfaces2
P
Pr ,
(2.27)
r
Pr = (1)
nr
lim
x x
n n
dk e
k/
1/2
k
r (x , x)
sin k
r (x , x) .
2E(k)
4
(2.28)
An important feature of the optical approximation is that all divergences are isolated in the
low reflection terms whose classical path length can vanish as x , x S. In practice only the
zeroth and first reflection are potentially divergent. Before performing the integral in k and taking
then we have to put aside the divergent zero and one reflection terms P0 and P1 for a
moment (in the next section we will show how their contributions are to be interpreted).
For the remaining families of paths (that we will denote as r R) the integral over k can be
done and the limit taken safely. The result is finite and reads
P (x) =
rR
lim
n n (1)nr
x x
r (x , x)
.
8 2
r (x , x)
1/2
(2.29)
2 If the conducting surfaces are rough and the average height h of the roughness is much smaller than the important
wavelengths a, then the surfaces can be considered, for what we are concerned, smooth [12]. The corrections to the
Casimir energy in the optical approximation are still not known but they must be very small, O(h2 ) since h = 0 the
average being on regions of size a.
255
We can further simplify this expression. For simplicity let us call z the normal direction. Notice
that for any sufficiently smooth function f (z , z) vanishing for either z or z on the surface z = 0
1
z z f (z , z)
z =z=0 = z2 f (z, z)
z=0 .
2
(2.30)
The proof is trivial: consider that the lowest order term in the expansion of f (z , z) near z , z = 0
is z z. The propagator G(x , x, k) satisfies all these properties and hence we can use this result
to get rid of the limit x x and assume x = x from the beginning. We can therefore rewrite
Eq. (2.29) as,
P (x) =
1/2
(1)nr z2
rR
r (x, x)
.
16 2
r (x, x)
(2.31)
Eq. (2.31) is one of the main results of this paper. In Ref. [11] we reduced the computation
of Casimir energy to a volume integral. The force is then found by taking the derivative with
respect to the distance between the bodies. Calculating the pressure instead gives the force by
means of just a double integral of a local function. The problem is then computationally lighter
and sometimes (as we will see in the examples) can even lead to analytic results.
Essentially the problem has been reduced to finding the lengths and enlargement factors associated with the optical paths for points close to the boundary. In the case of the pressure
(Eq. (2.29) or (2.31)) it is necessary to know their derivatives in the direction transverse to the
surfaces. We will see that this problem can be easily tackled numerically when it cannot be solved
analytically.
2.2. Regulate and eliminate divergences
As in the energy calculations [11], the only divergences occurring in the pressure come from
by paths whose lengths
1/, where is the plasma frequency of the material.3 There are
only two such families of paths: the zero and one reflection paths. In this section we show that
these divergent contributions are independent of the distances between the bodies. This fact is
easily understood: in order for a path to have arbitrarily small length all of its points must be on
the same body. So in order to study these terms we need only consider a single, isolated body
(and a massless field). We are also careful in maintaining the double derivative z2 ,z since we are
calculating the terms P0 and P1 separately.
For r = 0, the zero reflection term, introducing an exponential cutoff on the material reflection coefficient we obtain
P0 =
e
0
k/
k
4
sin k|z z|
dk
.
lim z z
=
2E(k) x xS
4|z z|
4 2
(2.32)
3 It is well known that the forces between rigid bodies remain finite and do not depend on the characteristics of the
material in the perfect metal limit. On the contrary, stresses on the isolated bodies are strongly dependent on the dielectric
constants and in general diverge in the perfect metal limit [17]. However, in the case of finite dielectric constants the
calculation of the Casimir force is possible only in the parallel plates case [18]: a description of the interplay of finite
conductivity effects and geometry dependence, even within the optical approximation, is still lacking. So in this paper
we limit ourselves to the case of infinite conductivity (which is however a very good approximation for the experiments)
and we neglect the infinite self-stresses induced by this idealization.
256
ek/ dk
k
4
sin k|z + z|
.
lim z z
=
2E(k) x xS
4|z + z|
4 2
(2.33)
Notice that these two terms are equal, so we could have substituted z ,z 12 z2 for their sum,
after having properly regulated the divergence.
This positive, cutoff dependent pressure, P P0 + P1 , must be dynamically balanced locally
by a pressure generated by the material, lest it collapse. Moreover the total force obtained by
integrating this quantity over the (closed) surface S of the whole body gives zero. However, if
the space around the body were inhomogeneous, as in the presence of a gravitational field, a finite
term survives the surface integration, giving rise to a vacuum Archimedes effect in which the
pressure on one side is, due to gravitational effects, larger than on the other side, so the body feels
a net force. We have analyzed this effect in detail in Ref. [20] and called it Casimir buoyancy.
Finally note that another important element of this class of quadratic operators is the Feynman propagator. In studying a field theory in a cavity or in between impenetrable bodies (for
example, hadrons as bags, photons in cavities or BoseEinstein condensates in traps), we can
consider expanding the Feynman propagator in a series of classical optical paths reflecting off
the boundaries. The first term, related to the direct path is the familiar free propagator, the others
give the finite volume corrections.
3. Examples
In this section we calculate the Casimir force from the pressure, using the formalism developed in the previous section, for three examples that were already addressed in Ref. [11] using
the energy method.
3.1. Parallel plates
The parallel plates calculation is a classic example, whose result is well known and constitutes the basis the widely used proximity force approximation (PFA) [16]. We use this standard
example to establish the rank among contributions to the total pressure and show the similarity
and differences with the energy method [11].
We calculate the force acting on the lower plate, denoted by d or down, by calculating the
pressure on its surface. We discard the zero and 1d (one reflection on the lower plate itself)
reflection terms. The first term to be considered is the path that bounces once on the upper plate
(u or up) 1u. For parallel plates = 1/
2 and we have
P (x) =
(1)nr z2
r1u
1
.
16 2
2r (x, x)
(3.1)
The length
r (x, x) for the paths that bounce an even number of times is a constant in z and
hence the derivatives vanish: they do not contribute to the pressure. This seemingly innocuous
observation simplifies the calculations considerably and it is a test for any other geometry which
reduces to parallel plates in some limit: in this limit the even reflections contributions must vanish. Generically their contributions are small. This parallels the role of the odd reflection paths
in the energy method [11].
257
Fig. 1. Odd reflection paths that contribute to the Casimir force between the two plates in the pressure calculations with
the optical approximation. The points x and x will eventually be taken coincident and lying on the lower plate.
Fig. 1 shows the odd reflection paths labelled with our conventions. For the path 1u we have
P1u (x) = lim z2
z0
1
16 2 (2a
2z)2
3
.
32 2 a 4
(3.2)
The next path to be considered is the path that bounces 3 times, first on d, then on u and again
on d, dud = 3u (3 stands for 3 reflections and u for the plate where the middle reflection occurs)
which gives a contribution
P3u (x) = lim z2
z0
1
3
=
.
16 2 (2a + 2z)2
32 2 a 4
(3.3)
The two contributions Eqs. (3.2) and (3.3) are equal. The reason is easily uncovered. One can
recover Eq. (3.3) from Eq. (3.2) sending z z but for the purpose of taking the second derivative at z = 0 this is irrelevant. In the same fashion P3d = P5d , P5u = P7u , etc., and hence we
find
P (x) = 2
4
3
3
3
3
2
+
,
32 2 a 4
32 2 (2a)4
32 2 (3a)4
16 2 a 4 90
(3.4)
which is the well-known result. Notice also that the rate of convergence is the same as in the
calculation making use of the Casimir energy in Ref. [11] (nth term contributes 1/n4 of the first
term, in this case 1u + 3u). These observations that allow us to determine the rank of the contributions are fundamental, and they apply as well to the other examples in this section.
3.2. The Casimir torsion pendulum
In this section we study a geometry already considered in Ref. [11]: a plate inclined at an angle
above another infinite plate. We have called this configuration a Casimir torsion pendulum
because the Casimir force will generate a torque which can be experimentally measured. The
configuration is analogous to the parallel plates case but the upper plate must be considered
tilted at an angle from the horizontal. The length of the upper plate must be taken finite, we
denote it by w, while the length of the lower plate can be infinite which we choose for simplicity.
There is only one substantial difference with the parallel plates case: the even reflection paths do
contribute in the pendulum, since their length varies as we move the final points x , x.
We calculate the force exerted on the lower, infinite plate for simplicity. We then obtain the
energy E, by integrating over the distance along the normal to the lower plate and from this we
258
The lower plate is taken infinite, the upper plate width is w, and the distance between the
height at the midpoint of the upper plate is a. We will choose as the origin of the coordinates one
point on the intersection line between the lower plate and the line obtained by prolonging the
upper plate. This defines a fictitious wedge of opening angle . We call x the horizontal and z
the vertical coordinate, the third direction, along which one has translational symmetry, being y.
Since the surfaces are locally flat we have = 1/
2 as in the case of the parallel plates, and
again the odd reflections are exactly as in the case of the parallel plates. However now the even
reflections contribute (the notation is the same as in the parallel plates case, in the even reflections
2u means the first reflection is on the upper plate, etc.):
T =
(3.6)
where we have grouped the terms with the symbolic notation Pa+b = Pa + Pb when Pa = Pb .
It is useful to recapitulate what we have learned about the rank of these contributions: P1u+3u
dominates, P3d+5d is smaller by 1/16, P5u+7u is smaller by 1/81, etc. The even reflections
are generically much smaller than the odd reflections, and vanish as 0.
The first term in (3.6) is
P1u+3u = 2
1 2 1
,
16 2 z
21 (z, x)
(3.7)
with
1 = 2(x sin z/ cos ), and an overall factor of 2 takes into account the identity P1u =
P3u . Taking the derivative and then setting z = 0 we find
1
3
(3.8)
,
16 2 x 4 sin4 cos2
and integrating from xm = (a/ sin w/2)/ cos to xM = (a/ sin + w/2)/ cos we find the
force per unit length in the y direction
cos
1
1
F1u+3u =
(3.9)
.
32 2 sin4 (a/ sin w/2)2 (a/ sin + w/2)2
P1u+3u =
Since term by term F = E/a we find the first term in optical expansion of the Casimir energy
E (the arbitrary constant is chosen so that E 0 when a ) as
E1u+3u =
aw cos4
(3.10)
2 2 (4a 2 w 2 sin2 )
2 (4a 2 w 2 sin2 )
(3.11)
Analogously we can calculate the contribution to the pressure P of the two reflections paths 2u
and 2d. Again the contributions of the two paths are identical and the result simplifies to
P2u+2d =
2 1 2 1
,
8 2 2 z
22 (z, x)
(3.12)
259
and using
2 = 2 x 2 + z2 sin we find
P2u+2d =
(3.13)
x4
which integrated from xm = (a/ sin w/2)/ cos and xM = (a/ sin + w/2)/ cos gives the
force along the z axis due to these paths:
cos3
1
1
F2u+2d =
(3.14)
.
48 2 sin2 (a/ sin w/2)3 (a/ sin + w/2)3
16 2 sin2
(3.15)
Notice that this expression vanishes when 0, as it should since for parallel plates all the
contributions of even reflections paths vanish.
The next term in the series is F3d+5d , whose calculation is performed in the same fashion.
The result is:
3 cos5 2
1
1
F3d+5d =
(3.16)
,
16 2 sin4 2 (a/ sin w/2)3 (a/ sin + w/2)3
5w 3 48a 2 w 2
3w
1
+
+ .
(3.17)
16 2 16a 4
32a 6
We can also present the term given by the 4 reflections paths,
cos3 2
1
1
F4u+4d =
.
16 2 4a 4
(3.18)
The terms independent of can be seen to reconstruct the parallel limit case F = (1 + 1/16 +
1/81 + )3/16 2 a 4 .
Term by term, this series for the force reproduces the series in Ref. [11]. The series for the
energy and the torque agree as well. The results of the pressure method then coincide with those
of the energy method (as for all the examples analyzed in this paper). In Ref. [11] we discussed
at some length the predictions of the optical method for the Casimir torsion pendulum. We will
not repeat them here, referring the reader to that paper for further details.
3.3. Sphere and plane
The sphere facing a plane is an important example for several reasons: it has been analyzed
theoretically with various exact or approximate numerical techniques [8,21]; it is an experimentally relevant configuration; the exact solution is unknown and probably will escape analytical
methods for a long time to come. We have already calculated the optical approximation to the
Casimir energy in Ref. [11] up to 5 reflections. In this paper we study this problem for mainly
pedagogical purposes, leaving a more accurate and complete numerical analysis for the future.
We believe it is worth studying this example because, contrary to the previous two examples,
260
the enlargement factor plays an important role and moreover we will reanalyze this example
with finite temperature in Section 4.2.1.
We calculate the pressure (and by integrating, the force) exerted on the plate by the sphere
which, of course, equals the force exerted by the plate on the sphere. We start from the qualitative observation that the rank of the contributions is the same as in the parallel plates case in the
limit a/R 0. In all the examples we have analyzed this rank is preserved for any value of a/R.
Moreover the ratios of the contributions to the force F3+5 (a, R)/F1+3 (a, R), F4 (a, R)/F2 (a, R),
etc., decrease quickly as a/R increases, we believe due to the growing importance of the enlargement factor.
In this paper we calculate analytically the 1s term (here s stands for sphere and p for plate)
and by using the relation P1s+3s P1s + P3s = 2P1s proved in Section 2 (the notation is the same
as in that section) we are able to include the 3s term as well.
Using the expressions for the length and enlargement factor for the 1s path obtained in
Ref. [11] we get
1/2
R 2 1s
16 2 z2
1s
2
1/2
R 2
2
2
.
(a + R z)2 + 2
(a
+
R
z)
+
=
z=0
2
2
32 z
P1s+3s = 2
(3.19)
The final expression for the pressure P1s+3s obtained after the derivatives are taken is rather long,
however the contribution to the force on the plate, F1s+3s (obtained by integration of P1s+3s over
the infinite plate) is quite simple:
F1s+3s = 2
d P1s+3s =
h cR
.
8a 3
(3.20)
This is the largest of the contributions and increasing a/R improves the convergence of the series
due to the presence of the enlargement factor, so the asymptotic behavior at large a/R predicted
by the optical approximation is that given by this formula, i.e. F R/a 3 or E R/a 2 . This
asymptotic law is in accordance with the numerics of Ref. [10] and the predictions of other
semiclassical methods [21]. However, Eq. (3.20) is in disagreement with the CasimirPolder law
[24] which predicts E R 3 /a 4 for a R. This is no great surprise, since our method is not
valid for a/R 1, the semiclassical reflections being corrected and eventually overshadowed by
diffractive contributions [22,23].
We have calculated the contribution of the two reflections paths analytically as well. The
calculation is more involved than the one reflection term but a big simplification occurs if one
notices that, for the purpose of taking the second derivative with respect to z at z = 0, one can
leave the reflection point on the sphere fixed. We could not prove a similar result for any other
reflection. It is certainly not true for odd reflections but one can conjecture it to be true for even
reflections. In this paper we have not calculated the 4 reflection terms and hence we could not
check this conjecture for more than 2 reflections.
And finally, we have calculated the 3p (or sps) and hence obtained the 5p, or pspsp, paths
contribution P3p+5p ; P3p+5p in the parallel plates limit should account for 1/16 of the total
force. This contribution, unlike the previous ones, must be calculated partly numerically, mainly
261
because finding the reflection point on the sphere requires the (unique) solution of a transcendental equation. This task is achieved much more quickly by a numerical algorithm than by patching
together the several branches of the analytic solution.
The total pressure is plotted in Fig. 2 while the various contributions (keeping in mind that
P1s+3s and P3p+5p are negative and P2+2 is mainly positive) are shown in Fig. 3. Fig. 2 reveals
some interesting features of the pressure in this geometry: the total pressure decays very quickly
with the distance as P : the exponent seems to depend upon the distance a/R, but for
a/R 0.1 a good fit is obtained with = 6, in accordance with the asymptotic expansion of the
1 + 3 reflection term Eq. (3.19); by decreasing the distance between the sphere and the plate,
the pressure becomes more and more concentrated near the tip, giving us reasons to trust our
approximation and supporting the use of the PFA as a first approximation in the limit a/R 0.
Fig. 3 shows the relative importance of the contributions due to the different paths. As expected
the contribution to the total pressure decreases quite fast by increasing the number of reflections.
In Fig. 4 one can also see that the sign of the pressure is not determined simply by the number of
reflections of the underlying optical pathas for the contribution to the energy density.
4 as a function of the
Fig. 2. (Colour online.) The magnitude of the total pressure up to reflection 5p in units of hc/R
radial coordinate on the plate, /R. Upward, or red to blue a/R = 1, a/R = 0.1 and a/R = 0.01.
Fig. 3. (Colour online.) Contributions to the pressure in units of h c/R 4 as a function of /R, for fixed a/R = 0.1.
Downward or red to blue, we have P1s+3s , P3p+5p and P2+2 . Although unnoticeable in this figure, the curve P2+2
changes sign at around /R 0.4 (see Fig. 4 for a similar situation).
262
By integrating the pressure over the whole plate we obtain the force F . It is useful to factor out
the most divergent term of the force, as predicted by the PFA, so we define the quantity f (a/R)
as
3R
(3.21)
f (a/R).
720a 3
Since we include only a finite number of reflections it is convenient to factor out the constant
(4)/(1 + 1/16) such that f is normalized with f (0) = 1. The function f (a/R), calculated
including paths 1s, 3s, 2, 3p and 5p, is plotted in Fig. 5. When a/R 0 f is fitted by
f (a/R) = 1 0.10a/R + O (a/R)2 .
(3.22)
F (a) =
(3.23)
Fig. 4. Contribution of the two reflection path(s) to the pressure in units of h c/R 4 as a function of /R, for fixed
a/R = 0.01. The pressure becomes negative, showing that the sign of the pressure is not determined by the number of
reflection only.
Fig. 5. The ratio between the optical force up to the 5p reflection and the most divergent term in the PFA, as defined by
Eq. (3.21).
263
By neglecting the 5s + 7p reflection paths (which in the parallel plates case contribute 2%
of the total force) we can only assert that the functions f in (3.22) and (3.23) represent the optical
approximation with an error of 2%. When plotted on the whole range of a/R where the optical
approximation is to be trusted the pressure and energy method curves never differ more than
2%. However there is no such a bound on the subleading term which, on the contrary, depends
on the higher reflections contributions which have not been included in this calculations.4 With
the terms calculated at this point, we cannot make a precise statement about the subleading
term. We can however safely say that the subleading term a/R coefficient is quite small and
our method disagrees with the PFA prediction 0.5a/R. The sphere opposite plate is such an
experimentally relevant geometry that further, more accurate studies need to be performed to
compare with experimental data.
In conclusion, the lessons to be learned from this example are two: (1) The calculations with
the pressure method are even quicker and simpler than the energy method and sometimes can
give analytic results for non-trivial geometries and (2) the subleading terms must be compared
only between calculations performed with the same accuracy.5
4. Casimir thermodynamics
As measurements of Casimir forces increase in accuracy they become sensitive to thermal
effects. The natural scale for Casimir thermodynamics is a distance, = h c/T , which at room
temperature is about 2.5 microns. (To avoid confusion with the wave number k, we set Boltzmanns constant equal to unity and measure temperature in units of energy. We continue to keep
), depending on the value of
h and c explicit.) So, assuming the corrections are of O((a/)
thermal effects might be expected between the 10% (for = 1) and 0.3% (for = 4, the standard
parallel plates result) level for Casimir force measurements on the micron scale. In open geometries, like the sphere and plane, even longer distance scales are probed by Casimir effects, and
this gives rise to interesting changes in the temperature dependence of the Casimir free energy
in comparison with the case of parallel plates [3]. The optical approximation is well suited for
discussion of thermodynamics since the thermodynamic observables, like the Casimir energy,
can be expressed in terms of the propagator. Here we consider again a non-interacting, scalar
field outside rigid bodies on which it obeys Dirichlet boundary conditions.
Before entering into a technical discussion of temperature effects, it is useful to anticipate one
of our central results which follows from qualitative observations alone. As T 0 the temperature effects probe ever longer distances. Even at room temperature the natural thermal scale is an
order of magnitude larger than the separation between the surfaces in present experiments (see
Ref. [5]). Since long paths contribute little to the Casimir force, we can be confident that thermal
effects vanish quickly at low temperature. However, the leading T -dependence at small T comes
from regions beyond the range of validity of the optical (or any other) approximation, so we are
4 For example, consider that including only 1s, 3p, and 2 and reflections would have given a subleading term
0.16a/R instead of 0.10a/R in Eq. (3.22). The subleading term then changes of 50% by adding the 3s + 5p reflection terms which contributes only up to 8% of the total.
5 A.S. would like to thank M. Schaden and S. Fulling for conversations on this point during the workshop Semiclassical
approximations to vacuum energy held at Texas A & M, College Station, TX, January 2005. The concerns about the
errors to be associated with the optical, semiclassical or proximity force approximation is still open to debate and is
strictly connected to one of the most challenging open problems in spectral theory i.e. how to go beyond the semiclassical
approximation to the density of states of a positive Hermitian operator.
264
unable to say definitively how they vanish for geometries where no exact solution is possible (i.e.
other than infinite parallel plates).
This section is organized as follows: First we discuss the free energy and check our methods
on the parallel plates geometry; then we discuss the temperature dependence of the pressure,
which we apply to the sphere and plate case. Finally we discuss the difficulties associated with
the T 0 limit.
4.1. Free energy
The free energy is all one needs to calculate both thermodynamic corrections to the Casimir
force and Casimir contributions to thermodynamic properties like the specific heat and pressure.
However like the Casimir energy, Casimir contributions to the specific heat, pressure, etc., are
cutoff dependent and cannot be defined (or measured) independent of the materials which make
up the full system. So we confine ourselves here to the thermal corrections to the Casimir force.
The problem of parallel plates has been addressed before and our results agree with those [3].
4.1.1. Derivation
We start from the expression of the free energy for the scalar field as a sum over modes
e 12 h n
1
ln
,
Ftot =
1 e(n )
n
1
ln 1 e(h n ) +
h
= 1
n,
2
n
n
F + E,
(4.1)
where is the chemical potential, and the last term is the Casimir energy, or the free energy
at zero temperature, since F = 0 for T = 0. The Casimir energy E, being independent of the
temperature, does not contribute to the thermodynamic properties of the system. It however does
contribute to the pressures and forces between two bodies. The force between two bodies, say a
and b, is obtained by taking the gradient of the free energy with respect to the relative distance
rab
ab F.
fab =
(4.2)
At T = 0 we recover the familiar result f = E.
Next we turn the sum over modes into a sum over optical paths. Following the same steps that
led from Eq. (2.4) to Eq. (2.15) we obtain
F =
d x
dk (x, k) ln 1 e(h (k)) ,
(4.3)
where (x, k) is given by Eq. (2.16). By specializing to a massless field in 3 dimensions with
zero chemical potential (to mimic the photon field), and substituting the optical approximation
for the propagator Eq. (2.13), we obtain the sum over paths
F
Fr =
(1)r
r=0
1
2 2
d
Dr
1/2
x r
dk k sin(k
r ) ln 1 e h ck .
(4.4)
265
Here the term F0 , the direct path, gives the usual free energy for scalar black body radiation.
Using the values for the direct path, we have 0 = 1/
20 and
0 = |x x| 0 when taking
x x. We get the familiar textbook expression
F0 = V
dk
0
2 V T 4
k 2 1
=
ln 1 e hck
,
2
90 (h c)3
2
(4.5)
where
r =
r T /h c =
r / measures the path length relative to the thermal length scale.
Eq. (4.6) is the fundamental result of this section and gives a simple, approximate description
of thermal Casimir effects for geometries where diffraction is not too important. There are no
divergences in any of the Fr , ultraviolet or otherwise, even for the direct path (as we saw in
Eq. (4.5)) and the first reflection path. All the ultraviolet divergences are contained in the Casimir
energy E. Indeed, by expanding the integrand of Eq. (4.6) at short distances, i.e.
r 1, we
obtain
4
1
1/2 1
1/2 1
2
3
r
2 +
r coth
r +
r csch
r r
r + .
r
(4.7)
2
3r
945 3
3 45
Only the 1-reflection path length can go to zero to generate a divergence. For this contribution
r diverges like 1/
2r as
r 0, however this is compensated by the
r term in (4.7) so the
expression is finite and then integrable.
To check for infrared divergences notice that at large distances,
r 1, the integrand of (4.6)
1/2
goes to r /
2r . For an infinite flat plate the r 1/z2 , where z is the normal coordinate to
the plate, and the integral is hence dz/z3 at large z. For finite plates the domain of integration
is finite and for curved plates the enlargement factor falls even faster than 1/
2 , and the integral
remains convergent.
Since the integral converges in both the infrared and ultraviolet, it is safe to estimate the
important regions of integration by naive dimensional analysis. This leads to the conclusion
that The paths that dominate the temperature dependence of the Casimir force have lengths of
High temperature implies short paths. Very low temperatures are
order the thermal length .
sensitive to very long paths. Long paths involve both paths experiencing many reflections, which
are sensitive to the actual dynamics at and inside the metallic surface, or paths making long
excursions in an open geometry, which are sensitive to diffraction. Either way, low temperatures
will present a challenge.
4.1.2. Parallel plates
We know that in the limit of infinite, parallel plates the optical approximation to the propagator becomes exact. Hence our method gives another way to calculate the free energy of this
configuration of conductors. It is convenient to study this example to check against known results
and to prepare the way for a study of the T 0 limit.
266
We recall that for this configuration the expression for the enlargement factor is = 1/
2
and the lengths are given by
2n = 2na (where a is the distance between the plates) and
2n+1,u =
2(a z) + 2na,
2n+1,d = 2z + 2na, the notation being the same as in Section 3.1, should at this
point be familiar to the reader.
As in the zero temperature case it is useful consider even and odd reflection contributions
separately and as for the zero temperature case, the sum over odd reflections turns into an integral
over z from 0 to
Fodd =
F2n+1,d + F2n+1,u
n=0
hc
=
S
2 2 3
dx
1
2 + x coth x + x csch2 x ,
4
2x
(4.8)
where x = 2z/ and S is the area of the plate. The definite integral can be easily performed
numerically and its value is = 0.06089 . . . ,
Fodd = 2
hc
T 3
S =
S
2(h c)2
4 2 3
(4.9)
which is independent of the separation, a, and therefore does not contribute to the force.
Let us turn now to the even reflection paths. They have constant length 2na, so the volume integral simply yields the volume between the surfaces v = Sa. We already calculated the
zero-reflection term F0 in Eq. (4.5). The remaining even reflection contributions (2, 4, 6, . . .
reflections) Feven,r2 can be written as an infinite sum
Feven,r2 = 2
hc
1 1
2 + xn coth xn + xn csch2 xn
Sa
2
4
4
2
2xn
(4.10)
n=1
where xn = 2na/ n (this defines the dimensionless temperature ) and we have introduced
an overall factor of two to take into account the multiplicity of the paths. Thus the total free
energy for parallel plates is the sum of F0 (Eq. (4.5) and the results of Eqs. (4.9) and (4.10)),
4 V T 4
T 3
+
S
90 (h c)3 2(h c)2
1
h c
2 + xn coth xn + xn csch2 xn .
Sa
4
2
4
2xn
F =
(4.11)
n=1
It is not possible to rewrite F in a closed form, but the sum is easy to compute numerically
and the high and low temperature expansions are easy to obtain analytically. At high temperatures
(and fixed a) , and the summand g(n) in Eq. (4.11) falls rapidly enough with n
1
2 + ( n) coth( n) + ( n)csch2 ( n)
2( n)4
1
[2 + n] + O e n ,
=
2( n)4
g(n) =
(4.12)
267
that the limit may be taken under the summation, with the result,
Feven,r2
hc
1
2 hc
1
(3)
Sa
ST
+
S.
=
2n3 3 4 n4
16a 2
1440a 3
2 4
(4.13)
n=1
Notice that the second term cancels the even paths contribution to the Casimir energy. Hence the
final expression for the high T expansion of the free energy is particularly simple,
Ftot = F + E =
2
(3)
ST 3
ST + O eT a/h c .
VT4 +
2
2
90h c
2(h c)
16a
(4.14)
The first term is usual black body contribution to the bulk free energy. It does not contribute to
the force. The second term is also independent of a and does not give rise to any force. The
third term instead gives the thermal Casimir force. Notice that hc
has disappeared from this
expression. Called the classical limit, this high temperature behavior has been noted before
and some early results are even due to Einstein (in [26, p. 2]; see also [13]). In the next section,
after the thermal corrections to the pressure are calculated, we show how to extend this result to
other geometries.
Note some interesting features of the T limit: First, the sum over paths converges
like the sum of (1/n)3 as indicated by the appearance of (3). While slower than the T = 0
convergence, it is still rapid enough to obtain a good approximation from low reflections. Second, note that the T problem in 3-dimensions corresponds exactly to a T = 0 problem in
2-dimensions. This is an example of the familiar dimensional reduction expected as T . We
can give a short proof of this result. Let us first write:
F =
1
log Z
(4.15)
where Z is the partition function. We need to evaluate Z to the lowest order in when 0.
The thermal scalar field theory can be written as a free theory on the cylinder R3 [0, ). For
0 the dynamics along the thermal coordinate is frozen in the ground state, with energy
E0 = 0, where does not depend on the thermal coordinate. The partition function Z is now
Z = Z3 + O(eE1 ) where E1 is the first excited state E1 1/ 2 and Z3 is the partition function
of the remaining three-dimensional problem in R3 . If the conductors geometry is symmetric
along one spatial coordinate, say x (in the parallel plates problem we have two of these directions,
x and y) this can now be interpreted as an Euclideanized time variable extending from 0 to Lx /c.
1
So we will write Z3 = Z2+1 = e h E2 Lx /c where E2 is the Casimir energy of the 2-dimensional
problem of two lines of length Ly , distant a. The free energy F is then:
F =
(3)
1 Lx
1
1
.
log Z log Z2+1 = T
E2 = T Lx Ly
h c
16 2 a 2
(4.16)
Since S = Lx Ly This is exactly the a-dependent term in Eq. (4.14). If the geometry is not translational invariant then we can only say from Eq. (4.16) that the free energy is linear in T (since
Z2+1 is independent of ). Later, by using the optical approximation we will find an explicit
analytic expression valid also for non-symmetric, smooth geometries.
For low temperatures, 0, the terms in the n-sum in Eq. (4.11) differ very little from each
other so we can use the EulerMaclaurin formula [25],
268
g(n) =
n=1
1
1
dx g(x) g(0) g (0) + =
+ O( ).
2
12
90
(4.17)
Substituting into Eq. (4.11) we find that the first term in Eq. (4.17) cancels the sum over odd
reflections (the second term in Eq. (4.11)) and that the second term in Eq. (4.17) combines with
F0 to give a very simple result,
Ftot = E
(V Sa) 2 T 4
90(h c)3
(4.18)
at low temperatures. This has a simple physical interpretation: the typical thermal excitations of
the field at low temperature have very long wavelengths, it is hence energetically inconvenient
for them to live between the two plates. As a result the only modification of the T = 0 result is
to exclude from the standard black body free energy the contribution from the volume between
the plates. One could imagine measuring this effect as a diminished heat capacity for a stack of
conducting plates inside a cavity.
The low temperature result, Eq. (4.18), is deceptively simple. Its simplicity obscures an underlying problem with the T 0 limit. We postpone further discussion until we have explored the
temperature dependence of the pressure. Suffice it to say for the moment, that Eq. (4.18) probably does not apply to realistic conductor with finite absorption, surface roughness, and other
non-ideal characteristics.
4.2. Temperature dependence of the pressure
In this section we will obtain the temperature dependence of the pressure within our approximation and apply it to a preliminary study of the sphere and plate case. To begin, we calculate
the thermal average of an operator O quadratic in the real scalar field . The average of a generic
operator O is given by the trace over a complete set of eigenstates | of the Hamiltonian
weighted by a Boltzmann factor:
e E |O| .
OT =
(4.19)
Oj 2nj + 1T =
Oj
1 + eEj
1 eEj
(4.20)
where T denotes the thermal average, j labels the normal modes j (cf. Section 2), nj is
the occupation number of the mode j and Ej its energy. The quantities Oj are read from the
decomposition of the diagonal part of the operator O written as Odiag = j Oj (aj aj + aj aj )
where aj is the annihilation operator of the mode j .
The Oj for the pressure can be read easily from the analysis in Section 2:
Pj =
lim
x xS
1
n j (x )j (x).
4Ej n
(4.21)
269
1
1 + eEj
n n j (x )j (x)
4Ej
x x
1 eEj
P (x S) = lim
= lim
x x
n n
dk ek/
= Im
dk e
k/
k
1 + eE(k)
Im G(x , x, k)
2E(k)
1 eE(k)
k
1 + eE(k)
1 2
G(x, x, k)
2E(k) 2 n
1 eE(k)
(4.22)
(1)nr n2
rR
1/2
r
1
/
)
,
coth(
r
16 2
(4.23)
where it is understood that the zeroth and first reflection terms, which contribute to the pressure
on each surface individually, but not to the force between surfaces, have been dropped.
Before applying this to the sphere and plate problem, let us again look at the limiting behavior
as T and T 0, and draw some conclusions independent of the detailed geometry. First
consider T . The shortest paths in the sum in Eq. (4.23) are of order a, the intersurface
separation. (Remember that the optical approximation is accurate as long as the important paths
are short compared to R, a typical radius of curvature of the surfaces.) At high T we can take the
0 limit under the sum over reflections since the resulting sum still converges. Therefore low
reflections dominate, and we can see, retrospectively, that the high temperature approximation
0. So as T ,
applies when /a
1/2
1
1
r /
nr 2 r
P=
(1) n
.
+O
e
16 2
(4.24)
This limit has been called (it has been previously found for the parallel plates case) the classical
limit [3,13,26], since the final expression for high temperatures, reinserting h and c,
P
1/2
r
(1)nr n2
T
16
r
(4.25)
is independent of h and c apart from exponentially small terms. This expression amounts in
neglecting the 1 in the expression 2nj + 1T , corresponding to normal ordering or neglecting
the contribution of the vacuum state.
At low temperatures, , it is not possible to interchange the limit with the sum. The
which goes like
relevant quantity is 1 coth(
r /),
r
1
+
r +O r
coth
=
r
3 2 45 4
(4.26)
270
as . The first term yields the familiar T = 0 expression. The others would give divergent
contributions because of the factors of
r in the numerators (even after the inclusion of the en 1,
largement factor r ). Of course the sum over reflections of the difference, 1 coth(
r /)
must have R and R a in order to obtain reliable results from the optical approximation. Fortunately this is a region of experimental interest: present experiments use, for example,
a 0.5 m, R 100 m, and at room temperature, 2.5 m. In this regime the optical
approximation should give a good description of the thermal corrections to the force between
perfectly reflective, perfectly smooth conductors.
The expression for the pressure is given by Eq. (4.23), the enlargement factors and lengths are
the same as in the T = 0 case. By applying Eq. (4.23) to the 1s + 3s paths we find the results in
Fig. 6. Notice that at high temperatures increasing the temperature essentially scales the whole
plot proportionally to T . The force is then linearly dependent on the temperature (this is the
classical limit already discussed in Section 4.2). More details are given in the caption of Fig. 6.
h c 3 R
f (a/R, /R).
720a 3
(4.27)
Fig. 6. (Colour online.) The dependence of the 1s + 3s contribution to the pressure P1s+3s for the sphere and the
plate in units of h c/R 4 for various temperatures. Two effects must be noticed. The top 3 curves (in blue) show the
high-temperature region where the pressure is proportional to T (notice the logarithmic scale). The two lower curves (in
orange and red) show the low-temperature region when increasing the temperature changes the asymptotic behavior of
while for small the behavior reduces to the zero-temperature limit.
P for large (i.e. )
271
8a
8a
Unfortunately there is no such simple closed expression for higher reflection terms (nor for
this first term at arbitrary T ). However, if one believes that the rank of contributions is similar
to the parallel plates case one should feel safe to say that this truncation captures the optical
approximation within a (3) 1 20%. Hence our statements are at least qualitatively correct.
This expression for the force gives a prediction for the function f , defined in Eq. (4.27). At
this level of accuracy (1s + 3s reflection) and for a/ 1, apart for exponentially small terms in
the temperature expansion we have
f1s+3s
90 a R
4 R
(4.29)
which grows linearly in a/R and is (interestingly enough) independent of R. This is evident in
Fig. 7 for the curves with = 1/8, 1/16. For higher the linear growth starts at higher values of
a not shown in Fig. 7. Moreover the exponential accuracy manifests itself in the sudden change
272
perature the function f (a/R) will deviate from his zero-temperature behavior at a h c/T .
The deviation will be in the upward direction, increasing the attractive force between the bodies.
Eventually, for sufficiently large distances, the high temperature behavior given by Eq. (4.25) (or
(4.29) for the sphere-plane problem) will be recovered.
4.3. Thermal corrections at low temperatures
The preceding examples have made it clear that in the language of the optical approximation,
This can be seen from
thermal corrections at low temperature arise from very long paths,
r .
the general form of the free energy, Eq. (4.6), or in the attempt to take the limit under the
summation in Eq. (4.23), which fails because of the expansion, Eq. (4.26). Here we examine this
non-uniformity more carefully in general and in particular for the parallel plate case, where all
the expressions are available. We then attempt to draw some conclusions about the magnitude of
corrections at low temperature and the possibility of calculating them reliably in an model that
idealizes the behavior of materials.
We return to Eq. (4.22), which gives the exact expression for the pressure, and separate out
the thermal contribution,
P (T ) P (0) P = Im
dk
0
1 2
e hck
,
n n G(x , x, k)2
2
1 e hck
(4.30)
1
P = Im
.
dk n2 n G(x , x, k)em hck
(4.31)
m=1 0
Each term in the sum is a Laplace transform of the Greens function. Clearly, as the
frequencies that dominate this integral are 1/ T .
What are the low frequency contributions to G(x , x, k)? In the ideal case of infinite, perfectly
conducting, parallel plates, there is a gap in the spectrum at low k: k a . However in realistic
situations the plates are finite and/or curved, the geometry is open, and there is no gap in the
spectrum. The low-k part of the spectrum is sensitive to the global geometry, including edges
and curvature, and to the low frequency properties of the material. If the conditions are close to
the ideal, the contributions to P from small k may be small. However as T 0, they dominate.
We conclude that the T 0 behavior of P cannot be calculated for realistic situations.
The optical approximation does not take account of diffraction, and cannot accurately describe
the T 0 limit. Nevertheless it is interesting to see how it fails, since this sheds light on the
problem in general. Substituting the optical expansion for the Greens function (replacing n2 n
1 2
= c = 1) we find
2 z and setting h
1 2
1/2
dk r sin(k
r )emk
8 2 z
P =
m=1 r1
m=1 r1
1 2 1/2
r
z r
.
2
2
8
m 2 +
2r
(4.32)
273
The problems with T 0 are quite apparent: as all paths become important.
Next we specialize to parallel plates where
r = (2ar
2z). The derivative can be carried out
explicitly. For simplicity we focus on m = 1 (P =
m=1 Pm ),
P1 =
2 12(ar)2 2
,
2
(4(ar)2 + 2 )3
(4.33)
r=1
2 2 3 2 r 2 2
.
4
( 2 r 2 + 2 )3
(4.34)
r=1
1
3
1
2
P1 = 2
coth
csch
.
2a
2a
4 8a 3
(4.35)
The second term in brackets is exponentially small as . If we ignore it, restore the mdependence, and sum over m, we obtain
P =
2
90 4
(4.36)
2 (1 + 2 X 2 / 2 )2 2 (1 + 2 X 2 / 2 )3
where the omitted terms are higher EulerMaclaurin contributions that are unimportant as
(i.e. 0).
If the upper limit on the sum, X, is taken to , only the first term, 1/2, survives and gives
the expected result. The question is: How large must X be before the limiting behavior set in?
Dropping the third term in Eq. (4.37), which is subdominant, we can rewrite P1 as
2
X
1
2
1 1
P1 = 2 4
(4.38)
+
f
(
X)
.
=
2 (1 + X 2 2 / 2 )2
2 4 2
The function f (z) is negative definite and has a minimum at z = 1/2 3 0.29 where it takes the
3/2
value
3 /32 0.16. So in order the result Eq. (4.36) to be valid we must include X Xc =
/ 3 terms in the sum. For example, in a typicalexperimental situation we have a = 0.5 m
and T = 300 K so = 8 m, = 8/ and Xc = 8/ 3 = 4.6. In this case it is necessary to go to
X 20 before the contribution of |f ( X)/ | is smaller than 1/2. This means paths with 40
reflections and path lengths of order 20 m. With 40 chances to sample the surface dynamics
of the material and paths of 20 m available to wander away from the parallel plate regime, the
idealizations behind the standard parallel plates calculation must be called into question.
It must be said however that in the modern experiments the temperature corrections are at
most of the order of a few percent at a 1 m and vanish when a 0. Nonetheless we want to
point out that there is a conceptual difference between formulations based on the infinite parallel
274
plates approximation, extended to curved geometries by means of the PFA, and a derivation (like
ours) in which the curvature is inserted ab initio. The thermal and curvature scales interplay in
a way that the usual derivations [3,4] could not possibly capture, giving rise to different power
law corrections in a/. It is worth reminding the reader that the usual numerical estimates of
thermal corrections are based on the infinite parallel plates power law (a/)4 . A smaller power
like (a/)2 would give a much bigger upper bound.
To summarize: temperature corrections are small at small T , but the existing methods of
calculating them, including both our optical approximation and the traditional parallel plates
idealization, cannot be trusted to give a reliable estimate of the T -dependence at small T .
5. Conclusions
In this paper we have shown how to adapt the optical approximation to the study of local
observables. We have illustrated the method by studying the pressure, but the method applies as
well to other components of the stress tensor, to charge densities, or any quantity that can be written in terms of the single particle Greens function. The advantage of the optical approximation is
to extend the study of these local observables to novel geometries. In particular we developed an
expression for the Casimir pressure on the bodies and applied our main result Eq. (2.31) to the
study of three important examples: parallel plates, the Casimir pendulum and a sphere opposite
a plate.
We have also shown how to calculate within this approximation scheme, thermodynamic
quantities and thermal corrections to the pressure in the general case and applied our results
to the example of parallel plates (retrieving the known results) and to the case of a sphere opposite a plate. Along the way we have given a proof of the classical limit of Casimir force for any
geometry (within our approximation), i.e. the fact that Casimir forces at high temperatures are
proportional to the temperature and independent of h , a fact that previously was known only for
parallel plates.
Finally, we argued that all known methods of computing the temperature dependence of the
Casimir effect are suspect as T 0.
Acknowledgements
We would like to thank S. Fulling and M. Schaden for comments. A.S. would like to thank
M.V. Berry for useful discussions. R.L.J. would like to thank the Rockefeller Foundation for
a residency at the Bellagio Study and Conference Center on Lake Como, Italy, where much
of this work was performed. This work is also supported in part by funds provided by the US
Department of Energy (D.O.E.) under cooperative research agreement DE-FC02-94ER40818.
References
[1] H.B.G. Casimir, Proc. K. Ned. Akad. Wet. 51 (1948) 793.
[2] E.M. Lifshitz, Sov. Phys. JEPT 2 (1956) 73;
N.E. Dzyaloshinskii, E.M. Lifshitz, L.P. Pitaevskii, Sov. Phys. Usp. 4 (1961) 152;
L.D. Landau, E.M. Lifshitz, Electrodynamics of Continuous Media, Pergamon, Oxford, 1960.
[3] V.M. Mostepanenko, N.N. Trunov, Casimir Effects and Its Applications, Oxford Univ. Press, 1997.
[4] R.S. Decca, E. Fischbach, G.L. Klimchitskaya, D.E. Krause, D.L. Lopez, V.M. Mostepanenko, Phys. Rev. D 68
(2003) 116003, hep-ph/0310157.
[5] S.K. Lamoreaux, Phys. Rev. Lett. 78 (1997) 5;
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
275
Received 9 May 2005; received in revised form 16 February 2006; accepted 27 February 2006
Available online 20 March 2006
Abstract
The generic structure of 4-point functions of fields residing in indecomposable representations of arbitrary rank is given. The used algorithm is described and we present all results for Jordan-rank r = 2 and
r = 3 where we make use of permutation symmetry and use a graphical representation for the results.
A number of remaining degrees of freedom which can show up in the correlator are discussed in detail.
Finally we present the results for two-logarithmic fields for arbitrary Jordan-rank.
2006 Elsevier B.V. All rights reserved.
277
Logarithmic conformal field theories are a generalization of conformal field theories (CFTs)
in the sense that CFTs are LCFTs of Jordan-rank one. Since the works of Belavin, Polyakov and
Zamolodchikov in 1984 [6] a powerful machinery of tools, algorithms and definitions has been
developed, which nowadays is indispensable for analyzing conformal field theories. These definitions and techniques include characters, null vectors, operator product expansions (OPEs) and
correlation functions, to name only a few. With the rise of LCFTs the demand for porting and
generalizing these tools to LCFTs became an important endeavor. Today, porting of definitions
and techniques from CFTs to LCFTs is almost finished, cf. [7,8] and references therein. Nevertheless there exist still some areas which are not well-understood, such as modular properties of
characters and partition functions.
In the course of the paper we want to discuss the generic form of four-point correlation functions, which is fixed by global conformal invariance, in the case of LCFTs. The solution for this
problem in case of CFTs is well-known, but in case of LCFTs only incomplete results exist so
far. An example where four-point correlators play a role in, are Abelian sandpile models which
can be described by a c = 2 LCFT, e.g. [9,10].
In the case of ordinary conformal field theory (CFT) it is known that every correlation function has to fulfill the so-called global conformal Ward identities (GCWI) as a consequence of
invariance under global conformal transformations:
Lq (h1 ) (z1 ) . . . (hn ) (zn ) = 0, q = 1, 0, 1,
(1)
where (h) (z) is a primary field and hence
n
q
zi zi i + (q + 1)hi .
Lq (h1 ) (z1 ) . . . (hn ) (zn ) =
(2)
i=1
When considering logarithmic conformal field theories, primary fields appear together with socalled logarithmic partner fields which, in the simplest case form indecomposable representations
in the form of Jordan cells. Then, these equations have to be slightly altered, cf. [11], by adding
an additional term to the GCWI, leading to the generalized global conformal Ward identities:
n
q
zi zi i + (q + 1)(hi + i ) (h1 ,k1 ) (z1 ) . . . (hn ,kn ) (zn ) = 0,
(3)
i=1
where (hi ,ki ) (zi ) denotes a logarithmic field of Jordan-level ki respectively a primary field in
case ki = 0. The operator i acts on these logarithmic fields by reducing the Jordan-level of the
field by 1 respectively annihilating the field in case it is a primary one: i (hi ,ki ) = (hi ,ki 1) for
ki > 1 and i (hi ,ki =0) = 0 otherwise (field being a primary). Note that in the above equation
the additional operator i vanishes for q = 1 meaning that the LCFT version exactly matches
the CFT version for this value of q. The additional operator i makes it much harder to find
the generic form of the correlators, because it renders the differential equations inhomogeneous,
i.e., the solution will depend on solutions of lower Jordan-level. It is this additional term i that
makes solving the equations a lot harder compared to the CFT case.
If we consider the states corresponding to the fields (hi ,ki ) , the action of i leads to the
following property for L0
L0 |h; k = h|h; k + |h; k 1,
(4)
where additionally
|h; k = 0
k > 0
(5)
278
holds. This shows, that the fields (hi ,ki ) indeed correspond to Jordan cells with respect to L0 .
The representation of a LCFT with the largest Jordan cell defines the rank r of the LCFT, i.e.,
ki < r.
The representation space is, as usual, spanned by the states |h, k defined by the field-state isomorphism |h, k := limz0 (h,k) |0. All these states are typically assumed to be quasi-primary
in the sense that Ln |h, k = 0 n > 0 and for all k. Thus, they almost behave as highest-weight
states, up to the non-diagonal action of L0 . This is not true in general, because states to logarithmic partner fields may fail to be quasi-primary, i.e., L1 |h, k > 0 = 0. However, under certain
assumptions, this does not affect the form of correlation functions. Furthermore, from the results
for 1-, 2- and 3-point functions we can expect the vacuum representation to have the maximal
Jordan-rank. No counter-examples are known up to now and thus we assume that the Jordan-rank
is the same for all representations without loss of generality. The latter is justified as follows: in
case some smaller Jordan-rank representation does show up, we can extend this representation
by adding additional fields which we set to zero. In essence, this simply means that the general results remain valid with some of the structure constants set to zero. For further details on
the precise assumptions in the case of non-quasi-primary fields and on the maximal rank of the
vacuum representation see [7].
While there are generic methods to determine 2- and 3-point correlation functions, e.g. see
[7,1215] and the particular elegant approach in [16], no such method exists, to our knowledge,
for 4-point correlation functions. However, in [14,17] a solution for the case of 4-point functions
involving a level two null vector field is given. Any n > 3-point correlator of chiral fields can be
reduced to 2- and 3-point functions. This is still possible for the full (non-chiral) theory. However,
in order to determine the correct normalisation constants of the 2- and 3-point functions in this
case, it is necessary to analyse the 4-point correlation functions. Therefore one can compute
all observable quantities of a CFTat least in principleif one knows all 2-, 3- and, 4-point
functions. Thus, this work attempts to close the remaining gap by providing the prescription to
fix the generic form of 4-point correlators in the case of arbitrary rank Jordan-cells in LCFT.
While the generic form of 2- and 3-point functions is fixed up to structure constants the generic
form of 4-point functions can be fixed only up to functions Fi1 i2 i3 i4 (x) of the globally conformally
invariant crossing ratios x. As in the case of ordinary conformal field theory these structure
functions can be computed if additional local symmetries, i.e., null vectors, exist. Indeed, such
null vectors can exist in the logarithmic case [18], but the resulting differential equations are
more difficult to solve because they are inhomogeneous in general [11].
In this paper we describe how the most general ansatz can be constructed and how the emerging constants can be calculated in order to find a valid ansatz for Eq. (3). Most of the constants
can be fixed with the help of the global conformal Ward identities, but we will also encounter
cases where some degrees of freedom are left. A necessary condition for these additional degrees
of freedom is that all four fields in the four-point function are of logarithmic origin. The number
of degrees of freedom very much depends on the form of the correlator. Furthermore we find that
we have to identify some of the structure functions Fi1 i2 i3 i4 that are part of the correlator.
We then will use the discussed methods to determine all correlators for Jordan-rank r = 2 and
r = 3. The results are given in a graphical representation and also we make use of permutation
symmetries in order to keep the terms as short as possible. In the last section we consider the
special case that only two of the four fields are logarithmic and we show how the resulting
equations can be solved in this case for arbitrary Jordan-rank r.
279
=:
(6)
zij Fk1 k2 k3 k4 log(z12 ), . . . , log(z34 ), x ,
i<j
(7)
j =i
it will contain all the logarithms. The factor i<j ziji exists to counter the hi terms on the lefthand side of Eq. (3) and therefore we can without loss of generality set all conformal weights
to zero, hi = 0. Note that the full correlator of course depends on the conformal weights. The
point here is that the global symmetries are not sufficient to fix the complete correlator, but they
are strong enough to fix the generic form and this form has no dependence on hi , except through
the prefactor involving the exponents ij . Therefore, we can simplify the resulting formulas by
omitting the trivial direct dependency on the conformal weights. If we set all conformal weights
to zero then (3) becomes
n
q
zi zi i + (q + 1)i k1 k2 k3 k4 = 0,
(8)
i=1
where we write ki instead of the much longer form (hi ,ki ) (zi ). The remaining two equations for
q = 0, 1 have a i term acting on the correlator and thus lowering the sum of the Jordan-levels by
one. Because of calculating the expressions recursively we can assume the predecessors i
to be known. This leads to the final form
O0 :=
4
zi i =
O1 :=
4
i=1
i ,
(9)
i=1
zi2 i = 2
i
zi i ,
(10)
280
where the correlators depend on the difference zij only. Though looking simple for given predecessors i at first glance, it is not easy to find an ansatz for the correlator at all. Moreover
we will learn that in some cases the result is not unique. We sometimes use the sloppy term
integrating the predecessors i as a shortage for finding an ansatz that fulfills the above
equations.
The starting point for the recursion is given by
k1 k2 k3 k4 = F0 (x) for
(11)
ki = r 1 respectively,
k1 k2 k3 k4 = 0 for
ki < r 1,
(12)
i
z34
. In essence this means that a corwhere x is the anharmonic ratio of the four points, x = zz12
14 z32
relation function with total Jordan-level K := i ki = r 1 behaves like a correlation function
in ordinary conformal field theory, i.e., it depends on one function of the globally conformally
invariant anharmonic ratio.
The reason for these initial conditions comes from the fact that the only non-vanishing
one-point function in LCFT is the one of the highest level logarithmic partner of the identity,
(h=0,k=r1) . Evaluating a correlation function amounts to contracting the inserted fields, in
all possible ways, down to a one-point function. Therefore, it is only natural to expect that
the total Jordan-level K of a non-vanishing correlator must at least be equal to r 1. Furthermore, since the cluster decomposition property should hold, the initial conditions must
also hold for arbitrarily factorized correlators, e.g., k1 k2 k3 k4 k1 k2 |00|k3 k4 in case that
z1 , z2 are well separated from z3 , z4 . However, some care has to be taken about the correct insertion
of the identity channel, which formally can be thought of to be of the form
|00| = r1
k=0 |h = 0; kh = 0; r 1 k|. It is easy to see that the cluster decomposition with
the above identity channel implies (12) and that precisely one term of this identity channel survives yielding (11), where we made use of the results for two-point functions in [7].
In the beginning we mentioned that ki > 0 represents a logarithmic partner field, while ki = 0
is a primary field. We can subdivide the class of primary fields into two subclasses, the so-called
proper primary-fields and the pre-logarithmic fields. This difference between the subclasses becomes apparent if one considers the operator product expansion (OPE). In contrast to the OPE
of two proper primary-fields the OPE of two pre-logarithmic shows an additional term of logarithmic behavior, cf. [19].
In the following we consider proper primary-fields only and use the term synonymous with
primary field. Restricting to proper primary-fields is for simplicity only. It is possible to include
pre-logarithmic fields into the theory, by making changes to the initial conditions (11), (12). For
instance in the well-known c = 2 example the initial-conditions for Jordan-rank r = 2 would
be
= 0,
(13)
= 0,
(14)
= F0 (x),
(15)
where stands for a proper primary and denotes a twist field. Note that the same could be
formally achieved by assigning rational values ki to pre-logarithmic values, e.g., in this example
assigning a value of ki = 12 to the twist fields and using (11), (12) would lead to the same initial
281
conditions. A more precise analysis of this and how to assign correct values for the ki can be
found in [7]. Apart from the initial conditions we also need slight adaption of the connection
rules we are going to explain in Section 2.4.3. More comments can be found in the conclusions.
2.2. Naming conventions and assumptions
The dependence of F on the anharmonic ratio, is suppressed in the following. Further note
that we do not write out the dependence on the Jordan-rank r, e.g., 1000 = F0 (for r = 2) as
well as 1100 = = 2000 = F0 , namely for r = 3.
Also, as already introduced above, we will denote correlators only with the shorthand
k1 k2 k3 k4 := (h1 ,k1 ) (z1 )(h2 ,k2 ) (z2 )(h3 ,k3 ) (z3 )(h4 ,k4 ) (z4 )
throughout the paper. Let us stress one important point here. Later on, we will talk about certain
symmetries such correlators will obey. For example, we already stated that in the r = 2 case, F0 =
1000 = 0100 = 0010 = 0001. Such permutations are ment to solely act on the Jordanlevels ki , not on the order of the fields, which definitely would involve non-trivial monodromies.
All we mean by such symmetries is that we may distribute the individual Jordan-levels within
the (indecomposable) representations appearing in a correlator in different ways. Thus, to write
it out once in full detail, a permutation S4 acts as
k1 k2 k3 k4 = k (1) k (2) k (3) k (4)
= (h1 ,k (1) ) (z1 )(h2 ,k (2) ) (z2 )(h3 ,k (3) ) (z3 )(h4 ,k (4) ) (z4 ) .
There exist further other symmetries, such as for rank r = 3, where we have F0 = 2000 =
= 0002 = 1100 = 1010 = = 0011, which again only refer to different distributions
among the Jordan-levels ki . The same conventions hold when we speak of symmetries on the
structure functions Fk1 k2 k3 k4 . Thus, the order of the representations involved is always assumed
to be fixed once and for all.
Of course, due to the inhomogeneous nature of the global conformal Ward identities for K =
k1 + k2 + k3 + k4 > r 1, the correlation functions will possess logarithms. As a consequence,
the correlation functions will depend on the individual Jordan-levels ki , as for example k, (r
k), 0, 0 log(z12 ) and 0, k, 0, (r k) log(z24 ). However, the remaining structure functions
will have a higher symmetry and thus will not depend so much on the individual Jordan-levels
ki , in a similar way as the structure constants of the 2- and 3-point functions, which indeed
only depend on the total Jordan-levels. In essence, this means that global conformal invariance
suffices to fix the form of the 4-point functions up to structure functions, which might be regarded
as some kind of reduced matrix elements, largely independent of the details of the individual ki .
The dependency on how a total Jordan-level K is distributed on to the individual ki is encoded to
a high degree in the coefficients of the structure functions which are polynomials in the log(zij ).
Motivated by the general structure of 2- and 3-point functions and by the results found in the
case of a LCFT of Jordan-rank r = 2 we assume that the general form of the solution for (6) is
of the form
k1 k2 k3 k4 =
Kr
L=0 {k1 k2 k3 k4 :
i ki =L}
(L)
pk k k k {lij } Fk1 k ,k2 k ,k3 k ,k4 k (x),
1 2 3 4
(16)
(L)
where lij := log(zij ), K := i ki , Kr := K r + 1. The pk k k k denote polynomials in the
1 2 3 4
lij of total degree L. The remaining reduced structure functions Fk1 k1 ,k2 k2 ,k3 k3 ,k4 k4 are
282
assumed to depend only on the crossing ratio x in order to be invariant under global conformal
transformations. Of course, we put Fj1 j2 j3 j4 0 whenever one ji < 0, to make the above ansatz
always well defined. In the following, certain symmetries of the F with respect to the ki will
appear, although a complete analysis of these symmetries is beyond the scope of this paper.
However, our results will still yield a valid solution if the maximal possible symmetry, i.e. the
structure functions depend only on the total Jordan-level, is assumed. Then, the above ansatz
would simplify to
k1 k2 k3 k4 =
Kr
L=0
{k1 k2 k3 k4 :
i ki =L}
(L)
pk k k k {lij } F(Kr L) (x),
(17)
1 2 3 4
but we will use the more general ansatz (16) throughout the paper. The only symmetry we will
always use is the one for F0 which sets the initial condition for our recursions (11) and (12)
and which can easily be proven by taking into account the two facts that the 3-point structure
constants depend only on the total Jordan-level and that the leading OPE coefficients (i.e. the
coefficient for a field of Jordan-level k1 + k2 in the OPE of two fields at Jordan-levels k1 and k2 ,
respectively, whenever k1 +k2 r 1) do not depend on the Jordan-levels at all [7]. Furthermore,
it is also clear that any 4-point function with only two logarithmic fields will have the symmetry
kk 00 = k k00 or 0k0k = 0k 0k, etc., simply because such a 4-point function can only
have a logarithmic dependency in z12 or z24 , etc., and because the OPE of two logarithmic fields
(h;k) and (h ;k ) is symmetric in the conformal weights h, h , and seperately in the Jordan-levels
k, k , see [7]. However, clearly kk 00 = 0k0k . The interesting question is, whether there is a
?
symmetry for the structure functions, Fkk 00 (x) = Fk k00 (x) = F0k0k (x) = F0k 0k (x).
The highest logarithmic powers that appear in the solution are always the factors associated
with the function F0 . The degree in lij , also called logarithmic degree for short, is given by
a1 a2
a
l13 . . . l346 =
ai K r + 1 =: l max .
deg l12
(18)
As discussed above, there are cases where we will find that some of the functions Fj1 j2 j3 j4 can or
have to be identified with each other, e.g., we will find that F2100 F1200 for r = 3 in the sense
explained above. After identification we will always use the F -term whose index represents the
lowest number. For example we write 2100 = F1200 (x) + instead of using F2100 (x).
In many places we decided to use a graphical representation instead of writing long expressions of logarithms. The idea for this stems from [2] where it was chosen in order to give a better
understanding of the contractions that can appear. Reading the diagrams is straightforward, the
points stand for the four vertices and each lij is represented by a line between the vertices i and j .
Permutation operators P are used to further reduce the length of the expressions, for instance
2
2
l12
l23 l34 l23
l12 l14 = (1 P(13) )
(19)
When reading such expression, one should carefully check, on what the premutation operators
act: whether they act only on the polynomial in the lij , or also on the Jordan-levels of the structure
functions F . The former case typically means that a symmetry for the F has been used, while the
latter case does not assume such a symmetry. From Section 3 on we will always use the graphical
representation to present the results.
283
2.3. Properties of O0 , O1
Both operators Oq are linear, nilpotent, act as derivatives on the function space and are invariant under any permutations p S4 . The function space we consider is the space of polynomials
in the logarithmic functions lij := log |zi zj |, called Flog := C[l12 , l13 , l14 , l23 , l24 , l34 ].
For q = 0, 1 the operators Oq have a simple behavior, when acting on Flog :
O0 :
O1 :
Flog Flog ,
li1 j1 . . . lin jn
nk=1 li1 j1 . . . lik1 jk1 lik+1 jk+1 . . . lin jn ,
(20)
(21)
meaning that we can replace the term by a sum, where each lij is replaced by either 1 (for q = 0)
or by zi + zj (for q = 1). Thus acting with Oq on any term obviously reduces the logarithmic
degree by one and by that proves (18).
An obvious question is whether the map Oq : f f is injective: are there any non-trivial
f Flog with O0 f = 0 and O1 f = 0? If we restrict ourselves to the function space Flog then
we find that we can exactly determine the kernel of the operator O := (O0 , O1 ).
As will be shown in Section 2.6 below, the kernel is given as follows.
kerFlog,g O =
g
gi
ai K1i K2 :
ak R ,
(22)
i=0
(23)
(24)
where Flog,g := {f
Flog | deg f = g} denotes the space of functions with logarithmic degree g,
such that Flog = g Flog,g .
2.4. An ansatz for the equations
As mentioned before we want to recursively solve Eqs. (9) and (10). Since the number of
terms quickly becomes huge and calculation tedious we make use of computer algebra software
for performing the calculations. In the next subsection we explain in more detail what we mean by
recursion. After this we show that the two equations can be reduced to a set of simpler equations
and in Section 2.4.3 we present the algorithm we used for creating an ansatz.
2.4.1. Recursion
With recursion we mean the following: we start with the initial conditions as given in (11)
which corresponds to logarithmic degree l = 0. Then we calculate all necessary correlators which
contain exactly one more logarithmic field or one field whose Jordan-level is increased exactly
by one. In short this means that we determine all correlators of logarithmic degree l = 1. The
following diagram describes which correlators need to be calculated in order to determine the
284
The effort for calculation can be reduced, since many of the correlators are related to others
by simple permutations, e.g., 2100 = P23 2010. Note, however, that this simple relationship
via a permutation only concerns the structural form of the correlator, not its dependency on the
conformal weights. All we mean by the relation is that the structural forms
2100 = F2100 (x) 2l12 F0 (x) F2010 (x) 2l13 F0 (x) = 2010
(25)
can be mapped onto each other by the permutation P23 , which acts on the Jordan-levels ki and
on the indices of the lij , but not on the zij and the conformal weights hi . In this respect, it is
often useful to consider the lij formally independent of the zij .
2.4.2. Breaking down into a set of equations
The operators O0 , O1 in Eqs. (9), (10) are linear, they act as derivatives on the correlators
and they are invariant under any permutation P S4 of the indices. These properties of the
operators O1 , O2 are useful to break down the two Eqs. (9), (10) into a set of equations and this
is what we will do in the following.
The correlators in the beforementioned equations can be replaced by formula (16). Inr L)
stead of writing polynomials such as pk(K
we write ( ) for short. In this somewhat sketchy
1 k2 k3 k4
notation Eqs. (9) and (10) can now be written as
Oq Fk1 k2 k3 k4 + ( )u Fk1 1,k2 ,k3 ,k4 + ( )u Fk1 ,k2 1,k3 ,k4 +
+ ( )u Fr1,0,1,0 + ( )u Fr1,0,0,1 + ( )u F0
= ( )Fk1 1,k2 ,k3 ,k4 + ( )Fk1 ,k2 1,k3 ,k4 + + ( )Fr1,0,1,0
+ ( )Fr1,0,0,1 + ( )F0 ,
(26)
where q = 0, 1 and where we marked the brackets ( ) on the left-hand side of the equation
with a small u in order to point out that these are the unknown terms we want to determine. We
point out again that all terms on the right-hand side are known because we want to recursively
solve the equations. As usual r is the Jordan-rank of the theory.
In the following we add an index to the bracket terms in order to keep in mind where the
respective term stems from, e.g., we write ( )uk1 1,k2 ,k3 ,k4 for the first ( )u term in the previous
equation. Using this notation and knowing that Oq operates as a derivative and that Oq F = 0,
we find that the problem reduces to integrating the following set of equations
285
(27)
(28)
Note that the first equation (27) and its solution is well known
( )uk1 ,k2 ,k3 ,k4 = Fk1 ,k2 ,k3 ,k4 (x),
(29)
Essentially, the terms for an ansatz of logarithmic degree l are given by a sum over all admissible
graphs subject to the following rules:
286
(1)
(2)
(3)
(4)
(5)
Let us have a look at a simple example. We consider a theory of Jordan-rank r = 3 and are
interested in the structure of the correlator 2110 for the highest possible logarithmic degree,
i.e., the F0 term. The corresponding graph for 2110 is
.
Altogether we have four legs to our disposal, but we also have to fix two of them leaving us with
two free legs. If we want to know which terms can appear for logarithmic degree l = 2, then we
have to create all 2-contractions according to the above rules. This results in the following six
different graphs:
,
All these graphs are the result of applying the beforementioned rules to 2110. However, we did
not draw the free legs in the above graphs as we only replace connections between vertices with
lij and thus do not need these free legs for generating the ansatz.
Furthermore it should be pointed out that there are only two different truly independent graphs
in the sense that they are not a mere permutation of other graphs. The first three graphs and the
remaining three graphs form two equivalence classes
by permutations of S4 .
induced
Using the algorithm results in a maximum of l+5
terms
that can appear. Combinatorial
l
restrictions which we will discuss in the following can reduce this number, for instance 2211
3 term.
for l = 3 does not contain a l34
2.5. Restrictions
The analysis of the results we found shows that several restrictions reduce the number of
different terms that may appear in the end result.
The first restriction naturally appears during the integration process. In some cases our method
for recursively constructing higher correlators fails. It is not possible to repair this failure in
a sensible manner by adding further terms to the ansatz, but a simple identification of different
functions F immediately fixes the problem.
This behavior is a general property of the theory for r 3, as we will see in Section 5. For
now it is sufficient to note that Fk1 1,k2 ,0,0 = Fk1 ,k2 1,0,0 , e.g., for r = 3 we get F2100 = F1200
plus five more identifications by virtue of permutations.
The second restriction we encountered is the so-called discrete symmetry of the correlators,
which limits the dimension of the kernel. By discrete symmetry we mean that a correlator which
contains at least two fields of the same Jordan-level should be invariant under any transposition
that exchanges the Jordan-levels of these fields, for instance
P(12) k k k3 k4 = k k k3 k4 .
(30)
At this point we point out again that we have formally set, without loss of generality, all conformal weights hi to zero and wrote k instead of (h,k) (z). In the next subsection we will discuss
287
in more detail to what extent the above mentioned invariance limits the dimension of the kernel
respectively show that in some cases no kernel term can show up at all.
The dimension of the kernel that finally shows up in the results is often smaller than the one we
would expect for the given logarithmic degree and given discrete symmetry. The difference will
show up especially if the logarithmic degree is close to the maximum degree l max = K r + 1.
The reason for this difference is that the ansatz does not allow all terms of lij of a given degree
3 and
to show up. For instance the correlator 2211 forbids the existence of terms of the type l34
by that limits the dimension of the kernel of degree 3. We also refer to this as combinatorial
3 is not
restriction, because the restriction depends on the form of the correlator, e.g., the term l34
forbidden in 2221.
2.6. Additional constants
As we have seen in Section 2.3 the kernel of the operator O is non-trivial. That means that
the results may come with additional constants. In order to understand the meaning of these
constants in the context of conformal field theory we rewrite the two basis terms K1 , K2 which
every element of the kernel consists of as follows:
z12 z34
log|x| log|1 x|,
K1 := l12 + l34 l13 l24 = log
(31)
z13 z24
z12 z34
log|x|,
K2 := l12 + l34 l14 l23 = log
(32)
z14 z23
where x =
z12 z34
z14 z32
is the anharmonic ratio of the four points. The anharmonic ratio x and its five
1
x
possible involutions x1 , 1 x, 1 x1 , 1x
, and x1
result in four linearly independent functions.
If we take the logarithm of the absolute value of these four functions, then we are left with only
two independent solutions, namely log|x| and log|1 x|. The choice of the basis has no influence
on the results and our choice of the basis K1 , K2 is given as above.
We can turn around the argument and ask for all functions of the anharmonic ratio x, i.e.,
globally conformally invariant functions, which have the additional property to be strictly polynomial in the lij . These functions are all in the kernel of the operator O. On the other hand, there
can be no other functions in the kernel if we restrict ourselves to polynomials of the lij , since
every member of the kernel must be invariant under global conformal transformations and thus
be a function of x. This proves the statement in Section 2.3. However, we note that this yields
only an upper bound on the size of the kernel. We will see that the size may be reduced due to
further symmetries.
Eq. (22) gives us the maximal dimension of the kernel for logarithmic degree l,
l
(l)
i di
K :=
(33)
ai K1 K2 : ak R ,
i=0
d max (l) = l + 1.
(34)
Up to a few exceptions we will notice that the full kernel never shows up in any equation. These
restrictions on the kernel are caused by the discrete symmetry and combinatorial constraints.
Examples for combinatorial constraints are shown in the next two sections.
It is worth noting that a non-trivial kernel can show up in a correlator only if there is no
primary field in the correlator present. This is obvious, since both kernel elements K1 , K2 refer
to all four vertices z1 , z2 , z3 , z4 .
288
(l)
KS2 S2 KS2 ,
l
max
+ 1.
dS2 S2 (l) =
2
(37)
(38)
For the S3 symmetry we note, that a S3 invariance extends to S4 invariance. This is because
S3 invariance in particular means P(12) invariance which, as explained above, also means P(34)
invariance. By this we immediately obtain full S4 invariance:
(l)
(l)
KS3 KS4 .
(39)
:= KS(2)
,
KS(4)
4
4
(5)
(3)
(2)
(2) 3
(6,(2,2,2))
:= KS4 ,
KS4
(3) 2
(6,(3,3))
:= KS4 ,
KS4
(2) 2 (3)
(7)
KS4 := KS4 KS4 .
(40)
These results are unique, up to constants and linear combinations. Of course any combination of
(2)
(3)
the form (KS4 )i (KS4 )j leads to a kernel of logarithmic degree 2i + 3j and we believe that the
kernel space is not larger than this, though it is not important since we consider kernels up to
logarithmic degree l = 6 only in the further course of this paper.
289
We expect the dimension of the S4 invariant kernel to be the number of possible partitions of
the degree in the numbers 2 and 3, e.g., 6 = 2 + 2 + 2 as well 6 = 3 + 3. This in turn means
that every integer 6 can be represented in two different ways, leading to the following number of
degrees of freedom that could appear at most for logarithmic degree l:
l
,
l = 6k + 1, k N0 ,
max
dS4 (l) = 6l
(41)
6 + 1, else.
2.6.2. Discrete symmetry and non-invariant F
In the previous subsection we analyzed the structure of the kernel under symmetry groups and
found that we have to consider S2 and S4 symmetry groups only. This holds if the function F...
itself is invariant under permutations.
Things get more complicated if Fa is mapped to Fb by the permutation. For example we
(1)
know that the S2 invariant kernel for one F (F1 for short) is one-dimensional, namely KS2 =
{a(K1 + K2 ): a R}. But if we have two F , which are related by the permutation, e.g., F1022
and F0122 , then the dimension of the kernel for F1022 becomes larger. The kernel would be
(cK1 + c K2 )F1022 + (c K1 + cK2 )F0122 ,
or PS2 (K (2) F1022 ) with PS2 = 1 + P(12) for short.
The kernel dimension therefore not only depends on the symmetry group and the logarithmic
degree, but also on the size of the equivalence class of functions F which are involved. The size
of the equivalence class is noted by Fn and the results for the logarithmic degrees l = 1, 2, . . . , 5
are listed in Appendix A.
The simple rule that S2 corresponds to S2 S2 respectively that S3 corresponds to S4 does
not hold for n > 1, therefore we have to discuss all four symmetry groups in Appendix A. The
dimension of the kernel decreases with increasing size of the symmetry group and increases
with increasing size of the equivalence class. It is interesting though not surprising, that the full
kernel K (l) is recovered, if the size of the equivalence class |F | equals the size of the symmetry
group |S|.
3. Results for Jordan-rank r = 2
In this section we present and discuss the results for a logarithmic conformal field theory with
Jordan-level r = 2. We have used the algorithm described in Section 2.4.3 to obtain these results
and though known, e.g. [11], we can write them in a more appealing form. Also we will discuss
the appearance of an additional degree of freedom, which shows up for 1111.
We start with simply writing down the first three expressions that our algorithm provides:
1000 = F0 ,
1
F0 ,
1100 = PS2 F1100
2
1
1
P(13) 1
1110 = PS3 F1110 +
6
2
(42)
F0110 +
1
2
F0
(43)
with PX = X P(x) . Writing the results this way makes the discrete symmetry manifest, that is
S2 for 1100 and S3 invariance for 1110.
290
For the correlator 1111 we have a logarithmic partner field at every vertex which means that
we can expect getting a non-trivial kernel for the first time. The result without kernel is given by
1
1
1
1111 = PS4
F0111
F1111 +
P(13)
24
6
3
1
1
1
+ (P(24) 1)
+ 1 P(14)
F0011
2
2
4
1
1
+
+
(44)
F0 ,
2
3
the contribution to the kernel is
(2)
Ker1111 = PS4 KS2 F0011 .
(45)
That we get a two-dimensional kernel for F0011 is not surprising, since there are 6 functions F
(2)
belonging to the equivalence class of F0011 and the resulting KS2 can be read of the table for
logarithmic degree 2 from Appendix A.
The inverse question is more interesting, namely we are interested in understanding, why no
other kernel term shows up at all. For logarithmic degree l = 1 the equivalence class of F0111 is
four and thus there is no kernel term showing up. According to (41) the S4 invariant kernel of
logarithmic degree l = 3 should be one-dimensional. We can immediately understand why this
kernel term does not show up, by looking at the graphical representation:
3
1
(3)
+2
3
3
+
+2
.
KS4 = PS4
2
2
(46)
3 appear, which is impossible for a Jordan-rank r = 2
This shows us that terms of the form l12
theory. Though three free legs are available the three-fold connection between vertices i and j
is forbidden for r = 2. Of course higher Jordan-rank LCFT r > 2 are allowed to include such
terms, but similar combinatorial restrictions will show up for r = 3 as well.
2000 = F0 .
(47)
(48)
(49)
(50)
291
1
1
P(23)
1120 = PS2 F1120 (1 + P(23) + P(13) )
F0120 +
F1110
2
2
3
1 1
F0 ,
+ P(23)
+ 1 + P(23)
(51)
2
4 2
where as before PX = xX Px .
The above correlators do not have an additional degree of freedom because they contain at
least one primary field. The simplest correlator with no primaries is
1
1
1
1
F0 .
F1111 +
P(13)
1111 = PS4
(52)
F0111 +
24
6
3
4
This is the first correlator for r = 3 which has a non-trivial kernel, namely
Ker1111 = c1 KS(2)
F0 .
4
(53)
The restriction that the expression needs to be invariant under S4 permutations is very strong and
forbids any kernel terms of degree one to show up.
The remaining correlators containing at least a primary field are
1
1
P(13)
2210 = PS2 F2210 +
F2200 (1 + P(23) P(13) )
F1210
2
2
1
F1110
P(13)
+ [2
]F1200 + P(23)
+
2
1
F0120
+ (1 P(23) + P(12) )
+ (1 + P(23) + P(13) )
2
1
F0
+ (P(13) 2)
+
(54)
2
and
1
1
F2220 +
P(13) 1
F1220
6
2
1
F1120
P(23)
+ P(23)
+
2
1
1 1
P(12) 1
+ P(13)
+
+
2
2 4
1
1
F1110
+
+
2
3
1
+ (2P(13) 1)
P(13)
2
1
1
3
+
+
+
2
8
4
2220 = PS3
F0220
F0120
F0 .
(55)
Finally there are, up to permutations, four correlators without primary field and at least one
field being of Jordan-level 2.
292
1 1
1
1112 = PS3 F1112 +
F1111
P(14)
6
6 3
1
+ P(13)(24) P(24) P(13)
F0112
2
1
1
1
1
1
+
+ P(14)
P(34) + P(24)
F1110
12 6
4
6
12
1
+ (1 + P(34) P(14) )
+ (P(24) 1)
F0012
2
2 1
2
2 2
1
+
+
P(24)
P(34) + P(24) P(124)
3 3
3
3 3
3
1
1
+
P(13) P(13)(24)
F0111
6
3
1 1
3
3
1
1
+
P(24)
(2)
Ker1112 = PS3 c1 (2K2 K1 )F0112 + KS4 F1110 + c3 K12 + c4 K22 K1 K2 F0111
+ c5 K12 + c6 K22 K1 K2 F1002 .
(57)
The set {F0112 , F1012 , F1102 } allows a one-dimensional kernel, namely 2K2 K1 . Note that this
kernel is not S3 invariant.
For the self-invariant terms F1110 and F0 we remarked in Section 2.6.1 that S3 invariance
4
= 1 for d = 2, 3. The
implies S4 variance. The dimension of the S4 -invariant kernel is nSmax
corresponding kernel term for F1110 shows up, combinatorial restrictions forbid the same for
(3)
F0 . The only kernel of degree 3 that would have been possible is KS4 , but this one includes lij3
terms for all 1 i < j 4, which is not compatible with the contraction rules as described in
3 terms, cf. (46).
Section 2.4.32111 cannot contain l34
For the 3-element sets {F0111 , F1011 , F1101 } respectively {F0012 , F0102 , F1002 } we know from
the kernel analysis in Appendix A that there is a two-dimensional kernel.
It should be noted that 1211 is generated by applying P(12) to 2111. The same holds for
the additional terms of the kernel. This means that the degrees of freedom we have for 2111
are not available for the permutations of this correlator, e.g., 1211.
The correlator 2211 comes with a high number of additional degrees of freedom, some of
these are restricted by combinatorial constraints. The correlator without kernel terms has the
form
1
1
F2201
P(13)
2211 = PS2 S2 F2211 +
4
2
1
+
F1211
P(13)(24) P(23)
2
1
1
1
+ P(23)
+
+
P(14) 1
F2200
2
2
4
+
1
1
P(23)
2
6
1 1
P(14)
+
3 2
1
+
12
293
F1111
+ (P(243) + P(24) + P(12) P(12)(34) P(124) + P(142) )
+ (1 P(23) )
P(24)
F1201
1 1
1
1
1
3
1
+ P(23) P(24) + P(12) + P(124) + P(142) + P(14)
+
2 2
2
2
2
4
4
1
1 1
F0211
+ P(14)
+ P(24)
2
4 2
1
F1200
+ 2P(12)
P(12)
P(13)
2
1
+ (P(34) 1 + P(243) P(24) 2P(1243) P(124) 2P(143) P(14) )
2
1 1
+ (1 + P(24) )
+ P(34)
+ P(24)
2 2
1
F0102
+ (P(34) + P(243) P(12)(34) P(13) )
+ P(14)
2
1
+ (P(23) 2 + P(34) + 2P(234) P(134) + 4P(13) + 3P(14) P(143) )
3
1 1
4 2
1
4
+
+
P(13)
+ P(23)
P(34)
6 3
3 3
3
3
2
4
1 1
F1101
P(23) 2P(243) + P(13)
+ P(23) + P(24)
+
3
3
3 3
2
7
5
7
3
7
2
5
P(23) + P(24) P(12) + P(124) + P(132) + P(142) + P(13)
+
3
4 12
12
4
12
3
4
1
1
1
4
2
P(24) + P(12)
+ P(1423) P(14) P(14)(23)
12
6
12
3
3
5
5
1
4 2
1
P(24) P(14)
+ P(24) + P(14)
+
12 12
4
3 3
2
4
7 7
1
7
P(23) + P(24) + P(12) + P(132)
+
3
6 6
6
6
11 5
7
1
F0111
P(24) P(13) + P(13)(24)
+
12 4
12
4
1
3
+ P(24) P(23) P(14)
2
4
1 1
1
P(23) + P(24) P(14)
+
2 2
4
1
1 5
P(24)
+ (1 + P(23) 4P(24) P(14) 4P(13) )
2
8 8
1
3
5
P(24) P(23) + P(13)
+ (1 + P(23) P(14) )
+
2
4
4
294
1
1
+
P(23) P(24) 1
2
2
1
+ P(23)
2
3
+
16
F0 ,
(58)
where PS2 S2 = 1 + P(12) + P(34) + P(12)(34) . The kernel of 2211 has a dimension of 18:
(1,d=1)
(1,d=1)
Ker2211 = PS2 S2 KS2
F2201 + KS2
F1211
(2,d=2)
+ K (2,d=3) + KS2
(59)
where we used a somewhat condensed notation and left out almost all constants. If multiple
F in brackets show up you should add the necessary constants in your mind, for instance,
KS(2,d=2)
(F2200 + F0211 + F1111 ) stands for 2 3 = 6 degrees of freedom. For better orientation
2
we added a small d = index to the common kernel terms which notes their dimension.
For the sets {F2201 , F2210 } and {F1211 , F2111 } we get the expected one-dimensional kernel
(1)
KS2 . Also the results for logarithmic degree 2 are not surprising.
Things are more complicated for higher logarithmic degrees. For d = 3 we would have been
(3)
expecting a two-dimensional kernel KS2 for F1101 , F1200 , F0111 and a full 4-dimensional kernel
K (3) for F0102 . But here we have to take into account the combinatorial restrictions again. The
(3)
(3,a)
3 , l 3 , . . . , l 3 contributions and K (3,b)
basis elements of KS2 , KS2 K13 + K23 contains l12
S2
13
34
3 , l 3 terms. Thus both basis elements are not allowed due to the
K12 K2 + K1 K22 comes with l12
34
3 term, but a linear combination is, namely (K K )2 (K + K ), which contains no l 2 term.
l34
1
2
1
2
34
For the set {F0102 F0201 , F1002 F2001 , F0120 F0210 , F1020 F2010 } the four-dimensional
kernel reduces by the combinatorial constraint to a three-dimensional one.
The reasoning for d = 4 goes along the same line. We expect from (38) a three-dimensional
4 , because
kernel space, but also we have two restrictions. No lij4 term may show up, not even l12
3 term is allowed. These two restrictions limit the kernel to
of the S2 S2 invariance. And no l34
2
K = K1 K2 (K1 K2 ) , leaving us with a one-dimensional kernel.
1 1
1
1
F2220 +
F1221
P(14)
P(13) 1
2221 = PS3 F2221 +
6
6 3
2
1
+ 2 P(34) P(12) + 2P(12)(34) + P(14)
(1 + P(24) )
2
1 3
+ P(13)
F0221
2 4
+ (2 P(24) )
+ (2 2P(34) 2P(12) + 2P(12)(34) P(124) )
1
+ P(13)(24) P(13)
F1220
2
1
1
+ 1 P(23) + P(243) + P(34) + P(234) + P(24) P(14)
2
2
1
+ P(24) P(13)(24)
2
2
2
3
3
1 2
+ P(34)
+
3 3
1
+
2P(23)
2
+
2
3
F1121
1
3
1
6
F1111
+ P(23)
F1120
1
(1 P(34) 2P(13) P(134) )
2
+ (1 P(12) )
1
1
1
F0220
+ P(13)
+
4
2
2
1
+ (P(23) 2P(34) + 2P(234) P(243) + P(13)(24) + P(12)(34) + 3P(1243)
2
+ P(14)(23) + 3P(124) P(12) + 2P(132) + P(1342) + P(24) P(1324) + 4P(143)
P(1423) )
2 2
2
2
2
F0121
+ (P(23) 2P(24) P(12) )
1
+ P(24)
3
4 2
1
2
+ P(34)
+
+
3 3
6
3
2
1
2
F1110
P(24)
3
3
3
1
1
1
1 1
P(34) + P(23) P(234) + P(14)
+
4
2
2
4 4
1 1
1
+ (P(23) P(34) )
+ P(23) + P(243) + P(24)
2 2
2
+ (P(23) 1)
+
1
+ (P(234) 3P(34) 3 P(23) + 2P(13) 2P(134) + P(143) P(14) )
2
1
1
1
1
3
P(234) 1 P(34) P(23) + P(13) P(134)
+
+
2
2
2
2
8
2
3
2
P(34)
+
3
+
2
P(14)
3
1
3
4
P(14)
+
3
295
296
3
1 1
1
P(23) P(143) + P(14)
+
2
2 2
2
3 1
+ P(24)
+
4 4
F0012
+ (1 P(13) P(14) + P(1324) )
1
+ (P(34) + P(23) 3P(243) P(12)(34) 6P(123) 6P(1234) + P(132)
2
2P(13)(24) + 2P(1324) 3P(1432) 2P(14) + 2P(1423) )
1
3
P(24) + P(13)
+
2
2
1
+ (3P(24) P(23) 3P(234) P(12) + P(12)(34) + P(123) )
2
3
1
+ P(24)
+ (2P(23) P(34) + P(12)(34) P(132) P(14) )
2
2
1
+ (P(243) P(23) P(234) + P(124) )
2
+ (2P(1234) P(23) P(234) + P(243) P(124) )
1
+ (P(124) + P(13) P(134) P(13)(24) P(14) + P(1423) )
2
F0120
+ (1 + P(34) + P(24) + P(13) )
+ (2 P(34) )
1 17
11
5
P(34) 1 + P(12) + P(124) + P(1342)
+
3 2
2
2
3
+ 2P(13) P(13)(24) + P(14)
2
1 5
2
1
1
1
P(34) + P(24) + P(12)(34) + P(13) P(12)
+
2 3
3
2
3
3
1
5 1
5
2
2
P(34) P(24) P(12) + P(12)(34)
+
+
3
6 2
6
3
3
1
1
P(124) + P(14)
6
3
5 1
1
1
1
5
P(34) + P(134) + P(124) P(13)(24) + P(143)
+
6
6 3
2
2
3
1
1
1
7
5
P(12)(34) P(34) P(12) + P(142)
P(14)
+
3
2
2
6
6
1
1
P(13)
+ (1 P(34) P(24) P(13) 2P(134) 4P(14) )
12
3
11
1
1
1
+ (P(24) P(34) )
3 1
+ P(24)
4 4
3
5
P(24) +
+
4
4
297
+ (1 P(34) 2P(14) )
7 3
+ P(24) + P(34) P(14)
4 4
5
3
+ P(34) 1 + P(24) P(14)
2
2
1
+ (1 + P(34) 2P(14) )
4
1
1
+ (P(34) 1)
+ (P(34) 1)
+ (P(24) 1)
4
4
3
1
1
+ (1 + P(34) )
(P(34) + P(14) )
+ (P(14) 1)
2
2
2
1
3
1
F0 .
P(34) P(14)
+
2
2
2
(60)
Interestingly the dimension of the kernel for 2221 is also 18 and by that not larger than the
kernel for 2211. Naively
one would expect that the kernel dimension increases with growing
Jordan-level K := i ki . On the other hand the larger symmetry group (S3 instead of S2 S2 )
reduces the kernel size which can even lead to a smaller kernel, as we will see in the case of
2222.
d=1
Ker2221 = PS3 (2K2 K1 )F1221
d=2
d=2
d=2
+ cK12 + c K22 K1 K2 F1220
+ F2111
+ F0221
d=4
d=1
(3)
+ K (3) F0121
+ KS4 F1111
d=2
+ cK2 (K1 K2 )(K1 2K2 ) + c K12 (K1 2K2 ) F0220
d=2
+ cK2 (K1 K2 )(K1 2K2 ) + c K12 (K1 2K2 ) F2110
d=1
d=1
+ K12 K2 (K1 K2 )F0111
+ K12 K2 (K1 K2 )F0120
d=1
.
+ K12 K2 (K1 K2 )F1002
(61)
There is not much surprise for most results. For logarithmic degree 1 and 2 we get for the
sets containing three F a d = 1 respectively a d = 2 kernel. For degree 3 we have the selfinvariant term F1111 with a S4 symmetry and two d = 2 kernels for {F0220 , F2020 , F2200 } and
{F2110 , F1210 , F1120 }. There is also a set containing six F , which results in a full four-dimensional
K (3) kernel.
As expected combinatorial constraints show up the first time for degree four, because of the
4 (i = 1, 2, 3) term. For degree actually
last vertex having one leg only and thus disallowing any li4
4 terms means that the kernel has
a d = 3 kernel would have been possible, but eliminating all li4
to be reduced to a one-dimensional kernel each. Also note that the only possible kernel term
(5)
of degree 5 would have been KS4 which does not show up, because of the same combinatorial
restriction.
298
1
1
1
F2222 +
P(13)
2222 = PS4
F1222
24
6
3
1
1
1
1
F0222
P(12) + P(14)
+
P(13)
3
6
3
12
1
1
1
F1122
P(23) P(24)
+ P(24)
+
+ P(13)(24)
2
2
4
13 5
+ (3P(34) + 3 + P(14) )
5
+ P(34)
6
2
3
F1112
+ (5 + 3P(24) )
3 + P(14)
2
1
+ (P(124) 11 9P(12) 7P(123) P(132) 3P(142) 7P(14) 7P(13)(24)
2
3
9
+ P(13) 8P(14)(23) )
+ 5 + P(24) + 3P(14) + P(13)(24)
2
2
3
5
+ P(13) + P(13)(24)
+ (6 + 7P(23) + 5P(12) )
+
2
4
+ (10 2P(23) P(24) 8P(12) 5P(123) 2P(132) )
9
F0122
+ 9 + 4P(24) + P(14)
2
1
1
1
+
3
2
24
5
1
1
F1111
+
+
+
12
6
3
1
1
+ (P(13) P(23) )
(1 + P(13)(24) )
2
4
1
3
1
1
P(23)
1 P(23) + P(24) P(14)
+
+
8
8
2
2
1 1
1
+ P(14)
4 8
2
1
+ (3 + P(23) P(24) + 3P(13)(24) )
+ (P(23) 1 P(1324) )
2
1 1
1
1
1
F0022
P(24) 1 P(23) P(13)
+
+
2 2
2
16
2
1
1
1
1 P(34) + 2P(24) + P(12)
+ P(24)
+
4
2
2
1
1
P(124) + P(142)
+ (P(134) 2P(24) P(13) P(1243) + P(124)
2
2
1
2P(142) + P(143) P(14) )
+ (3P(34) 1 3P(24) P(12)(34)
2
+ 2P(124) + P(12) 2P(13)(24) 3P(142) )
+ (P(24) P(34) 1 + P(13) 2P(14) 2P(13)(24) )
1
+ (3P(34) 1 P(24) + P(12) 2P(142) )
2
1
1
+ (P(24) P(13) )
+ (P(34) P(12) P(134)
2
2
1
2P(13) + P(14) + 2P(143) )
2
1
F0112
+ (1 P(34) + 3P(24) + P(13) + 3P(1243) + P(12) )
2
1
11
1
4
5
P(13) P(14)
+ P(13)
+
+
12
3 12
12 12
4
13 1
1
P(14)
+ P(14)
+
+ (P(13) P(14) )
3
12 4
12
3
1
5
P(14) P(13)
+ (P(12) 1)
+
4
6
6
1
7
1
1 1
P(13) P(14)
P(13)
+
+ P(14)
+
6
6
4
2 6
7
5
5
+ P(12)
+ (1 P(14) )
6
4 12
3
7 2
2
1 2
P(12) P(14)
P(12)
+
+
+
2
6 3
3
3 3
4
1 1
1
5 3
+
+
P(14) P(12)
P(14) + P(13)
3
6 2
12
6 4
1
1
1 P(243) P(24) P(14)
+
2
2
1
+ (1 P(24) + P(23) P(13) )
2
1
+ (P(243) 1 + P(234) + P(134) P(143) )
2
1
1
+ P(243) P(13)
+ (P(234) P(243) + P(13) P(14) )
2
2
1
+ P(23) P(34) P(234)
2
1
+ (P(243) P(24) P(13) + P(14) )
2
+ (P(34) P(13) + P(13)(24) )
299
+ (P(34) + 2)
P(13)
8
F0111
300
+
+
1
2
1
4
1
+
2
+
+
1
2
1
4
+
3
2
3
4
+3
+2
5
2
1
2
1
4
1
2
1
6
+
1
4
F0 .
(62)
We saw that the transition from 2211 to 2221 did not increase the dimension of the kernel
mainly because of the increase of the discrete symmetry group from S2 S2 to S3 . This transition
to 2222 enlarges the symmetry group from S3 to S4 and by that even reduces the dimension of
the kernel to 13.
d=2
(2)
Ker2222 = PS K F1122
4
S2
d=2
+ cK2 (K1 K2 )(K1 2K2 ) + c K12 (K1 2K2 ) F0122
d=1
d=3
d=1
(3)
(4)
(4)
+ KS4 F1112
+ KS2 F0022
+ KS4 F1111
d=3
+ cK14 + c K22 (K1 K2 )2 + c K13 K2 2K1 K23 + K24 F0112
d=1
+ K1 K2 (K1 K2 )2 (K1 + K2 )F0012
.
(63)
There is no kernel of logarithmic degree one because at most we have four F in a set and the S4
symmetry then is forbidden according to Appendix A. Additional combinatorial constraints start
with logarithmic degree 5: no lij5 for 1 i < j 4 is allowed to show up.
For F0012 plus the five permutations there would have been a d = 3 kernel, but the given linear
combination is the only one which eliminates all lij5 terms. For {F0111 , F1011 , F1101 , F1110 } only
(5,d=1)
(64)
As described in Section 2.5 it is possible to identify some of the appearing F -terms with each
other. In this case it turns out that it is easy to find the identifications that stems from the integration process by inserting the above ansatz in Eq. (10). This leads to
O1 k1 k2 00 = 2z1 k1 1, k2 , 0, 0 2z2 k1 , k2 1, 0, 0,
(65)
301
and considering the terms of the lowest order in {lij } only we get
(z1 + z2 )(c1 Fk1 1,k2 ,0,0 + c2 Fk1 ,k2 1,0,0 ) + O(l12 )
= 2z1 Fk1 1,k2 ,0,0 2z2 Fk1 ,k2 1,0,0 + O(l12 ).
(66)
We immediately see that these equations do not have a solution. As before we can circumvent the
problem by reducing the complexity of the equations, which can be accomplished by identification of some of the functions F . Here we can solve Eq. (66) by using the following identifications
Fk1 1,k2 ,0,0 Fk1 ,k2 1,0,0 .
(67)
This in perfect agreement with the results presented so far for r = 3. Because of having the above
identifications we are left with only one function F for each logarithmic degree of k1 k2 00.
Using (9) yields after a short calculation the full result for a correlator of Jordan-rank r with two
primary fields:
k1 k2 00 =
k1 +k
2 (r1)
n=0
(2)n n
l Fk +k (r1)n,r1,0,0 .
n! 12 1 2
(68)
As a consistency check we can compare the above result with the one presented in [7], respectively [15]. For the two-point correlation function the first paper gives the following result
1 +k2
k
(2)
k1 (z1 )k2 (z2 ) =
l D(h1 =0,h2 =0,k1 +k2 ) ,
! 12
(69)
=0
where we slightly adapted the notation and have set the conformal weights h1 , h2 to zero. The
D() are called structure constants and have the property that D(h,h;k) = 0 for k < r 1. In
other words the index in (69) effectively runs from 0 to k1 + k2 (r 1) and thus (68) and (69)
are of identical structure. This means, that the polynomial dependence on the logarithms lij is
exactly the same and that precisely the same number of free structure constants D() or structure
functions F (x) are needed.
6. Summary and discussion
In the scope of this paper we analyzed the influence of the global conformal symmetries in
form of the global conformal Ward identities on 4-point correlation functions in arbitrary logarithmic conformal field theory. While it is not possible to completely determine the correlators,
this does not even work in the CFT case, it is possible to fix the generic structure of the correlators.
The presented algorithm can be used to calculate the generic structure of 4-point correlators.
Within this paper we restricted ourselves to combinations of proper primary and logarithmic
fields, but did mention how to adjust the algorithm in order to extend the algorithm to prelogarithmic fields.
We explicitly gave the results for, up to permutations, all correlators of Jordan-rank r = 2, 3.
In some of the results we found additional constants which were identified as elements of the
kernel O. Furthermore we discussed various restrictions which limit the number of terms that
can appear in an ansatz or which lead to lesser degrees of freedom in the kernel. Also we found
that integration sometimes requires that some functions F need to be identified with each other.
302
There are almost no computations of LCFT 4-pt functions in the literature. We can only compare the structure of our results with some of the functions given in [20]. However, Gaberdiel and
Kausch compute, in the specific example of the c = 2 LCFT, the full non-chiral 4-pt functions.
They can do this specifically for 4-pt functions which involve at least one or two twist fields.
These latter fields are pre-logarithmic fields, and therefore not proper primary. Such functions are
not covered by the explicit results in our work, although it isin principlea straight-forward
task to generalize our initial conditions to be applicable to such cases as well. The only 4-pt function we can compare directly with is the one of four logarithmic partner fields of the identity. The
form of this function in [20] clearly agrees with our generic form for 1111 in the r = 2 case.
Also, their 4-pt functions of type 00 involving two twist fields and two proper primaries
agree in their form with what we expect our ansatz would yield with the appropriate initial conditions: the structure of such 4-pt functions is equivalent to the one of type 1000, again for r = 2.
Finally we gave explicit results for the case of exactly two logarithmic fields for arbitrary
Jordan-rank r. Studying this very simple case showed us why we need to identify some of the
functions F with each other. Also we did a consistency check of the result and showed that
Eq. (68) is equivalent to the one presented in [7].
The comparison can be extended to three-point correlators. For instance we can consider the
terms of logarithmic degree l = 2 of the correlator 2110 in a Jordan-rank r = 3 theory, cf.
Eq. (51):
1
2
2
2
2110 l=2 = l12 + l13 + l23 + 3l12 l13 + l12 l23 + l13 l23 F0 .
(70)
2
As a comparison we evaluate formula (3.11) in [7] and get for l = 2 the same result,
1
2
2
2
211l=2 = l12
+ 3l12 l13 + l12 l23 + l13 l23 C(h1 ,h2 ,h3 ;k=0) ,
+ l13
+ l23
2
(71)
except that F0 has to be replaced by the structure constant C(h1 ,h2 ,h3 ;k=0) . We once more note that
we suppressed any direct but trivial dependence on the conformal weights, so actually,
should
we
compare with C(0,0,0;k=0) . However, our results are, up to the omitted prefactor i<j zijij , valid
and independent of the values of the conformal weights hi . For the other correlators like 2210
et cetera we also confirmed that the results match if we restrict us to the highest logarithmic
degree, which corresponds to l max = k1 + k2 + k3 r + 1. As we will see in the following it is
interesting to study the case where l < l max . We use 2110 as an example again, but this time
we consider the term of l = 1 only:
2110l=1 = F1020 (l13 l12 l23 ) + F1110 (l23 l12 l13 ) + F1200 (l12 l13 l23 ). (72)
We remind the reader that the above result includes the usual identifications such as F2100
F1200 . The structure of the formula in [7] makes it obvious that for l = 1 only one structure
constant shows up and thus the corresponding term is
211l=1 = (l12 + l13 + l23 )C(h1 =0,h2 =0,h3 =0;k=1) ,
(73)
where we again set the conformal weights to zero and slightly adjusted the notation. Though
looking differently at first glance we can achieve the same form of the result if we demand that
the following extended identifications hold too, namely
F1200 F1020 F1110 .
(74)
303
This means that we do not only regain the F0 terms, but that we can reclaim all information,
provided that we do all necessary identifications. With necessary we mean that we have to
identify all F... terms of the same logarithmic degree.
We already encountered one situation where we had to identify several functions F with
each other: the initial conditions (11) where we identified F0 F0 by virtue of the cluster
decomposition argument.
This evokes the question whether this form of massive identifications of functions F is necessary or useful in the context of some physical theory respectively what conditions could force us
to massively reduce the number of functions F . It is clear that the special case where all conformal weights hi are equal to each other has an additional symmetry, since we can freely exchange
the fields. In this case, we definitely expect that a large number of such identifications should
take place.
Furthermore, one can quickly check that the given solutions remain valid after identifying
remaining free structure functions because any remaining such function can be arbitrarily chosen
as long as no further constraints such as local conformal symmetry are invoked. Due to the
recursive dependence of the solutions for total Jordan-level K on the ones for level K < K,
identifications are consistent only if restricted to functions Fk1 k2 k3 k4 , Fk1 k2 k3 k4 with k1 + k2 +
k3 + k4 = k1 + k2 + k3 + k4 . However, a more detailed analysis which identifications should be
present in the general case, i.e., for arbitrary values of the conformal weights hi , will be left to
future work.
Of course, when all four fields in the 4-point function are logarithmic, we cannot expect that
the resulting polynomials in the lij can be matched with the ones of 2- and 3-point functions. But
one might attempt to make the following comparison.
The structure functions Fk1 k2 k3 k4 (x) are ultimately composed out of (a suitable generalization
of) conformal blocks which depend on the internal propagator in the 4-point function. Crossing
symmetry of the 4-point function imply that the structure functions possess for each asymptotic
region |x| < 1, |1 x| < 1, or 1/|x| < 1 expansions of the schematic form
F(h1 ,k1 )(h2 ,k2 )(h3 ,k3 )(h4 ,k4 ) (x)
(h,k)
C(h
C(h,k)(hl ,kl )(hm ,km ) +
i ,ki )(hj ,kj )
(75)
(h,k)
for all permutations {i, j, l, m} of {1, 2, 3, 4}, which must all be expansions of the same analytical
functions. These expansions involve the 3-point structure constants as well as the OPE structure
constants. In the logarithmic case, these structure constants are matrix valued with coefficients
in C[{lij }]. In the notation used in this paper, C(h1 ,k1 )(h2 ,k2 )(h3 ,k3 ) = k1 k2 k3 where on both sides
all terms of the form zijij depending in the canonical way on the conformal weights are omitted.
In the r-dimensional Jordan-cell space, this defines matrices (Ck1 )k2 k3 labeled by the first Jordanlevel and with indices given by the second and third Jordan-level. In the same way, the propagator
defines a matrix (D)k1 k2 = k1 k2 . The OPE structure constants are then given by the matrix
product
kk
k
(Ck1 )k32 = (Ck1 )k2 k D 1 3 ,
(76)
involving the inverse propagator. Now, one can compute the leading orders of the different expansions of the 4-point structure functions which will yield different polynomials in the lij with
coefficients given by rational functions of the 2- and 3-point structure constants D(h,h;p) and
C(hi hj ,h;q) . Two observations can now be made:
304
Firstly, the three expansions for the s-, t- and u-channel, i.e., for |x| < 1, |1 x| < 1 and
1/|x| < 1 all differ. They lead to different polynomials. It is easy to check in simple examples
that certain monomials in the lij may appear only in one of the expansions. This always happens
for 4-point functions of the form k1 k2 k3 k4 with all ki > 0 but not all ki equal.
Secondly, the polynomials in lij with coefficients given by the structure functions Fk1 k2 k3 k4 (x)
cannot be matched to any of the three expansions. On the contrary, the 4-point functions will
involve all the different monomials in the lij and in particular all the ones which do not appear in all the expansions, but in only one of them. It is therefore much more difficult to
match the 4-point structure functions to expressions in the 3- and 2-point structure constants
or to suggest further identifications as they can easily be read off in the case of 4-point functions of type k1 k2 00 or k1 k2 k3 0. In fact, it is not straightforward how the three different
expansions should be combined for a comparison of coefficients in case all four fields are logarithmic. A further complication is given by the freedom to change the polynomials in the lij
by elements in the kernel of the operator O or, equivalently, by a redefinition of the structure
function coefficients. But we believe that it would be very interesting to investigate the consequences of crossing symmetry for the structure functions of LCFT 4-point functions, because
this might yield severe restrictions on the number of functions which have to be determined
by other means, for example local conformal invariance. This is an important task for future
work in order to greatly ease the full computation of 4-point correlation functions in LCFT of
rank r > 2.
Acknowledgements
The research of Michael Flohr is supported by the European Union network HPRN-CT2002-00325 and the research of Michael Flohr and Marco Krohn is partially supported by the
string theory network (SPP No. 1096), Fl 259/2-2, of the Deutsche Forschungsgemeinschaft.
M. Krohn would like to thank Sebastian Uhlmann and Robert Wimmer for helpful discussions.
The following tables contain the kernel terms that can show up for a logarithmic degree from
one to five. In addition to the logarithmic degree the kernel depends on the number of functions
F that are involved and also on the discrete symmetry. As we are considering four point functions
only we are left with four different symmetry groups.
The format of the entries is the same as in Section 2.6 with the small addition of the dimension
d of the kernel. It is interesting how similar the entries for the different logarithmic degrees are,
the only exception being the entry for the pair (F3 , S3 ) respectively (F12 , S4 ). Also note that each
column contains the full kernel, namely if and only if |F | = |S|, where S denotes the symmetry
group and |S| its cardinality.
means that these entries have to be identical as shown in (37), (39).
Log. deg 1
S2
S22
S3
F1
(1),d=1
KS
2
K (1),d=2
(1),d=1
KS
2
(1),d=1
KS
2
305
S4
K (1),d=2
()|d=1
K (1),d=2
0
(1),d=1
KS
()|d=1
K (1),d=2
Log. deg 2
S2
S22
S3
S4
F1
(2),d=2
KS
2
K (2),d=3
(2),d=2
KS
2
(2),d=2
KS
2
(2),d=1
KS
4
F2
F3
F4
F6
F12
F24
() = 2K2 K1 |d=1 .
(2),d=1
KS
K (2),d=3
()|d=2
F6
K (2),d=3
KS
F12
()|d=2
F24
K (2),d=3
S4
F2
F3
F4
(2),d=1
KS
4
(2),d=2
2
S2
S22
S3
F1
(3),d=2
KS
2
K (3),d=4
(3),d=2
KS
2
(3),d=2
KS
2
(3),d=1
KS
4
(3),d=1
KS
K (3),d=4
()|d=2
F6
K (3),d=4
KS
F12
()|d=2
F24
K (3),d=4
S4
F2
F3
F4
(3),d=1
KS
4
(3),d=2
2
S2
S22
S3
F1
(4),d=3
KS
2
K (4),d=5
(4),d=3
KS
2
(4),d=3
KS
2
(4),d=1
KS
4
(4),d=1
KS
K (4),d=5
()|d=3
F6
K (4),d=5
KS
F12
()|d=3
F24
K (4),d=5
F2
F3
F4
(4),d=1
KS
4
(4),d=5
2
306
Log. deg 5
S2
S22
S3
F1
(5),d=3
KS
2
K (5),d=6
(5),d=3
KS
2
(5),d=3
KS
2
(5),d=1
KS
4
S4
(5),d=1
KS
K (5),d=6
()|d=3
F6
K (5),d=6
KS
F12
()|d=3
F24
K (5),d=6
F2
F3
F4
(5),d=1
KS
4
(5),d=3
2
() = c(2K13 K22 + 8K12 K23 11K1 K24 + 5K25 ) + c (K14 K2 + 4K13 K22 6K12 K23 + 4K1 K24 )
+ c (8K15 + 1K14 K2 10K12 K23 + 20K1 K24 20K25 )|d=3 .
References
[1] V. Gurarie, Logarithmic operators in conformal field theory, Nucl. Phys. B 410 (1993) 535549, hep-th/9303160.
[2] M.A.I. Flohr, Bits and pieces in logarithmic conformal field theory, Int. J. Mod. Phys. A 18 (2003) 44974592,
hep-th/0111228.
[3] M.R. Gaberdiel, An algebraic approach to logarithmic conformal field theory, Int. J. Mod. Phys. A 18 (2003) 4593
4638, hep-th/0111260.
[4] J. Cardy, SLE for theoretical physicists, cond-mat/0503313, Ann. Phys., in press.
[5] J. Rasmussen, Note on SLE and logarithmic CFT, math-ph/0408011.
[6] A.A. Belavin, A.M. Polyakov, A.B. Zamolodchikov, Infinite conformal symmetry in two-dimensional quantum
field theory, Nucl. Phys. B 241 (1984) 333380.
[7] M.A.I. Flohr, Operator product expansion in logarithmic conformal field theory, Nucl. Phys. B 634 (2002) 511545,
hep-th/0107242.
[8] A. Nichols, SU(2)k logarithmic conformal field theories, Ph.D. thesis, Theoretical Physics University of Oxford,
2002.
[9] G. Piroux, P. Ruelle, Logarithmic scaling for height variables in the Abelian sandpile model, Phys. Lett. B 607
(2005) 188196, cond-mat/0410253.
[10] S. Mahieu, P. Ruelle, Scaling fields in the two-dimensional Abelian sandpile model, Phys. Rev. E 64 (2001) 066130,
hep-th/0107150.
[11] M.A.I. Flohr, Null vectors in logarithmic conformal field theory, JHEP Proc. Sect., PRHEP-tmr2000/044, hepth/0009137.
[12] A.M. Ghezelbash, V. Karimipour, Global conformal invariance in D dimensions and logarithmic correlation functions, Phys. Lett. B 402 (1997) 282289, hep-th/9704082.
[13] M. Khorrami, A. Aghamohammadi, M.R. Rahimi Tabar, Logarithmic conformal field theories with continuous
weights, Phys. Lett. B 419 (1998) 179185, hep-th/9711155.
[14] S. Moghimi-Araghi, S. Rouhani, M. Saadat, Logarithmic conformal field theory through nilpotent conformal dimensions, Nucl. Phys. B 599 (2001) 531, hep-th/0008165.
[15] M.R. Rahimi Tabar, A. Aghamohammadi, M. Khorrami, The logarithmic conformal field theories, Nucl. Phys. B
497 (1997) 555566, hep-th/9610168.
[16] J. Nagi, Logarithmic primary fields in conformal and superconformal field theory, hep-th/0504009.
[17] S. Moghimi-Araghi, S. Rouhani, M. Saadat, Use of nilpotent weights in logarithmic conformal field theories, Int. J.
Mod. Phys. A 18 (2003) 47474770, hep-th/0201099.
[18] M.A.I. Flohr, Singular vectors in logarithmic conformal field theories, Nucl. Phys. B 514 (1998) 523552, hepth/9707090.
[19] I.I. Kogan, A. Lewis, Origin of logarithmic operators in conformal field theories, Nucl. Phys. B 509 (1998) 687704,
hep-th/9705240.
[20] M.R. Gaberdiel, H.G. Kausch, A local logarithmic conformal field theory, Nucl. Phys. B 538 (1999) 631658.
Abstract
We consider Hermite and Laguerre -ensembles of large N N random matrices. For all even, corrections to the limiting global density are obtained, and the limiting density at the soft edge is evaluated. We
use the saddle point method on multidimensional integral representations of the density which are based
on special realizations of the generalized (multivariate) classical orthogonal polynomials. The corrections
to the bulk density are oscillatory terms that depends on . At the edges, the density can be expressed as a
multiple integral of the Konstevich type which constitutes a -deformation of the Airy function. This allows
us to obtain the main contribution to the soft edge density when the spectral parameter tends to .
2006 Elsevier B.V. All rights reserved.
MSC: 15A52; 41A60; 33D52
PACS: 02.50.Cw; 02.30.Mv; 02.30.Gp
Keywords: Random matrices; Asymptotic analysis; CalogeroMoserSutherland models
1. Introduction
We deal with two families of N N random matrices: the Hermite and Laguerre -ensembles
(for a review see [9]). These ensembles possess an eigenvalue joint probability density function
(p.d.f.) of the form
1 W (x)
e
, x = (x1 , . . . , xN ) I N ,
PN, (x) =
(1)
ZN
* Corresponding author.
308
where is real and positive. The support I of the eigenvalues in the Hermite and Laguerre cases
are respectively (, ) and (0, ). The ensembles names come from the fact that their p.d.f.
generalize the weight functions related to the Hermite and Laguerre polynomials; that is,
1 N 2
Hermite,
i=1 xi
1i<j N ln |xi xj |,
2
W (x) =
(2)
1 N
a N
i=1 xi 2
i=1 ln |xi |
1i<j N ln |xi xj |, Laguerre,
2
where a is a real and non-negative parameter. The normalization constants can be computed with
the help of the Selberg integrals:
(1+j/2)
G,N := g,N N
Hermite,
j =2 (1+/2) ,
ZN =
(3)
Wa,,N := wa,,N N (1+j/2) (1+(a+j 1)/2) , Laguerre,
j =1
(1+/2)
where g,N = (2)N/2 N (1/2+(N 1)/4) and wa,,N = (2/)N (a/2+1+(N 1)/2) .
For special values of the Dyson index , we recover classical random matrix ensembles (see
e.g. [9,19]). Indeed, the = 1, 2 , and 4 Hermite ensembles are respectively equivalent to the
Gaussian orthogonal, unitary, and symplectic ensembles. The Laguerre ensembles are similarly
related to the real, complex and quaternionic Wishart matrices. Recently, Dumitriu and Edelman
[3] have constructed explicit random matrices associated to the Hermite and Laguerre p.d.f. given
in Eq. (1). A generic random N N matrix belonging to the Hermite -ensemble can be written
as a tridiagonal symmetric matrix:
N[0, 1]
(N1)
(N1)
1
H =
N[0, 1]
(N2)
..
.
(N2)
N[0, 1]
..
.
2
(N3)
..
.
N[0, 1]
N[0, 1]
This means that the N diagonal elements and the N 1 subdiagonal elements are mutually
independent; the diagonal elements are normally distributed (with mean zero and variance 1)
while the off-diagonal have a chi distribution. Recall that the densities associated to N[, ]
2
2
2
and k are respectively (2 2 )1/2 e(x) /(2 ) and 2x k1 ex / (k/2), where in the latter
case x > 0. Any N N matrix L of the Laguerre -ensemble also has a tridiagonal form:
L = BT B , for some N N matrix
1
B =
(N1)
(P 1)
..
.
(N2)
..
.
(P N+1)
a=P N +1
2
.
In this article, we compute the density for large but finite random matrices of the Hermite and
Laguerre -ensembles. The density, or the marginal eigenvalue probability density, is defined as
follows:
N
N, (x) :=
(4)
PN, (x1 , . . . , xN ) dx1 dxN .
ZN
IN
309
The quantity N 1 N, (x) dx represents the probability to have an eigenvalue in the interval
[x, x + dx]. The density has two simple physical interpretations.
First, we remark that the Hermite p.d.f. is equivalent to the Boltzmann
factor
of a log-potential
Coulomb gas with particles of charge
unity
confined
to
the
interval
(
2N
,
2N ) with neutral
2
izing background charge density ( 2N/) 1 x /2N . From this point of view, ZN (divided
by N!) is simply the canonical partition function at inverse temperature and N, (x) dx gives
the number of charges present in the interval [x, x + dx]. This analogy allows one to predict the
global density:
2
2
1 x 2 , 1 < x < 1,
lim
(5)
N, ( 2N x) = W (x) :=
N N
0,
|x| 1.
This result is known as the Wigner semicircle
law.For a finite matrix, we expect that the scaled
density is of order one in the interval ( 2N
, 2N ), the bulk region of the mechanical
problem, while it decreases rapidly around 2N , called the soft edges. A similar log-gas
construction is possible for the Laguerre case. One expects the c = 1 MarcenkoPastur law [18]:
2
1
1, 0 < x < 1,
lim 4N, (4N x) = MP (x) := x
(6)
N
0,
x 1.
We see that, in the Laguerre case, the bulk is (0, 4N ) while the soft edge is the point 4N . The
origin is referred as the hard edge of the support because the eigenvalues are constrained to
be positive. The predictions given in Eqs. (5) and (6) have been confirmed in [1,7]. The asymptotic analysis used in these references constitutes the starting point for the study of the higher
expansions to be undertaken in the present work.
Second, their is a deep connection between the -ensembles and some integrable quantum
mechanical N -body problems on the line, known as the CalogeroMoserSutherland (CMS)
models (a good reference is [21]). The Hermite p.d.f. is in fact the ground state wave functions
squared of the (rational) AN1 CMS model, whose Hamiltonian is
H (H) =
N
N
2
1
2 2 ( 2)
+
xi +
,
2
4
2
(xi xj )2
xi
i=1
i=1
1i<j N
for xj (, ). The Laguerre p.d.f. is the ground state squared of the Hamiltonian of the BN
CMS model, which can be expressed as follows:
H
(L)
N
N
2
1
1
2
= 2
2xi 2 +
+
a(a 2) + xi
xi
4
xi
xi
i=1
i=1
x i + xj
+ ( 2)
,
(xi xj )2
1i<j N
where xj (0, ). It has been shown in [1] (see also [23]) that the eigenfunctions of the conjugated Schrdinger operators eW/2 H (H) eW/2 and eW/2 H (L) eW/2 are respectively the
generalized (or multivariate) Hermite and Laguerre polynomials, previously introduced by Lassalle in [16,17]. In the context of CMS models, the global
density can be seen as the ground
state expectation value of the density operator (x)
310
The relation between the CMS models and the generalized classical orthogonal polynomials
furnishes, when is an even integer, new integral representations of the global density that
suits perfectly for asymptotic analysis. Let us be more explicit. The definition of the density
given in (4) contains N integrals; considering N large does not simplify the calculation. On the
other hand, it has been noticed in [1,7] that the density is a particular Hermite (or Laguerre)
polynomial, characterized by a partition = ((N 1) ) and evaluated at x1 = = x = x (see
below). Using the work of Kaneko [14] and Yan [22], one then can shows that the density is
proportional to the following -dimensional integral:
Nf (u1 ,x)
RN, (x) := du1 e
(7)
du eNf (u ,x)
|uj uk |4/ ,
C
1j <k
a () m (x1 , . . . , xN )
311
(triangularity),
<
(eigenfunction)
for some eigenvalue (). In the last equations, < means that kj =1 i kj =1 i for all k
()
Dk :=
N
xik
i=1
2
1
2
k
k
+
x
x
.
i
j
xi
xj
xi2 1i<j N xi xj
The generalized Hermite polynomials, denoted by H (x1 , . . . , xN ; ), are the only symmetric
polynomials obeying to
()
b (, N)J() (x1 , . . . , xN ),
H (x1 , . . . , xN ; ) = J (x1 , . . . , xN ) +
||=||2n
n=1,2,...,|/2|
()
D0 2E1 H (x1 , . . . , xN ; ) = 2||H (x1 , . . . , xN ; ),
where
Ek :=
N
i=1
xik
.
xi
2
,
where H (H) is the CMS Hamiltonian defined in Section 1. On can show that
1 () ()
H (x1 , . . . , xN ; ) = exp D0 J .
4
(8)
Similarly to the Hermite case, the generalized Laguerre polynomials, written L (x1 , . . . ,
xN ; ), are the unique symmetric polynomials satisfying
()
c (, , N)J() (x1 , . . . , xN ),
L (x1 , . . . , xN ; ) = J (x1 , . . . , xN ) +
||=||n
n=1,2,...,||
()
D1 E1 + ( + 1)E0 L (x1 , . . . , xN ; ) = ||L (x1 , . . . , xN ; ).
2
,
a 1
.
2
The following formula furnishes a way to compute the generalized Laguerre polynomials:
L (x1 , . . . , xN ; ) = exp D1() ( + 1)E0 J() (x1 , . . . , xN ).
(9)
312
When is an even integer, the density in the Hermite and Laguerre ensembles can be respectively written as a particular Hermite and Laguerre polynomial; explicitly,
G
2
N G,N1 ex /2 H ((N1) ) (x1 , . . . , x ; /2)|x1 ==x =x ,
Hermite,
,N
N, (x) =
N Wa,,N1 x a/2 ex/2 L a1+2/ (x , . . . , x ; /2)|
x1 ==x =x , Laguerre,
Wa,,N
((N1) ) 1
(10)
where we have used the convention (nk ) = (n, . . . , n). Eqs. (8) and (9), together with the fact that
()
J (x1 , . . . , xk ) = x1n xkn when = (nk ), readily imply the following exact expressions of the
density:
N, (x) = N
G,N 1 x 2 /2
e
G,N
(N
1)/2
n=0
(1)n (/2) n N1
x1 xN1
D0
4n n!
(11)
x1 ==x =x
+ (a 2/)E0 x1 x
,
D1
n!
x1 ==x =x
N, (x) = N
(12)
n=0
313
eNf (u,x) is analytic everywhere in the finite complex u-plane, except possibly at a pole depending
on x, and that C is the real interval (, ).
The method of steepest descent requires that the integrand of (7) should be analytic. This
means in particular that the absolute values must be removed. Such an operation is realized in
the following lemma; it is possible when the line integration C of the variable uj is deformed
into an appropriate complex path Cj . Acceptable contours are given in Fig. 1. Other appropriate
contours are obtained by making a reflection of the picture with respect to the real axis. Note that
the dashed lines stand for (movable) branch cuts. We stress that uj s contour starts at and
ends at the complex variable uj 1 . Only the path of u1 (the last variable to be integrated) ends
on the real axis.
Lemma 1. Let {Cj } be a set of non-intersecting contours such that C1 is a simple contour going
from to and such that Cj goes from to uj 1 for all j = 2, . . . , n (see Fig. 1). Then
du1
= n!
dun
n
i=1
dun
n
i=1
Cn
|uj uk |4/
1j <kn
du1
C1
(uj uk )4/ ,
1j <kn
where < arg uj and where arg(ui uj )4/ = 0 when ui , uj R but ui > uj .1
Proof. Using the invariance of the left-hand side of the previous equation under any permutation,
we immediately see that it is equivalent to the following ordered integrals:
n!
u1
du1 e
Nf (u1 ,x)
un1
du2 e
Nf (u2 ,x)
(uj uk )4/ .
1j <kn
These integrals only contain analytic functions, so we can use Cauchys theorem. This implies
that the contour of u1 can be deformed into any simple curve starting at and stopping at .
For the remaining variables, any non-intersecting contour of integration in the complex plane
which starts at , does not cross the branch cuts coming from the multivaluedness of the
integrand, and complies with the ordering of the variables can be chosen. 2
1 When n = and
/ N in Lemma 1, other integral representations in which all variables go from to are
possible [2].
314
We are now in position to analyse (7) when N is large. Recall that the basic idea of the steepest
descent method is to choose a path for which the decrease of f (u, x) in maximum. In particular,
this means that the contour must pass through the saddle points. In the bulk case, f has two
simple saddle points u ; that is,
2
(13)
f (u, x) = Rei , R > 0.
f (u, x) = 0,
2
u
u
u
u
The directions of steepest descent at these points, denoted , are such that cos(2 + ) = 1
and sin(2 + ) = 0, so
=
(mod ),
2
< .
(14)
Proposition 2. Let f (u, x) be a function that satisfies Eqs. (13) and (14). Let also f =
f (u , x). Suppose moreover that the saddle points are such that
(u ) <
(u+ ). Then,
2 1
RN, (x) =
(/2, )2 (u+ u )
/2
NR
1
eN (f+ +f )/2 ei(1)(+ + ) rN, (x) + O
,
N
where
rN, (x) = 1 + 2
/2
k=1
k
j =1
(1 + 2j/)
(1 + 2(j k)/)
2
ei2k (+ + )/
(u+ u )4k
2 /
(N R)2k
2 /
!
"
cos ikN (f+ f ) + k(+ )(3 2/)
and
n, :=
du1
n/2
2n(n1)/
n
j =2
dun
n
i=1
eui
|uj uk |4/
1j <kn
(1 + 2j/)
.
(1 + 2/)
(15)
Proof. We first apply Lemma 1 to the expression (7). Then, the contours Cj are deformed into
steepest descent contours Sj = Sj Sj+ passing through the saddle points u . Close to these
points, the contours are parametrized as follows:
uj = u + tj ei ,
< arg uj ,
on Sj ,
where the angles of steepest descent are given (14) and where tj (, ) for some > 0.
Moreover, we impose ti > tj for i < j in order to guarantee
(ui ) >
(uj ) for i < j . Setting
NR
tj
yj =
2
315
we obtain
yi (, ) and
RN, (x) = !
n=0
du1
Sj+
dun+1
Sj
dun
Sj+
du
n
j =1
Sj
Nf (uj ,x)
(uj uk )
4/
1j <k
In terms of the new variables introduced above, the right-hand side of the previous equation
becomes
yn1
y1
n
2
1
Sn
dy1
dy2
dyn
eyj 1 + O
!
N
n=0
j =1
(yp yq )4/
1p<qn
yk2
k=n+1
Sn = (u+ u )4n(n)/
2
NR
y1
dyn+2
dyn+1
1
1+O
N
where
yn+1
dy
"
(yr ys )4/ ,
n+1r<s
3/212n(n)/
Sn n, n, + O
1
N
n
1
=
(/2, )2 S/2 rN, (x) + O
,
/2
N
n=0
316
where
/2
1 /2+k, /2k,
rN, (x) = 1 +
/2, /2,
/2
k=1
S/2+k + S/2k
1
+O
.
S/2
N
/2 + k
(16)
Note that the order of the analytic corrections (i.e., coming from the expansion
of f (uj , x)
as a polynomial in yj of degree superior than (2) is now 1/N rather than 1/ N . This can
be explained by using the theory of the generalized Hermite polynomials [1]: only symmetric
polynomials p(y1 , . . . , yn ) of degree even may have a non-zero contribution to
dy1
dyn
n
eyi
i=1
1j <kn
In our case, the O(1/ N ) terms are symmetric polynomials in yj of degree one and three, so
they do not contribute in the expression of RN, (x).
2k 2 / . Hence, the combinatorial terms
Note also
that (S/2+k + S/2k )/S/2 is of order 1/N
with k > /2 are smaller than the analytic corrections of order 1/N ; thus we must truncate
Eq. (16) as follows:
rN, (x) = 1 +
/2
k=1
/2 + k
1 /2+k, /2k, S/2+k + S/2k
.
/2, /2,
S/2
/2
1 /2+k, /2k,
=
2 /
2k
/2 + k /2
/2, /2,
(1 + 2(j k)/)
2
j =1
and the proof is complete.
The explicit link between the Hermite density and the integral of the type (7) has been obtained in [1]. It reads:
1 G,N 1
2
(2N )N/2+ eN x RN, (x)
N, ( 2Nx) =
(17)
2 G,N ,
if
1
ln(iu + x).
(18)
N
Up to additive terms of order 1/N , the latter function has two saddle points
1
u = ix 1 x 2 .
2
We see that
(u+ ) >
(u ) only if 1 < x < 1. According to the notation used in Eqs. (13)
and (14), we have
1 N 1
N 1
2
2
ln 2 + x i
arccos x x 1 x
f =
2
N
N
f (u, x) = 2u2 + ln(iu + x)
317
and
Rei = 8 1 x 2 ei(arcsin x) ,
where we have made use of
arcsin x = i ln(ix + 1 x 2 ) = /2 arccos x.
Note that the inverse trigonometric functions are defined on their principal branch; that is,
arcsin x : [1, 1] [/2, /2] and arccos x : [1, 1] [, 0]. Thus the angles of steepest descent are
1
= arcsin x.
2
Stirlings approximation,
(y + z) =
1
2ez zy+z1/2 1 + O
z
when z ,
immediately implies
G,N 1 2N/21/2 N/2 eN/2
1
=
.
1+O
G,N
N
N N/2+1/2
The substitution of the above results in Proposition 2 provide the sought asymptotic corrections
to the global density.
Corollary 3. Let 1 < x < 1 and let PW (x) denote the (cumulative) probability distribution
associated to the semicircle law given in Eq. (5); i.e.,
x
PW (x) =
1
Then
x
1
W (t) dt = 1 + W (x) arccos x.
2
(19)
2
1
N, ( 2N x) = W (x)rN, (x) + O
,
N
N
where
rN, (x) = 1 + 2
/2
(1)k
2
( 3 W (x)3 N )2k /
!
k
(1 + 2j/)
cos 2kNPW (x) + k(x, )
(1 + 2(j k)/)
k=1
j =1
(20)
318
Fig. 2. Comparison of the exact density (11), shown as a solid line, and the asymptotic density (21), shown as a dashed
line, in the Hermite -ensemble for N = 7 and = 6.
2
1
1
N, ( 2Nx) = W (x) + O
+O
N
N
N 8/
2 (1 + 2/)
1
(21)
Up to a factor of order 1/N , the dominant oscillatory terms in the Gaussian unitary and symplectic ensembles are thus
3 2 2 cos 2NPW (x) ,
= 2,
2
W (x) N
N, ( 2Nx) W (x) =
1
N
cos 2NPW (x) + 12 arcsin x , = 4,
(x)1/2 N 1/2
W
respectively. A direct computation shows that the non-oscillatory O(1/N) term is exactly zero
when = 2. This implies that our result reproduces the Gaussian global densities previously
obtained in [11,13], for the unitary case, and in [13], for the symplectic case.2
The asymptotic expansion of the density for = 6 is numerically compared with the exact
one in Fig. 2. This picture shows that, even for a small N s, Eq. (20) furnishes a qualitatively
2 Note however the presence of a misprint in [13] for the symplectic case.
319
Fig. 3. Asymptotic density (21) in the Hermite -ensemble for N = 8 and = 2, 6, 10.
good approximation of the density in the bulk. Fig. 3 illustrates the behavior of N, ( 2N x)
when the Dyson index varies.
3.2. Laguerre case
The method used to evaluate the asymptotic behavior the Laguerre density is almost the same
as the one used in the Hermite case. All relevant differences originate from the correct contour that we must choose in Eq. (7) [7]: C starts at the point u = 1, turns around zero in the
counterclockwise direction, and comes back to u = 1. In the following paragraphes, we briefly
obtain the Laguerre version of Lemma 1 and Proposition 2. We suppose that eNf (u,x) is analytic
everywhere, except maybe on the interval [0, 1].
Lemma 4. Let {Cj } be a set of non-intersecting counterclockwise contours around the origin, all
starting at uj = 1, such that 0 arg(un ) arg(u1 ) 2 . Then
#
#
n
du1 dun
eNf (ui ,x)
|uj uk |4/
C
= n!(1)
i=1
1j <kn
du1
n(n1)/
C1
Cn
dun
n
i=1
eN f (ui ,x)
(uj uk )4/ ,
1j <kn
320
2(n1)
N
ln u.
Proof. We first set |uj | = 1, i.e., uj = eij . C is such that j goes from 0 to 2 . The integral
is completely symmetric so that we have n! possible arrangements of the type 0 i1
in 2 . We choose 0 n 1 2 . In that case,
|uj uk |4/ =
1j <kn
2 sin
1j <kn
i j
2
4/
.
n
ui 2(n1)/
i=1
(ui uj )4/ .
1j <kn
#
du1
dun
C
n
i=1
|uj uk |4/
1j <kn
n!
(1)n(n1)/
du1 dun
|ui |=1
0arg(un )arg(u1 )2
n
i=1
eN f (ui ,x)
(uj uk )4/ .
1j <kn
The integrand is analytic everywhere but possibly on the segment [0, 1]. Therefore, we can apply
Cauchys theorem and deform the paths on the unit circle into any counterclockwise contours Ci
around zero and starting at ui = 1 as long as the ordering 0 arg(un ) arg(u1 ) 2 is
satisfied. 2
Proposition 5. Let f(u, x) = f (u, x) (2 2/)N 1 ln u, where f (u, x) is the function appearing in the definition of RN, (x), given in Eq. (7), and satisfying Eqs. (13) and (14). Let also
f = f(u , x). Suppose moreover that the saddle points are such that 0 arg(u+ ) < arg(u )
2 . Then,
2 1
2
RN, (x) =
(/2, ) (u+ u )
/2
NR
$
%
/2
k
Proof. We first use Lemma 4. The remaining steps are similar to those of Proposition 2.
321
In Ref. [7], the density of the Laguerre -ensemble has been written in terms of generalized
hypergeometric functions [14]:
N, (x) = N
There exist integral representations of the generalized hypergeometric functions [22]. For our
purpose, the appropriate integral formula can be found in Chapter 11 of [9]; one easily shows
that
()
B; A + 1 + (n 1)/; t1 , . . . , tn t ==t =x
1F 1
n
1
#
#
n
i2nB
1
1
=
du1
dun
exuj uB1
(1 uj )A+B
j
Mn (A, B, 1/) 2i
2i
C
2/
|uk ul |
j =1
1k<ln
n
j =1
(1 + A + B C + j C) (1 + j C)
.
(1 + A C + j C) (1 + B C + j C) (1 + C)
This implies that the Laguerre density in the bulk can be recast in an integral of the form (7):
N, (4Nx) =
(22)
provided that, in Eq. (7), C is a counterclockwise closed path around the origin and starting at
u = 1, and
1
2
f (u, x) = 4xu ln u + ln(1 u) +
(23)
a2+
ln(1 u).
N
Neglecting factors of order 1/N , we see that this function has two simple saddle points,
1
1
u =
1i
1 ,
2
x
which satisfy arg(u+ ) < arg(u ) only if 0 < x < 1. This implies
2
1
i
2
f (u, x) = Re , R = 16x
1, = ;
x
2
u2
u
hence, the directions of steepest descent are + = 3/4 and = /4. We also have
1
a + 4/ 4
a
f = 2x
ln 2 x 2i x
1 arccos x i arccos x.
N
x
N
The Stirling approximation readily gives
1+a
Wa+2,,N1
(1 + /2)
1
=
N a/2
1+O
.
Wa,,N
2
(1 + a/2) (1 + (a + 1)/2)
N
322
j =0
/2 M (a + /2 1, N 1, 2/)
1a
(1 + a/2) (1 + (1 + a)/2)
1
1
=
.
1+O
2
N
(1 + /2)N a+2
By substituting the above equations in Proposition 5, we obtain the following result.
(24)
Corollary 6. Let 0 < x < 1 and let PMP (x) denote the (cumulative) probability distribution
associated to the density MP defined in Eq. (6); i.e.,
x
PMP (x) =
2
arccos x.
(25)
Then
1
4N, (4N x) = MP (x)rN, (x) + O
,
N
where
rN, (x) = 1 + 2
/2
k=1
(1)k
(2 3 x 2 MP (x)3 N )2k
2 /
!
(1 + 2j/)
2
(x, ) = 1
2a arccos x.
2
k
(26)
(27)
cos
2NP
(x)
+
(x,
)
.
MP
(2 3 x 2 N )2/ MP (x)6/1
Hence, neglecting the possible factor O(1/N), the first oscillatory corrections to the global density in the complex and quaternionic Wishart ensembles are
1
x ,
(x)
MP
= 2,
4N, (4N x) MP (x) =
1/2
1/2
1/2
2 xMP (x) N
= 4.
323
Fig. 4. Comparison of the exact density (12), shown as a solid line, and the asymptotic density (27), shown as a dashed
line, in the Hermite -ensemble for N = 4, = 6, and a = 0, 1.
Contrary to the Gaussian unitary ensemble, the first correction in the complex Wishart ensemble
has a non-null (when a = 0) correction of order 1/N which is not oscillatory [11]. Nevertheless,
our oscillatory term is the same as the one given the latter reference. The = 4 case has been
recently studied in [10]; the dominant correction is purely oscillatory and is equal to the term
given above.
Fig. 4 provides a numerical comparison between the asymptotic and the exact expressions of
the density. Clearly, the asymptotic approximation is better for a = 0. Note that the oscillations
shift to the right when a increases. In Fig. 5, we illustrate the effect of the variation of on
4N, (4N x) for fixed N and a.
4. Density at the soft-edge
We have seen that the steepest descent method can be applied to the scaled densities only if
1 < x < 1 (Hermite case) or 0 < x < 1 (Laguerre case). Indeed, when the spectral parameter x
is outside these intervals, the contours of integration cannot be deformed into the steepest decent
ones without transgressing the appropriate ordering of the variables of integration. A change
of scaling is mandatory. The appropriate
changes of variable at the soft edges have been obtained in [6]: the scaled densities
(
2N x) (Hermite) and N, (4N x) (Laguerre) should be
N,
1/3
replaced by N, ( 2N + x/ 2N ) (Hermite) and N, (4N + 2(2N )1/3 x) (Laguerre).
324
Technically, these new scalings make the two simple saddle points coalesce and become a
double saddle point (or saddle point of order two). Then, the multiple Gaussian integrals are
replaced by multiple Airy integrals, or integrals of the Kontsevich type [15]:
1
Kn, (x) :=
(2i)n
i
i
dv1
dvn
n
j =1
evj /3xvj
|vk vl |4/ .
(28)
1k<ln
We recall that the Airy function of a real variable x can be defined as follows:
1
Ai(x) =
2i
i
ev
3 /3xv
dv,
and as a consequence
Ai
(x) = x Ai(x).
One readily verifies that
K1, (x) = Ai(x),
K2,2 (x) = 2 Ai
(x)2 xAi(x)2 .
(29)
325
It is worth mentioning that the function Kn,2 (x) has previously been studied in the context of the
Gaussian unitary ensemble [8]. In particular, it has been shown that3
i+j 2
d
Kn,2 (x) = n! det
Ai(x)
.
dx i+j 2
i,j =1,...,n
4.1. Hermite case
The next lemma generalizes a basic fact of the Airy function of a complex variable z; that is,
the contour in Eq. (29) can be deformed so that
1
3
Ai(z) =
(30)
ev /3zv dv,
2i
A0
where A0 is a simple path going from eia to eib , where /2 < a < /6 and /6 <
b < /2 (see in Fig. 6).
Lemma 7. Let A0 be the contour described above. Let also {Vj } denote a set of non-intersecting
paths such that V1 = A0 and such that, for all j {2, . . . , n}, Vj follows A0 but stop at vj 1 .
Then
n
3
n!
Kn, (x) :=
(31)
dv
dv
evj /3xvj
(vk vl )4/ ,
1
n
n
(2i)
V1
Vn
j =1
1k<ln
where < arg vj and where arg(vi vj )4/ = 0 when both vi = 0 = vj and
vi >
vj .
3 Note however that the term n! is missing in [8].
326
Proof. We essentially proceed as in Lemma 1. Firstly, we write Eq. (28) as ordered integrals
along the imaginary axis and remove the absolute values. Secondly, the analyticity of the integrand and the property
$ 3
%
< < ,
6
2
or
5
7
< <
6
6
(32)
are used to deform the contour of v1 into A0 . We finally complete the proof by exploiting the
ordering of the variables, the choice of principal branch for (vi vj )4/ , and the Cauchy theorem. 2
Proposition 8. The integral RN, defined by Eqs. (7) and (18) satisfies
RN, 1 +
x
2N 2/3
=4
N/2+N 1/3 x
e
1
K
(x)
+
O
.
,
2
2N N 2/3
N 1/3
x
2N 2/3
du1
= !
C1
du
2/3 )
i=1
(uj uk )4/ .
1j <k
= 0 = 2 g(u, x)
,
g(u,
x)
= Rei0 = 16ei/2 .
g(u, x)
3
u
u
u
u=u0
u=u0
u=u0
The directions for which the decrease of f is maximum are determined by both conditions
cos(30 + 0 ) = 1 and sin(30 + 0 ) = 0. Thus, the angles of steepest descent are 0 =
5/6, /6, /2. We choose the two former and make the following change of variables:
vj = 2iN 1/3 (uj u0 ).
This implies that
Nf uj , 1 +
x
2N 2/3
1 3
1
N
1/3
= N ln 2 + N x + ln 2 + vj xvj + O
.
2
3
N 1/3
Let {Vj } denote the set of ordered and non-intersecting contours of steepest descent: V1 starts at
ei/3 , passes through the origin and stops at ei/3 ; Vj follows V1 but stops at vj 1 , where
j = 2, . . . , . We have proved that
RN,
x
1+
2N 2/3
1/3
= 4!
eN/2+N x
(N
2 +2) N 2/3 i3
+O
327
dv1
V1
dv
evi /3xvi
i=1
(vj vk )4/
1j <k
1
.
N 1/3
Let us go back to Eq. (17) and make the change of scaling x 1 + x/(2N 2/3 ):
x
N,
2N +
2N 1/3
x
1 G,N 1
N/2+ N N 1/3 x x 2 /(4N 1/3 )
(2N )
e
e
e
RN, 1 +
.
=
2 G,N ,
2N 2/3
Proposition 8 and some manipulations directly imply the following.
Corollary 9. The density in the Hermite -ensemble evaluated at soft edge is proportional to the
an integral of the Kontsevich type:
x
1
N,
2N +
2N 1/3
2N 1/3
/2
1 4
(1 + /2)
1
=
.
K, (x) + O
2
N 1/3
(1 + 2/)1 (1 + 2j/)
j =2
ei2/3 Ai ei2/3 z + Ai(z) + ei2/3 Ai ei2/3 z = 0.
n+1
n!
(2i)n
dv1
V 1
V n
dvn
n
j =1
evj /3xvj
(vk vl )4/ ,
(33)
1k<ln
where < arg vj and where arg(vi vj )4/ = 0 when both vi = 0 = vj and
vi >
vj .
328
Proof. By virtue of Eq. (32), it is possible to join A1 and A1 . Thus, Cauchys theorem implies
that V 1 can be replaced A0 . The constraint on the ordering of the variables {v2 , . . . , vn } is then
considered and equivalence between Eqs. (31) and (33) follows. 2
Proposition 11. The integral RN, defined by Eqs. (7) and (23) satisfies
2N e(2N )1/3 x
1
x
e
K
(x)
+
O
=
(2i)
.
RN, 1 +
,
(2N )2/3
2a+4/3 N 2/3
N 1/3
Proof. We essentially follow the proof of Proposition 8. By virtue of Lemma 4, we have that
x
RN, 1 +
(2N )2/3
#
#
2/3
eN f (ui ,1+x/(2N ) )
(uj uk )4/ ,
= ! du1 du
C1
where
i=1
1j <k
x
(2N)2/3
1
1
2
1 2
= g(u) +
xu +
a2+
ln(1 u) +
2 ln(u),
N
N
(2N )2/3
f u, 1 +
for
g(u) = 4u + ln(1 u) ln(u).
The latter function possesses a double saddle point u0 = 1/2. One can check that
d 3 g(u)/du3 u=u = 32ei ,
0
so the steepest descent angles are 0 = 0, 2/3, 4/3. The contour of u1 is chosen such that:
(1) it approaches u0 by following the real axis in the negative direction; (2) it leaves the saddle
point with an angle 2/3; (3) it turns around the origin in the positive direction; (4) it comes back
to the u0 with an argument of 4/3; (5) it finally leaves this point and reaches the point u = 1
by following the real axis. The ordering of the variables around the origin implies moreover that
ui follows u1 but stops at ui1 . When N is large, step (3) is irrelevant and the steepest descent
contour brakes into two disjoints paths, namely (1), (2) and (4)(5). Now we set
vj = 2(2N)1/3 (u0 uj ),
which means that
N f uj , 1 +
x
(2N )2/3
4
1
1 3
1/3
.
= 2N + (2N) x + 4 a ln 2 + vj x vj + O
3
N 1/3
When N , the contours of the variables {vj } that give the major contribution to the integral,
denoted by {V j }, behave as follows: V 1 is the union of the path that begins at , passes close to
the origin and ends at ei/3 together with the path that starts at ei/3 , goes near the origin
329
3
e2N e(2N ) x
evi /3x vi
(vj vk )4/
= ! a+4/3 2/3 d v1 d v
2
N
V 1
i=1
1
.
N 1/3
Lemma 10 finally provides the sought for result.
1j <k
+O
We apply the last proposition to the scaled expression of the Laguerre density given in (22):
N, 4N + 2(2N )1/3 x
1/3
N Wa+2,,N1 (4N )a/2 e2N e(2N )
x
RN, 1 +
=
.
(2i) Wa,,N M (a + 2/ 1, N 1, 2/)
(2N )2/3
Minor manipulations and use of Stirlings approximation give the sought limiting soft edge density, which is identical to that obtained in Corollary 9 for the Hermite -ensemble.
Corollary 12. The density in the Laguerre -ensemble evaluated at the soft edge is proportional
to the an integral of the Kontsevich type:
2(2N)1/3 N, 4N + 2(2N )1/3 x
1 4 /2
(1 + /2)
1
=
.
K, (x) + O
2
N 1/3
(1 + 2/)1 (1 + 2j/)
j =2
!
vj3 /3xvj
K, (x) =
dv
dv
e
(vj vk )4/ ,
1
(2i)
V1
i=1
1j <k
where {Vj } is such that V1 goes from ei to ei , passing through the point x, and such
that Vj goes from ei to vj 1 for all j = 2, . . . , and /6 < < /2. We now set
wj = x 1/4 vj x 1/2 ei/2 .
330
Thus,
! e2x /3
K, (x) =
(2) x 3/41/2
3/2
dw1
W1
dw
3/4
i=1
(wj wk )4/ ,
1j <k
2
3/4
1 e2x /3
K, (z) =
dw
dw
ewj +O(x )
1
3/41/2
(2) x
i=1
|wj wk |4/ .
1j <k
Note that the term O(x 3/4 ) is odd in wj . As explained in the proof of Proposition 17, this
implies that the actual correction to the integral is of order x 3/2 . We finally obtain the desired
expression by comparing the last equation with Eq. (15). 2
Applying Proposition 13, the next result gives the behavior of the density when the spectral
parameter leaves the bulk.
Corollary 14. Let (x) denote the density evaluated at the soft edge:
limN 1 1/3 N, 2N + x 1/3 ,
Hermite,
2N
2N
(x) =
limN 2(2N )1/3 N, 4N + 2(2N )1/3 x , Laguerre.
Then, as x ,
(x) =
2 3/2
1
1 (1 + /2) e 3 x
+
O
.
2 (4)/2 x 3/41/2
x 3/4+1
When the density is evaluated at points inside the bulk but close to the edge, we should observe
both decrease and oscillation (see Figs. 25). This is confirmed in next paragraphs.
Proposition 15. Let x = |x|. When |x| is large,
(/2, )2
1
|x|kx, + O 5/2 ,
K, (x) =
/2
x
where
/2
k
(1 + 2j/)
kx, =1 + 2
2 /
2 /
6k
3k
(1 + 2(j k)/)
2
|x|
k=1
j =1
4k 3/2
2
cos
|x| k 1
.
3
2
(1)k
331
i
i
dv1
dv
j =1
e|x|
3/2 f (v
j)
|vk vl |4/ ,
1k<l
where
1
f (v) = v 3 + v.
3
The function f possesses two simple saddle points, namely, v = ei/2 . We have f =
f (v ) = 2i/3 and f
(v ) = 2ei/2 ; whence the angle of steepest descent are = /2
/4. The remainder of the proof is a straightforward application of Proposition 2. 2
Corollary 16. Let (x) be the quantity defined in Lemma 14. When x , we have
(1 + /2)
4 3/2
2
(x) = |x| 6/1 3/1/2 cos |x|
1
3
2
2
|x|
1
1
+O
+O
.
|x|5/2
|x|6/1/2
The previous result can be obtained directly from the asymptotic density in the bulk of the Hermite (or Laguerre) -ensemble. Indeed, the change of variable x 1 |x|/(2N 2/3 ) in Eq. (20)
and the development of this expression for N 1/3 |x| 1 reclaims Corollary 16. However, it is
impossible to derive Corollary 14 from Eq. (20) by such an expansion. Note finally that Corollaries 1416 imply that the density at the soft edge of the Laguerre -ensemble is independent
of a when both N and |x| are large.
6. Concluding remarks
The aim of the article was to determine the large-N asymptotic expansion of the density in
the Hermite and Laguerre -ensembles when 2N.
We have shown that the first correction to the global density is purely oscillatory when > 2
and is of order N 2/ . In the Hermite ensemble of N N random matrices, the density contains
N peaks; the greater is and the higher are the oscillations. The influence of the Dyson parameter on the oscillations is the same in the Laguerre ensemble. However, the density in the latter
ensemble contains N 1 summits and a (delta) divergence at the origin.
These results agree with the large- asymptotic analysis realized recently in [4]. More precisely, it has been proved that for , the density in the bulk of the Hermite ensemble can
be written as a sum of N Gaussian distributions centered at the zeros of an Hermite polynomial
of degree N (and similarly for the Laguerre case). These conclusions are, of course, coherent
with the log-gas analogy presented in Section 1. Note that no constraints on are imposed in
[4]. Consequently, we may surmise that our asymptotic formulas (21) and (27) are valid for any
real greater that 2, though the general method to prove this is still missing.
We have also shown that the density of the Hermite and Laguerre ensembles are both proportional to a Kontsevich like integral K, (x) when evaluated about the edges of the spectrum.
Although the exact densities of the Hermite and Laguerre ensembles are quite different, the asymptotic analysis of K, (x) has revealed that they approach the same function in the soft edge
332
scaling, thus verifying the expected universality. The Kontsevich like integral itself is a special
function generalizing the Airy integral and, as such, is worthy for independent study.
Acknowledgements
The work of P.J.F. has been supported by the Australian Research Council. P.D. is grateful to
the Natural Sciences and Engineering Research Council of Canada for a postdoctoral fellowship.
References
[1] T.H. Baker, P.J. Forrester, The CalogeroSutherland model and generalized classical polynomials, Commun. Math.
Phys. 188 (1997) 175216.
[2] VI.S. Dotsenko, V.A. Fateev, Four-point correlation functions and the operator algebra in 2D conformal invariant
theories with central charge C 1, Nucl. Phys. B 251 (1985) 691734.
[3] I. Dumitriu, A. Edelman, Matrix models for beta ensembles, J. Math. Phys. 43 (2002) 58305847.
[4] I. Dumitriu, A. Edelman, Eigenvalues of Hermite and Laguerre ensembles: Large beta asymptotics, Ann. Inst.
H. Poincar Probab. Statist. 41 (2005) 10831099.
[5] I. Dumitriu, A. Edelman, MOPS: Multivariate orthogonal polynomials (symbolically), math-ph/0409066.
[6] P.J. Forrester, The spectrum edge of random matrix ensembles, Nucl. Phys. B 402 (1993) 709728.
[7] P.J. Forrester, Exact results and universal asymptotics in the Laguerre random matrix ensemble, J. Math. Phys. 35
(1994) 25392551.
[8] P.J. Forrester, N.S. Witte, Application of the -function theory of Painlev equations to random matrices: PIV, PII
and the GUE, Commun. Math. Phys. 219 (2001) 357398.
[9] P.J. Forrester, Log Gases and Random Matrices, book in preparation, http://www.ms.unimelb.edu.au/~matpjf/
matpjf.html.
[10] P.J. Forrester, N.E. Frankel, T.M. Garoni, Asymptotic form of the density profile or Gaussian and Laguerre random
matrix ensembles with orthogonal and symplectic symmetry, math-ph/0508031.
[11] T.M. Garoni, P.J. Forrester, N.E. Frankel, Asymptotic corrections to the density of the GUE and LUE, J. Math.
Phys. 46 (2005) 103301.
[12] K. Johansson, On fluctuations of eigenvalues of random Hermitian matrices, Duke Math. J. 91 (1998) 151204.
[13] F. Kalisch, D. Braak, Exact density of states for finite Gaussian random matrix ensembles via supersymmetry,
J. Phys. A 35 (2002) 99579969.
[14] J. Kaneko, Selberg integrals and hypergeometric functions associated with Jack polynomials, SIAM J. Math.
Anal. 24 (1993) 10861110.
[15] M. Kontsevich, Intersection theory on the moduli space of curves and the matrix Airy function, Commun. Math.
Phys. 147 (1992) 123.
[16] M. Lassalle, Polynmes de Laguerre gnraliss, C. R. Acad. Sci. Paris Sr. I Math. 312 (1991) 725728.
[17] M. Lassalle, Polynmes de Hermite gnraliss, C. R. Acad. Sci. Paris Sr. I Math. 313 (1991) 579582.
[18] V.A. Marcenko, L.A. Pastur, Distribution of eigenvalues for some sets of random matrices, Math. USSR-Sb. 1
(1967) 457483.
[19] M.L. Metha, Random Matrices, Academic Press, San Diego, 1991.
[20] F.W.J. Olver, Asymptotics and Special Functions, Academic Press, San Diego, 1974.
[21] M.A. Olshanetsky, A.M. Perelomov, Quantum integrable systems related to Lie algebras, Phys. Rep. 94 (1983)
313404.
[22] Z. Yan, A class of generalized hypergeometric functions in several variables, Canad. J. Math. 44 (1992) 13171338.
[23] J.F. van Diejen, Confluent hypergeometric orthogonal polynomials related to the rational quantum Calogero system
with harmonic confinement, Commun. Math. Phys. 188 (1997) 467497.
[24] R. Wong, Asymptotic Approximation of Integrals, Academic Press, San Diego, 1989.
Abstract
Motivated by similarities between quantum Hall systems la Susskind and aspects of topological string
theory on conifold as well as results obtained in [E.H. Saidi, Topological SL(2) gauge theory on conifold and
noncommutative geometry, hep-th/0601020], we study the dynamics of D-string fluids running in deformed
conifold in presence of a strong and constant RR background B-field. We first introduce the basis of D-string
system in fluid approximation and then derive the holomorphic noncommutative gauge invariant field action
describing its dynamics in conifold. This study may be also viewed as embedding Susskind description for
Laughlin liquid in type IIB string theory. FQH systems on real manifolds R S 2 and S 3 are shown to
be recovered by restricting conifold to its Lagrangian sub-manifolds. Aspects of quantum behaviour of the
string fluid are discussed.
2006 Elsevier B.V. All rights reserved.
Keywords: Quantum Hall fluids; D string in conifold; Topological gauge theory; Noncommutative complex geometry
1. Introduction
Since Susskind proposal on fractional quantum Hall (FQH) fluids in Laughlin state as systems
described by 2 + 1 noncommutative CS gauge theory [1], there has been a great interest for
building new solutions extending this idea [26]. Motivated by: (a) results concerning attractor
* Corresponding author.
334
mechanism on flux compactification [7,8], in particular the link with noncommutative geometry,
and (b) the study of [9] dealing with topological noncommutative gauge theory on conifold, we
develop in this paper a new extension of Susskind proposal for FQH fluids to higher dimensions.
Our extension deals with modelization of the dynamics of a fluid of D strings running in conifold
and in presence of a strong and constant RR background B-field. The extended system lives
in complex three (real six) dimensions and is related to the usual FQH system with point like
particles by the following correspondence:
(1) The role of the usual FQH particles moving in a real Riemann surface M with coordinates z and z , is played by D strings moving on K3 surface with some complex holomorphic
coordinates u and v to be specified later. In this picture, FQH particles may be then viewed as
D0 branes coming from D1 strings wrapped on S 1 .
(2) The complex coordinates za (t) and z a (t) parameterizing the dynamics of the N fractional
quantum Hall particles are then mapped to ua = ua ( ) and va = va ( ) with = t + i being the
string world sheet complex coordinate.
(3) The local coordinates (t, z, z ) parameterize a real three dimension space; say the space
R 1,2 . The local variables (, u, v) parameterize a complex three dimension space, which is just
the conifold T S 3 realized as T S 1 fibered on T S 2 . The R 1,2 geometry used in Susskind description appears then as a special real three dimension slice of conifold.
(4) The role of the magnetic field B is now played by a constant and strong RR background
field B of type IIB string. Like in FQH system, the B field is supposed normal to K3 surface and
strong enough so that one can neglect other possible interactions.
From this naive and rapid presentation of the higher-dimensional extended FQH system, to
which we refer here below as a D-string fluid (DSF for short), one notes some specific properties among which the three following: first, Susskind proposal may be recovered from DSF by
taking appropriate parameter limits of DSF moduli space to be described later. Second, the real
geometry of FQH system is contained in conifold; the present study may be then thought of as
embedding Susskind field theoretical model for Laughlin state with filling factor = k1 into type
IIB superstring theory on conifold. This property offers one more argument for embedding FQH
systems in supersymmetric theories; others arguments have been discussed in [10,11]. Finally,
in DSF model, the complex holomorphy property plays a basic role; reality is recovered by restricting conifold to its half dimension Lagrangian sub-manifold. This involution has the effect
of projecting DSF into the usual FQH system opening the way for links between real 3D physics
and type II superstrings on CalabiYau threefolds.
The presentation of this paper is as follows: In Section 2, we introduce the basis of fluid
approximation of D-strings running in conifold. To build this system, we use special properties
of K3 complex surface and conifold geometry. We also take advantage of Susskind model for
Laughlin liquid which we use as a reference to make comparisons and physical interpretations.
In Section 3, we study the classical dynamics of the interaction between D strings and the RR
magnetic background field. We suppose that B is strong enough so that one can neglect string
kinetic energy and mutual energy interactions between the D strings. We also suppose that the
number of D strings per volume unit is high and uniform. Then use the fluid approximation
to derive the effective field theory extending Susskind model. In this section, we also study
some special limits such as real projection. In Section 4, we discuss quantum aspects of the Dstrings fluid, in particular holomorphic property and in Section 5 we give our conclusion and
outlook.
335
x, y, z, w C,
(2.1)
(2.2)
and where is a complex constant. In these relations, we have four complex holomorphic variables namely x, y, z and w; but not all of them are free. They are subject to two constraint
336
relations (2.1)(2.2) reducing the degrees of freedom down to two. Note in passing that by setting y = x and w = z , the above relations reduce to
|x|2 + |z|2 = Re ,
(x, z) ei (x, z),
(2.3)
so they define a real two sphere S 2 embedded in complex space C2 parameterized by (x, z).
This is an interesting property valid not only for T S 2 ; but also for conifold T S 3 . This crucial
property will be used to recover the hermitian models on real three dimension space; it deals
with the derivation of Lagrangian sub-manifold from mother manifold T S 3 . As we will see it
progressively, this feature is present everywhere along all of this paper. We will then keep it in
mind and figure it out only when needed to make comments.
To implement string dynamics, we should add time variable t and the string variable parameterizing the one-dimensional D string geometry. If we were dealing with a point like particle
moving on this complex surface, the variables would be given by the 1d fields
x = x(t),
y = y(t),
z = z(t),
w = w(t).
(2.4)
For the case of a D string with world sheet variable = t + i moving on T P 1 , the D string
variables are then given by the 2d fields
x = x( ),
y = y( ),
z = z( ),
w = w( ),
(2.5)
with | | l and obviously the constraint equations (2.1)(2.2). In the limit l 0, the above 2d
fields reduces to the previous one-dimensional variables. Since K3 surface as considered here is
a projective algebraic surface using complex holomorphic variables, it is natural to make the two
following hypothesis:
(i) Field holomorphy. We suppose that the above D-string field variables equations (2.5) have
no dependence; that is holomorphic functions in ,
= 0,
( ) =
n n , = x, y, z, w,
(2.6)
nZ
d
1
where n = 2i
( ) are string modes. This hypothesis means that the D string we are
n+1
dealing with is either a one handed mover closed D-string, say a left mover closed string, or
an open D-string with free ends. To fix the ideas, we consider here below closed D-strings and
think about = exp( + i ) with 0 = l 2l. Holomorphy hypothesis selects one sector;
it requires that the variables parameterizing the D-strings are complex holomorphic and same
for the field action SDSF = SDSF [x, y, z, w] that describe their dynamics. Usual hermiticity is
recovered by restricting conifold to its Lagrangian sub-manifold obtained by setting = , y = x
and w = z .
(ii) Induced gauge symmetry. For later use it is interesting to treat on equal footing the string
world sheet variable and those parameterizing K3. This may be done by thinking about the
projective transformations (2.2) also as those one gets by performing the change
,
(2.7)
with a nonzero complex parameter. In other words, the string variables obey the following
x( ) = x( ),
z( ) = z( ),
1
1
y( ) = y( ),
w( ) = w( ),
337
(2.8)
(2.9)
Note that Eq. (2.9) describes in fact an infinite set of constraint relations since for each value of
C , the D-string fields should obey (2.9). This feature has a nice geometric interpretation.
The string dynamics involves five complex holomorphic variables namely (, x, y, z, w) and the
two algebraic constraint equations (2.1)(2.2). Therefore these variables parameterize a complex
three dimension projective hypersurface embedded in C5 and which is nothing else that the
deformed conifold geometry T S 3 with the realization
T S3 T S1 T S2.
(2.10)
In this fibration, T S 2 is the base sub-manifold and the fiber T S 1 describes the D-string world
sheet.
To summarize, the variables describing the motion of a D-string in conifold are given by
Eqs. (2.7)(2.9). For a system of N D-strings moving in conifold, we have then
xa ( )ya ( ) za ( )wa ( ) = ,
a = 1, . . . , N,
(2.11)
where for each value of the index a, we have also Eqs. (2.7)(2.8). Having fixed the variables, we
turn now to describe the fluid approximation of D-strings and implement the constant and strong
background RR B-field.
2.2. Fluid approximation
For later analysis, it is convenient to use the usual SL(2) isometry of the conifold to put the
above relations into a condensed form. Setting
X i = x( ), z( ) ,
(2.12)
Yi = y( ), w( ) ,
transforming as isodoublets under SL(2) isometry, the coordinates of a given D string moving in
conifold is given by the holomorphic field doublets
X i = X i ( ),
Yi = Yi ( ),
i = 1, 2,
(2.13)
(2.14)
ij Xai ( )Ya ( ) = ,
a = 1, . . . , N,
(2.15)
(2.16)
338
Xa ( ), 1 a N X i (, x, y),
i
Ya ( ), 1 a N Y i (, x, y),
(2.17)
together with Eqs. (2.16) replaced by
ij X i Y j = ,
X i = X i (, x, y),
Y i = Y i (, x, y),
1
1
Y i , x, y = Y i (, x, y).
(2.18)
(2.19)
Yi = yi Ci ,
(2.20)
i and C
where C+
i are gauge fields constrained as
i
i
x i Ci yi C+
+ Ci C+
= 0,
(2.21)
scaling as the inverse of length and describing fluctuations around the static positions x i and yi .
From SL(2) representation theory, one may also split the fields X i and Y i using holomorphic
vielbein gauge fields
X i (, x, y) = x i E+ +
ij yj A++ ,
Yi (, x, y) = yi E+
ij x j A ,
(2.22)
i and
where E should be as E = (1 + A ). Like for X i and Y i , the gauge fields C+
Ci as well as E and A are homogeneous holomorphic functions subject to the projective
transformations C (, x, 1 y) = C (, x, y) and
1
E+ , x, y = E+ (, x, y),
1
A++ , x, y = 2 A++ (, x, y),
1
E+ , x, y = E+ (, x, y),
1
A , x, y = 2 A (, x, y).
(2.23)
339
(2.24)
An equivalent relation using A and A may be also written down. As far as the constraint
equations (2.21), (2.24) are concerned, there are more than one way to deal with. One way is
tosolve it perturbatively as E (1 + A ) with A = A0 and then substitute A0 =
i A++ A . An other way is to solve Eq. (2.24) exactly as
E+ = K 1 + A++ A ,
1
1 + A++ A ,
E+ =
(2.25)
K
where K is an arbitrary nonzero function. In both cases one looses field linearity which we
would like to have it. We will then keep the gauge field constraint equations as they are and give
the results involving all these components using Lagrange method. Notice that, from physical
i or equivalently A
view, the gauge fields C
= A (, x, y) and A = A (, x, y) describe
gauge fluctuations around the static solution
X i = xi ,
Yi = y i ,
x i yi = ,
(2.26)
i and Y =
preserving conifold volume 3-form. Expressing the field X i and Yi as X i = x i + C+
i
i
i
i
yi Ci , we have C+ = x A+ + y A++ and Ci = yi A+ + xi A . Notice also that,
as general coordinate transformations, the splitting (2.22) may be also defined as holomorphic
diffeomorphisms X i = Lv x i and Yi = Lv yi where the vector field Lv is given by
Lv = V++ D + V D++ + V0 D0 + V0 0 ,
(2.27)
with gauge component fields Vpq , p, q = +, and where the dimensionless derivatives generating the GL(2) group are given by
1 i
i
+
y
x
,
0 =
(2.28)
2
x i
y i
or naively as 0 =
,
(x i yi )
,
D = y
x i
i
and
D++ = x
,
y i
i
i
D0 = x
y
.
x i
y i
i
(2.29)
In these equations, we have two charge operators; the operator 0 generates the Abelian scaling
factor with the property
[0 , D ] = 2D ,
[0 , D0 ] = 0,
(2.30)
and D0 = [D++ , D ] generates the Abelian CartanWeyl GL(1) subgroup of SL(2). Notice
moreover that inverting the decomposition (2.22), we can write the vielbein fields as follows
1
i
,
yi X i = 1 + yi C+
1
i
A++ =
ij x i X j = xi C+
,
E+ =
1
Yi x i = 1 x i Ci ,
1
A =
ij yi Yj = y i Ci .
E+ =
(2.31)
As one sees, these gauge fluctuations E and A are dimensionless; they let understand that
they should appear as gauge fields covariantizing dimensionless linear differential operators.
340
These are just the D0, operators given above. At the static point equation (2.26), we also see
i = 0). With these tools we are now in position to address
that E+ = E+ = 1 and A = 0 (C
the building of the effective field action of the D string fluid model in conifold.
3. Field action
i , C ] describing the dynamics
To get the gauge invariant effective field action SDSF = SDSF [C
0
of the D string fluid in the conifold, we borrow ideas from Susskind method used for FQH
liquid of point like particles. We first give the classical field action Sclas [X, Y ] describing the
interaction between a given D string {X( ), Y ( )} moving in the RR background field B. Then
we consider the fluid approximation using the field variables {X (, x, y), Y(, x, y)} instead of
the coordinates {Xa ( ), Ya ( ), 1 a N}. In this limit we suppose that density (, x, y) is
large and uniform; i.e., (, x, y) = 0 . Finally, we derive the effective gauge field action once
by using the D-string field variables X and Y; i.e., S = SDSF [X , Y, ] and an other time by using
i and C describing the fluctuations around the static positions.
gauge fields C
0
1
2
N
a=1
T S1
j
X i ( )
Xa ( )
j
Bij Ya ( ) a
Yai ( )
,
(3.2)
iB
2
N
Yia
a=1
T S1
Xai
Yia
Xai
.
(3.3)
This field action SN [X, Y ] exhibits three special and remarkable features; first it is holomorphic
follows by setting
and the corresponding hermitian SNreal [X, X]
= = t,
B = B.
Yia = Xai ,
(3.4)
As such we have
=
SNreal [X, X]
i Re B
2
dt
N
Xai
a=1
dXai
dt
Xai
d(Xai )
.
dt
(3.5)
The second feature of SN [X, Y ] deals with the hypersurface equation (2.16). Since Yia Xai =
is a constraint equation on the dynamical field variables, it can be implemented in the action by
341
T S1
B
+
2
a=1
T S1
N
a ( ) Yia ( )Xai ( ) .
(3.6)
a=1
The difference between SN [X, Y ] of Eq. (3.3) and the above SN [X, Y, ] is that in the second
j
description the field variables Xai ( ) and Ya ( ) are unconstrained. Conifold target hypersurface
is obtained by minimizing SN [X, Y, ] with respect to ,
SN [X, Y, ]
= Yia ( )Xai ( ) = 0.
a
(3.7)
L
(X i / )
of the
and X i
(3.8)
with d
d = 0. This is a crucial point as far as we are thinking about conifold as given
by the fibration T S 1 T S 2 . Now, using the fluid approximation mapping the system
j
{Xai ( ), Ya ( ), a ( ); 1 a N } into the 3D holomorphic fields X i = X i (, x, y), Y j =
j
Y (, x, y) and = (, x, y), we can put Eq. (3.6) as a complex 3D holomorphic field action
S2 [X , Y, ] =
(3.9)
d 3 v L2 (X , Y, ),
T S3
with
L2 (X , Y, ) =
=
and 0 =
d 3v =
ln
iB
Yi 0 X i X i 0 Yi i Yi X i ,
2
(3.10)
d dx i dyi
,
x i yi = .
(3.11)
For more details on the specific properties of this complex volume see [12]; for the moment let us
push forward this description using the T S 1 T S 2 realization of conifold. In this view, notice
342
we obtain
L2 (C , ) =
i
B
i
i
i
,
C+
0 Ci + i yi C+
Ci x i Ci C+
Ci 0 C+
2i
(3.12)
d
i + x i C ). Doing the same thing for the
(yi C+
where we have dropped out the total derivatives d
i
i
i
i
splitting X = x E+ + y A++ and Yi = yi E+ xi A and substituting these relations back
into Eq. (3.10), we get
2 B
2 B
(E+ 0 E+ E+ 0 E+ ) +
(A 0 A++ A++ 0 A )
2
2
B
(E+ E+ A++ A 1),
+
(3.13)
2
invariant under the projective symmetry with a Lagrange gauge field parameter carrying the
conifold constraint hypersurface. By using the D0 charge operator, the transformations (2.23)
can be also stated as D0 E = 0, D0 A = 2A ; they follow as well from the identities
D0 X i = X i and D0 Y i = Y i . Note that by substituting E+ = 1 + A+ and E+ = 1 A+ ,
one sees that the term (E+ 0 E+ E+ 0 E+ ) reduces to a total derivative 0 (2A+ ) and
so can be ignored in such a realization.
=
L2 [E, A, ]
T S3
and using the fact that this number is a constant, one gets a constraint equation on the Jacobian
2 X ,Y )
J (x, y) = | 2((x,y)
| of the general transformation
x X = X (x, y),
y Y = Y(x, y).
(3.15)
Eq. (3.14) requires that J (x, y) = 1. Let us give some details on this calculation. Since the density
is uniform, we should have
d 3 V = 0
d 3 v.
0
(3.16)
T S3
T S3
Using the explicit expressions of the conifold holomorphic volume 3-form which we write first
d
2
3
2
as d 3 V = d
d S and second d v = d s. Then expanding the K3 holomorphic 2-form
d 2 S = (dX i dYi ), we get after some straightforward algebra
d 2 S = X i , Yi + d 2 s + X i , Yi 0 dx l dxl + X i , Yi 0+ dy l dyl .
(3.17)
In this relation d 2 s = (dx i dyi ) and {f, g}p,q stand for the Poisson brackets defined as
{f, g}+ = (D++ f )(D g) (D f )(D++ g),
343
(3.18)
with D,0 generating the SL(2, C) isometry equations (2.29). Volume preserving diffeomorphisms require then the following constraint equations to be hold
i
X , Yi + = D++ X i (D Yi ) D X i (D++ Yi ) = ,
(3.19)
and
i
X , Yi 0 = D0 X i (D Yi ) D X i (D0 Yi ) = 0,
i
X , Yi 0+ = D0 X i (D++ Yi ) D++ X i (D0 Yi ) = 0.
(3.20)
A careful inspection shows that the last two conditions are not really constraint equations. The
point is that because of the identities
D0 X i = X i ,
D0 Yi = Yi ,
i
X , Yi 0 = D X i Yi ,
X , Yi 0+ = D++ X i Yi .
(3.21)
(3.22)
But these relations vanishes identically because of the identity X i Yi = = constant. Therefore the volume transformation (3.17) becomes d 2 S = {X i , Yi }+ d 2 s and so we are left
with one constraint relation; namely {X i , Yi }+ = which can be implemented in the field
action (3.10) by help of a Lagrange gauge field C0 . To that purpose note that by setting
J = (C0 Yi D X i ), one can check that we have
d 3 v C0 X i , Yi + = d 3 v Yi C0 , X i +
(3.23)
where we have dropped out the boundary term d 3 v [D J++ + D++ J ]. Implementing this
identity in the field action as usual, we get the following holomorphic functional
SDSF [X , Y, C0 ] =
(3.24)
d 3 v LDSF (X , Y, C0 ),
T S3
with
LDSF [X , Y, C0 ] =
B
iB
Yi 0 X i X i 0 Yi +
Yi X i
2
2
B
Yi C0 , X i + X i C0 , Yi + .
(3.25)
Using the previous splitting of the D string fields X i and Yi , we can express this field action in terms of the gauge fields either as SDSF = SDSF [Ci , C0 , ] or equivalently as SDSF =
i and Y = y C .
SDSF [E, A, C0 , ]. Let us do this calculation for the splitting X i = x i +C+
i
i
i
In this case the density constraint equation {X i , Yi }+ = reads in terms of the Ci gauge fields
as follows
i
i
i
x , Ci + C+
(3.26)
, yi + + i C+
, Ci + = 0.
344
i F , {F, y }
This relation can be put into a more interesting way by setting {x i , F }+ = +
i + =
i
i =
i
i F with the remarkable properties + i = yi D++ (x D ) = D++ D and i +
i
x D (yi D++ ) = D D++ . Putting these relations back into (3.26), we obtain
k i
i
i
i k
Ci i C+
i +
C+ k Ci k C+
+ Ci = 0,
+
(3.27)
k G) ( k F )( G),
or equivalently by introducing Poisson bracket {F, G}PB (+k F )(
+k
i
i
i
+ Ci i C+ i C+ , Ci PB = 0.
(3.28)
Note also that {F, G}PB is just {F, G}+ . As we see, this is a typical equation of motion of
noncommutative gauge theory; it can be then thought of as the minimization of an invariant
i and C . In this view, we have
gauge field SDSF [C , C0 ] with gauge fields C
0
i
SDSF [C , C0 ]
i
i
= +
Ci i C+
i C+
, Ci PB = 0,
(3.29)
C0
from which we can determine SDSF [C , C0 ] taking into account equation (3.12). Setting
i
iB
, C0 , =
d 3 v LDSF [C , C0 , ],
SDSF C
2
T S3
we have
iB i
i
C+ 0 Ci Ci 0 C+
2
i
B
i
i
2C0 C+
, Ci PB
2 C0 + Ci C0 i C+
2
B i
i
.
yi C+ Ci x i Ci C+
+
(3.30)
2
This holomorphic Lagrangian density may be put into a more convenient way by performing an
integration by part and dropping out the total derivatives. Replacing
LDFS [C , C0 , ] =
iB
2i i
i
i
+
C0 + Ci + C0 i C+ + C0 C+ , Ci PB
2
3
i
iB
2i
i
i
+
Ci + C0 Ci 0 C+ Ci C+ , C0 PB ,
2
3
(3.31)
(3.32)
345
the previous field action reduces to noncommutative ChernSimons gauge theory in real three
dimensions. In this case (Re B) (Re ) should be equal to KacMoody level k.
4. Holomorphy and quantum corrections
Though natural from classical view, the correspondence between FQH systems and fluids of
D-strings in conifold described above is however no longer obvious at quantum level. In the
D-string fluid proposal, the classical free degrees of freedom of the holomorphic sector
iB
SN [X, Y ] =
2
N
a=1
T S1
Xai
i Yia
Yia
Xa
,
(4.1)
iB
2
T S1
N
Yia
a=1
Y
Xai
Xai ia ,
(4.2)
may couple quantum mechanically unless this is forbidden by underlying symmetries. Typical
examples of these powerful symmetries, one encounters in such kind of situations, are generally
given by conformal invariance, supersymmetry and their extensions. In this section, we make
general comments on quantum effects in the D string system and give a discussion on how
supersymmetry can help to overcome difficulties. Implication of supersymmetry in the game
can be motivated from several views starting from complex Khler geometry of T S3 and ending with topological aspects of 2d fields on conifold. To fix the ideas on the way we will do
things, we recall the standard parallel between field holomorphy in conifold geometry and chirality in 2d N = 2 supersymmetric nonlinear sigma model captured by the usual supersymmetric
derivatives D 1/2 . Using this parallel, we shall show that the holomorphic Lagrangian density
i
L(X, Y ) = B N
a=1 Yia (Xa / ) of the D-string fluid can be thought of as following from the
chiral superspace Lagrangian of the N = 2 supersymmetric sigma model in large B field
Lchiral [] =
(4.3)
d 2 W(),
SM
where refers
to generic chiral superfields and SM to chiral superspace. In this relation,
W() (B N
a=1 a1 a2 ) is chiral the superpotential. Substituting the chiral superfields ia
by their -expansions; i.e.,
ia Yia + + +1/2 1/2 Fia ,
i = 1, 2,
(4.4)
where we have dropped out fermions and where Fia are auxiliary fields to be specified in a
moment; then integrating with respect to the Grassman variables 1/2 , gives the following field
component product B( N
a=1 Yia Fia ). By taking the auxiliary fields Fia as
j
Xa
j
,
Fia =
(4.5)
ij Xa +
ij
where
ij is the usual spinor metric and the conifold complex parameters, one discovers, up to
a constant, the above holomorphic Lagrangian density.
346
Before going ahead, it should be also noted that the comments we shall give below are certainly not final answers; but just a tentative to approach aspects of quantum behaviour of D string
fluid in conifold. The discussion presented below relies on path integral method for quantization.
But may be the more natural way to do would be extending matrix model approach of Susskind
Polychronakos (SP) for FQH droplets. Recall that SP method uses canonical quantization. We
will give a brief comment on this method in the end of this section. More involved details may
be found in [15].
This discussion is organized as follows: in Section 4.1, we explore the consequences of quantum effects on conifold geometry and derive the constraint equation on quantum consistency of
holomorphy property. Using path integrals quantization method, we show that holomorphy persists as far quantum fluctuations are restricted to complex deformations of conifold. Implementation of Khler deformations destroys this behaviour since holomorphic and antiholomorphic
modes get coupled. In Section 4.2, we study the embedding the D string model in a supersymmetric theory and too particularly in its chiral sectors. The latter seems to be the appropriate
theory that governs the quantum fluctuations of the D-string fluid in conifold. As a first step in
checking this statement, we start by describing the field theoretic derivation of holomorphy hypothesis considered in Section 2. Then we give a correspondence with 2d N = 2 supersymmetric
nonlinear sigma model with conifold as a target space; in presence of a background magnetic
field B. We end this section by discussing the statistics of the D-string system which requires a
filling fraction = 1k with even integer KacMoody levels k.
4.1. Quantum effects and conifold deformations
A way1 to study the quantum effects on the holomorphy feature of the D-string fluid model is
to proceed as follows. First think about the D string fluid model as a classical field theory based
on the conifold geometry xy zw = . This means that the complex threefold, with its complex
modulus , can be thought of as a classical geometry. Quantum mechanically, the above fields
are subject to fluctuations and so the complex parameter gets corrections induced by quantum
effects. To have an idea on the nature of these quantum corrections, we consider fluctuations of
the D-strings around the classical field configurations x, y, z, w. These field fluctuations can be
written as
+ F ,
= x, y, z, w,
(4.6)
with the generic fields = ( ) is as in Eq. (2.6) and F describing the perturbations around
the classical field . Notice that these fluctuations are involved in the computation of the partition
function Z[j ] of the model
i
Z[j ] =
(4.7)
S[] + j ,
D exp
h
where D stays for the usual field path integral measure. As it is known, this quantity generates the Green functions of the quantum model with j being the usual external source. Notice also
that the F deformations should a priori depend on both the string fields and their complex
1 An other tentative to approach the fluid of D-strings in conifold, by using a generalization of matrix model method
based on canonical quantization, has been developed in [15]. There and as a first step in dealing with the problem, one
focuses on the study of quantum droplets for the conifold sub-varieties S3 and S2 .
347
F = F (, ).
(4.8)
By implementing the fluctuations (4.6) into the D-string fluid model, one discovers that the classical geometry xy zw = we started with gets now deformed as follows
xy zw = + F ,
(4.9)
(4.10)
Like for Eq. (4.8), one sees that F depends in general on both the fields x, y, z, w and their
complex conjugate x
F = F (, ),
= x, y, z, w.
(4.11)
(4.12)
and so classical holomorphy is preserved quantum mechanically. This is the condition for quantum decoupling of holomorphic and antiholomorphic degrees of freedom. This property has a
geometric interpretation in term of conifold structure deformations; it means that only complex
deformations of holomorphic volume that are allowed for having a consistent quantum mechanics. It is also interesting to note that Eq. (4.12) is a strong condition; its solution requires however
a strong symmetry which apparently D-string fluid model does not exhibit manifestly; at least
not as things have been formulated so far. Note moreover that as far as quantum holomorphy is
concerned, to our knowledge only supersymmetry that has the magic power to deal with target
space holomorphy. There, quantum corrections are controlled by the so-called nonrenormalization theorem.
The next question is how the string fluid model could be related to 2d N = 2 supersymmetric
nonlinear sigma model with conifold as target space. Thinking about the D-string model as the
bosonic part of a supersymmetric theory does not answer exactly the question since there are
Khler deformations induced by quantum effects that destroy the classical holomorphy property.
To overcome such difficulty one should then associate the action of the D-string model with
chiral superpotentials
W = W(),
(4.13)
348
holomorphic gauge theory, let understand that this model can be embedded in a N = 2 supersymmetric theory; from which one can get informations about quantum corrections. In this view
holomorphy property is interpreted as the target space manifestation of chirality feature of 2d
N = 2 supersymmetric sigma models with conifold as target space. A close idea is used in
building topological string theory by using twist of 2d N = 2 superconformal algebra [17] and a
correspondence with type II superstrings on CalabiYau threefolds [18]. In our concern, we have
the following correspondence
d . . . d 2 . . . ,
d . . . d 2 . . . ,
2
d . . . d 2 d 2 . . . ,
(4.14)
with the 1/2 s and 1/2 s the usual Grassman variables. Similar things may be also written
down for / and supersymmetric derivatives. Before that, let us start by deriving rigorously
the holomorphy hypothesis of Section 2 by using a field theoretical method; then come back to
the correspondence between target space holomorphy and 2d N = 2 supersymmetric chirality.
4.2.1. Holomorphy property and boundary QFT2
Holomorphy is one of the basic ingredients we have used in deriving the D-string model
developed in this paper. It has been imposed in order to complete the conifold realization T S3
as a fibration of T S1 over the base T S2 . In this study, we first give the field theoretic derivation
of this holomorphy hypothesis; it appears as the solution of a constraint equation required by
boundary field theory in two dimensions. Then we derive the field action (3.5); its connection
with supersymmetric models is considered in the next subsubsection.
To proceed and seen that the model we are studying involves complex fields, it is then natural
to start from the following bosonic QFT2 field action
= d 2 G + ,
S[, ]
(4.15)
M
where M is a real surface parameterized by the local complex coordinates (, ). The fields
= (, ) form a set of complex 2d scalar fields parameterizing some target Khler manifold
To make contact with conifold geometry and the fluid of N
with metric G = G (, ).
strings, we think about these field variables as
(, ) = Xai (, ),
i = 1, 2, a = 1, . . . , N,
(4.16)
with Xai an SU(2) doublet like in Eq. (2.12) and to fix the ideas the field doublet Yia are set to
X ai . Once the idea is exhibited, the field X ai will be promoted to Yia . In this case, the Khler
metric G may be split as
= ab [g(ij ) + B
ij ],
G (, )
(4.17)
where the SU(2) triplet g(ij ) is a function on the target space field coordinates; i.e., g(ij ) =
and where
ij is the usual antisymmetric SU(2) invariant tensor. In the special case
g(ij ) (, ),
where B is field independent and strong enough so that we can neglect the term g(ij ) , the metric
349
G reduces essentially to Bab
ij ; and so one is left with the following approximated field
action
N
j
2
i
d B
ij
+ X a Xa ,
S[X, X]
(4.18)
a=1
B
+
2
a=1
d+ d +
B
2
X ia Xai
a=1
N
d 2
N
( + X ia )Xai + X ia + Xai ,
(4.19)
a=1
where the summation over SU(2) indices is understood. By integrating the two first terms of
decomposes as
above relation, one sees that the field action S[X, X]
S S boundary + S bulk
with two factors for
(4.20)
S boundary
= S bound
as given below
N
N
B
B
bound
i
i
=
d
(+ Xia )Xa +
d
S
Xia Xa ,
2
2
a=1
M+
(4.21)
a=1
where M stand for the oriented boundaries of the Riemann surface M and
S
bulk
d
M
B
i
i
N
B
a=1
Equating (4.18) and (4.21), one gets the holomorphy condition of the field variables
i
Xa (, )
= 0,
= 0.
Xia (, )
M
M
(4.22)
(4.23)
X ia (, ) = Xia
( ) + Xia
( ).
(4.24)
They tell us that on the boundary M of the Riemann surface, we have two heterotic free field
theories; a holomorphic sector with field variables
Xai ( ),
Xia
( ),
(4.25)
350
( ) = Y ( ), and an antiholomorphic
which, for convenience and avoiding confusion we set Xia
ia
one with
Xia
(4.26)
( ) = Xai ( ) ,
Xai ( ) = Yia ( ) ,
d 2
N
B
a=1
Xai ( + X ia ) +
N
N
B
B
i
i
,
Yia + Xa
d
(Yia ) Xa
2
2
a=1
B
Xia + Xai
2
M+
(4.27)
a=1
are in one to one with the usual three blocks of 2d N = 2 supersymmetric nonlinear sigma
models
+
+
2
2
2
d d d K i , i +
d 2 d 2 W(i )
SN =2 , =
SM
+ .
d 2 d 2 W
i
SM
(4.28)
SM+
In this relation, the symbol SM stands for the usual two-dimensional superspace with supercoordinates ( , 1/2 , 1/2 ) and SM stand for the two associated chiral superspaces. The
i s (respectively i+ ) are chiral (respectively antichiral) superfields living on SM (respectively SM+ ), K(, + ) is the Khler superpotential and W() the usual complex chiral superpotential. Like for the holomorphic functions f = f ( ) living on M and satisfying the
holomorphy property
f
= 0,
(4.29)
we have for chiral superfields ( , 1/2 ) living on SM , the following chirality property,
D 1/2 = 0.
(4.30)
By comparison of the two actions, one sees that the bulk term S bulk of the QFT2 equation (4.27)
is associated with Khler term of the supersymmetric sigma model
bulk
S [QFT2 ]
(4.31)
d 2 d 2 d 2 K i , i+ ,
SM
351
a=1
SM
where we have set Fai = ( Xai ) and by putting after setting Fia = ( (Xai )), we also have
N
B
i
+ .
d
(Yia ) Fa
d 2 d 2 W
(4.33)
i
2
M+
a=1
SM+
Now, considering two chiral superfields 1 = 1 ( , 1/2 ) and 2 = 2 ( , 1/2 ) with expansions
1 = Y1 + +1/2 1/2 + 1/2 +1/2 + +1/2 1/2 F1 ,
2 = Y2 + +1/2 1/2 + 1/2 +1/2 +1/2 1/2 F2 ,
(4.34)
with Yi and Fi being the bosonic complex fields, we can build the superpotential associated with
the boundary QFT2 . We have
N
N
B
B
2
(4.35)
d
a1 a2 =
Ya1 Fa2 Ya2 Fa1 ,
2
2
SM
a=1
a=1
i
which can be also written a covariant form as B2 N
a=1 (Yia Fa ).
In the end of this section, we want to note that it would be interesting to push further the similarity between the fluid of D-strings and the usual FQH systems. As a next step, it is important to
build the ground state |0 of the quantized D-string model which may be done by extending the
matrix model approach of Susskind and Polychronakos. Recall in passing that the fundamental
wave function of standard FQH system on plane with filling fraction = k1 is described by the
Laughlin wave
L (x1 , . . . , xN )
N
(xa xb )k eB
N
2
a=1 |xa |
(4.36)
a<b=1
This wave function, which has been conjectured long time ago by Laughlin has been recently
rederived rigorously in [16] by using matrix model method. Notice that under permutation of
particles, the wave function behaves as
L (x1 , . . . , xa , . . . , xb , . . . , xN ) = ()k L (x1 , . . . , xb , . . . , xa , . . . , xN ).
(4.37)
Symmetry property of this function requires that k should be a positive odd integer for a system
of fermions and an even integer for bosons.
5. Conclusion and outlook
In this paper, we have developed a gauge field theoretical model proposal for a classical fluid
of D-strings running in conifold and made comments on its quantum behaviour. The field action SDSF of this classical conifold model, in presence of a strong and constant RR background
352
LDSF = i
(5.1)
where is a Lagrange gauge field capturing the conifold hypersurface. By setting {F, G}+ =
(D++ F )(D G) (D F )(D++ G) and using general properties of the Poisson bracket, in
particular antisymmetry and Jacobi identity as well as the property
C0 Yi , X i + = Yi C0 , X i + C0 Yi X i + (D++ J + D J++ ),
(5.2)
with J = (C0 Yi D X i ), the above holomorphic Lagrangian density LDSF can be also put
into a gauge covariant way as follows
LDSF = i
BRR
BRR
Yi D0 X i X i D0 Yi +
Yi X i ,
2
2
(5.3)
with D0 X i = 0 X i + i{C0 , X i }+ . The presence of the Poisson bracket {C0 , }+ in the gauge
covariant derivative D0 is a signal of noncommutative gauge theory in the same spirit as in
Susskind description of Laughlin fluid. The basic difference is that instead of a U (1) gauge
group, we have here a holomorphic C gauge symmetry acting on scalar field as = {, }+
and C0 = 0 + i{C0 , }+ with being the gauge parameter. Moreover, thinking about the
D-string field variables as
i
X i = x i + C+
,
Yi = yi Ci ,
(5.4)
Lreal
FQH =
Re(BRR )
i Xi 0 X i X i 0 Xi 2C0 D++ Xi D X i
2 Re()
Re(BRR )
+
2C0 D Xi D++ X i + Xi X i Re ,
2 Re
(5.5)
where X i = X i (, x, x),
Xi = (X i ), C0 = C0 , 0 = 0 and
D++ = x i
,
x i
D = x i
,
x i
D0 = [D++ , D ].
(5.6)
353
This analysis may be also viewed as a link between, on one hand, topological strings on conifold,
and, on the other hand, noncommutative ChernSimons gauge theory as well as FQH systems in
real three dimensions. It would be interesting to deeper this relation which may be used to approach attractor mechanism on flux compactification by borrowing FQH ideas. To that purpose,
one should first identify the matrix model regularization of the continuous field theory developed
in this paper. This may be done by extending the results of [13,14] obtained in the framework
of fractional quantum Hall droplets. An attempt using matrix field variables valued in GL(N, C)
representations is under study in [15], progress in this direction will be reported elsewhere.
Acknowledgement
This research work is supported by the program Protars III D12/25, CNRST.
References
[1] L. Susskind, The quantum Hall fluid and non-commutative ChernSimons theory, hep-th/0101029.
[2] S. Hellerman, L. Susskind, Realizing the quantum Hall system in string theory, hep-th/0107200.
[3] D. Karabali, Electromagnetic interactions of higher-dimensional quantum Hall droplets, Nucl. Phys. B 726 (2005)
407420, hep-th/0507027.
[4] A.E. Rhalami, E.H. Saidi, NC effective gauge model for multilayer FQH states, hep-th/0208144.
[5] A.E. Rhalami, E.M. Sahraoui, E.H. Saidi, NC branes and hierarchies in quantum hall fluids, JHEP 0205 (2002) 004,
hep-th/0108096.
[6] S.C. Zhang, Quantum Hall effect in higher dimensions, Talk given at the Conference on Higher-Dimensional Quantum Hall Effect, ChernSimons Theory and Non-Commutative Geometry in Condensed Matter Physics and Field
Theory, AS-ICTP, Trieste, 14 March 2005.
[7] H. Ooguri, A. Strominger, C. Vafa, Black hole attractors and the topological string, Phys. Rev. D 70 (2004) 106007,
hep-th/0405146.
[8] H. Ooguri, C. Vafa, E. Verlinde, HartleHawking wave-function for flux compactifications, hep-th/0502211.
[9] E.H. Saidi, Topological SL(2) gauge theory on conifold and noncommutative geometry, hep-th/0601020.
[10] J. Gates Jr., A. Jellal, E.H. Saidi, M. Schreiber, Supersymmetric embedding of the quantum Hall matrix model,
JHEP 0411 (2004) 075, hep-th/0410070.
[11] K. Hasebe, Supersymmetric quantum Hall effect on fuzzy supersphere, Phys. Rev. Lett. 94 (2005) 206802, hepth/0411137.
[12] S. Gukov, K. Saraikin, C. Vafa, A stringy wave function for an S 3 cosmology, hep-th/0505204.
[13] B. Morariu, A.P. Polychronakos, Fractional quantum Hall effect on the two-sphere: A matrix model proposal, Phys.
Rev. D 72 (2005) 125002, hep-th/0510034.
[14] E.H. Saidi, Topological matrix model proposal for Laughlin wave and cousin state, Lab/UFR-HEP0517/
GNPHE/0519/VACBT/0519.
[15] R. Ahl Laamara, L.B. Drissi, E.H. Saidi, D-string fluid in conifold, II: Matrix model for D-droplets, in preparation.
[16] S. Hellerman, M. Van Raamsdonk, Quantum Hall physics = noncommutative field theory, JHEP 0110 (2001) 039,
hep-th/0103179.
[17] E.H. Saidi, M. Zakkari, Superconformal geometry from the Grassmann and harmonic analycities I: The N = 2
superconformal case, Int. J. Mod. Phys. A 6 (1991) 31513173;
E.H. Saidi, M. Zakkari, Superconformal geometry from the Grassmann and harmonic analycities II: The N = 4
SU(2) conformal case, Int. J. Mod. Phys. A 6 (1991) 31753200.
[18] M. Marino, ChernSimons theory and topological strings, Rev. Mod. Phys. 77 (2005) 675720, hep-th/0406005.
Abe, H.
Agashe, K.
Ahl Laamara, R.
Altarelli, G.
Altarelli, G.
Arkani-Hamed, N.
Ball, R.D.
Ball, R.D.
Balog, J.
Barrau, A.
Becker, K.
Bellucci, S.
Bertolini, M.
Bi, X.-J.
Bill, M.
Bolognesi, S.
Burrington, B.A.
B742 (2006) 1
B742 (2006) 158
B741 (2006) 390
B742 (2006) 253
B741 (2006) 162
B741 (2006) 297
B743 (2006) 1
B741 (2006) 83
B743 (2006) 1
B741 (2006) 1
B742 (2006) 230
Caracciolo, S.
Casteill, P.-Y.
Chen, B.
Chetyrkin, K.G.
Chishtie, F.A.
Contino, R.
Cove, H.C.D.
de Forcrand, Ph.
Delgado, A.
Desrosiers, P.
Donini, A.
Drissi, L.B.
Eberle, H.
Elias, V.
Faisst, M.
Fernndez-Martnez, E.
Feruglio, F.
Flohr, M.
Flohr, M.
Forrester, P.J.
Forte, S.
Forte, S.
Fr, P.
Fr, P.
Froggatt, C.
Gargiulo, F.
Giudice, G.F.
Gomshi Nobary, M.A.
Grain, J.
Grange, P.
Gudnason, S.B.
B741 (2006) 42
B741 (2006) 108
B741 (2006) 34
B742 (2006) 253
B741 (2006) 199
B741 (2006) 1
Hamanaka, M.
Harmark, T.
He, Y.-L.
Higaki, T.
Ilderton, A.
Itou, E.
Jacobsen, J.L.
Jacobsen, J.L.
Jaffe, R.L.
Kajiyama, Y.
Kobayashi, T.
Koike, Y.
Konishi, K.
Korthals Altes, C.P.
Krohn, M.
Kubo, J.
B743 (2006) 74
B742 (2006) 187
B742 (2006) 312
B741 (2006) 180
B742 (2006) 124
B743 (2006) 276
B743 (2006) 74
Lerda, A.
Liu, J.T.
Livine, E.R.
L, H.
B743 (2006) 1
B742 (2006) 230
B741 (2006) 131
B741 (2006) 17
Mann, R.B.
Marino, E.C.
Marmorini, G.
McKeon, D.G.C.
Meloni, D.
Minasian, R.
Mognetti, B.M.
Morales, J.F.
Naculich, S.G.
Nagashima, J.
Nevzorov, R.
Niedermayer, F.
Nielsen, H.B.
Nunes, L.H.C.M.
Obers, N.A.
Oota, T.
B742 (2006) 41
B742 (2006) 275
Pelissetto, A.
Philipsen, O.
Pope, C.N.
Randjbar-Daemi, S.
Richard, J.-F.
Rigolin, S.
Rulik, K.
Russo, R.
355
Saidi, E.H.
Salas, J.
Saleur, H.
Salvio, A.
Scardicchio, A.
Schnitzer, H.J.
Sepahvand, R.
Shaposhnikov, M.
Smolin, L.
Steele, T.G.
Stelle, K.S.
Sturm, C.
Szabo, R.J.
Takcs, G.
Tentyukov, M.
Terno, D.R.
Trigiante, M.
Tseng, L.-S.
Wgner, F.
Weisz, P.
Yasui, Y.
Yokoi, N.
Zhang, P.