Sunteți pe pagina 1din 15

Tyche Manual

Salvatore Cardamone

Popelier Group
The University of Manchester

Contents
1 Introduction

2 Theoretical Considersations

2.1

Frame of Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2.1

Kinetic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2.2

Potential Energy . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2.3

Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Coordinate Transformations . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.1

Internal Coordinates . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.2

Redundant Internal Coordinates . . . . . . . . . . . . . . . . . .

2.3.3

Atomic Local Frame . . . . . . . . . . . . . . . . . . . . . . . . .

12

2.3

3 Implementation

12

3.1

Metropolis Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

3.2

Normal Mode Calculation . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.3

Atomic Local Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.4

Dynamical Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.5

Similarity Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.6

Biased Equipartition . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

4 User Guide

15

4.1

Compilation Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

4.2

Requisite ab initio Calculations . . . . . . . . . . . . . . . . . . . . . . .

15

4.3

Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

4.4

Output

15

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Introduction

Tyche is a molecular conformational sampling tool which attempts to replicate ab initio


dynamics whilst circumventing the computational overheads typically associated with
such methods. The author has attempted to forsake all associations with empirical
force fields and crude parameterisations, which are necessarily highly system-specific
and possess non-physical functional forms. Instead, it is hoped that Tyche permits for
the sampling of (discontinuous, stochastic) dynamical trajectories along the true energetic surface of any molecular system which can feasibly undergo a single point ab initio
calculation.

A set of molecular conformations, Nseed , are provided with their energetic first and
second order spatial derivatives, which allows for a local approximation to the potential energy surface (PES) about these points. These local reconstructions subsequently
permit the local exploration of molecular conformational space. Sampling about these
points by invoking a Metropolis Monte Carlo scheme, we may explore those regions of
conformational space which are lower in energy more thoroughly than those which may
inhabit high energy regions of the PES. Introducing some nomenclature, our method
may be considered as a piecewise reconstruction of the PES (PR-PES), upon which
dynamics may be performed.

Dynamics are accounted for by expressing the molecular system in terms of its normal
coordinates of motion, and evolving these harmonically. This then replicates the internal vibrations of a molecule, i.e. those degrees of freedom which would be perturbed in
reality.

Tyche, the goddess of fortune and chance, is chosen as a namesake for the software in
accordance with the stochastic nature of the underlying methodology.

Theoretical Considersations

2.1

Frame of Reference

The complete elucidation of the state of a system is attained by specifying the following
criteria:
1. The three Cartesian components of the centre of mass.
2. The three Euler angles of a rotating Cartesian frame relative to a global frame, the
axes of which coincide with the three principal axes of inertia of the static system.
3. The Cartesian coordinates of each atom in the molecule relative to this rotating
coordinate system.
Condition (1) is immediately satisfied by allowing the origin of the reference frame to
coincide with the molecular centre of mass. Condition (2) then follows on by allowing
the global reference frame to be fixed to the rotating molecular frame of reference as defined. This subsequently leaves 3N degrees of freedom corresponding to those outlined
in condition (3).

However, a Cartesian basis neglects the fact that the system is invariant with respect
to rigid translations and rotations (under the assumption of no external influences or
fields). The 3N , or ntot , degrees of freedom which form the Cartesian basis are therefore
readily reduced to 3N 6, a quantity which we suggestively denote by nvib . The state
vectors which inhabit this nvib -dimensional basis are referred to as internal coordinates.
Let x denote the state vector of the th atom of a system, and x the corresponding
equilibrium state vector of the atom. Displacements from equilibrium are subsequently
given by x = x x . Constraining the origin of the reference frame to coincide
with the centre of mass of the system, we require
N
X

m x = 0

=1

which retains validity when q = 0, since


N
X
=1
N
X

N
X

m x =

m x

=1

m (x x ) =

=1

N
X
=1

which leads to the satisfaction of condition (1).


3

m x = 0

2.2

Dynamics

We represent a molecular conformation by the 3N -dimensional state vector, x, the components of which correspond to the Cartesian degrees of freedom of each constituent
atom. In order that we may obtain a (not necessarily continuous) dynamical trajectory,
x(t), we require solutions to some equation of motion. For our purposes, we shall find
= T (x)
U(x), where T (x)
and U(x)
it convenient to evaluate the Lagrangian, L(x, x)
are the kinetic and potential energies of a conformation, respectively.

Once the Lagrangian has been obtained, we obtain desired conformations by those x
which satisfy the Euler-Lagrange equation
d L L

=0
dt x
x

(2.1)

In what follows, we shall find it convenient to evaluate the various energetic components
about some reference state, which we denote x . Thus, we introduce the difference
coordinate, x = x x . We may undertake this without loss of generality, since x is
given as a parameter and possesses no time dependence.
2.2.1

Kinetic Energy

The (classical) kinetic energy of a system is given by the familiar

=
T (x)

N
1X
m x x
2 =1

>

x = (x , y , z )

(2.2)

We simplify the above by introducing a set of generalised, mass-weighted coordinates,

q = x m , which permits revision of (2.2) to

=
T (q)

N
1X
q q
2 =1

(2.3)

Transformation to this set of generalised coordinates has no effect on (2.1), since it is


simply equivalent to a global scaling. The terms of (2.1) neatly split into derivatives
and since the Lagrangian is summative, we see that
with respect to q and q,
d L
d T

dt q
dt q
thus,
" N
#
N
N
X
d T
d 1X
d X

=
q q =
q =
q
dt q
dt q 2 =1
dt =1
=1
4

(2.4)

We shall find it convenient to later forsake the vector derivatives with individual components of the vector, i.e.
d T
= qk
dt qk

(2.5)

which has been sufficiently manipulated for the moment.


2.2.2

Potential Energy

Without wishing to evaluate all pairwise interactions and some functional form of the
potential energy of a molecular conformation, we instead consider the Taylor series of
the potential energy at the conformational state x about the aforementioned reference
state x


3N
X

U
U(x) = U(x ) +
(xi xi )
xi

i=1


3N
1 X
2 U

+
(xi xi )(xj xj )
+ (2.6)
2 i,j=1
xi xj x

At this point, we are free to arbitrarily set the reference state energy, U(x ) = 0, and
subsequently scale all other terms. Typical derivations also continue under the assumption that x occupies an energetic minimum, and so the first derivative term is also
set equal to zero, thus removing the first order term from (2.6). However, we continue without such a constraint and permit x to occupy any point on the molecular
potential energy function, thus generalising any subsequent results. We do, however,
truncate (2.6) at second order, which obviously presents a limitation if the potential
energy function is severely anharmonic, in which case the quadratic approximation to
the underlying function is erroneous.

We proceed by transformation to the set of mass-weighted difference coordinates used


in the kinetic energy expression such that construction of the Lagrangian will later be
trivial. This requires a simple substitution , and revision of (2.6) yields

U(q) =

3N
X
i=1



3N
qi U
1 X
qi qj
2 U
+

m qi q 2 i,j=1 m m qi qj q

= i i (mod 3)
= j j (mod 3)
(2.7)

We make note that transformation of the derivative terms to this set of generalised coordinates requires no special treatment. The transformation is purely scaling in nature,
which has no effect on the topology of the underlying function, thus the derivatives

remain the same. Additionally, we observe that the derivative terms have evaluation
conditions, which permits their treatment as constants, which we may amalgamate with
the inverse-square root mass-weightings to give

U(q) =

3N
X

Ji qi +

i=1

3N
1 X
Hij qi qj
2 i,j=1

(2.8)

where

1 U
Ji =
m qi q


1
2 U
Hij =
m m qi qj q

(2.9)

and , obey the same indexing rule as that given in (2.7). We have invoked the suggestive notation of J and H to indicate that these quantities represent the elements of
the Jacobian and Hessian of U, respectively.

Given the form of the potential energy, we are free to evaluate its implementation in
(2.1), where we see, as for the kinetic energy
L
U

q
q
Thus,

3N
3N
U
X
1 X
=
Ji qi +
Hij qi qj
q
q i=1
2 i,j=1

(2.10)

Manipulating derivatives with respect to vectors can become overly cumbersome when
the basis is orthogonal. Instead, we give an expression for the k th component of

U
q ,

and take advantage of the resultant Kronecker deltas

3N
3N
1 X
U
X
Ji qi +
Hij qi qj
=
qk
qk i=1
2 i,j=1
=

3N
X
i=1

3N
X
i=1

Ji ik +

Ji

3N
qi
1 X

+
Hij
qi qj
qk
2 i,j=1
qk

 X

3N
3N
3N
1 X
qi
1 X
qj
+ qj
=
Ji ik +
Hij qi
Hij [qi jk + qj ik ]
2 i,j=1
qk
qk
2 i,j=1
i=1

Owing to the double summation in the final term, we may revise this to read as a single
summation over the index i and subsequently double count owing to the symmetry of
the Hessian (Hij = kji ). Finalising,
6

3N
X
U
= Jk +
Hik qi
qk
i=1

2.2.3

(2.11)

Solution

Given (2.5) and (2.11), we are now in a position to evaluate (2.1) and determine the
conformations which satisfy it. Combining all components
d T
U
d L L

=0
dt x
x
dt x
x
Therefore, those q(t) which satisfy

qk Jk +

3N
X

Hik qi = 0

k = 1, ..., 3N

(2.12)

i=1

represent realistic molecular conformations. Rewriting in a more suggestive manner


3N
X
d2
q
+
Hik qi = Jk
k
dt2
i=1

k = 1, ..., 3N

(2.13)

we see that finding those conformations is simply a matter of solving a set of 3N coupled
inhomogeneous second order differential equations. To do this, we proceed by initially
considering the underlying homogeneous differential equation
3N
X
d2
q
+
Hik qi = 0
k
dt2
i=1

(2.14)

which has general harmonic solutions of the form

qk = Ak cos(t) + Bk sin(t)
This can be modified to a single sinusoid with a phase factor for ease of later manipulation

qk = Ak cos(t + )

(2.15)

Now, to account for the inhomogeneity present within (2.13), we append an as yet
undetermined function, k to our solution, and subsequently use this in (2.13) to evaluate
the form of k . Thus
3N
X
d2
[A
cos(t
+
)
+

]
+
[Hik Ai cos(t + ) + Hik i ] = Jk
k
k
dt2
i=1

3N
X
d
Ak sin(t + ) +
[Hik Ai cos(t + ) + Hik i ] = Jk
dt
i=1

Ak cos(t + ) +

3N
X

[Hik Ai cos(t + ) + Hik i ] = Jk

i=1

We here point out that in going from the first to second step, we have assumed that
k possesses no explicit time dependence, thus permitting its temporal derivative to
disappear. We shall resolve this rigorously once the functional form has been elucidated.
Noticing that if we constrain
3N
X

Hik i = Jk

i=1

then we recover a homogeneous differential equation

2 Ak cos(t + ) +

3N
X

Hik Ai cos(t + ) = 0

(2.16)

i=1

and our constraint condition yields the previously unknown functional form of the term
we added to the general solution (2.15)
Jk
k = P3N
i=1

Hik

Recalling our assumption that the temporal derivative of k was equal to zero, we check
this by dimensional analysis. Jk has units of force, whilst Hik has units of force per unit
length. Thus, we see that k must have units of length, and thus possesses no explicit
time dependence. The general solution to our inhomogeneous differential equation is
therefore given by
Jk
qk = Ak cos(t + ) + P3N
i=1

Hik

(2.17)

which is the equation of motion for a driven harmonic oscillator. Note that this is dimensionally consistent. Thus, we may infer that when the first derivative of the potential
energy function pertaining to a molecular conformation does not vanish, the resultant
equations of motion drive the system to the region of the potential energy function where
the first derivative does vanish. Crudely put, the system falls down the potential energy well to its equilibrium conformation.

We are now left with a system of 3N homogeneous differential equations in the 3N


unknowns Ak , k = 1, ..., 3N . Solving for the Ak requires manipulation of (2.16), which
we now undertake. Firstly, we divide through by the harmonic term cos(t + ) to yield

2 Ak +

3N
X

Hik Ai = 0

i=1

This places an immediate constraint on our solutions, in that we are not permitted to
allow cos(t + ) = 0, which is realised when t + =

n
2

n N\0. Continuing with

our derivation, we rearrange the above to read


3N
X


Ai Hik 2 ik = 0

(2.18)

i=1

which is an eigensystem, for which there exist 3N values of giving rise to non-trivial
solutions for the qk (t), i.e. those for which Ak 6= 0.

2.3

Coordinate Transformations

2.3.1

Internal Coordinates

2.3.2

Redundant Internal Coordinates

In the previous section, we considered the minimal set of nvib vibrational degrees of freedom. These correspond to collective atomic motions relative to one another. However,
we may alternatively isolate each vibrational degree of freedom corresponding to more
physically comprehensible motions. This necessitates a set of nred degrees of freedom,
termed redundant internal coordinates, where, by definition, nred nvib . Whilst a set
of redundant internal coordinates is by no means unique, and is defined by any basis
which satisfies this inequality, we shall use the term to represent that popularised by
Wilson.

The first degrees of freedom we consider in this basis correspond to bond lengths between
atoms which are physically connected, and are given by

qbond
=

r0 r0

, bonded

(2.19)

where , correspond to distinct atoms, and r0 = r0 r0 are termed bond vectors.


We shall find it convenient later on to invoke the convention of denoting absolute bond
lengths with primes, whilst normalised bond lengths are unprimed, i.e. r = r0 /|r0 |.

The criterion as to whether and are bonded is somewhat arbitrary, but within
Tyche, we utilise

bonded
, =
not bonded



|r | k rvdW + rvdW

if

otherwise

where rvdW is the van der Waals radius of an atom, and k is an adjustable stretch
factor. This is parameterised as 1.20 to conform with the jMol visualisation software.

The next redundant internal coordinates we consider correspond to the angles tended
by two bond vectors which share a common atom, i.e.

qangle

= arccos

r0 r0
|r0 ||r0 |

!
= arccos (r r )

(, ) (, ) bonded

(2.20)

The final redundant internal coordinates we require are those which report on the dihedral degrees of freedom as defined by the torsional angle subtended by two bond vectors,
r0 and r0 , such that r0 {qbond }. Demonstrating the mathematical form for these
torsional degrees of freedom is facilitated by reference to a schematic (QQ:Figure).

Here, we see that the dihedral angle, qdihedral


is calculated by considering the vectors, e01

and e02 , perpendicular to the planes spanned by (r0 , r0 ) and (r0 , r0 ), respectively.
In other words
e01 = r0 r0

e02 = r0 r0

The normalised forms for these vectors, e1 and e2 , are given by the definition of the
absolute value of the cross product, |a b| = |a||b| sin , such that

e1 =

r0 r0
e01


=

|e01 |
|r0 ||r0 | sin qangle

e2 =

r0 r0
e02


=

|e02 |
|r0 ||r0 | sin qangle

and we find that the cosine of the dihedral is given by the scalar product of these two
quantities, i.e.

0
0
0
0
(r

r
)

(r

r
)

  

qdihedral
= arccos

|r0 ||r0 |2 |r0 | sin qangle


sin qangle

(r

r
)

(r

r
)

 


qdihedral
= arccos

sin qangle sin qangle


10

(2.21)

Given the elements of q, we now require the transformation array, B, from the Cartesian
state vector, x, i.e. we wish to satisfy

q = Bx

(2.22)

Unfortunately, we cannot use the trick from the previous section, whereby we have the
analytical form for some vectors in this basis, and subsequently evaluate the rest by
some orthonormalisation procedure, since the redundant internal coordinates do not
form a mutually orthonormal basis. As such, we resort to calculating the elements of B
by considering it as the Jacobian of the transformation

Bij =

qi
xj

such that any small displacement in the Cartesian basis translates into a small displacement in the redundant internal coordinate basis. This then necessitates the partial
derivatives of the expressions (2.19), (2.20) and (2.21) with respect to all Cartesian degrees of freedom.

We begin by considering the elements qbond


/xi

q 0
1
ri0 (xi )
0
0
r r0 =
r

r
=

xi
2|r0 | xi
|r0 |

qbond
= ri (xi )
xi

(i = 1, 2, 3)

(2.23)

where we have introduced the sign factor

(xi ) =
1

xi r
if

(2.24)

xi r
otherwise

Next are the elements qangle


/xi , the derivation of which is somewhat more involved.

Consider the full form of the expression

qangle

xi

arccos(u v) =
arccos(f )
uv
xi
f
xi

(f = u v)

which follows from application of the chain rule. The derivative of arccos(f ) is

1
1

arccos(f ) = p
= p
= arcsin(qangle
)
2
f
1 (u v)2
1f

11

The derivative of the scalar product is subsequently given by the product rule
u
v

uv =v
+u
= vi ui (xi ) + ui vi (xi )
xi
xi
xi
which permits us to write

qangle

xi
2.3.3

3
3.1

= arcsin(qangle
)ui vi [ (xi ) + (xi )]

(2.25)

Atomic Local Frame

Implementation
Metropolis Monte Carlo

Given a conformational state vector q, we wish to evaluate the probability of transition


to some alternative state, q0 , which we denote by P(q q0 ). In terms of an algorithmic
implementation, we may dictate this probability to be comprised of the product of two
individual terms. The first of these corresponds to the probability of proposing the actual
transition from q to q0 , which we denote by G(q q0 ). The second is the probability
of accepting transition to the proposed state, which we denote by A(q q0 ). Thus,
P(q q0 ) = G(q q0 )A(q q0 )

(3.1)

In order that we may properly invoke the Metropolis scheme, we require that our implementation satisfies the detailed balance condition
P(q)P(q q0 ) = P(q0 )P(q0 q)

(3.2)

proposed by Kolmogorov to ensure that for any closed sampling pathway (i.e. one which
begins and ends with the same state), there is no preferred direction of sampling. In
other words, the probability of traversing the closed pathway in one direction is no more
favourable than in the opposite direction. Rearranging to a more useful form
P(q q0 )
P(q0 )
=
P(q)
P(q0 q)
and revising (3.1) for the backward transition probability,
P(q0 q) = G(q0 q)A(q0 q)

12

we may form a combination of the two to yield an expression for the acceptance probabilities
G(q q0 )A(q q0 )
P(q0 )
P(q q0 )
=
=
0
0
0
P(q q)
G(q q)A(q q)
P(q)
A(q q0 )
P(q0 )G(q0 q)
=
A(q0 q)
P(q)G(q q0 )

(3.3)

We immediately remove some underlying complexity from the above by constraining


the various G(q q0 ) to unity, which may be interpreted as all conformations being
equally accessible from all others, and there being no preference on proposing a move
to a certain state than any others. Thus, (3.3) simplifies to
P(q0 )
A(q q0 )
=
0
A(q q)
P(q)

(3.4)

The probability of a state is given by a simple Boltzmann weighting, and so has closed
form
P(q0 )
=
P(q)

1
0
Z exp [E(q )]
1
Z exp [E(q)]

= exp [Eqq0 ]

where Z is the partition function, is the thermodynamic beta and Eqq0 = E(q)E(q0 ).
We note that the final form makes no mention of the partition function, and so its
calculation is thankfully not required. To satisfy this ratio of probabilities, we utilise
the Metropolis choice of the acceptance probability


P(q0 )
A(q q0 ) = min 1,
P(q)

(3.5)

To implement this scheme, we define the acceptance array, Aij , whose elements correspond to (3.5) for the Nseed conformations which are available for sampling during the
course of our PR-PES dynamics. Upon the stochastic choice of a proposed conformation for sampling about, a random number is selected from the uniform distribution,
X U(0, 1). If the corresponding element of Aij X, then the proposed conformation
is accepted. The equality ensures that if Aij is is assigned its maximum possible value,
1, then it is accepted unconditionally. Otherwise, an alternative proposed conformation
is selected and the process reiterated until this condition is satisfied.

13

3.2

Normal Mode Calculation

Our primary aim here is to establish the six invariant degrees of freedom of our conformational state, qn . To do this, we require the nvib ntot transformation matrix, D,
which satisfies

s = Dq

(3.6)

Of course, s is then given as an nvib -dimensional state vector expressed in a normal


coordinate basis expressed in the Eckart frame of reference, i.e. where the global translational and rotational degrees of freedom remain invariant.

In a Cartesian basis, the three vectors corresponding to the invariant translational degrees of freedom are given trivially by
Di1 =
Di2 =
Di3 =

= 3a 2

m
m

where

a = [1, nvib ] N

= 3a 1

= 3a

3.3

Atomic Local Frame

3.4

Dynamical Evolution

3.5

Similarity Metric

To ensure that the resultant samples are as conformationally distinct as possible, Tyche
implements a filtering metric. In this, once a sample has been produced, it is compared
to all previously outputted samples. Some criterion must subsequently be satisfied to
permit output to the sample set. This strategy is somewhat embryonic, and so the metric currently implemented are perhaps crude. For our purposes however, we find that it
ensures the sample set is sufficiently diverse.

The metric, M, implemented is Euclidean, in that for each degree of freedom for the
nth state vector qn , we evaluate the following for the mth previously outputted sample

Mnm =

3N
X

|qin qim |

i=1

By defining some reference metric, M , we introduce the condition that if Mnm


M m 1, ..., n 1, the proposed sample, qn is considered sufficiently distinct and

14

permitted to be outputted. Otherwise, the proposed sample is rejected.

The only parameterisation required for this methodology is the definition of a value of
M . Somewhat arbitrarily, we designate it by specifying that on average, each degree
of freedom must possess a difference of 0.1
A

3.6

Biased Equipartition

User Guide

4.1

Compilation Instructions

4.2

Requisite ab initio Calculations

4.3

Parameter File

4.4

Output

15

S-ar putea să vă placă și