Documente Academic
Documente Profesional
Documente Cultură
Sergei Grudinin
Inria/CNRS Grenoble, France
email : sergei.grudinin@inria.fr
1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Introduction 3
1.2 NMA theory 3
1.3 RTB projection method 4
1.4 The NOLB method 5
1.5 Potential function 6
2 Program options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Usage 7
2.2 Main options 10
2.3 Experimental options 13
2.4 Currently suppressed options 13
3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Analysis of basic molecular motions 15
3.2 Comparison with linear NMA 17
3.3 Another example with a bigger system that contains heteroatoms 17
3.4 Finding the best transition between two protein conformations 18
3.5 Analysis of an MD trajectory or NMR ensemble 19
3.6 Clustering of MD trajectory frames 21
3.7 Analysis of protein docking poses 22
3.8 Structural ensembles 22
3.9 Structural transitions 24
3.10 Nonlinear structural transitions 27
3.11 Analysis of modes 28
4 Related methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1 GUI 30
4.2 Morphing paths 30
4.3 SAXS-guided structure optimization 31
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1. Theory
1.1 Introduction
Normal mode analysis (NMA) is an old and well established technique [1] that has recently found
many new applications in the field of structural biology and structural bioinformatics [2]. NMA
uses a quadratic approximation of the potential energy, and thus it produces linear deformations
of the initial structure, which are accurate only for small-amplitude motions. Larger amplitudes
can destroy, for example, the secondary structure and break interatomic bonds, when NMA is
applied to a protein. An obvious circumvention for this problem will be to take smaller amplitude
steps and iteratively recompute and diagonalize the Hessian matrix from the updated positions.
This procedure would indeed produce a more realistic deformation of the initial structure thanks
to the non-linearity of the obtained deformation. However, such an approach requires multiple
diagonalization steps, which may be computationally expensive for some of the applications. Thus,
multiple attempts were made to introduce non-linear deformations without the need of multiple
diagonalizations. Here we present a new scheme for nonlinear normal mode analysis in the
Cartesian coordinates that extrapolates a motion computed from instantaneous linear and angular
velocities to large amplitudes. The scheme can be considered as a natural evolution of the widely
used rotations-translations of blocks (RTB) method [3, 4].
where M is the diagonal mass matrix, and K is the Hessian matrix (or stiffness matrix) of the
potential energy V evaluated at the equilibrium position q0 . We then compute the square matrix
of eigenvectors L and the diagonal matrix of eigenvalues Λ of the mass-weighted stiffness matrix
Kw = M −1/2 KM −1/2 ,
Kw = LΛLT . (1.2)
Let η ∈ R3N be the projection of q into the eigenspace of Kw and let us call (λi )i=0...3N the diagonal
values in Λ. Then, left multiplication of eq. 1.1 by LT M 1/2 gives the following system of uncoupled
4 Chapter 1. Theory
equations,
η = LT M 1/2 q
, (1.3)
η̈i + λi ηi = 0 i = 1 . . . 3N
which can be solved by the classical ordinary differential equation (ODE) theory.
A B C ~v = ~v⊥ + ~v|| !
~
!
~
~v = ~v⊥ + ~v||
Figure 1.1: (A) All-atom representation of a protein. (B) RTB representation of the same protein.
(C) Each rigid block has six degrees of freedom.
The main idea of the RTB projection method is to approximate the system as n rigid blocks.
Figure 1.1.A-B schematically shows the all-atom and the RTB representations of the same system.
The transition from the RTB coordinate system with 6n DOFs to the all-atom coordinate system
with 3N DOFs is performed by an orthogonal projection matrix P ∈ R3N×6n . The conservation laws
of the linear momentum and the angular momentum of a rigid block consisting of Nb atoms written
in mass-weighted coordinates are
Nb
√
Mb q̃˙ =
p
∑ mk q̇k for a translation
k=1
, (1.4)
Nb
√
I1/2 q̃˙ = ∑ mk (qk × q̇k ) for a rotation
k=1
where Mb is the total mass of the rigid block, I is the rigid block’s inertia tensor, q̃ is the blocks’s
displacement, mk is the mass of the kth atom of the block, and qk is the displacement of the kth
atom of the block. The elements constituting PT , the matrix projecting an all-atom motion q into a
motion of rigid block q̃ are then obtained by differentiating (1.4) with respect to q̇k [5]. This leads
to translation Pt and rotation Pr matrices of size 3Nb × 3 each, computed for each of the rigid blocks
and written through their k 3 × 3 square components,
r
mk
Ptk = E3 for a translation
Mb , (1.5)
√ −1/2
Prk = − mk (I) [rk − rCOM
]× for a rotation
where k is one of Nb atom indices, rk is the position of the corresponding atom in the block, and
rCOM the position of the block’s center of mass (COM). The rigid block’s displacement (δ , θ )
1.4 The NOLB method 5
6-vector is then obtained by summing up the displacements in the RTB coordinate frame,
Nb
δ= ∑ Pt Tk qk for a translation
k=1
. (1.6)
Nb
θ= ∑ Pr Tk qk for a rotation
k=1
Having written these equations, we can write the projection matrix P as a diagonal block matrix,
1
Pt Pr 1
Pt Pr
n n
The normal modes are then computed by the diagonalization of the RTB-projected mass-
weighted stiffness matrix,
PT Kw P = L̃Λ̃L̃T , (1.8)
where L̃ is the matrix composed of the RTB normal modes with the corresponding diagonal
eigenvalue matrix Λ̃. This equation can be rewritten as
Kw = (PL̃)Λ̃(PL̃)T . (1.9)
The all-atom normal modes (in mass-weighted coordinates) are then obtained as a projection of
Lw
the RTB normal modes L̃ according to
Lw = PL̃. (1.10)
where di j is the distance between the ith and the jth atoms, di0j is the reference distance between
these atoms, as found in the original structure, γ is the stiffness constant, and Rc is a cutoff distance,
typically between 5 Å and 15 Å. The stiffness matrix corresponding to this potential function is
composed of the following blocks [2, 7, 8],
0 2 0 0
(xi j ) xi j yi j xi0j z0i j
γ
Hi j = − 0 2 y0i j xi0j (y0i j )2 y0i j z0i j i 6= j
(di j )
z0i j xi0j z0i j y0i j (z0i j )2 , (1.16)
Hii = − ∑ Hi j
j6=i
j
d~ij
2.1 Usage
Typing the ’--help’ or ’-h’ flag produces the brief and more detailed description of the program
options,
NOLB --help
*******************************************************************
*-----------------------------------------------------------------*
*-----------NOLB : a Non-Linear Rigid Block NMA method------------*
*----------Authors: Alexandre Hoffmann & Sergei Grudinin----------*
*-Copyright (c): Nano-D team, Inria/CNRS Grenoble, France, 2018.--*
*--------------- e-mail: sergei.grudinin@inria.fr ----------------*
*---- http://team.inria.fr/nano-d/software/nolb-normal-modes/ ----*
*-----------------------------------------------------------------*
*******************************************************************
USAGE:
NOLB <pdb filename> <pdb reference file> [-o <output name>] [-n <number
of modes>] [--mode <unsigned integer>] ... [-c <cutoff distance>]
[-s <number of output frames>] [-a <maximum amplitude>] [-p
<potential function>] [-m] [--linear] [--trajectory] [--hetatm]
[--zdock <Zdock transform filename>] [--hex <Hex transform
filename>] [--frames <max number of input trajectory frames>]
[--covScaling <scaling of covalent interactions>] [-dcd <dcd file
name>] [--clust] [--coords <input traj file>] [--weights <input
weights file>] [-r <maximum rmsd>] [--dist <RMSD distribution>]
[--rand] [--nSteps <number of minimization steps>] [-t
<minimization tolerance>] [--blocks <Rigid blocks ids filename>]
[--analyze] [--collectivity] [--nlin] [--nIter <number of
additional Hessian matrix computations>] [-h] [--version] [--log]
Where:
<pdb filename>
(required) Input PDB file, can be a PDB multi-model trajectory file
-m, --minimize
Minimize obtained structures, off by default.
--linear
Additionally compute the linear modes. Off by default.
--trajectory
Save the output trajectory at equidistant points. Off by default.
--hetatm
Read heteroatoms from the input file. False by default.
--clust
Cluster trajectory frames, off by default.
--rand
Random seed for decoy generation. Off by default.
2.1 Usage 9
--analyze
Analyze the computed modes. Off by default.
--collectivity
Prints modes collectivity. Off by default.
--nlin
Deterministic nonlinear transitions if the reference structure is
given, off by default.
-h, --help
Displays usage information and exits.
--version
Displays version information and exits.
--log
Displays ChangeLog information and exits.
NOLB ChangeLog:
Version 0.1 from Feb 2016:
Initial release.
Version 0.2 from March 2016:
Added initial support of PDB trajectories.
Version 0.3 from Feb 2017:
Added reading of hetero atoms. Adapted for SAMSON GUI. Multiple tests performed.
Compiled on Linux, Win32 and MacOS.
Version 0.4 from May 2017:
Added automatic tests for convergence. More of noninear PCA support.
Version 0.5 from June 2017:
Added output of linear modes.
Version 0.6 from July 2017:
Added generation of NMA decoys.
Version 0.7 from December 2017:
Added possibility to manually specify rigid blocks.
Version 0.8 from February 2018:
Added support for DCD trajectories. Initial support of MD rapid clustering. Fixed support
of insertion codes when aligning the sequences.
Version 0.9 from March 2018:
Added basic analysis of the motions.
Version 1.0 from April 2018:
10 Chapter 2. Program options
--linear
Flag to additionally compute the linear modes. Off by default. If computed, these are stored
separately.
2.2 Main options 11
--hetatm
Flag to read heteroatoms from the input file. False by default. Should be used for lipids, small
molecules, etc.
--trajectory
Flag to save the output trajectory at equidistant points. Off by default. If this option is disabled,
then the trajectory is written according to the natural harmonic motion of the oscillator.
--rand
Flag to reinitialize the random seed for decoy generation, off by default.
-m, --minimize
Flag to minimize the obtained structures, off by default.
Name of the trajectory file in the DCD format. If it is specified, the first PDB argument must
correspond to the topology of the DCD file.
--clust
This flag controls the clustering of the input trajectories. The user also needs to define the
corresponding RMSD clustering threshold. Off by default.
--analyze
This flag turns on the analysis of the computed normal modes. Currently, only the collectivities are
computed. Off by default.
--collectivity
This flag turns on outputting the collectivities of the modes. Off by default.
--nlin
This flag turns on a deterministic nonlinear transition computation if the reference structure is given.
No additional diagonalizations are required. Off by default.
--nIter
This flag specifies the number of additional computations of the Hessian matrix if the reference
structure is given, 0 by default. If enabled, the recommended value is 5. Will only take effect if the
--nlin flag is on.
-h, -–help
Flag to display usage information and exit.
--version
Flag to display version information and exit.
2.3 Experimental options 13
--log
Flag to display the ChangeLog information and exit.
A 18 Å
B
0
D
18 Å
E F
5Å
5Å
Figure 3.1: Comparison of linear (A, C, E) and non-linear (B, D, F) normal modes computed for a
coiled coil protein (pdb code 2ch7). Three types of motions are shown, bending (A, B), stretching
(C, D), and twisting (E, F). Several snapshots at different deformation amplitudes are superposed to
each other. These are colored according to the values of the overall deformation, as measured by
the RMSD. The colorbars show the RMSD with respect to the initial position. The arrows follow
the trajectories of individual atoms.
3.1 Analysis of basic molecular motions 15
This will produce the following output in the terminal (we suppose you are starting the program
from the terminal!),
*******************************************************************
*-----------------------------------------------------------------*
*-----------NOLB : a Non-Linear Rigid Block NMA method------------*
*----------Authors: Alexandre Hoffmann & Sergei Grudinin----------*
*-Copyright (c): Nano-D team, Inria/CNRS Grenoble, France, 2017.--*
*--------------- e-mail: sergei.grudinin@inria.fr ----------------*
*---- http://team.inria.fr/nano-d/software/nolb-normal-modes/ ----*
*-----------------------------------------------------------------*
*******************************************************************
=====================Reading input PDB file======================
Input PDB file................................................... : 2ch7.pdb
Number of chains read............................................ : 3
Number of atoms read............................................. : 4630
Found............................................................ : 4641 covalent bonds
................................................................. : 6264 angles
======================Constructing Hessian=======================
Number of disconnected atoms..................................... : 0
Number of interactions........................................... : 631042
Memory required for the Hessian matrix........................... : 129.991 Mb
============================Doing NMA============================
Memory required for the RTB projection matrix.................... : 1.27167 Mb
Reduced Hessian size : .......................................... : 3702 x 3702
Number of computed modes......................................... : 10
Null-space size.................................................. : 1
Lowest mode frequency............................................ : 0.000535409
Highest mode frequency........................................... : 0.00412211
Frequencies :
[ 1]................................................. : 0.000535409
[ 2]................................................. : 0.000542863
[ 3]................................................. : 0.00135327
[ 4]................................................. : 0.00153196
[ 5]................................................. : 0.00167521
[ 6]................................................. : 0.00261288
[ 7]................................................. : 0.00278308
[ 8]................................................. : 0.00347724
[ 9]................................................. : 0.003539
[10]................................................. : 0.00412211
====================Writing output PDB files=====================
Using output mask................................................ : 2ch7
Using output mask for nonlinear modes............................ : 2ch7_nlb_
=================================================================
=========================== Timing : ============================
=================================================================
Parsing arguments................................................ : 0.001036 s
Reading input PDB file........................................... : 0.015923 s
Constructing Hessian............................................. : 0.549772 s
Doing NMA........................................................ : 0.797914 s
Writing output PDB files......................................... : 0.775731 s
.................................................................
Total time : .................................................... : 2.14038 s
=================================================================
The first section lists the structural parameters, the number of atoms, chains, and the number of
16 Chapter 3. Examples
bonds and angles in the structure. These last two can be used during energy relaxation, which is
currently disabled. Water molecules are ignored and not listed, heteroatoms can be read with an
additional flag ’--hetatm’:
=====================Reading input PDB file======================
Input PDB file................................................... : 2ch7.pdb
Number of chains read............................................ : 3
Number of atoms read............................................. : 4630
Found............................................................ : 4641 covalent bonds
................................................................. : 6264 angles
The second section lists parameters of the Hessian matrix. These are the number of interacting
atom pairs, which can be controlled with the ’--cutoff’ flag, the number of disconnected atoms,
which are stayed static, and the total memory required for the Hessian:
======================Constructing Hessian=======================
Number of disconnected atoms..................................... : 0
Number of interactions........................................... : 631042
Memory required for the Hessian matrix........................... : 129.991 Mb
The next section lists parameters of the RTB basis and the diagonalization procedure. It provides
the size of the projection matrix, the size of the RTB Hessian, the size of the detected null-
space, the number of modes and the corresponding frequencies. These are the square roots of the
corresponding eigenvalues. The null-space is usually not detected fully, it should be 5 or 6 for
non-symmetric systems, and skipped from the further analysis:
============================Doing NMA============================
Memory required for the RTB projection matrix.................... : 1.27167 Mb
Reduced Hessian size : .......................................... : 3702 x 3702
Number of computed modes......................................... : 10
Null-space size.................................................. : 1
Lowest mode frequency............................................ : 0.000535409
Highest mode frequency........................................... : 0.00412211
Frequencies :
[ 1]................................................. : 0.000535409
[ 2]................................................. : 0.000542863
[ 3]................................................. : 0.00135327
[ 4]................................................. : 0.00153196
[ 5]................................................. : 0.00167521
[ 6]................................................. : 0.00261288
[ 7]................................................. : 0.00278308
[ 8]................................................. : 0.00347724
[ 9]................................................. : 0.003539
[10]................................................. : 0.00412211
The output pdb trajectory files have the following names, ’2ch7_nlb_1.pdb’,’2ch7_nlb_2.pdb’,etc.
The number in the name refers to the normal mode id starting from 1. The output mask can be
changed with the ’-o newMask’ flag. Finally, the last section lists the timings of the program, split
into individual contributions:
=========================== Timing : ============================
=================================================================
Parsing arguments................................................ : 0.001036 s
Reading input PDB file........................................... : 0.015923 s
Constructing Hessian............................................. : 0.549772 s
Doing NMA........................................................ : 0.797914 s
Writing output PDB files......................................... : 0.775731 s
3.2 Comparison with linear NMA 17
.................................................................
Total time : .................................................... : 2.14038 s
=================================================================
Here, the most time consuming operation is typically the diagonalization step. Writing trajectories
can be very long as well, depending on the number of required modes and frames in the trajectories.
Figure 3.1 shows some of the computed modes.
The output will be only different when listing the output files,
====================Writing output PDB files=====================
Using output mask................................................ : 2ch7
Using output mask for linear modes............................... : 2ch7_linear_
Using output mask for nonlinear modes............................ : 2ch7_nlb_
Now, we have also computed the linear NMA trajectories stored in ’2ch7_linear_1.pdb’, ’2ch7_linear_2.pdb’,
etc. Figure 3.1 shows the visual difference between the computed linear and non-linear modes.
This will produce the following output, which we will cut for brevity,
=====================Reading input PDB file======================
Input PDB file................................................... : 5b5e.pdb
Number of chains read............................................ : 39
Number of atoms read............................................. : 40908
Found............................................................ : 42093 covalent bonds
................................................................. : 57363 angles
======================Constructing Hessian=======================
Number of disconnected atoms..................................... : 0
Number of interactions........................................... : 6424422
Memory required for the Hessian matrix........................... : 1323.39 Mb
============================Doing NMA============================
Lowest mode frequency............................................ : 0.00156711
Highest mode frequency........................................... : 0.00784686
Frequencies :
[ 1]................................................. : 0.00156711
[ 2]................................................. : 0.00175741
=========================== Timing : ============================
=================================================================
Parsing arguments................................................ : 0.000678 s
Reading input PDB file........................................... : 0.099499 s
Constructing Hessian............................................. : 7.44606 s
Doing NMA........................................................ : 29.754 s
Writing output PDB files......................................... : 28.5004 s
.................................................................
18 Chapter 3. Examples
Please note that the PDB output now takes a significant part of the total execution time. This system
contains about 10,000 of small molecules, mostly lipids. In order to include these into the analysis,
we repeat the computations with the ’--hetatm’ flag,
Example 4 This example is identical to the previous one, with all the heteroatoms (excluding
water) included into the NMA calculations,
NOLB 5b5e.pdb -n 50 --hetatm,
Please note the changed Hessian size and the spectrum of the normal modes. The visual inspection
of the first modes demonstrates that these are still the same. However, their order might have
changed because of the changed spectrum.
Then, it outputs the trajectory from the starting conformation to the goal using the optimal combina-
tion of the first 4 modes. It also prints the reduction in the overall RMSD for each mode,
====================Writing output PDB files=====================
Using output mask................................................ : 1s6p
Using output mask for nonlinear modes............................ : 1s6p_nlb_
RMSD reduction : ................................................ :
1s6p.pdb 3.621983 3.169015 2.894955 2.235346 2.196610
Initial RMSD..................................................... : 3.62198
Final RMSD....................................................... : 2.19661
Finally, it outputs the timings. We can see that overall it only takes about two seconds,
=========================== Timing : ============================
=================================================================
Parsing arguments................................................ : 0.000248 s
Reading input PDB file........................................... : 0.015575 s
Reading input PDB file........................................... : 0.015943 s
Aligning input PDB file.......................................... : 0.031428 s
Superposeing input PDB file...................................... : 0.000262 s
Constructing Hessian............................................. : 0.948881 s
Doing NMA........................................................ : 1.16697 s
Writing output PDB files......................................... : 0.221236 s
.................................................................
Total time : .................................................... : 2.40055 s
=================================================================
The obtained result can be visualized with popular software packages, for example, as
pymol 1s6p_nlb.pdb 2hmi.pdb
You may see that the trajectory analysis is very fast, the only noticeable time here is actually the
output of the computed motions.
Example 7 In the following example, we will analyze a MD trajectory of a coiled-coil domain
of a chemoreceptor saved in DCD format. The length of the simulations was about 1 us, and the
frames were saved every 1 ns. To do so, we type
NOLB traj.pdb --dcd traj.dcd
Again, the trajectory analysis is very fast, and the principal components of a 1us simulations have
been computed in just a few seconds.
3.6 Clustering of MD trajectory frames 21
In the next example, we will provide a fast clustering method based on the RapidRMSD library
[10]. First, we will project the MD trajectory on the principal components, and then we will
use a fast method to compute RMSDs between the flexible conformations, if these are obtained
with a few collective motions. We will use a lysozyme MD trajectory (1960 atoms) with 10,000
frames.
Example 8 To cluster the frames of MD trajectory with RMSD threshold of 0.5 Å, type the
following,
NOLB lyz.pdb -dcd lyz.dcd --clust --rmsd 0.5
You may see once again that the trajectory analysis including the subsequent clustering of a MD
trajectory is very fast.
22 Chapter 3. Examples
Example 9 To visualize the principal components of the computed docking poses type
We should mention that this is just an example how to use the NOLB method for a simple PCA
analysis, which probably cannot be used for practical applications.
Figure 3.2: Illustration of 100 decoys of a coiled-coil system 2ch7 protein with RMSD of 10
Å from the starting structure.
3.8 Structural ensembles 23
Example 10 In the first example we will generate 100 decoys of a coiled-coil system at 10
Å from the starting structure.
NOLB 2ch7.pdb -s 100 --rmsd 10
This creates a ’2ch7_nlb_decoys.pdb’ pdb trajectory file with 100 models with the following
output,
===================Writing output Decoy files====================
Number of files.................................................. : 100
Output RMSD...................................................... : 10 A
Using outputfilename............................................. : 2ch7_nlb_decoys.pdb
=================================================================
=========================== Timing : ============================
=================================================================
Parsing arguments................................................ : 0.000178 s
Reading input PDB file........................................... : 0.006173 s
Constructing Hessian............................................. : 0.0752 s
Doing NMA........................................................ : 0.605818 s
Writing output Decoy files....................................... : 0.848708 s
.................................................................
Total time : .................................................... : 1.53608 s
=================================================================
Example 11 In the second example we will generate 100 decoys of a the same system with
RMSD distributed linearly from 0 to 10 Å compared to the starting structure. We will also
minimize the obtained decoys.
NOLB 2ch7.pdb -s 100 --rmsd 10 --dist 1 -m
This creates a ’2ch7_nlb_decoys.pdb’ pdb trajectory file with 100 models with the following
output,
===================Writing output Decoy files====================
Number of files.................................................. : 100
Output RMSD...................................................... : 10 A
Using outputfilename............................................. : 2ch7_nlb_decoys.pdb
Current energy................................................... : 0.272494
N minimization steps............................................. : 0
Current Gradient norm............................................ : 0.0210788
Displacement .................................................... : 0.00167775
RMSD after minimization.......................................... : 2.45655
---
Current energy................................................... : 6.27796
N minimization steps............................................. : 1
Current Gradient norm............................................ : 0.0738622
Displacement .................................................... : 0.0135281
RMSD after minimization.......................................... : 7.13781
=================================================================
=========================== Timing : ============================
=================================================================
Parsing arguments................................................ : 0.000224 s
Reading input PDB file........................................... : 0.009928 s
Constructing Hessian............................................. : 0.069928 s
Doing NMA........................................................ : 0.603946 s
Writing output Decoy files....................................... : 2.1154 s
.................................................................
Total time : .................................................... : 2.79943 s
=================================================================
24 Chapter 3. Examples
RMSD, Å RMSD, Å
Figure 3.3: Transitions between the unbound (u) and bound (b) states of proteins from the Protein
Docking Benchmark v5 [12]. The top x-axis shows Cα RMSD between the two states. The bottom
x-axis lists the corresponding PDB codes of the complexes. The left plot shows the receptors, and
the right plot shows the ligands, as labelled by the authors of the benchmark. Only structures with
u-b RMSD ≥ 2 Å are shown. The y-axis shows the relative u-b transition that can be predicted
using the optimal linear combination of some number of normal modes. Results for the range
between 1 and 20 are shown in different colors (see the colorbar at the right).
Example 12 In the current example we will produce a transition between the unbound (u) and
the bound (b) forms of the ligand from the 3l89 complex.
NOLB 3L89_l_u.pdb 3L89_l_b.pdb
This creates a ’3L89_l_u_nlb.pdb’ pdb trajectory file for the best transition between the states, and
also the following output
=====================Reading input PDB file======================
Input PDB file................................................... : 3L89_l_u.pdb
Number of chains read............................................ : 1
Number of atoms read............................................. : 1018
=====================Reading input PDB file======================
Input PDB file................................................... : 3L89_l_b.pdb
Number of chains read............................................ : 1
Number of atoms read............................................. : 987
=====================Aligning input PDB file=====================
Total number of gaps ............................................ : 0
Alignement....................................................... :
CEEPPTFEAMELIGKPKPYYEIGERVDYKCKKGYFYIPPLATHTICDRNHTWLPVSDDAC
CEEPPTFEAMELIGKPKPYYEIGERVDYKCKKGYFYIPPLATHTICDRNHTWLPVSDDAC
************************************************************
YRETCPYIRDPLNGQAVPANGTYEFGYQMHFICNEGYYLIGEEILYCELKGSVAIWSGKP
YRETCPYIRDPLNGQAVPANGTYEFGYQMHFICNEGYYLIGEEILYCELKGSVAIWSGKP
************************************************************
3.9 Structural transitions 25
PICEKV
PICEKV
******
Please note, that here we have found a reduction of RMSD by 73% if the best combination of the
first ten modes is used. This analysis can be performed on the whole benchmark, for example, with
the following bash script,
for f in b5/*_r_u.pdb; do NOLB $f ${f:0:${#f}-5}b.pdb -n 20 >> results-R-20modes.txt; done
for f in b5/*_l_u.pdb; do NOLB $f ${f:0:${#f}-5}b.pdb -n 20 >> results-L-20modes.txt; done
And the results can be presented as a plot (similar to the one in Figure 3.3) using the following
python-based post-processing,
python benchmark.py results-R-20modes.txt
inputFile = sys.argv[-1]
with open(inputFile) as f:
lines = f.readlines()
inputfiles = []
rmsds = []
frequencies = []
globalData = {}
startRMSD = 0
for line in lines:
words = line.split()
if (len(words)==0) : continue
if (words[0] == "*******************************************************************") :
if (len(rmsds)) :
globalData[rmsds[0]] = [inputfiles, rmsds]
inputfiles = []
frequencies = []
rmsds = []
if (words[0]=="Input"):
inputfiles.append(words[4].split(’/’)[-1])
if (words[0][0]=="["):
frequencies.append(float(words[-1]))
if (startRMSD) :
startRMSD = 0
for w in words:
rmsds.append(float(w))
if (words[0]=="RMSD"):
startRMSD = 1
globalData = collections.OrderedDict(sorted(globalData.items()))
cutoff = 2.0
nModes=20
Quality = np.zeros(nModes)
QualityAbs = np.zeros(nModes)
lenQ = 0
columns = []
labels = []
rows = []
expectedLength = len(list(globalData.values())[0][1])
print("expectedlength = ", expectedLength)
for dat in globalData:
rmsds = globalData[dat][1]
initRMSD = rmsds[0]
if (initRMSD < cutoff or initRMSD > 5) : continue
lenQ += 1
for idx, r in enumerate(rmsds[1:nModes+1]):
Quality[idx] += (initRMSD - r) / initRMSD
QualityAbs[idx] += (initRMSD - r)
reductions = []
while(len(rmsds) < expectedLength) : rmsds.append(rmsds[-1])
for idx, r in enumerate(rmsds[1:nModes+1]):
reduction = previousRmsd - r
previousRmsd = r
relativeReduction = reduction / initRMSD
if (len(rows)<idx+1) : rows.append([])
rows[idx].append(relativeReduction)
print (columns)
print (labels)
lenRows = len(rows)
colormap = plt.cm.Vega20b
colors = colormap(np.linspace(0, 1, lenRows))
cell_text = []
for idx, row in enumerate(rows):
plt.bar(index, row, bar_width, bottom=y_offset, color=colors[idx])
y_offset = y_offset + row
cell_text.append([’%1.1f’ % (x/1000.0) for x in y_offset])
ax1.set_xticks(index)
ax2.set_xticks(index)
ax2.set_xlim(plt.axis()[:2])
print (plt.axis())
print (ax1.set_xlim(plt.axis()[0],plt.axis()[1]*1.26))
ax1.set_frame_on(False)
plt.colorbar(scalarmappaple,ticks=range(1,nModes+1))
plt.show()
Example 13 In this example we produce a nonlinear transition between the unbound (u) and
the bound (b) forms of the ligand from the 3l89 complex.
NOLB 3L89_l_u.pdb 3L89_l_b.pdb -n 7 --nlin
This creates a ’3L89_l_u_nlb.pdb’ pdb trajectory file for the best nonlinear transition between the
states, and also the following output (cut for brevity) :
=========================RMSD statistics=========================
RMSD reduction : ................................................ :
4.445432 2.258740 2.177358 1.470716 1.412096 1.411807 1.337768 1.337624
Amplitudes :
454.891 -76.1758 197.929 -39.3717 -7.06865 83.5584 3.02367
RMSD reduction : ................................................ :
4.041845 2.112556 2.041760 1.442365 1.394135 1.393912 1.333585 1.333479
RMSD reduction : ................................................ :
3.680333 1.982142 1.920188 1.412448 1.372653 1.372480 1.323201 1.323135
...
...
RMSD reduction : ................................................ :
1.192290 1.189762 1.189452 1.189097 1.189088 1.189087 1.188989 1.188918
RMSD reduction : ................................................ :
1.192304 1.190237 1.189976 1.189690 1.189683 1.189683 1.189598 1.189536
Initial RMSD..................................................... : 4.44543
Final RMSD....................................................... : 1.18954
Reduction in RMSD by............................................. : 0.732414
The RMSD reduction values in lines correspond to the analytical estimation of these for the linear
case. The first value in the column correspond to the nonlinear case. We see that the final nonlinear
RMSD to the target is 1.19 Å, whereas the final linear RMSD to the target is 1.34 Å.
where the sum runs over all the N atoms in the molecule, and
with the normalization factor α taken such that ∑i q2i,n = 1. N κ gives an effective number of
nonzero eigenvector components q2i,n . Thus, κ is confined to the interval {N −1 ;1}. If κ1 = 1, then
the corresponding mode is maximally collective and has all the identical amplitudes q2i,n , which
happens for rigid-body motion, for example. In the limit of extreme local motion, where a mode
affects a single atom only, κ1 = 1 is minimal and equals to 1/N .
Example 14 In the following example we will compute collectivities for the first twenty modes
of the coiled-cil protein 2ch7.
NOLB 2ch7.pdb --analyze -n 20 -s 0
============================Doing NMA============================
Number of computed modes......................................... : 20
=================================================================
NMA collectivities :
[ 1].......................................... : 0.682681
[ 2].......................................... : 0.682069
[ 3].......................................... : 0.638731
[ 4].......................................... : 0.670347
[ 5].......................................... : 0.723305
[ 6].......................................... : 0.740871
[ 7].......................................... : 0.746425
[ 8].......................................... : 0.647954
[ 9].......................................... : 0.741014
[10].......................................... : 0.764615
[11].......................................... : 0.760854
[12].......................................... : 0.611267
[13].......................................... : 0.754941
[14].......................................... : 0.718023
[15].......................................... : 0.594634
[16].......................................... : 0.711807
[17].......................................... : 0.5525
[18].......................................... : 0.760569
[19].......................................... : 0.781436
[20].......................................... : 0.716035
=================================================================
We can see that the modes 10, 11, 13, 18, 19 are the most collective, and the modes 15 and 17 are
the least collective.
4. Related methods
4.1 GUI
A graphical user interfaces created for the SAMSON software platform is available at https:
//www.samson-connect.net. Figure 4.1 shows the current version of this GUI.
Figure 4.1: NOLB graphical user interface for the SAMSON software platform.
[1] E Bright Wilson, J C Decius, and Paul C Cross. Molecular Vibrations: The Theory of Infrared
and Raman Spectra. McGraw-Hill, 1955 (cited on page 3).
[2] Ivet Bahar et al. “Normal Mode Analysis of Biomolecular Structures: Functional Mechanisms
of Membrane Proteins”. In: Chem. Rev. 110.3 (Mar. 2010), pages 1463–1497 (cited on
pages 3, 6).
[3] Philippe Durand, Georges Trinquier, and Yves-Henri Sanejouand. “A New Approach for
Determining Low-Frequency Normal Modes in Macromolecules”. In: Biopolymers 34.6
(June 1994), pages 759–771 (cited on page 3).
[4] Florence Tama et al. “Building-Block Approach for Determining Low-Frequency Normal
Modes of Macromolecules”. In: Proteins: Struct., Funct., Bioinf. 41.1 (2000), pages 1–7
(cited on page 3).
[5] Timothy R Lezon et al. “Elastic Network Models for Biomolecular Dynamics: Theory and
Application to Membrane Proteins and Viruses”. In: Handbook on Biological Networks.
World Scientific Pub Co Pte Lt, Dec. 2009, pages 129–158 (cited on page 4).
[6] Alexandre Hoffmann and Sergei Grudinin. “NOLB: Nonlinear Rigid Block Normal-Mode
Analysis Method”. In: J. Chem. Theory Comput. 13.5 (2017), pages 2123–2134 (cited on
page 5).
[7] Pemra Doruker, Ali Rana Atilgan, and Ivet Bahar. “Dynamics of Proteins Predicted by
Molecular Dynamics Simulations and Analytical Approaches: Application to α -Amylase
Inhibitor”. In: Proteins: Struct., Funct., Bioinf. 40.3 (2000), pages 512–524 (cited on page 6).
[8] A R Atilgan et al. “Anisotropy of Fluctuation Dynamics of Proteins with an Elastic Network
Model”. In: Biophys. J. 80.1 (Jan. 2001), pages 505–515 (cited on page 6).
[9] Svetlana Artemova, Sergei Grudinin, and Stephane Redon. “A Comparison of Neigh-
bor Search Algorithms for Large Rigid Molecules”. In: J. Comput. Chem. 32.13 (2011),
pages 2865–2877 (cited on page 6).
[10] Emilie Neveu et al. “RapidRMSD : Rapid determination of RMSDs corresponding to motions
of flexible molecules”. In: Bioinformatics (2018) (cited on page 21).
[11] Sergei Grudinin and Alexandre Hoffmann. “Critical assessment of macromolecular confor-
mational transitions computed via linear and nonlinear normal mode analysis”. Unpublished
(cited on page 24).
BIBLIOGRAPHY 33
[12] Thom Vreven et al. “Updates to the Integrated Protein-Protein Interaction Benchmarks:
Docking Benchmark Version 5 and Affinity Benchmark Version 2.” In: J. Mol. Biol. 427.19
(Sept. 2015), pages 3031–3041 (cited on page 24).
[13] Rafael Brüschweiler. “Collective protein dynamics and nuclear spin relaxation”. In: The
Journal of chemical physics 102.8 (1995), pages 3396–3403 (cited on page 28).
Index
A I
Heteroatoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18