On Developing Coarse-Grained Models For Biomolecular Simulation A Review

View Article Online / Journal Homepage / Table of Contents for this issue
PCCP
Dynamic Article Links
Cite this: Phys. Chem. Chem. Phys., 2012, 14, 1242312430
PERSPECTIVE
Published on 08 June 2012. Downloaded by Universidad de Concepcion on 30/08/2014 08:07:40.
www.rsc.org/pccp
On developing coarse-grained models for biomolecular

simulation: a review
Sereina Riniker,a Jane R. Allisonab and Wilfred F. van Gunsteren*a
Received 23rd March 2012, Accepted 4th May 2012
DOI: 10.1039/c2cp40934h
So-called coarse-grained models are a popular type of model for accessing long time scales in
simulations of biomolecular processes. Such models are coarse-grained with respect to atomic
models. But any modelling of processes or substances involves coarse-graining, i.e. the elimination
of non-essential degrees of freedom and interactions from a more ne-grained level of modelling.
The basic ingredients of developing coarse-grained models based on the properties of ne-grained
models are reviewed, together with the conditions that must be satised in order to preserve the
correct physical mechanisms in the coarse-graining process. This overview should help the reader
to determine how realistic a coarse-grained model of a biomolecular system is, i.e. whether it
reects the underlying physical mechanisms or merely provides a set of pretty pictures of the
process or substances of interest.
Introduction
Ever since the experimental exploration of substances and

chemical processes, the development of theoretical models that
describe particular processes has been pursued. The creation of
dierential calculus a few centuries ago allowed for a concise
mathematical formulation of models and analytical solutions for
the most simple ones, such as the ideal, non-interacting gas model
for substances in the gas phase, the lattice of harmonic oscillators
as a model for a substance in the solid phase, or the Ising model
for interacting spins. Since the advent of computers in the past
century, the use of computational models allowed theoretical
models of vastly greater complexity, thereby expanding their range
of applicability. It also oered the possibility to increase the
accuracy of models through a more faithful representation of
the underlying physical mechanisms. This development led to
highly accurate, predictive models that are standardly used in
technical sciences such as aeronautic or civil engineering. Although
theoretical and computational chemistry has seen a rapid
expansion during the past few decades, the accuracy and
applicability of its models are still rather limited. This is due
to a variety of factors: (i) the degrees of freedom governing
chemical processes are electronic, nuclear, atomic and molecular;
(ii) the interactions between these particles are governed by
quantum mechanics, i.e. the Dirac or Schrodinger equations
or under particular conditions by classical equations of motion;
(iii) at non-zero temperatures, the behaviour of the particles is in
a
Laboratory of Physical Chemistry, Swiss Federal Institute of

Technology, ETH, 8093 Zurich, Switzerland.
E-mail: wfvgn@igc.phys.chem.ethz.ch
b
Centre for Theoretical Chemistry and Physics, Institute of Natural
Sciences, Massey University Albany, New Zealand
This journal is
the Owner Societies 2012
addition governed by statistical mechanics, i.e. BoseEinstein,

FermiDirac, or Boltzmann ensembles of congurations are to
be considered; (iv) the Coulomb interaction is spatially rather
long-ranged; (v) the time scale of dierent chemical processes
may easily span 15 orders of magnitude; and (vi) the energy or
free energy changes of chemical processes can be very small
compared to the total energy of the interacting particles. These
factors complicate the formulation of models in chemistry,
which often resembles an art rather than a science.
Each model involves a choice of the essential degrees of
freedom, of the interactions governing the motion along these
degrees of freedom, of a method to generate congurations of the
degrees of freedom, and of the way in which the interaction with
the outside world is represented.1 The challenge of making
chemical models is to strike an appropriate balance between
accuracy and computational cost while representing the process
of interest with an in essence physically correct mechanism.
In chemistry, dierent levels of modelling, i.e. involving
dierent types of degrees of freedom, can be chosen (Table 1).
At the second most ne-grained level, one considers nuclei and
electrons, as done in quantum chemistry. If one is not interested
in breaking or forming chemical bonds or excited states of
molecules, for example, one may eliminate the electronic degrees
of freedom from the model and only consider atoms. In other
words, the ne-grained model is coarse-grained by elimination of
electronic degrees of freedom. This coarse-graining procedure
can be applied between any two levels of modelling and thus
any model in chemistry can be viewed as a coarse-grained
model with respect to the eliminated degrees of freedom. However,
the term coarse-grained modelling has predominately been used to
indicate models in which the particles that constitute the degrees of
freedom of the model represent more than one non-hydrogen atom.
Phys. Chem. Chem. Phys., 2012, 14, 1242312430
12423
View Article Online
Table 1 Characteristic sizes of particles at dierent levels of resolution of modelling, scaling of the computational eort as a function of
the number of nucleons (Nn), electrons (Ne), atoms (Na) or beads (Nb),
and the reduction of the number of degrees of freedom or interactions
Ndf and the reduction of computational eort that can be achieved by
coarse-graining to the next level of modelling
Level Particles
Size of
bead/nm
Nucleons +
electrons
10
II
Nuclei +
electrons
10 610
III
Atoms
0.030.3
IV
CG
Scaling CG reduction reduction
comp. eort
eort
Ndf
NnZ 3
5
NeZ 3
N1a
Supra-atomic 0.51.0
beads
N1b
Supramolecular
beads
N1b
0.51.0
10100
>103
10100
>103
25
225
210
2100
If these atoms belong to one molecule, such a model is a supraatomic, or molecular, coarse-grained model. If the particles that
constitute the degrees of freedom represent more than one molecule, such a model is a supra-molecular coarse-grained model.
The purpose of the present article is to review the nature and
implications of the dierent choices involved in coarse-graining,
with the aim of aiding the sensible development of coarse-grained
models of biomolecular systems and supporting the evaluation of
the eciency and suitability of existing models for application to
problems of interest. We restrict the most detailed discussions to
coarse-graining between the atomic, molecular and supra-molecular
levels of modelling, levels IIIV in Table 1. This means modelling at
the classical statistical-mechanical level. Because of the great variety
of models and applications in the literature, we only classify and
mention the dierent choices to be made when formulating,
developing and testing a model, with references, and do not review
their applications. Readers with further interest in the area are
referred to other reviews on the subject (ref. 26). We hope that our
classication will help the reader to nd his or her way in the jungle
of models and to choose a combination of features and techniques
that suits his or her purpose best.
Choice of coarse-grained level of modelling
In science, one may distinguish very many levels of modelling.

Starting from the most ne-grained level of quarks, one may
coarse-grain up to the level of galaxies.7 In chemistry, the most
used levels are the following (Table 1):
(I) nucleons and electrons
(II) nuclei and electrons
(III) atoms
(IV) supra-atomic beads, e.g. united atoms
(V) supra-molecular beads
The interactions governing the motion of the particles of the
dierent levels are:
(I) strong interaction, Coulomb and Pauli principle
(II) Coulomb and Pauli principle
12424
(III) Coulomb, van der Waals, repulsive and bonded terms

(IV) Coulomb, van der Waals, repulsive and bonded terms
(V) Coulomb, van der Waals, repulsive terms
The interactions of levels I and II are governed by quantum
mechanics, while the interactions of levels IIIV are governed by
classical statistical mechanics. The number of degrees of freedom,
particles or interaction sites will determine, together with the
applicable equations of motion (quantum- or classical-mechanical),
the computational eort required, and thus the reduction of the
latter that can be reached by coarse-graining (Table 1).
Coarse-graining from level II to level III has dierent
characteristics and problematic issues than coarse-graining from
level III to level IV or V, because of the limited compatibility of
quantum and classical mechanical concepts. Therefore, below we
only consider coarse-graining in the realm of classical mechanics,
i.e. between levels III, IV and V.
Choice of degrees of freedom to be eliminated
Coarse-graining implies eliminating degrees of freedom. This

leads inevitably to a decrease of the applicability of the model.
For example, when coarse-graining from level II (nuclei and
electrons) to level III (atoms), relaxation of electronically
excited states of molecules is not covered by the model any
more. Generally, coarse-graining leads to a loss of accuracy of
the model, although for particular properties and types of
models this need not be the case. For example, the properties
of liquid water at ambient temperature are more accurately
described by the SPC model,8 a level III model, than by level II
ab initio models based on density-functional theory,9 due to
the limited accuracy of the functionals used. In general, the
choice of degrees of freedom to be eliminated depends on the
property and phase of the substance of interest.
The conditions that must be fullled by degrees of freedom in
order that they may be eliminated in a physically correct manner in
the coarse-graining process, such that a computationally ecient
and yet accurate coarse-grained model is obtained, are:
(1) they must be non-essential for the process or property of
interest.
(2) they must be large in number or computationally intensive, so that the computational gain is substantial enough
to oset the loss in accuracy.
(3) the interactions governing these degrees of freedom to
be eliminated should be largely decoupled from the interactions governing the other degrees of freedom of the system
which are to be maintained. This means that the frequency
components of the motion along the degrees of freedom to be
eliminated must be well separated from the other frequencies
occurring in the system, and that the coupling between the two
types of motion is weak.10
(4) their elimination should allow a simple, ecient representation of the interaction governing the other, remaining degrees of
freedom.
We discuss here two examples of coarse-graining between
levels III and IV: the use of so-called united atoms11 and of
bond-length constraints.10 By treating the aliphatic CH, CH2
and CH3 groups as united atoms, the number of atomic
interaction sites is substantially reduced, up to almost a factor
of 10 fewer pairwise non-bonded interactions for lipids, at the
This journal is
View Article Online
cost of losing the dipolar interactions of the CH moieties and the

van der Waals interactions of the H atoms. The intra-moiety
motions of these CHn groups are largely decoupled from the
motions of the other atoms and the torsional interactions involving
these H atoms can be incorporated into the corresponding interactions for the torsional angle that does not involve an aliphatic
H atom. If the positions of these H atoms are needed, i.e. when
calculating quantities such as nuclear Overhauser eects (NOEs),
residual dipolar couplings (RDCs) or S2 order parameters
measurable by NMR, the H atom positions can be easily
recovered based on the positions of the carbon atom and its
non-hydrogen covalently bound neighbours.12 Thus, all four
conditions for appropriate coarse-graining are largely met in
this case.
The other example of coarse-graining is the use of geometric
constraints for small molecules without intra-molecular torsionalangle degrees of freedom, such as the solvents water, methanol or
chloroform, or bond-length constraints in macromolecules.10 The
latter are standardly used in biomolecular simulations, because
they satisfy conditions 1 to 4 (ref. 10) and allow, through the
use of SHAKE,13 LINCS14 or other similar techniques to
maintain such constraints, a gain of about a factor of four in
computational eciency.
An example of coarse-graining that does not satisfy conditions 3
and 4 is the use of an implicit solvent model, i.e. the attempt to
mimic the eect of the solvent by a function that is only dependent
on the solute coordinates. If the solvent is water, this leads to
severe distortions of the energy surface of the solute. Although the
motions of a large solute may cover time scales ranging from
femtoseconds to milliseconds and the relaxation times of water
molecules are of the order of picoseconds, their motions on
picosecond to nanosecond time scales are not decoupled, and thus
condition 3 is not satised for particular processes. In explicit
solvents (orange particles in the left panel of Fig. 1), the non-polar
particles (blue) aggregate, and the electrostatic interaction between
ions (red and green) is reduced, leading to dissolution. So-called
hydrophobic or non-polar particles do like water, but their
interaction with water is less strong than the interaction of water
with itself, leading to water excluding the hydrophobes and their
subsequent aggregation. Ions with opposite charges do like water
more than each other, which leads to water surrounding the ions
and dissolution of ion pairs. The hydrophobic eect, the
apparent attraction between non-polar molecules or repulsion
between ions in aqueous solution due to the stronger interaction
between the water molecules or between water molecules and

ions, cannot be properly modelled in terms of solute and ion
coordinates only (right panel Fig. 1), because the eective
interaction between solute atoms and their entropy is a
complex function of the distribution of solvent coordinates.
Thus, condition 4 is dicult to meet.15
Coarse-graining from level III to IV for biomolecules is a
challenge because of the heterogeneity of biomolecules. The
scale invariance that lies at the heart of the renormalisation
group approach to coarse-graining of largely homogeneous
polymers is not observed for biopolymers, which are
composed of many dierent, complex structural units that
are connected through dierent types of interactions. In the
coarse-graining process, the basic geometry and the balance
between the various interactions must be preserved in order to
avoid losing characteristic features of these molecules.15 In
addition, entropy plays a non-negligible role in biomolecular
processes, which means that the loss of entropy in the process
of coarse-graining must be compensated for by a loss in energy
in order to maintain the relevant free energy dierences.
Finally, the reduction of the computational eort between
levels III and IV is rather modest compared to that between
other levels (Table 1). These considerations lead us to the
conclusion that coarse-graining from level III to level IV does
not pay o for biomolecules such as proteins, DNA, RNA and
sugars. Only a limited decrease in the number of interaction
sites is reached at the cost of losing the essential characteristics
of such molecules in terms of intra-molecular interactions,
interactions with the solvent and entropy. Only lipids, which
have relatively long homogeneous aliphatic tails, may be able
to retain the dominant characteristics of an amphiphilic
molecule with a particular geometry and exibility upon
coarse-graining from level III to level IV. Due to the abundance
of lipids in membranes, the reduction in computational cost
may o-set the loss of accuracy.
Since the inclusion of solvent degrees of freedom is essential
to properly represent the hydrophobic eect and because the
calculation of the solventsolvent interactions in a simulation
of a biomolecule such as a protein or a fragment of DNA in
aqueous solution dominates the computational eort, coarsegraining of the solvent degrees of freedom holds much promise
to reduce the computational costs, in particular when more
than one solvent molecule is subsumed into a supra-molecular
bead. In the case of water, such coarse-graining from level III
to level V should retain the thermodynamic and dielectric
screening properties and hydrogen-bonding capacity of water
as much as possible, and a proper ratio between entropy and
energy.16 This is not the case if a water bead is modelled as a
Lennard-Jones particle without charge.1719 Coarse-graining
of solvent degrees of freedom in a biomolecular simulation has
a good chance of meeting conditions 1 to 4, depending on how
the coarse-grained interaction is modelled.
4 Identication of properties or processes of

interest
Fig. 1 Illustration of the hydrophobic eect and the adverse eect of
eliminating solvent degrees of freedom in the process of coarsegraining. The solvent is shown in orange, the hydrophobes in blue,
and the ions in red (positively charged) and green (negatively charged).
This journal is
Since coarse-graining involves a loss of detail and possibly a

simplication of the interaction between particles, it is in
general not possible to reproduce all measurable properties
12425
View Article Online
of a substance as the model becomes more coarse-grained.

This raises the question of which properties are of interest, and
thus are to be represented as well as possible and which
properties are to be relinquished. In addition, coarse-graining
may also restrict the applicability of the model to a particular
phase or region of the phase diagram of a substance, as
investigated in ref. 20. For example, models for liquid water
that are commonly used in level III biomolecular simulations
represent the thermodynamic, dielectric, structural and hydrogenbonding properties of water molecules in the liquid phase
at ambient temperatures and pressures very well, even better
than level II water models, but are bad models for the
gas phase or for the liquid phase at high temperatures
(and pressures) of water. The more coarse-grained a model
is, the more restricted it is to the state point at which it was
parametrised.
In biomolecular systems, the following condensed phase
properties are of interest:
(1) molecular structure of the solute or structure of the
solvent;
(2) thermodynamic properties such as density, heat of
vaporisation, excess free energy or surface tension that carry
volume and energetic information. Other thermodynamic
quantities that characterise a response to a change in thermodynamic state point, such as the heat capacity, isothermal
compressibility or thermal expansion coecient, are of
secondary importance;
(3) dielectric properties, in particular the static dielectric
permittivity that governs the screening of Coulomb interactions by the solvent;
(4) dynamic properties such as diusion, viscosity, and
various molecular or dielectric relaxation times. These are of
less importance, because the equilibrium statistical-mechanical
ensemble averages of non-dynamic quantities generally do
not depend on the values of the dynamic ones, and because
most biomolecular processes are thermodynamically, not
dynamically driven.
Models for solvents at level III should at least reproduce the
structure of the liquid or liquid mixtures, the thermodynamic
properties such as the heat of vaporisation, excess free energy
and density, and the static dielectric permittivity. Models at
level IV, even when parametrised against a particular radial
distribution function derived from experiment or ne-grained
simulation, may not realistically mimic the structure of the
liquid, e.g. in terms of directional features such as hydrogen
bonding. For supra-molecular models of solvents (level V), a
comparison of the heat of vaporisation and the excess free
energy with experimental data is not straightforward, as is
explained in Section 8.4. Therefore, solvents at level V should
at least reproduce the thermodynamic properties surface
tension and density, and the static dielectric permittivity.
5 Extending the context in which the model can be

used: multi-graining
Generally, a model developed for a particular level of
modelling is only used at the same level of modelling. For
example, models for small molecular compounds in the liquid
phase are used to study the properties of mixtures of such
12426
compounds. However, this may limit the accessible time and

length scales in biomolecular simulations. Because of the
heterogeneity of biomolecular systems in terms of their relaxation time scales and the dierent types of interactions present
it is of interest to combine dierent levels of modelling in one
simulation or system.
The combination of dierent levels of modelling or resolution, i.e. multi-graining, can take dierent forms.
(1) The simulation switches between the two levels of
modelling in time: multi-graining in time. This can be done
in two ways:
(a) the simulation is performed at the coarse-grained level
and particular congurations are later mapped back to the
ne-grained level;2129
(b) a coupling parameter l is introduced that denes a path
between the ne-grained (e.g. l = 0) and the coarse-grained
(e.g. l = 1) representation of the particles, which allows
a smooth switching between dierent levels of modelling in
e.g. a Hamiltonian replica-exchange simulation.3032
(2) The system contains a mixture of ne-grained and
coarse-grained particles: multi-graining in space. This can be
done in two ways:
(a) the space occupied by the system is divided into a negrained and a coarse-grained region with a small buer region
in which the particles change from one resolution to the other.
The resolution of the particles thus depends on their position
in space;3338
(b) the particles of the system are either ne-grained or
coarse-grained and can freely mix. The resolution of the
particles is thus xed.3945
The references given above are for multi-graining approaches
between level III and level IV or V. However, multi-graining of
type 2(b) is for example also applied between level II and level
III in so-called hybrid QM/MM simulations in which the
electrons are treated quantum-mechanically (level II) and
the nuclei and surrounding atoms classically (level III). As
discussed before, multi-graining of type 2(b) with a biomolecule
as solute at level III and the solvent at level V seems to be
the most promising type of multi-graining with respect to the
trade-o between accuracy and speed.
Trade-o between levels of modelling
As mentioned before, the process of coarse-graining may

reduce the usefulness of the model in dierent ways.
(1) The range of thermodynamic state points at which it
may be applied is generally reduced.
(2) The transferability of model parameters between similar
but not identical moieties or compounds is usually reduced.
(3) The accuracy of various properties may be reduced.
(4) The physical basis for a particular property or process
may be changed, leading to an unphysical mechanism of the
process in the coarse-grained model.
(5) The reduction of entropy and energy in the system may
lead to an unphysical balance between these quantities in the
coarse-grained model.
The combined loss of usefulness on these ve counts must be
made up for by a much increased computational eciency of
the coarse-grained model.
This journal is
View Article Online
7 Quantities and data used for parameter

calibration and model testing
Any model contains parameters which have to be chosen in
one way or another. One may choose values of experimentally
observable quantities for such parameter calibration or of
quantities that can be calculated from computer simulations
of a particular system at another, generally more ne-grained
level of modelling. Such a calibration may involve direct
matching between the value of a model parameter and its
corresponding value taken from experiment or from a negrained model. For example, one may take the OH bond
length and HOH bond angle in a level III model for liquid
water from its gas-phase geometry as obtained from a level II
quantum-mechanical calculation or from its solid-state geometry
as obtained from diraction experiments. A model parameter
can also be calibrated in a less direct manner by tting another
property of the model to its ne-grained or experimental counterpart. For example, the partial atomic charges and the repulsive
Lennard-Jones parameter of the SPC water model were
chosen such that the experimental heat of vaporisation and
density of liquid water at ambient temperature and pressure
were reproduced.46
Calibration of model parameters against values of quantities
obtained from computation or simulation of another, more negrained model, as is done when applying force-matching,47
relative entropy,48 reverse Monte Carlo,49 or iterative Boltzmann
inversion50 techniques, is a risky strategy, because it relies on the
accuracy of the more ne-grained model, which is for molecular
systems generally not yet at the level of accuracy of experimental
thermodynamic or dielectric screening data. Projection of
particular quantities from a more ne-grained level to a more
coarse-grained level also involves assumptions about the
coupling between the motion along the eliminated and
preserved degrees of freedom, as discussed in Section 3.
On the other hand, not all data considered to be experimental
data are really measured, and thus suitable for parameter
calibration. One should distinguish between observed, primary
experimental data such as intensities of a diracted beam in a
scattering experiment or absorption intensities in a spectroscopic experiment, and derived, secondary experimental data
that are obtained from observed data by a particular procedure
which generally involves model assumptions and approximations that may introduce inaccuracy.51 Examples of derived
experimental data are X-ray structures determined from
X-ray diraction intensities or NMR model structures derived
from NMR observables such as NOE intensities, RDCs and
3
J-couplings. Use of derived experimental data in model
parameter calibration is a highly risky strategy due to the
uncertainty inherent to such data.
Another useful distinction is between atomic, molecular
and supra-molecular versus bulk or system properties. For
example, the atomic partial charge could also be calibrated to
reproduce a particular molecular dipole moment rather than
to reproduce the static dielectric permittivity of the bulk
liquid. As discussed in Section 4, which properties should get
the largest weight in the calibration? The supra-molecular
structural properties or the thermodynamic (system) properties?
These questions do not have a universally correct answer;
This journal is
rather, the choice will depend on the intended application(s)

of the model.
Finally, we note that any model parameter is an eective
parameter, i.e. its value has only limited signicance per se, but
obtains signicance in connection with the values of other
parameters and the overall properties of the model that were
the target of the calibration. For example, the bond angle of a
water molecule in the gas phase is not tetrahedral and it is
probably also not tetrahedral in the liquid phase. Yet, many
models for liquid water use a tetrahedral geometry in order to
facilitate the reproduction of the dielectric screening properties
of water, which is considered to be more important than the
detailed geometry of a water molecule.
As far as testing a model is concerned, the same considerations apply as in regard to parameter calibration. However, in
testing one tries to break the model, while during parameter
calibration one tries to optimise a particular set of model
properties. Of course, properties that were used for parameter
calibration have only limited values as a test of the model.
Procedure to develop a supra-molecular model
In this section we sketch a procedure that can be followed

when designing and developing a supra-molecular (SM) model.
This procedure was followed in the development of supramolecular coarse-grained models for water,16 dimethyl sulfoxide,
chloroform and methanol,52 which may serve as solvents for
biomolecules.
8.1 Choice of degrees of freedom
The rst choice is how many molecules Nmol are to be
subsumed into one supra-molecular bead. For liquid water,
which has a tetrahedral structure, the value Nmol = 5 appeared
to be most appropriate considering the properties of water
clusters of varying size in atomic-level (AL) simulations of the
liquid.16 For methanol, we chose Nmol = 4, and for DMSO
and CHCl3 Nmol = 2 due to the larger volume of these
molecules and to avoid overly large SM beads with an eye to
mixing atoms with SM beads.
The second choice is how many interaction sites or particles
are to be used to represent a SM bead. For polar molecules, at
least two sites or particles are needed to mimic an electric
dipole using charges. We decided to give both interaction sites
a mass, i.e. to treat them as particles, in order to be able to
control the dielectric relaxation of a two-particle SM bead.
8.2 Functional form of the interaction between interaction sites
or particles of a bead
According to condition 4 of Section 3, the form of the
interaction at the coarse-grained level should be a simple
and ecient representation of the interaction at the negrained level after removing the interaction governing the
eliminated degrees of freedom.
At the atomic level of modelling, the basic non-bonded
interactions are of Coulombic and van der Waals type, which
means an r 1 distance dependence between charges and an r 6
distance dependence between atoms (induced dipoledipole).
Since these interactions are relatively long-ranged, integrating
over the atoms that are eliminated in the coarse-graining
12427
View Article Online
process will not change the distance dependence of these

interaction types. Therefore, the r 1 and r 6 dependences were
kept at the coarse-grained level. For the repulsion between the
particles of dierent beads an r 12 term was chosen for reasons
of convenience and because the precise form of this repulsion
cannot be determined. The interaction that keeps the two
particles of a bead together was modelled as simply as possible
using a quartic attractive functional form.
8.3
Choice and calibration of the model parameters
The values of the model parameters can be chosen in a simple

manner, e.g. by summing the mass of the AL atoms to yield
the mass of a SM bead, or adjusted to obtain particular
features of the model.
An analysis of an AL simulation in terms of clusters of size Nmol
molecules may serve as a basis for nding appropriate values for the
SM parameters. For example, the value of Nmol = 5 for water
resulted from selecting reasonably sized spherical clusters.16 Given a
value of Nmol, the cluster size oers an estimate for the SM
Lennard-Jones parameter sLJ, while the inter-cluster AL energy
oers an estimate for the SM Lennard-Jones parameter eLJ. These
and other interaction parameters can then be varied to reproduce
experimentally determined values for thermodynamic, dielectric
and structural properties for the substance of interest. If an
insucient number of experimental values is available, values of
relevant quantities that can be computed from AL simulations,
either for the bulk or for clusters, can be used for parameter
calibration too, as long as the quantities are reasonably well
modelled at the AL level. For example, one may use experimental
values for the density, the static dielectric permittivity and the
surface tension of a liquid and in addition the ratio between van der
Waals and electrostatic interactions obtained from AL simulations.
8.4
Technical issues
When coarse-graining from level III to level V, a few technical

issues emerge that are generally not present in AL models.
(1) Atomic biomolecular force elds generally use a relative
dielectric constant ecs of 1 in the Coulomb interaction, because
there is vacuum between the atoms and the polarisability of atoms
is neglected. The SM beads should represent the polarisability or
dielectric screening capability of Nmol molecules. This is accounted
for by using values of ecs > 1 in the direct Coulomb interaction.16
(2) When comparing the pressure calculated for the SM
beads with the desired experimental value, one should account
for the fact that this pressure will be Nmol times smaller than in
an AL simulation by using a scaling factor SSM = Nmol.16
(3) For thermodynamic quantities such as the heat of
vaporisation, the excess free energy of a liquid or the free
energy of solvation that are dened by a dierence of an
energy or free energy between the gas phase and the liquid
phase, a meaningful comparison of values calculated with a
SM model (Nmol > 1) and experimental ones is not possible,
because it would require a reliable calculation of the (free)
energy of cluster decomposition in the gas phase.16
Multi-graining in space
The challenge of developing a model that spans two dierent

resolution levels of modelling is to nd a physically correct
12428
balance between the three types of interactions present in such

a system: interactions between ne-grained particles (FGFG),
interactions between ne-grained and coarse-grained particles
(FGCG) and interactions between coarse-grained particles
(CGCG). In addition, when trying to combine a model of
level II with one of level III, IV or V, a quantum-mechanical
description is to be combined with a classical-mechanical one,
which raises the question of how to match a probability
description with a trajectory one.
Finding a correct balance between FGFG, FGCG and
CGCG interactions leading to a physically sound distribution
of energy and entropy over the ne-grained and coarsegrained subsystems is not an easy task. The dierence in
particle size, number of interaction sites and strength of the
interaction at the ne-grained and coarse-grained level as
reected by the dierence in the respective model parameter
values must be accounted for. This can be done by adapting
the parameters for FGCG interactions such that particular
properties of mixtures of ne-grained and coarse-grained
particles, e.g. their homogeneity in the case of pure liquids,
are reproduced. Ideally, standard combination rules for
obtaining parameters for non-covalent interactions between
unlike particles can be applied to obtain the FGCG interaction parameters which reproduce the target properties, but if
this is not possible, one could achieve it in a simple manner by
introducing a scaling factor that relates FGCG parameters to
CGCG ones.41 For interactions of a Lennard-Jones functional
form, the parameters eLJ and sLJ can be scaled, while the
Coulomb interaction can be scaled by varying the relative
dielectric constant ecs used in the model.
If the energetic and entropic balance between the negrained and coarse-grained models in a hybrid multi-grained
system is physically incorrect, the properties of the system will
become sensitive to the ratio of the ne-grained versus coarsegrained degrees of freedom or interaction sites and incorrect
congurational ensembles will be obtained. In other words,
the ne-grained and coarse-grained models must be shown to
be thermodynamically consistent, which is a condition that is
generally not met for models used to simulate biomolecular
systems.
10 Multi-graining in time
As was discussed in Section 5, multi-graining, or the combination
of dierent levels of resolution of modelling, can also be executed
without mixing ne-grained and coarse-grained particles in one
system as is done in hybrid multi-grained models. One can apply
multi-graining in time either by back-mapping of coarse-grained
congurations to the ne-grained level or as a function of a
coupling parameter l. In these cases, ne-grained particles do not
interact with coarse-grained particles and thus there is no need to
dene and calibrate FGCG interaction parameters. The price
paid for this reduction of parametrisation complexity is that for
all types of atoms, molecules or particles in the system, FGFG
and CGCG interactions have to be dened and model parameters have to be calibrated. In other words, the challenge
of nding an appropriate balance between the energetic and
entropic contributions of the FGFG, FGCG and CGCG
interactions is shifted to the time domain or l-coupling domain.
This journal is
View Article Online
Fig. 2
Challenges for theory and computation in biomolecular science.
At every switch between the grain levels a possible thermodynamic

incompatibility between the interactions at the two levels can
emerge. If, for example, at the ne-grained level a protein fold is
stable due to strong intra-protein interactions, while at the coarsegrained level the proteinsolvent interactions are dominant,
switching between the grain levels in time will induce a folding/
unfolding process that has no physical meaning, but merely reects
the thermodynamic incompatibility of the ne-grained and coarsegrained models used for the system.
11
Discussion
Due to the rapid increase and wide availability of computing

power, the development of computational models with the aim of
describing the behaviour of complex biomolecular systems has
become popular. Yet, the development of a model at a particular
level of resolution, be it sub-atomic, atomic, molecular or supramolecular, that properly reects the physical mechanism that
determines the biomolecular behaviour or process that is to be
investigated is by no means a trivial task. It was the purpose of this
article to review the ingredients needed to apply coarse-graining in
a sensible manner for the development of biomolecular models.
One may choose a set of degrees of freedom and interaction
sites, postulate a particular functional form of the interaction or
energy as a function of the degrees of freedom, choose or calibrate
the interaction parameters in one way or another and then
simulate the motion of the system along the chosen degrees of
freedom. Depending on the quality of the choices made and the
physical correctness of the inevitable approximations involved,
such a simulation may produce at best the essential mechanism
and features of the process of interest, or at worst a set of
snapshots or pictures that whilst pretty, do not reect reality.
In order to avoid the latter case, of which ample examples
can be found in the literature, one should be aware of the four
This journal is
conditions that should be satised in the process of coarsegraining. These were formulated in Section 3. In particular,
conditions 3 and 4 tend to be violated when formulating a
coarse-grained model without analysing, at the ne-grained
level, the coupling between the interactions and motions of the
eliminated degrees of freedom and the remaining degrees of
freedom. If this coupling is not weak, the choice of an
appropriate functional form and parameters for the interaction at the coarse-grained level may become cumbersome.
When combining models of dierent resolution in one
system or simulation, their thermodynamic properties, in
particular their energetics, must be compatible in order to
obtain meaningful simulation results. Whether this goal can be
reached will depend on the features of the process and system
of interest. In Fig. 2, we have sketched this challenge of theory
and computation in the biomolecular sciences and the steps
that can be taken to meet this challenge. Unfortunately, it will
often be dicult to develop a reliable computational model
that produces more than just a series of pretty pictures.
Acknowledgements
This work was nancially supported by the National Center of
Competence in Research (NCCR) in Structural Biology and
by grant number 200020-137827 of the Swiss National Science
Foundation, and by grant number 228076 of the European
Research Council, which is gratefully acknowledged. We
thank Alan Mark for the idea of illustrating the hydrophobic
eect in Fig. 1.
References
1 W. F. van Gunsteren, D. Bakowies, R. Baron, I. Chandrasekhar,
M. Christen, X. Daura, P. Gee, D. P. Geerke, A. Glattli,
P. H. Hunenberger, M. A. Kastenholz, C. Oostenbrink,
12429
View Article Online
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
M. Schenk, D. Trzesniak, N. F. A. van der Vegt and H. B. Yu,

Angew. Chem., Int. Ed., 2006, 45, 4064.
G. S. Ayton, W. G. Noid and G. A Voth, Curr. Opin. Struct. Biol.,
2007, 17, 192.
P. Sherwood, B. R. Brooks and M. S. P. Sansom, Curr. Opin.
Struct. Biol., 2008, 18, 630.
C. Peter and K. Kremer, Soft Matter, 2009, 5, 4357.
M. Cascella and M. Dal Peraro, Chimia, 2009, 63, 14.
S. C. L. Kamerlin, D. Vicatos, A. Dryga and A. Warshel, Annu.
Rev. Phys. Chem., 2011, 62, 41.
H. J. C. Berendsen, Simulating the Physical World: Hierarchical
Modeling from Quantum Mechanics to Fluid Dynamics, Cambridge
University Press, Cambridge, UK, 2007.
A. Glattli, X. Daura and W. F. van Gunsteren, J. Chem. Phys.,
2002, 116, 9811.
B. Guillot, J. Mol. Liq., 2002, 101, 219.
W. F. van Gunsteren and H. J. C. Berendsen, Mol. Phys., 1977,
34, 1311.
M. Levitt and S. Lifson, J. Mol. Biol., 1969, 46, 269.
W. F. van Gunsteren, S. R. Billeter, A. A. Eising, P. H. Hunenberger,
P. Kruger, A. E. Mark, W. R. P. Scott and I. G. Tironi, Biomolecular
Simulation: The GROMOS96 Manual and User Guide, vdf
Hochschulverlag AG an der ETH Zurich, and BIOMOS b.v. Zurich,
Groningen, 1996.
J.-P. Ryckaert, G. Ciccotti and H. J. C. Berendsen, J. Comput.
Phys., 1977, 23, 327.
B. Hess, H. Bekker, H. J. C. Berendsen and J. G. E. M. Fraaije,
J. Comput. Chem., 1997, 18, 1463.
M. Muller, K. Katsov and M. Schick, Phys. Rep., 2006, 434, 113.
S. Riniker and W. F. van Gunsteren, J. Chem. Phys., 2011,
134, 084110.
S. J. Marrink, A. H. de Vries and A. E. Mark, J. Phys. Chem. B,
2004, 108, 750.
S. J. Marrink, H. J. Risselada, S. Yemov, D. P. Tieleman and
A. H. de Vries, J. Phys. Chem. B, 2007, 111, 7812.
W. Shinoda, R. Devane and M. L. Klein, Mol. Simul., 2007, 33, 27.
M. E. Johnson, T. Head-Gordon and A. A. Louis, J. Chem. Phys.,
2007, 126, 144509.
W. Tschop, K. Kremer, O. Hahn, J. Batoulis and T. Burger, Acta
Polym., 1998, 49, 75.
G. Milano and F. Muller-Plathe, J. Phys. Chem. B, 2005, 109, 18609.
S. Izvekov and G. A. Voth, J. Phys. Chem. B, 2005, 109, 2469.
B. Hess, S. Leon, N. F. A. van der Vegt and K. Kremer, Soft
Matter, 2006, 2, 409.
A. Y. Shih, P. L. Freddolino, S. G. Sligar and K. Schulten, Nano
Lett., 2007, 7, 1692.
A. P. Heath, L. E. Kavraki and C. Clementi, Proteins: Struct.,
Funct., Bioinf., 2007, 68, 646.
12430
27 T. Carpenter, P. J. Bond, S. Khalid and M. S. P. Sansom, Biophys.

J, 2008, 95, 3790.
28 A. J. Rzepiela, L. V. Schafer, N. Goga, H. J. Risselada, A. H.
de Vries and S. J. Marrink, J. Comput. Chem., 2001, 31, 1333.
29 A. Samiotakis, D. Homouz and M. S. Cheung, J. Chem. Phys.,
2010, 132, 175101.
30 M. Christen and W. F. van Gunsteren, J. Chem. Phys., 2006,
124, 154106.
31 E. Lyman, F. M. Ytreberg and D. M. Zuckerman, Phys. Rev. Lett.,
2006, 96, 28105.
32 P. Liu and G. A. Voth, J. Chem. Phys., 2007, 126, 045106.
33 M. Praprotnik, L. Delle Site and K. Kremer, J. Chem. Phys., 2005,
123, 224106.
34 B. Ensing, S. O. Nielsen, P. B. Moore, M. L. Klein and
M. Parrinello, J. Chem. Theory Comput., 2007, 3, 1100.
35 A. Heyden and D. G. Truhlar, J. Chem. Theory Comput., 2008,
4, 217.
36 J. H. Park and A. Heyden, Mol. Simul., 2009, 35, 962.
37 S. Poblete, M. Praprotnik, K. Kremer and L. Delle Site, J. Chem.
Phys., 2010, 132, 114101.
38 C. Junghans and S. Poblete, Comput. Phys. Commun., 2010,
181, 1449.
39 M. Neri, C. Anselmi, M. Cascella, A. Maritan and P. Carloni,
Phys. Rev. Lett., 2005, 95, 218102.
40 Q. Shi, S. Izvekov and G. A. Voth, J. Phys. Chem. B, 2006,
110, 15045.
41 J. Michel, M. Orsi and J. W. Essex, J. Phys. Chem. B, 2008,
112, 657.
42 M. Masella, D. Borgis and P. Cuniasse, J. Comput. Chem., 2008,
29, 1707.
43 A. J. Rzepiela, M. Louhivuori, C. Peter and S. J. Marrink, Phys.
Chem. Chem. Phys., 2011, 13, 10437.
44 S. Riniker and W. F. van Gunsteren, J. Chem. Phys, 2012,
submitted.
45 S. Riniker, A. P. Eichenberger and W. F. van Gunsteren, Eur.
Biophys. J., 2012, submitted.
46 H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren and
J. Hermans, in Intermolecular Forces, ed. B. Pullmann, Reidel,
Dordrecht, 1981, pp. 331342.
47 S. Izvekov and G. A. Voth, J. Chem. Phys., 2005, 123, 134105.
48 M. S. Schell, J. Chem. Phys., 2008, 129, 144108.
49 A. P. Lyubartsev and A. Laaksonen, Phys. Rev. E, 1995, 52, 3730.
50 D. Reith, M. Puetz and F. Mueller-Plathe, J. Comput. Chem.,
2003, 24, 1624.
51 W. F. van Gunsteren, J. Dolenc and A. E. Mark, Curr. Opin.
Struct. Biol., 2008, 18, 149.
52 J. R. Allison, S. Riniker and W. F. van Gunsteren, J. Chem. Phys.,
2012, 136, 054505.
This journal is

On Developing Coarse-Grained Models For Biomolecular Simulation A Review

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

On Developing Coarse-Grained Models For Biomolecular Simulation A Review

Încărcat de

Drepturi de autor:

Formate disponibile

View Article Online / Journal Homepage / Table of Contents for this issue

Dynamic Article Links

Cite this: Phys. Chem. Chem. Phys., 2012, 14, 1242312430

Published on 08 June 2012. Downloaded by Universidad de Concepcion on 30/08/2014 08:07:40.