Sunteți pe pagina 1din 10

review article

Published online: 17 december 2009 | doi: 10.1038/nchem.473

Designing artificial enzymes by intuition


and computation
Vikas Nanda1* and Ronald L. Koder2

The rational design of artificial enzymes, either by applying physico–chemical intuition of protein structure and function or with
the aid of computational methods, is a promising area of research with the potential to tremendously impact medicine, indus‑
trial chemistry and energy production. Designed proteins also provide a powerful platform for dissecting enzyme mechanisms
of natural systems. Artificial enzymes have come a long way from simple α‑helical peptide catalysts to proteins that facilitate
multistep chemical reactions designed by state-of-the-art computational methods. Looking forward, we examine strategies
employed by natural enzymes that could be used to improve the speed and selectivity of artificial catalysts.

I
n the nineteen-fifties and sixties, the advent of the semiconductor remarkable selectivity, rate-enhancements and product specificity
transistor and the integrated circuit transformed digital comput- of natural enzymes under aqueous conditions warrants more work
ers from powerful curiosities into pragmatic, cost-effective tools. in developing powerful molecular design technologies.
Along with advances in numerical methods, computers revolution- The complexities of enzyme design can be quite daunting.
ized the design and construction of aircraft, allowing engineers to Examining high-resolution structures of natural enzyme–substrate
simulate complex, nonlinear systems that integrated aerodynamics, complexes reveals that the conformations of active-site amino acids
propulsion, control and so on, thereby pushing aircraft technology are poised to facilitate catalysis. Second-shell interactions tune the
well beyond what was possible with previous analytical models. reactivity of the active site through networks of direct interactions with
Today, a Boeing 747 is an incredibly complex machine with over primary ligands and long-range electrostatic forces. Designing such
6,000,000 parts. As such, computers have become indispensable in molecules from scratch presents a host of computational challenges
the aerospace industry. Although much smaller in size, the mecha- (see Box 1). Accurate modelling of important forces in the active site
nistic complexity of enzymes and challenges associated with their requires quantum mechanical (QM) calculations. Unfortunately, it is
design (Box  1) suggest that they are as sophisticated as passenger not feasible to perform QM calculations on molecules the size of even
airliners, and it is expected that computational methods in chemistry the smallest enzymes. Design also requires the rapid evaluation of a
and biology will promote a similar revolution in the design of artifi- large number of candidate structure/sequence combinations and QM
cial catalysts. calculations are very demanding on computational resources. Effective
The promise of constructing enzymes that are capable of effi- treatment of water molecules and their interactions with active-site
ciently catalysing virtually any chemical reaction is a tremendous residues and reactants can also significantly increase the complexity
motivator for researchers in the protein-design field. Enzymes cata- of calculations. Along with high-resolution design of the active-site
lyse difficult chemical reactions in mild, aqueous environments, residues, candidate enzymes must maintain overall structural integ-
often with a speed and specificity unrivalled by synthetic catalysts. rity, and in some cases, incorporate large-scale protein motions that
Designing an enzyme from scratch is also the most rigorous way of may support catalysis. Integrating all of these factors into a design is a
testing our understanding of how natural enzymes function. Several formidable challenge.
recent designs have been stripped-down or rebuilt versions of natu- It is reasonable to ask what one gains through such sophisticated
ral enzymes, providing powerful tools for dissecting molecular con- computation. After all, a number of novel folds and catalytically
tributions to enzyme structure and reactivity. active proteins have been built without such tools. In this review, we
Enzyme design is inextricably linked with the design of protein survey several designed, artificial enzymes that have been developed
structure. Advances in protein design are often rapidly followed by with varying degrees of computational involvement. These include
attempts to apply new technologies to artificial enzymes. Therefore, de novo enzymes, where both the protein topology and the active
this is as much a review of protein fold design as of catalyst design. site are built from scratch, and active-site design, where surfaces and
However, it should be noted that complex protein topologies are not cavities on existing proteins are repurposed for catalysis. We build
a prerequisite for catalysis. Proline alone can catalyse a remarkable on previous reviews of this field (for example, see ref. 3) by includ-
array of reactions, including aldolase-like formations of carbon– ing a deeper discussion of computational challenges associated with
carbon bonds through enamine intermediates with high yields and enzyme design. Additionally, we look to the future of design, such as
substantial product enantiomeric excess. Other processes includ- introducing multiple substrates, protein motion, allostery into arti-
ing asymmetric epoxidations and acylations are achievable using ficial enzymes, and expanding protein design principles to a broader
short peptides. The impressive catalytic properties of proline and class of structured, catalytically active polymers.
small peptides have been extensively reviewed previously 1,2 and are
not covered here. Few designed enzymes have achieved the cata- The helix and the enzyme
lytic utility of such small peptides, and much remains to be done A fundamental paradigm in biochemistry is the link between a
before designer enzymes find practical applications. However, the protein’s function and its three-dimensional fold, which in turn

1
Robert Wood Johnson Medical School - UMDNJ Biochemistry, Center for Advanced Biotechnology and Medicine, 679 Hoes Lane West, Piscataway, New
Jersey 08854, USA; 2City College of New York Physics, 160 Convent Avenue, New York, New York 10031, USA. *e-mail: nanda@cabm.rutgers.edu

nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry 15


© 2010 Macmillan Publishers Limited. All rights reserved.
review article Nature chemistry doi: 10.1038/nchem.473
presence of NADPH with a kcat of 0.02 min–1 and Km of 5.0 mM. (For
Box 1 | Anatomy of an enzyme. most simple reactions, Km describes the affinity of the enzyme for
the substrate and kcat reflects the rate of product formation under
Hen egg white lysozyme (HEWL) was the first enzyme atomic conditions where the enzyme is saturated by substrate.)
structure to be solved by X‑ray crystallography in 1965110. The In work by Barbier and Brack, it was found that regular copoly-
three-dimensional structure highlights many of the physical mers of leucine and lysine could hydrolyse polyribonucleotides14,15.
characteristics of enzymes that make them unusually challeng- Poly(Lys–Leu) and poly(Leu–Lys–Lys–Leu) had superior hydrolysis
ing proteins to design. HEWL functions in antibacterial defence rates to random copolymers or those incorporating both d,l-Leu
and cleaves glycosidic linkages found in bacterial cell walls. The and d,l-Lys, demonstrating that either β‑sheet structure (formed by
active site consists of two amino acids, a glutamic acid, which a poly‑ pattern16) or α‑helical structure had a significant role in
functions as a general acid/base, and an aspartic acid nucle- improving activity. The regular structure provided a cationic surface
ophile. These are placed at the bottom of a deep substrate cleft, to support binding of negatively charged substrates such as nucleic
which confers specificity and poises the substrate over the active acids. Repulsive electrostatic interactions between adjacent lysine
site. Many small-molecule catalysts function better in organic side-chains served to depress the pKa of the amino group, making it
solvents, where the low bulk dielectric enhances electrostatic a better catalytic base.
interactions. The cleft mimics this by isolating catalytic groups These features were exploited in the design of ‘oxaldie’, a fifteen-
from bulk water, strengthening local electrostatic interactions. residue α‑helical peptide comprising mostly leucines and lysines
Accurate modelling of catalytic residue conformations and local that catalysed decarboxylation of oxaloacetate to pyruvate through
electrostatics are key in designing effective artificial enzymes. an imine intermediate17 (Fig.  1). Possible amine donors in the
Quantum mechanics methods have been useful in moving this covalent intermediate were lysine side-chains and the backbone
area of design forward. amino terminus. The activity of the peptide was concentration
The protein fold must be sufficiently stable to form this cleft dependent, suggesting that a helical multimer was the active form.
and preorganize active-site residues, which is one reason why Inserting a helix-breaking proline into the centre of the peptide
enzymes are much larger than synthetic catalysts. The computa- significantly reduced activity, affirming the relationship between
tional design of proteins with partially buried polar active sites structure and activity. Rate enhancements of 103 to 104 over free
is especially challenging. The protein fold must be able to absorb amines were observed.
the energetic cost of desolvating polar active-site groups and sta- Oxaldie was designed on the premise that the α‑helix would raise
bilizing electrostatic interactions that favour catalysis. the basicity of the active site, whether it was the amino terminus
through interactions with helix macrodipole, or the amino groups
Substrate cleft Water
of lysine through electrostatic interactions between side chains. This
made it possible to rationally improve activity by targeting struc-
ture. In one strategy, lysines were placed on the solvent-exposed
face of avian pancreatic peptide, a 36-amino-acid protein with a
helix packing against an extended polyproline chain. The resulting
oxaldie‑3 was monomeric, unlike its predecessor, eliminating the
need for high peptide concentrations to elicit catalysis18. Modest
improvements in kcat and Km were also achieved. In oxaldie‑4, a
Active site
bovine pancreatic peptide scaffold with intrinsic disulfides further
stabilized the fold, resulting in even better kinetic parameters and
Protein fold activity at higher temperatures19. The same scaffold was adapted to
make a miniature esterase20.
The self-replicating peptides are an interesting case of α‑helical
enzymes where the catalyst does not form covalent intermediates
depends on its amino acid sequence. Getting from sequence to with reactants21,22. The enzyme is a 32-residue amphipathic helix
structure to function is the common goal unifying all work in the
field of protein design. Some of the earliest model proteins were A B
built with the α‑helix as the fundamental unit of structure. A seven- a b c d e f g
Hy

residue repeating sequence, , where the first and fourth


ionic face

dro

amino acids are nonpolar ( ) and the rest polar ( ), will charac-
phobic fa

teristically form multimeric, left-handed coiled-coil assemblies of g


C at

c
amphipathic α‑helices; nonpolar side-chains associate between heli-
ce

cal elements to form a hydrophobic core4–8. Specific homo and het- d


erooligomers of α‑helices can be achieved by rational design of core f Oxaldie-1
packing interactions and surface electrostatics9–11. The power of this a
b
simple idea was demonstrated in a clever design where the super-
e
helical twist was inverted by changing the motif to an eleven-residue
repeat, , thus maintaining a continuous hydropho-
bic core in a right-handed coiled-coil12. This was an example of a Figure 1 | From sequence to structure to function. A, A repeating seven-
true de novo design with no known natural counterpart at the time residue pattern of nonpolar (filled circles) and polar (open circles) residues
it was constructed. will create a hydrophobic surface on one face of the helix. These surfaces
The majority of early attempts to construct enzymes used can drive associations of helices, forming a hydrophobic core. B, The
an amphipathic α‑helix as the principle structural component. oxaldie enzymatic peptides17 make dual use of this repeating pattern,
‘Helichrome’ was designed to function as a hydrolase, using four creating one hydrophobic, leucine-rich face and one cationic, lysine-rich
α‑helices to form a hydrophobic substrate binding pocket over one face. Spatial clustering of lysines lowers their pKa and provides a surface for
face of an iron porphoryn to which the helices were covalently teth- the negatively charged oxaloacetate substrate to bind. L = leucine (grey),
ered13. Helichrome could convert aniline to p‑aminophenol in the K = lysine (red), A = alanine(white).

16 nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry


© 2010 Macmillan Publishers Limited. All rights reserved.
Nature chemistry doi: 10.1038/nchem.473 review article
with a hydrophobic Leu/Val-rich face, which binds two 16-residue
peptides that are N and C‑terminal halves of the same sequence.
Binding reduces the entropic costs of chemical ligation between a
C‑terminal thioester leaving group and an N‑terminal cysteine. The
product also serves as a template, resulting in progressive amplifica-
tion of catalytic activity. Although computational methods have not
been applied in the design of these systems, they could potentially
be used to improve catalytic efficiency, by developing sequences
that balance the competing processes of template formation and
product inhibition.
The α‑helix continues to be a useful tool in the rational design
of enzymes without the need for sophisticated computation. The
polar/nonpolar sequence periodicity of the helix serves as a sim-
ple mechanism for promoting multimer associations, and for spa-
tially clustering active-site residues. β‑Hairpins and small sheets
have not been used extensively as a platform for developing small
peptide catalysts, given the increased difficulty of designing folded
β‑structures, which are stabilized by interactions well-separated in
the primary sequence. Significant progress in β‑sheet design may
soon change this23–25. There are limitations to the chemical complex-
ity one can achieve on the face of a single helix. As such, a number of HN1 α3W
enzyme designs have made use of more complex protein topologies,
combining multiple helical elements. Figure 2 | Helical designs with tertiary structure. Using short connecting
loops, multi-helical structures provide the potential for greater chemical
Enzyme models with tertiary structure diversity at the active site, a binding surface for defining substrate specificity
Valuable progress has been made in the design of catalytic proteins and a core for controlling the microenvironment of key catalytic residues.
where active-site and second-shell residues are donated by multi- Two examples are the HN1 ribonuclease29 with an active site of two histidines
ple structural elements. This significantly enhances the potential and four arginines (model structure), and α3W (ref. 31), a three-helix bundle
for chemical diversity, as well as providing space for binding sites to that tunes the reduction potential of a tryptophan radical in the bundle core.
improve affinity and specificity. One productive scaffold is the helix–
loop–helix motif, where a short turn connects two amphipathic most likely because the randomly selected core residues lack ‘knobs
α‑helices, which then dimerize into a four-helix bundle26. Because and holes’ type intercalation, allowing these stable elements of sec-
the two helices are part of the same peptide, they can host a greater ondary structure to move independently 32–34. Binary patterning cou-
diversity of chemical groups. The Baltzer lab has used this motif to pled with explicitly designed polar interactions buried within the
target a number of reactions involving model RNA-like substrates protein core can fix these elements of secondary structure, lifting
(Fig. 2). Functionalizing one face of a helix–loop–helix at four sites these proteins into unique three-dimensional conformations, either
with a zinc-triazacyclononane amino acid resulted in a peptide using internal hydrogen bonding or metal-ligand interactions34,35.
capable of catalysing transesterification of 2‑hydroxypropyl‑p-nitro- This design approach been exploited in an artificial oxygen
phenyl phosphate 380 times faster under saturating conditions27. transport protein, HP‑7 (ref. 36). In this protein one of the two lig-
The same scaffold has been used to hydrolyse phosphodiester bonds and histidines of a bis-histidine haem binding site in a binary pat-
using histidine as a general acid/base instead of metals28,29. terned four-helix bundle scaffold was destabilized by the addition
A three-helix design (helix–loop–helix–loop–helix) with a sin- of core glutamic acid residues on the same helix. This intentional
gle tryptophan or tyrosine in the hydrophobic core was designed violation of the rules of binary patterning results in the stabiliza-
to measure how the protein environment modulates the stability tion of an alternative conformation wherein the ligand histidine
of amino acid radicals, as inferred from electrochemical measure- detaches and the core glutamate residues rotate out into solution,
ments. Amino acid radicals are important in a number of funda- opening a haem iron coordination site for oxygen binding (Fig. 3).
mental biochemical reactions including the evolution of molecular This mechanism is similar to that observed for other hexacoordi-
oxygen from water by photosystem II and the biosynthesis of DNA nate hemoglobins such as cytoglobin, neuroglobin or leghemo-
from ribonucleotides by ribonucleotide reductase. The low dielec- globin37, although the structure and sequence of HP‑7 is unrelated
tric environment inside the protein was shown30 to significantly to any natural oxygen carrier.
raise the reduction potential of both Tyr and Trp. The structure of Measurement of the kinetic and thermodynamic constants for
α3W also uncovered a cation–π interaction between the Trp and oxygen and carbon monoxide binding by HP‑7 demonstrates that
an Arg, which modelling suggested would also raise its reduction it performs equivalently to natural globins, with the unprecedented
potential31. Although the α3Y/W peptides have not yet been devel-
oped into a catalyst, they showcase an important function of protein O2
design — to provide minimal systems to help us understand the Haem
complex molecular forces involved in enzyme function.

Binary pattern design of an artificial oxygen-transporter


As discussed above for helichrome, oxalide and others, simple folds Figure 3 | Stepwise oxygen binding to a binary-patterned four-helix
such as helical bundles can be designed non-computationally with bundle. In the HP‑7 maquette36, haem binding requires rotation of α‑helices
a high probability of success merely by placing polar amino acid to present histidines (green) in the proper geometry. This forces the
side chains at solvent-exposed positions, and nonpolar amino acids unfavourable burial of glutamates (red triangles) inside the nonpolar protein
at core positions, a process termed binary patterning 6. Such design core. Release of one of the axial histidines allows the glutamates to interact
experiments typically result in molten globules that exhibit a stable with solvent and provides an open coordination site on the haem for oxygen
secondary structure but lack a unique three-dimensional structure, binding. Figure reproduced with permission from ref. 36, © 2009 NPG.

nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry 17


© 2010 Macmillan Publishers Limited. All rights reserved.
review article Nature chemistry doi: 10.1038/nchem.473
exception that HP‑7 binds molecular oxygen more tightly than car- intermediate in the presence of oxygen; multiple turnovers and
bon monoxide. This work not only demonstrates that an intuition- thousandfold rate enhancements over the uncatalysed reaction were
led approach like the modified binary-patterning method described observed45. Using an electron-rich phenol such as 4‑AP provided
here can lead to sophisticated biological function, it presages pro- a facile substrate for establishing structural features necessary for
tein design’s ability to exceed the capabilities of the proteins found binding and catalysis. Additional optimization through combinato-
in nature. rial libraries or directed evolution could be used to develop more
powerful metalloenzymes.
Computational design of a de novo enzyme The DF series has also proved to be a useful tool in understanding
Designing effective enzymes requires the ability to design structure. how natural metalloenzymes may fold and function. One important
This is an area where computational methods have proved most feature of natural enzymes recapitulated by DF is preorganization.
useful. Fully automated designs of sequences to achieve target folds Preorganization of active-site residues improves catalytic activity by
have been demonstrated multiple times over the past decade, from reducing the entropic cost of forming the enzyme–substrate com-
small proteins such as a zinc-finger-like structure that folds without plex. It also allows the enzyme to dictate the configuration of the
metal38 to Top7 (ref. 39), a novel mixed α/β fold with no natural complex, which may be important for targeting transition states
counterpart. Although the fully automated design of an enzyme and high-energy intermediates. In the absence of metal, the fold
from the ground up has yet to be accomplished, recent work on was highly similar and key metal binding residues were constrained
computationally designed metalloenzymes have made significant close to the active site46. Another feature that distinguishes enzymes
advances in this area. from small-molecule catalysts is their ability to couple reactivity to
Computational design had an appreciable role in the DF series of protein motion, such as allosteric conformational changes that tog-
proteins (due ferro, Italian for ‘two iron’). These were inspired by a gle an enzyme between active and inactive states. In high-resolution
class of natural oxygen activating enzymes with dinuclear iron clus- structures of DF1 bound to manganese, two coordination envi-
ters40. A retrostructural analysis of metalloenzymes such as the R2 ronments were found in the asymmetric unit: one where a solvent
subunit of ribonucleotide reductase and methane monooxygenase molecule bridges the two metals, and a second where two solvent
revealed a simple underlying D2 symmetry of the protein backbone, molecules are bound trans to protein ligands47,48. This was coupled
an antiparallel four-helix bundle, which reflected the symmetry of with a shift between the two helix–loop–helix motifs, suggesting
the active-site metals and substrates41 (Fig. 4). This made it possible that a large-scale sliding motion could tune active-site reactivity.
to generate a scaffold de novo using an ideal α‑helix and symmetry The DF series has also helped further general computational
operations42. Key metal ligands (called keystone interactions) and methods in enzyme design and simulation. In work to create a sin-
second-shell interactions were placed on the scaffold. The protein gle chain protein, DFsc, it was found that correctly modelling the
was designed using a combination of manual and automated mod- turn residues between helices could significantly improve design
elling. One of the computational tools used was ROC (repacking stability. This led to an in-depth characterization of helix–turn–helix
of cores), a program that optimizes a design by evaluating struc- structures in the Protein Data Bank (PDB), identifying key turn
ture/sequence candidates on van der Waals energies of side-chain motifs that correlated to specific geometries of helix–helix pairs. A
packing 43. The final model was a four-helix bundle composed of a de novo computational simulation of charge-pair interactions on the
homodimer of helix–loop–helix chains, much like those of Baltzer26. surface of DF was used to design sequences that formed a 2A:2B
The atomic structure of the design was solved by X‑ray crystallog- heterotetramer between four separate helices49. It was even possible
raphy in the presence of zinc. Comparison of the structure and the to build an A:B:2C heterotetramer from three peptides using sur-
model demonstrated that many of the design elements were suc- face electrostatics50. This will facilitate combinatorial approaches
cessfully implemented. DF1 provided a clear link between sequence, to develop better enzymes by mixing libraries of peptides and
structure and function, making it a rich platform for understanding screening for activity. High-resolution active-site design has been
how this class of proteins functions. initiated with DF1, subjecting a chemically simplified metal site to
The structure of DF1 suggested that any potential catalytic Car–Parrinello molecular dynamics simulations51. Snapshots from
activity was hampered by the presence of two leucines at the metal these simulations correspond well to high-resolution structures of
site. These were mutated to alanine in DF2, providing a substrate DF1. Simulation methods could be used to generate structures of
channel capable of binding small aromatic molecules44. On adding transition states and high-energy intermediates in catalysis. These
4‑aminophenol (4-AP), it was shown that DF2 could catalyse two- could then be included as constraints in the redesign of DF1 for
electron oxidation to benzoquinone monoimine through a diferric novel catalytic activity.
The retrostructural analysis strategy has also been applied to the
computational design of a β‑sheet metalloprotein and a four-helix
Glu bundle that binds arrays of non-natural porphyrin cofactors52,53. Key
His features such as focusing on keystone interactions, binding-site pre-
organization by second-shell ligands, and mirroring of protein fold
and active-site symmetry have been implemented in these systems.
Glu Integration of computation and chemical intuition in the de  novo
design of proteins will further our understanding of the basic rela-
tionship between sequence, structure and reactivity.
Ribonucleotide Metal site DF-1
reductase R2 symmetry Computational de novo active-site design
Although the concurrent design of structure and catalysis prom-
Figure 4 | Retrostructural analysis and design of a dinuclear ises to broadly expand the scope of artificial enzymes, this area is
metalloprotein. The DF series of designed metalloenzymes were built from still in its infancy. Even state-of-the-art designs such as DF still
structural analysis of dinuclear metal sites in proteins such as ribonucleotide depend heavily on biological motifs and helical topologies. Thus, a
reductase41. The two metals, two histidines and four glutamates that formed number of labs have pursued the more tractable target of designing
the active site could be described by two half-sites related by a C2 symmetry novel enzymatic functionality into existing protein scaffolds. Two
axis. The same symmetry was found locally in the natural metalloenzymes major challenges in de  novo active-site design are (1) identifying
and was used in the de novo design of a helix–loop–helix dimer, DF‑1. optimal locations on a protein to introduce the active-site residues

18 nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry


© 2010 Macmillan Publishers Limited. All rights reserved.
Nature chemistry doi: 10.1038/nchem.473 review article
and (2) modelling the active site with sufficient accuracy to enable a b
appropriate reactivity. These issues represent a long-standing con- E155
flict in protein design between speed and accuracy. Approximations E340
in energy calculations and coarse-grained sampling methods
may speed simulations, and allow broader sampling of sequence/
structure combinations. However, the trade-off with accuracy may
reduce the ability to discriminate between successful and unsuc- D65
cessful designs.
Software such as METAL SEARCH54–56 and DEZYMER57 suc-
cessfully introduced metal-binding sites into novel locations on pro- H63
tein surfaces. Starting with a high-resolution structure of the target H66
protein, sites in the backbone were selected based on their capacity
to present metal-binding residues presenting thiol, carboxylate or
imidazole ligands. These were then modelled onto the structure in
various rotameric configurations to determine whether the coordi- Figure 5 | Design versus structure of a Zn2+ binding MBP. a, Using the
nation geometry of the metal was satisfied. computer program DEZYMER57, four substitutions were designed into closed
Despite the comparatively large energy of metal–ligand bond form of the maltose binding protein (MBP), which conferred affinity for zinc.
formation relative to other intramolecular forces in proteins, accu- Mutations (in yellow) were in the maltose binding site (white spheres).
rate design of metalloproteins presents a number of challenges, par- b, Crystal structures of the MBP variant demonstrated binding was in fact to
ticularly with regards to specificity. For example, DEZYMER was a conformation closer to the open form, only involving half of the residues
used to convert the substrate binding region of maltose binding and an additional aspartate ligand (green). A = alanine, R = arginine,
protein (MBP) into a zinc-binding site. MBP has two large domains Y = tyrosine, W = tryptophan, E = glutamate, D = aspartate, H = histidine.
connected by a flexible hinge region, allowing a cavity between the
two domains to open and close. Maltose binds to this cavity and sta- transition state with PNPA. To computationally model this reaction,
bilizes the closed state (Fig. 5). Marvin and Hellinga systematically a composite side chain composed of the histidine covalently linked
modelled groups of amino acids in the domain interface as potential to PNPA was introduced and conformationally sampled around
zinc ligands, until a set of four amino acids were identified, two from accessible bond rotations. Adjacent amino acids to the site of the
each domain, which could potentially bind zinc when MBP was in His-PNPA were allowed to mutate to alanine to facilitate substrate
the closed conformation58. The best design bound zinc with micro- binding and recognition. The conformations of His-PNPA and sur-
molar affinity. Subsequent biophysical and structural characteriza- rounding side chains were optimized using Dead End Elimination,
tion revealed an unanticipated mode of zinc binding that involved a powerful algorithm that significantly reduces the combinatorial
the open state59. Two histidines of the four designed amino acid complexity of multi-site optimizations by eliminating pairwise
substitutions were shown to bind the zinc. A nearby aspartate car- states that are provably incompatible with the global energy mini-
boxylate side-chain was found within four ångströms of the metal, mum63. The top two scoring candidates, PZD1 and 2 were synthe-
too far to form direct, strong ligand–metal bonds. However, muta- sized. PZD2 demonstrated significant rate enhancements over the
tion of this Asp to Ala abolished affinity for zinc, indicating some uncatalysed reaction and saturation kinetics with increasing sub-
role in binding. Thus a key challenge in metalloprotein design is to strate concentration.
understand the role of second-shell and potentially weak first-shell Although the thioredoxin-derived metalloenzymes and PDZ2
interactions in affinity and specificity. The large-scale conforma- had kinetic parameters well below those of natural enzymes and
tional transitions in MBP between open and closed states high- even catalytic antibodies, they were key first steps in developing
lights another challenge — that of negative design. By optimizing computational methods for enzyme design. Important advances
side-chain metal–ligand geometries, programs seek to maximize in the efficiency of computational methods such as Dead End
the stability of a target state, usually using a crystal structure as a Elimination are proving crucial in making these design problems
starting point. Negative design seeks to account for competing, off- tractable64. Although initial extension of these computational meth-
target conformations and either explicitly or implicitly destabilize ods to the design of a triose phophate isomerase turned out to be
them. This is extremely difficult without a detailed knowledge of the unsuccessful, many important ideas put forth in these studies were
structure of conformational microstates. Optimization of sequence incorporated into the recent, successful design of chemically ambi-
over an ensemble of target and off-target protein conformations also tious artificial enzymes.
exacerbates the already formidable number of states to be sampled. In parallel with computational advances, active-site designs con-
This is currently a very active area in protein design. tinue to progress using rational, intuition-based strategies. Esterase
A series of His3Fe sites were introduced to thioredoxin in vari- activity was introduced to human carbonic anhydrase (HCAIII)
ous environments classified as grooves, shallow pockets and a deep through both protein and substrate engineering 65. The affinity of
pocket, allowing the effect of the protein microenvironment on HCAIII for benzenesulfonamide-containing molecules was used to
‘nascent’ enzymatic activity to be studied60,61. Superoxide dismutase model a substrate such that the scissile bond was positioned within
activity, the conversion of superoxide radical into molecular oxygen a cleft in the protein. Grafting a His dyad from previous de  novo
and hydrogen peroxide, correlated with local electrostatic interac- helix–loop–helix designs resulted in an HCAIII variant with
tions, where a net positive charge at the binding site was hypoth- enhanced esterase activity over wild-type.
esized to attract the O2– species. Further structural characterization
of these designs could be very informative, as the previous MBP The ROSETTA enzymes
example emphasizes. Where possible, an atomic-resolution struc- ROSETTA is a suite of computational tools developed in the labo-
ture of a de novo functional protein is an important step in under- ratory of David Baker for protein-structure prediction, protein–
standing how design elements correlate to mechanism. protein complex prediction and protein design. One of the key
A similar approach was used to build a non-metal proto-enzyme innovations in ROSETTA is its use of high-resolution structures
site into a thioredoxin scaffold that catalysed the hydrolysis of in the PDB as a ‘parts list’. Small fragments around ten residues in
paranitrophenol acetate (PNPA) into PNP and acetate62. This was length are assembled into larger molecules, drastically reducing the
accomplished by a histidine nucleophile that formed a high-energy conformational degrees of freedom to be sampled. This approach

nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry 19


© 2010 Macmillan Publishers Limited. All rights reserved.
review article Nature chemistry doi: 10.1038/nchem.473
a
O

Motif III O
Asp

H
Hydrophobic
pocket N

O OH N His O O
Lys H +
NH O

O O

b Theozyme c Active site ensemble: 1018 states

His

Asp

Lys

d Scaffold set e Matches

Jelly roll TIM barrel

Figure 6 | Assembly line for the ROSETTA Enzymes. a, Sketch of a reaction motif, outlining key intermediates, general acid/base ligands and strategies
for modulating catalytic residue pKas. b, QM calculations are used to optimize the geometry of a transition-state model including truncated active-site
residues. c, Elaboration of catalytic residue rotamers creates an ensemble of active sites. d, These sites are matched to complementary surfaces on a
family of target protein scaffolds. e, Promising designs are synthesized and characterized for activity. Figure reproduced from ref. 111, © 2008 NPG.

rests on the assumption that structure and stability information is intermediates to cleave the substrate and regenerate the active site.
implicitly encoded locally within each fragment 34. The global stabil- The reaction proceeded through an imine intermediate involving
ity of a design is evaluated based on a scoring potential that com- a lysine as a Schiff base, similar to the oxaldie decarboxylation of
bines physics-based energy terms such as van der Waals packing oxaloacetate. In oxaldie, the lysine nucleophile was stabilized through
and knowledge-based energy terms derived from statistical analysis electrostatic interactions with other charged side chains or the helix
of amino acid interactions with the PDB66. Recently, ROSETTA was macrodipole. The same mechanism was used in the first of four reac-
used to develop artificial enzymes that catalysed a retro-aldol reac- tion motifs attempted by placing a second lysine in the vicinity of
tion (Fig. 6) and a Kemp elimination67,68. These designs were impres- the first. The other three used a hydrophobic pocket to lower the
sive in the extent to which the relationship between structure and pKa of the lysine. A general acid/base was included to trigger cleav-
reactivity was modelled and characterized. age of the carbon–carbon bond. Each motif used a different base:
In the retro-aldolase, the goal was to break a carbon–carbon bond I, a Lys/Asp dyad; II, tyrosine; III, a His/Asp; and IV, a water mol-
in a non-natural substrate, 4‑hydroxy‑4-(6-methoxy‑2-napthyl)‑2- ecule. Attempting designs based on several reaction motifs not only
butanone69. The intended reaction was significantly more complex increases the chance of a successful outcome, but also demonstrates
than previous designs, involving multiple transition states and how design can be used to test various hypotheses for catalysis.

20 nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry


© 2010 Macmillan Publishers Limited. All rights reserved.
Nature chemistry doi: 10.1038/nchem.473 review article
As previously discussed, two challenges in computational rather, most mutations were found in amino acids adjacent to the
enzyme design are accurate modelling of active-site interactions active site. High-resolution crystal structures of KE07 were super-
and sufficient sampling of candidate backbone templates on protein imposable on the computational model, allowing the researchers to
scaffolds. To meet the first challenge, quantum mechanics calcu- develop specific hypotheses regarding the mechanism of improved
lations were performed on a minimal chemical representation of enzymatic function by evolution. This highlights the potential advan-
the protein–substrate complex 70. To simplify active-site construc- tage of combining rational design with directed evolution.
tion, a composite transition state was made that combined optimal As with retro-aldolase designs, KE07 only demonstrates kinetic
geometries of the carbinolamine alcohol and the bond-breaking parameters equivalent to or below that of catalytic antibodies77, or
state. The remainder of the active-site side-chains were then built even non-specific reaction rates with ‘off the shelf ’ proteins such
and sampled for rotameric degrees of freedom. After all active-site as bovine and human serum albumins78. The real success of the
and second-shell residues were modelled, a total of anywhere from ROSETTA enzymes is not in the computational design of catalysts
1013 to 1018 potential sequence/structure combinations were gener- of practical value, a goal that has still to be achieved by any group.
ated, depending on the reaction motif considered. Rather, they are among the earliest demonstrations of the ability to
At this point in the design, the researchers had built a huge harness the three-dimensional protein fold to stabilize energetically
ensemble of disembodied active sites. The next challenge was to unfavourable active sites79, for example by locating a general base in
compare these active sites to a set of potential scaffolds in seventy a hydrophobic microenvironment. This is an important component
protein structures. These structures represented a diverse set of of enzyme design.
protein folds including triose phosphate isomerase (TIM) barrels, A two-pronged approach combining laboratory evolution with
jelly rolls, β‑propellors, lipocalins and periplasmic binding proteins. structure-based computation may increase the likelihood of design-
To model 1018 active sites onto all possible combinations of back- ing synthetic enzymes with activities approaching those of natural
bone positions would be computationally intractable. An algorithm counterparts. One strategy is to reduce the number of sequences
called RosettaMatch made use of geometric hashing to reduce the to be experimentally characterized. This has been approached in
problem to one that scaled linearly with the number of states71,72. multiple ways: identifying positions in a protein that can tolerate
(For a very readable review of geometric hashing, we recommend mutations80, identifying protein domain boundaries in order to gen-
ref. 73.) Still, this was an immense calculation, and other resources erate libraries of permuted chimeric species81 and predicting amino
such as distributed computing over thousands of personal comput- acid substitutions that preserve function while maximizing chemi-
ers volunteered through the Rosetta@Home project made this pos- cal diversity of sampled sequences82. The Tidor laboratory took an
sible. After the first stage of matching, the number of designs was inverse approach, using computational sequence optimization to
around 180,000. This was further reduced by computational rede- enhance antibody binding after affinity maturation against two tar-
sign of adjacent amino acids surrounding the active site, in order to gets (lysozyme and human epidermal growth factor receptor)83.
maintain structural integrity of the overall protein fold.
After an extensive computational vetting process, searching Moving enzyme design out of the active site
through four reaction motifs, a billion billion active-site configu- KE07 models a potentially powerful strategy of combining computa-
rations and a diverse set of scaffolds, seventy proteins were syn- tional design and laboratory evolution towards novel enzymes. Below,
thesized. Each design contained anywhere from eight to twenty we highlight other areas where rational methods may introduce new
mutations relative to the wild-type protein. Impressively, nearly functionality that would be difficult using laboratory evolution alone.
half showed some aldolase activity. Interestingly, the successful The enzymes designed so far have each focused on the creation of
enzymes were not equally distributed over mechanism or fold. Only a transition-state-stabilizing active site as the basis for their catalytic
motifs III and IV, which used histidine or water as the general acid/ function. Thus, they necessarily have the same limitations in scope as
base, were successful, suggesting the other mechanisms were either has been noted for catalytic antibodies elicited with transition-state
chemically unfeasible or difficult to design. Similarly, only the jelly- analogues76, in particular their modest catalytic rate accelerations
roll and TIM barrel were productive, indicating these folds may that pale in comparison to the values of up to 1023 observed in natural
have intrinsic geometric properties favouring the design of catalytic enzymes84. This is at least in part due to the fact that catalysts that bind
sites (the capacity for a fold to accommodate multiple sequences to transition states too tightly suffer from product inhibition, setting
is referred to as designability, and quantifying this parameter has a limit on the maximal rate acceleration possible using this method
been attempted for simple systems)74,75. Rate enhancements of up alone85. For rationally designed enzymes to surpass these limitations,
to 104 were achieved. Structures of designs solved by X‑ray crystal- additional strategies, many derived from natural enzyme mecha-
lography showed the atomic accuracy of the computational models, nism, must be employed86. Although these strategies incur an addi-
an essential feature for furthering rational design of these enzymes. tional level of complexity in rational design, they are also responsible
Designs were verified by mutagenesis of active-site residues to ablate for much of the extra catalytic power observed in protein catalysts.
catalysis. Careful analysis verified saturation kinetics. Although Some of these further mechanistic enhancements may offer an entry-
these designs fall short of retro-aldolase activity seen in catalytic way into catalytic complexity that is not only impossible in catalytic
antibodies69,76, they remove a number of constraints on the design of antibodies, but also not accessible to directed evolution approaches.
artificial enzymes, opening the possibility for reactivity on a broad Furthermore, it seems likely that many of these strategies are them-
spectrum of protein scaffolds. selves open to implementation using chemical intuition.
This approach can also be combined with in  vitro evolution to
improve catalytic efficiency, as was shown with a set of Kemp elimi- Multiple substrates. Perhaps the simplest mechanism by which
nation enzymes also designed with the ROSETTA platform68. The enzymes accelerate chemical reactions is by using binding energy to
computational approach to designing Kemp elimination enzymes compensate for the entropy loss incurred in bimolecular reactions87.
was very similar to the retro-aldolase. Kemp elimination is a model Merely holding two substrates in proximity to each other at the cor-
reaction for proton abstraction from a carbon. Successful designs rect orientation for a productive reaction can engender a 108-fold
were again found in TIM-barrel folds, with rate enhancements of rate acceleration in the absence of any transition-state-stabilizing
up to 105, saturation kinetics in certain cases and greater than seven protein–substrate interactions. Such a strategy proved successful in
turnovers. In vitro evolution of one of the designs, KE07, by cycles of the self-replicating peptide system21. This, coupled with transition–
mutagenesis and activity screening resulted in a 200-fold improve- state-stabilization, could elevate rate enhancements to levels similar
ment in kcat/Km. Residues involved in catalysis were unchanged; to those observed in natural enzymes. Designing a single active site

nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry 21


© 2010 Macmillan Publishers Limited. All rights reserved.
review article Nature chemistry doi: 10.1038/nchem.473
small-scale dynamics. Designed enzymes offer a uniquely adaptable
scaffold on which to test our ideas about such dynamic motions,
and artificial proteins will no doubt prove to be central in the devel-
opment of our understanding of enzyme dynamics.

Large-scale enzyme motions. Multi-ångström, millisecond times-


cale protein motions are a critical component of many natural enzyme
mechanisms92. One well-characterized example is the Ω loop govern-
ing the function of adenylate kinase93,94, where this loop has an open
conformation, competent for substrate binding and product release,
and a closed conformation, competent for catalysis (see Fig. 7). This
hinge-opening motion on a ‘lid’ over the active site is a common motif
in protein function, particularly the TIM-barrel family of enzymes95.
There are even larger motions, involving entire protein domains, used
by complex multidomain enzymes — the ‘escapement’ mechanisms
that have been well characterized in the cytochromes bc1 and the fla-
Figure 7 | Millisecond timescale motions in adenylate cyclase. Adenylate vocytochrome P450 BM396 are simple large-scale motions. Intuitive
cyclase maintains intracellular concentrations of adenylate nucleotides approaches can be envisaged for incorporating large-scale motions
by converting an ATP and an AMP into two ADP molecules. Binding and into a protein design, such as the introduction of flexible loops. Such
catalysis requires a significant conformational rearrangement93,94. The two features are unlikely to result directly from laboratory timescale evo-
domains of adenylate cyclase are shown in red and grey: left, the unbound lution approaches without some degree of initial design.
form; right, bound to a dinucleotide analogue shown in blue. The helical rotation discussed above in the mechanism of the
artificial neuroglobin HP‑7, is exemplary of the types of problems
that will bind, orient and activate two substrates simultaneously is, de novo enzyme design can target relative to catalytic antibody for-
however, a more difficult problem. It seems likely that simpler bi- mation or directed evolution. Antibodies are relatively inflexible
substrate mechanisms, either an ordered mechanism with a cova- and evolutionary methods are unlikely to generate multiple con-
lent intermediate, or a single protein with two active sites coupled formations simultaneously. As such conformational switching is an
by the channelling of an intermediate, would be more accessible to essential component of the function of many protein catalysts; these
current design technology 3. motions are best incorporated at the design level.

Timing in multiple substrate reactions. An important facet of Allostery. Cooperative phenomena are fundamental in many bio-
the mechanism of many of nature’s more complicated catalysts is logical functions such as oxygen transport, metabolic and transcrip-
the temporal control of substrate binding and transfer events. One tional regulation97. The ability to incorporate allosteric regulation
merely has to look at the exquisite engineering apparent in the into artificial enzymes will enable the creation of medicinally useful
multiple electron and proton transfers in photosynthesis, respira- in vivo catalysts that can be regulated either by metabolites or using
tion and nitrogen fixation to see the advantages conferred by the exogenous small-molecule effectors. Such behaviour is in most
ability to control the timing and energetics of the intermediates in cases a more complicated version of the large-scale conformational
multiple-substrate reactions88. As these reaction mechanisms would switching described above, only in this case the large-scale protein
require the simultaneous optimization of several different catalytic motion is actuated by the binding of an effector molecule.
events at several different sites, it is unlikely that such would arise
from laboratory timescale evolution. A more likely situation is one Keeping energetic intermediates from the cellular environment.
where an initial enzymatic scaffold is designed explicitly and then Anyone who has observed a bioinorganic chemist at work in a dry
the kinetics and thermodynamics of the intermediates are further glovebox can appreciate the fragility of metalloenzyme active sites.
optimized using directed evolution. Such enzymes employ reactive intermediates that must be screened
from water, oxygen and reactive species in the cellular environment
Small-scale enzyme motions. The analysis of hydrogen tunnelling such as glutathione and superoxide. In fact, the failure of many ini-
in enzymes has established unequivocally that protein dynamics tial attempts at metalloenzyme design can be attributed to the desire
can play a large part in rates of enzymatic catalysis89. Computational to make these model proteins as small as possible98. Larger proteins
analysis of enzyme–substrate complexes using molecular mechanics have sufficiently sized hydrophobic cores that they can completely
has estimated the role of the dynamics of ‘near attack complexes’ in encapsulate these intermediates in a non-reactive environment,
catalytic function in the absence of atomic tunnelling to be as high screening them from solution and lengthening their lifetimes.
as 103 (ref. 90). However, these nanosecond timescale motions are
an intrinsic property of all biopolymers. It is not clear to what extent Catalytic foldamers
choreographed fast dynamics promotes catalysis in different natural Despite the intensive focus on protein enzymes, the first catalytic
enzymes. A recent report of the dynamic analysis of each interme- biomolecules may have been based on RNA rather than amino
diate state in the catalytic mechanism of dihydrofolate reductase91 acids99. Nucleotides can carry out metal-assisted reactions, act as
demonstrated that in each kinetic intermediate, the protein accessed general acid–bases and function to orient substrates and isolate
only the conformations present in the current state and the states them from solvent, much like the most sophisticated catalytic pro-
immediately before and after. These findings indicate that for at least teins100. Catalytic activity has also been found in certain bacterial
some enzymes such dynamics are considerably choreographed. carbohydrates101. Evidently, proteins do not have a monopoly on
Because the origins and even the consequences of these motions biological catalysis. Although the successful design of a new pro-
are not well understood, this contribution to catalysis will be dif- tein is a rewarding experience in itself, often real advances are in
ficult to reproduce. As the optimization brought about by directed our understanding of the basic molecular forces that guide struc-
evolution techniques is often a result of mutations distal to the active ture and function. Given that catalysis is not limited to proteins,
site that may impact protein motions, it seems that this method is it is important for us to ask whether insights gained from current
currently the best approach for increasing catalytic efficiency with de  novo proteins are idiosyncratic to proteins, or whether we are

22 nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry


© 2010 Macmillan Publishers Limited. All rights reserved.
Nature chemistry doi: 10.1038/nchem.473 review article
learning broader lessons about molecular design. This is the goal 19. Taylor, S. E., Rutherford, T. J. & Allemann, R. K. Design of a folded,
of ‘foldamer’ research, the development of novel polymeric systems conformationally stable oxaloacetate decarboxylase. J. Chem. Soc. Perkins
Trans. 2, 751–755 (2002).
that fold into unique, three-dimensional structures102,103. 20. Nicoll, A. J. & Allemann, R. K. Nucleophilic and general acid catalysis at
Towards designing a functional foldamer, it is important to physiological pH by a designed miniature esterase. Org. Biomol. Chem.
remember the lessons of oxaldie and other catalytic peptides — a 2, 2175–2180 (2004).
firm grasp of molecular structure is an important prerequisite to 21. Lee, D. H., Severin, K., Yokobayashi, Y. & Ghadiri, M. R. Emergence of
introducing function. Computational studies of foldamers are still symbiosis in peptide self-replication through a hypercyclic network. Nature
in their infancy. Several groups have used simulation approaches to 390, 591–594 (1997).
22. Saghatelian, A., Yokobayashi, Y., Soltani, K. & Ghadiri, M. R. A chiroselective
study the conformational space available to peptidomimetics incor- peptide replicator. Nature 409, 797–801 (2001).
porating d‑amino acids104,105 or β‑amino acids106,107, as well as com- 23. Butterfield, S. M., Cooper, W. J. & Waters, M. L. Minimalist protein design: a
pletely non-biological foldamers such as m‑phenylene ethylenes beta-hairpin peptide that binds ssDNA. J. Am. Chem. Soc. 127, 24–25 (2005).
(mPE)108. These studies establish non-natural scaffolds that can then 24. Butterfield, S. M., Goodman, C. M., Rotello, V. M. & Waters, M. L. A peptide
be functionalized to facilitate reactivity. The mPEs are promising flavoprotein mimic: flavin recognition and redox potential modulation in
owing to their capacity to form helical cavities that can be func- water by a designed beta hairpin. Angew. Chem. Int. Ed. 43, 724–727 (2004).
25. Hughes, R. M. & Waters, M. L. Model systems for beta-hairpins and beta-
tionalized with catalytic residues and serve to isolate substrate from sheets. Curr. Opin. Struct. Biol. 16, 514–524 (2006).
solvent 109. This is an exciting new direction for molecular design. 26. Olofsson, S. & Baltzer, L. Structure and dynamics of a designed helix-loop-
Although the Wright brothers did not use a computer to build helix dimer in dilute aqueous trifluoroethanol solution. A strategy for NMR
the first airplane, the aerospace industry has since greatly benefitted spectroscopic structure determination of molten globules in the rational
from sophisticated simulation and design software. Clearly, both the design of native-like proteins. Fold. Des. 1, 347–356 (1996).
chemist and the computer will have important roles in the future of 27. Rossi, P., Tecilla, P., Baltzer, L. & Scrimin, P. De novo metallonucleases based
on helix‑loop‑helix motifs. Chem. Eur. J. 10, 4163–4170 (2004).
artificial enzymes. Design by chemical intuition tests our understand- 28. Razkin, J., Lindgren, J., Nilsson, H. & Baltzer, L. Enhanced complexity and
ing of basic rules, that is, binary patterning, molecular forces, metal catalytic efficiency in the hydrolysis of phosphate diesters by rationally
coordination and catalytic mechanisms. These rules are then codified designed helix‑loop‑helix motifs. ChemBioChem 9, 1975–1984 (2008).
to allow the computational design of increasingly complex systems. 29. Razkin, J., Nilsson, H. & Baltzer, L. Catalysis of the cleavage of uridine 3’‑2,2,
Designer enzymes are a useful test of our understanding of how pro- 2-trichloroethylphosphate by a designed helix‑loop‑helix motif peptide. J. Am.
teins fold and function. They also provide a rational path towards Chem. Soc. 129, 14752–14758 (2007).
30. Tommos, C., Skalicky, J. J., Pilloud, D. L., Wand, A. J. & Dutton, P. L. De novo
inexpensive, non-toxic catalysts that perform novel chemistry. proteins as models of radical enzymes. Biochemistry 38, 9495–9507 (1999).
31. Dai, Q. H. et al. Structure of a de novo designed protein model of radical
References enzymes. J. Am. Chem. Soc. 124, 10952–19053 (2002).
1. Davie, E. A. C., Mennen, S. M., Xu, Y. & Miller, S. J. Asymmetric catalysis 32. Gibney, B. R., Rabanal, F., Skalicky, J. J., Wand, A. J. & Dutton, P. L. Design of a
mediated by synthetic peptides. Chem. Rev. 107, 5759–5812 (2007). unique protein scaffold for maquettes. J. Am. Chem. Soc. 119, 2323–2324 (1997).
2. List, B. Proline-catalyzed asymmetric reactions. Tetrahedron 33. Gibney, B. R., Rabanal, F., Skalicky, J. J., Wand, A. J. & Dutton, P. L. Iterative
58, 5573–5590 (2002). protein redesign. J. Am. Chem. Soc. 121, 4952–4960 (1999).
3. Koder, R. L. & Dutton, P. L. Intelligent design: the de novo engineering of 34. Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of
proteins with specified functions. Dalton Trans. 25, 3045–3051 (2006). protein tertiary structures from fragments with similar local sequences
4. Lim, V. in Protein Folding, 28th Conference of the German Biochemical Society using simulated annealing and bayesian scoring functions. J. Mol. Biol.
(ed. Jaenicke, R.) 149–166 (Elsevier, 1979). 268, 209–225 (1997).
5. Crick, F. H. C. The packing of alpha-helices: simple coiled coils. Acta Crystallogr. 35. Koder, R. L. et al. Native-like structure in designed four helix bundles driven
6, 689–697 (1953). by buried polar interactions. J. Am. Chem. Soc. 128, 14450–14451 (2006).
6. Kamtekar, S., Schiffer, J. M., Xiong, H., Babik, J. M. & Hecht, M. H. Protein 36. Koder, R. L. et al. Design and engineering of an O2 transport protein. Nature
design by binary patterning of polar and nonpolar amino acids. Science 458, 305–309 (2009).
262, 1680–1685 (1993). 37. Kundu, S., Trent, J. T. & Hargrove, M. S. Plants, humans and hemoglobins.
7. Lau, S. Y. M., Taneja, A. K. & Hodges, R. S. Synthesis of a model Trends Plant Sci. 8, 387–393 (2003).
protein of defined secondary and quaternary structure. J. Biol. Chem. 38. Dahiyat, B. I. & Mayo, S. L. De novo protein design: fully automated sequence
259, 13253–13261 (1984). selection. Science 278, 82–87 (1997).
8. DeGrado, W. F. & Lear, J. D. Induction of peptide conformation at apolar/ 39. Kuhlman, B. et al. Design of a novel globular protein fold with atomic-level
water interfaces. 1. a study with model peptides of defined hydrophobic accuracy. Science 302, 1364–1368 (2003).
periodicity. J. Am. Chem. Soc. 107, 7684–7689 (1985). 40. Wallar, B. J. & Lipscomb, J. D. Dioxygen activation by enzymes containing
9. Bryson, J. W. et al. Protein design: a hierarchic approach. Science binuclear non-heme iron clusters. Chem. Rev. 96, 2625–2658 (1996).
270, 935–941 (1995). 41. Lombardi, A. et al. Retrostructural analysis of metalloproteins: application to
10. Handel, T. M., Williams, S. A. & DeGrado, W. F. Metal ion- the design of a minimal model for diiron proteins. Proc. Natl Acad. Sci. USA
dependent modulation of the dynamics of a designed protein. Science 97, 6298–305 (2000).
261, 879–885 (1993). 42. Summa, C. M., Lombardi, A., Lewis, M. & DeGrado, W. F. Tertiary templates
11. Lovejoy, B. et al. Crystal structure of a synthetic triple-stranded alpha-helical for the design of diiron proteins. Curr. Opin. Struct. Biol. 9, 500–508 (1999).
bundle. Science 259, 1288–1293 (1993). 43. Lazar, G. A., Desjarlais, J. R. & Handel, T. M. De novo design of the
12. Harbury, P. B., Plecs, J. J., Tidor, B., Alber, T. & Kim, P. S. High-resolution hydrophobic core of ubiquitin. Protein Sci. 6, 1167–1178 (1997).
protein design with backbone freedom. Science 282, 1462–1467 (1998). 44. Di Costanzo, L. et al. Toward the de novo design of a catalytically active helix
13. Sasaki, T. & Kaiser, E. T. Helichrome: synthesis and enzymatic activity of a bundle: a substrate-accessible carboxylate-bridged dinuclear metal center.
designed hemeprotein. J. Am. Chem. Soc. 111, 380–381 (1989). J. Am. Chem. Soc. 123, 12749–57 (2001).
14. Barbier, B. & Brack, A. Basic polypeptides accelerate the hydrolysis of 45. Kaplan, J. & DeGrado, W. F. De novo design of catalytic proteins. Proc. Natl
ribonucleic acids. J. Am. Chem. Soc. 110, 6880–6882 (1988). Acad. Sci. USA 101, 11566–11570 (2004).
15. Barbier, B. & Brack, A. Conformation-controlled hydrolysis of 46. Maglio, O., Nastri, F., Pavone, V., Lombardi, A. & DeGrado, W. F.
polyribonucleotides by sequential basic polypeptides. J. Am. Chem. Soc. Preorganization of molecular binding sites in designed diiron proteins.
114, 3511–3515 (1992). Proc. Natl Acad. Sci. USA 100, 3772–3777 (2003).
16. Brack, A. & Spach, G. Multiconformational synthetic polypeptides. J. Am. 47. Geremia, S. et al. Response of a designed metalloprotein to changes in metal
Chem. Soc. 103, 6319–6323 (1981). ion coordination, exogenous ligands, and active site volume determined by
17. Johnsson, K., Allemann, R. K., Widmer, H. & Benner, S. A. Synthesis, X‑ray crystallography. J. Am. Chem. Soc. 127, 17266–76 (2005).
structure and activity of artificial, rationally designed catalytic polypeptides. 48. DeGrado, W. F. et al. Sliding helix and change of coordination geometry in a
Nature 365, 530–532 (1993). model di-MnII protein. Angew. Chem. Int. Ed. 42, 417–420 (2003).
18. Taylor, S. E., Rutherford, T. J. & Allemann, R. K. Design, synthesis and 49. Summa, C. M., Rosenblatt, M. M., Hong, J. K., Lear, J. D. & DeGrado, W. F.
characterisation of a peptide with oxaloacetate decarboxylase activity. Bioorg. Computational de novo design, and characterization of an A2B2 diiron protein.
Med. Chem. Lett. 11, 2631–2635 (2001). J. Mol. Biol. 321, 923–938 (2002).

nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry 23


© 2010 Macmillan Publishers Limited. All rights reserved.
review article Nature chemistry doi: 10.1038/nchem.473
50. Marsh, E. N. & DeGrado, W. F. Noncovalent self-assembly of a heterotetrameric 82. Treynor, T. P., Vizcarra, C. L., Nedelcu, D. & Mayo, S. L. Computationally
diiron protein. Proc. Natl Acad. Sci. USA 99, 5150–5154 (2002). designed libraries of fluorescent proteins evaluated by preservation and
51. Papoian, G. A., DeGrado, W. F. & Klein, M. L. Probing the configurational diversity of function. Proc. Natl Acad. Sci. USA 104, 48–53 (2007).
space of a metalloprotein core: an ab initio molecular dynamics study of Duo 83. Lippow, S. M., Wittrup, K. D. & Tidor, B. Computational design of antibody-
Ferro 1 binuclear Zn cofactor. J. Am. Chem. Soc. 125, 560–569 (2003). affinity improvement beyond in vivo maturation. Nature Biotechnol.
52. Cochran, F. V. et al. Computational de novo design and characterization of a 25, 1171–1176 (2007).
four-helix bundle protein that selectively binds a nonbiological cofactor. J. Am. 84. Radzicka, A. & Wolfenden, R. A proficient enzyme. Science 267, 90–93 (1995).
Chem. Soc. 127, 1346–1347 (2005). 85. Lienhard, G. E. Enzymatic catalysis and transition-state theory. Science
53. Nanda, V. et al. De novo design of a redox-active minimal rubredoxin mimic. 180, 149–154 (1973).
J. Am. Chem. Soc. 127, 5804–5805 (2005). 86. Kraut, D. A., Carroll, K. S. & Herschlag, D. Challenges in enzyme mechanism
54. Clarke, N. D. & Yuan, S. M. Metal search: a computer program that helps and energetics. Annu. Rev. Biochem. 72, 517–571 (2003).
design tetrahedral metal-binding sites. Proteins 23, 256–263 (1995). 87. Jencks, W. P. Catalysis in Chemistry and Enzymology (McGraw-Hill, 1969).
55. Klemba, M., Gardner, K. H., Marino, S., Clarke, N. D. & Regan, L. Novel 88. Noy, D., Moser, C. C. & Dutton, P. L. Design and engineering of
metal-binding proteins by design. Nature Struct. Biol. 2, 368–373 (1995). photosynthetic light-harvesting and electron transfer using length, time, and
56. Regan, L. & Clarke, N. D. A tetrahedral zinc(II)-binding site introduced into a energy scales. Biochim. Biophys. Acta 1757, 90–105 (2006).
designed protein. Biochemistry 29, 10878–10883 (1990). 89. Nagel, Z. D. & Klinman, J. P. Tunneling and dynamics in enzymatic hydride
57. Hellinga, H. W. & Richards, F. M. Construction of new ligand binding sites in transfer. Chem. Rev. 106, 3095–3118 (2006).
proteins of known structure. I. Computer-aided modeling of sites with pre- 90. Bruice, T. C. Computational approaches: Reaction trajectories, structures,
defined geometry. J. Mol. Biol. 222, 763–785 (1991). and atomic motions. Enzyme reactions and proficiency. Chem. Rev.
58. Marvin, J. S. & Hellinga, H. W. Conversion of a maltose receptor into 106, 3119–3139 (2006).
a zinc biosensor by computational design. Proc. Natl Acad. Sci. USA 91. Boehr, D. D., McElheny, D., Dyson, H. J. & Wright, P. E. The dynamic energy
98, 4955–4960 (2001). landscape of dihydrofolate reductase catalysis. Science 313, 1638–1642 (2006).
59. Telmer, P. G. & Shilton, B. H. Structural studies of an engineered zinc biosensor 92. Gerstein, M., Lesk, A. M. & Chothia, C. Structural mechanisms for domain
reveal an unanticipated mode of zinc binding. J. Mol. Biol. 354, 829–840 (2005). movements in proteins. Biochemistry 33, 6739–6749 (1994).
60. Benson, D. E., Wisz, M. S. & Hellinga, H. W. Rational design of nascent 93. Henzler-Wildman, K. A. et al. A hierarchy of timescales in protein dynamics is
metalloenzymes. Proc. Natl Acad. Sci. USA 97, 6292–6297 (2000). linked to enzyme catalysis. Nature 450, 913–916 (2007).
61. Pinto, A. L., Hellinga, H. W. & Caradonna, J. P. Construction of a catalytically 94. Henzler-Wildman, K. A. et al. Intrinsic motions along an enzymatic reaction
active iron superoxide dismutase by rational protein design. Proc. Natl Acad. trajectory. Nature 450, 838–844 (2007).
Sci. USA 94, 5562–5567 (1997). 95. Xiang, J. Y., Jung, J. Y. & Sampson, N. S. Entropy effects on protein hinges:
62. Bolon, D. N. & Mayo, S. L. Enzyme-like proteins by computational design. The reaction catalyzed by triosephosphate isomerase. Biochemistry
Proc. Natl Acad. Sci. USA 98, 14274–14279 (2001). 43, 11436–11445 (2004).
63. Desmet, J., Demaeyer, M., Hazes, B. & Lasters, I. The dead-end 96. Munro, A. W. et al. P450BM3: the very model of a modern flavocytochrome.
elimination theorem and its use in protein side-chain positioning. Nature Trends Biochem. Sci. 27, 250–257 (2002).
356, 539–542 (1992). 97. Wyman, J. & Gill, S. J. Binding and Linkage (University Science Books, 1990).
64. Looger, L. L. & Hellinga, H. W. Generalized dead-end elimination algorithms 98. Anderson, J. L. R., Koder, R. L., Moser, C. C. & Dutton, P. L. Controlling
make large-scale protein side-chain structure prediction tractable: implications complexity and water penetration in functional de novo protein design.
for protein design and structural genomics. J. Mol. Biol. 307, 429–445 (2001). Biochem. Soc. Trans. 36, 1106–1111 (2008).
65. Host, G. E., Razkin, J., Baltzer, L. & Jonsson, B. H. Combined enzyme and 99. Joyce, G. F. The antiquity of RNA-based evolution. Nature
substrate design: grafting of a cooperative two-histidine catalytic motif 418, 214–221 (2002).
into a protein targeted at the scissile bond in a designed ester substrate. 100. Doudna, J. A. & Lorsch, J. R. Ribozyme catalysis: not different, just worse.
ChemBioChem 8, 1570–1576 (2007). Nature Struct. Mol. Biol. 12, 395–402 (2005).
66. Kuhlman, B. & Baker, D. Native protein sequences are close to optimal for 101. Lee, S. & Jung, S. Cyclosophoraose as a catalytic carbohydrate for
their structures. Proc. Natl Acad. Sci. USA 97, 10383–10388 (2000). methanolysis. Carbohydr. Res. 339, 461–468 (2004).
67. Jiang, L. et al. De novo computational design of retro-aldol enzymes. Science 102. Hill, D. J., Mio, M. J., Prince, R. B., Hughes, T. S. & Moore, J. S. A field guide to
319, 1387–1391 (2008). foldamers. Chem. Rev. 101, 3893–4011 (2001).
68. Rothlisberger, D. et al. Kemp elimination catalysts by computational enzyme 103. Goodman, C. M., Choi, S., Shandler, S. & DeGrado, W. F. Foldamers as
design. Nature 453, 190–195 (2008). versatile frameworks for the design and evolution of function. Nature Chem.
69. Tanaka, F., Fuller, R., Shim, H., Lerner, R. A. & Barbas, C. F. Evolution of Biol. 3, 252–262 (2007).
aldolase antibodies in vitro: correlation of catalytic activity and reaction-based 104. Nanda, V. & DeGrado, W. F. Computational design of heterochiral peptides
selection. J. Mol. Biol. 335, 1007–1018 (2004). against a helical target. J. Am. Chem. Soc. 128, 809–816 (2006).
70. Tantillo, D. J., Chen, J. & Houk, K. N. Theozymes and compuzymes: theoretical 105. Nanda, V. & Degrado, W. F. Simulated evolution of emergent chiral structures
models for biological catalysis. Curr. Opin. Chem. Bio. 2, 743–750 (1998). in polyalanine. J. Am. Chem. Soc. 126, 14459–14467 (2004).
71. Zanghellini, A. et al. New algorithms and an in silico benchmark for 106. Baldauf, C., Gunther, R. & Hoffmann, H.‑J. Helix Formation in α, - and β,
computational enzyme design. Protein Sci. 15, 2785–2794 (2006). γ-hybrid peptides: Theoretical insights into mimicry of α- and β-peptides.
72. Ladman, Y., Schwartz, J. T. & Wolfson, H. J. Affine invariant model-based J. Org. Chem. 71, 1200–1208 (2006).
object recognition. IEEE Trans. Robot. Automat. 6, 578–589 (1990).
107. Sandvoss, L. M. & Carlson, H. A. Conformational behavior of β-proline
73. Wolfson, H. J. & Rigoutsos, I. Geometric hashing: an overview. IEEE Comp.
Sci. Eng. 4, 10–21 (1997). oligomers. J. Am. Chem. Soc. 125, 15855–15862 (2003).
74. Li, H., Helling, R., Tang, C. & Wingreen, N. Emergence of preferred structures 108. Lee, O.‑S. & Saven, J. G. Simulation studies of a helical m‑phenylene ethylene
in a simple model of protein folding. Science 273, 666–669 (1996). foldamer. J. Phys. Chem. B 108, 11988–11994 (2004).
75. Li, H., Tang, C. & Wingreen, N. S. Are protein folds atypical? Proc. Natl Acad. 109. Smaldone, R. A. & Moore, J. S. Reactive sieving with foldamers:
Sci. USA 95, 4987–4990 (1998). inspiration from nature and directions for the future. Chem. Eur. J.
76. Hilvert, D. Critical analysis of antibody catalysis. Annu. Rev. Biochem. 14, 2650–2657 (2008).
69, 751–792 (2000). 110. Blake, C. C. et al. Structure of hen egg-white lysozyme. A three-dimensional
77. Thorn, S. N., Daniels, R. G., Auditor, M. T. & Hilvert, D. Large rate Fourier synthesis at 2 Å resolution. Nature 206, 757–761 (1965).
accelerations in antibody catalysis by strategic use of haptenic charge. Nature 111. Nanda, V. Do‑it‑yourself enzymes. Nature Chem. Biol. 4, 273–275 (2008).
373, 228–230 (1995).
78. Hollfelder, F., Kirby, A. J. & Tawfik, D. S. Off‑the‑shelf proteins that rival Acknowledgements
tailor-made antibodies as catalysts. Nature 383, 60–62 (1996). VN acknowledges support from the NIH Director’s New Innovator Award Program,
79. Warshel, A. Electrostatic origin of the catalytic power of enzymes and the role 1-DP2-OD006478-01 and the NSF BMAT program DMR-0907273. RLK acknowledges
of preorganized active sites. J. Biol. Chem. 273, 27035–27038 (1998). supported by the following grants: MCB-0920448 from the NSF, MCB-5G12 RR03060
80. Voigt, C. A., Mayo, S. L., Arnold, F. H. & Wang, Z. G. Computational method to toward support for the NMR facilities at the City College of New York, P41 GM-66354
reduce the search space for directed protein evolution. Proc. Natl Acad. Sci. USA to the New York Structural Biology Center and infrastructure support from NIH 5G12
98, 3778–3783 (2001). RR03060 from the National Center for Research Resources.
81. Voigt, C. A., Martinez, C., Wang, Z. G., Mayo, S. L. & Arnold, F. H.
Protein building blocks preserved by recombination. Nature Struct. Biol. Author information
9, 553–558 (2002). The authors declare no competing financial interests.

24 nature chemistry | VOL 2 | JANUARY 2010 | www.nature.com/naturechemistry


© 2010 Macmillan Publishers Limited. All rights reserved.

S-ar putea să vă placă și