Sunteți pe pagina 1din 6

Opinion

Two personal perspectives on a key issue in contemporary 3D QSAR


Robert D. Clark1 and Ulf Norinder2
Chemists working with small molecules are under enormous pressure to be able to reliably predict how biological systems in particular and the environment in general will respond to the deployment of the corresponding compounds as medicines, cosmetics, or in other manufactured goods. To be specic and robust, any such prediction must be based on an implicit or explicit mathematical model of how chemical structure relates to biological activityi.e., on some postulated quantitative structureactivity relationship (QSAR). Such models are necessarily limited in how broadly they can be applied. Their applicability domain depends on the structural diversity of the data set used, but also on the descriptors used to characterize how that structural variation relates to the activity in question. In principle, descriptors based on the molecular interaction elds produced by atoms distributed in three-dimensional (3D) space should be the most general of all, but nding suitable conformations and alignment is a challenge. One way to obtain these is by taking the structure of the macromolecular target into account as well, as is done in scoring ligand/receptor complexes for virtual screening. Unfortunately, the available docking tools are generally not up to the task. Here, we share some personal observations and opinions on two possible ways to address this shortcoming: implicitly, by iterative rescoring of docked poses obtained using derived 3D QSARs; and explicitly, by evaluating ligand interaction elds with respect to target atoms rather than against generalized probe atoms. C 2011 John
Wiley & Sons Ltd.

How to cite this article:

WIREs Comput Mol Sci 2012, 2: 108113 doi: 10.1002/wcms.69

INTRODUCTION

he art of constructing mathematical models that capture the relationship between structure and activity is sometimes referred to as quantitative structureactivity relationship (QSAR), but to be useful the models produced must go beyond simply tting the data; they must generate testable hypotheses about those relationships that are at least modestly extensible. In particular, they must robustly predict the activity of compounds that have yet to be made or tested. Since macromolecules see potential ligands as threePresent address: Simulations Plus, Inc., Lancaster, CA, USA

Correspondence to: bob@simulations-plus.com Biochemical Infometrics, St. Louis, MO, USA 2 AstraZeneca Research and Development, Sodert alje, Sweden
1

DOI: 10.1002/wcms.69

dimensional electronic surfaces, small molecules that are quite different in their atomic composition and connectivity can compete for the same binding site. Conversely, compounds that are very similar in twodimensional (2D) structure can interact with potential receptors in very different wayssome terpenoids that differ only in chirality, for example, have quite distinct odors. Three-dimensional (3D) QSAR seeks to characterize those interactions in terms of interaction eld potentials evaluated at (more or less) specic points distributed around each molecule, with the potential at each such point representing one descriptor. Given that the interaction elds for all potential ligands differ only locally, such methods hold out the promisein principleof having very broad applicability domains.1 , 2 How much are such QSAR methods used in practice throughout academia and industry? A search

108

c 2011 John Wiley & Sons, Ltd.

Volume 2, January/February 2012

WIREs Computational Molecular Science

Perspectives on a key issue in contemporary 3D QSAR

in Scinder carried out on February 17, 2011, using the search term 3D QSAR in journals and reviews retrieved 1912 hits, a majority of which is associated with academic research. In fact, only 275 out of 1912 were investigations not published by academics. This result begs the question: How much are 3D QSAR methods used within the pharmaceutical industry? Not much has been published by those directly involved but that does not necessarily mean that these methods are not being used. Nonetheless, our sense is that 3D QSAR methods, in general, are not frequently used in the pharma industry for QSAR investigations, a suspicion conrmed by a small-scale survey of colleagues in some of the larger pharmaceutical companies. There are probably several reasons for this lack of use, but two of them, in particular, stand out. One is the so-called alignment problem. Although interaction eld intensities are very general descriptors, the molecules they represent have to be put into a common conformational and positional context before their full value can be realized. Hence, the ability to align molecules meaningfully in 3D space is what limits the applicability domain of such 3D QSARs. Despite some valiant efforts to automate the process, identifying sets of conformers and superimposition rules that yield robust and meaningful 3D QSARs remains a challenging and largely a manual process. Given the productivity pressures felt throughout the pharmaceutical industry today, it is not surprising that relatively few researchers are willing to invest their time and energy in such projects. The second is that 3D QSAR has traditionally been carried out as a ligand-based technique, limited in applicability to closely related structures. Receptorbaseda QSAR techniques, in contrast, make use of interaction scores for poses of ligands docked into target receptor structures derived from X-ray or nuclear magnetic resonance spectroscopic analysis, usually of complexes with other ligands. Such techniques make use of information that is not directly available from ligands alone, such as steric exclusion zones and which parts of a bound ligand are exposed to solvent. When used as part of a virtual screening program, docking is a form of 3D QSAR because it attempts to quantitatively relate biological activity to molecular structure. But most available docking programs have focused on the docking process and getting pose prediction right, i.e., obtaining ligand poses close to those seen in reference X-ray structures for the complex ligand. Scoring ligand afnity correctly takes more time; however, a rigorous scoring function that consistently distinguishes true binding modes from other possible binding modes need not consider interactions of the

separated ligand and receptor with solvent because those are the same for all poses. Hence, it is somewhat misguided to cast docking as being intrinsically more general than ligand-based techniques when there is clear evidence that docking generally fails to correctly rank/order ligands by afnity.3 , 4 These two problems can be addressed simultaneously by using docking tools to generate the Cartesian and conformational frames of reference the alignmentsneeded to carry out molecular eld analyses. A second, complementary QSAR regime in which interactions with the receptor are explicitly incorporated into the analysis is arguably more ambitious, in that the intended applicability domain is generally comprised of all drug-like or lead-like molecular structures. What follows are some personal reections on the current state of QSAR in these areas as seen through the prism of our combined half-century of scientic experience and personal interests. It is not intended to be an exhaustive review and the papers cited are not necessarily seminal, but they have been chosen to illustrate what we see as the current state of the art.

USING DOCKING TO GENERATE ALIGNMENTS FOR 3D QSAR


The potential functions used to characterize molecular elds vary in form from program to program, with at least two sets of potentials (eld types) usually being calculated. The two most widely used methods comparative molecular eld analysis (CoMFA)5 and comparative molecular similarity indices analysis (CoMSIA)6 are often applied in parallel, though each can be used alone as well. GRID/GOLPE and related programs constitute a third, closely related class of molecular eld analysis (MFA) techniques that estimate partial free energies of interaction using the GRID force eld, then apply a variable selection program to focus in on the most informative regions.7 The following discussion will generally be cast in terms of CoMFA and CoMSIA because those are the techniques with which the authors have the most experience, but the observations involved are likely to apply equally well to any MFA technique. The exact geometry and placement of the sampling grid against which interaction elds for the ligand ensemble are evaluated can be important for predictive performance,8 though making a point of choosing the ensemble alignment with the best internal productivity (q2 ) is usually counterproductive.9 , 10 The ligand conformations chosen and how those

Volume 2, January/February 2012

c 2011 John Wiley & Sons, Ltd.

109

Opinion

wires.wiley.com/wcms

conformers are aligned with respect to each other is generally much more critical to making prospective predictions, which is what really matters. Originally, ligand structures were generated by manually modifying or elaborating a template structure. Such structural changes had to be made in some reasonably consistent way but exactly how was incompletely specied if at all. This procedure ensured that scaffolds overlapped, which meant that eld differences associated with shared substructures were kept small. Partial least squares (PLS) maps differences in activity onto differences in eld intensity, so this alignment served to focus the QSAR on areas of structural variation. Historically, there have been two different ways to obtain the template conformation for the chosen stereotypical ligand. One was to use an X-ray structure of it bound to the macromolecular target of interest. The other was to nd its global energy minimum. More recently, exible docking has been used as a more general and less constraining alternative to X-ray crystallography. The gratifying result of the few comparisons that have been made between MFA models based on globally minimized template conformations and those obtained by exible docking is that docked templates are generally superior.11 , 12 Unfortunately, many data sets are too structurally diverse to be generated from one or a few template structures. Even when they can be, the best choice of templateif one existsis not always obvious. This dilemma has led some to take the next logical step by docking each ligand into the target binding site separately, then constructing an MFA model using the best-scoring pose for each. Somewhat surprisingly, fully independent docking usually works less well than does alignment to a docked template,13 , 14 even though the latter method discounts the extra information embodied in the structure of the protein. The difference in performance is much larger than that seen between minimized and docked templates, between atom-based and feature-based alignments, between CoMFA and CoMSIA models, or between different docking programs.15 This effect doubtless reects the general fuzziness of the fully docked over` lay via-a-vis that of an alignment based on substructure, where the signal in eld differences is more coherent and is spread across fewer lattice points. This apparent inferiority of docking as an alignment tool is more than a little counterintuitive, given that some seminal MFA studies used alignments based on X-ray crystal structures of the individual inhibitors.7 The poor performance of alignments based on individually docked ligands could, of course, simply reect inaccuracies in the docked congura-

tions produced by the available tools. Impugning the performance of docking programs does not, however, explain why different docking programs give statistically similar results. The problem with alignment by docking is more likely to lie in the dubious assumption that the binding site is rigid, an assumption that is implicit in all studies carried out to date.16 That this is, in fact, the case is suggested by studies in which ligands were docked individually and the conformations produced were then aligned to a docked template on the basis of substructure or pharmacophoric overlap, procedures which yield better models than alignment by docking alone.14 Though it may not be obvious, such realignment is equivalent to allowing the binding site to reorient slightly around each of the overlaid ligands.17 Perhaps being able to support the generation of good MFA models should replace reproduction of poses from X-ray crystal structures as the primary criterion for validating docking and scoring programs. Short of that, an alignment based on docking may serve as a good rst approximation upon which to construct a 3D QSAR, which can then be used to generate a rened alignment by rescoring the highest ranked docked poses. A quick proof-of-principle application of this concept to the estrogen receptor (ER) suggests that this is indeed the case (Robert D. Clark, unpublished). Surex-Dock18 was used to dock set of 33 agonists taken from Wolohan and Reichert19 into the ER agonist target from the Directory of Useful Decoys.20 The 25 ligands yielding useful poses included sterols and stilbesterols, as well as hydroxyphenyl pyrazole and furan derivatives. An initial CoMFA model based on the best scoring Surex poses had a cross-validated standard error (SECV ) of 0.889 and a rather discouraging q2 of 0.243. Rescoring the top 20 poses for each ligand to identify the one with the highest predicted activity yielded a new CoMFA model that had a reduced SECV (0.868) and an increased q2 (0.415). Another iterative renement model yielded a 3D QSAR with SECV and q2 values0.699 and 0.639, respectivelycomparable to those found with classical substructure-based alignments. It is key that the choice of poses at each iteration was based on maximizing predicted activity, not on maximizing predictive statistics, which was an incidental effect. Interestingly, the changes in alignment involved corresponded to small congurational adjustments for most ligands. A few ligands, however, ipped in ways that would not have been predicted based on substructure (data not shown). This is an incomplete description of a single case; it remains to be seen whether the approach is

110

c 2011 John Wiley & Sons, Ltd.

Volume 2, January/February 2012

WIREs Computational Molecular Science

Perspectives on a key issue in contemporary 3D QSAR

generally applicable. That said, if the assumptions underlying 3D QSAR and docking are valid and even approximately fullled, it should usually work well. Moreover, the improvement observed in this case suggests that docking and MFA can, under the right conditions, be quite synergistic.

EXPLICIT INCLUSION OF THE RECEPTOR


To many researchers within the computational chemistry community, the phrase receptor-based QSAR is limited to cases in which ligand conformations and/or alignments have been obtained by X-ray diffraction or docking. The QSAR in such cases is subsequently derived from the relation between the ligands alone, with no further explicit utilization of the receptor. The alternative is to make explicit use of receptor information in constructing a QSAR model based on the localized interactions between receptor and ligand. There are relatively few investigations reported in the literature satisfying this denition; most that do utilize the comparative binding energy (COMBINE) approach21 to derive receptor-based QSAR models. Why are there so few methods and publications in the eld of receptor-based QSAR despite the huge increase of available X-ray structures of good quality in recent years? Many crystal structures with different ligands have been made publicly available during the last decade, especially in the kinase and protease elds. Are there special difculties associated with receptor-based QSAR investigations as opposed to ligand-based studies? Certainly, the combination of target and ligand exibility complicates things, as does the tendency to overinterpret the accuracy and precision of published crystal structures. Both factors affect the reliability of docked pose predictions but their potential to negatively affect complex scoring is even greater. The disconnect between interaction functions that work well for docking and those that work well for subsequent scoring of complex stability has been widely noted, with the best results often being obtained using one type of function for docking and some other type for scoring the docked complexes obtained.3 Each scoring function contains a number of terms that capture different aspects of protein ligand interaction. These scoring functions have then been derived from data sets with well-dened binding afnities. However, the use of these more generic scoring functions may not be the most effective approach for deriving predictive target-based (Q)SAR models.22 Tailor-made scoring functions customized

for each particular project may provide a better basis for deriving models with broader applicability domains, though the simple summation of interaction terms may still be too crude a protocol to model the underlying SAR quantitatively. The further dissection of the proteinligand interactions that is carried out in the COMBINE method holds out considerable promise for realizing a signicant improvement in performance by characterizing the interactions in more detail and identifying those which are most important: not all interactions contribute equally to binding afnity or measured target activity. Only a relatively small number of COMBINE studies have been published so far,23 and the predictive performance of the models produced has been uneven. The question that comes to mind is this: How many attempts have been made in the broader QSAR community to derive COMBINE models that were unsuccessful and have gone unreported? Several different approaches have been used in COMBINE studies, including variations in docking technique and complex minimization as well as how new ligands are aligned to cocrystallized ones in order to generate the ligandreceptor complex used to predict activity. Most COMBINE methods published so far used the assisted model building with energy renement (AMBER) force-eld to evaluate ligandreceptor interactions, but the work that has been done with different force-elds revealed only minor differences with respect to model performance. Predictive target-based QSAR models have some attractive features:

The compact representation (usually only electrostatic and steric terms) of the interactions makes the results relatively easy to interpret, though aggregating the various interactions of each amino acid residue is sometimes a confounding factor. Visual (graphical) display of the important QSAR coefcients provides insight into the importance as well as the location of these interactions. Depending on the docking scheme employed, i.e., with or without minimization, fairly large virtual libraries can be screened readily and potentially interesting new chemical entities can be identied for synthesis or acquisition. In principle, a target-based QSAR model should have the potential to enrich for active compounds as well or better than a generic scoring function.

Volume 2, January/February 2012

c 2011 John Wiley & Sons, Ltd.

111

Opinion

wires.wiley.com/wcms

However, target-based QSAR models are not without their shortcomings and pitfalls: The residuewise granularity of the descriptors is sometimes too coarse to support a detailed analysis. This problem can be addressed by using all-atom based residue descriptors or shifting to descriptors that differentiate between side-chain and backbone interactions. As with all QSAR models, COMBINE maps differences in activity onto differences in descriptors, with the result that the only descriptors that contribute to the derived model are those that exhibit signicant variation across the training set. As a result, datasets containing analogues that share a common central substructure will generally lack sensitivity in the corresponding region, attributing little importance to it even though its presence may be critical to obtaining high activity. This should be less of a problem for target-based models than for purely ligandbased 3D QSAR models in which shared substructures are used for alignment (see above). Using docking and/or minimization to construct a target-based model should result in more positional variation in core placement which should, in turn, add informative variation to the corresponding interaction terms and increase the contribution of critical core interaction terms to the nal COMBINE model. The structure of the ligands in the training set can limit the predictive ability of the model in other, more subtle ways as well; so large structural excursions from the applicability domain (see above) should be clearly indicated to the researcher. Our own experience (Ulf Norinder, unpublished) with trying to derive COMBINE type models has been quite variable, and it has not been possible to develop predictive models for many proprietary projects. One potential reason is the exibility issue.16 Multireceptor modeling has recently been introduced to help address the conformational exibility issues of using single ligandreceptor complexes, which may improve success rates. Another way to approach the problem is to include molecular dynamics (MD) simulations in the COMBINE analysis protocols. As far as we are aware there are no published investigations presently available that use this approach. However, investigations

performed on a handful of public and proprietary data sets indicate that the added complexity and increased computational time that result from incorporating receptor and ligand exibility do not pay off in terms of improved predictive performance (Ulf Norinder, unpublished). Rather, these experiments suggest that if a good COMBINE-like model can be derived by applying MD, then a much simpler approach that makes use of molecular mechanics minimization will work as well. Conversely, if adding molecular mechanics minimization does not produce a predictive COMBINE model, then adding MD calculations is unlikely to do so. In view of the present limitations of target-based QSAR modeling, the treatment of solvation effects on ligand binding will need to be further improved and more accurate and comprehensive treatments of ligandreceptor interactions will need to be developed before the technique can contribute effectively to drug discovery research.

CONCLUSION
Even when married to receptor-based techniques in the ways considered here, 3D QSAR molecular descriptors based solely on electrostatic and steric interaction elds will often prove too coarse to be of great value in late-stage drug discovery. On the other hand, they may also be too specic for the multiobjective differentiation needed as a program approaches clinical trials, where a detailed understanding of properties such as lipophilicity, hydrogen bonding, and aromatic characteristics is key to addressing problems with selectivity, pharmacokinetic tractability, and toxicity as well as activity. Until recently, QSAR practitioners focused primarily on maximizing the ability of the models they produce to account for the variations in activity seen within the limited data set from which they were derived. Moving forward, we need to focus more on expanding their prospective usefulness. To do that, we need to fundamentally reconsider how and why we align molecules for 3D QSAR. We can do so by looking for ways to make productive use of docking techniques and the everincreasing supply of structural information about the biological targets whose behavior we seek to affect.

NOTES
a

We prefer receptor-based or target-based to the more commonly used structure based because ligands have structure, too.

112

c 2011 John Wiley & Sons, Ltd.

Volume 2, January/February 2012

WIREs Computational Molecular Science

Perspectives on a key issue in contemporary 3D QSAR

ACKNOWLEDGMENTS
Docking and CoMFA software was kindly provided to RDC by Dr. Ajay Jain (UCSF), BioPharmics LLC, and Tripos International. The authors would also like to thank the anonymous reviewers for their insights and suggestions.

REFERENCES
1. Netzeva TI, Worth AP, Aldenberg T, Benigni R, Cronin MTD, Gramatica P, Jaworska JS, Kahn S, Klopman G, Marchant CA, et al. Current status of methods for dening the applicability domain of (quantitative) structureactivity relationships. The report and recommendations of ECVAM workshop 521. ATLA 2005, 33:155173. 2. Tetko IV, Bruneau P, Mewes H-W, Rohrer DC, Poda GI. Can we estimate the accuracy of ADMETox predictions? Drug Disco Today 2006, 11:700707. 3. Warren GL, Andrews CW, Capelli A-M, Clarke B, LaLonde J, Lambert MH, Lindvall M, Nevins N, Semus SF, Senger S, et al. A critical assessment of docking programs and scoring functions. J Med Chem 2006, 49:59125931. 4. Enyedy IJ, Egan WJ. Can we use docking and scoring for hit-to-lead optimization? J Comput Aided Mol Des 2008, 22:161168. 5. Cramer RD III, Patterson DE, Bunce JD. Comparative molecular eld analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc 1988, 110:59595967. 6. Klebe G, Abraham U, Mietzner T. Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity. J Med Chem 1994, 37:41304146 7. Watson KA, Mitchell EP, Johnson LN, Cruciani G, Son JC, Bichard CJF, Fleet GWJ, Oikonomakos NG, Kontou M, Zographos SE. Glucose analogue inhibitors of glycogen phosphorylase: from crystallographic analysis to drug prediction using GRID force eld and GOLPE variable selection. Acta Cryst 1995, D51:458 472. 8. Cho SJ, Tropsha A. Cross-validated R2-guided region selection for comparative molecular eld analysis: a simple method to achieve consistent results. J Med Chem 1995, 10601066. 9. Norinder U. Single and domain mode variable selection in 3D QSAR applications. J Chemometrics 1996, 10:95105. 10. Wang R, Gao Y, Liu L, Lai L. All-orientation search and all-placement search in comparative molecular eld analysis. J Mol Model 1998, 4:276283. 11. Prasanna S, Daga PR, Xie A, Doerksen RJ. Glycogen synthase kinase-3 inhibition by 3-anilino-412. phenylmaleimides: insights from 3D-QSAR and docking. J Comput Aided Mol Des 2009, 23:113127. Murugesan V, Prabhakar YS, Katti SB. CoMFA and CoMSIA studies on thiazolidin-4-one as anti-HIV-1 agents. J Mol Graph Model 2009, 27:735743. Patel PD, Patel MR, Kaushik-Basu N, Talele TT. 3D QSAR and molecular docking studies of benzimidazole derivatives as hepatitis C virus NS58 polymerase inhibitors. J Chem Inf Model 2008, 48:4255. Sheng C, Zhang W, Ji H, Zhang M, Song Y, Xu H, Zhu J, Miao Z, Jiang Q, Yao J, et al. Structure-based optimization of azole antifungal agents by CoMFA, CoMSIA, and molecular docking. J Med Chem 2006, 49:25122525. Guido RVC, Oliva G, Montanari CA, Andricopulo AD. Structural basis for selective inhibition of trypanosomatid glyceraldehyde-3-phosphate dehydrogenase: molecular docking and 3D QSAR studies. J Chem Inf Model 2008, 48:918929. Cozzini P, Kellogg GE, Spyrakis F, Abraham DJ, Costantino G, Emerson A, Fanelli F, Gohlke H, Kuhn LA, Morris GM. Target exibility: an emerging consideration in drug discovery and design. J Med Chem 2008, 51:62376255. Clark RD. A ligands-eye view of protein binding. J Comput Aided Mol Des 2008, 22:507521. Jain AN. Surex-Dock 2.1: robust performance from ligand energetic modeling, ring exibility, and knowledge-based search. J Comput Aided Mol Des 2007, 21:281306. Wolohan P, Reichert DE. CoMFA and docking study of novel estrogen receptor subtype selective ligands. J Comput Aided Mol Des 2003, 17:313328. Huang N, Shoichet BK, Irwin JJ. Benchmarking sets for molecular docking. J Med Chem 2006, 49:6789 6801. Ortiz AR, Pisabarro MT, Gago F, Wade RC. Prediction of drug binding afnities by comparative binding energy analysis. J Med Chem 1995, 38:26812691. Enyedy IJ, Egan WJ. Can we use docking and scoring for hit-to-lead optimization? J Comput Aided Mol Des 2008, 22:161168. Lushington GH, Guo J-X, Wang JL. Whither combine? New opportunities for receptor-based QSAR. Curr Med Chem 2007, 14:18631877.

13.

14.

15.

16.

17. 18.

19.

20.

21.

22.

23.

Volume 2, January/February 2012

c 2011 John Wiley & Sons, Ltd.

113

S-ar putea să vă placă și