Sunteți pe pagina 1din 31

PROTEIN DOCKING

By Iffat Fatima Roll No. 08 M.Phil Biochemistry

What is Docking?

process of starting with a set of coordinates for two distinct molecules and generating a model of the bound complex

Docking two molecules means constructing the coordinates of the bound state. Bound state is called the complex. We require coordinates for the independent molecules as input Molecules move towards each other and bind/dock But aim is to predict their docked configuration (not describe their motion).

Docking shows

Given two biological molecules determine:

Whether the two molecules interact If so, what is the orientation that maximizes the interaction while minimizing the total energy of the complex Goal: To be able to search a database of molecular structures and retrieve all molecules that can interact with the query structure

Why is docking important?

It is of extreme relevance in cellular biology, where function is accomplished by proteins interacting with themselves and with other molecular components It is the key to rational drug design: The results of docking can be used to find inhibitors for specific target proteins and thus to design new drugs. It is gaining importance as the number of proteins whose structure is known increases.

Example: HIV-1 Protease

Active Site (Aspartyl groups

HIV-1 Protease

Why Docking Difficult

Both molecules are flexible and may alter each others structure as they interact:
Hundreds to thousands of degrees of freedom (DOF) Total possible conformations are astronomical

Types of Docking
Protein-Protein Docking Both molecules usually considered rigid 6 degrees of freedom First apply steric constraints to limit search space and the examine energetics of possible binding conformations Protein-Ligand Docking Flexible ligand, rigid-receptor Search space much larger Either reduce flexible ligand to rigid fragments connected by one or several hinges, or search the conformational space using monte-carlo methods or molecular dynamics

Techniques Involved in docking

Surface representation, that efficiently represents the docking surface and identifies the regions of interest (cavities and protrusions)
Connolly surface Lenhoff technique Kuntz et al. Clustered-Spheres Alpha shapes

Surface matching that matches surfaces to optimize a binding score:


Geometric Hashing

Surface Representation
Each atomic sphere is given the van der Waals radius of the atom Rolling a Probe Sphere over the Van der Waals Surface leads to the Solvent Reentrant Surface or Connolly surface

Lenhoff technique
Computes a complementary surface for the receptor instead of the Connolly surface, i.e. computes possible positions for the atom centers of the ligand.

Kuntz et al. Clustered-Spheres


Uses clustered-spheres to identify cavities on the receptor and protrusions on the ligand Compute a sphere for every pair of surface points, i and j, with the sphere center on the normal from point i Regions where many spheres overlap are either cavities (on the receptor) or protrusions (on the ligand)

Alpha Shapes

Formalizes the idea of shape In 2D an edge between two points is alphaexposed if there exists a circle of radius alpha such that the two points lie on the surface of the circle and the circle contains no other points from the point set

Surface matching

Find the transformation (rotation + translation) that will maximize the number of matching surface points from the receptor and the ligand

First satisfy steric constraints


Find the best fit of the receptor and ligand using only geometrical constraints

then use energy calculations to refine the docking


Selet the fit that has the minimum energy

Geometric Hashing
Building the Hash Table:
For each triplet of points from the ligand, generate a unique system of reference Store the position and orientation of all remaining points in this coordinate system in the Hash Table

Searching in the Hash Table


For each triplet of points from the receptor, generate a unique system of reference Search the coordinates for each remaining point in the receptor and find the appropriate hash table bin: For every entry there, vote for the basis Determine those entries that received more than a threshold of votes, such entry corresponds to a potential match For each potential match recover the transformation T that results in the best least-squares match between all corresponding triplets Transform the features of the model according to the recovered transformation T and verify it. If the verification fails, choose a different receptor triplet and repeat the searching.

Docking Program
The programs are:
DOCK (I. D. Kuntz, UCSF)

AutoDOCK (Arthur Olson, The Scripps Research Institute)


RosettaDOCK (Baker, Washington Univ., Gray, Johns Hopkins Univ.)

FlexX GOLD

Hammerhead
FLOG

DOCK
DOCK works in 5 steps: Step 1 Start with crystal coordinates of target receptor Step 2 Generate molecular surface for receptor Step 3 Generate spheres to fill the active site of the receptor: The spheres become potential locations for ligand atoms Step 4 Matching: Sphere centers are then matched to the ligand atoms, to determine possible orientations for the ligand Step 5 Scoring: Find the top scoring orientation

Receptor Structure
X-ray crystal NMR homology

Binding Mode Analysis for Lead Optimization: binding orientations and scores for each ligands

Virtual Screening for MTS/HTS and Library Design: ligands in the order of their best scores

Binding Site

Molecular Surface of Binding Site

Scoring Orientations 1. Energy scoring (vdw and electrostatic) 2. Contact scoring (shape complementarity) 3. Chemical scoring 4. Solvation terms Filters

Spheres describing the shape of binding site and favorable locations of potential ligand atoms

Ligands
Matching heavy atoms of ligands to centers of spheres to generate thousands of binding orientations
3D structure atomic charges potentials labeling

Other Program
AutoDock
AutoDock was designed to dock flexible ligands into receptor binding sites The strongest feature of AutoDock is the range of powerful optimization algorithms available

RosettaDOCK
It models physical forces and creates a very large number of decoys It uses degeneracy after clustering as a final criterion in decoy selection

A Protein-Protein Docking Algorithm (Gray & Baker)

Goal: to predict protein-protein complexes from the coordinates of unbound monomer components. Two steps: A low-resolution Monte Carlo search and a final optimization using Monte Carlo minimization. Up to 105 independent simulations produce decoys that are ranked using an energy function. The top-ranking decoys are clustered for output.

Algorithms of the 3 docking methods


Method Step 1: Rigid body search (Investigator) ClusPro (Camacho and Vajda) Gray and Baker Fast Fourier Transform (FFT) correlation approach using ZDOCK or DOT Monte-Carlo search using simplified protein geometry and scoring function FFT correlation with shape complementarity, electrostatics, and desolvation FFT correlation with shape complementarity Step 2: Rescoring, ranking, filtering, and refinement Re-scoring with empirical potentials and clustering

Iterative repacking of side chains and rigid-body docking repeated until convergence. Final selection by clustering. Clustering of conformations to avoid redundancies

ZDOCK (Weng)

RDOCK (Weng)

Re-scoring with empirical potentials

How current protein docking programs work?

Table I. Major differences between enzyme-inhibitor and antibody-antigen complexes

Property

Enzyme-inhibitor complexes

Antibody-antigen complexes Possibly < 1400 2 Frequently multiple patches Mostly planar -13.0 kcal/mol < DG < -6.5 kcal/mol 51% nonpolar (can be as low as 44%) Positive (unfavorable) Can be substantial; loop and/or hinge motion Within the interface

Interface area DASA Interface connectedness Interface shape

1400 2 < DASA < 2000 2, Single patch Convex-concave

Binding free energy DG, -17.5 kcal/mol < DG < -13.0 kcal/mol kcal/mol % Nonpolar residues in interface Desolvation free energy Conformational change Crystallographic water positions 61% nonpolar (can reach 71%) Negative (favorable) Generally moderate Around perimeter of interface

Type I

Conformational change Small (rigid interface)

Interface DASAa Standardb Hydrophobicity Docking outcome Successful, unless key side chains are in wrong conformations Example Trypsinogen and trypsin inhibitor (1cgi): KD = 0.2 pM, DASA = 1950 2, and DGdes = -18.3 kcal/mol. Most complexes of enzymes with their protein inhibitors are in this category Ribonuclease a and ribonuclease inhibitor (1dfj): DASA = 2580 2, DGdes = 18.6 kcal/mol, DEelec=-63.9 kcal/mol KD = 0.15 nM Hyhel-5 Fab with lysozyme (1mlc): KD = 126M, DASA = 1390 2, DGdes = -3.84 kcal/mol, DEelec = --21.4 kcal/mol, Most antibody antigen complexes are in this category Ras and Ras interacting domain (1lfd) KD = 2M, DASA = 1130 2, and DGdes = 3.6 kcal/mol. A number of weak complexes are in this category

II

Small

DASA > 2000 2

Strong; the convex-concave interface provides good shape complementarity Unimportant Successful

III

Moderate, but larger than for Type I

Standard

Variable, but generally weak. Charge-charge interactions can be strong

IV

Restricted to side chains

DASA <1400 Weak; mostly polar and 2 charge-charge interactions DASA > 2000 2 Generally moderate

Unpredictable; can be very difficult, even with know hypervariable regions of antibody Hits are found, but are generally lost in scoring and ranking

Substantial backbone change, C RMSD > 2

Rigid body Cyclin A and cyclin-dependent kinase 2 methods seem (1fin): KD = 47.6 nM, DASA = 3390 2, and to always fail for DGdes = 4.7 kcal/mol these complexes

ASA Acessible Surface Area, bStandard interface: 1400 2 < DASA < 2000 2, c C RMSD - carbon Root Mean Square Deviation

A Protein-Protein Docking Algorithm (Gray & Baker)

Our goal is to try to predict protein-protein complexes from the coordinates of the unbound monomer components. The method is divided in two steps: A low-resolution Monte Carlo search and a final optimization using Monte Carlo minimization. Up to 105 independent simulations are carried out, and the resulting decoys are ranked using an energy function. The top-ranking decoys are clustered to select the final predictions.

GOLD (Genetic Optimisation for Ligand Docking)

Performs automated docking with full acyclic ligand flexibility, partial cyclic ligand flexibility and partial protein flexibility in and around active site.

Protocol

THANKS
IFFAT FATIMA ROLL No : 08

S-ar putea să vă placă și