Protein Docking

PROTEIN DOCKING
By Iffat Fatima Roll No. 08 M.Phil Biochemistry
What is Docking?
process of starting with a set of coordinates for two distinct molecules and generating a model of the bound complex
Docking two molecules means constructing the coordinates of the bound state. Bound state is called the complex. We require coordinates for the independent molecules as input Molecules move towards each other and bind/dock But aim is to predict their docked configuration (not describe their motion).
Docking shows
Given two biological molecules determine:
Whether the two molecules interact If so, what is the orientation that maximizes the interaction while minimizing the total energy of the complex Goal: To be able to search a database of molecular structures and retrieve all molecules that can interact with the query structure
Why is docking important?
It is of extreme relevance in cellular biology, where function is accomplished by proteins interacting with themselves and with other molecular components It is the key to rational drug design: The results of docking can be used to find inhibitors for specific target proteins and thus to design new drugs. It is gaining importance as the number of proteins whose structure is known increases.
Example: HIV-1 Protease
Active Site (Aspartyl groups
HIV-1 Protease
Why Docking Difficult
Both molecules are flexible and may alter each others structure as they interact:
Hundreds to thousands of degrees of freedom (DOF) Total possible conformations are astronomical
Types of Docking
Protein-Protein Docking Both molecules usually considered rigid 6 degrees of freedom First apply steric constraints to limit search space and the examine energetics of possible binding conformations Protein-Ligand Docking Flexible ligand, rigid-receptor Search space much larger Either reduce flexible ligand to rigid fragments connected by one or several hinges, or search the conformational space using monte-carlo methods or molecular dynamics
Techniques Involved in docking
Surface representation, that efficiently represents the docking surface and identifies the regions of interest (cavities and protrusions)
Connolly surface Lenhoff technique Kuntz et al. Clustered-Spheres Alpha shapes
Surface matching that matches surfaces to optimize a binding score:

Geometric Hashing
Surface Representation
Each atomic sphere is given the van der Waals radius of the atom Rolling a Probe Sphere over the Van der Waals Surface leads to the Solvent Reentrant Surface or Connolly surface
Lenhoff technique
Computes a complementary surface for the receptor instead of the Connolly surface, i.e. computes possible positions for the atom centers of the ligand.
Kuntz et al. Clustered-Spheres

Uses clustered-spheres to identify cavities on the receptor and protrusions on the ligand Compute a sphere for every pair of surface points, i and j, with the sphere center on the normal from point i Regions where many spheres overlap are either cavities (on the receptor) or protrusions (on the ligand)
Alpha Shapes

Formalizes the idea of shape In 2D an edge between two points is alphaexposed if there exists a circle of radius alpha such that the two points lie on the surface of the circle and the circle contains no other points from the point set
Surface matching
Find the transformation (rotation + translation) that will maximize the number of matching surface points from the receptor and the ligand
First satisfy steric constraints

Find the best fit of the receptor and ligand using only geometrical constraints
then use energy calculations to refine the docking

Selet the fit that has the minimum energy
Geometric Hashing
Building the Hash Table:
For each triplet of points from the ligand, generate a unique system of reference Store the position and orientation of all remaining points in this coordinate system in the Hash Table
Searching in the Hash Table

For each triplet of points from the receptor, generate a unique system of reference Search the coordinates for each remaining point in the receptor and find the appropriate hash table bin: For every entry there, vote for the basis Determine those entries that received more than a threshold of votes, such entry corresponds to a potential match For each potential match recover the transformation T that results in the best least-squares match between all corresponding triplets Transform the features of the model according to the recovered transformation T and verify it. If the verification fails, choose a different receptor triplet and repeat the searching.
Docking Program
The programs are:
DOCK (I. D. Kuntz, UCSF)
AutoDOCK (Arthur Olson, The Scripps Research Institute)

RosettaDOCK (Baker, Washington Univ., Gray, Johns Hopkins Univ.)
FlexX GOLD
Hammerhead
FLOG
DOCK
DOCK works in 5 steps: Step 1 Start with crystal coordinates of target receptor Step 2 Generate molecular surface for receptor Step 3 Generate spheres to fill the active site of the receptor: The spheres become potential locations for ligand atoms Step 4 Matching: Sphere centers are then matched to the ligand atoms, to determine possible orientations for the ligand Step 5 Scoring: Find the top scoring orientation
Receptor Structure
X-ray crystal NMR homology
Binding Mode Analysis for Lead Optimization: binding orientations and scores for each ligands
Virtual Screening for MTS/HTS and Library Design: ligands in the order of their best scores
Binding Site
Molecular Surface of Binding Site
Scoring Orientations 1. Energy scoring (vdw and electrostatic) 2. Contact scoring (shape complementarity) 3. Chemical scoring 4. Solvation terms Filters
Spheres describing the shape of binding site and favorable locations of potential ligand atoms
Ligands
Matching heavy atoms of ligands to centers of spheres to generate thousands of binding orientations
3D structure atomic charges potentials labeling
Other Program
AutoDock
AutoDock was designed to dock flexible ligands into receptor binding sites The strongest feature of AutoDock is the range of powerful optimization algorithms available
RosettaDOCK
It models physical forces and creates a very large number of decoys It uses degeneracy after clustering as a final criterion in decoy selection
A Protein-Protein Docking Algorithm (Gray & Baker)
Goal: to predict protein-protein complexes from the coordinates of unbound monomer components. Two steps: A low-resolution Monte Carlo search and a final optimization using Monte Carlo minimization. Up to 105 independent simulations produce decoys that are ranked using an energy function. The top-ranking decoys are clustered for output.
Algorithms of the 3 docking methods

Method Step 1: Rigid body search (Investigator) ClusPro (Camacho and Vajda) Gray and Baker Fast Fourier Transform (FFT) correlation approach using ZDOCK or DOT Monte-Carlo search using simplified protein geometry and scoring function FFT correlation with shape complementarity, electrostatics, and desolvation FFT correlation with shape complementarity Step 2: Rescoring, ranking, filtering, and refinement Re-scoring with empirical potentials and clustering
Iterative repacking of side chains and rigid-body docking repeated until convergence. Final selection by clustering. Clustering of conformations to avoid redundancies
ZDOCK (Weng)
RDOCK (Weng)
Re-scoring with empirical potentials
How current protein docking programs work?
Table I. Major differences between enzyme-inhibitor and antibody-antigen complexes
Property
Enzyme-inhibitor complexes
Antibody-antigen complexes Possibly < 1400 2 Frequently multiple patches Mostly planar -13.0 kcal/mol < DG < -6.5 kcal/mol 51% nonpolar (can be as low as 44%) Positive (unfavorable) Can be substantial; loop and/or hinge motion Within the interface
Interface area DASA Interface connectedness Interface shape
1400 2 < DASA < 2000 2, Single patch Convex-concave
Binding free energy DG, -17.5 kcal/mol < DG < -13.0 kcal/mol kcal/mol % Nonpolar residues in interface Desolvation free energy Conformational change Crystallographic water positions 61% nonpolar (can reach 71%) Negative (favorable) Generally moderate Around perimeter of interface
Type I
Conformational change Small (rigid interface)
Interface DASAa Standardb Hydrophobicity Docking outcome Successful, unless key side chains are in wrong conformations Example Trypsinogen and trypsin inhibitor (1cgi): KD = 0.2 pM, DASA = 1950 2, and DGdes = -18.3 kcal/mol. Most complexes of enzymes with their protein inhibitors are in this category Ribonuclease a and ribonuclease inhibitor (1dfj): DASA = 2580 2, DGdes = 18.6 kcal/mol, DEelec=-63.9 kcal/mol KD = 0.15 nM Hyhel-5 Fab with lysozyme (1mlc): KD = 126M, DASA = 1390 2, DGdes = -3.84 kcal/mol, DEelec = --21.4 kcal/mol, Most antibody antigen complexes are in this category Ras and Ras interacting domain (1lfd) KD = 2M, DASA = 1130 2, and DGdes = 3.6 kcal/mol. A number of weak complexes are in this category
II
Small
DASA > 2000 2
Strong; the convex-concave interface provides good shape complementarity Unimportant Successful
III
Moderate, but larger than for Type I
Standard
Variable, but generally weak. Charge-charge interactions can be strong
IV
Restricted to side chains
DASA <1400 Weak; mostly polar and 2 charge-charge interactions DASA > 2000 2 Generally moderate
Unpredictable; can be very difficult, even with know hypervariable regions of antibody Hits are found, but are generally lost in scoring and ranking
Substantial backbone change, C RMSD > 2
Rigid body Cyclin A and cyclin-dependent kinase 2 methods seem (1fin): KD = 47.6 nM, DASA = 3390 2, and to always fail for DGdes = 4.7 kcal/mol these complexes
ASA Acessible Surface Area, bStandard interface: 1400 2 < DASA < 2000 2, c C RMSD - carbon Root Mean Square Deviation
A Protein-Protein Docking Algorithm (Gray & Baker)
Our goal is to try to predict protein-protein complexes from the coordinates of the unbound monomer components. The method is divided in two steps: A low-resolution Monte Carlo search and a final optimization using Monte Carlo minimization. Up to 105 independent simulations are carried out, and the resulting decoys are ranked using an energy function. The top-ranking decoys are clustered to select the final predictions.
GOLD (Genetic Optimisation for Ligand Docking)
Performs automated docking with full acyclic ligand flexibility, partial cyclic ligand flexibility and partial protein flexibility in and around active site.
Protocol
THANKS
IFFAT FATIMA ROLL No : 08

Protein Docking

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Protein Docking

Încărcat de

Drepturi de autor:

Formate disponibile

PROTEIN DOCKING

By Iffat Fatima Roll No. 08 M.Phil Biochemistry

Given two biological molecules determine:

Why is docking important?

Example: HIV-1 Protease

Active Site (Aspartyl groups

Why Docking Difficult

Techniques Involved in docking

Surface matching that matches surfaces to optimize a binding score:

Kuntz et al. Clustered-Spheres

First satisfy steric constraints

then use energy calculations to refine the docking

Searching in the Hash Table

AutoDOCK (Arthur Olson, The Scripps Research Institute)

Molecular Surface of Binding Site

A Protein-Protein Docking Algorithm (Gray & Baker)

Algorithms of the 3 docking methods

Re-scoring with empirical potentials

How current protein docking programs work?

Table I. Major differences between enzyme-inhibitor and antibody-antigen complexes

Interface area DASA Interface connectedness Interface shape

1400 2 < DASA < 2000 2, Single patch Convex-concave

Conformational change Small (rigid interface)

DASA > 2000 2

Moderate, but larger than for Type I

Variable, but generally weak. Charge-charge interactions can be strong

Restricted to side chains

Substantial backbone change, C RMSD > 2

A Protein-Protein Docking Algorithm (Gray & Baker)

GOLD (Genetic Optimisation for Ligand Docking)

S-ar putea să vă placă și