Cath Database

Presented By Shri Vaishnavi & Pinky
CATH DATABASE
HIERARCHIAL DOMAIN CLASSIFICATION OF PROTEIN STRUCTURES
CATH
CLASS: Secondary structure packing within the protein structure. Alpha-helices, BetaSheets and AlphaBeta. Includes both alpha/beta and alpha+beta.
Architecture
Distinguishes structures within the same class, but different architectures. Groupings can sometimes be rather broad as they describe general features of protein-fold shape. Ex: Tim Barrel, the number of layers in an sandwich(Orengo C.A et al., 1997)
Topology
Arrangement and connectivity of secondary structure elements are same in number. Within the topology level, structures are same but may differ in function. Ex. Globin or immunoglobin fold.
Homology
Structures are grouped by their high structural similarity and similar functions. They may have evolved from a common ancestor. Non-bundle globin-like foldsthe erythrocruorins, colicins, phycocyanins and domain 1 of diptheria toxin all have the same CAT number (1.10.340), but are differentiated by their H numbers 10, 20, 30 and 40, respectively
Sequence family
Have sequence identities >35%
Presumed to have extremely similar structures and functions they may be slightly different examples of the same protein from different species belonging to the same sequence superfamily. SOLID.
FLOW CHART OF CATH DATABASE
CATH SERVER PROTOCOL (FRANCES PEARL ET AL.,2005)

Input Structure to Server (http://www.biochem.ucl.ac.uk/cgi-bin/cath/CathServer.pl)
Generate derived data from the PDB coordinate files

Identify more remote homologues
Set Threshold E-value from validated structural homologs

If match is found, superfamilies are structurally compared with the query structure using the SSAP structure alignment program
Any query structure unmatched is scanned against a library of representative structures from each close sequence family in CATH
The top 10 matches are displayed
DICTIONARY OF HOMOLOGOUS SUPERFAMILIES (DHS) ( J.E.BRAY ET AL.,2000)
Database of validated multiple structural alignment annotated with consensus functional information for evolutionary protein families.
A powerful resource to validate, examine and visualize key structural and functional features of each homologous superfamily.
Also provides a tool for examining sequence-structure relationships for proteins within each fold group
GENERATION OF DATA FOR THE DHS

Generation of structure comparison data using SSAP
Comparisons provide a complete data set for analyzing analogues , homologous and checking for incorrect classifications
Automatic validation of structural relatives(DHS-VALID)
DHS-VALID program is used to check automatically all the pairwise sequence and structure comparison data generated for each fold group and homologues superfamily in CATH.
Generation of multiple structural alignment using CORA
Conserved Residue Attributes Uses the pairwise structural comparison data from SSAP to determine the initial set of proteins to be aligned Identifies conserved characteristics and expresses as a 3D structural profile Profiles encapsulate the core
Annotation of structural alignments
GENE3D (DANIEL W.A. BUCHAN ET AL.,2002)
It is focused on providing structural annotation for protein sequences without structural representatives The protein sequences have also been clustered into whole chain families so as to aid functional prediction.
The structural annotation is generated using HMM models based on the CATH domain families
Applications:
Annotate Hypothetical proteins and gene (Corin Yeats et al.,2006)

Examine the functions of homologous superfamilies that are multiply expanded within genomes or sets of genomes.
APPLICATIONS OF CATH DATABASE
CATH database was used as a guide to select proteins from a wide variety of protein families (Jonathan G. Lees et al.,2006)
To capture evolutionary divergence (Lesley H. Greene et al.,2007)

For identifying remote homologs (J.E.Bray et al.,2000)
The organization of proteins by global structural similarity helps improve prediction algorithms based on fold recognition
Allow the distribution of common motifs to be explored more easily
Gives insights into which combinations of motifs generate stable protein architectures
Allows newly determined structures to be easily examined for recognizable folds (CA Orengo et al.,1997)
G
1
E
N E 3 D
4
1. Boundary assignment by inheriting from other chain 2. Predicts Hypothetical proteins 3. Database of validated multiple structural alignments 4. Scores used for identifying matches
S S A P
REFERENCES

CA Orengo et al.,1997 CATH a hierarchic classification of protein domain structures J.E.Bray et al.,2000 The CATH Dictionary of Homologous Superfamilies(DHS): a consensus approach for identifying distant structural homologues CA Orengo et al.,1999 The CATH Database provides insights into protein structure/function relationships Lesley H. Greene et al.,2007 The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution Frances Pearl et al.,2005 The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis Daniel W.A. Buchan et al., 2002 Gene3D: Structural Assignment for Whole Genes and Genomes Using the CATH Domain Structure Database Corin Yeats et al.,2006 Gene3D: modelling protein structure, function and evolution

Cath Database

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Cath Database

Încărcat de

Drepturi de autor:

Formate disponibile

Presented By Shri Vaishnavi & Pinky

HIERARCHIAL DOMAIN CLASSIFICATION OF PROTEIN STRUCTURES

FLOW CHART OF CATH DATABASE

CATH SERVER PROTOCOL (FRANCES PEARL ET AL.,2005)

Generate derived data from the PDB coordinate files

Set Threshold E-value from validated structural homologs

The top 10 matches are displayed

DICTIONARY OF HOMOLOGOUS SUPERFAMILIES (DHS) ( J.E.BRAY ET AL.,2000)

GENERATION OF DATA FOR THE DHS

Automatic validation of structural relatives(DHS-VALID)

Generation of multiple structural alignment using CORA

Annotation of structural alignments

GENE3D (DANIEL W.A. BUCHAN ET AL.,2002)

Annotate Hypothetical proteins and gene (Corin Yeats et al.,2006)

APPLICATIONS OF CATH DATABASE

To capture evolutionary divergence (Lesley H. Greene et al.,2007)

S-ar putea să vă placă și