Documente Academic
Documente Profesional
Documente Cultură
• Preguntas iniciales:
– ¿Qué es el interactoma?
– ¿Qué problema científico aborda?
– ¿Cómo se estudia el interactoma?
1.- Búsqueda de la
información
Bibliografía seleccionada
2.- Elaboración de la
información
¿Qué es el interactoma?
¿Qué problema científico aborda?
¿Cómo se estudia?
• Interactoma:
– identificación sistemática de
interacciones de proteínas
dentro de un organismo
• Procedimientos experimentales:
– Sistema Y2H (híbrido de dos)
– Detección de complejos por
espectroscopía de masas
• Problemas técnicos:
– Cobertura incompleta
– Detección de las interacciones
• Retos técnicos:
– Generación de conjuntos de
clones
– Método de mating de las cepas
de levadura
• Full interactome network:
– The complete collection of all
physical protein–protein interactions
that can take place within a cell.
• Requires
– Construction of comprehensive sets
of protein–protein interactions
– Creation of genome-scale resource
collections of open reading frames
(ORFeomes) cloned so as to
facilitate protein
– Capture all expressed isoforms
(splice variants and
polymorphisms).
• Generation of comprehensive
network maps, generally depicted as
nodes (e.g. proteins, RNAs, DNA
binding sites or metabolites) linked by
edges corresponding to molecular
interactions (e.g. protein–protein
interactions, enzymatic reactions,
DNA–protein, etc.).
¿Qué es el Y2H?
¿Cómo se hace?
¿Cómo se interpreta?
Yeast two-hybrid system
• The Y2H system was originally
described by Fields and Song in 1989.
(Nature, 340, 245–246.).
High-throughput two-hybrid screens are subsaturating. Venn diagrams showing overlap among independent high-
throughput two-hybrid screens (a) for Drosophila proteins or (b) for human proteins. In each case the data represent the
entire two-hybrid dataset rather than just the set judged to be high confidence in each study. Numbers indicate unique
interactions based on gene locus (i.e. detection of an interaction between protein A and two splice variants of protein B
would be counted as one interaction). Data for Drosophila were obtained from Giot et al. [13], Stanyon et al. [14], and
Formstecher et al. [15], and compiled to remove redundancy in the Drosophila Interactions Database [23]. Human data
was obtained from Rual et al. [16] and Stelzl [17].
Yeast two-hybrid system
Y2H
• Quality of the in high-throughput datasets (Y2H, AP-MS).
– Low overlap in the results of the two genome-wide yeast two-hybrid projects by
Uetz et al. (2000) and Ito et al. (2000) and, similarly, in the two high-throughput
AP-MS approaches analysing the yeast proteome (Gavin et al. 2002; Ho et al.
2002) has raised concerns about “noisiness” and false negative or false positive
results.
• Technical false-positives
– In HT-Y2H experiments the technical false-positive rate is substantially reduced
by
• incorporating multiple reporter genes to measure transcription activation
• employing different DNA sequences for binding by DB in the promoters of the reporter
genes
• using low copy number vectors
• retesting interacting pairs in fresh yeast
Yeast two-hybrid system
HT-2YH
Stringent Y2H screening strategy.
Through use of multiple, single-copy
reporter genes, low copy plasmids for
expression of bait and prey in yeast,
and retesting of all positives, Y2H
achieves increased stringency leading
to reproducibly real interactions.
TECHNICAL FALSE-POSITIVES IN
Y2H
• Biological false-positives
– The interaction can be confirmed by multiple, different methods, but the two proteins are
never present in the same cell or subcellular compartment at the same time.
– These false-positives are nearly impossible to unequivocally identify using interaction assays
alone.
• Technical false-positives
– In HT-Y2H experiments the technical false-positive rate is substantially reduced by
• incorporating multiple reporter genes to measure transcription activation
• employing different DNA sequences for binding by DB in the promoters of the reporter genes
• using low copy number vectors
• retesting interacting pairs in fresh yeast
• Auto-activators
– the DB-X construct activates gene expression in the absence of any AD-Y
• Strong auto-activators can be removed directly before any AD-Y is added
• Additional auto-activators arise owing to acquisition of mutations in the bait during propagation.
• These latent auto-activators are much harder to identify, as the presence of AD-Y gives the
appearance of an interaction when in fact it is the DB-X construct alone that auto-activates the Y2H
reporter genes, irrespective of any AD-Y that is present
TECHNICAL FALSE-POSITIVES IN
Y2H
AP-MS
Ejemplos de estudios con Y2H
• Bacterias
– Helicobacter pylori
• Eucariontes
– S. cerevisiae
• 5600 interacciones en que participan el 69% de las proteínas
– Drosophila melanogaster
• 24000 interacciones que implican el 54% de los genes
– Plasmodium falciparum
– Caenorhabditis elegans
• 5400 interacciones que implican el 12% de las proteínas
– Homo sapiens
• 10000 ORFs clonadas
IDENTIFICATION OF INTERACTING PROTEINS
BY MASS SPECTROMETRY
• Two basic strategies:
– direct (purification of a stable complex and elucidation of the components of the complex by
mass spectrometry)
• Faces the difficult task of achieving sufficient purification of target complexes without loss of
components and with minimal contamination.
– co-AP (purification of a complex by virtue of an affinity tag placed on one of its components,
then elucidation of the components of the complex by mass spectrometry).
• Many complexes involve very transient interactions and/or individual components are
not readily detectable owing to low expression, AP–MS will underestimate the extent
of complex co-membership.
• A systematic analysis suggests that a majority of novel and shared components are
likely to be biologically relevant (108), which means that AP–MS is a reliable method
for identifying novel components of complexes.
COMPLEMENTARITY OF AP–MS AND Y2H
Y2H AP-MS
P(1) = 0.33
The most basic characteristic of a node in k=1 P(2) = 0.50
a network is its degree k, which is defined P(3) = 0.00
as the number of links it has to other k=2 P(4) = 0.17
nodes. k=4
• An elementary measure to characterize a
In protein interaction networks, links network’s topology is the degree distribut-
usually are undirected. In other complex ion P(k), obtained by counting the number
networks, like for example gene of nodes having the same degree N(k)
regulatory networks, links can be divided by the total number of nodes (N).
directional; here the degree of a node is
divided into incoming degree, comprising • P(k) gives the probability of a node having
the links that point towards that node, and exactly the degree k.
outgoing degree, denoting links pointing
away from it. • P(k) can be used to classify networks
Protein interaction networks
• A Poisson distribution of P(k) values is indicative of random networks.
• This means that the large majority of nodes have only one or very few links, while a
small but significant number of nodes, the so-called “hubs” or centros, are connected
to many other nodes.
– Power law topology contribute to the robustness against random perturbations
– Knockouts of genes encoding hubs are approximately three times more likely to
confer lethality than those of non-hubs.
– Furthermore, the dynamics of interactions mediated by hub proteins points to a
modular organization of the yeast proteome.
Protein interaction networks
• Another feature to describe and classify network architecture and the relative position
of particular nodes in the network is the path length
– The number of steps that have to be taken to reach from one node to another.
• The shortest path and the mean path length are measures of the diameter of a
network.
• Scale-free networks have ultra-short mean path lengths and therefore have so-called
“small-world” properties, a characteristic of random networks.
Protein interaction networks
• Biological and other complex networks revealed a high degree of clustering which is
not found in random networks but rather is an attribute of regular networks.
• Clustering coefficient Ci:
– Number of links existing between the neighbours of a node i divided by the maximum
number of links possible between these neighbours:
Ci = 2ni / k(k-1)
where n is the number of links connecting the k neighbors.
– A high clustering coefficient means that, if for example a node A is connected to B and C,
there is a high probability that B has a direct link to C or, in other words, A, B and C form a
triangle.
• A high Clustering coefficient (a high density of triangles) indicates a “community
structure” or a modular organization, which is another general property of complex
networks.
• In biological networks, functional annotation of these separable subgraphs supports
the view that these structures reflect the modularity of cellular functions.
– Rationale for annotation:
• Guilty by association
• Majority rule
Protein interaction network decomposition reveals functional modules and motifs.
a Graphical representation of a nonredundant set of yeast interactome data compiled from the
GRID and DIP databases results in a highly complex network. b Exemplarily, κ-core decompos-
ition, a method that is based on the recursive removal of the least connected nodes from the
yeast interactome network, is shown. Depicted is the 8-core, a subgraph with all nodes con-
nected by at least eight edges. Colouring indicates functional categorization according to GO
annotation.
Joachim F. Uhrig. Protein interaction networks in plants. Planta (2006) 224: 771–781
Interactómica comparada
• El mapa de interacciones en una especie es útil para predecir cuáles son
las interacciones en otra.
Towards a proteome-scale map of the human protein–protein interaction network. Rual et al. 2005. Nature 437, 1173-1178