Sunteți pe pagina 1din 53

Genome Evolution in Yeast

Gilles Fischer

27th January 2009 | European Course on


INTRODUCTION:

Comparative genomics

Yeasts as model organisms

GENOME EVOLUTION:

DNA duplications

Chromosome dynamics

Nucleotide composition
A brief introduction to the field of Comparative Genomics

Comparing genomes is a very old idea

DNA carries the genetic information: Avery (1943) and Hershey-Chase (1952)

Vendrely and Vendrely (1950):

"Il ne fait aucun doute que l'tude systmatique de la teneur absolue du noyau en
acide dsoxyribonuclique, travers de nombreuses espces animales puisse
fournir des suggestions intressantes en ce qui concerne le problme de
l'volution"

Jacques Monod:

"Tout ce qui est vrai pour le colibacille est vrai pour l'lphant"
A brief introduction to the field of Comparative Genomics

identical divergent different

time
or
quantity of evolutionary changes

Looking for differences Looking for similarities


A brief introduction to the field of Comparative Genomics

identical divergent different

time
or
quantity of evolutionary changes

Looking for differences Looking for similarities

NEED FOR ADEQUATELY RELATED ORGANSIMS


A brief introduction to the field of Comparative Genomics

Bio-informatics

Looking for differences


Rules governing
Genome sequences
genome evolution
Looking for similarities

Experimental Biology

Genetic screens
Molecular Mechanistic
mechanisms hypotheses
functional genomics
A brief introduction to the field of Comparative Genomics

Bio-informatics

Looking for differences


Rules governing
Genome sequences
genome evolution
Looking for similarities

SMALL GENOMES
AND
EXPERIMENTALLY TRACTABLE

Experimental Biology

Genetic screens
Molecular Mechanistic
mechanisms hypotheses
functional genomics
A brief introduction to the field of Yeast Genomics

Organisms with small genomes, phylogenetically related and experimentally tractable =


YEASTS

Eukaryotic micro-organisms classified in the kingdom Fungi


About 1,500 species currently described (only 1% of all yeast)
Yeasts are unicellular, typically measuring 34 m in diameter (up to over 40 m)

Saccharomyces cerevisiae used in baking and fermenting alcoholic beverages for thousands of
years
Other species of yeast, such as Candida albicans, are opportunistic human pathogens
Yeasts have recently been used to generate electricity in microbial fuel cells and produce
ethanol for the biofuel industry.

Yeasts are found in both divisions Ascomycota and Basidiomycota


The budding yeasts ("true yeasts") are classified in the Saccharomycotina subphylum
A brief introduction to the field of Yeast Genomics

Organisms with small genomes, phylogenetically related and experimentally tractable =


YEASTS

The Tree of Eukaryotes (Keeling et al., 2005)


Saccharomycotina
A brief introduction to the field of Yeast Genomics
Saccharomyces paradoxus
Saccharomyces mikatae
Saccharomyces cerevisiae
The first eukaryotic genome sequence: Saccharomyces kudriavzevii
Saccharomyces bayanus
The genome of S. cerevisiae Saccharomyces pastorianus
Saccharomyces exiguus
Saccharomyces servazzii
Saccharomyces castellii
Candida glabrata
Vanderwaltozyma polyspora
Zygosaccharomyces rouxii
Lachancea thermotolerans
Lachancea waltii
Lachancea kluyveri
Andr Goffeau Kluyveromyces lactis
Kluyveromyces marxianus
Eremothecium gossypii
8 years, 120 labs, Saccharomycodes ludwigii
641 people Brettanomyces bruxellensis
Pichia angusta
Candida lusitaniae
Life with 6000 genes Debaryomyces hansenii
Science (1996) Pichia stipitis
Pichia sorbitophila
Candida guilliermondii
Candida tropicalis
Candida parapsilosis
Lodderomyces elongisporus
Candida albicans
Candida dubliniensis
Arxula adeninivorans
Yarrowia lipolytica

Schizosaccharomyces pombe
Saccharomycotina
A brief introduction to the field of Yeast Genomics
Saccharomyces paradoxus
Saccharomyces mikatae
Saccharomyces cerevisiae
Saccharomyces kudriavzevii
Whole Genome Duplication Saccharomyces bayanus
Saccharomyces pastorianus
Saccharomyces exiguus
Saccharomyces servazzii
Gain of Megasatellites Saccharomyces castellii
Candida glabrata
Vanderwaltozyma polyspora
Zygosaccharomyces rouxii
Gain of HO gene Lachancea thermotolerans
Lachancea waltii
Lachancea kluyveri
Kluyveromyces lactis
Gain of mating type cassettes Kluyveromyces marxianus
and small centromeres Eremothecium gossypii
Saccharomycodes ludwigii
Brettanomyces bruxellensis
Pichia angusta
frequent tandem duplications Candida lusitaniae
Debaryomyces hansenii
Pichia stipitis
Pichia sorbitophila
Candida guilliermondii
Extensive loss of transposable Candida tropicalis
elements and spliceosomal Candida parapsilosis
introns Lodderomyces elongisporus
Candida albicans
Candida dubliniensis
Arxula adeninivorans
Yarrowia lipolytica

Schizosaccharomyces pombe
A brief introduction to the field of Yeast Genomics Genome annotation
# chr size (Mb) # genes # tRNA # introns

Saccharomyces cerevisiae 16 12,1 5769 274 287

Candida glabrata 13 12,3 5204 207 131

Zygosaccharomyces rouxii 7 9,8 4998 272 167

Lachancea kluyveri 8 11,3 5308 258 322


(WashU seq center M. Jonhston)

Lachancea thermotolerans 8 10,4 5104 231 286

Kluyveromyces lactis 6 10,7 5084 162 175

Debaryomyces hansenii 7 12,1 6273 200 475

Yarrowia lipolytica 6 20,5 6434 510 1070


A brief introduction to the field of Yeast Genomics Evolutionary scale
amino acid
identity %

Saccharomyces cerevisiae 100 *


100 MYr

100 *

100 MYr
Candida glabrata 65

Homo sapiens
Zygosaccharomyces rouxii -

450 MYr
100 - 300 MYr

90

Lachancea kluyveri - Mus musculus

550 MYr
300 - 1000 MYr

Lachancea thermotolerans -
70
Takifugu rubripes
Kluyveromyces lactis 60
Tetraodon negroviridis

Debaryomyces hansenii 51

Yarrowia lipolytica Ciona intestinalis


48 50
Berbee and Taylor, 2006; James et al., 2006 *Dujon et al., et * Jaillon et al., Nature, 2004
A brief introduction to the field of Yeast Genomics
Genome redundancy
1.40

mean family size


Saccharomyces cerevisiae 1.35
WGD
1.30

1.25
Candida glabrata
1.20

1.15
Zygosaccharomyces rouxii
1.10

Lachancea kluyveri
(WashU seq center M. Jonhston)

Lachancea thermotolerans
- important level of redundancy (in all
eukaryotic phyla)

Kluyveromyces lactis
- Gene order changes (differential loss of
duplicates, translocation breakpoints)
Debaryomyces hansenii

- several mechanisms of duplication


Yarrowia lipolytica

Wolfe and Shields, 1997


- Small, compact and specialized:
- small intergenic sequences
- few transposable elements
Yeast Genomes - few introns
- limited RNA interference

-Large evolutionary scale

- High level of genome redundancy

- Numerous evolutionary novelties in all clades

- High number of sequenced genomes

===> good model organisms to study genome evolution


Genome evolution: DNA duplications

Most eukaryotic genomes contain high proportion of duplicated genes

S. c. A. t. C. e. D. m. H. s. s.
duplication
Duplicated Genes 43% 65% 49% 40% 50%

Degeneration
Pseudogenization Neofunctionalization Conservation Complementation
Loss of function Gain of a new Gene dosage increase Specialization of
(most frequent fate) function Genetic robustness the 2 copies

===> Strong evolutionary potential


Genome evolution: DNA duplications

Adaptative value of DNA duplications:

Adaptation to sulfate-limited conditions in chemostats for 200 generations:

CGH

SDs containing between 1 to 22 genes


No homology at the junctions (microhomologies)

Gresham et al., PLoS Genet 2008


Genome evolution: DNA duplications

A duplication assay:

XV

RPL20B and so on
XIII

RPL20A 3days - YPD - 30

==> WT growth rate

???
XV

RPL20B RPL20B
XIII

rpl20A
dltion
==>slow growth ==> WT growth rate
Genome evolution: DNA duplications A duplication assay:

Molecular characterization of segmental duplications:

Karyotype Hybridization Comparative Genomic Hybridization RPL20B

IV - XII
XV RPL20B
143 kb
VII, XV

V, XIII Molecular combing

II
XIV
X
XI direct tandem
V - VIII

IX PCR and sequence

III
VI
I
A A C C T A G A G C T T ( G T T ) 14 G T G G A T T G T T T

Despite the selection of a single gene duplication event, only large segmental duplications were recovered
Genome evolution: DNA duplications Molecular mechanisms:

strain rate of SDs type of SDs breakpoint sequences (%)


(/cell/division) Intra-chromosomal Inter-chromosomal
LTRs microhomologies
(300bp) (2 to 11 bp)
microsatellites
(poly A/T or
rpt trinucleotides)

WT 10-7 (1) 42 6 48 52
REPLICATION

pol32
time (min)
Raghuraman
0 et al. Science, 2001
(<0.07) T -T T T T - T - -
0

5 clb5 7x 10-5 (730) 66 3 62 38


10 Lately replicated regions
15 CPT 3 x 10-5 (320) 22 0 54
tRNAs 56
20
LTRs
DSB REPAIR

microsatellites
25 rad52 3 x 10-7 (3) 70 1 0 100
30 a connection with
35 rad52 replication?
rad1 8 x 10-8 (0.8) 15 0 0 100
40
dnl4
Koszul et al. EMBO J., 2004
Replication-based mechanisms
strain rate of SDs type of SDs breakpoint sequences (%)
(/cell/division)
Intra-chromosomal Inter-chromosomal LTRs microhomologies
microsatellites

WT 10-7 (1) 42 6 48 52

clb5 7x 10-5 (730) 66 3 62 38

Clb5 defect in the firing of late replication origins (Schwob et al , 1993)


S-phase lasts twice longer (Epstein et al, 1992)
Rad9-dependent activation of the replication checkpoint indicative of
DNA damages (Gibson et al, 2004)
RPL20B lies in Clb5-dependent region (CDR; McCune et al, 2008)

replication perturbations strongly induce SD formation

pol32
Bloom and Cross, 2007 0 (<0.07) - - - -
Nick McElhinny, Cell 2008
Pol32
Pol32 is required for initiating BIR reaction (Lydeard et al, 2007)

SDs are generated through replication-based mechanisms


Replication-based mechanisms

strain rate of SDs type of SDs breakpoint sequences (%)


(/cell/division)
Intra-chromosomal Inter-chromosomal LTRs microhomologies
microsatellites

WT 10-7 (1) 42 6 48 52

CPT 3 x 10-5 (320) 22 0 54 56

CPT
Top1 Top1
=>broken forks promote
SD formation

Broken forks as precursor lesions leading to SDs


The DSB repair pathways

Dnl4

NHEJ Resection

HR
pas dhomologies, Rad52 Rad1
religature simple
Rad51
Pol32 MMEJ

SSA BIR

Microhomologies (5-12pb)
>30pb dhomologies
SDSA DSBR
Two different replication-based mechanisms

strain rate of SDs type of SDs breakpoint sequences (%)


(/cell/division)
Intra-chromosomal Inter-chromosomal LTRs microhomologies
microsatellites

WT 10-7 (1) 42 6 48 52

rad52 3 x 10-7 (3) 70 1 0 100

====>
=>
HR-dependent

HR-independent

=> HR-mediated SDs result from BIR Rad51-independent

=> Non HR-mediated SDs result from ?


The DSB repair pathways

X
Dnl4

Resection

X
Rad52 X
Rad1
MMIR: microhomology microsatellite-induced replication

strain rate of SDs type of SDs breakpoint sequences (%)


(/cell/division)
Intra-chromosomal Inter-chromosomal LTRs microhomologies
microsatellites

WT 10-7 (1) 42 6 48 52

rad52 3 x 10-7 (3) 70 1 0 100


rad52
rad1 8 x 10-8 (0.8) 15 0 0 100
dnl4

HR requires Rad52
MMEJ requires Rad1 SD are still being formed in the absence of all known DSB repair pathways
NHEJ requires Dnl4
existence of a new DSB repair pathway?

Sequences found at breakpoints: microhomologies between 2 and 11 bp


poly (A/T)13-23
trinucleotide repeats (GTT)3-20

Extremely high density of microhomologies and microsatelites in the genome


often intragenic

Formation of chimeric genes at breakpoints (in 13 out of 26 junctions)


The DSB repair pathways

X
Dnl4

Resection

X
Rad52 X
Rad1
The DSB repair pathways

X
Dnl4

Resection

X
Rad52 X
Rad1

A new pathway?
MMIR
Microhomology/microsatellites Induced Replication
- independent from all known DSB repair pathways (HR, NHEJ, MMEJ)
- dependent from Pol32
- Replication template switching between microhomologies and microsatellites
Genome evolution: DNA duplications Conclusions

SDs are spontaneously generated at high frequency: 10-7 SD/cell/division for the RPL20B locus

SDs arise from two alternative replication-based mechanisms: BIR and MMIR

MMIR represents a new mechanism different from known DSB repair pathways (HR, NHEJ):
between microhomologie (between 2 to 11 nt) and microsatellites (poly A/T, trinucleotide
repeats)
independent from Rad52
requires Pol32

MMIR induces the formation of chimerical genes at the rearrangement junctions


Genome evolution: DNA duplications In human, FoSTeS/MMBIR:

Hastings et al, Nature Review Genetics, 2009

Complex structural variations: - Lissencephaly (Nagamani et al., J. Med Genet 2009)


- Miller-Dieker syndrome
- Charcot-Marie-Tooth disease (Lupski and Chance, 2005)
- Pelizaeus Merzbacher disease (Lee et al., Cell 2007)
- XLMR syndrome (Bauters et al., Genome Res 2008)
- SDs and CNVs (Kim et al., Genome Res 2008)
Genome evolution: Chromosome Dynamics

-Duplications: high evolutionary potential (creation of new genes, adaptation,


specialization,)

- Translocations, inversions, deletions: very low evolutionary potential? (Loss


of genes, deregulation of gene expression, modification of sub-nuclear
architecture,)

Species 1
translocations
Inversions
Species 2 duplications
# deletions

# x
rates of rearrangements
Genome evolution: Chromosome Dynamics

Sensu stricto
S. cerevisiae
S. serevisiae
S. cariocanus S. bayanus

S. paradoxus
Candida glabrata
S. mikatae
S. kudriavzevii
Zygosaccharomyces rouxii
S. bayanus

Lachancea kluyveri
Saccharomyces sensu stricto complex:
- monophyletic group
- very closely related species Lachancea thermotolerans
- hybrids viable but sterile
- 16 chromosomes
Kluyveromyces lactis

Debaryomyces hansenii

Yarrowia lipolytica
Genome evolution: Chromosome Dynamics

S. cerevisiae
only few translocations:
S. cariocanus (4)
low reorganization
S. paradoxus (0) recombination between repeated sequences
S. mikatae (2) no chromosomal speciation
variable rate of rearrangements?
S. kudriavzevii (0)

S. bayanus (4)

S. cerevisiae S. paradoxus S. kudriavzevii S. mikatae S. cariocanus S. bayanus

Fischer et al. , Nature 2000


Genome evolution: Chromosome Dynamics
S. cerevisiae S. bayanus C. glabrata K.Sensu
lactisstricto D. hansenii Y. lipolytica
8 15 5 7 9 11 13 1 3 5
1 4 6 8 10 12 2 4 6 A D G IJ 2 45 6
S. serevisiae
S. bayanus

Candida glabrata
chr VIII

Zygosaccharomyces rouxii

Lachancea kluyveri

Lachancea thermotolerans

Kluyveromyces lactis

Debaryomyces hansenii

Yarrowia lipolytica

98% 88% 77% 11% 5%


Genome evolution: Chromosome Dynamics
S. cerevisiae S. bayanus C. glabrata K. lactis D. hansenii Y. lipolytica
8 15 5 7 9 11 13 1 3 5
1 4 6 8 10 12 2 4 6 A D G IJ 2 45 6

chr VIII

98% 88% 77% 11% 5%


Genome evolution: Chromosome Dynamics
S. cerevisiae S. bayanus C. glabrata K. lactis D. hansenii Y. lipolytica
8 15 5 7 9 11 13 1 3 5
1 4 6 8 10 12 2 4 6 A D G IJ 2 45 6

chr VIII

Fischer

F. Brunet

98% 88% 77%


Fischer et al. , PLoS Genet 2006
Genome evolution: Chromosome Dynamics
S. cerevisiae S. bayanus C. glabrata K. lactis D. hansenii Y. lipolytica
8 15 5 7 9 11 13 1 3 5
1 4 6 8 10 12 2 4 6 A D G IJ 2 45 6

chr VIII

98% 88% 77% 11% 5%


Genome evolution: Chromosome Dynamics
S. cerevisiae S. bayanus C. glabrata K. lactis D. hansenii Y. lipolytica
8 15 5 7 9 11 13 1 3 5
1 4 6 8 10 12 2 4 6 A D G IJ 2 45 6

chr VIII

98% 88% 77% 11% 5%


Genome evolution: Chromosome Dynamics at genome scale:

Saccharomyces cerevisiae Mean amino acid identity: 65%

- comprehensive reshuffling

C. glabrata
Candida glabrata
- 509 translocations, 104 inversions

- no homologous chromosomes
Zygosaccharomyces rouxii

"UNSTABLE" GENOMES
Lachancea kluyveri
S.cerevisiae

Lachancea thermotolerans

Mean amino acid identity: 58%


L. thermotolerans

-moderate reshuffling

-91 translocations, 22 inversions

- large chromosomal segments


(up to 670 kb)

"STABLE" GENOMES
L. kluyveri
Genome evolution: Chromosome Dynamics

Quantitative estimation of the relative genome stability: GOC (gene order conservation)

species 1
=5
# neighboring orthologues
If yes: +1
? If no: 0
GOC =
Total # orthologues
species 2

=5

- GOL : Gene Order Loss = 1 - GOC

GOL (
- Rate of rearrangements = ( Dist phylogntique mean rate

Rocha, Trends Genet, 2003,


Genome evolution: Chromosome Dynamics

Rearrangement branch rate


WGD 1.5 Species instability scale
Saccharomyces cerevisiae

2.7 0.7
1.3
0.4 Candida glabrata
D. hansenii

0.6 0.6
Zygosaccharomyces rouxii
1.7
S. cerevisiae
0.3
Lachancea kluyveri C. glabrata 0.5
(WashU seq center M. Jonhston)

0.4 Lachancea thermotolerans


Z. rouxii
0.0 K. lactis 0.4
L. kluyveri
0.9
Kluyveromyces lactis L. thermot

0.3
1.7
Debaryomyces hansenii

1.7
Yarrowia lipolytica
Fischer et al. , PLoS Genet 2006
Genome evolution: Chromosome Dynamics
moderate

massive low

Sensu stricto
S. serevisiae differential gene loss
S. bayanus

Candida glabrata Unstable genome

Zygosaccharomyces rouxii

Lachancea kluyveri
(WashU seq center M. Jonhston)

Lachancea thermotolerans
Stable genomes

Kluyveromyces lactis

TGA expansion
Debaryomyces hansenii
No synteny

Y. lipolytica
Genome evolution: Chromosome Dynamics Conclusions

High level of chromosome plasticity


Hundreds of translocations and inversions

Gene order is not very constrained

Highly variable rates of chromosome rearrangements between lineages but also within a given
lineage

Is there a selective advantage associated to these rearrangements? Are they accumulated by


genetic drift?

usually considered as deleterious

few examples of the adaptative role of rearrangements (proliferation of cancer cells (ONeil
and Look, 2007), growth advantage of translocated yeast cells (Colson et al, 2004),
adaptative gene loss (Domergue, 2005).

Creation of genetic novelties requires chromosome plasticity?


Genome evolution: Nucleotide composition
Base substitution mutations:
GC% C T transitions : cytosine deamination

Saccharomyces cerevisiae 38.3

QuickTime et un
Candida glabrata 38.8 dcompresseur
sont requis pour visionner cette image.

Zygosaccharomyces rouxii 39.1

Kreutzer and Essigmann, PNAS, 1998

41.5
G T transversions : 8-oxo-guanine
Lachancea kluyveri
Shibutani et al., Nature, 1991
Lachancea thermotolerans 47.3
Global AT-enrichment
Kluyveromyces lactis 38.8

Biased Gene Conversion (BGC):


Eremothecium gossypii 52.0 > Duret and Galtier, Annu Rev
AT GC mutations
not in yeast? Genomics Human Genet, 2009

Debaryomyces hansenii 36.3 <


Global GC-enrichment

Yarrowia lipolytica 49.0 Marsolier-Kergoat and Yeramian, Genetics, 2009


The Gnolevures Consortium, Genome Res., 2009
Lachancea thermotolerans
GC% A B C D E F G H
80

60

47.3
40
39.1
QuickTime et un
dcompresseur
sont requis pour visionner cette image.

20
1 2 3 4 5 6 7 8 9 10 Mb
A B C D E F G

Zygosaccharomyces rouxii

GC% Lachancea kluyveri


80
1 Mb

60 C-left

52.9
40
41.5

20
1 2 3 4 5 6 7 8 9 10 11 Mb
A B C D E F G H
Genome evolution: Nucleotide composition

DNA

GC% in C-left: 46.1 54.2 46.8


global GC increase
GC% out of C-left: 37.4 42.0 36.5

RNA 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd AAAAAA

GC% in C-left: 53.3 41.0 68.3 strong bias in codon usage


GC% out of C-left: 46.4 37.0 42.7

Protein A G P R I N K F

84 84 84 72 11 16 16 16 GC% in synonymous codons

1.3 1.2 1.1 1.2 0.7 0.8 0.9 0.9 relative use in C-left

bias in protein composition


Payen et al., Genome Res., 2009
Genome evolution: Nucleotide composition Phylogeny:

S. cerevisiae
100
Alignments of universally conserved proteins :
100

17 families (6688 residues) outside C-left C. glabrata

19 families (4631 residues) in C-left Z. rouxii

100 L. kluyveri
100 0.05
96

100 L. waltii
100

100
L. thermotolerans

K. lactis
100
98
E. gossypii

C-left has the same phylogentic origin than the rest of the genome

Payen et al., Genome Res., 2009


Genome evolution: Nucleotide composition Synteny:

LAWA_S33 LAWA_S27 LAWA_S56 LAWA_S55

670 kb LAKL_C

LATH_F LATH_G LATH_C


LATH_E LATH_A

C-left share a common ancestral origin with the


genomes of L. waltii (LAWA) and L. thermotolerans
(LATH)
Genome evolution: Nucleotide composition Replication:

- Design of custom microarrays (Agilent 2 x 105k):

200bp fragments

- Time course analysis of copy number variation during S-phase:

G1 DNACy3

S DNACy5

G2
Genome evolution: Nucleotide composition Replication:

ChrA

ChrB
Genome evolution: Nucleotide composition Replication:

ChrC

ChrD
Genome evolution: Nucleotide composition Conclusions

L. kluyveri offers a unique opportunity to understand the mechansims of evolution


of genome nucleotide composition

Global GC increase (codon usage bias and protein composition bias)

harbors a normal gene density

Phylogenetic origin consistent with the rest of the genome

presents a very high level of synteny conservation with sister species genomes

encompasses the MAT locus but has lost the silent cassettes HMR and HML

is devoid of Transposable Elements (203 insertions in the rest of the genome)

harbors the same compositional bias in all 11 L. kluyveri strains tested

The replication program is modified (more origins and delayed firing)

=> a cause or a consequence of the unusual GC composition?

Meiotic recombination and BGC?


Merci
- Unit de Gntique Molculaire des Levures, Institut Pasteur
Celia Payen
Romain Koszul

- Unit de Gnomique des Microorganismes, quipe Biologie des Gnomes


Nicolas Agier
Gunola Drillon

- Gnolevures consortium:
Jean-Luc Souciet Univ. Louis Pasteur, Strasbourg

- Centre National de Squenage, Evry Jean Weissenbach, Patrick Winker

- Gnopole Pasteur-Ile de France Christiane Bouchier, Lionel Frangeul

- Plateforme Puces ADN, Gnopole Pasteur Odile Sismeiro, Jean-Yves Copp

S-ar putea să vă placă și