Documente Academic
Documente Profesional
Documente Cultură
Andre Faure & Petra Schwalie Paul Flicek Lab, Vertebrate Genomics, EMBL-EBI 9. March 2010
RESOURCES
http://www.bioconductor.org http://seqanswers.com data
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context Read prole plots (3) Motif analysis (de novo & scanning) (4) Differential enrichment
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here)
(2) Genomic context Read prole plots (3) Motif analysis (de novo & scanning) (4) Differential enrichment
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context Read prole plots (3) Motif analysis (de novo & scanning) (4) Differential enrichment
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling
Read prole plots (3) Motif analysis (de novo & scanning) (4) Differential enrichment
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context Read prole plots (3) Motif analysis (de novo & scanning) (4) Differential enrichment
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context
(3) Motif analysis (de novo & scanning) (4) Differential enrichment
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context Read prole plots (3) Motif analysis (de novo & scanning) (4) Differential enrichment
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context
CTCF
CG
G
A C ATC
A AG
T
CCA AGGGGGC
C
T
G
A
TG
CT
GC
TT
A AGCT
AGC
AT
C T
GC
AG
CG
TA
AA
AC
AC
CT
C AGCTGT
TT
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context Read prole plots (3) Motif analysis (de novo & scanning) (4) Differential enrichment
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context Read prole plots (3) Motif analysis (de novo & scanning)
Sampl
Sampl
WORKFLOW
Raw data (ENCODE CTCF, H3K36me3, Input in K562 & HepG2) Quality check & align (not discussed here) (1) Peak-calling (2) Genomic context Read prole plots (3) Motif analysis (de novo & scanning) (4) Differential enrichment
(1) PEAK-CALLING
chipseq, GenomicRanges
(Bioconductor)
estimating fragment length extending reads islands of enrichment modeling the background (e.g. Poisson, neg. binomial) calling peaks (manual, MACS, SWEMBL) genomic overlaps: comparison of peak-calling results
biomart, GenomicRanges
(Bioconductor)
obtaining annotation (Ensembl) overlaps with annotation (e.g. promoters) enrichment of peaks in genomic areas (e.g. promoters) (not discussed here) functional term enrichment (not discussed here) (e.g. GREAT, McLean et al. Nat Biotechnol) average prole plots on genomic feature/peak summit
(Bioconductor)
obtaining the peak sequences de novo motif discovery motif scanning: motifs per peaks? motif enrichment vs. background (not discussed here) rening the PWM for a given factor motif prole plot (distribution of motif around peak summit)
(Bioconductor)
dening regions of interest (ROI) obtaining counts per regions of interest (replicates & conditions) estimating library sizes estimating variation of counts per ROIs calling differentially modied regions (negative binomial distribution) overview of signicantly modied regions
http://www.ebi.ac.uk/~schwalie/chipseqprac_0311/chipseq_practical.pdf
(1) PEAK-CALLING
PEAK ANALYSIS
motif discovery
MACS Swembl
motif prole
motifs/peaks
Wednesday, 9 March 2011
ChIP-seq: advantages and challenges of a maturing technology (Park, Nat Rev Genet 2009) Computation for ChIP-seq and RNA-seq studies (Peke et al, Nat Methods 2009) Design and analysis of ChIP-seq experiments for DNA-binding proteins (Kharchenko et al, Nat Biotechnol 2008) Q&A: ChIP-seq technologies and the study of gene regulation (Liu et al, MBC Biol 2010)
Evaluation of algorithm performance in ChIP-seq peak detection (Wilbanks, PLos ONE 2010) A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments (Laajala et al, BMC Bioinformatics)