Documente Academic
Documente Profesional
Documente Cultură
http://www.ncbi.nlm.nih.gov/geo/
Microarray in general
SPOTS
Hybridization
SCAN
Gene Expression Omnibus (GEO): Gene Expression and Molecular Abundance Data Repository
A public repository for the archiving and distribution of gene expression data submitted by the scientific community. MIAME compliant data.
Convenient for deposition of gene expression data, as required by funding agencies and journals. Curated, online resource for gene expression data browsing, query, analysis and retrieval.
GEO Architecture
GEO has four kinds of data records
Platform (GPL) = the technology used and the features detected. Sample (GSM) = preparation and description of the sample. Series (GSE) defines a set of samples and how they are related. DataSets (GDS) sample data collections assembled by GEO staff.
GEO Architecture
Submitted by Manufacturer*
Submitted by Experimentalists
Curated by NCBI
GPL
Platform descriptions
GSM
GSE
GDS
Grouping of experiments
Grouping of Raw/processed slide/chip data spot intensities from a single a single experiment slide/chip
Simple interface to: show status find documentation query data browse data submit data
Selecting the total public data or Repository Browser links on the GEO home page, takes you to the Repository Browser, listing:
number of each type of submitted file, both public and unreleased the total number of each technology type under Platforms the total number of each Sample type
All GEO submissions need to be associated with a platform file. These describe the features on a given platform, required to understand the data. A platform file must be submitted if one is not already present in GEO. Commercial array platform files are submitted to GEO by the manufacturer.
Accession: Title: Samples: GEO ID brief description number of samples of platform in GEO associated with platform ID
Select Find Platform Select company Select distribution Select species Enter title keyword
Start the platform search Select the accession for the U133 plus 2.0 array Scroll down to find data table information
Data is submitted to GEO as a Series, which represents the experiment design. Selecting Browse>Series brings up a list sorted by release date. Selecting a Series ID brings up the Series file summary.
Format controls how information is displayed: HTML SOFT (Simple Omnibus Format in Text) MINiML (MIAME Notation in Markup Language)
Amount controls how much information is displayed: Brief Quick Full Data
Platform Series
Total data rows and file size Supplementary raw data file
A common way to access GEO data is through accessions from papers. Online journals include hyperlinks to the GEO accession page. Or, at the GEO home page enter the accession into the Query>GEO accession text box
One option for displaying PubMed search results is GEO DataSet links. When present, the results page is actually from Entrez GEO DataSets.
Advanced Searches
GEO data can be queried as: Datasets: experiment-centric view using Entrez GEO DataSets Gene profiles: gene-centric view using Entrez GEO Profiles Selecting either takes you to a similar Entrez introduction page
Total results
Number of DataSets Number of Platforms Number of Series
Start a GEO DataSets search with the Query>DataSets text box This brings up an Entrez GEO DataSets results form
Platform
Reference Series Supplementary files Number of Samples and truncated list Cluster image
Select the DataSet ID or click on the cluster image to go to the DataSet record.
Sample and analysis information. Data retrieval. Selecting analysis takes you to the data clustering interface. Selecting the cluster image takes you to the clustering page
GEO Gene Profiles use gene IDs from Platform files to show the expression of a gene across DataSets. Entering a gene ID into the Query>Gene profiles text box takes you to the Entrez results page.
GEO BLAST
E button
On the GEO BLAST page enter sequences in fasta format, GenBank accessions or select sequence files on local disks for blastn comparisons. These are compared to GenBank sequences listed in Platform files associated with GEO DataSets From the Blast result page select the E option to the right of an alignment to show GEO Gene Profiles for that sequence in GEO DataSets