Documente Academic
Documente Profesional
Documente Cultură
Identifying species
With the use of BLAST, you can possibly correctly identify a species or find homologous
species. This can be useful, for example, when you are working with a DNA sequence
from an unknown species.
Locating domains
When working with a protein sequence you can input it into BLAST, to locate known
domains within the sequence of interest.
Establishing phylogeny
Using the results received through BLAST you can create a phylogenetic tree using the
BLAST web-page. Phylogenies based on BLAST alone are less reliable than other
purpose-built computational phylogenetic methods, so should only be relied upon for
"first pass" phylogenetic analyses.
DNA mapping
When working with a known species, and looking to sequence a gene at an unknown
location, BLAST can compare the chromosomal position of the sequence of interest, to
relevant sequences in the database(s).
Comparison
When working with genes, BLAST can locate common genes in two related species, and
can be used to map annotations from one organism to another.
FASTA
FASTA is a DNA and protein
sequence alignment software
package first described (as FASTP) by
David J. Lipman and
William R. Pearson in 1985.[1] Its
legacy is the FASTA format which is
now ubiquitous in bioinformatics.
TYPES OF FASTA
Protein
Protein-protein FASTA.
Protein-protein Smith-Waterman (ssearch).
Global Protein-protein (Needleman-Wunsch)
(ggsearch)
Global/Local protein-protein (glsearch)
Protein-protein with unordered peptides
(fasts)
Protein-protein with mixed peptide sequences
(fastf)
MULTIPLE SEQUENCE
ALIGNMENTS
Definition:
A Multiple Sequence Alignment (MSA)
is a sequence alignment of three or more
biological sequences, generally protein,
DNA, or RNA. In many cases, the input set
of query sequences are assumed to have
an evolutionary relationship by which they
share a lineage and are descended from a
common ancestor
WHY WE DO MULTIPLE SEQUENCE ALIGNMENTS.
Multiple nucleotide or amino sequence alignment
techniques are usually performed to fit one of the
following scopes :
In order to characterize protein families, identify shared
regions of homology in a multiple sequence alignment
Determination of the consensus sequence of several
aligned sequences
Help prediction of the secondary and tertiary structures
of new sequences
Preliminary step in molecular evolution analysis using
Phylogenetic methods for constructing phylogenetic trees
Shady box tool.
1. GENSCAN is an program to identify
complete gene structures in genomic
DNA.
2. It Can be used to predict the
location of genes and their exon
intron boundaries in genomic
sequences from a variety of
organisms
Glimmer
Glimmer is a system for finding genes in
microbial DNA, especially the genomes of
bacteria, archaea, and viruses.(Gene
Locator and Interpolated Markov Modeller)
Glimmer is the system of choice for
genome annotation efforts on a wide range
of bacteria, archaeal, and viral species due
to high accuracy.
GeneID
PAM
Point Accepted Mutation (Dayhoff et al.)
1 PAM = PAM1 = 1% average change of
all amino acid positions
After 100 PAMs of evolution, not every
residue will have changed
some residues may have mutated several times
some residues may have returned to their
original state
some residues may have not changed at all
BLOSUM
Blocks Substitution Matrix
Scores derived from
observations of the frequencies
of substitutions in blocks of local
alignments in related proteins
Matrix name indicates
evolutionary distance
BLOSUMx was created using
sequences sharing no more than x%
identity
TOOLS FOR MICROARRAY