Documente Academic
Documente Profesional
Documente Cultură
Name:
Instructors Name:
Course:
Date:
Computational Genomics
encoded and expressed from genetic complement of living organisms. Genome is the complete
recording of all DNA that determines the identity of biological organisms. The inventory usually
Biological systems are complex and interact and varying networks (Deonier et al., 2005).
from molecular to high levels in the hierarchy. The anatomy, evolutionary histories,
biochemistry, nature, physiology and structure of organisms define the types of problems that
Zoologists do not give the correct impression of life since their work stresses mainly on
mammals and other vertebrates. Organisms range from bacteria to multicellular plants and
animals. Such organisms employ various strategies for extracting energy from the environment.
They reduce dissolve sulphate to obtain H2S and eventually pyrite (Cristianini, 2006). Processes
such as photosynthesis and aerobic respiration enable them to acquire food. Some organisms
exist in high temperatures that near boiling points while others exist in very cold temperatures (at
freezing points). Some organisms like lithotrophic bacteria live in rocks are beneath the surface
of the earth. The analysis of RNA sequences indicates that there exist three major domains of
living organisms. They include eubacteria, Archaea and eukaryotes. Eubacteria consists of M.
Surname 2
colombiense or Bacillus subtilis while eukaryotes are organisms that have structured
chromosomes like fungi or humans. Archaea are types of bacteria that live in extreme
environments. The three types of organisms give two major categories namely prokaryotes,
archaebacteria. Prokaryotes do not have a true nucleus and their DNA is not as structured as
eukaryotic chromosomes.
The abundance of bacteria and the wide range of environment they can inhabit make them most
successful form of life on earth. Numerous unicellular forms known as protists exist among the
eukaryotes. Most of them are marine organisms. Ultrastructural and molecular data show that
different types of protists differ from each other compared to the differences between plants and
animals. However, they are members of a kingdom called Protista. Fungi, plants and animals are
the major multicellular groups. M. colombiense is an anaerobic bacterium that is found in the
lower intestine of warm blooded animals (Kislyuk et al., 2010). It belongs to the genus
Colombiense. Most strains of M. colombiense harmless but other strains can cause food
poisoning in their hosts. A part of normal flora of the gut is the harmless strains. They can also
benefit their hosts by producing vitamins such as K2. They also prevent colonization of the
intestine with bacteria that is pathogenic. M. colombiense does not contain crystal violet.
constructs assemblies of reads from one or more sequencing Runs by use of read flowgrams as
input (SSSM, 2011). It creates assembly projects, remove and add reads from a project and
specify parameters. In addition, it runs the algorithms of assemblies on the project data and view
output that assembly computations produce. A Graphical User Interface (GUI) is used to access
the application. Input data come from several regions of one or several runs of interest. External
file formats such FASTA and FASTQ are used to import additional read data. Some of the roles
Surname 3
the software performs during the assembly process include identifying pairwise overlaps that are
between the reads, solving branching structures that are between the contigs. The software
Multiple alignments of reads and divides also use the software application. It can perform extra
steps with the availability of paired data. Multiple alignments of overlapping read sequences are
produced by the assembler. The GUI provides tools that facilitate the contig sequences as well as
the multiple reads that from the contig. There is an interactive view of the flowgrmas of the
reads. The GS Novo assembler enables users to modify, run and create assemblies in the form of
projects (SSSM, 2011). The functionality is provided both the GUI and the command line
interface (CLI). Projects may be established to gather all reads at once. An alternative to this can
be incremental operation that allows additional reads to be added to an existing assembly (Lemay
et al., 2006). The results then appear as output files using GUI or the CLI. GUI provides a
graphical interface to view many of the results from assembly irrespective of whether the GUI or
Assembler application uses a folder on the file system to carry the project information regarding
assembly. For example, whether GUI or new assembly and related commands carry out
computation of the assembly of reads, the computation of reads must be carried out.
Graphical user interface application and gsAssembler can perform of view assemblies (Kislyuk
et al., 2010). The graphical interfaces can be used to open existing assembly projects, carry out
and fro assembly projects. They also modify assembly input, output parameters and view the
progress of any information required to start an assembly computation. By double clicking its
Surname 4
desktop button, the GS De Novo launches. It can alternatively be launched by gsAssembler and
once it launches, the user can open an existing project to create another project.
There are seven main buttons on the toolbar, along main window of the GS De Novo Assembler.
Use the exit button to close the GS De Novo assembler application. The New button enables one
to create a new assembly project. The Open button enables one an existing assembly project. The
Start button begins the computation of the assembly of an assembly project that is open. The
Stop button halts the execution of an assembly computation that is on-going. The About button
shows the GS De Novo assembler splash screen while the Help button opens the GS De Novo
Click the New Project button in the right toolbar or click the new assembly project text button
that appears in the Quick start column. A dialogue box displays and the name of the new project
and its location can be specified in it (Deonier et al., 2005). In the Name text field, type the name
of the new project. For example, M. Colombiense. Type the full path in the Location text field.
An alternative method to insert location is to click on the button named Open Project at the
right of the text field. Use select project location window and navigate to the directory where
the new project is supposed to be created. After the name and location have been updated, the
full path field changes. Once satisfied with both the name and location, click OK button. The
project can be saved by clicking on the yes button to save and prompt the data.
The new project dialogue contains menu titled drop-down. It specifies the Sequence Type and
indicates the type of DNA library that has been sequenced or gemonic. The choice of the type of
sequence determines the parameters user is allowed to modify in the project. A project once
created cannot be changed (Cristianini et al., 2006). To open an existing project, click the open
Surname 5
project button in the right toolbar or click on the Open an Assembly project text button located
in the quick start column. A dialogue is displayed in which one can select the name and specify
the directory location of the project that is required to be opened. 454 assembly projects will be
displaced. One may choose to display all files by selecting All Files from the Files of Type
dropdown menu.
Viewing project summary with the overview tab occurs when the overview tab displays. A
mouse can be used to select other tabs. Some tabs may remain inactive until the type of
information they require is available for the project. Information that is related to the overall
project is indicated on the overview Tabs Project Summary Information. Before an assembly
computation can be performed, one Read Data file must be added to the project (Lemay et al.,
2006). Errors in input parameters can also cause the message such as Not ready for Analysis to
appear in the top right corner of the application window. Once the errors have been corrected and
one Read Data file is added to the project, all options will have valid values and the message will
change to Ready for Analysis. The start button will become active.
Surname 6
Works Cited
Deonier R, Tavare S, Waterman M. Computational Genome Analysis. 1st ed. Los Angeles:
http://www.tok.ro/toksite/downloads/Bioinformatika/Konyvek/konyvek%20bioinfo
%20fejezetekkel%20+%20bioinfo%20konyvek/Computational%20Genome
Kislyuk, Andrey O., et al. "A computational genomics pipeline for prokaryotic sequencing