Sunteți pe pagina 1din 6

Surname 1

Name:

Instructors Name:

Course:

Date:

Computational Genomics

Computational genomics refers to the interpretation and understanding of information that is

encoded and expressed from genetic complement of living organisms. Genome is the complete

recording of all DNA that determines the identity of biological organisms. The inventory usually

encompasses a wide range of organisms from molecules to large populations of organisms.

Biological systems are complex and interact and varying networks (Deonier et al., 2005).

Computational genomics emphasizes biological phenomena at different levels of complexity

from molecular to high levels in the hierarchy. The anatomy, evolutionary histories,

biochemistry, nature, physiology and structure of organisms define the types of problems that

require solutions. A number of medical and evolutionary reasons tend to underscore

understanding of human biology.

Zoologists do not give the correct impression of life since their work stresses mainly on

mammals and other vertebrates. Organisms range from bacteria to multicellular plants and

animals. Such organisms employ various strategies for extracting energy from the environment.

They reduce dissolve sulphate to obtain H2S and eventually pyrite (Cristianini, 2006). Processes

such as photosynthesis and aerobic respiration enable them to acquire food. Some organisms

exist in high temperatures that near boiling points while others exist in very cold temperatures (at

freezing points). Some organisms like lithotrophic bacteria live in rocks are beneath the surface

of the earth. The analysis of RNA sequences indicates that there exist three major domains of

living organisms. They include eubacteria, Archaea and eukaryotes. Eubacteria consists of M.
Surname 2

colombiense or Bacillus subtilis while eukaryotes are organisms that have structured

chromosomes like fungi or humans. Archaea are types of bacteria that live in extreme

environments. The three types of organisms give two major categories namely prokaryotes,

archaebacteria. Prokaryotes do not have a true nucleus and their DNA is not as structured as

eukaryotic chromosomes.

The abundance of bacteria and the wide range of environment they can inhabit make them most

successful form of life on earth. Numerous unicellular forms known as protists exist among the

eukaryotes. Most of them are marine organisms. Ultrastructural and molecular data show that

different types of protists differ from each other compared to the differences between plants and

animals. However, they are members of a kingdom called Protista. Fungi, plants and animals are

the major multicellular groups. M. colombiense is an anaerobic bacterium that is found in the

lower intestine of warm blooded animals (Kislyuk et al., 2010). It belongs to the genus

Colombiense. Most strains of M. colombiense harmless but other strains can cause food

poisoning in their hosts. A part of normal flora of the gut is the harmless strains. They can also

benefit their hosts by producing vitamins such as K2. They also prevent colonization of the

intestine with bacteria that is pathogenic. M. colombiense does not contain crystal violet.

Sequencing software GS De Novo is used to record DNA elements of living organism. It

constructs assemblies of reads from one or more sequencing Runs by use of read flowgrams as

input (SSSM, 2011). It creates assembly projects, remove and add reads from a project and

specify parameters. In addition, it runs the algorithms of assemblies on the project data and view

output that assembly computations produce. A Graphical User Interface (GUI) is used to access

the application. Input data come from several regions of one or several runs of interest. External

file formats such FASTA and FASTQ are used to import additional read data. Some of the roles
Surname 3

the software performs during the assembly process include identifying pairwise overlaps that are

between the reads, solving branching structures that are between the contigs. The software

generates consensus basecalls of the contigs.

Multiple alignments of reads and divides also use the software application. It can perform extra

steps with the availability of paired data. Multiple alignments of overlapping read sequences are

produced by the assembler. The GUI provides tools that facilitate the contig sequences as well as

the multiple reads that from the contig. There is an interactive view of the flowgrmas of the

reads. The GS Novo assembler enables users to modify, run and create assemblies in the form of

projects (SSSM, 2011). The functionality is provided both the GUI and the command line

interface (CLI). Projects may be established to gather all reads at once. An alternative to this can

be incremental operation that allows additional reads to be added to an existing assembly (Lemay

et al., 2006). The results then appear as output files using GUI or the CLI. GUI provides a

graphical interface to view many of the results from assembly irrespective of whether the GUI or

the CLI assembled the project.

Assembler application uses a folder on the file system to carry the project information regarding

assembly. For example, whether GUI or new assembly and related commands carry out

computation of the assembly of reads, the computation of reads must be carried out.

Graphical user interface application and gsAssembler can perform of view assemblies (Kislyuk

et al., 2010). The graphical interfaces can be used to open existing assembly projects, carry out

computation of an assembly, viewing the results of a completed assembly, add/remove read to

and fro assembly projects. They also modify assembly input, output parameters and view the

progress of any information required to start an assembly computation. By double clicking its
Surname 4

desktop button, the GS De Novo launches. It can alternatively be launched by gsAssembler and

once it launches, the user can open an existing project to create another project.

There are seven main buttons on the toolbar, along main window of the GS De Novo Assembler.

Use the exit button to close the GS De Novo assembler application. The New button enables one

to create a new assembly project. The Open button enables one an existing assembly project. The

Start button begins the computation of the assembly of an assembly project that is open. The

Stop button halts the execution of an assembly computation that is on-going. The About button

shows the GS De Novo assembler splash screen while the Help button opens the GS De Novo

assembler section in the manual of the software.

Creating a new project: M. Colombiense

Click the New Project button in the right toolbar or click the new assembly project text button

that appears in the Quick start column. A dialogue box displays and the name of the new project

and its location can be specified in it (Deonier et al., 2005). In the Name text field, type the name

of the new project. For example, M. Colombiense. Type the full path in the Location text field.

An alternative method to insert location is to click on the button named Open Project at the

right of the text field. Use select project location window and navigate to the directory where

the new project is supposed to be created. After the name and location have been updated, the

full path field changes. Once satisfied with both the name and location, click OK button. The

project can be saved by clicking on the yes button to save and prompt the data.

The new project dialogue contains menu titled drop-down. It specifies the Sequence Type and

indicates the type of DNA library that has been sequenced or gemonic. The choice of the type of

sequence determines the parameters user is allowed to modify in the project. A project once

created cannot be changed (Cristianini et al., 2006). To open an existing project, click the open
Surname 5

project button in the right toolbar or click on the Open an Assembly project text button located

in the quick start column. A dialogue is displayed in which one can select the name and specify

the directory location of the project that is required to be opened. 454 assembly projects will be

displaced. One may choose to display all files by selecting All Files from the Files of Type

dropdown menu.

Viewing project summary with the overview tab occurs when the overview tab displays. A

mouse can be used to select other tabs. Some tabs may remain inactive until the type of

information they require is available for the project. Information that is related to the overall

project is indicated on the overview Tabs Project Summary Information. Before an assembly

computation can be performed, one Read Data file must be added to the project (Lemay et al.,

2006). Errors in input parameters can also cause the message such as Not ready for Analysis to

appear in the top right corner of the application window. Once the errors have been corrected and

one Read Data file is added to the project, all options will have valid values and the message will

change to Ready for Analysis. The start button will become active.
Surname 6

Works Cited

Cristianini, Nello, and Matthew W. Hahn. Introduction to computational genomics: a case

studies approach. Cambridge University Press, 2006.

Deonier R, Tavare S, Waterman M. Computational Genome Analysis. 1st ed. Los Angeles:

University of Southern California; 2005. Available at:

http://www.tok.ro/toksite/downloads/Bioinformatika/Konyvek/konyvek%20bioinfo

%20fejezetekkel%20+%20bioinfo%20konyvek/Computational%20Genome

%20Analysis.pdf. Accessed September 16, 2014.

Kislyuk, Andrey O., et al. "A computational genomics pipeline for prokaryotic sequencing

projects." Bioinformatics 26.15 (2010): 1819-1826.

Lemay, Danielle G., and Daniel H. Hwang. "Genome-wide identification of peroxisome

proliferator response elements using integrated computational genomics." Journal of

lipid research 47.7 (2006): 1583-1587.

454 Sequencing System Software Manual. GS De Novo Assembler, GS reference

mapper, SFF Tools. Version 2.6. (2011)

S-ar putea să vă placă și