Sunteți pe pagina 1din 41

BCH 323 NOTES (2013)

INFORMATION FLOW
THE GENOME
In modern molecular biology and genetics, the genome is the entirety of an organism's
hereditary information. It is encoded either in DNA or, for many types of viruses, in RNA.
The genome includes both the genes and the non-coding sequences of the DNA/RNA.
Genome composition is used to describe the makeup of contents of a haploid genome, which
should include genome size, proportions of non-repetitive DNA and repetitive DNA in
details. By comparing the genome compositions between genomes, scientists can better
understand the evolutionary history of a given genome.

Genome size is the total number of DNA base pairs in one copy of a haploid genome. The
genome size is positively correlated with the morphological complexity among prokaryotes
and lower eukaryotes; however, after mollusks and all the other higher eukaryotes above, this
correlation is no longer effective. This phenomenon also indicates the mighty influence
coming from repetitive DNA act on the genomes.

Since genomes are very complex, one research strategy is to reduce the number of genes in a
genome to the bare minimum and still have the organism in question survive. There is
experimental work being done on minimal genomes for single cell organisms as well as
minimal genomes for multi-cellular organisms (see Developmental biology). The work is
both in vivo and in silico.
A gene is the basic physical and functional unit of heredity. Genes, which are made up of
DNA, act as instructions to make molecules called proteins. In humans, genes vary in size
from a few hundred DNA bases to more than 2 million bases. The Human Genome Project
has estimated that humans have between 20,000 and 25,000 genes.

Figure: Genome Organization


Chromosomes and Genome Organization

Diploid organisms have one copy of each chromosome from each parent.
It has an uninterrupted length of DNA, representing many ways.
And so a genome consists of entire set of chromosomes
Most eukaryotic genes in contrast to typical bacterial genes, the coding sequences (exons) are
interrupted by noncoding DNA (introns). The gene must have (Exon; start signals; stop
signals; regulatory control elements).The average gene 7-10 exons spread over 10-16kb of
DNA.

A prokaryotic chromosome consists of a single molecule of DNA in the form of a closed


loop. The chromosome is described as circular.

A prokaryotic gene may be divided into 3 sequences with respect to its transcription:

A sequence called the promoter upstream of the start


The RNA-coding sequence itself
A terminator specifying where transcription will stop

NB!!! A prokaryotic cell has only one chromosome.

A eukaryotic chromosome is linear, not circular, in other words it has two ends, like a
sausage. Each chromosome contains one molecule of DNA for the first half or so of
interphase, then the DNA replicates, and the two DNA molecules remain together (as sister-
chromatids) in the same chromosome for the rest of interphase. This does not happen in
prokaryotic cells.

NB!!! Eukaryotic cells have more than one chromosome.


A further difference: prokaryotic chromosomes consist only of a naked DNA molecule, but
eukaryotic chromosomes also contain many molecules of proteins (mostly histones). The
DNA is wound around these proteins.

Eukaryotes generally have many more genes and these genes are spread across multiple
chromosomes.

Prokaryotes have fewer genes and these genes are all located on one chromosome. Groups of
genes producing proteins with related functions are often organized into operons in
prokaryotes but not in eukaryotes.

Eukaryotes also have mRNA that must have its introns excised and the mRNA transported
out of the nucleus to the ribosomes. The greater complexity of the eukaryote genome means
that a greater variety and complexity of control mechanisms is necessary.

Prokaryotes have one type of RNA polymerase for all types of RNA

The mRNA is not modified.

Prokaryotic transcription:

Transcription factors bind to specific DNA sequences upstream of the start of operons, or
sets of related genes.
Transcribed mRNA is directly translated by ribosomes.
In prokaryotes, transcription of a gene and translation of the resulting mRNA occur
simultaneously.

NB!!! The existence of introns in prokaryotes is extremely rare.

Figure: Prokaryotic chromosomes vs Eukaryotic chromosomes


The DNA for a given gene in eukaryotes is organized into exons and introns.

In order to remove the introns from the pre-mRNA, the pre-mRNA is spliced at splice
junctions found at the extreme ends of each and every intron. In pre-mRNA to mRNA
splicing it is critical to make sure that splicing is extremely accurate. If splicing is off by one
nucleotide the entire coding will be messed up because all of the codons downstream of the
mistake will be out of the correct reading frame (they will be out of phase).

Eukaryote genes are not grouped in operons. Each eukaryote gene is transcribed separately,
with separate transcriptional controls on each gene.

Eukaryotic mRNA is modified through RNA splicing.

Eukaryotic mRNA is generally monogenic (monocistronic); code for only one polypeptide.
From genome to transcriptome
It is constructed from transcription where individual genes are copied in RNA molecule.
It is never synthesized de novo
Very much significant since it contains the coding RNA and hence determines the
biochemical capacity of the cell.
By analyzing the transcriptome researchers can determine when and where each gene is
turned on and off in the cells.
The Proteome

It contains all the proteins translated from the transcriptome at any moment.
It is much more complex than the transcriptome or the genome.
It varies in differing circumstances due to different patterns of gene expression and
different patterns of protein modification.
PROTEOMICS: proteins represent the actual functional molecules in the cell, so when
mutation occurs in the DNA, proteins are ultimately affected.

Prokaryotic genome typically exists in the form of a circular chromosome located in the
cytoplasm, but in eukaryotes the genetic material is housed in the nucleus and tightly
packaged into linear chromosomes.

Each of earths species has its own distinctive genome, your genome is different from that of
every person on earth, and in fact it is different from that of person who has ever lived.
THE DNA
Deoxyribonucleic acid (DNA) is a molecule that encodes the genetic instructions used in the
development and functioning of all known living organisms and many viruses.

Along with RNA and proteins, DNA is one of the 3 major macromolecules essential for all
known forms of life.
Most DNA molecules are double-stranded helices, consisting of two long biopolymers of
simpler units called nucleotides. Each nucleotide is composed of either of 4 nucleobases:
Guanine - a purine
Adenine - a purine
Thymine - a pyrimidine, and
Cytosine - a pyrimidine
recorded using the letters G, A, T, and C, as well as a backbone made of
alternating sugars (deoxyribose) and phosphate groups (related Tophosphoric acid), with the
nucleobases (G, A, T, C) attached to the sugars.
DNA is well-suited for biological information storage, since the DNA backbone is
resistant to cleavage and the double-stranded structure provides the molecule with a
built-in duplicate of the encoded information.

Figure: DNA Structure


Although nucleotides derive their names from the nitrogenous bases they contain, they owe
much of their structure and bonding capabilities to their deoxyribose molecule. The central
portion of this molecule contains five carbon atoms arranged in the shape of a ring, and each
carbon in the ring is referred to by a number followed by the prime symbol ('). The two
strands of DNA run in opposite directions to each other and are therefore anti-parallel, one
backbone being 3 (three prime) and the other 5 (five prime). This refers to the direction
the 3rd and 5th carbon on the sugar molecule is facing.

When nucleotides join together in a series, they form a structure known as a polynucleotide.
Attached to each sugar is one of 4 types of molecules called nucleobases (informally, bases).
It is the sequence of these 4 nucleobases along the backbone that encodes genetic
information. This information is read using the genetic code, which specifies the sequence of
the amino acids within proteins. The code is read by copying stretches of DNA into the
related nucleic acid RNA in a process called transcription.
Within cells, DNA is organized into long structures called chromosomes. During cell
division these chromosomes are duplicated in the process of DNA replication, providing each
cell its own complete set of chromosomes. DNA is found in nearly all living cells. However,
its exact location within a cell depends on whether that cell possesses a special membrane-
bound organelle called a nucleus.
Eukaryotic organisms (animals, plants, fungi, and protists) store most of their DNA
inside the cell nucleus and some of their DNA in organelles, such
as mitochondria orchloroplasts (plastids). In
contrast, prokaryotes (bacteria and archaea) store their DNA only in the cytoplasm.
This is because prokaryotes lack the nucleus.

Within the chromosomes, chromatin proteins such as histones compact and organize DNA.
These compact structures guide the interactions between DNA and other proteins, helping
control which parts of the DNA are transcribed.

Biological functions

DNA usually occurs as linear chromosomes in eukaryotes, and circular chromosomes in


prokaryotes. The set of chromosomes in a cell makes up its genome; the human genome has
approximately 3 billion base pairs of DNA arranged into 46 chromosomes. The information
carried by DNA is held in the sequence of pieces of DNA called genes. Transmission of
genetic information in genes is achieved via complementary base pairing.
For example, in transcription, when a cell uses the information in a gene, the DNA sequence
is copied into a complementary RNA sequence through the attraction between the DNA and
the correct RNA nucleotides. Usually, this RNA copy is then used to make a matching
protein sequence in a process called translation, which depends on the same interaction
between RNA nucleotides.
In alternative fashion, a cell may simply copy its genetic information in a process called DNA
replication. The details of these functions are covered in other articles; here we focus on the
interactions between DNA and other molecules that mediate the function of the genome.
DNA CONFORMATIONS

DNA exists in many possible conformations (6 morphological forms)which include A, B and


Z form.The conformation that DNA adopts depends on 6 conditions:

Hydration level
DNA sequence
Amount and direction of super coiling
Chemical modifications of bases
Type and concentration of metal ions
Presence of polyamines in solution

1. Geometry of A-DNA conformation

A form is right handed double helix fairly similar to the more common and well-
known B-DNA form, but with a shorter more compact helical structure
Overall shape of A form is short and wide (Avg base pairs = 11)
It appears likely that it occurs only in dehydrated samples of DNA, such as those used
in crystallographic experiments, and possibly is also assumed by DNA-RNA hybrid
helices and by regions of double-stranded RNA.

Figure: A-DNA Conformation


2. Geometry of Z-DNA conformation

Z form is left handed


Elongated and narrow (Avg base pairs = 12)
Z-DNA is the skiniest, with only one groove and is stabilized by high salt
concentration (See below structure)

Figure: Z-DNA conformation

3. Geometry of B-DNA conformation

DNA naturally occurs in the B form (Avg base pairs / 360 = 10)
Most stable configuration and predominant structure
Helix is right-handed
Exists under high hydration levels (92% H2O) and no unusual base sequence
Overall shape is long and narrow (Diameter = 1.9nm)
Twin helical strands of polynucleotides form DNA backbone (5 to 3 and 3 to 5)
Specificity of base pairing (A=T and CT)
Sugar is -D-deoxyribose
Major (2.2nm) and Minor (1.2nm) grooves
Figure: B-DNA conformation

NB!!! B and Z form normally occur only in DNA whereas A normally occur in RNA

Figure: Different types of DNA conformations assembeled. Identify them.


(From Left to Right: A-DNA, B-DNA, and Z-DNA form)
Figure: Comparison between A, B, and Z form of DNA.

PROPERTIES OF DNA MOLECULE

DNA was first identified and isolated by Friedrich Miescher and the double helix structure of
DNA was first discovered by James Watson and Francis Crick. The structure of DNA of all
species comprises two helical chains each coiled round the same axis, and each with a pitch
of 34 ngstrms (3.4 nanometres) and a radius of 10 ngstrms (1.0 nanometres).[5]
According to another study, when measured in a particular solution, the DNA chain measured
22 to 26 ngstrms wide (2.2 to 2.6 nm), and one nucleotide unit measured 3.3 (0.33 nm)
long. The largest human chromosome, chromosome number 1, consists of approximately 220
million base pairs and is 85 mm long.
In living organisms DNA does not usually exist as a single molecule, but instead as a pair of
molecules that are held tightly together. These two long strands entwine like vines, in the
shape of a double helix. A nucleobase linked to a sugar is called a nucleoside.
When phosphoric acid is esterified to one of the sugar portion of a nucleoside, a nucleotide is
formed (see structure below). (Nucleotides are phosphoric esters of nucleosides) i.e. In other
words, a nucleoside is compound that consists of a base and a sugar, covalently linked
together. It differs from a nucleotide by lacking a phosphate group in its structure.
In medicine, several nucleoside analogues are used as anticancer and antiviral agents. The
viral polymerase incorporates these compounds with non-canon bases. These compounds are
activated in the cells by being converted into nucleotides; they are administered as
nucleosides as charged nucleotides cannot easily cross cell membranes.
A polymer comprising multiple linked nucleotides (as in DNA) is called a polynucleotide.
The backbone of the DNA strand is made from alternating phosphate and sugar residues. The
sugar in DNA is 2-deoxyribose, which is a pentose (five-carbon) sugar. Deoxyribose, or
more precisely 2-deoxyribose, is a monosaccharide with idealized formula H-(C=O)-(CH2)-
(CHOH)3-H. Its name indicates that it is a deoxy sugar, meaning that it is derived from
the sugar ribose by loss of an oxygen atom. Since the pentose sugars arabinose and ribose
only differ by the stereochemistry at C2', 2-deoxyribose and 2-deoxyarabinose are equivalent,
although the latter term is rarely used because ribose, not arabinose, is the precursor to
deoxyribose.
The sugars are joined together by phosphate groups that form phosphodiester bonds
between the third and fifth carbon atoms of adjacent sugar rings. These asymmetric bonds
mean a strand of DNA has a direction. The asymmetric ends of DNA strands, 5 (five prime)
and 3 (three prime) ends, have a terminal phosphate group and a terminal hydroxyl group,
respectively. One major difference between DNA and RNA is the sugar, with the 2-
deoxyribose in DNA being replaced by the alternative pentose sugar ribose in RNA.

Figure: Five prime Three prime ends of DNA. Take note where the next nucleotide
attaches to this molecule.
NB!!! The DNA double helix is stabilized primarily by the following:
1. Base-stacking interactions among aromatic nucleobases.
2. Hydrogen bonds-linkage between two bases
3. Charge to charge interactions

In the aqueous environment of the cell, the conjugated bonds of nucleotide bases align
perpendicular to the axis of the DNA molecule, minimizing their interaction with the
solvation shell and therefore, the Gibbs free energy (G).
How do these factors stabilize DNA double helix?

The double helix is stabilized because nitrogenous bases are only able to match up (pair) with
certain other nucleotides on the opposing strand. Pairing is determined by the molecular
shape of the bases and their ability to form hydrogen bonds.

Hydrogen bond is a cause of linkage between bases although it is weak energy-wise it is


able to stabilize the helix because of the large numbers that are present in DNA molecule.

Stacking interactions are interactions between bases are weak, but the large amounts of
these interactions help to stabilize the overall structure of the helix.

Hydrophobic effects stabilize DNA by burying bases in the interior of the helix increases its
stability.

Charge-charge interactions refers to the electrostatic (ion-ion) repulsion of the negatively


charged phosphate that is potentially unstable, however the presence of Mg2+ and cationic
proteins with abundant Arginine and Lysine residues that stabilizes the double helix.

NB!!! The significance of DNA and the factors stabilizing it is that it contains the genetic
instructions used in the development and functioning of all known living organisms and some
viruses. The main role of DNA molecules is the long-term storage of information. The DNA
segments that carry this genetic information are called genes.

Additional Forces help the DNA Double Helix to be a stable structure. The negatively
charged phosphate group is free to interact with positively charged atoms in electrostatic
forces. Hydrophobic forces stabilize DNA double helix by burying the bases in the interior of
the helix increasing its stability.
The stability of the DNA double helix depends on a fine balance of interactions including
hydrogen bonds between bases, hydrogen bonds between bases and surrounding water
molecules, and base-stacking interactions between adjacent bases
1. Nucleobases Classification

The word base does not refer to an alkaline compound It refers to a one- or two-ringed
nitrogenous compound. The nucleobases are classified into two types: the Purines, (A and
G), being fused five- and six-membered heterocyclic compounds, and the Pyrimidines, the
six-membered rings (C and T).

Nucleobases are nitrogen-containing biological compounds found within DNA, RNA,


nucleotides, and nucleosides. They are also termed nitrogenous bases or simply bases
because of their ability to form base-pairs and to stack upon one another lead directly to the
helical structure of DNA and RNA.

A purine is a heterocyclic aromatic organic compound. It consists of a pyrimidine ring


fused to an imidazole ring. Purines, including substituted purines and their tautomers, are the
most widely occurring nitrogen-containing heterocycle in nature.Most significant ones are
Adenine and Guanine, two of the nitrogenous bases found in nucleic Acids (DNA & RNA)

Other notable types are xanthine, hypoxanthine, theobromine, caffeine, uric acid and
isoguanine.

Purines and pyrimidines make up the two groups of nitrogenous bases, including the two
groups of nucleotide bases (What are they???). Two of the four deoxyribonucleotides and
two of the four ribonucleotides, the respective building-blocks of DNA and RNA, are
purines.

A pyrimidine is a nucleotide (nucleoside + phosphate group) whose basic structure is


cyclohexane with Nitrogen atoms positioned at 1 and 3. This molecule is also aromatic,
and planar. Pyrimidine is isomeric with two other forms of diazine. Cytosine(C), Uracil (U),
and Thymine (T) are all examples of pyrimidines; each with different chemical groups.
Pyrimidine attaches itself to a phosphate sugar group such as ribonucleotides (which have a
hydroxy group positioned axially at carbon-2) or deoxyribonucleotide (which has a hydrogen
atom at C-2) at the 1st Nitrogen.

A fifth pyrimidine nucleobase, Uracil (U), usually takes the place of thymine in RNA and
differs from thymine by lacking a methyl group on its ring. Thymine is also found to a
small extent in some forms of RNA. Uracil is not usually found in DNA, occurring only as a
breakdown product of cytosine.

The numbering of nucleobases is unprimed unlike the numbering of sugars.


Figure: The purines and pyrimidines and their attachments. Note the sugar on Uracil is
Ribose.

A purine is a heterocyclic aromatic organic compound. It consists of a pyrimidine ring


fused to an imidazole ring. Purines, including substituted purines and their tautomers, are the
most widely occurring nitrogen-containing heterocyclic in nature.

Figure: Purines and Pyrimidines, and their structures.


Base stacking

Base stacking interactions in DNA and RNA are due to dispersion attraction, short-range
exchange repulsion, and electrostatic interactions, which also contribute to stability. Again,
GC stacking interactions with adjacent bases tend to be more favorable. (Note, however, that
a GC stacking interaction with the next base pair is geometrically different from a CG
interaction.) Base stacking effects are especially important in the secondary structure and
tertiary structure of RNA; for example, RNA stem-loop structures are stabilized by base
stacking in the loop region.

Base modifications and DNA packaging

The expression of genes is influenced by how the DNA is packaged in chromosomes, in a


structure called chromatin. Base modifications can be involved in packaging, with regions
that have low or no gene expression usually containing high levels of methylation of cytosine
bases. DNA packaging and its influence on gene expression can also occur by covalent
modifications of the histone protein core around which DNA is wrapped in the chromatin
structure or else by remodeling carried out by chromatin remodeling complexes. There is,
further, crosstalk between DNA methylation and histone modification, so they can
coordinately affect chromatin and gene expression.

For one example, cytosine methylation, produces 5-methylcytosine, which is important for X-
chromosome inactivation. The average level of methylation varies between organisms the
worm Caenorhabditis elegans lacks cytosine methylation, while vertebrates have higher
levels, with up to 1% of their DNA containing 5-methylcytosine. Despite the importance of
5-methylcytosine, it can deaminate to leave a thymine base, so methylated cytosines are
particularly prone to mutations. Other base modifications include adenine methylation in
bacteria, the presence of 5-hydroxymethylcytosine in the brain, and the glycosylation of
uracil to produce the "J-base" in kinetoplastids

Functions of Nucleobases

Aside from the crucial roles of Purines (adenine and guanine) in DNA and RNA, Purines are
also significant components in a number of other important biomolecules, such as ATP, GTP,
cyclic AMP, NADH, and coenzyme A. Purine itself, has not been found in nature, but it can
be produced by organic synthesis. They may also function directly as neurotransmitters,
acting upon purinergic receptors. Adenosine activates adenosine receptors.As part of
coenzymes in oxidation-reduction reactions (NAD/FAD).

Sources of Nucleobases

Purines are found in high concentration in meat and meat products, especially internal organs
such as liver and kidney. In general, plant-based diets are low in purines. Examples of high-
purine sources include: sweetbreads, anchovies, sardines, liver, beef kidneys, brains, meat
extracts (e.g., Oxo, Bovril), herring, mackerel, scallops, game meats, beer (from the yeast)
and gravy.
A moderate amount of purine is also contained in beef, pork, poultry, other fish and seafood,
asparagus, cauliflower, spinach, mushrooms, green peas, lentils, dried peas, beans, oatmeal,
wheat bran, wheat germ, and hawthorn.

Higher levels of meat and seafood consumption are associated with an increased risk of gout,
whereas a higher level of consumption of dairy products is associated with a decreased risk.
Moderate intake of purine-rich vegetables or protein is not associated with an increased risk
of gout (Refer BCH 313).

Chemical properties of Nucleobases

Pyrimidine has similar properties to that of pyridines. One similarity is that as the number of
nitrogen atoms in the ring increase, the ring pi electrons become less energetic and, as a
result, electrophilic aromatic substitution gets more difficult while nucleophilic aromatic
substitution gets easier. One example is the displacement of the amino group in 2-
aminopyrimidine by chlorine and its reverse reaction. Reduction in resonance stabilization of
pyrimidines leads to the addition and ring cleavage reactions, and not substitutions. An
example of this is in the Dimroth arrangement. Pyrimidines are less basic than pyridines and
the N-alkylation and N-oxidation are more difficult in pyrimidines as well.

2. Grooves

Twin helical strands form the DNA backbone. Another double helix may be found tracing the
spaces, or grooves, between the strands. These voids are adjacent to the base pairs and may
provide a binding site. As the strands are not symmetrically located with respect to each
other, the grooves are unequally sized. One groove, the major groove, is 22 wide and the
other, the minor groove, is 12 wide.

The narrowness of the minor groove means that the edges of the bases are more accessible in
the major groove. As a result, proteins like transcription factors that can bind to specific
sequences in double-stranded DNA usually make contacts to the sides of the bases exposed in
the major groove. This situation varies in unusual conformations of DNA within the cell, but
the major and minor grooves are always named to reflect the differences in size that would be
seen if the DNA is twisted back into the ordinary B form.

NB: 1 Angstrom = 1.0x Meters


Figure: Major and Minor Grooves. Minor groove is a binding site (for the dye Hoechst
33258). Major groove is said to be rich in chemical information (genes).

3. Base Pairing Complementarity

In a DNA double helix, each type of nucleobase on one strand bonds with just one type of
nucleobase on the other strand. This is called complementary base pairing.
Complementarity is a property shared between two nucleic acid sequences, such that when
they are aligned antiparallel to each other, the nucleotide bases at each position will be
complementary. Two bases are complementary if they form Watson-Crick base pairs. The
degree of complementarity between two nucleic acid strands may vary, from total
complementarity to none. Here, purines form hydrogen bonds to pyrimidines, with adenine
bonding only to thymine in two hydrogen bonds, and cytosine bonding only to guanine in
three hydrogen bonds. This arrangement of two nucleotides binding together across the
double helix is called a base pair.
As hydrogen bonds are not covalent, they can be broken and rejoined relatively easily. The
two strands of DNA in a double helix can therefore be pulled apart like a zipper, either by a
mechanical force or high temperature.As a result of this complementarity, all the information
in the double-stranded sequence of a DNA helix is duplicated on each strand, which is vital in
DNA replication. Indeed, this reversible and specific interaction between complementary
base pairs is critical for all the functions of DNA in living organisms.
Figure: Base pairing complementarity- a GC base pair with three hydrogen bonds (triple
bond).

Figure: An AT base pair with two hydrogen bonds (double bond). Non-covalent hydrogen
bonds between the pairs are shown as dashed lines.

The two types of base pairs form different numbers of hydrogen bonds, AT forming two
hydrogen bonds, and GC forming three hydrogen bonds (see figures, right). DNA with
high GC-content is more stable than DNA with low GC-content.
As noted above, most DNA molecules are actually two polymer strands, bound together in a
helical fashion by noncovalent bonds; this double stranded structure (dsDNA) is maintained
largely by the intrastrand base stacking interactions, which are strongest for G,C stacks. The
two strands can come apart a process known as melting to form two single-stranded DNA
molecules (ssDNA) molecules. Melting occurs at high temperature, low salt and high pH
(low pH also melts DNA, but since DNA is unstable due to acid depurination, low pH is
rarely used).
The stability of the dsDNA form depends not only on the GC-content (% G,C basepairs) but
also on sequence (since stacking is sequence specific) and also length (longer molecules are
more stable). The stability can be measured in various ways; a common way is the "melting
temperature", which is the temperature at which 50% of the ds molecules are converted to ss
molecules; melting temperature is dependent on ionic strength and the concentration of DNA.
As a result, it is both the percentage of GC base pairs and the overall length of a DNA double
helix that determines the strength of the association between the two strands of DNA. Long
DNA helices with a high GC-content have stronger-interacting strands, while short helices
with high AT content have weaker-interacting strands. In biology, parts of the DNA double
helix that need to separate easily, such as the TATAAT Pribnow box in some promoters, tend
to have a high AT content, making the strands easier to pull apart.
In the laboratory, the strength of this interaction can be measured by finding the temperature
necessary to break the hydrogen bonds, the melting temperature (also called Tm value). When
all the base pairs in a DNA double helix melt, the strands separate and exist in solution as two
entirely independent molecules. These single-stranded DNA molecules (ssDNA) have no
single common shape, but some conformations are more stable than others.

Chargaffs Rules
For DNA, adenine (A) bases complement thymine (T) bases and vice versa; guanine (G)
bases complement cytosine (C) bases and vice versa. With RNA, it is the same except that
uracil is present in place of thymine, and therefore adenine (A) bases complements uracil (U)
bases.

Since there is only one complementary base for each of the bases found in DNA and in RNA,
one can reconstruct a complementary strand for any single strand. All C bases in one strand
will pair with G bases in the complementary strand, etc. In a DNA double helix, the two
strands of DNA are complementary; this plays an important role in DNA replication, as each
strand can act as a template for the construction of the other.

For example, the complementary strand of the DNA sequence

5' A G T C A T G 3'

Is

3' T C A G T A C 5'

NB!!!Note that the latter is often written as the reverse complement with the 5' end on the left
and the 3' end on the right, as below:

5' C A T G A C T 3'

A sequence that is equal to its reverse complement is said to be a palindromic sequence.

NB!!! Chargaff's rules state that DNA from any cell of all organisms should have a 1:1
ratio (base Pair Rule) of pyrimidine and purine bases and, more specifically, that the
amount of guanine is equal to cytosine and the amount of adenine is equal to thymine.
This pattern is found in both strands of the DNA. They were discovered by Austrian chemist
Erwin Chargaff.

First parity rule


The first rule holds that a double-stranded DNA molecule globally has percentage base
pair equality: %A = %T and %G = %C. The rigorous validation of the rule constitutes the
basis of Watson-Crick pairs in the DNA double helix.
Second parity rule

The second rule holds that both %A ~ %T and %G ~ %C are valid for each of the two DNA
strands. This describes only a global feature of the base composition in a single DNA strand.

Why Chargaffs Rule does not work in RNA?

RNA is found as a single stranded molecule. Chargaff's rule states that DNA helices contain
equal molar ratios of A to T and G to C. This is because DNA is found as a double stranded
helix in which A and T and G and C bases pair complementarily. RNA only forms local
helices meaning that it doesn't necessarily contain equal ratios.

4. Sense and Antisense (Exons and Introns)

A DNA sequence is called "sense" if its sequence is the same as that of a messenger RNA
copy that is translated into protein. It always runs in the 5 to 3 direction. The sequence on
the opposite strand is called the "antisense" sequence.It always runs in the 3 to 5 direction
of the DNA.
Both sense and antisense sequences can exist on different parts of the same strand of DNA
(i.e. both strands contain both sense and antisense sequences). In both prokaryotes and
eukaryotes, antisense RNA sequences are produced, but the functions of these RNAs are not
entirely clear.One proposal is that antisense RNAs are involved in regulating gene
expression through RNA-RNA base pairing.

A few DNA sequences in prokaryotes and eukaryotes, and more in plasmids and viruses, blur
the distinction between sense and antisense strands by having overlapping genes. In these
cases, some DNA sequences do double duty, encoding one protein when read along one
strand, and a second protein when read in the opposite direction along the other strand. In
bacteria, this overlap may be involved in the regulation of gene transcription, while in
viruses, overlapping genes increase the amount of information that can be encoded within the
small viral genome.

An intron is any nucleotide sequence within a gene that is removed by RNA splicing
while the final mature RNA product of a gene is being generated. The term intron refers
to both the DNA sequence within a gene and the corresponding sequence in RNA transcripts.
Sequences that are joined together in the final mature RNA after RNA splicing are exons. An
exon is any nucleotide sequence encoded by a gene that remains present within the final
mature RNA product of that gene after introns have been removed by RNA splicing.

The term exon refers to both the DNA sequence within a gene and to the corresponding
sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently
joined to one another as part of generating the mature messenger RNA or noncoding RNA
product of a gene.

Introns are found in the genes of most organisms and many viruses, and can be located in a
wide range of genes, including those that generate proteins, ribosomal RNA (rRNA), and
transfer RNA (tRNA). When proteins are generated from intron-containing genes, RNA
splicing takes place as part of the RNA processing pathway that follows transcription and
precedes translation.

Figure: An Intron (and exons) location on a gene.

Figure: An Exon (and introns) located on a gene.

5. Supercoiling

DNA can be twisted like a rope in a process called DNA supercoiling. With DNA in its
"relaxed" state, a strand usually circles the axis of the double helix once every 10.4 base
pairs, but if the DNA is twisted the strands become more tightly or more loosely wound.

If the DNA is twisted in the direction of the helix, this is positive supercoiling, and the bases
are held more tightly together. If they are twisted in the opposite direction, this is negative
supercoiling, and the bases come apart more easily.
In nature, most DNA has slight negative supercoiling that is introduced by enzymes called
topoisomerases. These enzymes are also needed to relieve the twisting stresses introduced
into DNA strands during processes such as transcription and DNA replication.
Supercoiling in prokaryotes

DNA supercoiling refers to the over- or under-winding of a DNA strand. Supercoiling is


important in a number of biological processes, such as compacting DNA. Prokaryotic cells do
not contain nuclei or other membrane-bound organelles. In prokaryotes, there is genome
packaging.

The nucleoid is simply the area of a prokaryotic cell in which the chromosomal DNA is
located. This arrangement is not as simple as it sounds, however, especially considering that
the E. coli chromosome is several orders of magnitude larger than the cell itself. The question
one should ask themselves is, how do bacteria lacking a nucleus organize and pack their
genome into the cell? The answer lies in supercoiling which enables them to do this.
Supercoiling is important in the following:

For DND packing (packaging)


Replication
Repair
Recombination
Transposition and transcription.

A rubber band can be used to demonstrate supercoiling. Imagine twisting a rubber band so
that it forms tiny coils. Now twist it even further, so that the original coils fold over one
another and form a condensed ball. When this type of twisting happens to a bacterial genome,
it is known as supercoiling...see graphic below

Figure: Rubber band used to demonstrate supercoiling.

If by means of comparison, eukaryotes wrap their DNA around proteins called histones to
help package their DNA into smaller spaces, BUT most prokaryotes do not have histones.
Thus, one way prokaryotes compress their DNA into smaller spaces is through supercoiling.
Figure: A supercoiled chromosome of E.coli.

Genomes can be negatively supercoiled, meaning that the DNA is twisted in the opposite
direction of the double helix, or positively supercoiled, meaning that the DNA is twisted in
the same direction as the double helix. Most bacterial genomes are negatively supercoiled
during normal growth... see structure below

There are a number of proteins involved in supercoiling. These proteins act together to fold
and condense prokaryotic DNA.

In particular, one protein called HU, which is the most abundant protein in the nucleoid,
works with an enzyme called topoisomerase I to bind DNA and introduce sharp bends in the
chromosome, generating the tension necessary for negative supercoiling. Recent studies have
also shown that other proteins, including integration host factor (IHF), can bind to specific
sequences within the genome and introduce additional bends
The folded DNA is then organized into a variety of conformations that are supercoiled and
wound around tetramers of the HU protein, much like eukaryotic chromosomes are wrapped
around histones.

Once the prokaryotic genome has been condensed, DNA topoisomerase I, DNA gyrase, and
other proteins help maintain the supercoils. One of these maintenance proteins, H-NS, plays
an active role in transcription by modulating the expression of the genes involved in the
response to environmental stimuli. Another maintenance protein, factor for inversion
stimulation (FIS), is abundant during exponential growth and regulates the expression of
more than 231 genes, including DNA topoisomerase I.

Figure: The mechanism of topoisomerases.

Without topoisomerases, the DNA cannot replicate normally. Therefore, the inhibitors of
topoisomerases have been used as anti-cancer drugs to stop the proliferation of malignant
cells. However, these inhibitors may also stop the division of normal cells. Some cells (e.g.,
hair cells) which need to continuously divide will be most affected. This explains a
noticeable side effect: the hair loss.

NB!!! What function does supercoiling serve the cell? Supercoiling compacts the DNA.
Negative supercoiling helps to unwind the DNA duplex for replication and
transcription.
RNA Structure and Function

Ribonucleic acid is a long polymer of nucleotides found in the nucleus but mainly in the
cytoplasm of a cell. RNA molecules are also polynucleotides with a sugar-phosphate
backbone and four kinds of bases.The nucleotides are joined a phosphodiester bond, just as
they are in DNA.

The main differences between RNA and DNA are:

RNA molecules are single-stranded


The sugar in RNA is a ribose sugar (as opposed to deoxy-ribose) and has an OH at the 2'
C position highlighted in red in the figure below (DNA sugars have H at that position)
Thymine in DNA is replaced by Uracil in RNA. T has a methyl (-CH3) group instead of
the H atom shown in red in U.

Figure: A typical RNA backbone. The picture shows an ATP molecule (adenosine tri-
phosphate) about to be incorporated into an RNA chain with the release of a di-phosphate)

The presence of a hydroxyl group at the 2' position of the ribose sugar makes RNA different
from DNA and makes the RNA adopt A-form geometry rather than the B-form most
commonly observed in DNA.
The hydroxyl group at 2 means that in the flexible regions of an RNA molecule chemicals
may attack the adjacent phosphodiester bond to cleave the backbone.
Figure: The difference between the sugars found in DNA and RNA. What are the
differences?

The basic structure of RNA can be outlined as a ribose sugar, which is numbered from 1
through 5 with:

a base attached to the 1 position


a hydroxyl group at the 2' position
a phosphate attached to the 3 position of one ribose and the 5' position of the next

RNA transmits genetic information from DNA to the cytoplasm and is involved with the
synthesis of proteins that control chemical processes in the cell.

Figure: Phosphodiester linkage in RNA


RNA molecules do not have a regular helical structure like DNA. Instead, they can form
complicated 3-dimensional structures where the strands can loop back and form intra-strand
base-pairs (a hairpin-shaped structure) from self-complementary regions along the chain.See
structure below.

Figure: An RNA structure

There are three classes of RNA molecules:

1. messenger RNA
(mRNA) which acts as a template for protein synthesis and has the same sequence of
bases (read from the 5' to the 3' end) as the DNA strand that has the gene sequence.
mRNA can range from ~300 nucleotides to ~7000 nucleotides, depending on the size and
the number of proteins that they are coding for.
Cellular organisms use mRNA to convey genetic information that directs synthesis of
specific proteins, while many viruses encode their genetic information using an RNA
genome (look out for structure).

2. transfer RNA
(tRNA), one for each triplet codon that codes for a specific amino-acid (the building
blocks of proteins). tRNA molecules are covalently attached to the corresponding amino-
acid at one end, and at the other end they have a triplet sequence (called the anti-codon)
that is complementary to the triplet codon on the mRNA. All tRNA molecules are in the
range ~70-90 nucleotides. They have a molecular weight of ~25,000 and have
sedimentation constant ~ 4 Svedberg (S) units.

All tRNA molecules have very similar secondary structures in which the single-
stranded chain is folded in a 'clover-leaf' structure that has three hairpins and an acceptor
stem where the amino-acid is covalently attached. The acceptor stem is the 3' end of the
chain and always terminates in the sequence 5'-CCA-3'.
Figure: transfer RNA molecules

3. ribosomal RNA
(rRNA) which make up an integral part of the ribosome, the protein synthesis machinery
in the cell.
The ribosome is a large machinery (~ 20 nm in diameter, 70S sedimentation rate for
bacterial ribosomes) and is made of two subunits: a large subunit (~50S) and a small
subunit (~ 30S). The large subunit is in turn made of two ribosomal RNAs (5S and 23S)
and several (~34 proteins) whereas the small subunit has one ribosomal RNA (16S) and ~
21 proteins. The 23S rRNA is ~ 3000 nucleotides long, and the 16S rRNA is ~ 1500
nucleotides long.

Figure: ribosomal RNA structure


Hyperchromic Effect of DNA

Because of its unique chemistry and structure, DNA shows very peculiar electronic
properties. A complete understanding of these properties at an atomic-molecular point of
view is crucial to shed light on several fundamental biophysicalbiochemical processes
occurring in living organism,1,2 to improve DNA sequencing,3 to detect somatic mutations,4
and to enhance emerging therapeutic strategies.

Hyperchromicity is the increase of absorbance (optical density) of a material. The most


famous example is the hyperchromicity of DNA that occurs when the DNA duplex is
denatured. The UV absorption is increased when the two single DNA strands are being
separated, either by heat or by addition of denaturant or by increasing the pH level. The
opposite, a decrease of absorbance is called hypochromicity.

Among others, the so-called hyperchromic effect, which refers to the experimentally
observed change in absorbance after DNA denaturation, is a very important property of
DNA. This effect is routinely used to record melting curves by monitoring the change in
absorbance when the DNA undergoes a conformational transition from double-stranded to
single-stranded structure in response to a chemical or physical perturbation.

Heat denaturation of DNA, also called melting, causes the double helix structure to unwind to
form single stranded DNA. When DNA in solution is heated above its melting temperature
(usually more than 80 C), the double-stranded DNA unwinds to form single-stranded DNA.
The bases become unstacked and can thus absorb more light. In their native state, the bases of
DNA absorb light in the 260-nm wavelength region. When the bases become unstacked, the
wavelength of maximum absorbance does not change, but the amount absorbed increases by
37%. A double strand DNA dissociating to single strands produces a sharp cooperative
transition.

Figure: The denaturing of DNA.

Hyperchromicity can be used to track the condition of DNA as temperature changes.


The transition/melting temperature (Tm) is the temperature where the absorbance of UV light
is 50% between the maximum and minimum, i.e. where 50% of the DNA is denatured.
The Hyperchromic effect is the striking increase in absorbance of DNA upon
denaturation. The two strands of DNA are bound together mainly by the stacking
interactions, hydrogen bonds and hydrophobic effect between the complementary bases.

Summary

The hydrogen bond limits the resonance of the aromatic ring so the absorbance of the
sample is limited as well.
When the DNA double helix is treated with denatured agents, the interaction force
holding the double helical structure is disrupted.
The double helix then separates into two single strands which are in the random coiled
conformation.
At this time, the base-base interaction will be reduced, increasing the UV absorbance of
DNA solution because many bases are in free form and do not form hydrogen bonds with
complementary bases.
As a result, the absorbance for single-stranded DNA will be 37% higher than that for
double stranded DNA at the same concentration.

The measurement of absorption of light is important in monitoring the melting and


annealing of DNA.

Figure: Nucleic acid melting curve showing hyperchromicity as a function of temperature.

Some important concepts in Hyperchromic Effect

At Tm, the DNA is half denatured and half double stranded.

By lowering the temperature below the Tm, the denatured DNA strands would anneal
back into a double stranded DNA.

When temperature is above the Tm, the DNA is denatured

Because the melting temperature (Tm), occurs almost instantly at a certain T, monitoring the
absorbance of the DNA at various temperature would indicate the melting T. By being able to
find the temperature at which DNA melted and annealed, scientists are able to separate DNA
strands and anneal them with other DNA strands. This is important in creating hybrid DNAs,
which consists of two DNA strands from different sources. Since DNA strands can only
anneal if they are similar, the creation of hybrid DNAs can indicate similarities between
genomes of different organisms.

The Melting Temperature of DNA

The stacking of the bases in the native conformation of DNA contributes the largest part of
the stabilization energy. Energy must be added to a sample of DNA to break the hydrogen
bonds and to disrupt the stacking interactions. This is usually carried out by heating the DNA
in solution...refer Hyperchromic effect

Denaturation and renaturation of DNA strand involves the disruption of hydrogen bonds
between base pairs and the disruption of hydrophobic interactions between stacked
nucleobases. Following this, the double helix unwinds to form two single strands.

Nucleation is the rate limiting step in renaturation.

Figure: Strand Separation or melting.


High Tm indicates high G/C content. Careful monitoring of Tm gives an estimate of base
composition. Heating the DNA very slowly the A=T regions (2H-bonds) denature first, the
follow the GC regions... see diagrams below

How can DNA denaturation be monitored? Bases absorb light in the 260 nm wavelength
region. Wavelength of absorption does not change, but the amount of light absorbed
increases. This effect is called hyperchromicity

Factors affecting Tm of DNA:

Concentration of DNA
Concentration of ions in the solution, most notably Mg2+and K+
DNA sequence
Length of DNA
Melting map: Finding rare mutations for common disease
DNA melting map shows the temperature at which short segments of human genome melt.
Genetic mutations melt at lower temperature than normal DNA. DNA map can be used to
identify gene that have rare mutations.

Natural process of strand separation

DNA replication
When stacking interactions and hydrogen bonding are partially disrupted, it is easier for
the protein to create locally unwound regions on DNA in A/T rich region.
Transcription

Heat causes the DNA helix to melt and separate into two strands (Hyperchromic Effect).
The melting of DNA can be monitored experimentally by observing the absorption of
ultraviolet light. The melting temperature of DNA increases as the G-C content increases.
DNA that consist entirely of A/T base pairs melts at about 70C and DNA that has only G/C
base pairs melts at over 100C .

Uses of DNA in technology (further reading)

1. Genetic engineering

Methods have been developed to purify DNA from organisms, such as phenol-chloroform
extraction, and to manipulate it in the laboratory, such as restriction digests and the
polymerase chain reaction. Modern biology and biochemistry make intensive use of these
techniques in recombinant DNA technology.
Recombinant DNA is a man-made DNA sequence that has been assembled from other DNA
sequences. They can be transformed into organisms in the form of plasmids or in the
appropriate format, by using a viral vector. The genetically modified organisms produced can
be used to produce products such as recombinant proteins, used in medical research, or be
grown in agriculture.

2. Forensics

Forensic scientists can use DNA in blood, semen, skin, saliva or hair found at a crime scene
to identify a matching DNA of an individual, such as a perpetrator. This process is formally
termed DNA profiling, but may also be called "genetic fingerprinting". In DNA profiling, the
lengths of variable sections of repetitive DNA, such as short tandem repeats and
minisatellites, are compared between people. This method is usually an extremely reliable
technique for identifying a matching DNA. However, identification can be complicated if the
scene is contaminated with DNA from several people.

The development of forensic science, and the ability to now obtain genetic matching on
minute samples of blood, skin, saliva or hair has led to a re-examination of a number of cases.
Evidence can now be uncovered that was not scientifically possible at the time of the original
examination. Combined with the removal of the double jeopardy law in some places, this can
allow cases to be reopened where previous trials have failed to produce sufficient evidence to
convince a jury. People charged with serious crimes may be required to provide a sample of
DNA for matching purposes. The most obvious defence to DNA matches obtained
forensically is to claim that cross-contamination of evidence has taken place. This has
resulted in meticulous strict handling procedures with new cases of serious crime. DNA
profiling is also used to identify victims of mass casualty incidents. As well as positively
identifying bodies or body parts in serious accidents, DNA profiling is being successfully
used to identify individual victims in mass war graves matching to family members.

3. Bioinformatics

Bioinformatics involves the manipulation, searching, and data mining of biological data, and
this includes DNA sequence data. The development of techniques to store and search DNA
sequences have led to widely applied advances in computer science, especially string
searching algorithms, machine learning and database theory.

String searching or matching algorithms, which find an occurrence of a sequence of letters


inside a larger sequence of letters, were developed to search for specific sequences of
nucleotides. The DNA sequence may be aligned with other DNA sequences to identify
homologous sequences and locate the specific mutations that make them distinct. These
techniques, especially multiple sequence alignment, are used in studying phylogenetic
relationships and protein function.

Regions of DNA sequence that have the characteristic patterns associated with protein- or
RNA-coding genes can be identified by gene finding algorithms, which allow researchers to
predict the presence of particular gene products and their possible functions in an organism
even before they have been isolated experimentally.

4. DNA nanotechnology

DNA nanotechnology uses the unique molecular recognition properties of DNA and other
nucleic acids to create self-assembling branched DNA complexes with useful properties.
DNA is thus used as a structural material rather than as a carrier of biological information.
This has led to the creation of two-dimensional periodic lattices (both tile-based as well as
using the "DNA origami" method) as well as three-dimensional structures in the shapes of
polyhedra. Nanomechanical devices and algorithmic self-assembly have also been
demonstrated, and these DNA structures have been used to template the arrangement of other
molecules such as gold nanoparticles and streptavidin proteins.

5. Information storage

In a paper published in Nature in January, 2013, scientists from the European Bioinformatics
Institute and Agilent Technologies proposed a mechanism to use DNA's ability to code
information as a means of digital data storage. The group was able to encode 739 kilobytes of
data into DNA code, synthesize the actual DNA, then sequence the DNA and decode the
information back to its original form, with a reported 100% accuracy. The encoded
information consisted of text files and audio files. A prior experiment was published in
August 2012. It was conducted by researchers at Harvard University, where the text of a
54,000-word book was encoded in DNA.

6. History and anthropology

Because DNA collects mutations over time, which are then inherited, it contains historical
information, and, by comparing DNA sequences, geneticists can infer the evolutionary
history of organisms, their phylogeny. This field of phylogenetics is a powerful tool in
evolutionary biology. If DNA sequences within a species are compared, population
geneticists can learn the history of particular populations. This can be used in studies ranging
from ecological genetics to anthropology; For example, DNA evidence is being used to try to
identify the Ten Lost Tribes of Israel.

DNA has also been used to look at modern family relationships, such as establishing family
relationships between the descendants of Sally Hemings and Thomas Jefferson. This usage is
closely related to the use of DNA in criminal investigations detailed above. Indeed, some
criminal investigations have been solved when DNA from crime scenes has matched relatives
of the guilty individual.
REFERENCES

1. Russell, Peter (2001). iGenetics. New York: Benjamin Cummings.


2. Saenger, Wolfram (1984). Principles of Nucleic Acid Structure.
3. Alberts, Bruce; Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts and
Peter Walters (2002). Molecular Biology of the Cell; Fourth Edition.
4. Butler, John M. (2001). Forensic DNA Typing. Elsevier.
5. Watson J.D. and Crick F.H.C. (1953). "A Structure for Deoxyribose Nucleic Acid"
6. Mandelkern M, Elias J, Eden D, Crothers D (1981). "The dimensions of DNA in
solution". J Mol Biol 152 (1): 15361.
7. Gregory S; Barlow, KF; McLay, KE; Kaul, R; Swarbreck, D; Dunham, A; Scott, CE;
Howe, KL; Woodfine, K (2006). "The DNA sequence and biological annotation of
human chromosome 1". Nature 441 (7091): 31521.
8. Watson J.D. and Crick F.H.C. (1953). "A Structure for Deoxyribose Nucleic Acid"
(PDF). Nature 171 (4356): 737738. Retrieved 4 May 2009.
9. Berg J., Tymoczko J. and Stryer L. (2002) Biochemistry. W. H. Freeman and
Company
10. Abbreviations and Symbols for Nucleic Acids, Polynucleotides and their Constituents
IUPAC-IUB Commission on Biochemical Nomenclature (CBN). Retrieved 3 January
2006.
11. Ghosh A, Bansal M (2003). "A glossary of DNA structures from A to Z". Acta
Crystallogr D 59 (4): 6206.Yakovchuk P, Protozanova E, Frank-Kamenetskii MD
(2006). "Base-stacking and base-pairing contributions into thermal stability of the
DNA double helix". Nucleic Acids Res. 34 (2): 56474.
12. Verma S, Eckstein F (1998). "Modified oligonucleotides: synthesis and strategy for
users". Annu. Rev. Biochem. 67: 99134.
13. Kiljunen S, Hakala K, Pinta E, Huttunen S, Pluta P, Gador A, Lnnberg H, Skurnik M
(2005). "Yersiniophage phiR1-37 is a tailed bacteriophage having a 270 kb DNA
genome with thymidine replaced by deoxyuridine". Microbiology 151 (12): 4093
4102.
14. Simpson L (1998). "A base called J". Proc Natl Acad Sci USA 95 (5): 20372038.
15. Borst P, Sabatini R (2008). "Base J: discovery, biosynthesis, and possible functions".
Annual review of microbiology 62: 23551.
16. Cross M, Kieft R, Sabatini R, Wilm M, de Kort M, der Marel GA, van Boom JH, van
Leeuwen F, Borst P et al. (1999). "The modified base J is the target for a novel DNA-
binding protein in kinetoplastid protozoans". EMBO J 18 (22): 65736581.
17. DiPaolo C, Kieft R, Cross M, Sabatini R (2005). "Regulation of trypanosome DNA
glycosylation by a SWI2/SNF2-like protein". Mol Cell 17 (3): 441451.
18. Vainio S, Genest PA, ter Riet B, van Luenen H, Borst P (2009). "Evidence that J-
binding protein 2 is a thymidine hydroxylase catalyzing the first step in the
biosynthesis of DNA base J". Molecular and biochemical parasitology 164 (2): 157
61.
19. Iyer LM, Tahiliani M, Rao A, Aravind L (2009). "Prediction of novel families of
enzymes involved in oxidative and other complex modifications of bases in nucleic
acids". Cell Cycle 8 (11): 16981710.
20. Van Luenen HG, Farris C, Jan S, Genest PA, Tripathi P, Velds A, Kerkhoven RM,
Nieuwland M, Haydock A et al. (2012). "Leishmania". Cell 150 (5): 909921.
21. Hazelbaker DZ, Buratowski S (2012). "Transcription: base J blocks the way". Curr
Biol 22 (22):
22. Wing R, Drew H, Takano T, Broka C, Tanaka S, Itakura K, Dickerson R (1980).
"Crystal structure analysis of a complete turn of B-DNA". Nature 287 (5784): 7558.
23. Pabo C, Sauer R (1984). "Protein-DNA recognition". Annu Rev Biochem 53: 293
321. doi:10.1146/annurev.bi.53.070184.001453. PMID 6236744.
24. Clausen-Schaumann H, Rief M, Tolksdorf C, Gaub H (2000). "Mechanical stability of
single DNA molecules". Biophys J 78 (4): 19972007.
25. Chalikian T, Vlker J, Plum G, Breslauer K (1999). "A more unified picture for the
thermodynamics of nucleic acid duplex melting: A characterization by calorimetric
and volumetric techniques". Proc Natl Acad Sci USA 96 (14): 78538.
26. deHaseth P, Helmann J (1995). "Open complex formation by Escherichia coli RNA
polymerase: the mechanism of polymerase-induced strand separation of double
helical DNA". Mol Microbiol 16 (5): 81724.
27. Isaksson J, Acharya S, Barman J, Cheruku P, Chattopadhyaya J (2004). "Single-
stranded adenine-rich DNA and RNA retain structural characteristics of their
respective double-stranded conformations and show directional differences in
stacking pattern". Biochemistry 43 (51): 159966010.
28. Designation of the two strands of DNA JCBN/NC-IUB Newsletter 1989. Retrieved 7
May 2008.
29. Httenhofer A, Schattner P, Polacek N (2005). "Non-coding RNAs: hope or hype?".
Trends Genet 21 (5): 28997.
30. Munroe S (2004). "Diversity of antisense regulation in eukaryotes: multiple
mechanisms, emerging patterns". J Cell Biochem 93 (4): 66471.
31. Makalowska I, Lin C, Makalowski W (2005). "Overlapping genes in vertebrate
genomes". Comput Biol Chem 29 (1): 112.
32. Johnson Z, Chisholm S (2004). "Properties of overlapping genes are conserved across
microbial genomes". Genome Res 14 (11): 226872
33. Lamb R, Horvath C (1991). "Diversity of coding strategies in influenza viruses".
Trends Genet 7 (8): 2616. doi:10.1016/0168-9525(91)90326-L. PMID 1771674.
34. Benham C, Mielke S (2005). "DNA mechanics". Annu Rev Biomed Eng 7: 2153.
doi:10.1146/annurev.bioeng.6.062403.132016. PMID 16004565.
35. Champoux J (2001). "DNA topoisomerases: structure, function, and mechanism".
Annu Rev Biochem 70: 369413.
36. Wang J (2002). "Cellular roles of DNA topoisomerases: a molecular perspective". Nat
Rev Mol Cell Biol 3 (6): 43040.
37. Venter J; Adams, MD; Myers, EW; Li, PW; Mural, RJ; Sutton, GG; Smith, HO;
Yandell, M; Evans, CA (2001). "The sequence of the human genome". Science 291
38. Thanbichler M, Wang S, Shapiro L (2005). "The bacterial nucleoid: a highly
organized and dynamic structure". J Cell Biochem 96
39. Wolfsberg T, McEntyre J, Schuler G (2001). "Guide to the draft human
genome". Nature 409 (6822): 8246.
40. ^ Gregory T (2005). "The C-value enigma in plants and animals: a review of parallels
and an appeal for partnership". Ann Bot (Lond) 95
41. ^ The ENCODE Project Consortium (2007). "Identification and analysis of functional
elements in 1% of the human genome by the ENCODE pilot
project". Nature 447 (7146): 799816.
42. Pidoux A, Allshire R (2005). "The role of heterochromatin in centromere
function". Philos Trans R Soc Lond B Biol Sci 360 (1455): 56979.
43. Harrison P, Hegyi H, Balasubramanian S, Luscombe N, Bertone P, Echols N, Johnson
T, Gerstein M (2002)."Molecular Fossils in the Human Genome: Identification and
Analysis of the Pseudogenes in Chromosomes 21 and 22". Genome Res 12 (2): 272
80.
44. Harrison P, Gerstein M (2002). "Studying genomes through the aeons: protein
families, pseudogenes and proteome evolution". J Mol Biol 318 (5): 115574..
45. Alb M (2001). "Replicative DNA polymerases". Genome Biol 2
46. Goff SP, Berg P (1976). "Construction of hybrid viruses containing SV40 and lambda
phage DNA segments and their propagation in cultured monkey cells". Cell 9 (4 PT
2): 695705.
47. Houdebine L (2007). "Transgenic animal models in biomedical research". Methods
Mol Biol 360: 163202.
48. Daniell H, Dhingra A (2002). "Multigene engineering: dawn of an exciting new era in
biotechnology". Curr Opin Biotechnol 13 (2): 13641.
49. Job D (2002). "Plant biotechnology in agriculture".Biochimie 84 (11): 110510.
50. Collins A, Morton N (1994). "Likelihood ratios for DNA identification". Proc Natl
Acad Sci USA 91 (13): 600711.
51. Weir B, Triggs C, Starling L, Stowell L, Walsh K, Buckleton J (1997). "Interpreting
DNA mixtures". J Forensic Sci 42 (2): 21322.
52. Jeffreys A, Wilson V, Thein S (1985). "Individual-specific 'fingerprints' of human
DNA". Nature 316 (6023): 769.
53. DNA Identification in Mass Fatality Incidents". National Institute of Justice.
September 2006.
54. Baldi, Pierre; Brunak, Soren (2001). Bioinformatics: The Machine Learning
Approach. MIT Press.
55. Gusfield, Dan. Algorithms on Strings, Trees, and Sequences: Computer Science and
Computational Biology,1997.
56. ^ Sjlander K (2004). "Phylogenomic inference of protein molecular function:
advances and challenges".Bioinformatics 20 (2): 1709.
57. Mount DM (2004). Bioinformatics: Sequence and Genome Analysis (2 ed.)
58. Rothemund PW (2006). "Folding DNA to create nanoscale shapes and
patterns". Nature 440 (7082): 297302
59. Andersen ES, Dong M, Nielsen MM (2009). "Self-assembly of a nanoscale DNA box
with a controllable lid". Nature 459(7243): 736.
60. Ishitsuka Y, Ha T (2009). "DNA nanotechnology: a nanomachine goes live". Nat
Nanotechnol 4 (5): 2812
61. Aldaye FA, Palmer AL, Sleiman HF (2008). "Assembling materials with DNA as the
guide". Science 321 (5897): 17959.
62. Wray G; Martindale, Mark Q. (2002). "Dating branches on the Tree of Life using
DNA". Genome Biology.
63. Lost Tribes of Israel, NOVA, PBS airdate: 22 February 2000. Transcript available
from PBS.org. Retrieved 4 March 2006.
64. Kleiman, Yaakov. "The Cohanim/DNA Connection: The fascinating story of how
DNA studies confirm an ancient biblical tradition". (13 January 2000). Retrieved 4
March 2006.
65. Bhattacharya, Shaoni. "Killer convicted thanks to relative's
DNA". newscientist.com (20 April 2004). Retrieved 22 December 06.
66. Goldman, Nick; Bertone, Paul; Chen, Siyuan; Dessimoz, Christophe; LeProust, Emily
M.; Sipos, Botond; Birney, Ewan (23 January 2013). "Towards practical, high-
capacity, low-maintenance information storage in synthesized
DNA".Nature 494 (7435): 7780.
67. Naik, Gautam (24 January 2013). "Storing Digital Data in DNA". Wall Street Journal.
Retrieved 24 January 2013.
68. Elson D, Chargaff E (1952). "On the deoxyribonucleic acid content of sea urchin
gametes".Experientia
69. Chargaff E, Lipshitz R, Green C (1952). "Composition of the deoxypentose nucleic
acids of four genera of sea-urchin". J Biol Chem
70. Rudner, R; Karkas, JD; Chargaff, E (1968). "Separation of B. Subtilis DNA into
complementary strands. 3. Direct analysis". Proceedings of the National Academy of
Sciences of the United States of America 60 (3): 9212.
71. Zhang CT, Zhang R, Ou HY (2003). "The Z curve database: a graphic representation
of genome sequences". Bioinformatics.

S-ar putea să vă placă și