Sunteți pe pagina 1din 33

Human Genome Project

Unwinding the helical mystery By IFFAT FATIMA

The Human Genome


The human genome is by far the most complex and largest genome. Its size spans a length of about 6 feet of DNA, containing 30,000 to 40,000 genes. The DNA material is organized into a haploid chromosomal set of 22 and a sex chromosome.

Human Genome

Each chromosome contains many genes, the basic physical and functional units of heredity.

Genes are specific sequences of bases that encode instructions on how to make proteins.
Proteins perform most life functions and even make up the majority of cellular structures. Proteins are large, complex molecules made up of smaller subunits called amino acids. A protein folds up into specific three-dimensional structure that define their particular functions in the cell.

Human Genome Project


identify all the approximate 30,000 genes in human DNA, determine the sequences of the 3 billion chemical base pairs that make up human DNA, store this information in databases, improve tools for data analysis.

The Human Genome Project (HGP)


An international research program designed to:
1. 2. 3. Construct detailed genetic and physical maps of the human genome, Determine the complete nucleotide sequence of human DNA, To localize the estimated 50,000 100,000 genes within the human genome, Perform similar analyses on the genomes of other organisms used as model systems in research. Produce a resource of detailed information about the structure organization and function of human DNA, information that constitutes the basic set of inherited instructions for the development and functioning of a human being.

4.

5.

Human Genome Project-Milestones:


1990: Project initiated as joint effort of U.S. Department of Energy and the National Institutes of Health
June 2000: Completion of a working draft of the entire human genome February 2001: Analyses of the working draft are published April 2003: HGP sequencing is completed and Project is declared finished two years ahead of schedule

Whose DNA is being sequenced?


Used samples from of blood (female) and sperm (male) from a large number of people. Celera Genomics collected samples from individuals who were Hispanic, Asian, Caucasian, and African-American. The donor identities were protected.

Stages of Human Genome Project

The project had three stages:


Genetic (or linkage) mapping Physical mapping DNA sequencing

Step .1- Map Creation


The first step towards sequencing the genome is creating maps.

Genetic Linkage Maps


A linkage map (genetic map) maps the location of several thousand genetic markers on each chromosome A genetic marker is a gene or other identifiable DNA sequence Recombination frequencies are used to determine the order and relative distances between genetic markers

Cytogenetic map Genes located by FISH

Chromosome bands

Linkage mapping Genetic markers Physical mapping Overlapping fragments

DNA sequencing

A CLOSER LOOK AT THE HUMAN GENOME


Chromosome no No of Genes No of Bp (million) 240 bp 240 bp 200 bp 190 bp 180 bp 170 bp Percentage determined 90 % 95% 95% 95% 95% 95%

Chromosome 1 3000 genes Chromosome 2 2500 genes Chromosome 3 1900 genes Chromosome 4 1600 genes Chromosome 5 1700 genes Chromosome 6 1900 genes

Chromosome no Chromosome 7 Chromosome 8 Chromosome 9 Chromosome 10 Chromosome 11 Chromosome 12

No of Genes

No of Bp (million) 150 bp 140 bp 130 bp 130 bp

Percentage determined 95% 95% 85% 95%

1800 genes 1400 genes 1400 genes 1400 genes

2000 genes

130 bp

95%

1600 genes

130 bp

95%

Chromosome no
Chromosome 13

No of Genes

No of Bp (million)
110 bp 100 bp

Percentage determined
80% 80%

800 genes

Chromosome 14 1200 genes

Chromosome 15

1200 genes

100 bp
90 bp 80 bp 70 bp

80%
85% 95% 95%

Chromosome 16 1300 genes Chromosome 17 Chromosome 18 1600 genes 600 genes

Chromosome no Chromosome 19 Chromosome 20 Chromosome 21 Chromosome 22 Chromosome X Chromosome Y

No of Genes

No of Bp (million) 60 bp 60 bp 40 bp 40 bp 150 bp 50 bp

Percentage determined 85% 90% 70% 70% 95% 50%

1700 genes 900 genes 400 genes 800 genes 1400 genes 200 genes

Cut the DNA into overlapping fragments short enough for sequencing

2 Clone the fragments


in plasmid or phage vectors.

3 Sequence each
fragment.

4 Order the
sequences into one overall sequence with computer software.

A complete haploid set of human chromosomes consists of 3.2 billion base pairs

Some Techniques Used in the Genome Project


Restriction Fragment Length Polymorphisms (RFLPs) Restriction enzyme is specific to a certain base sequence and will cut DNA at all such sites to produce a number of "restriction fragments ................reveal a unique pattern ("fingerprint").
Automated DNA Sequencing The technique makes use of at least four different fluorescent dyes that attach specifically to either adenine, thymine, guanine or cytosine. Restriction fragments are tagged with dye. Polymerase Chain Reaction (PCR)

Restriction Fragment Length Polymorphism (RFLP)


RFLP demonstrate polymorphic sequence variations that result in DNA fragments of different sizes following restriction digestion.

Person A Person B

Variable Number of Tandem Repeat (VNTR)


VNTR reflect polymorphic sequences of DNA which contain repeating sequences which vary in number. Each repeat unit contains ~11 to 60 bp.

Person A Person B Person C

Short Tandem Repeat (STR)


STR are polymorphisms based on differences in lengths of DNA tracts composed of tandemly repeated di-, tri-, or tetranucleotides which are usually repeated ~5 to 30 times. The most encountered STR consists of dinucleotide CA. STR occur frequently in the human genome. Estimated to occur every 30 to 60 kb for CA repeats. Person A Person B Person C

Single-nucleotide Polymorphism (SNP)


SNP reflect polymorphic sequences which possess a single base variant at a particular site. In the human genome, SNPs occur relatively frequently, roughly every 500 to 1000 bp and are distributed in relatively uniform fashion.

Person A Person B

Different methods and Uses


Method DNA sequencing Usage Physical map of DNA can be identified with highest resolution.

Use of Probes
Radiation Hybrid Mapping Flourescence in situ hybridization (FISH) Sequence Tagged size Mapping Expressed Sequence Tagged maping PCR

To identify RELPs, SNPS.


Fragment genome into large pieces & to locate markers & genes. To localize a gene on chromosomes Applicable to any part of DNA sequences. A variety of STS mapping ,expressed genes are located. To amplify gene fragments.

About the human genome


Only 1.5% codes for proteins, rRNA and tRNA

The rest is used for regulatory sequences and introns 24% pseudogenes (nonfunctioning genes) 15% repetitive DNA 59%

Repetitive DNA
44% transposable elements (jumping genes) - Transposons - cut and paste

Most of these are retrotransposons cut, copy to RNA, RT to DNA, and paste (ex Line1 or L1)

15% large segment and simple sequence DNA small ones STR - Short Tandem Repeats often used in centromeres and telomeres

Repetitive DNA that includes transposable elements and related sequences (44%)

Introns and regulatory sequences (24%)

L1 sequences (17%)

Alu elements (10%)

Repetitive DNA unrelated to transposable elements (15%)

Unique noncoding DNA (15%)

By the Numbers
The human genome contains 3 billion chemical nucleotide bases (A, C, T, and G). The average gene consists of 3000 bases, but sizes vary greatly, with the largest known human gene being dystrophin at 2.4 million bases.

The total number of genes is estimated at around 30,000--much lower than previous estimates of 80,000 to 140,000.
Almost all (99.9%) nucleotide bases are exactly the same in all people. The functions are unknown for over 50% of discovered genes.

How It's Arranged


The human genome's gene-dense "urban centers" are predominantly composed of the DNA building blocks G and C. In contrast, the gene-poor "deserts" are rich in the DNA building blocks A and T. GC- and AT-rich regions usually can be seen through a microscope as light and dark bands on chromosomes. Genes appear to be concentrated in random areas along the genome, with vast expanses of noncoding DNA between. Stretches of up to 30,000 C and G bases repeating over and over often occur adjacent to gene-rich areas, forming a barrier between the genes and the "junk DNA." These CpG islands are believed to help regulate gene activity. Chromosome 1 has the most genes (2968), and the Y chromosome has the fewest (231).

Less than 2% of the genome codes for proteins. Repeated sequences that do not code for proteins ("junk DNA") make up at least 50% of the human genome. Repetitive sequences are thought to have no direct functions, but they shed light on chromosome structure and dynamics. Over time, these repeats reshape the genome by rearranging it, creating entirely new genes, and modifying and reshuffling existing genes. The human genome has a much greater portion (50%) of repeat sequences than the mustard weed (11%), the worm (7%), and the fly (3%).

THANKS
By IFFAT FATIMA ; Roll No : 08 M.PHILL Bio Chemistry

S-ar putea să vă placă și