0 evaluări0% au considerat acest document util (0 voturi)
16 vizualizări11 pagini
The document discusses challenges in analyzing the genetic information contained within soil microbiomes using metagenomics. It describes a project called METACONTROL that examined disease-suppressive soils. Key challenges include obtaining high-quality soil DNA, generating large metagenomic libraries, and developing efficient screening methods due to the complexity of soil microbial communities. The project provided insights into issues that have hindered metagenomic exploration of soils.
The document discusses challenges in analyzing the genetic information contained within soil microbiomes using metagenomics. It describes a project called METACONTROL that examined disease-suppressive soils. Key challenges include obtaining high-quality soil DNA, generating large metagenomic libraries, and developing efficient screening methods due to the complexity of soil microbial communities. The project provided insights into issues that have hindered metagenomic exploration of soils.
The document discusses challenges in analyzing the genetic information contained within soil microbiomes using metagenomics. It describes a project called METACONTROL that examined disease-suppressive soils. Key challenges include obtaining high-quality soil DNA, generating large metagenomic libraries, and developing efficient screening methods due to the complexity of soil microbial communities. The project provided insights into issues that have hindered metagenomic exploration of soils.
the METACONTROL project Jan Dirk van Elsas 1 , Rodrigo Costa 1 , Janet Jansson 2,3 , Sara Sjo ling 4 , Mark Bailey 5 , Renaud Nalin 6 , Timothy M. Vogel 7 and Leo van Overbeek 8 1 Department of Microbial Ecology, Centre for Ecological and Evolutionary Studies University of Groningen, Kerklaan 30, 9750AA Haren, The Netherlands 2 Department of Microbiology, Swedish University of Agricultural Sciences, Genetics Center, 750 07 Uppsala, Sweden 3 Ecology Department, Earth Sciences Division, Lawrence Berkeley National Laboratory, MS 70A-3317, One Cyclotron Road, Berkeley, CA 94720, USA 4 School of Life Sciences, So derto rn University College, 141 89 Huddinge, Sweden 5 Molecular Microbial Ecology Group, Centre for Ecology & Hydrology, Manseld Road, Oxford OX1 3SR, UK 6 LibraGen SA, 3 Rue des Satellites 31400, Toulouse, France 7 Environmental Microbial Genomics, Laboratoire AMPERE, Ecole Centrale de Lyon, Universite de Lyon, 36 Avenue Guy de Collonge, 69134 Ecully, France 8 Plant Research International, PO Box 16, Wageningen, The Netherlands Soil teems with microbial genetic information that can be exploited for biotechnological innovation. Because only a fraction of the soil microbiota is cultivable, our ability to unlock this genetic complement has been hampered. Recently developed molecular tools, which make it possible to utilize genomic DNA from soil, can bypass cultivation and provide information on the col- lective soil metagenome with the aim to explore genes that encode functions of key interest to biotechnology. The metagenome of disease-suppressive soils is of particular interest given the expected prevalence of anti- biotic biosynthetic clusters. However, owing to the com- plexity of soil microbial communities, deciphering this key genetic information is challenging. Here, we examine crucial issues and challenges that so far have hindered the metagenomic exploration of soil by draw- ing on experience from a trans-European project on disease-suppressive soils denoted METACONTROL. Introduction Soil is the habitat on Earth that harbours the largest microbial diversity (see Glossary) per unit mass or volume [1,2]. Because microorganisms represent the least explored and richest source of novel bioactive compounds [3], soil offers promising perspectives for the search of novel func- tions of biotechnological interest. Traditional microbiolo- gical approaches already revealed that soils commonly harbour a broad array of antibiosis-related functions [46], some of which have been associated with the sup- pression of plant pathogens [6,7]. In addition, some soil Review Glossary Bacterial artificial chromosome (BAC): an artificially constructed vector for medium-sized segments of DNA (up to 300 kb in length), which are then incorporated into a host cell (usually Escherichia coli). BACs serve as cloning vectors in metagenomics. Cloning vector: a small DNA vehicle that can accommodate a foreign (cloned) DNA fragment. Plasmids, cosmids, fosmids and BACs are examples of cloning vectors. Cosmid: hybrid plasmid constructed by the insertion of cos sequences, which are DNA sequences of the bacteriophage Lambda. Diversity: statistically, it describes the richness as well as distribution of relative abundance of a species compared to the different species found in a sample. Microbial diversity in soil is likely to be the highest found in any ecosystem on Earth. Fluorescence-activated cell sorting (FACS): sorting of cells that have been tagged with a fluorescent dye in a flow cytometer. Fosmid: a hybrid vector consisting of an f-factor cosmid (circular DNA) that is capable of containing larger pieces of DNA, i.e. up to 50 kb (average 35 kb) compared to only 10 kb in a plasmid. Host: in cloning procedures, hosts are organisms that serve as the recipient of the cloning vectors that each carry a unique copy of foreign DNA directly extracted from the environment or from another organism. The most common metagenomic library host is the bacterium E. coli. Metagenomics: the study of the collective genomes recovered from environ- mental samples without prior cultivation. It enables the investigation of genome information on organisms that are not easily cultured in the laboratory. It is therefore a means of systematically investigating, classifying and manipulating the entire genetic material isolated from environmental samples. Plasmids: independent, circular and self-replicating DNA molecules that carry only a few genes. Plasmids are autonomous molecules and exist in cells as extra chromosomal genomes, although some plasmids can be inserted into a bacterial chromosome. Pyrosequencing: a novel high-throughput nucleotide sequencing method [23] that is based on multiple parallel extensions from target DNA molecules coupled to on-line sensitive reads. Shotgun sequencing: technique in which DNA is broken up randomly into small segments, which are then sequenced using the chain termination or pyrosequencing method to obtain reads. Multiple overlapping reads for the target DNA are obtained and overlapping ends of different reads are assembled by software packages into contiguous sequences denoted contigs. Stable isotope probing (SIP): the use of stable isotopes, such as 13 C, as markers to identify organisms that are actively involved in transforming the 13 C labelled material (substrate). Substrate-induced gene expression screening (SIGEX): screening strategy developed for metagenomic libraries in which the induction of gene expression is used as the criterion for rapid identification and isolation of clones. Suppressive soil: soil that is able to suppress the development of plant diseases by pathogens that are present. Corresponding author: van Elsas, J.D. (j.d.van.elsas@rug.nl). 0167-7799/$ see front matter 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibtech.2008.07.004 Available online 4 September 2008 591 Figure 1. Metagenomic exploration of Wildekamp (NL) soil (W; [6,26]). In total, about 16 000 clones were functionally screened for growth inhibition of Rhizoctonia solani AG3 and Bacillus subtilis. The clones were also genetically screened for homology with genes involved in polyketide biosynthesis (PKS1 type). (a) Outline of metagenomics approach [26]. (b) Functional and molecular screening of (W) soil metagenomic library. Functional screening was performed using arrayed Escherichia coli clones on plates that were overgrown by the target organism R. solani AG3. Inhibition of fungal growth in the vicinity of the E. coli colonies, e.g. slipstream inhibition as indicated in the figure, was recorded. Genetic screening was performed using an array of E. coli clones, in which duplicated clones overlapped to enable a rapid identification of hybridization-positive clones. Shown here is hybridization with the PKS1 probe [26]. Review Trends in Biotechnology Vol.26 No.11 592 microorganisms are naturally resistant to a broad range of antibiotics [8] and the collective antibiotic resistance genes present in soil have been referred to as the antibiotic resistome [9]. The soil microbiota also contains a large pool of genes that encode enzymes involved in either biosynthetic or biodegradation processes, including the degradation of xenobiotics [10,11]. Only a minor fraction of the soil microbiota can be cultivated using existing cultivation approaches, although this number is constantly increasing owing to ongoing development of techniques to support the growth of recal- citrant species [12]. Nevertheless, soil microbiologists have increasingly relied on molecular approaches that derive their strength from the direct exploration of soil-extracted DNA [1,13,14] using a soil metagenomics approach (Figure 1a). Soil metagenomics and its challenges Soil metagenomics is dened as the study and exploration of the collective genomes present in a particular soil sample [15]. To enable such exploration, construction of soil DNA-based metagenomic libraries in a cloning vector (see Glossary) is often used. When applied to bulk soil, the approach mainly addresses genes of the microbial com- munity, i.e. the soil microbiome. Although metagenomics already has been successful in unlocking several novel genes and functions from the soil microbiome, the under- lying work can be tedious and seemingly inefcient [16 22]. This is a result of technical constraints in: (i) producing adequate amounts of high-quality soil DNA, which is de- pendent on soil characteristics; (ii) generating metage- nomic libraries of sufcient size in suitable metagenomic hosts; and (iii) screening procedures, which are not always efcient. Two different metagenomic approaches to soil are possible, i.e. either unselective or targeted metagenomics. Unselective soil metagenomics constitutes a gene shing expedition, because no a priori selection (e.g. via prior growth-based selection of the communities under study, or following PCR amplication of a given gene or DNA region of interest) takes place before the metagenome is obtained and analysed. Given the typical distributions of microbial species in most soils (Box 1), hit rates of target genes are affected and can be low if targets occur in non- dominant species. In this context, direct shotgun sequen- cing of the soil DNA pool is becoming increasingly popular [23]. This approach bypasses the cloning step and instead relies on random high-throughput sequencing of soil DNA, or a target fraction thereof (Box 2). By contrast, in a targeted soil metagenomic approach, the isolated pool of DNA is deliberately biased to enhance hit rates, for instance owing to (i) pretreatment of the microbial com- munity, (ii) prefractionation of the DNA on the basis of G+C% [24], or (iii) the amplication of target regions by Box 1. Soil microbial diversity and library coverage As most soils support an extensive microbial diversity with potentially thousands of bacterial and archaeal species [1,52], sampling the entire diversity in a soil sample would require an enormous effort, which currently seems impracticable. Two key parameters are crucial for the distribution of microbial species, the richness (i.e. the number of species) and their evenness (their relative abundance). Often, the microbial species distributions in soil are characterized by the strong dominance of relatively few species (tens to hundreds) next to the presence of large numbers of rare species, which form a characteristic tail in rank-abundance curves [2]. This typical species distribution in soil makes the full recovery of the extant soil microbiota difficult, although it is theoretically possible. Therefore, most soil metagenomic libraries will mainly reflect the genetic make-up of the most abundant species in the soil. However, in the light of practical limits to library sizes, these libraries might still contain only fragmented genetic information for even this pool of species. Any successful soil metagenomics undertaking will thus have to be guided by prior knowledge of the prevalence and distribution of target genes within the microbiota in the soil(s) to be examined. Ideally, gathering such information should precede construction and screening of the library. Alternatively, prior information might point the way towards the introduction of a deliberate bias in the experimental approach, such as favouring target organisms by substrate inclusions. In the METACONTROL project, we assumed a high occurrence of polyke- tide biosynthetic operons in disease-suppressive soils based on assumptions about the role of antagonism in suppression, which is supported by the known high prevalence of the pyrrolnitrin biosynthetic operon [6]. The achievable coverage of metagenomic libraries, and the success in discovering novel biological functions, will further depend on the choice of the vector and host strain in which the libraries are maintained. Box 2. Environmental shotgun sequencing harnessing a novel powerful tool Environmental shotgun sequencing (ESS), the high-throughput shotgun sequencing of environmental DNA, is a promising approach to characterize microbial genomes present in an environ- mental sample. Despite its disadvantages of being highly resource demanding, it enables acquisition of a wealth of information on sequences and genomes in natural ecosystems [54] that has no precedence. Recently developed high-throughput sequencing tech- nologies, such as the (second generation) genome sequencer GS- FLX, which uses pyrosequencing (454 Life Sciences, Roche), as well as the Solexa sequencing platform (Solexa Ltd, Cambridge, UK), now allow for a rapid generation of large amounts of sequence information. For instance, at greatly improved accuracy and throughput, up to 100 Mb of sequence information can be generated in less than 8 h running time (GS-FLX system). The procedure is highly unselective and the high amount of sequences of common housekeeping genes that are generated could overload databases. The outstanding challenges associated with most ESS projects is not the high-throughput sequencing per se, but instead the assembly and sorting of the sequences and the annotation of eventual contigs. For this purpose, new bioinformatics tools for the analysis of ESS databases are constantly being developed and refined [5560]. As a result of its untargeted nature, ESS enables a comprehensive analysis of the genomic composition of an environ- mental sample, in that the genes of the most dominant organisms are represented to a greater extent than those of rare organisms. Moreover, ESS is suitable in the aiding of genome reconstruction. The latter potential of ESS has been demonstrated in the assembly of one nearly complete and three partially assembled microbial genomes from a simple community found in an acid mine drainage [61]. However, genomic reconstructions from most soil samples will require enormous sequencing power as well as increased bioinfor- matic input [5560], which raises serious issues of time/money investment [52]. The collection of large amounts of sequence data of soils might nevertheless be useful if comparative studies on the prevalence of proteins are performed. An alternative approach, ESS of the soil metamobilome, the collective genetic information encoded in plasmid or phage genomes, might help in facilitating biotechnological applications of soil metagenomics, because the metamobilome is expected to contain a large number of relevant traits [62]. In addition, the microbial consortia associated with microfauna could also be analysed and explored with ESS [63]. Review Trends in Biotechnology Vol.26 No.11 593 PCR before metagenomic analysis. Pretreatment can con- sist of, for example, a pre-enrichment of target microbial groups by adding particular growth substrates, such as chitin [25], or a selection of particular cells by cell-sorting procedures [26]. Metagenomic libraries The metagenomic libraries will thus be derived either from unbiased DNA in the unselective approach or from a specic fraction of the microbiota or the DNA pool in the targeted method. Irrespective of the approach, the DNA library should ideally encompass all genes of interest that are targeted in the soil microbial community. However, such coverage is seldom achieved owing to the difculty of achieving a representative sample of the soil microbiota, which would also include rarer microbial species [14] (Box 1). Various approaches, based on functional or sequence analysis, are commonly used to analyse the obtained DNA pool and their characteristics are discussed below. Library screening A great challenge in soil metagenomics is the screening of libraries to locate genes of interest. Two approaches are possible: functional and molecular screenings [14] (Figure 1b). Functional screening depends on the success- ful expression of target gene(s) in the metagenomic host, whereas molecular screening is based on the detection of regions in the DNA that are recognizable via hybridization or PCR approaches [14]. An overview of studies that suc- cessfully screened DNA libraries [2733] can be found in Table 1. As, depending on the abundance and distribution of target genes in the soil microbiota, functional screens of soil-generated libraries can be associated with low hit rates, molecular screens, in particular those that can be performed in a high-throughput setup, have become more widely employed [26,34]. Recently, DNA microarray-based screening (denominated metagenomic proling) has devel- oped into a highly promising novel approach because it enables the efcient screening of a library using (library- derived) microarrays based on functional or other genes [35]. Another advanced screening method, substrate- induced gene expression screening (SIGEX), exploits the induction of gene expression [36]. In this method, a repor- ter gene (e.g. the gene encoding green uorescent protein GFP) enables the sorting of clones, in which the desired gene has been induced, via uorescence activated cell sorting (FACS) of GFP-positive cells. In this way, the identication of clones that carry a desired metabolic activity, for example a particular step in a biodegradative pathway, can be facilitated [36]. In one study, multiple clones could be isolated from a groundwater metagenomic library consisting of over 150 000 clones that had been induced by benzoate and naphthalene [36]. Finally, strat- egies that are based on growth selection can offer excellent opportunities to enhance screening efciencies. In the following sections, we examine the power and pitfalls of the metagenomics-based exploration of agricul- tural soils suppressive to phytopathogens using our experi- ence from the European-Union-sponsored project Soil metagenomics to identify novel mechanisms of antagonism and antifungal activity for the improved control of phyto- pathogens (METACONTROL QLK3-CT-200202068) [5,14,25,26,34,3740]. In particular, we address the tech- nical and intellectual challenges that need to be overcome to optimize the potential prot from this novel approach. The METACONTROL project aims, results and technological challenges Aims It has long been known that particular agricultural soils suppressive soils can restrict the activity of plant-patho- genic microorganisms. The suppression is often biotic and key mechanisms include the production of antibiotics. A great part of this antagonistic activity might reside in the as-yet-uncultured fraction of the soil microbiota [7,26] and hence unlocking this antagonistic potential via a metage- nomics approach is important. The METACONTROL pro- ject, which ran as from 2002, is unprecedented in its kind. It united seven European laboratories [5,14,25,26,34,37 40], mainly from academia but also including the small enterprise LibraGen (Toulouse, France), with the main aim of exploring the biotechnological potential of phyto- pathogen-suppressive soils. The underlying assumption was that the microbiota of these suppressive soils would serve as rich reservoirs of anti-phytopathogen loci, such as those involved in the production of antibiotics of the poly- ketide class and chitinase biosynthesis. The metagenomic exploitation of the aforementioned traits was thus the principal focus of the METACONTROL endeavour. During the course of the project, four suppressive soils and one control soil were identied (indicated by capital letter with the suppressed species) (Table 2): one in the Netherlands (W, Rhizoctonia solani AG3), one in Sweden (U, Plasmo- diophora brassicae), one in France (C, Fusarium) and one in the UK (Wy, Fusarium). The control soil (France) was denoted M. Metagenomic libraries were constructed for these soils and screened for the occurrence of antibiotic functions [5,14,25,26,34,3739]. In addition, the project partners developed a range of methodologies that facili- tated the exploration of the suppressive soil libraries [21,25,34,37]. In the following, we will discuss the technol- ogy and the relevant crucial choices that had to be made before each analytical step (Figure 1) in respect of: (i) the optimal soil DNA extraction methodology, (ii) the possib- ility to bias the soil community or DNA, (iii) the suitability of the metagenomic vector/host system for the target objec- tives, (iv) the optimal procedure for screening, and (v) the nal analysis. Results Soil DNA extraction and preparation For a robust library preparation, soil DNA that accurately represent the diversity of the soil microbiota is required in a quantity that enables efcient cloning [26,37]. In addition, the DNA itself needs to be of sufcient quality with regard to chemical purity, lack of degradation and length of DNA fragments to be suitable for cloning into a vector [37]. A required minimal size of roughly 40100 kb will increase the chance that the entire target pathways, e.g. those involved in the biosynthesis of polyketide antibiotics, remain intact [34,40]. The METACONTROL consortium collectively tested and applied advanced methodology that Review Trends in Biotechnology Vol.26 No.11 594 Table 1. Selected soil metagenomics studies and their relevance for biotechnology and ecology Soil type/ condition/ location Host/ vector system Screening type/target Finding(s) Features Relevance Refs Agricultural soil, Madison, US Escherichia coli/BAC Genetic/16S rRNA genes; functional/various traits Retrieval of clones displaying heterologous antibacterial, lipase, amylase, nuclease and hemolytic activities Analysis of 16S rRNA gene diversity revealed representatives of low GC Gram-positive bacteria, Acidobacterium, Cytophagales, and Proteobacteria in one of the BAC libraries One of the inaugural studies on soil metagenomics, it highlighted the potential of the approach in shedding light on the functional capabilities of the soil biota [27] Not indicated E. coli/cosmid Functional/ antibacterial activity A clone (CSL12) was found that produces a series of long-chain N-acyl-L-tyrosine antibiotics About 700 000 cosmid clones were screened, 65 of which were antagonistic to Bacillus subtilis, the bacterial strain used in the assays First report of a long-chain N-acyl amino acid biosynthesis gene [28] Uncultivated soil, New England, US E. coli/BAC Functional/ antibacterial activity The 27 kb clone mg 1.1 was found to express many small molecules related to and including indirubin Retrieval of one anti-bacterial clone per 60 Mb of soil-derived DNA. Library composed by DNA fragments of 5120 kb Indirubin is an antileukaemic drug [29] Soil type not indicated, Ithaca, US E. coli/cosmid Functional/natural products Discovery of a 6.7 kb violacein gene cluster, which is different from previously reported ones, present in a blue-coloured clone Identication of coloured clones in the cosmid library was used as a screen criterion to search for natural product- producing clones Violacein is active against Gram-positive bacteria and induces apoptosis in broblast cells [30] Agricultural soil, Madison, US E. coli/BAC Functional/colour production Isolation and characterization of the triaryl cation antibiotics turbomycin A and B An example of heterologous gene expression of previously uncharacterized organic molecules Both turbomycin A and B display broad- spectrum activity against Gram-negative and positive bacteria [31] Arable soil, La Cote Saint Andre, France E. coli Streptomyces lividans/ cosmid shuttle vector Genetic/type I polyketide synthase (PKS I) and 16S rRNA genes Discovery of new PKS genes in at least eight clones An analysis of 16S rRNA gene- based bacterial diversity present in the E. coli cosmid library was carried out 16S rRNA gene sequences analysed derived primarily from organisms not described at that time. A range of naturally active compounds was shown to be heterologously expressed [39] Uncultivated soil, New England, US E. coli S. lividans Pseudomonas putida/BAC shuttle vectors Functional/ antibacterial and antifungal activity The three hosts displayed different heterologous gene expression capabilities for known antibiotic clusters A method is described that enables the transfer of a metagenomic library from E. coli to S. lividans and P. putida Highlights the importance of increasing the host range to enhance the success of metagenomics endeavours [53] Uncultivated soil, Madison, US E. coli/BAC Functional/ antibiotic resistance Nine clones expressing resistance to aminoglycoside antibiotics and one expressing tetracycline resistance were found Heterologous gene expression is required for the detection of aminoglycoside resistant clones Suggests that the diversity of antibiotic resistance genes within soil bacteria is greater than previously estimated [32] Forest rhizosphere soils E. coli/fosmid Functional/ fungal antagonism Identication of eight open reading frames (ORFs) conferring antifungal activity to the E. coli host. These ORFs were found to be similar to genes encoding type II family polyketide synthases One positive antifungal hit in a total of 113 700 tested fosmid clones. Use of the yeast Saccharomyces cerevisiae as model fungus to test for antagonism Perhaps the rst study to target antifungal activity in the metagenome of rhizosphere soils (pine tree rhizospheres). [42] Pasture soil, Montrond, France E. coli/fosmid Functional/ production and degradation of N-acylhomoserin lactones (NAHL) Identication of a gene, qlcA, encoding a NAHL-degrading lactonase. The fosmid insert p2H8, containing the glcA gene, was found to harbour nine ORFs resembling genes of Acidobacteria The qlcA gene was found to effectively quench quorum- sensing mediated pathogenic functions when expressed in Pectobacterium carotovorum Knowledge of the diversity of quorum- quenching lactonase extended [33] Review Trends in Biotechnology Vol.26 No.11 595 enabled themto obtain pure high molecular weight (HMW) DNA fromall the soils under study [14,26,34,37]. The most efcient approach was found to consist of a physical extraction of microbial cells from soil followed by gentle DNA extraction and purication using pulsed- eld gel electrophoresis (PFGE) (Figure 1) [26,37]. Cushion (Percoll and/or Nycodenz) pre-separation of bacterial cells from soil [26,37] was found to be optimal for subsequent isolation of the HMW DNA. Moreover, the microbial growth status in the soil was an important determinant of soil DNA quality, and the latter could be boosted by incubation with growth substrates [37]. Typically, this approach produced HMW DNA, often with size >60100 kb [26,37]. We also found that high amounts of cells, i.e. minimally ca. 10 11 , were needed to yield sufcient DNA for efcient library construction [25]. Because soils typically contain in the order of 10 8 to 10 10 cells per g, this nding helped set the rule for the efcient construction of the soil metagenomic libraries. Deliberate bias and preselection In the METACONTROL project, mainly unselective approaches were used to explore the four selected naturally disease-suppressive soils for their anti-phytopathogen functions [14,25,26,34,37]. In doing so, we disregarded the possibility of positive growth selection because the chosen target genes/operons do not a priori imply a growth advantage in the metagenomics host, even in the presence of the target phytopathogens [26,34]. However, some targeted approaches were developed to assess specic members of the microbial community, for example the soil was pre-treated with chitin to enhance the levels of chitinase producers [21,25]. Astrong focus was also placed on the pre-selection of metabolically active bacterial cell fractions with the help of FACS analysis. This introduced bias was hoped to increase the prevalence of organisms carrying the aforementioned target genes, in light of their presumed function in the suppressive soil [25]. Metabolically active cells could indeed be successfully sorted from the soil [26,41]. However, as a result of the limited ow rate of the cell sorters used, throughput was limited (10 6 cells/hour), which resulted in insufcient biomass for library construction [26]. To meet the high sorting demands to establish a library, more sophisticated cell sorters with higher owrates are needed. Metagenomic library production The clone libraries obtained from METACONTROL for the four disease- suppressive soils and the control soil consisted of 6000 to 60 000 clones and were constructed in Escherichia coli (Table 2). Both large insert size vectors, such as bacterial articial chromosomes (BACs), that enable the cloning of inserts up to 200 kb, and fosmids that enable insertion of 3545 kb fragments were used. BAC vectors allow for the coverage of complex large operons and also facilitate the analysis of a gene/operon within its original genomic context. By contrast, fosmids are only able to accommodate smaller inserts and thereby only enable the cloning of smaller operons. Using a fosmid vector system, such as the Epicentre ccFos system, allows for the positive selection of vectors that have acquired inserts [21,26,34,37,38]. Three of the nal libraries (Table 2) were based on fosmid vectors, the reason being the ease of obtaining appropriately sized libraries within reasonable time. One library, for the M soil, was successfully Table 2. Soil metagenomic libraries constructed under the METACONTROL project [26,34,39] and their characteristics a Soil description Library vector/clone number Type of screening Number of positive clones Remarks and Refs W Wildekamp grassland, suppressive to Rhizoctonia solani AG3 Fosmid/16 000 Functional: antagonism against R. solani AG3 and Bacillus subtilis 7 Combined functional and genetic (PKS1) screening resulted in seven positive clones. Five of those were conrmed as polyketide synthase PKS1-positive clones. Three clones were completely sequenced, and one insert showed high similarity with sequences from Acidobacterium sp. [26] Genetic screening with soil-generated PKS1 probe Wy Wytham grassland and Fusarium-suppressive agricultural soil Fosmid/100 000 Functional: antagonism against Fusarium sp. indicator strains agar plate based dual-culture assay 13 (grassland) Grassland most effective source of functional clones and greatest diversity. Agricultural bulk soil revealed low diversity and limited functional traits. End-sequencing and subcloning of cosmids resulted in mostly unidentied ORFs. Efcacy of clones lower than control strains isolated from same source Average insert size 35.6 kb 2 (agricultural soil). Each clone genetically distinct C Chateaurenard Fusarium-suppressive soil Fosmid/51 000 Genetic screening 22 Combination of functional and genetic screening. Functional screening used several targets: Fusarium spore generation and hyphal production, Aspergillus nidulans growth, Hebeloma cylindrosporum hyphal generation. Genetic screening included sequencing of PKS positive clones [34,39] BAC/60 000 Functional screening U Uppsala Plasmodiophora brassicae-suppressive soil Fosmid/8000 Functional: antagonism against Pythium ultimum 4 Selection of Streptomyces mutomycini, Kitasatospora, Lentzea, Oerskovia revealed by ngerprinting. S. mutomycini and Streptomyces clavifer prevalent in the library. Chitinase genes found in soil, in the library and in isolates; a cluster that prevailed in soil not found in the library, whereas a library cluster was not found in the soil Genetic: chitinase genes and 16S rRNA gene M Montrond (control) soil Fosmid/60 000 Genetic and functional screening 39 Thirty-nine novel PKS1 positive clones, most with supernatants showing antimicrobial activity, were found a PKS1: polyketide synthesis operon for type-I polyketides. Review Trends in Biotechnology Vol.26 No.11 596 constructedina BACvector [34,39]. The appliedBACvector also contained a replicon compatible with a Streptomyces host, enabling shuttling between E. coli and Streptomyces metagenomic hosts and consequently enhancing the probability of heterologous gene expression within the clones obtained in this library [34,39]. Metagenomic library screening Soil metagenomic libraries might contain only few clones that actually carry a gene/operon of interest (Box 1) in addition to any potential constraints of gene expression in the host strain. In the METACONTROL project, both functional and molecular screens were used to uncover potential antagonistic functions with varying success rates [21,26,34,37,38], but, overall, low numbers of phytopathogen-suppressive clones were found in the four libraries [26,34,37] (Table 2). Functional screening Functional screenings of the libraries were performed in so-called dual-culture assays, which enable the target phytopathogenic organisms to grow out over the metagenomic library clones arrayed on Petri dishes and scoring for irregularities/inhibitions in growth of the target organism [26,34,39]. This experimental setup led to the detection of few positive clones (range 148), amounting to, on average, <0.05% of positives for all libraries (Table 2). Such low numbers can be attributed either to a rare occurrence of target genes/operons in the vectors used, or to the responsible cellular machineries being signicantly larger than the vector inserts such as found for polyketide production loci. Other factors that might impede the detection of function of the target genes could be a low expression rate from the DNA inserts or unsuitable expression conditions on the plates used in these assays. For instance, the accurate quorum sensing system, a cellular communication mechanism commonly found in bacteria, might have been lacking in the E. coli host used. Indeed, shuttling from E. coli to Streptomyces as the metagenomic host facilitated the expression of an antibiotic (amphotericin) production locus [34,39] as indicated by a high activity against target fungi. Considering the potential expression bottlenecks, the generally low number of functionally positive clones did not come as a surprise. For instance, a screening of forest soil libraries for antifungal traits also yielded only one positive signal among the 113 700 fosmid clones [42] (Table 1). Hence, substantial methodological improve- ments are required to boost the percentages of positive hits and we will present possible strategies below. Molecular screening The libraries obtained in the METACONTROL project were also screened using molecular tools, such as hybridization and PCR amplication methods (Table 2). It was hypothesized that the success in detecting novel operons, such as those involved in polyketide biosynthesis, would depend on the application of deliberate degeneracy in the probes and primers used [34,39]. The rationale behind this contention is that, this way, the molecular screening is not restricted to previously known exact sequence information and will enable the broadest possible range of meaningful positive hits within the metagenomic library. The approach facilitated the identication of target genes that are sufciently similar to query sequences, and thus constituted a rapid crude search step before a more resource-demanding and thorough investigation. Using the total soil community DNA as the target, we PCR-amplied the genes of the polyketide biosynthesis operon (PKS1) with degenerate primers. The resulting amplicons were used for the generation of probes that would thus hybridize with PKS sequences that abounded in the soil [26,34,38] (Table 2). This hybridization approach yielded seven positive clones in one (W) soil library that contained genes that were probably involved in PKS biosynthesis, which was subsequently conrmed by end-sequencing [26]. In the case of the M soil, the roughly 60 000 clones from the soil metagenomic library were divided into pools, which were used as templates for PKS-based PCR screenings. This enabled the detection of >100 positive pools (0.22%hit rate [34]). The amplicons produced were then sequenced to check for redundancies within the library and with known PKS sequences. The redundancy level was low (2%) and 39 unique PKS sequences were found from the pools, all representing promising novel PKS biosynthesis operons (Figure 2a). Then, the positive clones in the pools were identied using colony hybridization with relevant probes, after which these were tested, following shuttling into Streptomyces, in assays for antagonistic activity against the indicator organisms Bacillus subtilis 1A72, Staphylococcus aureus 21, Enterococcus faecalis 40, Escherichia coli 9, Pseudomonas aeruginosa 39, Fusarium oxysporum LNPV, Aspergillus fumigatus Gasp 4707 and Neurospora crassa HK. The positive clones showed 56% antimicrobial activity against at least B. subtilis, 13% against S. aureus, 4% against E. faecalis and <1% partial inhibition of growth of N. crassa mycelium (provided by LibraGen; Figure 2b). These molecular screening procedures thus enabled rapid access to novel PKS sequences (Figure 2a) typical for the soil community that can be used for production [26]. Technological challenges What are the remaining challenges in exploring suppres- sive soils that have become apparent fromthe METACON- TROL experience? Without any doubt, there is a strong need to improve the efciencies at each step in soil meta- genomics, from the application of a deliberate bias to favour target organisms and genes/operons in the starting material, to the improvement of cloning and screening procedures aimed at increasing the throughput and ef- ciency of metagenomics. Deliberate bias in sampled communities We learned in METACONTROL that deliberate manipulation of the sample communities from soil offers unique possibilities to enhance metagenomics hit rates. For instance, growth selection can be applied as outlined before. Here, an intelligent selection of growth conditions will guide the bias. Also, FACS can be applied not only to sort the metabolically active cell fractions, but also to obtain particular fractions of the community for instance the Review Trends in Biotechnology Vol.26 No.11 597 high-G+C% Gram-positive bacteria in which antibiotic production loci are abundantly present following staining with specic uorescent probes. Another promising strategy is offered by the use of stable isotopes. Stable isotope probing (SIP) introduces 13 C-labelled substrates, for example methane, into soil communities. Methanotrophs then take up the 13 C and incorporate it in their cellular DNA. The resulting heavy DNA can be separated from 12 C-DNA by ultracentrifugation and sequenced, thus identifying the organisms that captured the substrate [43,44]. This approach was coupled to soil metagenomics studies [45] and resulted inthe identication of a complete methane monooxygenase operon, enabling insight into the enzyme and its function in soil. Also, a dominant phenol-degrading bacterium, identied as a Thauera sp., was found in a wastewater bioreactor [46]. Figure 2. Polyketide synthesis (PKS type 1) positive clones found in the C soil library. (a) Phylogenetic analysis of the sequenced (KS ketosynthase) domains (one particular gene function involved in the biochemical pathway towards polyketides). The tree reconstruction was computed (OD/Neighbour Joining method) for 43 metagenomic sequences, also including 38 sequences from public databases, and corresponding to 204 informative sites. KS domains from public databases: black; metagenomic KS domains: red; PKS/non-ribosomal peptide synthesis (NRPS) hybrids: blue. Scale bars (0.1) indicate genetic distance (10%) (also in (b)). Numbers at junctions indicate bootstrap values (scale 1100). Higher values, generally above 50, indicate robust clustering. (b) Unrooted tree of KS domains and biological activity. Active clones against only Bacillus subtilis are in red, against Bacillus subtilis and Staphylococcus aureus in blue, against Bacillus subtilis, Staphylococcus aureus and Enterococcus faecalis in green and against Bacillus subtilis, Staphylococcus aureus and Neurospora crassa in grey. Non-marked clones did not display activity against any tested organism. Abbreviation: KSLib, ketosynthase sequences of the LibraGen metagenomic library. Review Trends in Biotechnology Vol.26 No.11 598 Moreover, by tracking the fate of 13 C-labelled CO 2 xed by higher plants in soil, major data on plant-responsive microorganisms that often produce antimicrobials as secondary metabolites can be achieved [47]. The wide application of SIP using other organic substrates bears great potential in future explorative metagenomic studies in which organisms with particular ecological roles are the targets. Searching for better metagenomic library hosts Working with E. coli as the metagenomics host has clear advantages in respect of the relative ease of the laboratory work and the great experience gained with it over many years. However, it can be limited with regard to the screening of phenotypes from the soil metagenome as E. coli is not a typical soil organism. The main restriction arises from the fact that some intrinsic promoters and associated factors required for the expression of the inserts might be poorly recognized in this host. Moreover, essential post- translational processing and/or transport functions of the target genes might be missing. Rondon et al. [48] showed that only 30% of Bacillus traits could be expressed in E. coli, which indicates that E. coli is at best a suboptimal host for the heterologous expression of genes from many non-enteric soil bacteria. Furthermore, the host can also be sensitive to toxic compounds that are produced by inserts after their transcription and translation and these host clones might consequently disappear from the library. Obviously, these constraints are true for any host strain selected for metagenomics and Box 3 details ongoing efforts in developing alternative hosts. Our work also supports the contention that working with several hosts instead of just E. coli will often enable the hit rates to be boosted. The metagenomic library vector The METACONTROL experience conrmed that crucial evaluation of the vector system to be used in soil metagenomics is required [21]. Three types of vectors, i.e. small-, medium- and large-insert size vectors, are available. The small-insert size vectors that basically only permit screening for single gene-encoded functions can be used in shotgun sequencing approaches and enable construction of libraries from mechanically sheared DNA. Such an approach was successfully employed for the detection of small open reading frames (ORFs) derived fromuncultured prokaryotes fromsediment [49]. By contrast, fosmid and BAC vectors enable incorporation of large fragments and even intact operons within their genomic context, which might provide a better handle at gene expression. However, the fact that pure HMW soil DNA is required in sufcient amounts for efcient cloning into BAC vectors often makes this class of vectors unsuitable for routine cloning efforts, such as those required for high-throughput setups. The selection of vector clones that receivedinserts is alsoacrucial stepand this could be further improved with the development of vector systems that facilitate the detection of any inserts, such as in a conditional host-killing system. Outlook the way ahead Much like in other soil metagenomics projects, several interesting novel biological functions have been uncovered in the course of the METACONTROL project. These in- clude the (partial) biosynthetic machinery for the pro- duction of particular polyketide antibiotics, e.g. a leinamycin-like antibiotic, as well as for several other novel polyketides [26,34,3739] (Table 2). However, at the same time several intrinsic difculties were found to be associ- ated with the approach [14,50,51] and this resulted in low hit rates even for the suppressive soils that were presum- ably enriched in the target functions. The project actually conrmed the currently accepted rule-of-thumb in soil metagenomics that the search for non-housekeeping func- tions can be compared to looking for a needle in a haystack, unless preselection is used introducing a bias to increase the numbers of positive hits. The conditions imposed by the sample habitat (e.g. the typical rank- abundance distribution of microbial communities in soil Box 1) dictate this outcome and require the development of creative tricks and tools to overcome these limitations. What are these tricks and tools? Improvement of the efciency of each step in soil meta- genomics protocols from DNA preparation to screening/ selection of positive clones remains key to progress in the mining of the soil microbiota. This will include further netuning of DNA extraction, that is, extraction of DNA from community members or metagenome fractions that most probably contain the target genetic information. For instance, total plasmid DNA (i.e. the metamobilome) can be targeted when novel biodegradation, metal or antibiotic resistance genes that are frequently found on plasmids are targeted. Also, extraction can be tuned to 13 C-DNA Box 3. The quest for the most suitable metagenomics host/ vector system To overcome the limitations associated with the use of Escherichia coli as a host for soil metagenomics, non-E. coli hosts have been, or are in the process of being, developed at the premise that such hosts will enable a greater number of directly obtained genes or operons to be expressed. Improvements here will enhance the success of functional screens. Non-E. coli hosts include soil bacteria belonging to the genera Sphingomonas, Burkholderia, Bacillus, Acidobacter- ium and/or Verrucomicrobium [64]. Other alternative hosts, such as Streptomyces, Rhizobium and Pseudomonas spp. [38,53,65] and mutants of Lysobacter enzymogenes and Pseudomonas fluorescens mutated in their antagonistic activities [26], have also been used. Shuttle vectors that allow efficient traffic between these hosts and E. coli have also become available. For example, a Rhizobium leguminosarum host was used to test expression of metagenomic clones and alcohol dehydrogenase as well as tryptophan biosyn- thetic genes were found in this host, but not in E. coli [65,66]. R. leguminosarum might enable the expression of a broader range of genes from typical soil bacteria than E. coli owing to its larger complement of relevant sigma factors. Pseudomonas spp., which possess signalling circuits for secondary metabolite production, might also serve as adequate library hosts for antagonistic functions and/or attributes promoting plant growth. For example, hetero- logous expression of genes involved in 2,4-diacetylphloroglucinol biosynthesis was found in Pseudomonas putida, but not in E. coli [53]. Finally, vectors that can shuttle between E. coli and eukaryotic hosts [67] will facilitate the study of eukaryotic genes in a relevant genomic background. The selection of a specific library host will depend on the aim of the particular metagenomic study. However, it has become clear that the preparation of libraries in shuttle vectors, which allow for gene expression in different or multiple hosts, will further extend the potential of metagenomics. Review Trends in Biotechnology Vol.26 No.11 599 incorporated by microorganisms that utilized particular 13 C-labelled substrates or to specic fractions of the chro- mosomal DNA pool, e.g. high GC content DNA. Other improvements rely on the introduction of novel positive selection for desired traits, either based on growth or on overcoming resistance, as well as improvements in sub- sequent screens, such as high-throughput formats of increased accuracy. Moreover, the use of alternative hosts, with or without shuttle vectors, is expected to yield import- ant progress. Further, combination of screening (gene expression) data with high-throughput sequencing should be fostered. From the METACONTROL experience [26] it has become apparent that adequate deciphering of meta- genomic sequence information is paramount here. Another tool that will support explorative metage- nomics is based on the provision of prior knowledge on the incidence, abundance and expression of target genes in varying habitats. Global-scale gene mapping (GGM derived from the concept of environmental gene tagging) enables the description of habitats in terms of their gene abundance and/or expression [52]. GGM should compare microbial gene pools across soils and might thus provide a global perspective on their prevalence. For instance, PKS1- type polyketide biosynthetic operons were more prevalent in a soil metagenome than in whale carcass, acid mine drainage or (Sargasso) sea metagenomes [52]. GGM thus enables the prediction of metagenomic hit rates. However, before we reach this objective, crucial questions need to be answered (Box 4). We conclude that major advances in the biotechnologi- cal exploration of soil are expected in the next decade. Guidance by GGM will be primordial for progress. The expanded soil metagenomics approaches will enable scien- tists to (i) mine soil for genes and pathways of interest to biotechnological applications (Tables 1, 2); (ii) decipher identity and function of hitherto uncultured microorgan- isms; and (iii) provide an overall characterization of soil with regard to function and diversity. However, soil meta- genomics will continue to depend on labour-intensive tasks and will thus remain resource-demanding [20]. On the positive side, the ever-increasing throughput of sequencing technologies will aid the quick assessment of the preva- lence of target genes, shedding more light into the soil genetic reservoir and potential for biotechnological appli- cation. Acknowledgements This work was supported by funding received under the EU METACONTROL project (QLK3-CT-200202068). R.C. received support from the Soil Biotechnology Foundation (Groningen). References 1 Gans, J. et al. (2005) Computational improvements reveal great bacterial diversity and high metal toxicity in soil. Science 309, 13871390 2 Van Elsas, J.D. et al., eds (2006) Modern Soil Microbiology II, CRC Press 3 Demain, A.L. (2000) Small bugs, big business: the economic power of the microbe. Biotechnol. Adv. 18, 499514 4 Garbeva, P. et al. (2006) Effect of above-ground plant species on soil microbial community structure and its impact on suppression of Rhizoctonia solani. Environ. Microbiol. 8, 233246 5 Adesina, M.F. et al. (2007) Screening of bacterial isolates from various European soils for in vitro antagonistic activity towards Rhizoctonia solani and Fusarium oxysporum: site-dependent composition and diversity revealed. Soil Biol. Biochem. 39, 28182828 6 Garbeva, P. et al. (2004) Quantitative detection and diversity of the pyrrolnitrin biosynthetic locus in soil under different treatments. Soil Biol. Biochem. 36, 14531463 7 Steinberg, C. et al. (2006) Soil suppressiveness to plant diseases. In Modern Soil Microbiology II (van Elsas, J.D. et al., eds), pp. 455478, CRC Press 8 Demane`che, S. et al. (2008) Antibiotic-resistant soil bacteria in transgenic plant elds. Proc. Natl. Acad. Sci. U. S. A. 105, 39573962 9 DCosta, V.M. et al. (2007) Expanding the soil antibiotic resistome: exploring environmental diversity. Curr. Opin. Microbiol. 10, 481489 10 Galva o, T.C. et al. (2005) Exploring the microbial biodegradation and biotransformation gene pool. Trends Biotechnol. 23, 497506 11 Boubakri, H. et al. (2006) Development of metagenomic DNA shufing for the construction of a xenobiotic gene. Gene 375, 8794 12 Janssen, P.H. et al. (2002) Improved culturability of soil bacteria and isolation in pure culture of novel members of the divisions Acidobacteria, Actinobacteria, Proteobacteria, and Verrucomicrobia. Appl. Environ. Microbiol. 68, 23912396 13 Kowalchuk, G.A. et al. (2004) Molecular Microbial Ecology Manual (2nd edn), Kluwer Academic Publishers 14 Lefevre, F. et al. (2008) Drugs from hidden bugs: their discovery via untapped resources. Res. Microbiol. 159, 153161 15 Handelsman, J. et al. (1998) Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. J. Biol. Chem. 5, R245R249 16 Ward, N. (2006) New directions and interactions in metagenomics research. FEMS Microbiol. Ecol. 55, 331338 17 Handelsman, J. (2004) Metagenomics: application of genomics to uncultured microorganisms. Microbiol. Mol. Biol. Rev. 68, 669684 18 Daniel, R. (2005) The metagenomics of soil. Nat. Rev. Microbiol. 3, 470 478 19 Deutschbauer, A.M. et al. (2006) Genomics for environmental microbiology. Curr. Opin. Biotechnol. 17, 229235 20 Kowalchuk, G.A. et al. (2007) Finding the needles in the metagenome haystack. Microb. Ecol. 53, 475485 21 Sjoling, S. et al. (2006) Soil metagenomics: exploring and exploiting the soil. In Modern Soil Microbiology II (van Elsas, J.D. et al., eds), pp. 409434, CRC Press 22 Leveau, J.H.J. (2007) The magic and menace of metagenomics: prospects for the study of plant growth-promoting rhizobacteria. Eur. J. Plant Pathol. 119, 279300 23 Edwards, R.A. et al. (2006) Using pyrosequencing to shed light on deep mine microbial ecology. BMC Genomics 7, 5770 Box 4. Outstanding questions The efficiency of metagenomic explorations of soil will be advanced if answers can be found to the following questions: (i) Is the occurrence and distribution of target genetic information, such as the particular polyketide biosynthesis operons targeted under METACONTROL, in different soils predictable? If the genes that confer the target functions are indeed found more frequently in particular biomes than in others (for instance in temperate grassland vs tropical rainforest vs disturbed sites), this information can be exploited to guide metagenomics approaches aimed at discovering and developing novel anti- biotics. (ii) Are the target traits endemic in the resident microflora for a given biome type? If so, such information can guide us in screening approaches applied to a broad range of soils. (iii) Are the target functions fixed within the microbial chromo- somes or present on accessory elements within the mobile (horizontal) gene pool? Such information will guide us to sampling either the chromosomal (metagenome) or mobile (metamobilome) gene pool. (iv) Can the target traits be used to design markers to be used as tools that would provide an overall perspective of the global soil genetic reservoir? How can this knowledge help biotechnolo- gists to enhance their chances to successfully encounter novel genes? Review Trends in Biotechnology Vol.26 No.11 600 24 Apajalahti, J.H. et al. (1998) Effective recovery of bacterial DNA and percent-guanine-plus-cytosine-based analysis of community structure in the gastrointestinal tract of broiler chickens. Appl. Environ. Microbiol. 64, 40844088 25 Hjort, K. et al. (2007) Community structure of actively growing bacterial populations in plant pathogen suppressive soil. Microb. Ecol. 53, 399413 26 Van Elsas, J.D. et al. A novel protocol for the metagenomic analysis of suppressive soil. J. Microbiol. Methods (in press) 27 Rondon, M.R. et al. (2000) Cloning the soil metagenome: a strategy for accessing the genetic and functional diversity of uncultured microorganisms. Appl. Environ. Microbiol. 66, 25412547 28 Brady, S.F. and Clardy, J. (2000) Long-chain N-acyl amino acid antibiotics isolated from heterologously expressed environmental DNA. J. Am. Chem. Soc. 122, 1290312904 29 MacNeil, I.A. et al. (2001) Expression and isolation of antimicrobial small molecules from soil DNA libraries. J. Mol. Microbiol. Biotechnol. 3, 301308 30 Brady, S.F. et al. (2001) Cloning and heterologous expression of a natural product biosynthetic gene cluster from eDNA. Org. Lett. 3, 19811984 31 Gillespie, D.E. et al. (2002) Isolation of antibiotics turbomycin A and B from a metagenomic library of soil microbial DNA. Appl. Environ. Microbiol. 68, 43014306 32 Riesenfeld, C.S. et al. (2004) Uncultured soil bacteria are a reservoir of new antibiotic resistance genes. Environ. Microbiol. 6, 981989 33 Riaz, K. et al. (2008) A metagenomic analysis of soil bacteria extends the diversity of quorum-quenching lactonases. Environ. Microbiol. 10, 560570 34 Ginolhac, A. et al. (2004) Phylogenetic analysis of polyketide synthase I domains from soil metagenomic libraries allows selection of promising clones. Appl. Environ. Microbiol. 70, 55225527 35 Sebat, J.L. et al. (2003) Metagenomic proling: microarray analysis of an environmental genomic library. Appl. Environ. Microbiol. 69, 4927 4934 36 Uchiyama, T. et al. (2005) Substrate-induced gene expression screening of environmental metagenome libraries for isolation of catabolic genes. Nat. Biotechnol. 23, 8893 37 Bertrand, H. et al. (2005) High molecular weight DNA recovery from soils prerequisite for biotechnological metagenomic library construction. J. Microbiol. Methods 62, 111 38 Nalin, R. et al. (2004) LibraGen. Method for the expression of unknown environmental DNA into adapted host cells. Patent CA2492966 39 Courtois, S. et al. (2003) Recombinant environmental libraries provide access to microbial diversity for drug discovery from natural products. Appl. Environ. Microbiol. 69, 4955 40 Ginolhac, A. et al. (2005) Type I PKS may have evolved through horizontal gene transfer. J. Mol. Evol. 60, 716725 41 Whiteley, A.S. et al. (2003) Analysis of the functional diversity within water-stressed soil communities by ow cytometric analysis and CTC + cell sorting. J. Microbiol. Methods 54, 257267 42 Chung, E.J. et al. (2008) Forest soil metagenome gene cluster involved in antifungal activity expression in Escherichia coli. Appl. Environ. Microbiol. 74, 723730 43 Radajewski, S. et al. (2000) Stable isotope probing as a tool in microbial ecology. Nature 403, 646649 44 Radajewski, S. et al. (2003) Stable-isotope probing of nucleic acids: a window to the function of uncultured microorganisms. Curr. Opin. Biotechnol. 14, 296302 45 Dumont, M.G. et al. (2006) Identication of a complete methane monooxygenase operon from soil by combining stable isotope probing and metagenomic analysis. Environ. Microbiol. 8, 12401250 46 Maneeld, M. et al. (2002) RNAstable isotope probing: a novel means of linking microbial community function to phylogeny. Appl. Environ. Microbiol. 68, 53675373 47 Ostle, N. et al. (2003) Active microbial RNAturnover in a grassland soil estimated using a 13 CO 2 spike. Soil Biol. Biochem. 35, 877885 48 Rondon, M.R. et al. (1999) Toward functional genomics in bacteria: analysis of gene expression in Escherichia coli froma bacterial articial chromosome library of Bacillus cereus. Proc. Natl. Acad. Sci. U. S. A. 96, 64516455 49 Wilkinson, D.E. et al. (2002) Efcient molecular cloning of environmental DNA from geothermal sediments. Biotechnol. Lett. 24, 155161 50 Lorenz, P. and Eck, J. (2005) Metagenomics and industrial applications. Nat. Rev. Microbiol. 3, 510516 51 Langer, M. et al. (2006) Metagenomics: an inexhaustible access to natures diversity. Biotechnol. J. 1, 815821 52 Tringe, S.G. et al. (2005) Comparative metagenomics of microbial communities. Science 308, 554557 53 Martinez, A. et al. (2004) Genetically modied bacterial strains and novel bacterial articial chromosome shuttle vectors for constructing environmental libraries and detecting heterologous natural products in multiple expression hosts. Appl. Environ. Microbiol. 70, 24522463 54 Venter, J.C. et al. (2004) Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 6674 55 Johnson, P.L.F. and Slatkin, M. (2006) Inference of population genetic parameters in metagenomics: a clean look at messy data. Genome Res. 16, 13201327 56 Noguchi, H. et al. (2006) MetaGene: prokaryotic gene nding from environmental genome shotgun sequences. Nucleic Acids Res. 34, 56235630 57 McHardy, A.C. and Rigoutsos, I. (2007) Whats in the mix: phylogenetic classication of metagenome sequence samples. Curr. Opin. Microbiol. 10, 499503 58 Raes, J. et al. (2007) Get the most out of your metagenome: computational analysis of environmental sequence data. Curr. Opin. Microbiol. 10, 490498 59 Zhu, Y. et al. (2007) Deciphering RNA structural diversity and systematic phylogeny from microbial metagenome. Nucleic Acids Res. 35, 22832294 60 Schloss, P.D. and Handelsman, J. (2008) A statistical toolbox for metagenomics: assessing functional diversity in microbial communities. BMC Bioinformatics 9, 34 61 Tyson, G.W. et al. (2004) Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 3743 62 Frost, L.S. et al. (2005) Mobile genetic elements: the agents of open source evolution. Nat. Rev. Microbiol. 3, 722732 63 Woyke, T. et al. (2006) Symbiosis insights through metagenomic analysis of a microbial consortium. Nature 443, 950955 64 Eyers, L. et al. (2004) Environmental genomics: exploring the unmined richness of microbes to degrade xenobiotics. Appl. Microbiol. Biotechnol. 66, 123130 65 Wexler, M. et al. (2005) A wide host-range metagenomic library from a waste water treatment plant yields a novel alcohol/aldehyde dehydrogenase. Environ. Microbiol. 7, 19171926 66 Li, Y. et al. (2005) Screening a wide host-range, waste-water metagenomic library in tryptophan auxotrophs of Rhizobium leguminosarum and of Escherichia coli reveals different classes of cloned trp genes. Environ. Microbiol. 7, 19271936 67 Al-Hasani, K. et al. (2003) Development of a novel bacterial articial chromosome cloning system for functional studies. Plasmid 49, 184 187 Review Trends in Biotechnology Vol.26 No.11 601