1MSU-DOE Plant Research Laboratory, 2Department of Microbiology, 3NSF Center for Microbial Ecology, Michigan State University, East Lansing, MI 48824, USA.


The identification and classification of bacteria are of crucial importance in environmental, industrial, medical and agricultural microbiology and microbial ecology. A number of different phenotypic and genotypic methods are presently being employed for microbial identification and classification (see Fig. 1 and Louws et al. 1996). Each of these methods permits a certain level of phylogenetic classification, from the genus, species, subspecies, biovar to the strain specific level (Fig. 1). Moreover, each method has its advantages and disadvantages, with regard to ease of application, reproducibility, requirement for equipment and level of resolution (Akkermans et al. 1995).

Generally, DNA-based methods are emerging as the more reliable, simple and inexpensive ways to identify and classify microbes. In fact, the assignment of genera/species has traditionally been based on DNA-DNA hybridization methods (Wayne et al. 1987) and modern phylogeny is increasingly based on 16S rRNA sequence analysis (Woese 1987; Stackebrandt and Goebel 1994). Here, we describe a method referred to as rep-PCR genomic fingerprinting, a DNA amplification based technique, which has been found to be extremely reliable, reproducible, rapid and highly discriminatory (Versalovic et al. 1994; Louws et al. 1996).

Rep-PCR genomic fingerprinting makes use of DNA primers complementary to naturally occurring, highly conserved, repetitive DNA sequences, present in multiple copies in the genomes of most Gram-negative and several Gram-positive bacteria (Lupski and Weinstock 1992). Three families of repetitive sequences have been identified, including the 35-40 bp repetitive extragenic palindromic (REP) sequence, the 124-127 bp enterobacterial repetitive intergenic consensus (ERIC) sequence, and the 154 bp BOX element (Versalovic et al. 1994). These sequences appear to be located in distinct, intergenic positions around the genome. The repetitive elements may be present in both orientations, and oligonucleotide primers have been designed to prime DNA synthesis outward from the inverted repeats in REP and ERIC, and from the boxA subunit of BOX, in the polymerase chain reaction (PCR) (Versalovic et al. 1994). The use of these primer(s) and PCR leads to the selective amplification of distinct genomic regions located between REP, ERIC or BOX elements. The corresponding protocols are referred to as REP-PCR, ERIC-PCR and BOX-PCR genomic fingerprinting respectively, and rep-PCR genomic fingerprinting collectively (Versalovic et al. 1991; 1994). The amplified fragments can be resolved in a gel matrix, yielding a profile referred to as a rep-PCR genomic fingerprint (Versalovic et al. 1994; see Fig. 2). These fingerprints resemble "bar code" patterns analogous to UPC codes used in grocery stores (Lupski 1993).
The rep-PCR genomic fingerprints generated from bacterial isolates permit differentiation to the species, subspecies and strain level.

Rep-PCR genomic fingerprinting protocols have been developed in collaboration with the group led by Dr. J.R. Lupski at Baylor College of Medicine (Houston, Texas) and have been applied successfully in many medical, agricultural, industrial and environmental studies of microbial diversity (Versalovic et al. 1994). In addition to studying diversity, rep-PCR genomic fingerprinting has become a valuable tool for the identification and classification of bacteria, and for molecular epidemiological studies of human and plant pathogens (van Belkum et al. 1994; Louws et al. 1996 and references therein; Versalovic et al. 1997).

This chapter also describes the application of computer assisted analysis of rep-PCR generated genomic fingerprints for the identification and classification of microbes using cluster analysis algorithms. Cluster analysis is the art of finding groups in data, and bacterial classification and taxonomy are principal applications of this methodology (see Fig. 2B). We will describe the generation of raw data, the comparison of fingerprints, and the different algorithms used to find groupings in the data, and to identify specific strains in a database using their genomic fingerprints.


In this second section we will provide an overview of the different methodologies and protocols used to implement rep-PCR genomic fingerprinting. One distinct advantage of the rep-PCR genomic fingerprinting method is that the primers used work in a variety of Gram-negative and Gram-positive bacteria (see Versalovic et al. 1991; 1994; Louws et al. 1996). This means that no previous knowledge of the genomic structure or nature of indigenous repeated sequences is necessary. It also bypasses the need to identify suitable arbitrary primers by trial and error, that is inherent in the RAPD protocol (see Welsh and McClelland 1990).

Sample preparation is simple and rapid and genomic fingerprints can be obtained from a variety of different templates (see next section). Many samples can be prepared in a short time. PCR amplification requires 5-7 hours. Electrophoresis on agarose gels can be performed in eight hours, but 18 hours is preferred for better resolution of the complex fingerprints on long (24 cm) gels. Therefore rep-PCR fingerprinting, including pattern analysis by eye or using a computer, can be performed in two days. In this section, we will primarily focus on examples involving the analysis of plant-associated and other soil bacteria. For a discussion of medical applications, consult reviews by Versalovic et al. (1994, 1997).

Correct pattern imaging, visual interpretation or conversion to computer processable data, will be described in the third section. Important parameters that will be discussed include choice of size marker standards for multiple gel comparison and database construction, determination of proximity coefficients, and use of appropriate clustering methods for phylogenetic analysis.

Template preparation for rep-PCR genomic fingerprinting.

Several methods of template preparation can be used for rep-PCR mediated genomic fingerprinting. The method used depends on the nature of the microbes to be analyzed, their receptiveness to lysing (releasing DNA), size of pools to be analyzed, level of resolution desired and time available. Rep-PCR genomic fingerprints have been obtained from purified DNA, whole cells from pure liquid cultures or cultures from plates, as well as directly from extracts of plant lesions or nodules (Versalovic et al. 1994; 1997; Louws et al. 1994, 1995, 1996; Nick and Lindstrom 1994; Schneider and de Bruijn 1996; Vera Cruz et al. 1996). Here we will focus on the whole cell and purified DNA based methods, and not discuss the plant tissue related approaches. For a detailed description of the latter, see Schneider and de Bruijn (1996) and Louws et al. (1996).

- Whole cells from pure liquid cultures.

Whole cells obtained from a liquid culture can be directly used in rep-PCR amplification reactions. Generally, washing the cells improves the quality of the rep-PCR reactions. Using this method rep-PCR genomic fingerprints have been generated from for example A. caulinodans and R. meliloti (Schneider and de Bruijn 1996).

1. Take 3 ml of a liquid culture (OD600 0.65 - 0.95).

2. Spin down the cells.

3. Wash the cell pellet with 1 M NaCl. Repeat the washing several times for cultures producing a lot of polysaccharides.

4. Resuspend the cell pellet in 100 µl double distilled water. Store aliquots at -20_C (optional).

5. Use 1-2 µl of template per PCR reaction (see below).

- Whole cells from single colonies on plates.

Whole cells obtained from single colonies on plates can also be used directly in the rep-PCR reaction. Using this method rep-PCR genomic fingerprints have been obtained from Rhizobium sp., Clavibacter michiganensis subsp., E.coli, various xanthomonads and pseudomonads, as well as from a large collection of unidentified 3 CBA degraders and subsurface microbes (de Bruijn 1992; Louws et al. 1994, 1995; Judd et al. 1993; Zlatkin et al. 1996; Schneider and de Bruijn 1996; M.H. Schultz, J.L.W. Rademaker and F.J. de Bruijn unpublished results).

1. Remove a small portion of a well-defined single colony directly off a fresh plate using a 1 ml disposable inoculation loop (Simport L200-1).

2. Insert loop into 25 ml PCR mix and whisk to resuspend the cells.

Comments: Up to 4 and even 12 weeks old plates have been used successfully (Schneider and de Bruijn 1996; M.H. Schultz, J.L.W. Rademaker and F.J. de Bruijn unpublished results). Relatively few cells yield enough DNA for a rep-PCR reaction; in fact, using too many cells results in the generation of a background smear.

- Whole cell after alkaline lysis.

Whole cells of microbial species that are difficult to lyse and do not early release DNA during the PCR cycles may be pretreated by the alkaline lysis method.

1. Take 10 µl from a cell suspension (103 - 107 bacteria) or a portion of one colony.

2. Add 100 µl of 0.05 M NaOH and incubate at 95°C for 15 min.

3. Centrifuge for 2 min. at 14,000 rpm.

4. Use 1 µl of the supernatant per rep-PCR reaction.

- Purified genomic DNA.

The procedure described is based on Pitcher et al. (1989), and modified by Dr. Luc Vauterin (personal communication). This method describes the extraction of DNA from solid media (agar plates), instead of liquid media. Best results are obtained using young and fresh cells.

Materials and reagents

Next pages 5-11