collection containing many distant related genera rep-PCR is an excellent tool to asses diversity. An interesting place to obtain a broad overview about phylogeny, including references, is the "Tree of life" website of the university of Arizona, (Maddison and Maddison, http://phylogeny.arizona.edu/tree/phylogeny.html).

- Database generation, microbial identification methods and available software.

Specific data sets in microbial classification or taxonomy can be used to order the units of these groups into reference collections. These organized reference collections of fingerprints, or libraries, can be extremely useful for taxonomic and identification purposes. Currently several computer programs are available to facilitate the storage and analysis of rep-PCR generated fingerprints. GelCompar (Applied Maths, Kortrijk, Belgium; Vauterin and Vauterin 1992) is a state-of-the-art package for comparative analysis of electrophoresis patterns. It consists of extensive basic software, which allows for normalization of fingerprints according to intra-gel size standards. Additional comprehensive modules are available comprising hierarchical cluster analysis, principal component analysis, library management & identification, and comparative quantification & polymorphism analysis. Five band-based similarity coefficients (Jaccard, Dice, an area sensitive, fuzzy logic and Jeffreys x coefficient) are provided together with the curve-based product-moment correlation coefficient, which is based on the whole densitometric curves. The unweighted pair group method (UPGMA), Wardís and Neighbour Joining algorithms can be used to perform the cluster analyses. The GelCompar program has been successfully applied to rep-PCR genomic fingerprinting analysis (de Bruijn et al. 1996a; 1996b; Louws et al. 1996; Schneider and de Bruijn 1996; Zlatkin et al. 1996). Other available programs include the Whole Band Analyser (BioImage, Ann Arbor, MI, USA) for a Sun station and other image analysis software for Windows and Macintosh. Moreover, Dendron (Solltech, Oakdale, Ia, USA) is a computer-assisted gel analysis system operating on Macintosh. This package allows the generation of data bases and hierarchical cluster analysis. Four similarities based upon band position alone or band position and intensities are present and subsequently either the weighted pair group method (WPGM) or a proprietary method can be used to generate dendrograms (Snelling et al. 1996). RFLPscan (Scanalytics, Waltham, MA, USA) is the new generation of the AMBIS system allowing creation of databases and calculation of one band-based similarity coefficient between lanes. As pointed out in the introduction to this section, we will focus on the GelCompar system as applied to rep-PCR genomic fingerprinting pattern analysis.

- Image capture and standardization of rep-PCR genomic fingerprint patterns for computer-assisted analysis.

When one wants to compare multiple different gels with each other, the best results are obtained when all experimental parameters are standardized as much as possible. This is especially important, when large databases are to be generated or, data generated by different laboratories need to be compared. The standardized conditions used should include sample preparation and processing, use of similar growth conditions, the same DNA isolation methods and use of the same rep-PCR conditions. Moreover the use of standardized electrophoresis conditions and size markers is essential. Lastly image capturing should be standardized. A video camera based on charged coupled device (CCD) photography, can be used and the digitized images of ethidium bromide stained gels can be saved directly as TIFF files. Alternatively photographs can be taken using conventional cameraís, such as the KODAK MP4 landcamera, and the resulting photographs can be scanned by normal flatbed scanners or laser scanners, and saved as TIFF files. We will describe our method below.

Experimental protocol

1. Place the photograph (see Fig. 3A) on the top right hand corner of the HP Scanjet 3c, with the gel lanes parallel to the edge of the scanner and start Deskscan II version 2.2 software.

2. Use the image type defined as "Sharp B. and W. Photo", with "256 grays" and a "normal" sharpening. The path type has a resolution of 75 halftones and 100 dpi.

3. Activate "Preview" and frame the image within the borders of the picture, using only informative bands.

4. Activate "Zoom" and correct the frame when necessary, activate the [ sign for automatic contrast and brightness correction.

5. Activate the "Final" function and save the image as a TIFF file in the subdirectory of GelCompar called "images" or "gels.ima".

- Analysis of rep-PCR generated fingerprint patterns using GelCompar.

The gel images are further processed by defining and naming patterns (lanes) in the Conversion (see Fig. 3B), followed by normalization and subtracting the background of the fingerprints (see Fig. 3C). Subsequently, databases can be generated, which allow editing, searching for, and selecting lists of fingerprints. Fingerprints can be grouped by numerical comparison or identified by comparison to lists, or to more sophisticated specific libraries.

Materials

1 GelCompar version 4.0 program (Applied Maths, Kortrijk, Belgium)

2 A PC-compatible computer with a 80486 DX cpu or better and minimal 8 MB RAM, running Windows 3.1 or 95.

Comments: GelCompar requires a basic understanding of Windows 3.1. or 95 and some understanding of the MS-DOS directory system, file naming and management. We run GelCompar on a Pentium PC (100 MHz, 16 MB RAM) which allows rapid processing of hundreds of fingerprints.

Analytical protocol

Conversion of images into a GelCompar format.

In this part the whole gel image is converted to separate lanes or gel tracks (see Fig. 3B) and specific information is added to these lanes.

1. Click on "Convert" in the start up menu and then choose "File" and "Load Image"

2. Choose a TIFF-file from the default subdirectory "images" or "gels.ima" and check "Negative" for light bands on a dark background, such as ethidium bromide stained gels.

3. Click on "Edit" and "Settings" and customize the 'Track scanning settings'. We typically use for rep-PCR a 'Track resolution' (the number of points each densitometric track will consist of) of 400, 'Curve smoothing' of 0, 'Spline thickness' (indicates the number of points, to be averaged for a more stable profile, at either side horizontally of the center of the spline) of 2, and 'Gelstrip thickness' (indicates the number of points horizontally at either side of the center of the spline to be taken for the image of a gelstrip) of 5. We also activate the 'Rescaling' 'Whole gel' and the 'Number of tracks' dependent on the samples on the gel and 'Number of nodes' depending on the curvature of the lanes. We normally apply three nodes. Either of the 'Track search algorithm' I or II can be applied. 'Track resolution', 'Curve smoothing' and 'Spline thickness' and preferably also 'Gelstrip thickness' should be fixed per database.

4. Fit the limits of the green frame snugly around the gel.

5. Assign tracks with the help of either one of the automated track search routines defined in the settings, use "Add group" or assign tracks manually.

6. Position the splines on the tracks using the left mouse button and the "Shift" key. Apply more nodes using "Page up", when necessary, to follow the lane distortion.

7. Apply the "Edit" and "Zoom box" commands to check or edit results.

8. When satisfied with the results the track positions can be saved by choosing "Track", "Save track positions" subsequently activate "Scan" and do not close the window, leave the window by minimizing it and enter descriptive information for each lane.

9. For Marker lanes check "Reference" to ensure recognition of the internal standards by GelCompar and finish by saving the gel file in the default "gels.raw" subdirectory.

Normalization of fingerprints.

In this part the lanes are normalized to an absolute database standard (Fig. 3A to 3B), using the reference tracks, and the background of the geltracks is subtracted (see Fig. 3C, 6 and 7).

1. Click on "Normalize" in the start up menu and subsequently activate "Edit" and "Settings" to customize Normalization and Background Subtraction. For rep-PCR fingerprint analysis, we typically use a 'Resolution' of 400 pt, a 'Smoothing' of 3 pt, a 'Rolling Disk' background subtraction with an 'Intensity Setting' of 12 and enable both "load 2D-gelstrips" and "save2D-gelstrips".

2. Choose "File" and "Load" and a file from the "gels.raw" directory.

3. For the first gel in this database, choose "Edit" and "Reference positions", move cursor to a well defined, single and reproducible band in a typical reference track and choose "Peak" and "Add". Assign all well defined and reproducible marker bands in the standard which are useful and finish by choosing "OK". When the standard reference is already chosen, proceed to step 5.

4. To apply this standard reference to the database choose "Edit", "Settings" "Use current standard reference" and "OKî.

5. Choose "Associate" and "By pattern recognition". Check all associated bands, and make necessary improvements by selecting a standard reference band with the left mouse button, and assigning the corresponding new marker band with the right hand mouse button.

When the marker lanes are very different from the absolute database standard, visible in the left part of the window, it is necessary to press "Control" and / or "Shift" to associate corresponding bands.

6. When content with the association choose "Alignment" and "Align associated peaks".

7. Check the normalized pattern again.

8. When content with the normalization choose "File" and "Save", the background is now subtracted and the normalized profiles are saved in the appropriate "gels.int" subdirectory.

Analysis.

The analysis starts with loading or creating a list. A list is a series of lanes from one or more gels from the same database to be compared with each other. Gels can be reconstructed (see Fig. 6 and 7) or composite gels can be assembled using a list. Dendrograms can be generated using several methods and libraries can be built.



* Creating a List.

1. Click on "Analyze" in the start up menu and double click a gel name in the analysis window.

2. Select all wanted lanes with the right hand mouse button or by applying the "Search" and "Topic" functions. Lists can be saved by choosing "List" and "Save".

* Assigning bands.

To assign the bands in a lane, the program features a helpful automated gel searching option.

1. Double click the gel name of your choice; when the window has opened choose "Gel", "Bands" and "Auto Search". With "System", band settings" the sensitivity of the band searching can be adjusted by varying the minimal height and minimal area of each band to be found.

2. Check assigned bands, and select more or less by moving the cursor to the band in question. The form of the densitometric peak can be adjusted by replacing the pink squares.

3. Select band with left mouse button and click on right hand mouse button to assign it.

Comment: Clicking on the lane number above a lane reveals the number of bands in this lane in the bottom of the window.

4. Alternatively changes can be made by selecting a lane and choosing "Bands" and "Edit".

5. When content with the assigned bands choose "OK" or "exit" and "OK" and the adjusted peak assignment is saved.

* Combining or superimposing gels.

Rep-PCR genomic fingerprint patterns of different gels can be combined linearly. Gels can be linked to allow the analysis of associated genotypic fingerprints. This may lead to a higher contrast in the cluster analysis of the strains, as described by de Bruijn et al. (1996).

1. Click on "Analyze" in the start up menu then "Database" and "New combined gels", "File" and "Add new component".

2. Choose gel with correct descriptive information and double click on the name.

3. When necessary, exchange unwanted gel tracks by wanted lanes using "File", "Change link", type number of wanted lane in "Entry number" and double click the name of the gel of choice. Record all changed links since this information will be inaccessible after saving of the combined gel.

4. Click "Add new component" to add more fingerprints.

5. Repeat 3 and 4, until all your fingerprints are combined.

6. Choose "File" "Save combined gel" or "Save superimposed gel" to save the gel of choice.

Comment: Check the combined gel using cluster analysis or other analysis tools before you close the 'Combined Gel' window. This way you are able to save an improved version of the combined gel without going through the whole routine again. When satisfied the 'Combined Gel' window can be closed.

* Hierarchical cluster analysis using GelCompar.

Fingerprints can be grouped by similarity based on bands or whole densitometric curves.

1. When necessary assign bands as described above.

2. Create a list as described above.

3. Choose band based (comparison clustering bands) or curve based (comparison clustering correlation) comparison. For rep-PCR genomic fingerprints we prefer the product-moment correlation coefficient (comparison clustering correlation) (see Fig. 5, 6, 7).

4. Choose similarity coefficient.

5. Choose clustering algorithm, UPGMA (see Fig. 5, 6, 7), Ward's or Neighbour Joining.

6. Evaluate dendrograms.

Comment: Similarity coefficients are relative values, not absolute percentages but for convenience they are given as a value between 0-100 (see Fig. 5C, 6, 7).

* Principal Components Analysis.

Non-hierarchic clustering methods such as Principal Components Analysis (PCA) are useful alternatives to the hierarchical methods as pointed out above. For an example of this analysis see Fig. 8.

1. Create list as above described.

2. Choose "Comparison", "PCA" or click on the icon.

3. A three-dimensional representation of the taxonomical units as clouds of dots in spatial conformation is now seen. Samples in PCA of GelCompar are labelled by the names assigned to the tracks in "Conversion". By this way an unique label is automatically assigned to each group.

Next pages 25- End