Published online 16 October 2007
Published in J Environ Qual 36:1661-1669 (2007)
DOI: 10.2134/jeq2006.0555
© 2007 American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
TECHNICAL REPORTS
Surface Water Quality
Assessment of the 16S-23S rDNA Intergenic Spacer Region in Enterococcus spp. for Microbial Source Tracking
J. W. Dickerson, Jr.a,*,
J. B. Crozierb,
C. Hagedorna and
A. Hassalla
a Dep. of Crop and Soil Environmental Sciences, 330 Smyth Hall, Virginia Polytechnic Inst. and State Univ., Blacksburg, VA 24061
b Dep. of Biology, Roanoke College, Salem, VA 24153
* Corresponding author (chagedor{at}vt.edu).
Received for publication December 21, 2006.
 |
ABSTRACT
|
|---|
A new library-based microbial source tracking (MST) approach intended for initial application in the coastal waters of Virginia was evaluated. Host-origin isolates of Enterococcus spp. were collected from beaches and the surrounding tidewater region of Virginia and used to construct a library based on the pattern of DNA band lengths produced by the amplification of the 16S-23S rDNA intergenic spacer (IGS) region, and subsequent digestion with MboI. Initial results from small host-origin libraries (64 and 200 total isolates) with discriminant analysis (DA) and logistic regression (LR) yielded high average rates of correct classification (ARCC) for a four-source classification split (birds, dogs, sewage, and wildlife), with ARCCs ranging from 83 to 100%. However, the poor results obtained when classification was attempted on a non-library validation set (VS, ARCCs of 47 and 48%, respectively, using DA and LR) demonstrated that a library of 200 isolates was insufficient to adequately represent the diversity of the enterococci in the sampled region. An increase in the library size to 1029 total isolates was accompanied by a reduction in the ARCC of the library to 42.7% with DA and 45.7% with LR, plus similarly poor results obtained from the VS. The low correct classification rates generated by the larger known-source library were unsuitable for field application. Many reported MST methods have been based on results obtained using small host-origin libraries without external validation. Our results indicate that such an approach can be very misleading, and that larger libraries and external validation is essential for the confirmation of preliminary results.
Abbreviations: ARCC, average rate of correct classification DA, discriminant analysis IGS, intergenic spacer LR, logistic regression MST, microbial source tracking RCC, rate of correct classification VS, validation set
 |
INTRODUCTION
|
|---|
THE advent of microbial source tracking (MST) over the last decade has provided watershed managers with a means to discriminate between fecal sources polluting surface waters. The underlying premise of MST is that certain enteric bacterial strains are uniquely adapted to, and thus reside exclusively in the gastrointestinal tract of a single, or group of closely-related host organisms. Differences in the phenotypes or genotypes of these strains can be used to determine the relative contributions of animal sources to the fecal pollution in a water body. Two major classes of MST methods are currently being developed and utilized in surface waters across the world (Sinton et al., 1998; Scott et al., 2002; Simpson et al., 2002; Pond et al., 2004; Blanch et al., 2006).
The earliest and most commonly applied genotypic and phenotypic methods involve the construction of a host-origin database, or library, of isolates from known fecal sources providing a collection of possible fingerprint patterns allowing for a direct comparison with the fingerprints of isolates of unknown origin. The most commonly used phenotypic methods have employed differences in antibiotic resistance patterns (Wiggins, 1996; Hagedorn et al., 1999; Harwood et al., 2000, 2003; Graves et al., 2002; Whitlock et al., 2002) or the ability to utilize varying nutrient sources (Hagedorn et al., 2003; Harwood et al., 2003; Ahmed et al., 2005) of indicator organisms to determine fecal origins. An even greater variety of genotypic methods have been reported in the MST literature including: ribotyping (Parveen et al., 1999; Carson et al., 2001; Hartel et al., 2002; Carson et al., 2003; Scott et al., 2003), pulsed-field gel electrophoresis (PFGE) (Simmons et al., 2002, Samadpour et al., 2005), microarrays (Indest et al., 2005), and repetitive sequence polymerase chain reaction (rep-PCR) (Dombek et al., 2000; Carson et al., 2003; Seurinck et al., 2003; Johnson et al., 2004).
Soon after the development of library-based MST methods, researchers began looking for organisms, or sequences within the genome of organisms, that were consistently exclusive to pollution from a particular fecal source. Known as library-independent methods, in addition to using source-specific markers found in some recognized indicator organisms (Scott et al., 2005; USEPA, 2005), researchers have frequently expanded the search into non-indicator fecal organisms such as: Bifidobacterium spp. (Rhodes and Kator., 1999), Bacteroides (Bernhard and Field, 2000; Field et al., 2003; Simpson et al., 2003), F-specific DNA and RNA coliphages (Hsu et al., 1995; Cole et al., 2003; Long et al., 2005; Sundram et al., 2006), methanogens (Ufnar et al., 2006), and human- or livestock-specific enteric viruses such as enterovirus (Noble and Fuhrman, 2001; Fong et al., 2005), adenovirus (Jiang et al., 2001; Maluquer de Motes et al., 2004; Fong et al., 2005), and teschovirus (Jimenez-Clavero et al., 2003).
A major drawback of library-based methods to date has been observable geographical limitations (Hartel et al., 2002; Scott et al., 2003), although similar spatial restrictions have been seen in at least one library-independent method as well (Hamilton et al., 2006). An additional disadvantage of the current status of library-independent methods is that only a limited number of methods are capable of consistently quantifying the contributing proportions of fecal inputs in polluted waters (Noble et al., 2003; Field et al., 2003); as most of these methods presently serve primarily as a presence/absence test of human, and a limited number of non-human, sources. Although human fecal contamination presents the greatest risk to public health (Sinton et al., 1993), additional information on other potential sources is often useful in attempts to lower indicator bacteria concentrations to within USEPA levels of acceptable risk (USEPA, 1986).
A few recent studies have reported success using E. coli 16S-23S ribosomal DNA (rDNA) intergenic spacer (IGS) regions to discriminate between humans, cows, and chickens (Buchan et al., 2001), and to a lesser extent E. coli from sewage, horses, cows, gulls, and dogs (Seurinck et al., 2003). The absence of selection pressures in the IGS region, as opposed to the highly conserved nature of the bordering rDNA, has proven useful as a target site for the molecular subtyping of a variety of pathogenic bacteria (Guertler and Stanisich, 1996; Graham et al., 1996; Riffard et al., 1998; Chun et al., 1999; Stubbs et al., 1999), providing a simpler (in both equipment needed and level of training required) and more cost-effective assay than more traditional MST methods such as PFGE (Bedendo and Pignatari, 2000) or ribotyping (Carson et al., 2003). Frequently present in multiple copies, the arrangement of the rDNA operon is almost always 16S-IGS-23S-IGS-5S in bacteria. The amplification of 16S-23S rDNA IGS regions within a bacterial genus such as Enterococcus spp. can be performed using primers that recognize the highly conserved sequences found in the flanking regions of 16S and 23S rDNA. The Enterococcus spp. genome contains as many as six rDNA operons (Sechi and Daneo-Moore, 1993) allowing for the amplification and digestion of multiple amplicons with the potential in MST to increase the diversity of banding patterns produced among strains in the search for banding or fingerprint patterns unique to strains from a specific host organism.
The objective of this study was to develop a method of detecting and quantifying source-specific enterococci from birds (ducks, geese, and gulls), dogs, sewage (presumed human), and wildlife (deer and raccoons). The completion of successful laboratory testing would allow for the application of a new library-based MST method in the coastal regions of Virginia. Enterococci were selected as the target organisms due to their abundance in the fecal matter of warm-blooded animals (Devriese et al., 1987) and usage as fecal indicators in weekly monitoring procedures in the marine and coastal waters of Virginia (VDH, unpublished data, 2004). As library size requirements likely vary between methods and watersheds, the use of a non-library collection of known-source isolates, or validation set (VS), was employed for external validation to better assess the number of isolates required to represent enterococcal diversity in the target watershed. This study addresses the method-specific sampling and performance criteria described by Stoeckel and Harwood (2007).
 |
Materials and Methods
|
|---|
Collection of Fecal Samples
Fecal samples from known animal sources were collected at public beaches, dog parks, and nature parks within the Tidewater region of southeastern Virginia from February to September of 2005, as described in Dickerson et al. (2007). Sewage influent samples from the twelve treatment plants within the region were provided on four separate occasions during this time period by the Hampton Roads Sanitation District. None of the wastewater treatment plants contained combined sewers, so sewage samples should have contained composite samples of enterococci of almost exclusively human origin. Both fresh and dried fecal samples were collected from animals in each category (birds, dogs, and wildlife), except sewage, based on opportunity at the time samples were collected. Gulls identified were dominantly Ring-billed (Larus delawarensis) and Herring (Larus argentatus), as well as an occasional Laughing gull (Larus atricllia). The geese and ducks from which scat was obtained were identified as: Snow Goose (Chen caerulescens), Canada goose (Branta canadensis), and Mallard (Anas platyrhynchos). Dog (Canis familiaris) fecal samples were collected from local beaches and from several dog parks in the area. Wildlife (deer [Odocoileus virginianus] and raccoon [Procyon lotor]) scat was collected in Chickahominy Wildlife Management Area, Waller Mill Park, Newport News Park, and Pocahontas and York River State Parks in Eastern Virginia.
Isolation of Enterococci
A portion of each fecal or untreated sewage sample was diluted into tubes of sterile distilled deionized (DDI) water and spread on m-Enterococcus agar (Baltimore Biologics Laboratory, BBL). After 48-hour incubation at 35°C (APHA, 1998), no more than 4 randomly selected red to burgundy colonies from each non-sewage source, and not more than 12 from each sewage source were picked from each plate using sterile toothpicks. All isolates were inoculated into Enterococcosel Broth (BBL) in a 96-well microtiter plate for confirmation as enterococci (black color after incubation). All confirmed enterococcal isolates were regrown on TSA agar for use in PCR. The numbers of fecal samples collected and isolates selected from each sample, and the period over which fecal samples were obtained, was similar to other reports where PCR was used for MST (Carson et al., 2003; Hamilton et al., 2006;).
Polymerase Chain Reaction
Polymerase chain reaction was used to amplify Enterococcus IGS regions located between the 16S and 23S rDNA regions. Based on sequences in the GenBank database, primers were designed manually that would anneal to highly conserved downstream 16S rDNA and upstream 23S rDNA sequences in virtually all enterococci such that entire IGS regions could be amplified from each isolate (primers produced by Invitrogen Corporation). Approximately 1.0 µL of a pure culture of cells was diluted into 300.0 µL of sterile DDI water to serve as a template for PCR. PCR was performed using PuReTaq Ready-To-Go PCR beads (Amersham Biosciences), in 22.0 µL of sterile DDI water, 1.0 µL of 16S primer (5'-GCCTAAGGTGGGATAGATGA-3', novel to this study), 1.0 µL of 23S primer (5'-CCCGTCCTTCATCGGCTCCTA-3', novel to this study), and 1.0 µL of diluted cell culture. Primers were used at a final dilution of approximately 0.2 µmol L–1. The PCR was initiated by incubating the reaction mixture at 95°C for 6 min to lyse the cells, followed by 35 1-min cycles of 94, 57, and 72°C. The final elongation step was completed at 72°C for 7 min, followed by a 4°C hold of all reaction mixtures. All PCR experiments contained a positive control (E. fecalis) to assess method reproducibility and stability (numbers of bands and length of each).
Restriction Digests
Restriction enzyme digests consisted of 10 µL of restriction digest mix (34 µL of 10X Buffer C, 17 µL BSA, 17 µL spermidine (100 mmol L–1), 93.5 sterile DDI, 8.5 µL MboI restriction enzyme (Promega, 5'-^GATC-3', 3'-CTAG^-5') combined with 10 µl of PCR product into a centrifuge tube, centrifuged briefly, and incubated at 37°C for 3.5 h.
Gel Electrophoresis
Restriction enzyme digests were mixed with loading dye and loaded on a 3% horizontal agarose gel (Agarose Low Melting, Fisher Scientific), with several 100 bp ladders and the positive control, to detect polymorphisms among isolates. All gels were run in 1X TAE (10 mmol L–1 Tris, 5 mmol L–1 acetate, 0.1 mmol L–1 EDTA, pH 7.4 (Promega)) for 80 min at 100 V with standard gels (10 by 15 cm). Gels were stained for 3 h in a solution of 2X SYBR Green I (Cambrex Bio Science Rockland, Inc), and photographed on a UV mini-transilluminator with a Polaroid DS34 camera. All photographs were digitally scanned in Gel-Pro 3.1 using a HP Scanjet 6300C.
Statistical Analysis of Polymerase Chain Reaction Profiles for Source Prediction
Each digest, when visualized, exhibited between 4 and 14 total bands. Band lengths were quantified using Gel-Pro Software and converted to binary data based on 100 base-pair length categories ranging from <100 to >1000 bp in length. Analyses were conducted using both discriminant analysis (DA) and logistic regression (LR) in SAS-JMP statistical software (version 5.0.1; SAS Institute, 2003.), comparing isolates from within the library (all isolates were left in the library) and the VS against the model constructed using patterns from isolates in the host-origin library. The classification table generated in SAS-JMP was used to calculate the rate of correct classification (RCC) of each group of isolates, with the ARCC for the library or VS of isolates calculated by dividing the sum of the number of isolates correctly classified across all four categories by the total number of isolates classified, similar to the estimate of correct classification (ECC) used in Albert et al. (2003).
Creation of Known-Source Library (Host-Origin Database)
An initial library of 64 isolates was created using 16 isolates from each source category, obtained from two sewage, four dog, five bird, and five wildlife fecal samples collected in February 2005. The library was increased in size to 200 isolates (50 per category) using isolates collected (May 2005) from eight sewage, 14 dog, 15 bird, and 15 wildlife fecal samples. The final library contained a total of 1029 isolates (201 birds, 353 sewage, 266 dogs, 209 wildlife), from 52 bird, 42 sewage, 70 dog, and 54 wildlife fecal/sewage samples, with additions made from June-September 2005 collections. Clonal isolates within the library were not removed for initial tests. In addition to isolates collected for library construction, 100 additional isolates (12 birds, 48 sewage, and 20 each of dogs and wildlife), collected in May 2005 from four birds, six sewage, seven dogs, and seven wildlife fecal/sewage samples, were held out of the known-source library for use as a VS. No isolates obtained for the VS were collected from the same fecal/sewage samples used for library construction.
Following completion of the 1029-isolate library, a one-time random sampling of 201 isolates was selected (with the JMP software) from each of the three largest source categories (dogs, sewage, and wildlife) to generate a balanced known-source library of 804 isolates (201 isolates per source category). An additional library (323 isolates) and VS (62 isolates) was later generated from the larger 1029-isolate library and 100-isolate VS containing only unique banding patterns through the detection and removal of all clonal isolates within and across source categories.
 |
Results
|
|---|
Polymerase chain reaction and subsequent restriction digest product was obtained for almost all isolates tested, yielding bands ranging from approximately 32 to 1055 bp in size. An initial test library was constructed using 64 total isolates (16 per source category) to select for the restriction endonuclease providing the most discriminating source-specific fingerprint patterns. This library produced an average rate of correct classification (ARCC) of 100% using MboI for the desired four-source split, with both DA (Table 1
) and LR (Table 2
), suggesting there was potential for this approach. A VS was not constructed for this early stage of library development.
View this table:
[in this window]
[in a new window]
|
Table 1. Classification table displaying the percentages (and number) of isolates classified using discriminant analysis for a 64-isolate library.
|
|
View this table:
[in this window]
[in a new window]
|
Table 2. Classification table displaying the percentages (and number) of isolates classified using logistic regression for a 64-isolate library.
|
|
Based on the success of the 64-isolate library (Tables 1 and 2), additional fecal samples were collected and enterococci isolates fingerprinted, increasing the library size to 200 isolates (50 per source category). With the increased library size, the ARCC decreased from 100 to 83% with DA (Table 3
), with the highest RCC of 88% in both the dogs and wildlife categories. Using LR (Table 4
) a higher ARCC of 88% was achieved, with both dogs and sewage yielding RCCs of 94%. Although these classification rates would be considered acceptable and promising for most library-based MST methods based on previous studies (Harwood et al., 2000; Whitlock et al., 2002; Choi et al., 2003; VanOmmeren and Alm, 2006), the use of a VS of isolates was implemented to provide an additional means of assessing library capabilities. The VS isolates were correctly classified at considerably lower rates than those composing the library with DA only identifying only 47 out of 100 isolates correctly (ARCC = 47%, Table 3) and LR only placing only 48 of the 100 isolates into the correct source category (ARCC = 48%, Table 4). The ineffectiveness of the known-source library in classifying non-library isolates suggested the library was of insufficient size and not representative of the strain diversity of enterococci in the coastal region of Virginia.
View this table:
[in this window]
[in a new window]
|
Table 3. Classification table displaying the percentages (and number) of isolates classified using discriminant analysis for a 200-isolate library.
|
|
View this table:
[in this window]
[in a new window]
|
Table 4. Classification table displaying the percentages (and number) of isolates classified using logistic regression for a 200-isolate library.
|
|
A final increase in the size of the library brought the total number of isolates to 1029 (201 birds, 266 dogs, 353 sewage, and 209 wildlife isolates). With DA (Table 5
), the final ARCC for the classification of isolates composing the library (internal ARCC) was 42.7%, well below the classification rates of any libraries applied in field studies (Hagedorn et al., 1999; Harwood et al., 2000; Graves et al., 2002; Choi et al., 2003; Carroll et al., 2005). No source category in the 1029-isolate library produced a RCC greater than 70% with DA. The internal ARCC using LR (Table 6
) was 45.7%, only slightly higher than the results generated by DA, with the wildlife category producing the highest RCC of 64.1%. Only 47 of the 100 VS isolates were correctly classified by the 1029-isolate library using DA (ARCC = 47%, Table 5). And only 53 of 100 VS isolates (ARCC = 53%, Table 6) were correctly classified into respective host categories with LR.
View this table:
[in this window]
[in a new window]
|
Table 5. Classification table displaying the percentages (and number) of isolates classified using discriminant analysis for a 1029-isolate library.
|
|
View this table:
[in this window]
[in a new window]
|
Table 6. Classification table displaying the percentages (and number) of isolates classified using logistic regression for a 1029-isolate library.
|
|
Only a slight overall improvement was seen in classification rates when library source categories were balanced (201 isolates per category) by randomly selecting a subset from each of the three largest source categories (Tables 7
and 8
). With DA (Table 7), the internal ARCC for the library increased slightly, from 42.7 to 43.3%, by balancing the source categories. However, the VS of isolates showed a slight decrease, as the ARCC declined from 45.7 to 44%. Library isolates from individual source categories showed minor changes, with only the smallest original category (birds) displaying an RCC increase (+5.5%). In the VS, changes within individual categories were also minor, with the only major change (>5.0% points) being a 25% point decrease in the RCC of bird isolates.
View this table:
[in this window]
[in a new window]
|
Table 7. Classification table displaying the percentages (and number) of isolates classified with discriminant analysis for a source-category balanced 804-isolate library.
|
|
View this table:
[in this window]
[in a new window]
|
Table 8. Classification table displaying the percentages (and number) of isolates classified with logistic regression for a source-category balanced 804-isolate (201 per category) library.
|
|
When a classification model was generated using LR for the balanced source categories (Table 8), the ARCC for isolates within the library displayed a slight decrease, dropping from 45.7 to 45.1%. The VS isolates showed a larger decrease, as ten fewer isolates were correctly classified, decreasing the ARCC for the VS from 53 to 43%. Within the individual categories birds and wildlife, the smallest categories in the original unbalanced library yielded an increase in the RCC of 24.8 and 4.6 percentage points, respectively, once the source categories were equal in size. Conversely, the RCC of the dogs and sewage categories both decreased by 9.2 and 14.1 percentage points, respectively. The decrease in the ARCC for the VS of isolates was the largest average change seen. Validation set isolates from the two largest categories in the original library decreased by 10.4 (sewage) and 50 (dogs) percentage points.
The removal of clones from the 1029-isolate library reduced the library size to 323 unique isolates, a reduction of 68.6% (Tables 9
and 10
). Although the ARCCs increased in some source categories and declined in others with both statistical algorithims for the library with clones removed (compare Tables 5 and 9 for DA, Tables 6 and 10 for LR), there was little effect on the overall ARCCs and the values remained considerably lower than those reported for other non-clonal libraries of comparable size (Dombek et al., 2000; Guan et al., 2002; Seurinck et al., 2003; Lasalde et al., 2005; Duran et al., 2006; Vantarakis et al., 2006). For example, the ARCCs (with DA) for the clonal library (Table 5) and the non-clonal library (Table 9) were 43.9 and 53.6%, respectively. The VS ARCCs (with DA) for the clonal library (Table 5) and the non-clonal library (Table 9) were 49.9 and 52.5%, respectively.
View this table:
[in this window]
[in a new window]
|
Table 9. Classification table displaying the percentages (and number) of isolates classified with discriminant analysis for a 323-isolate library (clones removed from the 1029-isolate library, Table 5).
|
|
View this table:
[in this window]
[in a new window]
|
Table 10. Classification table displaying the percentages (and number) of isolates classified with logistic regression for a 323-isolate library (clones removed from the 1029-isolate library, Table 6).
|
|
 |
Discussion
|
|---|
A successful MST method requires the testing and/or fingerprinting of large numbers of fecal isolates within a geographic region to assess and account for host-strain diversity (Dickerson et al., 2007). The results of this study indicate that the validation of a MST method using isolates from a small number of sources tends to falsely inflate method effectiveness, which can result in improper and premature applications in field trials. The rDNA IGS method, both simpler to perform and less expensive than most other molecular MST methods, worked very well during the initial stages of testing (Tables 1 and 2) producing correct classification rates of 100% with both DA and LR. This initial success prompted the continued addition of known-source isolates working toward the construction of a host-origin library of adequate size to undergo field evaluations in the coastal region of Virginia. For the library of 200 isolates (Tables 3 and 4), the ARCC remained reasonably high and provided sufficient discrimination between source categories using both statistical methods. However, the inability of the library to correctly classify isolates from the VS at a level comparable to those within the library indicated that the diversity of fecal isolates was inadequately represented. As the library continued to increase in size (to 1029 isolates), the method began to fail (Tables 5 and 6), falling to levels unsuitable for source discriminations and was thus deemed unsuccessful.
For this study, classification models were generated by both DA and LR, two parametric methods, for the identification of isolates which were both part of, and not part of the known-source library. While DA has been widely applied in library-based source tracking methods (Wiggins, 1996; Hagedorn et al., 1999; Harwood et al., 2000; Graves et al., 2002), LR has, to date, remained unutilized in the field of MST. The most frequently implemented classification method for biomedical applications (Dreiseitl and Ohno-Machado, 2002), LR requires fewer assumptions than DA such as a normal distribution and equal variances within groups among independent variables. The use of LR has been shown to more effectively classify unknowns, as compared to DA, under conditions of non-normality, such as those using binary explanatory variables (Press and Wilson, 1978).
While methods of internal classification are commonly used in library-based MST, the practicality of a method lies exclusively in the ability to correctly classify isolates of unknown origin, such as those from water samples. One of the major recommendations to emerge from the Southern California Coastal Water Research Project (SCCWRP) and United States Geological Survey (USGS) sponsored method comparison (MC) studies (Stewart et al., 2003; Stoeckel et al., 2004) was the usefulness of the VS of non-library isolates, as opposed to internal validations, to assess the effectiveness of a known-source library. Commonplace in the medical (Terrin et al., 2003) or statistical (Press and Wilson, 1978) fields, the external verification of library effectiveness has been used in very few MST studies (Moore et al., 2005). However modifications of the VSs used in each of the MC studies are necessary to correct potential flaws that were present in each. In the SCCRWP study, isolates were obtained from the same set of fecal samples used to construct the known-source library. The underperformance of most MST methods in this study, even with the use of a seemingly favorable VS, may have been the result of a greater level of fecal diversity than previously expected in a single fecal sample, or inadequate reproducibility of many MST methods (Stewart et al., 2003). The USGS sponsored MC study (Stoeckel et al., 2004) used a VS of isolates collected 9 mo after those used for library construction. The overall inadequate performance of the methods involved resurrected previous concerns of temporal instability of strains within host organisms (Jenkins et al., 2003). Therefore in this study, once the library size was increased to 200 isolates, comparable to numbers frequently reported in several publications (Dombek et al., 2000; Guan et al., 2002; Seurinck et al., 2003; Lasalde et al., 2005; Duran et al., 2006; Vantarakis et al., 2006), a VS of isolates (collected simultaneously, but from different fecal samples) were fingerprinted to serve as an additional means of predicting the classification ability of the known-source library. In library-based MST methods, small libraries typically produce high ARCCs solely due to the random placement of isolates into defined categories (Whitlock et al., 2002). The poor classification rates and marked differences between library isolates and those not a part of the library is indicative of a library that is too small, and does not contain enough isolates to represent strain diversity in a watershed. Once the library size was increased to over 1000 isolates, classification rates decreased sharply, yielding correct classification rates below 50% and unsuitable for field applications.
No attempt was made during method development to distinguish between species of Enterococcus. Although the potential existed for greater source discrimination if analyses were limited to one or a few Enterococcus species, the requirements of speciating isolates would serve to increase both time constraints and method costs, making any finding less desirable for application in the field, but nevertheless may have resulted in a workable method. In addition, as the population and proportions of specific enterococci species vary between organisms (Lauková and Juris, 1997; Wheeler et al., 2002), limiting analyses to specific species could unfairly bias relative fecal contributions or potentially eliminate the ability to detect animal sources not carrying the species selected.
The differences seen in the ARCC between libraries containing balanced and unbalanced source categories were small regardless of whether DA or LR was used to generate classification models (Tables 5–8

). The ARCC for the VS decreased using both algorithms, possibly due to a loss in overall library representativeness from a decrease in the total number of isolates. The major difference resulting from a balanced library could only be seen within individual source categories. The effects of balancing a library on the RCC for a category were minor for both the library and VS isolates when using DA. However, using LR, larger changes were observed in both library and VS isolates when the library was balanced, as an inverse relationship was seen between the number of isolates lost from the original unbalanced library and the change in RCC for a given source category. Thus decreases in the RCC were seen for the two categories (dogs and sewage) losing a significant portion of the total isolates, while the two smallest categories, forfeiting only seven (wildlife) or zero (birds) isolates, increased (or failed to decrease) in the category RCC. For the libraries generated in this study, there was also little to be gained by removing clones. This would indicate that an inadequate clonal library cannot be substantially improved by removing clones; the non-clonal library will still be inadequate.
Conclusions of this study stress the dangers in using a small number of isolates/fecal sources in assessing the effectiveness of both library-based and, possibly, library-independent MST methods. Levels of strain diversity are undoubtedly method-dependent; however, host strain variability is probably greater than previously assumed, especially in a non-conserved DNA region such as IGS, and small libraries are likely not capable of reporting results with a high level of confidence. The SCCWRP MC study concluded that libraries of
300 were generally not successful at identifying fecal pollution in blind water samples, even when libraries were generated from the same fecal material used to construct the blinds (Griffith et al., 2003; Harwood et al., 2003; Myoda et al., 2003). Further research is needed into the diversity of genotypes and phenotypes of fecal bacteria both within a single host organism and a host population (Stoeckel and Harwood, 2007). Based on the changes observed in source category RCCs in the modified libraries, additional research is needed into the robustness of commonly used classification algorithms when library source categories are unequally represented, as well as the effects of clone removal on the classification ability of a known-source library. Results suggest that the use of a VS of isolates is a necessary tool for assessing the size requirements of a known-source library. This method was unsuccessful using the restriction enzyme MboI; however, one or several other potential restriction enzymes may provide greater source discrimination in future tests, even as applied to the same amplicon.
 |
ACKNOWLEDGMENTS
|
|---|
This research was funded by the Virginia Dep. of Health, Div. of Zoonotic and Environmental Epidemiology and an internal Roanoke College grant. We thank the Hampton Roads Sanitation District and the Virginia Dep. of Conservation and Recreation for assistance in the collection of sewage and fecal samples. Special thanks to Roanoke College students Tiffany Simpson, Margaret Mauney, and Marina Salama for the screening of multiple restriction enzymes and the initial testing of the method.
 |
NOTES
|
|---|
All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher.
 |
REFERENCES
|
|---|
- Ahmed, W., R. Neller, and M. Katouli. 2005. Host species-specific metabolic fingerprint database for enterococci and Escherichia coli and its application to identify sources of fecal contamination in surface waters. Appl. Environ. Microbiol. 71:4461–4468.[Abstract/Free Full Text]
- Albert, J.M., J. Munakata-Marr, L. Tenorio, and R.L. Siegrist. 2003. Statistical evaluation of bacterial source tracking data obtained by rep-PCR DNA fingerprinting of Escherichia coli. Environ. Sci. Technol. 37:4554–4560.[Medline]
- APHA. 1998. American Public Health Association, American Water Works Association, Water Pollution Control Federation. Standard methods for the Examination of Water and Wastewater. 20th ed. American Public Health Assoc., Washington, DC.
- Bedendo, J., and A.C.C. Pignatari. 2000. Typing of Enterococcus faecium by polymerase chain reaction and pulsed field gel electrophoresis. Braz. J. Med. Biol. Res. 33:1269–1274.[Web of Science][Medline]
- Bernhard, A.E., and K.G. Field. 2000. A PCR assay to discriminate human and ruminant feces on the basis of host differences in Bacteriodes-Prevotella genes encoding 16S rRNA. Appl. Environ. Microbiol. 66:4571–4574.[Abstract/Free Full Text]
- Blanch, A.R., L. Belanche-Munoz, X. Bonjoch, J. Ebdon, C. Gantzer, F. Lucena, J. Ottoson, C. Kourtis, A. Iverson, I. Kuhn, L. Moce, M. Muniesa, J. Schwartzbrod, S. Skraber, G.T. Papageorgiou, H. Taylor, J. Wallis, and J. Jofre. 2006. Integrated analysis of established and novel microbial and chemical methods for microbial source tracking. Appl. Environ. Microbiol. 72:5915–5926.[Abstract/Free Full Text]
- Buchan, A., M. Alber, and R.E. Hodson. 2001. Strain-specific differentiation of environmental Escherichia coli isolates via denaturing gradient gel electrophoresis (DGGE) analysis of the 16S–23S intergenic spacer region. FEMS Microbiol. Ecol. 35:313–321.[Medline]
- Carroll, S., M. Hargreaves, and A. Goonetilleke. 2005. Sourcing faecal pollution from onsite wastewater treatment systems in surface waters using antibiotic resistance analysis. J. Appl. Microbiol. 99:471–482.[CrossRef][Medline]
- Carson, C.A., B.L. Shear, M.R. Ellersieck, and A. Asfaw. 2001. Identification of fecal Escherichia coli from humans and animals by ribotyping. Appl. Environ. Microbiol. 67:1503–1507.[Abstract/Free Full Text]
- Carson, C.A., B.L. Shear, M.R. Ellersieck, and J.D. Schnell. 2003. Comparison of ribotyping and repetitive extragenic palidromic-PCR for identification of fecal Escherichia coli from humans and animals. Appl. Environ. Microbiol. 69:1836–1839.[Abstract/Free Full Text]
- Choi, S., W. Chu, J. Brown, S.J. Becker, V.J. Harwood, and S.C. Jiang. 2003. Application of Enterococci antibiotic resistance patterns for contamination source identification at Huntington Beach, California. Mar. Pollut. Bull. 46:748–755.[CrossRef][Web of Science][Medline]
- Chun, J., A. Huq, and R.R. Colwell. 1999. Analysis of 16S–23S rRNA intergenic spacer regions of Vibrio cholerae and Vibrio mimicus. Appl. Environ. Microbiol. 65:2202–2208.[Abstract/Free Full Text]
- Cole, D., S.C. Long, and M.D. Sobsey. 2003. Evaluation of F+RNA and DNA coliphages as source-specific indicators of fecal contamination in surface waters. Appl. Environ. Microbiol. 69:6507–6514.[Abstract/Free Full Text]
- Devriese, L.A., A. van de Kerckhove, R. Kilpper-Baelz, and K.H. Schleifer. 1987. Characterization and identification of Enterococcus species isolated from the intestines of animals. Int. J. Syst. Bacteriol. 37:257–259.[Abstract/Free Full Text]
- Dickerson, J.W., Jr., C. Hagedorn, and A. Hassall. 2007. Detection and remediation of human-origin pollution at two public beaches in Virginia using multiple source tracking methods. Water Res. (in press).
- Dombek, P.E., L.K. Johnson, S.T. Zimmerley, and M.J. Sadowsky. 2000. Use of repetitive DNA sequences and the PCR to differentiate Escherichia coli isolates from human and animal sources. Appl. Environ. Microbiol. 66:2572–2577.[Abstract/Free Full Text]
- Dreiseitl, S., and L. Ohno-Machado. 2002. Logistic regression and artificial neural network classification models: A methodology review. J. Biomed. Inform. 35:352–359.[CrossRef][Web of Science][Medline]
- Duran, M., B.Z. Haznedaroglu, and D.H. Zitomer. 2006. Microbial source tracking using host specific FAME profiles of fecal coliforms. Water Res. 40:67–74.[Medline]
- Field, K.G., A.E. Bernhard, and T.J. Brodeur. 2003. Molecular approaches to microbiological monitoring: Fecal source detection. Environ. Monit. Assess. 81:313–326.[CrossRef][Web of Science][Medline]
- Fong, T., D.W. Griffin, and E.K. Lipp. 2005. Molecular assays for targeting human and bovine enteric viruses in coastal waters and their application for library-independent source-tracking. Appl. Environ. Microbiol. 71:2070–2078.[Abstract/Free Full Text]
- Graham, T., E.J. Golsteyn-Thomas, V.P. Gannon, and J.E. Thomas. 1996. Genus- and species-specific detection of Listeria monocytogenes using polymerase chain reaction assays targeting the 16S/23S intergenic spacer region of the rRNA operon. Can. J. Microbiol. 42:1155–1162.[Web of Science][Medline]
- Graves, A.K., C. Hagedorn, A. Teetor, M. Mahal, A.M. Booth, and R.B. Reneau, Jr. 2002. Determining sources of fecal pollution in water for a rural Virginia watershed. J. Environ. Qual. 31:1300–1308.[Abstract/Free Full Text]
- Griffith, J.F., S.B. Weisberg, and C.D. McGee. 2003. Evaluation of microbial source tracking methods using mixed fecal sources in aqueous test samples. J. Water Health 1:141–151.[Medline]
- Guan, S., R. Xu, S. Chen, J. Odumeru, and C. Gyles. 2002. Development of a procedure for discriminating among Escherichia coli isolates from animal and human sources. Appl. Environ. Microbiol. 68:2690–2698.[Abstract/Free Full Text]
- Guertler, V., and V.A. Stanisich. 1996. New approaches to typing and identification of bacteria using the 16S–23S rDNA spacer region. Microbiol. 145:3–16.
- Hagedorn, C., J.B. Crozier, K.A. Mentz, A.M. Booth, A.K. Graves, N.J. Nelson, and R.B. Reneau, Jr. 2003. Carbon source utilization profiles as a method to identify sources of fecal pollution in water. J. Appl. Microbiol. 94:792–799.[CrossRef][Medline]
- Hagedorn, C., S.L. Robinson, J.R. Filtz, S.M. Grubbs, T.A. Angier, and R.B. Reneau, Jr. 1999. Determining sources of fecal pollution in a rural Virginia watershed with antibiotic resistance patterns in fecal streptococci. Appl. Environ. Microbiol. 65:5522–5531.[Abstract/Free Full Text]
- Hamilton, M.J., T. Yan, and M.J. Sadowsky. 2006. Development of goose- and duck-specific DNA markers to determine sources of Escherichia coli in waterways. Appl. Environ. Microbiol. 72:4012–4019.[Abstract/Free Full Text]
- Hartel, P.G., J.D. Summer, J.L. Hill, V. Collins, J.A. Entry, and W.I. Segars. 2002. Geographic variability of Escherichia coli ribotypes from animals in Idaho and Georgia. J. Environ. Qual. 31:1273–1278.[Abstract/Free Full Text]
- Harwood, V.J., J. Whitlock, and V.H. Withington. 2000. Classification of the antibiotic resistance patterns of indicator bacteria by discriminant analysis: Use in predicting the source of fecal contamination in subtropical Florida waters. Appl. Environ. Microbiol. 66:3698–3704.[Abstract/Free Full Text]
- Harwood, V.J., B. Wiggins, C. Hagedorn, R.D. Ellender, J. Gooch, J. Kern, M. Samadpour, A.C.H. Chapman, and B.J. Robinson. 2003. Phenotypic library-based microbial source tracking methods: Efficacy in the California collaborative study. J. Water Health 1:153–166.[Medline]
- Hsu, F., Y.S.C. Shieh, J. van Duin, M.J. Beekwilder, and M.D. Sobsey. 1995. Genotyping male-specific RNA coliphages by hybridization with oligonucleotide probes. Appl. Environ. Microbiol. 61:3960–3966.[Abstract]
- Indest, K.J., K. Betts, and J.S. Furey. 2005. Application of oligonucleotide microarrays for bacterial source tracking of environmental Enterococcus sp. isolates. Int. J. Environ. Public Health 2:175–185.
- Jenkins, M.B., P.G. Hartel, T.J. Olexa, and J.A. Stuedemann. 2003. Putative temporal variability of Escherichia coli ribotypes from yearling steers. J. Environ. Qual. 32:305–309.[Abstract/Free Full Text]
- Jiang, S., R. Noble, and W. Chu. 2001. Human adenoviruses and coliphagees in urban runoff-impacted coastal waters of southern California. Appl. Environ. Microbiol. 67:179–184.[Abstract/Free Full Text]
- Jimenez-Clavero, M.A., C. Fernandez, J.A. Ortiz, J. Pro, G. Carbonell, J.V. Tarazona, N. Roblas, and V. Ley. 2003. Teschoviruses as indicators of porcine fecal contamination of surface water. Appl. Environ. Microbiol. 69:6311–6315.[Abstract/Free Full Text]
- Johnson, L.K., M.B. Brown, E.A. Carruthers, J.A. Ferguson, P.E. Dombek, and M.J. Sadowsky. 2004. Sample size, library composition, and genotypic diversity among natural populations of Escherichia coli from different animals influence accuracy of determining sources of fecal pollution. Appl. Environ. Microbiol. 70:4478–4485.[Abstract/Free Full Text]
- Lasalde, C., R. Rodriguez, and G.A. Toranzos. 2005. Statistical analyses: Possible reasons for unreliability of source tracking efforts. Appl. Environ. Microbiol. 71:4690–4695.[Abstract/Free Full Text]
- Lauková, A., and P. Juris. 1997. Distribution and characterization of Enterococcus species in municipal sewages. Microbios 89:73–80.[Web of Science][Medline]
- Long, S.C., S.S. El-Khoury, S.J.G. Oudejans, M.D. Sobsey, and J. Vinje. 2005. Assessment of sources of diversity of male-specific coliphages for source tracking. Environ. Eng. Sci. 22:367–377.[CrossRef]
- Maluquer de Motes, C., P. Clemente-Casares, A. Hundesa, M. Martin, and R. Girones. 2004. Detection of bovine and porcine adenovirsuses for tracing the source of fecal contamination. Appl. Environ. Microbiol. 70:1448–1454.[Abstract/Free Full Text]
- Moore, D.F., V.J. Harwood, D.M. Ferguson, J. Lukasik, P. Hannah, M. Geitrich, and M. Brownell. 2005. Evaluation of antibiotic resistance analysis and ribotyping for identification of faecal pollution sources in urban watershed. J. Appl. Microbiol. 99:618–628.[CrossRef][Medline]
- Myoda, S.P., C.A. Carson, J.J. Fuhrmann, B. Hahn, P.G. Hartel, R.L. Kuntz, C.H. Nakatsu, M.J. Sadowsky, M. Samadpour, and H. Yampara-Isquire. 2003. Comparing genotypic bacterial source tracking methods that require a host origin database. J. Water Health 1:167–180.[Medline]
- Noble, R.T., S.M. Allen, A.D. Blackwood, W. Chu, S.C. Jiang, G.L. Lovelace, M.D. Sobsey, J.R. Stewart, and D.A. Wait. 2003. Use of viral pathogens and indicators to differentiate between human and non-human fecal contamination in a microbial source tracking comparison study. J. Water Health 1:195–207.[Medline]
- Noble, R.T., and J.A. Fuhrman. 2001. Enteroviruses detected in the coastal waters of Santa Monica Bay, California: Low correlation to bacterial indicators. Hydrobiologia 460:175–184.[CrossRef][Web of Science]
- Parveen, S., K.M. Portier, K. Robinson, L. Eminston, and M.L. Tamplin. 1999. Discriminant analysis of ribotype profiles of Escherichia coli for differentiating human and nonhuman sources of fecal pollution. Appl. Environ. Microbiol. 65:3142–3147.[Abstract/Free Full Text]
- Pond, K.R., R. Rangdale, W.G. Meijer, J. Brandao, L. Falc
o, A. Rince, B. Masterson, J. Greaves, A. Gawler, E. McDonnell, A. Cronin, and S. Pedley. 2004. Workshop report: Developing pollution source tracking for recreational and shellfish waters. Environ. Forensics 5:237–247.[CrossRef] - Press, S.J., and S. Wilson. 1978. Choosing between logistic regression and discriminant analysis. J. Am. Stat. Assoc. 73:699–705.[CrossRef][Web of Science]
- Rhodes, M.W., and H. Kator. 1999. Sorbitol-fermenting bifidobacteria as indicators of diffuse human fecal pollution in estuarine watersheds. J. Appl. Microbiol. 87:528–535.[CrossRef][Medline]
- Riffard, S., F. Lo Presti, P. Normand, P. Forey, M. Reyrolle, J. Etienne, and F. Vandenesch. 1998. Species identification of Legionella via intergenic 16S–23S ribosomal spacer PCR analysis. Int. J. Syst. Bacteriol. 48:723–730.[Abstract/Free Full Text]
- Samadpour, M., M.C. Roberts, C. Kitts, W. Mulugeta, and D. Alfi. 2005. The use of ribotyping and antibiotic resistance patterns for identification of host sources of Escherichia coli strains. Lett. Appl. Microbiol. 40:63–68.[CrossRef][Web of Science][Medline]
- SAS Institute. 2003. SAS version 5.0.1. SAS Inst., Cary, NC.
- Scott, T.M., T.M. Jenkins, J. Lukasik, and J.B. Rose. 2005. Potential use of a host-associated molecular marker in Enterococcus faecium as an index of human fecal pollution. Environ. Sci. Technol. 39:283–287.[Medline]
- Scott, T.M., S. Parveen, K.M. Portier, J.B. Rose, M.L. Tamplin, S.R. Farrah, A. Koo, and J. Lukasik. 2003. Geographical variation in ribotype profiles of Escherichia coli isolated from humans, swine, poultry, beef, and dairy cattle in Florida. Appl. Environ. Microbiol. 69:1089–1092.[Abstract/Free Full Text]
- Scott, T.M., J.B. Rose, T.M. Jenkins, S.R. Farrah, and J. Lukasik. 2002. Microbial source tracking: Current methodology and future directions. Appl. Environ. Microbiol. 68:5796–5803.[Free Full Text]
- Sechi, L.A., and L. Daneo-Moore. 1993. Characterization of intergenic spacers in two rrm of Enterococcus hirae ATCC 9790 operons. J. Bacteriol. 175:3213–3219.[Abstract/Free Full Text]
- Seurinck, S., W. Verstraete, and S.D. Siciliano. 2003. Use of 16S–23S rRNA intergenic spacer region PCR and repetitive extragenic palindromic PCR analyses of Escherichia coli isolates to identify nonpoint fecal sources. Appl. Environ. Microbiol. 69:4942–4950.[Abstract/Free Full Text]
- Simmons, G.E., Jr., D.F. Waye, S. Herbein, S. Myers, and E. Walker. 2002. Estimating nonpoint source fecal coliform sources using DNA profile analysis. p. 143–168. In T. Younos (ed.) Advances in water monitoring research. Water Resources Publications, Denver, CO.
- Simpson, J.M., J.W. Santo Domingo, and D.J. Reasoner. 2002. Microbial source tracking: State of the science. Environ. Sci. Technol. 36:5279–5288.[Medline]
- Simpson, J.M., J.W. Santo Domingo, and D.J. Reasoner. 2003. Assessment of equine fecal contamination: The search for alternative bacterial source-tracking targets. FEMS Microbiol. Ecol. 47:65–75.[CrossRef]
- Sinton, L.W., A.M. Donnison, and C.M. Hastie. 1993. Faecal streptococci as faecal pollution indicators: A review. Part II: Sanitary significance, survival, and use. N. Z. J. Mar. Freshwater Res. 27:117–137.
- Sinton, L.W., R.K. Finlay, and D.J. Hannah. 1998. Distinguishing human from animal faecal contamination in water: A review. N. Z. J. Mar. Freshwater Res. 32:323–348.
- Stewart, J.R., R.D. Ellender, J.A. Gooch, S. Jiang, S.P. Myoda, and S.B. Weisberg. 2003. Recommendations for microbial source tracking: Lessons learned from a methods comparison study. J. Water Health 1:225–231.[Medline]
- Stoeckel, D.M., and J. Harwood. 2007. Performance, design, and analysis in microbial source tracking studies. Appl. Environ. Microbiol. (in press).
- Stoeckel, D.M., M.V. Mathes, K.E. Hyer, C. Hagedorn, H. Kator, J. Lukasik, T.L. O'Brien, T.W. Fenger, M. Samadpour, K.M. Strickler, and B.A. Wiggins. 2004. Comparison of seven protocols to identify fecal contamination sources using Escherichia coli. Environ. Sci. Technol. 38:6109–6117.[Medline]
- Stubbs, S.L.J., J.S. Brazier, G.L. O'Neill, and B.I. Duerden. 1999. PCR targeted to the 16S–23S rRNA gene intergenic spacer region of Clostridium difficile and construction of a library consisting of 116 Different PCR ribotypes. Appl. Environ. Microbiol. 37:461–463.
- Sundram, A., N. Jumanlal, and M.M. Ehlers. 2006. Genotyping of F-RNA coliphages isolated for wastewater and river samples. Water SA 32:65–70.
- Terrin, N., C.H. Schmid, J.L. Griffith, R.B. D'Agostino, Sr., and H.P. Selker. 2003. External validity of predictive models: A comparison of logistic regression, classification trees, and neural networks. J. Clin. Epidemiol. 56:721–729.[CrossRef][Web of Science][Medline]
- Ufnar, J.A., S.Y. Wang, J.M. Christiansen, H. Yampara-Iquise, C.A. Carson, and R.D. Ellender. 2006. Detection of the nifH gene of Methanobrevibacter smithii: A potential tool to identify sewage pollution in recreational waters. J. Appl. Microbiol. 101:44–52.[CrossRef][Medline]
- USEPA. 1986. Ambient water quality criteria for bacteria: 1986. EPA-440/5-84-002. USEPA, Office of Water, Washington, DC.
- USEPA. 2005. Microbial source tracking guide document. USEPA, Office of Research and Development, Washington, DC.
- VanOmmeren, L., and E.W. Alm. 2006. Development and application of rapid antibiotic resistance analysis for microbial source tracking in the Black River watershed, Michigan. Lake Reservoir Manage. 22:240–244.
- Vantarakis, A., D. Venieri, G. Komninou, and M. Papapetropoulou. 2006. Differentiation of faecal Escherichia coli from humans and animals by multiple antibiotic resistance analysis. Lett. Appl. Microbiol. 42:71–77.[CrossRef][Web of Science][Medline]
- Wheeler, A.L., P.G. Hartel, D.G. Godfrey, J.L. Hill, and W.I. Segars. 2002. Potential of Enterococcus faecalis as a human fecal indicator for microbial source tracking. J. Environ. Qual. 31:1286–1293.[Abstract/Free Full Text]
- Wiggins, B.A. 1996. Discriminant analysis of antibiotic resistance patterns in fecal streptococci, a method to differentiate human and animal sources of fecal pollution in natural waters. Appl. Environ. Microbiol. 62:3997–4002.[Abstract]
- Whitlock, J.E., D.T. Jones, and V.J. Harwood. 2002. Identification of the sources of fecal coliforms in an urban watershed using antibiotic resistance analysis. Water Res. 36:4273–4282.[Medline]
This article has been cited by other articles:

|
 |

|
 |
 
J. Lu, J. W. Santo Domingo, R. Lamendella, T. Edge, and S. Hill
Phylogenetic Diversity and Molecular Detection of Bacteria in Gull Feces
Appl. Envir. Microbiol.,
July 1, 2008;
74(13):
3969 - 3976.
[Abstract]
[Full Text]
[PDF]
|
 |
|