JEQ Grow Your Career With ASA
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 16 October 2007
Published in J Environ Qual 36:1661-1669 (2007)
DOI: 10.2134/jeq2006.0555
© 2007 American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Dickerson, J. W.
Right arrow Articles by Hassall, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dickerson, J. W., Jr.
Right arrow Articles by Hassall, A.
Agricola
Right arrow Articles by Dickerson, J. W.
Right arrow Articles by Hassall, A.
Related Collections
Right arrow Water Quality
Right arrow Watershed and Landscape Processes
Right arrow Water Pollution

TECHNICAL REPORTS

Surface Water Quality

Assessment of the 16S-23S rDNA Intergenic Spacer Region in Enterococcus spp. for Microbial Source Tracking

J. W. Dickerson, Jr.a,*, J. B. Crozierb, C. Hagedorna and A. Hassalla

a Dep. of Crop and Soil Environmental Sciences, 330 Smyth Hall, Virginia Polytechnic Inst. and State Univ., Blacksburg, VA 24061
b Dep. of Biology, Roanoke College, Salem, VA 24153

* Corresponding author (chagedor{at}vt.edu).

Received for publication December 21, 2006.

    ABSTRACT
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
A new library-based microbial source tracking (MST) approach intended for initial application in the coastal waters of Virginia was evaluated. Host-origin isolates of Enterococcus spp. were collected from beaches and the surrounding tidewater region of Virginia and used to construct a library based on the pattern of DNA band lengths produced by the amplification of the 16S-23S rDNA intergenic spacer (IGS) region, and subsequent digestion with MboI. Initial results from small host-origin libraries (64 and 200 total isolates) with discriminant analysis (DA) and logistic regression (LR) yielded high average rates of correct classification (ARCC) for a four-source classification split (birds, dogs, sewage, and wildlife), with ARCCs ranging from 83 to 100%. However, the poor results obtained when classification was attempted on a non-library validation set (VS, ARCCs of 47 and 48%, respectively, using DA and LR) demonstrated that a library of 200 isolates was insufficient to adequately represent the diversity of the enterococci in the sampled region. An increase in the library size to 1029 total isolates was accompanied by a reduction in the ARCC of the library to 42.7% with DA and 45.7% with LR, plus similarly poor results obtained from the VS. The low correct classification rates generated by the larger known-source library were unsuitable for field application. Many reported MST methods have been based on results obtained using small host-origin libraries without external validation. Our results indicate that such an approach can be very misleading, and that larger libraries and external validation is essential for the confirmation of preliminary results.

Abbreviations: ARCC, average rate of correct classification • DA, discriminant analysis • IGS, intergenic spacer • LR, logistic regression • MST, microbial source tracking • RCC, rate of correct classification • VS, validation set


    INTRODUCTION
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
THE advent of microbial source tracking (MST) over the last decade has provided watershed managers with a means to discriminate between fecal sources polluting surface waters. The underlying premise of MST is that certain enteric bacterial strains are uniquely adapted to, and thus reside exclusively in the gastrointestinal tract of a single, or group of closely-related host organisms. Differences in the phenotypes or genotypes of these strains can be used to determine the relative contributions of animal sources to the fecal pollution in a water body. Two major classes of MST methods are currently being developed and utilized in surface waters across the world (Sinton et al., 1998; Scott et al., 2002; Simpson et al., 2002; Pond et al., 2004; Blanch et al., 2006).

The earliest and most commonly applied genotypic and phenotypic methods involve the construction of a host-origin database, or library, of isolates from known fecal sources providing a collection of possible ‘fingerprint’ patterns allowing for a direct comparison with the fingerprints of isolates of unknown origin. The most commonly used phenotypic methods have employed differences in antibiotic resistance patterns (Wiggins, 1996; Hagedorn et al., 1999; Harwood et al., 2000, 2003; Graves et al., 2002; Whitlock et al., 2002) or the ability to utilize varying nutrient sources (Hagedorn et al., 2003; Harwood et al., 2003; Ahmed et al., 2005) of indicator organisms to determine fecal origins. An even greater variety of genotypic methods have been reported in the MST literature including: ribotyping (Parveen et al., 1999; Carson et al., 2001; Hartel et al., 2002; Carson et al., 2003; Scott et al., 2003), pulsed-field gel electrophoresis (PFGE) (Simmons et al., 2002, Samadpour et al., 2005), microarrays (Indest et al., 2005), and repetitive sequence polymerase chain reaction (rep-PCR) (Dombek et al., 2000; Carson et al., 2003; Seurinck et al., 2003; Johnson et al., 2004).

Soon after the development of library-based MST methods, researchers began looking for organisms, or sequences within the genome of organisms, that were consistently exclusive to pollution from a particular fecal source. Known as library-independent methods, in addition to using source-specific markers found in some recognized indicator organisms (Scott et al., 2005; USEPA, 2005), researchers have frequently expanded the search into non-indicator fecal organisms such as: Bifidobacterium spp. (Rhodes and Kator., 1999), Bacteroides (Bernhard and Field, 2000; Field et al., 2003; Simpson et al., 2003), F-specific DNA and RNA coliphages (Hsu et al., 1995; Cole et al., 2003; Long et al., 2005; Sundram et al., 2006), methanogens (Ufnar et al., 2006), and human- or livestock-specific enteric viruses such as enterovirus (Noble and Fuhrman, 2001; Fong et al., 2005), adenovirus (Jiang et al., 2001; Maluquer de Motes et al., 2004; Fong et al., 2005), and teschovirus (Jimenez-Clavero et al., 2003).

A major drawback of library-based methods to date has been observable geographical limitations (Hartel et al., 2002; Scott et al., 2003), although similar spatial restrictions have been seen in at least one library-independent method as well (Hamilton et al., 2006). An additional disadvantage of the current status of library-independent methods is that only a limited number of methods are capable of consistently quantifying the contributing proportions of fecal inputs in polluted waters (Noble et al., 2003; Field et al., 2003); as most of these methods presently serve primarily as a presence/absence test of human, and a limited number of non-human, sources. Although human fecal contamination presents the greatest risk to public health (Sinton et al., 1993), additional information on other potential sources is often useful in attempts to lower indicator bacteria concentrations to within USEPA levels of acceptable risk (USEPA, 1986).

A few recent studies have reported success using E. coli 16S-23S ribosomal DNA (rDNA) intergenic spacer (IGS) regions to discriminate between humans, cows, and chickens (Buchan et al., 2001), and to a lesser extent E. coli from sewage, horses, cows, gulls, and dogs (Seurinck et al., 2003). The absence of selection pressures in the IGS region, as opposed to the highly conserved nature of the bordering rDNA, has proven useful as a target site for the molecular subtyping of a variety of pathogenic bacteria (Guertler and Stanisich, 1996; Graham et al., 1996; Riffard et al., 1998; Chun et al., 1999; Stubbs et al., 1999), providing a simpler (in both equipment needed and level of training required) and more cost-effective assay than more traditional MST methods such as PFGE (Bedendo and Pignatari, 2000) or ribotyping (Carson et al., 2003). Frequently present in multiple copies, the arrangement of the rDNA operon is almost always 16S-IGS-23S-IGS-5S in bacteria. The amplification of 16S-23S rDNA IGS regions within a bacterial genus such as Enterococcus spp. can be performed using primers that recognize the highly conserved sequences found in the flanking regions of 16S and 23S rDNA. The Enterococcus spp. genome contains as many as six rDNA operons (Sechi and Daneo-Moore, 1993) allowing for the amplification and digestion of multiple amplicons with the potential in MST to increase the diversity of banding patterns produced among strains in the search for banding or fingerprint patterns unique to strains from a specific host organism.

The objective of this study was to develop a method of detecting and quantifying source-specific enterococci from birds (ducks, geese, and gulls), dogs, sewage (presumed human), and wildlife (deer and raccoons). The completion of successful laboratory testing would allow for the application of a new library-based MST method in the coastal regions of Virginia. Enterococci were selected as the target organisms due to their abundance in the fecal matter of warm-blooded animals (Devriese et al., 1987) and usage as fecal indicators in weekly monitoring procedures in the marine and coastal waters of Virginia (VDH, unpublished data, 2004). As library size requirements likely vary between methods and watersheds, the use of a non-library collection of known-source isolates, or validation set (VS), was employed for external validation to better assess the number of isolates required to represent enterococcal diversity in the target watershed. This study addresses the method-specific sampling and performance criteria described by Stoeckel and Harwood (2007).


    Materials and Methods
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
Collection of Fecal Samples
Fecal samples from known animal sources were collected at public beaches, dog parks, and nature parks within the Tidewater region of southeastern Virginia from February to September of 2005, as described in Dickerson et al. (2007). Sewage influent samples from the twelve treatment plants within the region were provided on four separate occasions during this time period by the Hampton Roads Sanitation District. None of the wastewater treatment plants contained combined sewers, so sewage samples should have contained composite samples of enterococci of almost exclusively human origin. Both fresh and dried fecal samples were collected from animals in each category (birds, dogs, and wildlife), except sewage, based on opportunity at the time samples were collected. Gulls identified were dominantly Ring-billed (Larus delawarensis) and Herring (Larus argentatus), as well as an occasional Laughing gull (Larus atricllia). The geese and ducks from which scat was obtained were identified as: Snow Goose (Chen caerulescens), Canada goose (Branta canadensis), and Mallard (Anas platyrhynchos). Dog (Canis familiaris) fecal samples were collected from local beaches and from several dog parks in the area. Wildlife (deer [Odocoileus virginianus] and raccoon [Procyon lotor]) scat was collected in Chickahominy Wildlife Management Area, Waller Mill Park, Newport News Park, and Pocahontas and York River State Parks in Eastern Virginia.

Isolation of Enterococci
A portion of each fecal or untreated sewage sample was diluted into tubes of sterile distilled deionized (DDI) water and spread on m-Enterococcus agar (Baltimore Biologics Laboratory, BBL). After 48-hour incubation at 35°C (APHA, 1998), no more than 4 randomly selected red to burgundy colonies from each non-sewage source, and not more than 12 from each sewage source were picked from each plate using sterile toothpicks. All isolates were inoculated into Enterococcosel Broth (BBL) in a 96-well microtiter plate for confirmation as enterococci (black color after incubation). All confirmed enterococcal isolates were regrown on TSA agar for use in PCR. The numbers of fecal samples collected and isolates selected from each sample, and the period over which fecal samples were obtained, was similar to other reports where PCR was used for MST (Carson et al., 2003; Hamilton et al., 2006;).

Polymerase Chain Reaction
Polymerase chain reaction was used to amplify Enterococcus IGS regions located between the 16S and 23S rDNA regions. Based on sequences in the GenBank database, primers were designed manually that would anneal to highly conserved downstream 16S rDNA and upstream 23S rDNA sequences in virtually all enterococci such that entire IGS regions could be amplified from each isolate (primers produced by Invitrogen Corporation). Approximately 1.0 µL of a pure culture of cells was diluted into 300.0 µL of sterile DDI water to serve as a template for PCR. PCR was performed using PuReTaq Ready-To-Go PCR beads (Amersham Biosciences), in 22.0 µL of sterile DDI water, 1.0 µL of 16S primer (5'-GCCTAAGGTGGGATAGATGA-3', novel to this study), 1.0 µL of 23S primer (5'-CCCGTCCTTCATCGGCTCCTA-3', novel to this study), and 1.0 µL of diluted cell culture. Primers were used at a final dilution of approximately 0.2 µmol L–1. The PCR was initiated by incubating the reaction mixture at 95°C for 6 min to lyse the cells, followed by 35 1-min cycles of 94, 57, and 72°C. The final elongation step was completed at 72°C for 7 min, followed by a 4°C hold of all reaction mixtures. All PCR experiments contained a positive control (E. fecalis) to assess method reproducibility and stability (numbers of bands and length of each).

Restriction Digests
Restriction enzyme digests consisted of 10 µL of restriction digest mix (34 µL of 10X Buffer C, 17 µL BSA, 17 µL spermidine (100 mmol L–1), 93.5 sterile DDI, 8.5 µL MboI restriction enzyme (Promega, 5'-^GATC-3', 3'-CTAG^-5') combined with 10 µl of PCR product into a centrifuge tube, centrifuged briefly, and incubated at 37°C for 3.5 h.

Gel Electrophoresis
Restriction enzyme digests were mixed with loading dye and loaded on a 3% horizontal agarose gel (Agarose Low Melting, Fisher Scientific), with several 100 bp ladders and the positive control, to detect polymorphisms among isolates. All gels were run in 1X TAE (10 mmol L–1 Tris, 5 mmol L–1 acetate, 0.1 mmol L–1 EDTA, pH 7.4 (Promega)) for 80 min at 100 V with standard gels (10 by 15 cm). Gels were stained for 3 h in a solution of 2X SYBR Green I (Cambrex Bio Science Rockland, Inc), and photographed on a UV mini-transilluminator with a Polaroid DS34 camera. All photographs were digitally scanned in Gel-Pro 3.1 using a HP Scanjet 6300C.

Statistical Analysis of Polymerase Chain Reaction Profiles for Source Prediction
Each digest, when visualized, exhibited between 4 and 14 total bands. Band lengths were quantified using Gel-Pro Software and converted to binary data based on 100 base-pair length categories ranging from <100 to >1000 bp in length. Analyses were conducted using both discriminant analysis (DA) and logistic regression (LR) in SAS-JMP statistical software (version 5.0.1; SAS Institute, 2003.), comparing isolates from within the library (all isolates were left in the library) and the VS against the model constructed using patterns from isolates in the host-origin library. The classification table generated in SAS-JMP was used to calculate the rate of correct classification (RCC) of each group of isolates, with the ARCC for the library or VS of isolates calculated by dividing the sum of the number of isolates correctly classified across all four categories by the total number of isolates classified, similar to the estimate of correct classification (ECC) used in Albert et al. (2003).

Creation of Known-Source Library (Host-Origin Database)
An initial library of 64 isolates was created using 16 isolates from each source category, obtained from two sewage, four dog, five bird, and five wildlife fecal samples collected in February 2005. The library was increased in size to 200 isolates (50 per category) using isolates collected (May 2005) from eight sewage, 14 dog, 15 bird, and 15 wildlife fecal samples. The final library contained a total of 1029 isolates (201 birds, 353 sewage, 266 dogs, 209 wildlife), from 52 bird, 42 sewage, 70 dog, and 54 wildlife fecal/sewage samples, with additions made from June-September 2005 collections. Clonal isolates within the library were not removed for initial tests. In addition to isolates collected for library construction, 100 additional isolates (12 birds, 48 sewage, and 20 each of dogs and wildlife), collected in May 2005 from four birds, six sewage, seven dogs, and seven wildlife fecal/sewage samples, were held out of the known-source library for use as a VS. No isolates obtained for the VS were collected from the same fecal/sewage samples used for library construction.

Following completion of the 1029-isolate library, a one-time random sampling of 201 isolates was selected (with the JMP software) from each of the three largest source categories (dogs, sewage, and wildlife) to generate a balanced known-source library of 804 isolates (201 isolates per source category). An additional library (323 isolates) and VS (62 isolates) was later generated from the larger 1029-isolate library and 100-isolate VS containing only unique banding patterns through the detection and removal of all clonal isolates within and across source categories.


    Results
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
Polymerase chain reaction and subsequent restriction digest product was obtained for almost all isolates tested, yielding bands ranging from approximately 32 to 1055 bp in size. An initial test library was constructed using 64 total isolates (16 per source category) to select for the restriction endonuclease providing the most discriminating source-specific fingerprint patterns. This library produced an average rate of correct classification (ARCC) of 100% using MboI for the desired four-source split, with both DA (Table 1 ) and LR (Table 2 ), suggesting there was potential for this approach. A VS was not constructed for this early stage of library development.


View this table:
[in this window]
[in a new window]

 
Table 1. Classification table displaying the percentages (and number) of isolates classified using discriminant analysis for a 64-isolate library.

 

View this table:
[in this window]
[in a new window]

 
Table 2. Classification table displaying the percentages (and number) of isolates classified using logistic regression for a 64-isolate library.

 
Based on the success of the 64-isolate library (Tables 1 and 2), additional fecal samples were collected and enterococci isolates fingerprinted, increasing the library size to 200 isolates (50 per source category). With the increased library size, the ARCC decreased from 100 to 83% with DA (Table 3 ), with the highest RCC of 88% in both the dogs and wildlife categories. Using LR (Table 4 ) a higher ARCC of 88% was achieved, with both dogs and sewage yielding RCCs of 94%. Although these classification rates would be considered acceptable and promising for most library-based MST methods based on previous studies (Harwood et al., 2000; Whitlock et al., 2002; Choi et al., 2003; VanOmmeren and Alm, 2006), the use of a VS of isolates was implemented to provide an additional means of assessing library capabilities. The VS isolates were correctly classified at considerably lower rates than those composing the library with DA only identifying only 47 out of 100 isolates correctly (ARCC = 47%, Table 3) and LR only placing only 48 of the 100 isolates into the correct source category (ARCC = 48%, Table 4). The ineffectiveness of the known-source library in classifying non-library isolates suggested the library was of insufficient size and not representative of the strain diversity of enterococci in the coastal region of Virginia.


View this table:
[in this window]
[in a new window]

 
Table 3. Classification table displaying the percentages (and number) of isolates classified using discriminant analysis for a 200-isolate library.

 

View this table:
[in this window]
[in a new window]

 
Table 4. Classification table displaying the percentages (and number) of isolates classified using logistic regression for a 200-isolate library.

 
A final increase in the size of the library brought the total number of isolates to 1029 (201 birds, 266 dogs, 353 sewage, and 209 wildlife isolates). With DA (Table 5 ), the final ARCC for the classification of isolates composing the library (internal ARCC) was 42.7%, well below the classification rates of any libraries applied in field studies (Hagedorn et al., 1999; Harwood et al., 2000; Graves et al., 2002; Choi et al., 2003; Carroll et al., 2005). No source category in the 1029-isolate library produced a RCC greater than 70% with DA. The internal ARCC using LR (Table 6 ) was 45.7%, only slightly higher than the results generated by DA, with the wildlife category producing the highest RCC of 64.1%. Only 47 of the 100 VS isolates were correctly classified by the 1029-isolate library using DA (ARCC = 47%, Table 5). And only 53 of 100 VS isolates (ARCC = 53%, Table 6) were correctly classified into respective host categories with LR.


View this table:
[in this window]
[in a new window]

 
Table 5. Classification table displaying the percentages (and number) of isolates classified using discriminant analysis for a 1029-isolate library.

 

View this table:
[in this window]
[in a new window]

 
Table 6. Classification table displaying the percentages (and number) of isolates classified using logistic regression for a 1029-isolate library.

 
Only a slight overall improvement was seen in classification rates when library source categories were balanced (201 isolates per category) by randomly selecting a subset from each of the three largest source categories (Tables 7 and 8 ). With DA (Table 7), the internal ARCC for the library increased slightly, from 42.7 to 43.3%, by balancing the source categories. However, the VS of isolates showed a slight decrease, as the ARCC declined from 45.7 to 44%. Library isolates from individual source categories showed minor changes, with only the smallest original category (birds) displaying an RCC increase (+5.5%). In the VS, changes within individual categories were also minor, with the only major change (>5.0% points) being a 25% point decrease in the RCC of bird isolates.


View this table:
[in this window]
[in a new window]

 
Table 7. Classification table displaying the percentages (and number) of isolates classified with discriminant analysis for a source-category balanced 804-isolate library.

 

View this table:
[in this window]
[in a new window]

 
Table 8. Classification table displaying the percentages (and number) of isolates classified with logistic regression for a source-category balanced 804-isolate (201 per category) library.

 
When a classification model was generated using LR for the balanced source categories (Table 8), the ARCC for isolates within the library displayed a slight decrease, dropping from 45.7 to 45.1%. The VS isolates showed a larger decrease, as ten fewer isolates were correctly classified, decreasing the ARCC for the VS from 53 to 43%. Within the individual categories birds and wildlife, the smallest categories in the original unbalanced library yielded an increase in the RCC of 24.8 and 4.6 percentage points, respectively, once the source categories were equal in size. Conversely, the RCC of the dogs and sewage categories both decreased by 9.2 and 14.1 percentage points, respectively. The decrease in the ARCC for the VS of isolates was the largest average change seen. Validation set isolates from the two largest categories in the original library decreased by 10.4 (sewage) and 50 (dogs) percentage points.

The removal of clones from the 1029-isolate library reduced the library size to 323 unique isolates, a reduction of 68.6% (Tables 9 and 10 ). Although the ARCCs increased in some source categories and declined in others with both statistical algorithims for the library with clones removed (compare Tables 5 and 9 for DA, Tables 6 and 10 for LR), there was little effect on the overall ARCCs and the values remained considerably lower than those reported for other non-clonal libraries of comparable size (Dombek et al., 2000; Guan et al., 2002; Seurinck et al., 2003; Lasalde et al., 2005; Duran et al., 2006; Vantarakis et al., 2006). For example, the ARCCs (with DA) for the clonal library (Table 5) and the non-clonal library (Table 9) were 43.9 and 53.6%, respectively. The VS ARCCs (with DA) for the clonal library (Table 5) and the non-clonal library (Table 9) were 49.9 and 52.5%, respectively.


View this table:
[in this window]
[in a new window]

 
Table 9. Classification table displaying the percentages (and number) of isolates classified with discriminant analysis for a 323-isolate library (clones removed from the 1029-isolate library, Table 5).

 

View this table:
[in this window]
[in a new window]

 
Table 10. Classification table displaying the percentages (and number) of isolates classified with logistic regression for a 323-isolate library (clones removed from the 1029-isolate library, Table 6).

 

    Discussion
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
A successful MST method requires the testing and/or fingerprinting of large numbers of fecal isolates within a geographic region to assess and account for host-strain diversity (Dickerson et al., 2007). The results of this study indicate that the validation of a MST method using isolates from a small number of sources tends to falsely inflate method effectiveness, which can result in improper and premature applications in field trials. The rDNA IGS method, both simpler to perform and less expensive than most other molecular MST methods, worked very well during the initial stages of testing (Tables 1 and 2) producing correct classification rates of 100% with both DA and LR. This initial success prompted the continued addition of known-source isolates working toward the construction of a host-origin library of adequate size to undergo field evaluations in the coastal region of Virginia. For the library of 200 isolates (Tables 3 and 4), the ARCC remained reasonably high and provided sufficient discrimination between source categories using both statistical methods. However, the inability of the library to correctly classify isolates from the VS at a level comparable to those within the library indicated that the diversity of fecal isolates was inadequately represented. As the library continued to increase in size (to 1029 isolates), the method began to fail (Tables 5 and 6), falling to levels unsuitable for source discriminations and was thus deemed unsuccessful.

For this study, classification models were generated by both DA and LR, two parametric methods, for the identification of isolates which were both part of, and not part of the known-source library. While DA has been widely applied in library-based source tracking methods (Wiggins, 1996; Hagedorn et al., 1999; Harwood et al., 2000; Graves et al., 2002), LR has, to date, remained unutilized in the field of MST. The most frequently implemented classification method for biomedical applications (Dreiseitl and Ohno-Machado, 2002), LR requires fewer assumptions than DA such as a normal distribution and equal variances within groups among independent variables. The use of LR has been shown to more effectively classify unknowns, as compared to DA, under conditions of non-normality, such as those using binary explanatory variables (Press and Wilson, 1978).

While methods of internal classification are commonly used in library-based MST, the practicality of a method lies exclusively in the ability to correctly classify isolates of unknown origin, such as those from water samples. One of the major recommendations to emerge from the Southern California Coastal Water Research Project (SCCWRP) and United States Geological Survey (USGS) sponsored method comparison (MC) studies (Stewart et al., 2003; Stoeckel et al., 2004) was the usefulness of the VS of non-library isolates, as opposed to internal validations, to assess the effectiveness of a known-source library. Commonplace in the medical (Terrin et al., 2003) or statistical (Press and Wilson, 1978) fields, the external verification of library effectiveness has been used in very few MST studies (Moore et al., 2005). However modifications of the VSs used in each of the MC studies are necessary to correct potential flaws that were present in each. In the SCCRWP study, isolates were obtained from the same set of fecal samples used to construct the known-source library. The underperformance of most MST methods in this study, even with the use of a seemingly favorable VS, may have been the result of a greater level of fecal diversity than previously expected in a single fecal sample, or inadequate reproducibility of many MST methods (Stewart et al., 2003). The USGS sponsored MC study (Stoeckel et al., 2004) used a VS of isolates collected 9 mo after those used for library construction. The overall inadequate performance of the methods involved resurrected previous concerns of temporal instability of strains within host organisms (Jenkins et al., 2003). Therefore in this study, once the library size was increased to 200 isolates, comparable to numbers frequently reported in several publications (Dombek et al., 2000; Guan et al., 2002; Seurinck et al., 2003; Lasalde et al., 2005; Duran et al., 2006; Vantarakis et al., 2006), a VS of isolates (collected simultaneously, but from different fecal samples) were fingerprinted to serve as an additional means of predicting the classification ability of the known-source library. In library-based MST methods, small libraries typically produce high ARCCs solely due to the random placement of isolates into defined categories (Whitlock et al., 2002). The poor classification rates and marked differences between library isolates and those not a part of the library is indicative of a library that is too small, and does not contain enough isolates to represent strain diversity in a watershed. Once the library size was increased to over 1000 isolates, classification rates decreased sharply, yielding correct classification rates below 50% and unsuitable for field applications.

No attempt was made during method development to distinguish between species of Enterococcus. Although the potential existed for greater source discrimination if analyses were limited to one or a few Enterococcus species, the requirements of speciating isolates would serve to increase both time constraints and method costs, making any finding less desirable for application in the field, but nevertheless may have resulted in a workable method. In addition, as the population and proportions of specific enterococci species vary between organisms (Lauková and Juris, 1997; Wheeler et al., 2002), limiting analyses to specific species could unfairly bias relative fecal contributions or potentially eliminate the ability to detect animal sources not carrying the species selected.

The differences seen in the ARCC between libraries containing balanced and unbalanced source categories were small regardless of whether DA or LR was used to generate classification models (Tables 5–8GoGoGo). The ARCC for the VS decreased using both algorithms, possibly due to a loss in overall library representativeness from a decrease in the total number of isolates. The major difference resulting from a balanced library could only be seen within individual source categories. The effects of balancing a library on the RCC for a category were minor for both the library and VS isolates when using DA. However, using LR, larger changes were observed in both library and VS isolates when the library was balanced, as an inverse relationship was seen between the number of isolates lost from the original unbalanced library and the change in RCC for a given source category. Thus decreases in the RCC were seen for the two categories (dogs and sewage) losing a significant portion of the total isolates, while the two smallest categories, forfeiting only seven (wildlife) or zero (birds) isolates, increased (or failed to decrease) in the category RCC. For the libraries generated in this study, there was also little to be gained by removing clones. This would indicate that an inadequate clonal library cannot be substantially improved by removing clones; the non-clonal library will still be inadequate.

Conclusions of this study stress the dangers in using a small number of isolates/fecal sources in assessing the effectiveness of both library-based and, possibly, library-independent MST methods. Levels of strain diversity are undoubtedly method-dependent; however, host strain variability is probably greater than previously assumed, especially in a non-conserved DNA region such as IGS, and small libraries are likely not capable of reporting results with a high level of confidence. The SCCWRP MC study concluded that libraries of ~300 were generally not successful at identifying fecal pollution in blind water samples, even when libraries were generated from the same fecal material used to construct the blinds (Griffith et al., 2003; Harwood et al., 2003; Myoda et al., 2003). Further research is needed into the diversity of genotypes and phenotypes of fecal bacteria both within a single host organism and a host population (Stoeckel and Harwood, 2007). Based on the changes observed in source category RCCs in the modified libraries, additional research is needed into the robustness of commonly used classification algorithms when library source categories are unequally represented, as well as the effects of clone removal on the classification ability of a known-source library. Results suggest that the use of a VS of isolates is a necessary tool for assessing the size requirements of a known-source library. This method was unsuccessful using the restriction enzyme MboI; however, one or several other potential restriction enzymes may provide greater source discrimination in future tests, even as applied to the same amplicon.


    ACKNOWLEDGMENTS
 
This research was funded by the Virginia Dep. of Health, Div. of Zoonotic and Environmental Epidemiology and an internal Roanoke College grant. We thank the Hampton Roads Sanitation District and the Virginia Dep. of Conservation and Recreation for assistance in the collection of sewage and fecal samples. Special thanks to Roanoke College students Tiffany Simpson, Margaret Mauney, and Marina Salama for the screening of multiple restriction enzymes and the initial testing of the method.


    NOTES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher.


    REFERENCES
 TOP
 NOTES
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 




This article has been cited by other articles:


Home page
Appl. Environ. Microbiol.Home page
J. Lu, J. W. Santo Domingo, R. Lamendella, T. Edge, and S. Hill
Phylogenetic Diversity and Molecular Detection of Bacteria in Gull Feces
Appl. Envir. Microbiol., July 1, 2008; 74(13): 3969 - 3976.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Dickerson, J. W.
Right arrow Articles by Hassall, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dickerson, J. W., Jr.
Right arrow Articles by Hassall, A.
Agricola
Right arrow Articles by Dickerson, J. W.
Right arrow Articles by Hassall, A.
Related Collections
Right arrow Water Quality
Right arrow Watershed and Landscape Processes
Right arrow Water Pollution


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Crop Science
Journal of Natural Resources
and Life Sciences Education
Vadose Zone Journal
Soil Science Society of America Journal Journal of Plant Registrations The Plant Genome