|
|
||||||||
a Bureau of Water Assessment and Management, New York State Department of Environmental Conservation, 625 Broadway, 4th Floor, Albany, NY 12233-3502
b Department of Natural Resources, Fernow Hall, Cornell University, Ithaca, NY 14853
c School of Civil and Environmental Engineering, Hollister Hall, Cornell University, Ithaca, NY 14853-3501
* Corresponding author (plbishop{at}gw.dec.state.ny.us)
Received for publication May 18, 2004.
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: ANCOVA, analysis of covariance BMP, best management practice CI95, 95% confidence interval MDTE, minimum detectable treatment effect post-BMP, post-treatment period pre-BMP, pre-treatment period PP, particulate phosphorus STP, soil test phosphorus TDP, total dissolved phosphorus WAP, Watershed Agricultural Program WFP, Whole Farm Plan
| INTRODUCTION |
|---|
|
|
|---|
The effectiveness of agricultural BMPs can be demonstrated through scientific evaluation and monitoring, although investigations may be hampered by the limitations of stream water sampling methodology, variability in weather, and spatial heterogeneity of the landscape. Often, evaluation of BMPs has focused either on the sub-field or large-basin scale. On the sub-field (plot) scale, monitoring studies have demonstrated that agricultural BMPs are effective at decreasing sediment and P losses from fields (e.g., Murray, 2001; Robillard and Walter, 1984; Udawatta et al., 2002), but relating these results to improvements in receiving water quality can be difficult. Basin-wide studies have relied on large-scale environmental models (e.g., Benaman, 2002; Boesch et al., 2001; Gitau et al., 2003), or on direct sampling of stream water quality (e.g., Inamdar et al., 2002; Longabucco and Rafferty, 1998; Meals, 2001; Udawatta et al., 2002). In the case of P, results have shown that several years may pass before the effects of BMPs translate into measurable improvements in water quality (Boesch et al., 2001; Wang et al., 2002) due to both the earlier accumulation of P in soils and stream sediments, and to the slow rate of landscape P transport processes. Moreover, interpretation of data can be complicated by catastrophic loading events (Meals, 2001), unexpected changes in cropping practices (Boesch et al., 2001), and natural inter-annual variability (Longabucco and Rafferty, 1998).
Only a few studies (e.g., Galeone, 1999; Shirmohammadi et al., 1997) have monitored BMP treatment effects on a small watershed scale. The small watershed approach is advantageous because it evaluates an area that is large enough to capture landscape P transport processes and dilution effects (Gburek et al., 2000), yet small enough to focus on nutrient loading from a single farm and the management practices adopted at that scale.
As annual variation and long-term trends in weather and nutrient export occur naturally in undisturbed watersheds (e.g., Esterby, 1996; Moog and Whiting, 2002; Murtaugh, 2000), these effects must be separable from BMP effects when performing an evaluation of treatments. The paired watershed experimental design, initially developed to evaluate the effect of silvicultural practices on runoff quantity and quality (Reinhart, 1967; Wilm, 1949), has been a successful method of controlling for environmental variability. It is based on a quantifiable relationship between the water quality at two sites that remains valid until a major change is made at one site, after which time a new relationship will exist (USEPA, 1993). In a comparison of monitoring designs, paired watersheds were considered most appropriate for documenting BMP effects within a relatively short time period (Spooner et al., 1985). This design has been used to quantify changes in water quality at scales ranging from sub-field (e.g., Clausen et al., 1996) to multiple-land use, multiple-county areas (e.g., Galeone, 1999; Meals, 2001). The USEPA established guidelines for applying analysis of covariance (ANCOVA) to paired watershed data (USEPA, 1993, 1997a, 1997b), recommending the use of a single matched covariate or predictor variable, such as total nutrient loads in runoff from a control watershed, to account for nontreatment climate effects (USEPA, 1993). Hewlett and Peinaar (1973), as well as Loftis et al. (2001), discuss incorporating multiple covariates into ANCOVA models for paired watershed data analysis, but few researchers have actually done so.
The purpose of this study was to quantify reductions in P loading attributable to an extensive program of BMPs implemented on a working dairy farm in the watershed of a New York City water supply reservoir. The BMPs were developed through the WFP process of the WAP. A paired watershed study design was employed, with control for environmental variability provided by load and flow data from a nearby nonfarm, forested watershed, as well as by flow data from the farm watershed. Stream flow and water quality were continuously monitored at the outflow of each watershed for a two-year pre-treatment period and a four-year post-treatment period, using an automated, event-based sampling protocol. Focus was placed on determining P reductions during runoff events because many of the BMPs were designed to function when runoff occurred. Also, events were important contributors of P, delivering, on average, 84% of PP and 57% of TDP total annual loads from the farm site (Bishop et al., 2003).
A multivariate ANCOVA methodology was developed to provide a means to quantify BMP treatment effects in the context of an entire upland farm watershed. Using the available matched load data set, the analysis demonstrated the improvement in model accuracy achievable by employing multiple covariates to describe natural environmental variability. We also provide results from models utilizing only covariates from the farm watershed that illustrate the advantages of paired watersheds over the single watershed "before and after" monitoring design when evaluating treatment effects. Statistical formulas were developed and used to estimate the analytical power and MDTE associated with the paired watershed analysis.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
The 160-ha treatment watershed (Fig. 2a) is occupied by a third-generation dairy farm that maintains approximately 80 milking cows and 35 heifers. The farm is typical of upland dairy operations of the Catskills region in that the barn is located in the valley bottom, close to a central stream. Land use on the watershed is 53% deciduous forest, 25% improved pasture and hay, 7% maize (Zea mays L.) rotation, 13% unimproved pasture, and 2% impermeable areas. Deciduous forest and unimproved pasture largely dominate the upper slopes of the watershed, while crop fields and improved pasture tend to be located on the lower slopes. Impermeable surfaces such as barnyards, roads, and farm buildings are mostly near the stream. During the grazing season (early May to late October) cows frequently cross the stream and saturated areas to reach pasture.
|
The control watershed (Fig. 2b), located 6.4 km east of the farm, covers 86 ha and is comprised of mostly forest and old fields. A nonfarm control was selected because no significant changes in the watershed were expected during the study period. In contrast, working farms may modify operational practices or go out of business altogether over the course of a long-term study, and cannot be relied on to provide the consistent control necessary for describing natural environmental variability. The nonfarm control watershed contains several seasonal residences and one septic system, but has no recent history of manure application and no significant anthropogenic P inputs, aside from atmospheric deposition. Land use is 78% deciduous forest, 22% shrub and grasses, and <1% impermeable areas. Soil samples collected throughout the watershed exhibited low to moderate STP (mean of 3.3 mg kg1 Morgan's STP, SD of 2.2 mg kg1) (New York State Water Resources Institute, 2002). No significant changes in land use were observed during the study period, although the modification of drainage pipes and ditches associated with an unpaved roadway may have modified the effective watershed boundary slightly.
Runoff production on the two watersheds is generally controlled by mechanisms of saturation excess overland flow (Walter et al., 2000), although infiltration excess runoff was produced on roadways, cow paths, and other compacted areas (Hively, 2004; Walter et al., 2003). Events generally reflect two dominant hydrological seasons: dry (June through September) and wet (October through May). Stream water P concentrations in the dry period tend to be high, but P losses are small because of the modest event flow volumes that result from short-duration summer rainstorms under dry soil conditions. During the wet period, evapotranspiration is reduced and dilution effects are increased, so that stream water P concentrations tend to be much lower, but associated P losses much higher due to larger event flow volumes.
Best Management Practice Treatments
Following two years of pre-treatment (pre-BMP) stream monitoring (June 1993 through May 1995), during which farm operation remained essentially constant, an extensive program of BMPs was implemented on the farm watershed. A site-specific WFP developed by a WAP planning team and the collaborating farm family identified target areas for control of nutrients, sediments, and pathogens. The resulting array of BMPs (Table 1) included physical changes to farm infrastructure as well as organizational changes to farm management, and affected both P source areas and P transport processes across much of the farm landscape. Stream monitoring was suspended during the BMP implementation period (June 1995 through October 1996), and then resumed for the post-treatment (post-BMP) period (November 1996 through October 2000).
|
|
Sampling and Chemical Analysis
An automated stream monitoring station was established at the outflow of each watershed (Fig. 2a, 2b). It consisted of a heated shelter housing a refrigerated automatic sampler (American Sigma Model 900; Hach Corporation, Loveland, CO) and a data logger (Handar Model 555, Vaisala Corporation, Sunnyvale, CA), and two parallel pipes containing sensing equipment and sampling lines, through which all stream flow was channeled. A 43-cm-diameter pipe provided accurate measurements of the low to moderate discharges that occur most of the year, and a 2.1-m-diameter culvert pipe effectively handled high flows. Covered with soil and stone to form a dike, the pipes remained ice-free, thereby increasing accuracy during freezing weather. Flow (m3 s1) was recorded every 10 min by the data logger, using input from combination levelvelocity electromagnetic sensors (Model 251DC flowmeters; Marsh-McBirney, Frederick, MD) located in the pipes. Stream flow rating curves were developed for each site through manual stream gauging and were updated annually. Heated (Model 50202; Young Instruments, Traverse City, MI) and unheated (Model 111; Rainwise, Bar Harbor, ME) rain gauges were installed at the sites to differentiate between rain and snowfall precipitation.
Water samples were collected at least weekly during baseflow periods, and more frequently during runoff events. Event-based sample collection was triggered by onset of precipitation, in the case of rainfall events, and a rise in stream stage of at least 0.03 m. Frequency of sample collection was directly related to the rate of stream rise and fall, up to a maximum rate of six samples per hour. The number of samples collected per event typically ranged from three to more than 20, depending on event magnitude and duration, which ranged from several hours to several days. All events that occurred during the six-year study period were sampled.
Stream water samples were analyzed by the New York State Department of Health for total phosphorus (TP) and TDP. Particulate P was computed as the difference between TP and TDP. Sample concentrations of TP (unfiltered sample) and TDP (vacuumed through a pre-washed 0.45-µm membrane filter) were determined using sulfuric acidammonium persulfate digestion followed by manual colorimetric assay employing the ascorbic acidmolybdate complex method (USEPA, 1979).
For the purpose of calculating loads, an event was considered to start with the beginning of stream rise and was deemed over when concentrations of P returned to approximately pre-event levels, or when another event began. Event P loads were calculated as the product of 10-min flow volumes (m3) and either actual or estimated 10-min P concentrations, summed over the duration of each event. Estimated 10-min P concentrations were derived from actual sample concentrations through linear interpolation.
During the study period, 269 runoff events were observed and sampled. Thirty-nine of these events were removed from the analysis because they were unmatched (event at farm but not at nonfarm: n = 33; or vice-versa: n = 6). Unmatched events occurred mainly when event size was small, so analysis of seasonal loading trends for the farm watershed was not affected by their removal. One additional event (28 Jan. 1994) was deleted because of a suspected laboratory error. The resulting dataset includes 74 events in the pre-BMP period and 156 events in the post-BMP period (Table 2). Data were grouped (Table 2) to reflect seasonal variation in both land application of manure (considered a primary source of P on the farm) following BMP implementation (Fig. 3) and hydrologic runoff processes (wet vs. dry periods). Approximate begin and end dates for the dry period and timing of manure spreading were used to define seasonal date ranges (Table 2).
|
. The natural log transformation was employed to normalize the distribution of event magnitude, as is common practice (Cohn et al., 1989; USEPA, 1997b). The transformation allowed a straightforward evaluation of P load reductions based on pre- vs. post-BMP changes in the y intercept of the log-linear regression model. It also yielded a simple and reasonable mathematical representation of interactions among the covariates.
The complete multivariate regression model, which was applied to both the seasonal and full-year datasets, can be written as:
![]() | [1] |
i is the residual error, assumed to be independently distributed. For simplicity in the following text, the interaction term ki[ln(Pnfi) m] is denoted as k·ln Pnf.
Interpretation of Model Terms
For each seasonal and full-year dataset, the complete multivariate model was fit and nonsignificant terms subsequently dropped. Because the interaction term k·ln Pnf was never significant, calculation of treatment effects was based on the multivariate model:
![]() | [2] |
Differences in pre- vs. post-BMP event P loading were based on a one-sided t test on the f coefficient in Eq. [2] using a 5% significance level. Results were obtained using both STATISTICA Version 6 (StatSoft, 2001) and S-PLUS 6 (Insightful Corporation, 2001) statistical packages, for general linear models and for analysis of covariance. Separate analysis with the two packages provided confirmation that errors were not made in the statistical analyses or the corresponding data manipulation.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
|
![]() | [3] |
This analysis found no significant (p > 0.05) pre- vs. post-treatment differences in the matched watershed flow volume relationship for any season, or for the full year. Overall, adoption of BMPs did not appear to affect farm runoff volume. Similarly, no significant pre- vs. post-treatment differences were found for the farm event peak flow ln(peak Qf) or farm average event flow rate ln(QRf) covariates.
Average flow ratios (Qf/Qnf) in spring were similar to the ratio of watershed areas (1.9), while summer flow ratios were about twice as large (Table 3). This variation is probably attributable to the two basins responding similarly to precipitation under saturated conditions in spring, and somewhat dissimilarly under dry conditions in summer when the larger farm watershed sustains a greater baseflow and the production of runoff from impermeable areas on the farm landscape is comparatively large.
Stream water P concentrations at the farm were typically three to ten times higher than at the control site during events, reflecting the more intensive land use and presence of livestock. Annual flow-weighted mean PP concentrations over the six study years ranged from 152 to 294 µg L1 at the farm and from 24 to 86 µg L1 at the control site (Bishop et al., 2003). Similarly, TDP annual event mean concentrations had a range of 53 to 172 µg L1 at the farm compared with a range of 12 to 15 µg L1 at the control site.
Performance of the Analytical Models
The univariate matched load model:
![]() | [4] |
|
|
![]() | [5] |
|
![]() | [6] |
![]() | [7] |
![]() | [8] |
)] reflecting a constant adjustment, a treatment effect, and the model error. However, b was significantly different from 1 in the full year and in all seasons except the spring for the TDP models, and in the winter and spring for the PP models, indicating a relationship among the variables more complex than in Eq. [7] and [8], and justifying the use of unconstrained estimators of b and c. When the farm event peak flow ln(peak Qf) and the farm average event flow rate ln(QRf) were incorporated as additional explanatory covariates in Eq. [2] they contributed substantially to the PP analyses, increasing adjusted R2 by 5 to 13% for the seasonal analyses and by 11% for the full year (Table 4). These flow terms provided little improvement for TDP, however, increasing adjusted R2 by just 1 to 2% for all analyses. Evidently, TDP loading was effectively described using only ln(Pnf) and ln(Qf/Qnf), while for PP the farm flow terms were useful on-site descriptors of event characteristics affecting sediment transport and PP loading processes. This makes sense from a hydrologic standpoint because PP loading depends on surface detachment and transport processes that are likely to behave somewhat differently at the farm and nonfarm watersheds, whereas TDP loading depends more directly on the volume of runoff produced during each event. In vegetated areas of both watersheds the plant canopy decreases the erosive impact of rain on soils and serves to retard sediment loss. Areas unique to the farm watershed, such as tilled fields, unprotected stream banks, barnyards, farm roads, and cattle lanes, can exhibit elevated sediment loss, particularly as event intensity increases.
For the multivariate PP analyses, ln(peak Qf) was highly significant (p < 0.02) for the full year and for all seasons. Although ln(QRf) was sometimes not significant, the two flow variables appeared to work together, and the largest reductions in residual error were obtained when both were included. Used together in the multivariate model, the farm event flow terms compensated for apparent differences in farm and nonfarm hydrologic response, thus helping to resolve treatment effects.
Best Management Practice Treatment Effects
A best ANCOVA model was selected for TDP and PP for each season from the univariate (Eq. [4]), bivariate (Eq. [5]), and multivariate (Eq. [2]) matched load models, based on the residual variance and in view of the significance of the coefficients. The multivariate model was always best except for the winter TDP analysis when the bivariate was best. Best model regression results and the estimated magnitude of post-treatment P load reductions, along with 95% (z = 1.96) confidence intervals (CI95) for the estimates computed from the coefficient and standard error of the treatment indicator variable k, are given in Table 5, for each season and the full-year dataset.
Multivariate analyses of the full-year datasets (Table 5) indicated that TDP event loads were reduced by 43% in the post-BMP period (CI95: 3649%), while PP event loads were reduced by 29% (CI95: 1541%). Significant (p < 0.05) TDP load reductions were evident in all seasons except the fall. For PP, load reductions were significant only in the winter and summer seasons. Effects of seasonal differences in BMPs and hydrology on results (Table 5) are discussed below.
Winter (15 December13 April)
The TDP and PP event loads were reduced by 43% (CI95: 3650%) and 32% (CI95: 1545%), respectively, in the post-BMP period. It is likely that winter load reductions were largely attributable to storage of manure and minimal spreading from mid-December to mid-April (Fig. 3). The residual variances for winter were the smallest, perhaps because soil conditions during this time period (frozen, snow-covered, or saturated) created a more uniform runoff source area than in other seasons.
Spring (14 April14 June)
The TDP event loads were reduced by 36% in the post-BMP period (CI95: 1452%), while PP event loads showed a nonsignificant (p > 0.05) reduction of 18% (CI95: 31 to 48%). Although springtime is also characterized by saturated watershed conditions, the residual variances for this period were higher than for the winter, perhaps due to variability introduced by P loss from manure-spread fields and to increased sediment availability resulting from spring tillage. Manure was heavily surface-applied in the spring months (Fig. 3) to empty the storage, with some being incorporated into the soil during tillage.
Summer (15 June30 September)
The TDP and PP event loads were reduced by 52% (CI95: 3664%) and 39% (CI95: 262%), respectively, in the post-BMP period. In the dry summertime, upper watershed slopes did not usually saturate, and P loads were produced mainly from near-stream, impermeable, and slope-break source areas. Summer P load reductions may be attributable to exclusion of cows from the stream corridor, relocation of the silage storage bag away from the stream bank, implementation of rotational grazing, improved pasture management, and somewhat reduced manure spreading during summer months (Fig. 3). The residual variances for the summer were the largest of any season, most likely due to the highly variable nature of summer rainstorms and soil moisture conditions. Although the multivariate model (Eq. [2]) was chosen as the best model for the summer PP analysis to maintain consistency with the considered model options, the coefficients for ln(Pnf), ln(Qf/Qnf), and ln(QRf) were not significant (p > 0.05) in the summer. If the nonsignificant terms were sequentially removed, the model reduced to a single covariate, farm event peak flow ln(peak Qf), reflecting the association of PP loading with sediment transport processes. During summer periods, increased sediment availability from near-stream sources and greater minimum baseflow at the farm appeared to create conditions that were not well matched by the smaller, more vegetated control site. Thus, while the bivariate model ([Eq. 5]) did have significant results, models containing ln(peak Qf) provided better resolution (Table 4).
Fall (1 October14 December)
Significant event P load reductions were not observed during the fall season (Table 5). Increased fall manure spreading in the post-BMP period (Fig. 3) when the farmer emptied the manure storage lagoon in preparation for the winter may have offset any P reductions attributable to other BMPs implemented on the farm.
Other changes in farm practices occurring in the post-BMP period may have counteracted the effect of BMPs to some degree. These included a gradual increase in herd size of about 30% and intensified use by cows of a streamside loafing yard that created a concentrated P-loading source area not far upstream of the monitoring station (Hively, 2004). In addition, none of the BMPs (Table 1) altered either the amount of P imported onto the farm as feed or fertilizer or the amount exported as products. Therefore, as the mass balance of P on the farm did not change appreciably during the study period, presumably any reductions observed in stream losses of P resulted from more of it being retained on the farm. This outcome has the potential of accelerating net accumulation of P in the farm soils and eventually raising STP levels to the point of saturation of soil P binding capacity. Studies indicate this saturation point represents a threshold of STP above which TDP concentrations in runoff can increase sharply (e.g., Beauchemin and Simard, 1999; McDowell and Sharpley, 2001), an effect that could lead to increased loading of dissolved P from the farm in the future.
Beginning in 2001, a second round of BMPs was implemented on the farm that not only corrected the concentrated P source area, but addressed an identified P imbalance resulting from more P being imported, largely as purchased feeds, than was exported in the form of milk, meat, or crops. The farm watershed P mass balance was improved with institution of a precision feeding program, which lowered imports of dietary P by an average of 25% and reduced excretion of P in manure by 33% (Cerosaletti et al., 2004). Reductions of this magnitude in the amount of manurial P applied to the farm soils are expected to slow the rate of soil P accumulation and continue to reduce losses of P in runoff waters.
Our study was somewhat unique in its characterization of the changes in water quality from a single farm, so findings from other BMP effectiveness studies that monitored larger watersheds may not be directly comparable. One study of interest (Brannan et al., 2000), however, demonstrated reductions of 35% in PP loading and 4% in TDP loading in a 10-year evaluation of animal waste practices (including manure storage, spreading schedules, and stream fencing) implemented in a 331-ha Virginia watershed containing two dairy farms. In the same study, the authors reported PP load reductions of 70%, but TDP load increases of 117% in a nearby 462-ha agricultural watershed composed mostly of cropland that received BMPs including nutrient management plans based on nitrogen needs, and field erosion control practices. Application of manure at rates based on nitrogen needs of crops, which typically results in overfertilization of P, and conversion of organic P to inorganic P in the manure storage were given as factors that could explain the ineffectiveness of the program in reducing TDP loads. The BMPs evaluated in our study produced overall PP reductions comparable with those Brannan et al. (2000) reported for the first watershed and about half of that observed in the second watershed, but were much more successful in reducing TDP loading. Findings of Brannan et al. (2000) may constitute evidence of the eventual P saturation of soil and subsequent release of dissolved P in runoff that is postulated to occur when conservation and nutrient management practices are implemented in the absence of efforts to improve whole-farm mass balance of P.
Single Watershed Models
Although the study was designed to use paired watersheds, with event loads from the nonfarm watershed ln(Pnf) serving as the primary covariate to explain natural variation in farm event loads ln(Pf), we were interested in determining what BMP effects might have been observable if only the farm watershed had been monitored, as in a simple "before and after" study, and just its flow variables were available as covariates. Farm event flow volume ln(Qf) was first tested as the primary covariate in the univariate single watershed model:
![]() | [9] |
When compared with the univariate single watershed model (Eq. [9]), a bivariate single watershed model using farm event peak flow ln(peak Qf) and average farm event flow rate ln(QRf) together:
![]() | [10] |
While both of these single watershed models explained a larger portion of observed variability (Table 4) than was accounted for by the standard univariate matched load model (Eq. [4]), they were not able to demonstrate significant (p < 0.05) reductions in event PP loads, and never outperformed the multivariate models. These results are consistent with the conclusion of Spooner et al. (1985) that, without a very long monitoring period, before and after study designs are unlikely to document statistically significant BMP effects at the watershed level.
Analytical Power and Minimum Detectable Treatment Effect
Model precision is an important issue because it affects both statistical power and minimum detectable treatment effects (Cooke et al., 1995; Galeone, 1999; Loftis et al., 2001). From a research perspective, a monitoring study should be designed so that it has a high probability, or power, of detecting statistically significant changes in water quality. Power is a function of the magnitude of change in the water quality variable being investigated, the variance of the residual errors (the complement of model precision), and the required significance level. Statistical power for a one-sided t test, based on pre- and post-treatment sample size and model precision, can be approximated (Bickel and Doksum, 1977) by:
![]() | [11] |
(f) is the power equal to the probability of correctly rejecting the null hypothesis of no treatment effect, as a function of f > 0;
is the required Type I error;
is the standard normal cumulative distribution; z
=
[1
], the critical value for a normal distribution; f is the coefficient of k in (Eq. [2]), describing the magnitude of P load reductions; and
f =
1/2, standard error of the coefficient f of the k term in Eq. [2], where
2e is the residual variance and (XTX)1 is the hat matrix from the regression analysis.
An estimate of
f appears in the regression output for each season and the full-year dataset, making the computation in Eq. [11] relatively simple if attention is restricted to the set of observed covariates. This approach neglects any uncertainty associated with estimators of the residual variance
2e, generally unimportant for larger sample sizes (Bickel and Doksum, 1977).
A comparison of power for the univariate (Eq. [4]), bivariate (Eq. [5]), and multivariate (Eq. [2]) matched load models is presented graphically in Fig. 6a
(TDP) and 6b (PP) for the seasons with the smallest (winter) and largest (summer) residual variances. The faster the power curves rise toward one, the greater the probability that the test will correctly detect a specified water quality improvement. Residual variances
2e and the estimated probabilities (power) of detecting a 20% treatment effect (f = 0.22) are listed in Table 6. It is clear that the incorporation of the flow ratio covariate ln(Qf/Qnf) into the bivariate model substantially increased the power of the ANCOVA for both TDP and PP (Fig. 6a, 6b); adding the farm flow terms ln(peak Qf) and ln(QRf) into the multivariate model further increased the power of the PP analyses. Gains in power were greatest for the summer season when incorporation of the flow ratio term produced the greatest reduction in residual variance (Table 6). Power was greatest for the full-year and winter analyses due to the larger sample sizes (Table 2) and to the smaller residual variance for the winter season model. Power was lower for PP analyses than for TDP analyses, due to the higher degree of variability associated with the PP data.
|
|
) and fixed power (
), the smallest detectable treatment effect can be calculated using:
![]() | [12] |
is the critical value for a Type I error probability (
), zß is the critical value for Type II error probability (ß), and
f =
1/2 is the standard error of the estimator of k in Eq. [2].
For example, if
= 5% and ß = 20% (power = 80%):
![]() |
If the sample size is small (<25), then Eq. [12] should be modified to account for the limited degrees of freedom (Bickel and Doksum, 1977).
Estimates of MDTE calculated using power = 80% and a desired confidence level of
= 5% (Table 6) demonstrate that the addition of a second covariate ln(Qf/Qnf) to the bivariate model (Eq. [5]) substantially reduced the minimum detection limit for both TDP and PP when compared with the univariate matched load model (Eq. [4]). Without this innovation sample sizes were insufficient to separate the data into seasonal categories and still detect treatment effects. Incorporating ln(peak Qf) and ln(QRf) as additional covariates in the multivariate model (Eq. [2]) resulted in further improvements in model power and reductions in MDTE, particularly for the PP analyses.
Estimating Average Watershed Loading Rates
Six-year average farm watershed P loading rates were estimated using observed covariate data along with coefficients from the best seasonal ANCOVA models (Table 5) to predict event loads as if BMPs had been in place the entire time or had not been implemented at all. In particular, two "filled-in" event datasets were produced: one combined two years of observed with four years of modeled loads to yield six years without BMPs (no-BMP); the other combined four years of observed with two years of modeled loads to yield six years with BMPs (BMP). The treatment index variable k was set equal to one (Eq. [2]) to calculate BMP values for the pre-treatment period, and was set equal to zero to calculate no-BMP values for the post-treatment period.
A quasi-maximum likelihood error estimator (Ferguson, 1986; Cohn et al., 1989; Cohn 1995; Millard, 1996):
![]() | [13] |
This method can also be used to estimate a BMP treatment effect (Hively and Stedinger, 2003), although confidence intervals will be wider than in Table 5 due to the small number of yearly sums and to the large degree of inter-annual variation. It would have utility, however, if the interaction term k·ln Pnf in Eq. [2] were found to be significant, in which case the influence of BMPs would vary with event magnitude and the simple treatment index variable k would not represent the entire treatment effect. In the present study the interaction term was never significant; accordingly, the regression model results presented in Table 5 provide a more accurate and direct estimate of treatment effects.
| CONCLUSIONS |
|---|
|
|
|---|
Multivariate ANCOVA was an effective technique for evaluating paired watershed event load data. Covariates explained 80 to 90% of observed variability in annual and seasonal TDP and PP event loads. The multivariate model improved on previous methods of paired watershed data analysis (USEPA, 1993) by including several covariates to control for environmental variability. The nonfarm watershed proved to be a satisfactory control for farm TDP loads as long as the ratio of watershed event flow volumes was also incorporated as a covariate. This term accounted for "unmatched" differences in event flow volumes resulting from unbalanced event precipitation at the two sites, and was most important in the summer season when precipitation events were more spatially variable. For PP, incorporation of two additional covariates, farm event peak flow and farm event flow rate, resulted in further and substantial improvement in model accuracy by accounting for inherent between-watershed differences in sediment availability and hydrologic response. The high degree of unexplained variability apparent in the PP study data suggests that future studies of PP loading should consider increasing intra-event sampling frequency and incorporating a precipitation intensity covariate into the ANCOVA to better account for particulate delivery and transport processes.
Separation of the data into seasons before analysis allowed a more precise evaluation of changes in P event loads and their relation to specific BMPs when compared with the full-year analysis, despite the smaller sample size. The improved precision was particularly important for analysis of the winter season, which accounted for 40 to 50% of annual P event loading.
Overall, the use of multiple covariates, including parameters such as flow, precipitation, or temperature, may substantially improve the power of paired watershed ANCOVA models and reduce the number of samples necessary for detection of changes. Cost savings could be realized by employing multiple predictor variables to improve the precision of the analysis, rather than by increasing sample size. Using a nonfarm site as a control can provide a valuable alternative to the use of a second farm in long-term paired watershed studies, as it can be difficult to find a control farm that maintains consistent operation and management throughout the study period, especially in areas like the New York City water supply basin where 85% of the dairy farms are participating in the WFP program. Use of a stable control site to represent environmental variation along with other covariates from the treatment watershed was critical for successfully determining the seasonal effects of BMPs on farm P loading.
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|