|
|
||||||||
Department of Geological Sciences, Case Western Reserve University, 10900 Euclid Ave., Cleveland, OH 44106-7216
* Corresponding author (dbm3{at}po.cwru.edu)
Received for publication August 12, 2000.
| ABSTRACT |
|---|
|
|
|---|
Abbreviations: CRP, Conservation Reserve Program NO2+3, nitrate plus nitrite SRP, soluble reactive phosphorus TP, total phosphorus TSS, total suspended solids
| INTRODUCTION |
|---|
|
|
|---|
|
Loads of these species are subject to controlled and uncontrolled influences. Recent efforts to decrease loads and soil loss have included the Conservation Reserve Program (CRP), in which cropland deemed most vulnerable to soil loss is left fallow; and conservation tillage, in which the soil is planted without tillage, or with light mulch tillage. Less-deliberate factors include application of fertilizer and animal waste. Weather is uncontrolled.
Research reported in this issue indicates links between agricultural practices and loads in the Maumee and Sandusky River watersheds. Phosphorus loads appear to be related to fertilizer, manure application, and conservation tillage (Richards and Baker, 2002; Calhoun et al., 2002b). Decreasing sediment loads are attributed to conservation tillage and conservation reserve (Richards and Baker, 2002; Forster and Rausch, 2002; Matisoff et al., 2002). Nitrate loads are related to tile drainage (Calhoun et al., 2002a), and could reflect changes in fertilizer and manure use (Richards and Baker, 2002), themselves related to a shift from corn (Zea mays L.) to soybean [Glycine max (L.) Merr.] and wheat (Triticum aestivum L.) (Forster and Rausch, 2002).
While these coincidences suggest causal links, other possible explanatory variables exist, notably those related to weather and hydrology. The goal of this investigation was to identify the variables exhibiting the greatest explanatory power for statistically describing load variations in the 19761995 period. Identification of these variables would further several ends. It would provide a more comprehensive context in which to assess more focused findings. For example, where relationships of loads to agricultural changes are suggested, the results of this study should indicate whether variations in hydrology or weather might instead explain observed load variations. Identifying the best explanatory climatic variables is itself of interest, pointing to salient processes affecting loads, and leading to knowledge that might be exploited in efforts to limit nutrient and sediment loads. A further benefit of the study is the quantification of the sensitivity of load variations to changes in the explanatory variables; these results are employed in the companion paper to this article (Moog and Whiting, 2002).
Identification of the salient variables was achieved through statistical modeling, building on the example of Jordan et al. (1997), who presented a general linear model relating nitrate, phosphorus, particulates, carbon, and silica loads from 27 watersheds feeding Chesapeake Bay to base flow fraction, percentage of cropland, and physiographic province. Seeking a more comprehensive survey, we instead employed stepwise regression with forward selection, in which candidate variables were tested one at a time, permitting efficient testing of a larger number of variables, including those that are highly colinear. The objective was not the output of the model but rather its form (i.e., which variables were selected to compose it). The main limitations to the study lay in the scope of the test variables, which were limited by data availability and the need to average, and by limiting regression to linear relationships (except in the case of stream discharge).
The goal of this study was to identify the variables exhibiting the greatest explanatory power for statistically describing load variations in the 19761995 period in the Maumee and Sandusky watersheds. It was achieved by using stepwise regression with forward selection, building a model by successively adding a term using the most significant remaining variable.
| METHODS |
|---|
|
|
|---|
All statistical analysis was based on monthly values of the variables. For most climate and streamflow variables, these were averages of the daily values for each month. Also included for some variables were the standard deviation and maximum of the daily values for each month.
Climate data included daily precipitation, temperature (low, high, and mean), snowfall depth, and snow depth. The data were obtained from the Midwestern Climate Information System (MICIS) of the Midwestern Climate Center, a cooperative program of the National Weather Service's National Climatic Data Center and the Illinois State Water Survey. Average climatic conditions across each basin were derived from the stations shown in Fig. 1. Records of a particular variable (e.g., precipitation) at a particular station were accepted if they covered at least 90% of the days in the study period. Missing values that remained were estimated with the value at the nearest operating station for each missing date. In the Sandusky basin, four stations met the 90% completeness criterion in each of the six climate categories, except for snowfall and snow depth at one station. In the Maumee basin, 16 stations met the criterion in one or more categories. Of these, precipitation records were accepted at all 16, temperature at 12, snowfall at 12, and snow depth at 11. In both basins, the stations provided an even spatial distribution. For each day, a single value of each climate variable in each basin was derived by averaging over the stations in the basin, using the inverse distance-squared method (Smith, 1993, p. 3.20), with grids having sides of 0.1 degrees latitude and longitude.
Daily mean streamflow rates were obtained from the National Water Information System (NWIS), using the United States Geological Survey (USGS) streamflow gauges at Tindall Bridge near Fremont, Ohio on the Sandusky River (Station 04198000, drainage area 3240 km2); and at Waterville, Ohio on the Maumee River (Station 04193500, drainage area 16390 km2). These sites are near the furthest downstream points that are free from lake backwater. Temporal averaging was identical to that for climate variables (i.e., monthly averages of daily means). At both stations, the daily streamflow records were complete for the study period: water years 1976 to 1995 (1 Oct. 1975 to 30 Sept. 1995).
Loads were provided by the Water Quality Laboratory at Heidelberg College, which sampled concentrations of NO2+3, SRP, TSS, and TP near the USGS gauges. Samples were taken from one to four times daily, at an average rate of 38 times per month. Concentrations were multiplied by the instantaneous flow rate at the time of the sample to obtain instantaneous loads, which were numerically integrated over each month. These monthly loads were adjusted for time gaps in sampling and instantaneous flow artifacts by multiplying them by the ratio of the USGS monthly discharge to the observed monthly discharge, which was obtained by integrating the instantaneous discharges analogously to the instantaneous loads. The concentration and load records were complete except for the period October 1978 to September 1981 in the Maumee River.
Agricultural data were available by county as annual values from 1979 to 1995. Basinwide values were calculated as the sum of the values from counties wholly or mostly within each watershed. Nitrogen and phosphorus fertilizer delivered to dealers and livestock data were taken from the Ohio Agricultural Statistics and Ohio Department of Agriculture Annual Reports. Poultry data were not available, owing to the limited number of producers and privacy regulations. Nitrogen and phosphorus content in animal waste was calculated from head of livestock using the formulae in the Ohio Livestock Manure and Wastewater Management Guide (The Ohio State University, 1992). Acreage in conservation tillage and the Conservation Reserve Program came from the Core 4 program of the Conservation Technology Information Center in Lafayette, Indiana. Since agricultural variables were available only as annual values, each month in a calendar year was assigned the same value. In the case of fertilizer or animal waste, actual values vary from month to month, but when compared only with the same month(s) from other years, the variable may be well-characterized by its annual value as long as the pattern of variation within each year is consistent. The difference between the calendar year and the agricultural year was handled by including variables representing both prior and current years, so that the model procedure could select the appropriate year for that month or season.
Model Development
The variables with the greatest explanatory power for loads were found by building a statistical model that selected from the set of test variables, using stepwise regression with forward selection. Time was not a variable; rather, model construction was repeated for subsets of the study period, such as particular months.
LoadDischarge Fit
Preliminary investigation led to modeling streamflow discharge separately from the other explanatory variables. It became apparent that streamflow rates dominated explanation of the variance of the loads. This result is of no surprise, since flow rates are closely related to stream runoff and transport ability, and were used to calculate loads from the concentration data. Accuracy in modeling the loaddischarge relationship is of primary importance, as relatively small errors in the fit may be comparable with the variance in load due to other explanatory variables. Separating discharge permitted a clearer focus on the secondary variables. This separation was accomplished by adopting the residuals of the loaddischarge model as the response variable. These residuals are analogous to the "flow-adjusted concentrations" introduced by Hirsch et al. (1982).
The relationship between load and discharge differed by species, as shown by the example loglog scatterplots for NO2+3 and SRP in the Maumee watershed (Fig. 2)
. For SRP, TSS, and TP, the plots are reasonably linear and of constant variance, so that they may be modeled as:
![]() | [1] |
is random error about the regression line.
|
![]() | [2] |
Closer investigation showed that the loaddischarge relationships, Eq. [1] and [2], were not consistent throughout the year, but tended to vary seasonally. In order to maximize accuracy, separate loaddischarge equations were developed for each combination of month, species, and watershed. Individual months exhibited the same functional dependencies as those shown in Fig. 2.
In the subsequent text, the "loaddischarge residuals"the logs of the measured monthly loads minus those predicted by Eq. [1] or [2]are used as the response variables.
Stepwise Linear Regression Model
The loaddischarge residuals, which are log-transformed loads, were related linearly to the model variables. Those variables measuring streamflow, precipitation, or rainfall were first log-transformed because they tended to exhibit lognormal distributions. All other variables were left untransformed, as most of these could take on zero values. The resulting model equation was:
![]() | [3] |
1; N is the number of explanatory variables; and b are regression constants. The subscript 1 or 2 refers to the untransformed or log-transformed variables, respectively. The log-transformed terms represent the hydrologic transporting medium and thus drive the load to zero as they themselves approach zero. Each untransformed variable acts instead as a modification; as it approaches zero, its influence vanishes and the load remains finite. The coefficients b (including the constant term b10) in Eq. [3] were found through stepwise, univariate linear regression with forward selection. First, each of the model variables was tested separately as the sole explanatory variable (x11 or x21), with the coefficient b11 or b21 derived by linear least-squares regression of the ln(LR) on x1. The model variable explaining the greatest degree of variance in ln(LR) (i.e., having the small sum of squared residuals after linear regression) was then accepted as the "leading term" of the model. The process was repeated by testing each remaining variable as x2, then x3, and so on, until reaching a stopping criterion.
In forming a stopping criterion, parsimony was valued because an important objective of the model building was to identify variables that bear meaningful correlations to loads. Liberal inclusion of explanatory variables could explain a high degree of variance, but at the cost of including numerous variables whose contribution is primarily one of random chance. Thus, rather than apply a model-oriented stopping criterion (e.g., the Akaike Information Criterion, described by Venables and Ripley, 1997), we required that the p value of the added term in explaining the marginal variance (that left after including the previous x) be less than 0.01. This criterion was more parsimonious than the Akaike Information Criterion.
Stepwise univariate regression with forward selection offered several advantages over multiple regression. By avoiding problems with colinearity due to closely correlated explanatory variables, it allowed us to test numerous alternative formulations of several basic phenomena. For example, the candidate variables included both the maximum and standard deviation of daily precipitation, which were very closely related. If they had been included simultaneously in a multiple regression, they would have split the explained variance and would have appeared less significant than would either one alone.
Another advantage to stepwise regression is that the secondary terms may reveal relationships that would be obscured if earlier terms to which they are correlated were not first removed. For example, during March and April in the Maumee watershed, the standard deviation of precipitation was positively correlated to TSS loaddischarge residuals, but with the effects of maximum daily rainfall removed, the correlation became negative. That is, though more variable daily rainfall was related to greater loads, this could be ascribed to the tendency of months with more variable daily rainfall to exhibit greater maxima. Once the effect of maximum rainfall was accounted for and removed, variable daily rainfall showed instead a negative association with load.
Explanatory Variables
Numerous candidate explanatory variables for loads were formed from the data on climate, streamflow, and agriculture. Choices were limited by data availability and the need to simplify through averaging. The goal of selection was not only to include salient variables in load exports, but also to attain broader insight by testing a comprehensive set of factors.
Tables 1 and 2 indicate the variables that were employed in the final model to explain the variation in mean monthly loads of NO2+3, SRP, TSS, and TP. The variables in Table 2 were formed from those in Table 1. They were selected for their relations to soil conditions and the interaction of soil with precipitation. For example, erosion and leaching may be impeded by interception by a snowpack, storage of water above ground, and shielding of soil from raindrop impacts. Snow cover might be correlated to soil conditions, such as freezing.
|
|
The modeling was split into two time periods for each watershed because agricultural data were available only for 19791995, while flow and climate data were available for the entire 19761995 study period. The first time period employed only climatic explanatory variables, covering water years 19761978 and 19821995 for the Maumee basin and 19761995 for the Sandusky basin. The second time period included both climatic and agricultural data, covering water years 19821995 for the Maumee basin and 19801995 for the Sandusky basin. Both time periods were modeled for each month or season. The resulting model from the second era was taken as the solution to Eq. [3] if any agricultural variables proved significant. Otherwise, the model using only the longer-term climate variables was adopted.
Equation [3] was initially solved separately for each month (e.g., using sets of all January values), and the results showed some consistency within certain groups of months, with clear contrasts among others. Because this consistency was frequently obscured by variance on the monthly time scale, the model was made more robust by grouping the monthly values into five "seasons": January to February, March to April, May to July, August to October, and November to December. Selection was based on consistency among the months within the seasons (based on examination of monthly box plots of load, discharge, and all model variables) as well as consistency in the explanatory variables chosen by the regressions based on individual months. Streamflows and loads in August, September, and October were very similar and lower than in other months (Fig. 3) . January and February were distinguished by snowfall and snow cover (Fig. 4) , as well as the significance of snow-related variables in the monthly models. March and April tended to have high discharges, snowmelt, and loads. May, June, and July were lower in discharge and load, though still high in rainfall, and corresponded to the period of tillage, planting, fertilizer application, and crop growth.
|
|
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
A strict interpretation of the results must be limited to the specific set of variables that were tested. For example, lacking data on fertilizer applications, we employed fertilizer deliveries to dealers in the watershed; these may be only roughly correlated to actual use, and it must be understood that we were not testing fertilizer application as an explanatory variable, but merely deliveries. Additionally, all variables were subject to error and limited in temporal and spatial precision, either by data availability or by the need to simplify through averaging. The test variables were a limited subset of all conceivable variables.
More general insights into factors affecting exported loads may be attained by speculation, which relaxes the strict interpretation by recognizing that the selected variables represent broader classes of related phenomena. For example, a strong correlation to fertilizer deliveries would suggest that fertilizer use may be a significant factor in nutrient exports. A consistent correlation of loads to related variables such as rain, precipitation, streamflow, and maximum precipitation might be understood as a result of phenomena linked to "wet weather". Compared with the numerical results, such speculation is less certain but also potentially more useful in guiding subsequent research. The major caveat is that one must be especially cautious in generalizing a lack of a relationship. For example, while in actuality there may be an association between levels of fertilizer application and export of nutrients, the test variables may simply be insufficient to reflect it. Random errors and imprecision in the data may obscure relationships. On the other hand, they do not invalidate the correlations that are discovered.
Tables 3 through 6, and a summary in Table 7, list components of the statistical models for prediction of monthly values of the loaddischarge residuals of NO2+3, SRP, TSS, and TP, respectively. They do not include the regression coefficients, but are designed to convey the ability of the selected model variables to explain the variance in the loaddischarge residuals. The signs and significance of these terms are "marginal"; that is, they are based on ability to explain variance that remained after previous terms were applied. In particular, the correlation of discharge to load was removed in all cases.
|
|
|
|
|
Nitrate plus Nitrite (NO2+3)
Of the four species, results for NO2+3 (Tables 3 and 7) exhibited the most consistency between watersheds and greatest significance in the model variables. Notable in Table 3 is the negative correlation of antecedent precipitation and streamflow (virtually interchangeable) to the loaddischarge residuals. The relationship was very significant in both watersheds from November through April. Figure 5
depicts the correlation of streamflow in the previous 12 months to the loaddischarge residuals in each month. The two watersheds appear very similar.
|
Another significant correlation was observed for the snow cover fraction in JanuaryFebruary. In both basins, snow cover exhibited a negative relationship to NO2+3 loaddischarge residuals as a secondary explanatory variable. This suggests a reduction in leaching owing to storage of precipitation above ground in a snowpack. The decrease in NO2+3 loaddischarge residuals associated with extensive snow cover in JanuaryFebruary could mean that loads were simply being deferred, but MarchApril correlations between loaddischarge residuals and snow cover in the previous three months are weak.
MarchApril snow cover in the Sandusky basin had a positive correlation to loaddischarge residuals, in contrast to JanuaryFebruary. In MarchApril, extensive snow cover may have been more indicative of melting than of surface storage; snow cover fraction and mean snowmelt had a correlation coefficient greater than 0.94 in both watersheds in MarchApril, whereas it was less than 0.59 in JanuaryFebruary.
Few agricultural variables appear in Table 3. This is not proof that they were unimportant in explaining NO2+3 loads; the data may simply have been inadequate to reveal their influence. The shorter time period for which agricultural data were available contributed to their lack of statistical significance.
One relationship that appears only once in Table 3, yet held for most of the year in the Maumee basin, is a negative correlation of NO2+3 loaddischarge residuals to current-year nitrogen fertilizer deliveries. Figure 6 shows that years with deliveries above 8 x 107 kg were associated with lower loaddischarge residuals. While surprising, it suggests that farmers may have cut back on fertilizer application following poor growing years, but not enough to offset the increase in storage during the previous years. Indeed, there was a strong correlation (r2 = 0.57, p = 0.0005) between current-year nitrogen deliveries and streamflow in the 12 months preceding April of the same year (Fig. 7) , an approximate date for decisions about most nitrogen fertilizer application. However, correlation between nitrogen deliveries and preceding streamflow is absent in the Sandusky watershed.
|
|
Soluble Reactive Phosphorus (SRP)
For SRP, the Sandusky watershed showed similar but stronger correlations to agricultural variables than did the Maumee watershed, leading to many more agricultural terms in the final models, as indicated in Tables 4 and 7. Though CRP enrollment and conservation tillage each appear in only one season, and phosphorus from manure in only four, in fact each of these variables was well-correlated to loaddischarge residuals throughout the year in both watersheds. The CRP and conservation tillage exhibited strong negative correlations to loaddischarge residuals, and phosphorus from manure was positively correlated, as seen in Fig. 8
, which indicates current-year correlations and p values; the previous-year data were similar. In the Sandusky watershed, current-year phosphorus fertilizer deliveries showed strong positive correlations in February and March, the primary months for SRP load.
|
The positive correlation of phosphorus fertilizer deliveries to the loaddischarge residuals was opposite to that of nitrogen fertilizer, perhaps owing to the much greater persistence of phosphorus in the soil (Stevenson, 1986), which would have prevented the annual cycle hypothesized earlier for NO2+3. The increase of phosphorus fertilizer deliveries following wet years was much more modest than that of nitrogen fertilizer, probably for the same reason.
Total Suspended Solids (TSS)
The most significant correlations in Tables 5 and 7 for TSS indicate that preceding wet conditions were associated with decreased loads during cold seasons. These correlations covered January through April in the Maumee watershed and November through February in the Sandusky watershed. They were similar to those observed for NO2+3, even though NO2+3 is not transported via soil particles. Total suspended solids appeared sensitive to more recent wet conditions than did NO2+3, as 3-month spans had greater explanatory power than 12-month spans. A washout effect could explain the relationship if the sediment was largely supply-limited.
As with NO2+3, snow cover appeared to decrease sediment loss through interception of precipitation by the snowpack and, in this case, shielding soil from rainfall impacts. Snow depth and previous snowfall were both negatively related to TSS loaddischarge residuals in the Sandusky watershed for January through April.
Total suspended solids did show some differences from NO2+3. Streamflow in the previous year is negatively associated with TSS loaddischarge residuals in the Sandusky watershed for MayJuly, when NO2+3 exhibited no significant relationships. Maximum rainfall was related to increased TSS loads in January through May (also December for the Maumee River), as shown in Fig. 9 , whereas no such relationship was found for NO2+3 and SRP. This implies that intense precipitation events in the winter and spring affected sediment transport more than dissolved-constituent transport. It is a principle of erosion that soil loss is sensitive to rainfall intensity, as reflected in the broadly used Universal Soil Loss Equation (e.g., Dunne and Leopold, 1978).
|
Because phosphorus adsorbs to soil particles, one might have expected the models and correlations for TP (Tables 6 and 7) to look much like those for TSS (Tables 5 and 7), and the relationship to antecedent precipitationstreamflow did appear similar. The main difference was that, for TP, antecedent precipitation was not as well correlated to loaddischarge residuals in JanuaryFebruary in the Maumee watershed.
| CONCLUSIONS |
|---|
|
|
|---|
The strongest, most prevalent relationship was the negative correlation of antecedent precipitation and streamflow to the loaddischarge residuals, primarily for NO2+3, but also for TSS loads and TP. Causes of decreased losses following wet years may have included erosion and loss of nutrients stored in the soil through crop uptake and/or leaching.
Snow cover appeared to play a role in decreasing or deferring NO2+3 and TSS loads in January through February. In MarchApril, the same relationship was observed for TSS, but was reversed for NO2+3, perhaps owing to large snow melts, which may have transported loads deferred by extensive snow cover earlier in the winter. Maximum monthly rainfall was associated with increasing TSS loads in the winter and spring, consistent with this scenario.
Agricultural variables were significant in explaining SRP loaddischarge residuals, particularly phosphorus from manure. While agricultural variables were not prevalent in the models for other speciesowing in large measure to their shorter time period and relative lack of precisionsome consistent correlations were discovered. Acreage undergoing conservation tillage was negatively correlated to loaddischarge residuals throughout the year for SRP, TP, and TSS. The CRP enrollment was negatively related to SRP and TSS in summer months. Phosphorus from manure was positively related to SRP and TP loads.
The relationship of fertilizer deliveries to loaddischarge residuals was more complex. In the Maumee watershed, nitrogen fertilizer deliveries showed a surprising negative correlation to NO2+3 loads, but a positive correlation to streamflows in the preceding 12 months, suggesting that farmers responded to drier years with lesser fertilizer application, though not enough to offset increased soil storage. Phosphorus fertilizer showed significant, positive correlations to SRP loaddischarge residuals in the two basins.
The four different species exhibited significant differences, greater than those between the watersheds. The NO2+3 loads were well-explained by climatic variables, notably preceding streamflowprecipitation and snow cover, with summer months related only to current streamflow. The SRP loads, by contrast, revealed significant correlations primarily to agricultural variables. The TSS loads appeared similar to NO2+3, except they were negatively correlated to CRP enrollment and conservation tillage. The TP correlations generally did not meet the p value criterion for inclusion in the model.
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
D. Ginting and M. Mamo Measuring Runoff-Suspended Solids Using an Improved Turbidometer Method J. Environ. Qual., April 3, 2006; 35(3): 815 - 823. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. P. Viator, R. C. Nuti, K. L. Edmisten, and R. Wells Predicting Cotton Boll Maturation Period Using Degree Days and Other Climatic Factors Agron. J., March 1, 2005; 97(2): 494 - 499. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. B. Jaynes, D. L. Dinnes, D. W. Meek, D. L. Karlen, C. A. Cambardella, and T. S. Colvin Using the Late Spring Nitrate Test to Reduce Nitrate Loss within a Watershed J. Environ. Qual., March 1, 2004; 33(2): 669 - 677. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. P. Richards, F. G. Calhoun, and G. Matisoff The Lake Erie Agricultural Systems for Environmental Quality Project: An Introduction J. Environ. Qual., January 1, 2002; 31(1): 6 - 16. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. P. Richards, D. B. Baker, and D. J. Eckert Trends in Agriculture in the LEASEQ Watersheds, 1975-1995 J. Environ. Qual., January 1, 2002; 31(1): 17 - 24. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. G. Calhoun, D. B. Baker, and B. K. Slater Soils, Water Quality, and Watershed Size: Interactions in the Maumee and Sandusky River Basins of Northwestern Ohio J. Environ. Qual., January 1, 2002; 31(1): 47 - 53. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. B. Moog and P. J. Whiting Climatic and Agricultural Contributions to Changing Loads in Two Watersheds in Ohio J. Environ. Qual., January 1, 2002; 31(1): 83 - 89. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||