JEQ Grow Your Career With ASA
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online 1 March 2007
Published in J Environ Qual 36:508-520 (2007)
DOI: 10.2134/jeq2005.0426
© 2007 American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Akita, Y.
Right arrow Articles by Serre, M. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Akita, Y.
Right arrow Articles by Serre, M. L.
Agricola
Right arrow Articles by Akita, Y.
Right arrow Articles by Serre, M. L.
Related Collections
Right arrow Geostatistics
Right arrow Water Pollution

TECHNICAL REPORTS

Organic Compounds in the Environment

Spatiotemporal Nonattainment Assessment of Surface Water Tetrachloroethylene in New Jersey

Yasuyuki Akitaa, Gail Carterb and Marc L. Serrea,*

a Dep. of Environmental Science & Engineering, School of Public Health, Univ. of North Carolina-Chapel Hill, Rosenau Hall CB# 7431 Chapel Hill, NC 27599-7431
b New Jersey Dep. of Environmental Protection, Division of Science, Research, and Technology, P.O. Box 409, Trenton, NJ 08625-0409

* Corresponding author (marc_serre{at}unc.edu)

Received for publication November 11, 2005.

    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUMMARY AND CONCLUSIONS
 REFERENCES
 
Tetrachloroethylene (PCE) is one of the most frequently detected volatile organic compounds (VOCs) in water systems across the USA. In New Jersey, the Department of Environmental Protection (NJDEP) monitors surface water quality at several sites throughout the state. However due to budget and scientific limitations, the sampling data is insufficient to assess all river streams in New Jersey. To address this problem, the objective of this study is to utilize a framework for the space/time estimation of PCE throughout all river reaches in New Jersey over the 1999 through 2003 time period and to track how this concentration evolves over time. We use the Bayesian maximum entropy (BME) mapping method to take into account the composite spatiotemporal variability of PCE, and we produce maps providing a stochastic description of the distribution of PCE at all times throughout the river network. In addition, we conduct a nonattainment assessment analysis by applying a criterion based on the estimated probability distribution function that allows us to identify the river miles that are highly likely in nonattainment of the standard, those that are highly likely in attainment of the standard, and the remaining labeled as nonassessed. Using this criterion we investigate how the river miles contaminated by PCE vary over space and time, and we identify watershed management areas (WMAs) with contamination problems. Finally, a cross validation comparison with a purely spatial analysis demonstrates that the space/time framework leads to a better estimation and a reduction of the number of nonassessed miles.

Abbreviations: BME, Bayesian maximum entropy • NJDEP, New Jersey Department of Environmental Protection • PCE, tetrachloroethylene • PDF, probability density function • S/TRF, space/time random field • WMA, watershed management areas


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUMMARY AND CONCLUSIONS
 REFERENCES
 
VOLATILE organic compounds (VOCs) are widely used in many industries as solvents, paints, refrigerants, and gasoline components. The extensive prevalence and use of VOCs result in substantial releases into the environment. Many of these VOCs enter into water bodies, and are identified as serious waterborne contaminants. Tetrachloroethylene (PCE) is one of the most frequently detected VOCs in both surface and ground water in the USA (Ram et al., 1990; Squillace et al., 1999; Moran et al., 2002). According to the Agency for Toxic Substances and Disease Registry (1997), PCE was detected in 38% of 9232 surface water monitoring sites in the USA (Agency for Toxic Substances and Disease Registry, 1997). Grady and Casey (2001) conducted a study in the northeast and mid-Atlantic regions of the USA and they reported that 4.4% of the randomly selected community water systems contained PCE. In California, about 10% of the sampled drinking water sources were found to contain PCE (Williams et al., 2004). In addition, PCE was also detected in storm water samples, with some samples having a concentration exceeding the drinking water standard (Makepeace et al., 1995; Line et al., 1996). These results suggest that there exist many point and nonpoint sources of PCE across the USA.

Tetrachloroethylene is a nonflammable colorless liquid at room temperature with a sharp sweet odor. Tetrachloroethylene is commonly used as an industrial degreaser and dry cleaning solvent all over the world. Emission of PCE around the world during 1990 was 366000 metric tons, with the USA accounting for about a third of these emissions at 127000 metric tons (McCulloch and Midgley, 1996; McCulloch et al., 1999). Exposure to PCE has adverse effects on human health. Some animal studies show that oral exposure causes harmful effects to the liver (National Toxicology Program, 1977; Buben and O'Flaherty, 1985; Hayes et al., 1986). Tetrachloroethylene is also classified as a carcinogen. The Department of Health and Human Services has determined that PCE is reasonably anticipated to be a human carcinogen (National Toxicology Program, 2004). Similarly the International Agency for Research on Cancer (IARC) has classified it in its group 2A of compounds probably carcinogenic to humans. Williams et al. (2004) reported that cancer risk for PCE was the highest among 12 most frequently detected VOCs in drinking water in California. Therefore, assessing the PCE concentration and identifying contaminated areas play a vital role in assessing the quality of our waters and protecting the public health.

In the state of New Jersey, the New Jersey Department of Environmental Protection (NJDEP) set the surface water quality standard of PCE at 0.388 µg L–1, which is more stringent than the U.S. Environmental Protection Agency (USEPA) maximum contamination level (MCL) of 0.005 mg L–1. A question of interest for the NJDEP is to assess the extent to which its surface water meets the quality standard of PCE. For that purpose, the NJDEP monitors surface water quality at several sites throughout the state; however, due to budget constraints and limitations of existing space/time estimation methods, the monitoring data will only provide information on a sparse spatial subset of all the river reaches, and many river reaches are not assessed. Furthermore, the water quality displays a high variability over time, and since the data are only collected a few times a year, there is a need for assessing the temporal evolution of water quality between the times of sampling events.

The objective of this study is to address this need by developing and implementing a framework for the space/time estimation of PCE throughout all river reaches in New Jersey over the 1999 through 2003 time period, and tracking how this concentration evolves over time. Spatiotemporal geostatistics (Christakos, 1990, 2000; Nielsen and Wendroth, 2003) provide a powerful mathematical construct to model processes distributed across space and time. In this work, we use the Bayesian maximum entropy (BME) method of modern spatiotemporal geostatistics (Christakos, 1990, 2000; Christakos and Li, 1998; Serre and Christakos, 1999), and the corresponding BMElib numerical library (Serre et al., 1998; Serre, 1999; Christakos et al., 2002). This approach provides a conceptual framework that takes into account the composite spatiotemporal variability of water quality parameters along rivers (Serre et al., 2004), and rigorously processes all available monitoring data distributed unevenly over space and time.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUMMARY AND CONCLUSIONS
 REFERENCES
 
Study Area and Tetrachloroethylene Monitoring Data
The study area of this work is the state of New Jersey. New Jersey is divided into 20 watershed management areas (WMA) and each WMA has a unique identification number. The river network, WMAs, and PCE monitoring stations operating in the 1999 through 2003 time period are shown in Fig. 1.


Figure 1
View larger version (67K):
[in this window]
[in a new window]

 
Fig. 1. Study area-Tetrachloroethylene (PCE) monitoring stations and watershed management areas (WMAs) in New Jersey.

 
The dataset used in this study was obtained from two different sources. The first source is the U.S. Geological Survey (USGS) National Water Information System (NWIS). The second source is the USEPA STOrage and RETrieval (STORET) database. Data downloaded from NWISWeb contains values with a less than sign such as " <0.01," which indicates that the actual value is known to be less than the value shown and values with a letter "E" such as "E.01" which indicates estimated values. In this study, data with a less than sign are treated by taking 50% of the value indicated after the "<" sign, and data with a letter "E" are used as actual values. The dataset downloaded from NWISWeb contains 313 records, and that from STORET contains 56 records. Both datasets were combined, resulting in 369 measured values collected at 171 monitoring stations between 1999 and 2003. The samples were collected at various times in the 1999 through 2003 period, so that the sample times were not synchronized across the stations. In Fig. 2, we show the yearly aggregated measured values, i.e., the arithmetic average of all values taken at one station over 1 yr for 1999 and 2002. It should be noted that each figure shows data that were collected at various times in the given year; however, these figures allow for an exploratory analysis indicating the main features of the distribution of monitoring data over space and time. We can see from the figure that the PCE monitoring data exhibit a high spatial variability, with higher concentration values observed mainly in the central and northeastern parts of the state. Furthermore, the data values change substantially from year to year, indicating a high temporal variability as well.


Figure 2
View larger version (30K):
[in this window]
[in a new window]

 
Fig. 2. Yearly aggregated tetrachloroethylene (PCE) concentration (µg L-1) in (a) 1999 and (b) 2002.

 
While Fig. 2 was created by aggregating the data over 1 yr for the sake of the exploratory data analysis, we will consider each datum with its actual time of measurement for the remainder of this analysis. In Fig. 3(a) we show the histogram of the raw data. This histogram indicates that the distribution of the data is highly skewed, with several values close to zero, and a few large values. In Fig. 3(b) we show the histogram of the log-transformed data. The statistical moments of these distributions are summarized in Table 1, where one can see that the log transformation leads to a reduction of the coefficient of skewness. As a result we will use the log-transform of the data in the following analysis.


Figure 3
View larger version (12K):
[in this window]
[in a new window]

 
Fig. 3. Histogram showing the distribution of (a) raw tetrachloroethylene (PCE) monitoring data and (b) that of the log-transformed data.

 

View this table:
[in this window]
[in a new window]

 
Table 1. Basic statistics of the monitoring tetrachloroethylene (PCE) data (raw, µg L–1) and its log transform.

 
Bayesian Maximum Entropy Estimation Framework for Space/Time Mapping Analysis
In this study we use the BME method of modern spatiotemporal geostatistics to estimate the concentration of PCE across space and time. The concentration field is modeled in terms of the space/time random field (S/TRF), X(p), where p = (s,t) is the space and time coordinate. The S/TRF models the distribution of concentration of PCE across space and time in terms of a collection of plausible field realizations {chi}(p). In the context of space/time estimation, we write the S/TRF as a collection xmap of random variables representing the concentration at specific space/time points,

Formula 1[1]
where xi (i = 1...v) are random variables at space/time locations pi, and the vector xmap is simply the collection of random variables (x1,...,xv). A realization of the S/TRF at these points is given by the vector

Formula 2[2]

Using the viewpoint of Eq. [1] and [2], the uncertainty characterizing the S/TRF at the mapping points pmap is modeled in terms of the probability corresponding to the different plausible realizations, {chi}map, i.e.,

Formula 3[3]
where f({chi}map, pmap) is the probability density function (PDF) of the S/TRF.

The BME method provides a rigorous framework to process all the knowledge base K characterizing the concentration of PCE in the surface water of New Jersey for the 1999 through 2003 period, and yields a stochastic estimate of the concentration at any unmonitored space/time point along the river network. The total knowledge base K is divided into the general knowledge base G and the site-specific knowledge base S. The general knowledge base G depicts global characteristics of the S/TRF, such as its mean trend mx(p) = E[X(p)] characterizing consistent trends in the distribution of PCE over space and time, and the covariance cx(p,p') = E[(X(p)–mx(p))(X(p')–mx(p'))] describing space/time correlations and concentration dependencies between the two space/time points p = (s,t), and p' = (s',t'). The E[ ] is the expectation operator, which is fully defined in terms of the PDF for the S/TRF X(p). The site-specific knowledge base S refers to the specific PCE monitoring data available over the space/time mapping domain of interest. In this study we treat the measured concentration value as hard data, i.e., we assume that the measurement errors were small enough to be neglected. While the reader is referred to Christakos et al. (2002) for a complete description of the numerical implementation of the BME method and its mathematical formulation, we describe here briefly three main stages of the BME analysis. At the prior stage, the general knowledge base G is examined to obtain the prior PDF describing the S/TRF for PCE (Eq. [3]). In the integration stage, the prior PDF is updated using Bayesian conditionalization on the site-specific knowledge base S, leading to a posterior PDF describing the PCE concentration at any space/time estimation point along the river network. This posterior PDF integrates all the knowledge base available K = G {cup} S, and provides a full stochastic description of the PCE concentration to estimate. Finally, at the interpretive stage, we use the posterior PDF obtained at estimation points to derive temporal plots and maps describing the distribution of the estimated concentration and its associated uncertainty across space and time. In our work, we use the BMElib package to implement the BME method, we select the expected value of the posterior PDF (the so-called BME mean estimate) as the estimator for PCE, and we use the posterior variance as a measure of the associated estimation uncertainty.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUMMARY AND CONCLUSIONS
 REFERENCES
 
Mean Trend and Covariance of Tetrachloroethylene in New Jersey
Let Y(p) be the S/TRF representing the distribution over space and time of the log-transformed surface water PCE concentration, and let mY(p) = Y(p) be its mean trend, so that we have

Formula 4[4]
where X(p) is the so-called residual field summarizing all the space/time variability and uncertainties associated with the log-transformed PCE concentration in the environment. The mean trend model that we used in this work is the additive space time model mY(p) = mY,s(s)+mY,t(t), where mY,s(s) is a purely spatial component, and mY,t(t) is a purely temporal component. The additive nature of this mean-trend model was chosen to represent a state-wide trend in the time evolution of PCE across New Jersey, which adequately represents the allegedly state-wide effects of environmental regulation enforcement by state agencies, and was found to fit the data well. The spatial and temporal components of mY(p) were obtained by exponential smoothing of the time-averaged and spatially-averaged data, respectively, and they are shown in Fig. 4. As can be seen from this figure, an apparent trend exists both spatially and temporally. The spatial component of the mean exhibits higher PCE contamination in the northeastern region of the state as well as, to a lesser extent, in a region located in the southwest. The temporal mean trend component shows an increase in PCE concentration in the 4-yr period from January 1999 through January 2003, followed by a decrease during the subsequent 6 mo. In addition there is a seasonal trend showing an increase in PCE during the summers of 2001 and 2002. This increase of PCE during the summers is not as marked before 2001, and the time series is not long enough to see if it continues after 2002.


Figure 4
View larger version (34K):
[in this window]
[in a new window]

 
Fig. 4. (a) Spatial and (b) temporal components of the mean trend model for the log-transformed tetrachloroethylene (PCE) field.

 
By removing the mean trend from the log-transformed data, we obtain the log-transformed mean-trend removed residual data for the S/TRF X(p). As expected, this residual field was found to be homogeneous/stationary, as the mean trend model properly removed any nonhomogeneous/nonstationary component of the process. As a result the space/time covariance cx(p,p') of X(p) can be expressed in terms of the spatial distance r = ||ss'|| and time difference {tau} = |tt'| for the two space/time points p = (s,t), and p' = (s',t'), so that we may write cx(p,p') = cx(r = ||ss'||, {tau} = |tt'|). In practice, this is done by finding pairs (p,p') of measurement events for which the sampling locations s and s' are separated by a distance of r and the sampling events t and t' are separated by a distance of {tau}. Using the log-transformed mean-trend removed dataset we obtained estimated values of the covariance for different classes of spatial lags r and temporal lags {tau}, using a numerical algorithm that we developed to handle data unevenly distributed over space and time. These experimental values, shown with dots in Fig. 5, were then used to fit the following nonseparable space/time covariance model

Formula 5[5]
where c1 = 1.0 {sigma}x2, ar1 and ar2 are spatial ranges with values of 29.5 km and 24.6 km respectively, a{tau}1 and a{tau}2 are temporal ranges with values of 1700 d and 2000 d respectively, and {sigma}x2 = 0.8832 (log-µg L–1)2. The covariance model (Eq. [5]) is shown with a line in Fig. 5. It should be noted that this covariance model fits the experimental covariance data not only for cx(r = 0,{tau}) or cx(r,{tau} = 0) (top row of Fig. 5), but also when both r > 0 and {tau} > 0 (middle and bottom rows of Fig. 5). Thus, this model characterizes the variability of PCE in its composite space/time domain, rather than being restricted to a purely spatial description of PCE variability (i.e., restricted to studying cx(r,{tau} = 0)) as is usually the case for a classical mapping analysis. The composite space/time aspect of the analysis is especially important for our mapping estimation as the monitoring data are not collected at synchronized times along the river, so that in most cases the space/time distance between monitoring data points is such that both r > 0 and {tau} > 0. In such cases the purely spatial analysis is quite powerless, as will be demonstrated later when we compare the composite space/time analysis with a purely spatial approach.


Figure 5
View larger version (20K):
[in this window]
[in a new window]

 
Fig. 5. Covariance of the mean trend removed log-transformed tetrachloroethylene (PCE) field.

 
The covariance model of Eq. [5] provides very valuable information about the variability of PCE along the rivers of New Jersey. The Gaussian component indicates that PCE at a fixed location exhibits a rather smooth variability over time. Furthermore, both components have a long temporal range of a{tau}1 = 1700 and a{tau}2 = 2000 d, respectively. This means that it will take 4 to 5 yr for a significant variation of PCE concentration to occur at a given river location. Since PCE volatilizes rapidly from water at the water-air interface, the long lasting behavior of PCE in the water must be explained by its presence in the bulk phase of the water, or in the ground water connected to the river. Indeed, because PCE is denser than water and only slightly soluble in water, that which is not immediately volatilized may be expected to sink to the bottom of the river (Doust and Huang, 1992), and possibly be transported into ground water by leaching through fissures (Chilton et al., 1990). Hence, our findings that PCE exhibits smooth temporal variability over long temporal ranges indicates that its distribution along the river could potentially be influenced by contributions from the bottom of the river stored in the ground water from previous contaminations. Turning our attention to the spatial part of the covariance model, we note that it has two exponential components, with spatial ranges ar1 = 29.5 km and ar2 = 24.6 km, respectively. Based on our findings, we believe that the spatial variability is a function of the spatial scale of previously contaminated ground water aquifers and of the dispersion processes along the bottom of the river. In terms of monitoring, the high spatial variability of PCE means that at a given time the concentration measured at a sampling location can only be used to assess the water quality at most within 25 km, and that river reaches beyond are uncorrelated with that water sample.

Time Series Estimation of Tetrachloroethylene at Selected Stations
Using the BME estimation method we are able to process the general knowledge (mean trend and covariance models shown in Fig. 4 and 5, respectively) and the site-specific knowledge (monitoring data, see Fig. 2), and to obtain space/time estimates for the concentration of PCE at any space/time points of interest. To investigate how the concentration changes over time, we show in Fig. 6 the BME space/time estimation of the expected concentration value and the associated 68% confidence interval at monitoring stations 39, 135, and 167. At Station 39, the BME estimation and 68% confidential interval never exceed the surface water quality standard 0.388 µg L–1. On the other hand, at Station 167, these values almost always exceed the standard before January 2002, and then drop below the standard in 2003. At Station 135, these values mostly stay below the standard throughout the time period.


Figure 6
View larger version (33K):
[in this window]
[in a new window]

 
Fig. 6. (a) Map showing Stations 39, 135, and 167, and temporal plots of the Bayesian maximum entropy (BME) space/time estimate of surface water tetrachloroethylene (PCE) at (b) Station 39, (c) Station 135, and (d) Station 167.

 
Spatial Distribution of Tetrachloroethylene over the State of New Jersey at Selected Times
To investigate how the estimated concentration of PCE varies over space at a specific time, we show in Fig. 7 maps of the BME mean estimate of concentration calculated for 15 Apr. 2002. Figure 7(a) shows the map of estimated concentration over the whole state. While this map does not show details (such as the river network), it is nonetheless a useful tool to identify specific areas of the state where the surface water concentration of PCE might be a concern. One such area is Watershed Management Area 05 (WMA05), with boundaries outlined at the northeastern part of the state. Figure 7(b) zooms in on WMA05 and shows in more detail the spatial distribution of surface water concentration along the river network. The river network in WMA05 includes a main stem running north to south in the middle of the watershed, as well as the coastal line defining the south boundary of that watershed. As can be seen from Fig. 7(b), on 15 Apr. 2002, many miles of the streams in WMA05 had a concentration estimated to be in excess of the state standard of 0.388 µg L–1. In fact one can exactly quantify the number of river miles in nonattainment for difference confidence levels, as explained in the next section.


Figure 7
View larger version (33K):
[in this window]
[in a new window]

 
Fig. 7. Maps showing the Bayesian maximum entropy (BME) estimate of surface water tetrachloroethylene (PCE) concentration on 15 Apr. 2002 (a) throughout the entire state of New Jersey and (b) over an area restricted to WMA05.

 
Figure 7 illustrates maps of the BME estimate of concentration for 15 Apr. 2002, as this is a date typical of the duration of study, but maps (not shown here) were obtained for every 15 d in the 1999 through 2003 period. Each map is obtained from a composite space/time analysis that accounts for all the space/time data, i.e., including not only the data collected on the estimation date, but also data from the preceding and following days. This feature of space/time analysis is actually critical for mapping the surface water concentration in the state of New Jersey because the monitoring data is not collected at synchronized times along the river. For example, in the case of the map in Fig. 7, due to logistic and time constraints of the personnel collecting water samples, there were only a few data points collected on 15 Apr. 2002. A purely spatial approach would only consider these scarce data. In contrast, the space/time analysis presented in this work considers the additional data collected in the days preceding and following 15 Apr. 2002, thereby greatly increasing the dataset under consideration. This leads to a substantial improvement of mapping accuracy, as will be presented in a subsequent section.

Areas and River Miles in Excess of the Water Quality Standard
Next we are interested in quantifying the number of river miles for which the estimated concentration of PCE exceeded the water quality standard. The fraction of river miles that exceeds the New Jersey surface water quality standard on any given day is obtained by using estimation points distributed at equidistant intervals along the river network, and calculating the fraction of these points that have a concentration in excess of the quality standard. Table 2 shows the fraction of river miles that exceeded the quality standard on 5 Feb. 2000, 11 Mar. 2001, 15 Apr. 2002, and 20 May 2003. The results shown in Table 2 indicate that this fraction reaches a maximum around 15 Apr. 2002. At that date, the fraction of river miles with a BME mean estimate exceeding the quality standard was 1.50%. If the upper bound of the 68% confidence interval is adopted as the basis for assessing nonattainment of the standard, the fraction reaches about 9%. If the upper bound of the 95% confidence interval is adopted, then the fraction reached about 70%.


View this table:
[in this window]
[in a new window]

 
Table 2. Fraction of river miles that does not attain the water quality standard for tetrachloroethylene (PCE) on 5 Feb. 2000, 11 Mar. 2001, 15 Apr. 2002, and 20 May 2003. BME = Bayesian maximum entropy.

 
A Criterion-Based Assessment of the River Miles in Nonattainment of the Standard
In the analysis shown above, we used three different estimates (the BME mean estimate, and the upper bound of the 68 and 95% confidence intervals) to calculate the fraction of river miles exceeding the standard. We now define a formal space/time criterion to assess nonattainment of the water quality standard at different confidence levels. Using the BME posterior PDF we can immediately calculate for each estimation point p = (s,t) the probability Prob[PCE(p) > 0.388 µg L–1] that the PCE concentration at p exceeds the water quality standard of 0.388 µg L–1. This value is equal to the probability of nonattainment of water quality standard at any space/time point p, i.e.,

Formula 6[6]

Conversely we can calculate the probability of attainment of the water quality standard as Prob[Attainment, p] = 1 – Prob[Nonattainment, p] = Prob[PCE(p) < 0.388 µg L–1]. Then using these probabilities we classify the conventional attainment or nonattainment status of each estimation point as follows:

More likely than not in attainment: Prob[Nonattainment, p] < 50%
More likely than not in nonattainment: Prob[Nonattainment, p] > 50%

The conventional classification above is useful to determine whether the standard for PCE has been attained at any space/time point p. However, that classification provides little information about the associated uncertainty. There are several methods to identify the contaminated sections based on the uncertainty in the estimated values. Saisana et al. (2004) reviewed four popular classification criteria based on uncertainties in the estimated values to delineate the contaminated area with respect to the standard. In this study, we use the methodology introduced by Garcia and Froidevaux (1997), but modify their probability thresholds so as to use a more stringent criterion. This criterion classifies any space/time point as either highly likely to be in attainment of the standard, or highly likely to be in nonattainment of the standard, or nonassessed, as follows:

Highly likely in attainment: Prob[Attainment, p] > 90%, or
Prob[Nonattainment, p] < 10%
Highly likely in nonattainment: Prob[Nonattainment, p] > 90%
Nonassessment: 10%less double equalsProb[Nonattainment, p]less double equals90%

This criterion should be used in conjunction with the conventional criterion, rather than in place of it. While the conventional criterion provides a useful environmental indicator that can be used to track the state's ability to meet its standard at any space/time point, this criterion provides additional information about the confidence in meeting that standard, which is critical for the protection of the susceptible population and the environment. For example, we envision that recreational water activities for susceptible population should be restricted to areas classified as highly likely in attainment, rather than more likely than not in attainment. On the other hand, areas highly likely in nonattainment should be targeted for contamination control and remediation. Finally, nonassessed river miles should be the focus of increased sampling.

These two space/time criteria defined above provide a rigorous yet practical procedure for the state of New Jersey to monitor the evolution over time of the fraction of river miles in attainment and nonattainment of the water quality standard for PCE. Figure 8 shows the temporal evolution of the fraction of river miles calculated to be in the various attainment statuses of the second criterion from 1999 through 2003. Figure 8 also shows the fraction of river miles that are more likely than not to be in nonattainment. Several interesting findings arise from examining Fig. 8. First, while the number of river miles more likely than not to be in nonattainment seem to have increased from January 1999 through July 2002 and decreased thereafter, the number of river miles highly likely to be in nonattainment started to decrease as early as May 2000. Second, what is very noticeable is that the number of nonassessed river miles did significantly increase from July 1999 through July 2002, before seeing a reduction thereafter. Finally, it should be noted that only 75% of the river miles in New Jersey were highly likely to be in attainment in July 2002, but this fraction increased to cover as much as 95% of all river miles by the end of 2003.


Figure 8
View larger version (54K):
[in this window]
[in a new window]

 
Fig. 8. Time series of the fraction of river miles in New Jersey that are highly likely in nonattainment, nonassessed, and highly likely in attainment from 1999 to 2003.

 
The state of New Jersey is divided into 20 WMAs and each area has a unique identification number. To investigate which part of the state is contaminated, we show in Fig. 9 the contribution of each WMA to the fraction of river miles assessed as highly likely in nonattainment in New Jersey. From this figure we can see that the river miles assessed as highly likely in nonattainment during 1999 through 2003 are located in only seven WMAs (WMA01, WMA04, WMA05, WMA07, WMA10, WMA18, and WMA20). These contaminated WMAs are located in the central and northeastern part of the state, which is consistent with the results shown in Fig. 7 and data shown in Fig. 2. Overall, Fig. 9 is useful for state scientists to keep track of where the contamination is coming from. For example, it is clear from Fig. 9 that while only WMA05 and WMA10 contributed all river miles in nonattainment in the early 1999 through 2001 period, these two watersheds were clean of nonattainment in the later 2002 through 2003 period, at which point other WMAs started to be the source of contamination (indicating a shift in pollution source and a new area for intervention for state scientists and regulators).


Figure 9
View larger version (36K):
[in this window]
[in a new window]

 
Fig. 9. Contribution of watershed management area (WMA) to river miles assessed as highly likely in nonattainment.

 
Model Comparison: Space/Time Analysis versus Purely Spatial Analysis
As explained earlier, the variance of the BME posterior PDF, or error variance, provides a measure of the mapping uncertainty associated with the estimated concentration of PCE, i.e., the lower the variance, the more accurate is the map. We show in Fig. 10(a) the map of the error variance associated with the BME estimate obtained from the space/time analysis on 5 Feb. 2000. For comparison purposes, a purely spatial analysis was conducted on 5 Feb. 2000 (i.e., only processing the data collected on 5 Feb. 2000 ± 15 d), and we show the corresponding map of error variance in Fig. 10(b) (estimates not shown). By comparing Fig. 10(a) and 10(b) we see that the error variance of the space/time analysis is substantially lower than that of a purely spatial analysis. In the purely spatial analysis, the error variance is high in the areas where no sampling data was available on 5 Feb. 2000 ± 15 d. On the other hand, in the space/time analysis, areas with low error variances are extended to cover the whole study area because this analysis processes not only the data available on 5 Feb. 2000 ± 15 d, but also the space/time data available for all the days before and after that time period.


Figure 10
View larger version (56K):
[in this window]
[in a new window]

 
Fig. 10. Map of Bayesian maximum entropy (BME) error variance on 5 Feb. 2000 obtained from (a) the space/time analysis and (b) the purely spatial analysis.

 
Another procedure that can be used to compare the space/time estimation approach presented in this work with the purely spatial approach is to perform a cross-validation analysis. We removed each monitoring datum in turn from the dataset, and re-estimated that value using only the remaining data. This leads to a cross-validation estimate, which can be compared with the observed datum that was removed. By repeating this procedure for each datum and subtracting the cross-validation estimate from the observed value the estimation error is obtained, from which measures of estimation accuracy such as the mean error, the mean absolute error, and the mean square error can be calculated. These measures of estimation accuracy are shown in Table 3 for the purely spatial estimation approach (second column) and the space/time estimation approach (third column). We then calculate in the last column of Table 3 the improvement in mapping accuracy of the space/time approach over the purely spatial approach. For instance, the space/time analysis leads to a 56% reduction in mean square error when compared with the purely spatial analysis. Hence, Table 3 demonstrates that the composite space/time analysis leads to estimates that are significantly more accurate than estimates obtained from a purely spatial estimation. Another question is whether there is a substantial difference in the estimated value between these two approaches. To test this question, we provide the temporal evolution of the fraction of river miles found to be in the various nonattainment statuses using the purely spatial analysis in Fig. 11. Two very important differences appear by comparing Fig. 11 (purely spatial analysis) with Fig. 8 (space/time analysis). First, it is clear that there is a drastic difference between methods in the number of river miles estimated to be in nonattainment of the water quality standard. For instance, in the 1999 through 2000 period the purely spatial approach estimates that the fraction of river miles in nonattainment is less than 0.006%, whereas the space/time analysis estimates that over 0.29% of all river miles are highly likely to be in nonattainment, a difference of several orders of magnitude. Hence, a purely spatial analysis of PCE concentration in the streams of New Jersey would lead to a very misleading assessment of nonattainment of its water quality standard, whereas a space/time analysis would lead to a very different (and more accurate) assessment. The second noticeable difference between Fig. 8 and Fig. 11 is that when the purely spatial approach does pick up a small increase in the number of river miles in nonattainment (in 2001 through 2003), then this results in a very large fraction of river miles in nonassessment. In fact the fraction of New Jersey river miles that cannot be assessed with the purely spatial approach peaks at almost 99% in 2002, whereas in the case of the space/time approach the number of nonassessed river miles never exceeds 25% of all of New Jersey river miles. These results have important implications for the State of New Jersey, as they show that a substantial increase in the number of river miles can be assessed for surface water PCE concentration when the space/time approach is used.


View this table:
[in this window]
[in a new window]

 
Table 3. Mean error, mean absolute error, and mean square error of cross validation obtained from the purely spatial approach (second column) and the space/time approach (third column). The last column is the % change from column 2 to column 3.

 

Figure 11
View larger version (53K):
[in this window]
[in a new window]

 
Fig. 11. Time series of the fraction of river miles in New Jersey found to be in the various nonattainment statuses using the purely spatial analysis.

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUMMARY AND CONCLUSIONS
 REFERENCES
 
The Clean Water Act (CWA) requires states to identify water bodies not meeting water quality standards, and to establish total maximum daily loads (TMDLs) that will bring these waters into compliance. Environmental Protection Agency guidelines require a water to be listed as impaired if more than 10% of the monitoring data exceeds the standard. However, the number of the sampling data is usually limited due to budget constraints. Therefore, observed data generally do not represent the true concentration distribution and an assessment framework to supplement this limitation is needed. Statistical inference is one of the solutions to overcome this limitation. Smith et al. (2001) suggested statistical approaches based on a Binomial hypothesis test and the Bayesian Binomial framework to take into account the effect of the sample size. Gibbons (2003) extended this statistical approach and provided a method based on the confidence interval for the 90th percentile of the estimated distribution. These approaches are easy to carry out and more statistically sound than simply counting the number of samples exceeding the quality standard. However, these methods do not work where no monitoring data exists, such as along unsampled river reaches.

Water quality assessment along unsampled river reaches is one of the alternatives to statistical inference. In general, water quality assessment models can be divided into two categories corresponding to mechanistic approaches and statistical approaches, respectively. A mechanistic model depends on the current scientific understanding of the physical, chemical, biological, or geological processes in the water system. The Better Assessment Science Integrating Point and Nonpoint Sources (BASINS) system is an example of a water quality assessment model which rests on a mechanistic modeling approach (USEPA, 2001). However, relying solely on a mechanistic model is generally unrealistic. The National Research Council reported that mechanistic models tend to be mathematically complex and result in costly and time consuming analyses, and that mechanistic processes are so complicated that it is virtually impossible to conceptually understand or model the whole surface water system. Therefore, as an alternative they recommended using a statistical approach to make the most out of the sparse monitoring samples and evaluate all river reaches (National Research Council, 2001; Reckhow, 2003). To take into account data sparseness and basin characteristics, the spatially referenced regressions of contaminant transport on watershed attributes (SPARROW) model was introduced (Smith et al., 1997; Alexander et al., 2004). This model uses statistical regression equations to incorporate basin attributes. The Bayesian network model (Borsuk et al., 2003, 2004; Stow et al., 2003) also combines the statistical approach with mechanistic characteristics of the river basin. The advantage of these methods over our approach is that they incorporate hydrographic characteristics of the physical processes into their models. For instance, we adopt a Euclidian metric instead of the distance along the river (river metric) to take into account the effect of storm water runoff and ground water contamination, which tend to connect points directly across land. Hence, this might lead to a bias of the estimation since autocorrelation of surface water contaminants in river streams might be better estimated by using the river metric. In addition, unlike these studies, it is impossible for our approach to quantitatively identify the source of contamination. On the other hand, model simplicity is one of advantages of our modeling approach, since our method uses only available monitoring data and does not require any additional input. By contrast, mechanistic models, as well as the SPARROW and Bayesian network models, need additional inputs other than monitoring data for estimation, and it would be difficult and costly to obtain all these additional inputs for all the river miles of New Jersey. Furthermore, our approach fully accounts for the space/time variability of water quality in terms of the space/time covariance function, which is a considerable advantage over existing mechanistic and statistical inference models, and is the key step in being able to estimate water quality along unsampled river miles where additional inputs are not available.


    SUMMARY AND CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUMMARY AND CONCLUSIONS
 REFERENCES
 
Tetrachloroethylene is one of the most frequently detected VOCs in water systems across the USA. In New Jersey, the concentration of PCE exceeded its quality standard (0. 388 µg L–1) at several monitoring stations. Therefore, assessing the PCE concentration along all river streams is essential to maintain good water quality and protect public health. However, due to budget and scientific limitations, it is practically impossible to monitor this contaminant at all times along all stream reaches in the state. To address this issue, we have utilized an analysis framework that provides a rigorous model for the composite space/time variability of the water quality data along the river network, and yield maps describing the distribution of PCE across space and time. As demonstrated from our results, our space/time estimation provides an assessment of PCE in the streams of New Jersey that is substantially more accurate than the classical approach relying on a purely spatial analysis of the data. This has important implications for the State of New Jersey, as it shows that the framework presented in this work will allow the state to considerably increase the number of river miles that can be assessed for surface water PCE concentration due to the space/time analysis. Using a probabilistic criterion to assess attainment of the surface water quality standard, the number of river miles highly likely in nonattainment of the standard (i.e., for which Prob[PCE > 0.388 µg L–1] > 90%) increased from 1999 to 2000 to reach about 0.45% of all river miles in New Jersey, and then steadily decreased thereafter. In addition, only seven watershed management areas (WMA01, WMA04, WMA05, WMA07, WMA10, WMA18 and WMA20) contributed to nonattainment. Finally, there was a large increase of nonassessed river miles in 2001 through 2002, reaching as much as 25% of all river miles in the state. The space/time analysis presented here will help optimize in the future the space/time collection of water samples so as to minimize the number of nonassessed river miles.

Our results demonstrate that it is crucial to rigorously account for the composite space/time variability of monitoring data when assessing the surface water concentration of PCE along the rivers of New Jersey. However, while our work addresses this issue adequately, it would be worthwhile for future research to consider other important aspects of the space/time water quality assessment of PCE. First, future research should consider the impact of PCE analytical and sampling measurement errors. This could be a natural extension of the present work using the BME framework to model measurement errors in terms of probabilistic soft data. For example, the NWISWeb data with a less than sign could be modeled as soft data of interval type between zero and the detection limit, rather than taking the half value of the detection limit. Likewise, data coded with a letter "E," indicating estimated values, could be considered probabilistically provided that information about the reliability of these estimated values is available. Second, future research on PCE water quality assessment should consider the use of secondary data. Several studies reported co-occurrence of PCE and trichloroethylene (TCE) (Lopes and Bender, 1998; Squillace et al., 1999; Grady and Casey, 2001). In addition, it is reported that PCE can degrade to TCE (Vogel and McCarty, 1985). Therefore, the assessment of PCE might be improved by studying the association between PCE and TCE, and using that association to incorporate secondary TCE data in the estimation of PCE. Finally, as mentioned above, using the distance along the river (river metric) might better estimate the autocorrelation of surface water contaminant in river streams. Therefore, we recommend that a third topic for future research be the development of a framework to estimate the space/time autocorrelation of surface water contaminants using a river metric.


    ACKNOWLEDGMENTS
 
This work was supported by the New Jersey Dep. of Environmental Protection (contracts SR03-046 and SR04-062) and grants from the National Inst. of Environmental Health Sciences (grants no. 5 P42 ES05948 and P30ES10126).


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 SUMMARY AND CONCLUSIONS
 REFERENCES
 





This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Akita, Y.
Right arrow Articles by Serre, M. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Akita, Y.
Right arrow Articles by Serre, M. L.
Agricola
Right arrow Articles by Akita, Y.
Right arrow Articles by Serre, M. L.
Related Collections
Right arrow Geostatistics
Right arrow Water Pollution


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
The SCI Journals Agronomy Journal Crop Science
Journal of Natural Resources
and Life Sciences Education
Vadose Zone Journal
Soil Science Society of America Journal Journal of Plant Registrations The Plant Genome