Domain Editor-in-Chief: Detlev Helmig; Institute of Alpine and Arctic Research, University of Colorado Boulder, US
Guest Editor: Brian Lamb; Washington State University, US

1. Introduction

Unconventional oil and gas exploration (UOG), i.e. the exploration of oil and gas from shales using horizontal drilling, slickwater mixtures, and hydraulic fracturing, is a relatively new source of air pollutants to the lower atmosphere. The rapid development of UOG in several United States shale areas has not only increased domestic oil and gas production (Energy Information Administration, 2015), but has also caused an increase in methane and non-methane hydrocarbon (NMHC) emissions to the atmosphere (Petron et al., 2012; Gilman et al., 2013; Karion et al., 2013; LaFranchi et al., 2013; Miller et al., 2013; Peischl et al., 2013; Swarthout et al., 2013; Edwards et al., 2014; Helmig et al., 2014; Macey et al., 2014; Petron et al., 2014; Thompson et al., 2014; Zavala-Araiza et al., 2014; Karion et al., 2015; Lan et al., 2015; Lavoie et al., 2015; Peischl et al., 2015; Smith et al., 2015; Townsend-Small et al., 2015; Yacovitch et al., 2015; Yuan et al., 2015; Zavala-Araiza et al., 2015a; Zavala-Araiza et al., 2015b; Zimmerle et al., 2015; Allen, 2016; Eisele et al., 2016; Franco et al., 2016; Helmig et al., 2016; Kort et al., 2016; Lyon et al., 2016; Marrero et al., 2016; Olaguer et al., 2016; Prenni et al., 2016; Schade and Roest, 2016; Turner et al., 2016).

Methane, NMHCs, and NOx emissions occur from various oil and gas exploration related activities: well pad and drill rig establishment, drilling, well bore preparation, hydraulic fracturing and flowback water returning to the surface, well completion, production, and transport, handling and processing of product. UOG exploration occurs at hundreds to thousands of small to medium (well) pad sites throughout a shale area, creating a landscape dotted with pads (Jones et al., 2015). The environmental impacts of UOG exploration are recognized (Jackson et al., 2014), and an ongoing focus has been on air emissions (Allen, 2014; Field et al., 2014; Moore et al., 2014; Allen, 2016), which can be attributed to a variety of individual sources. Some of these sources are temporary, such as hydrocarbons outgassing from flowback water (Allen et al., 2013), some are intermittent, such as emissions from valves (Allen et al., 2015) or flares combusting casinghead gas at oil production sites, and some are near permanent, such as fugitive emissions from compressor engines or storage tanks (Warneke et al., 2014; Brantley et al., 2015; Johnson et al., 2015; Lyon et al., 2016). Some of these sources may dominantly emit methane because natural gas is the handled product (Johnson et al., 2015), while others may be dominated by NMHCs because the product handled is oil or condensate (Brantley et al., 2015).

From a local to regional air quality perspective, emissions of NMHCs and NOx are of most interest since (i) some of the emitted hydrocarbons, such as benzene, are air toxics, and (ii) they contribute to regional photochemical ozone formation (Kemball-Cook et al., 2010; Edwards et al., 2014; Pacsi et al., 2015). Both of these aspects are more relevant in oil producing shale areas, such as the Eagle Ford shale in Texas, as compared to mostly gas producing shale areas, such as the Marcellus shale. This is because in the latter, the dominant hydrocarbon emissions consist of methane and ethane (Vinciguerra et al., 2015), which have low atmospheric reactivity, while a suite of NMHCs is emitted in the former areas (Helmig et al., 2014; Schade and Roest, 2016). In addition, flaring in oil producing shale areas (Elvidge et al., 2016) is a significant source of NOx (AACOG Natural Resources Department, 2014), but flaring is rare in gas producing shale areas.

In this work, we focus on NMHC and NOx emissions sources in the Eagle Ford shale in south central Texas (

There are ongoing discussions about both the economic consequences of the considerable amount of gas flaring in this shale area (Tedesco, 2014; Daly, 2016; Hiller, 2016b; Hiller, 2017), as well as the public health effects of associated emissions (Hiller, 2016a). As flaring is a source of atmospherically more reactive compounds produced by incomplete combustion, such as formaldehyde, acetaldehyde, and ethene (Strosher, 2000; Al-Fadhli et al., 2012; Knighton et al., 2012; Torres et al., 2012; Wood et al., 2012; Pikelnaya et al., 2013), as well as NOx due to high flame temperatures, its emissions might have a significant effect on local ozone formation (Olaguer, 2012). We therefore analyzed local air quality data from a new, continuous air monitoring station in Karnes City, Texas, in the middle of the Eagle Ford shale, established in December 2014. The monitor is centrally placed in that gas production (gas-to-oil ratio, GOR, >6000) is prevalent south of Karnes City, while oil production (GOR < 2000) is prevalent to its north, with a “wet gas” window in between (Energy Information Administration, 2014). We focus on the first year of measurements, 2015, during which UOG activities in Karnes county produced 154.9 million mcf (thousand cubic feet = 28.3 m3) natural gas, 97.2 million bbl (one barrel = 115.6 L) of oil, 23 million bbl of condensate, and 157.3 million mcf of casinghead gas. The latter is often flared at well sites, while condensate and oil are typically temporarily stored in onsite tanks.

Our main objective for this study was to assess the first year of data from the monitor and develop a source apportionment of contributing emitters. Since the monitor location was central, and urban, we hypothesized that observed NMHCs and NOx would stem from a combination of sources, including car traffic and UOG exploration. We then combined the acquired source knowledge to develop insight into the historic development of air toxics at this rural site, with a focus on benzene due to its important role in public health (World Health Organization, 2010; EPA, 2015; Propper et al., 2015a; Strum and Scheffe, 2016), even at ambient concentrations (Bolden et al., 2015a).

2. Methods

2.1. Site and data acquisition

We used publicly available monitoring station data from TCEQ’s Karnes county location (Figure 1), EPA Site Number 482551070, Continuous Air Monitoring System (CAMS) station 1070, activated in December 2014. The site was located in Karnes City, Texas, 28.885 degrees N and 97.902 degrees W, 131.4 m above sea level. Figure 1 shows a map of Karnes county and Karnes City highlighting the major traffic axes, as well as several potential oil and gas industry hydrocarbon sources discussed later (see also Sullivan, 2016).

Figure 1 

A south Texas county map of the Eagle Ford shale area (light brown). The monitoring site’s location in Karnes City is indicated by a red dot. The inset shows the regional road network surrounding the site in Karnes City, including the locations of oil & gas midstream facilities (orange triangles) and a fiberglass reinforced plastic production facility (blue square). Major roads are depicted as thick gray lines. Thin black lines mark county borders, and thin grey lines in the inset show residential roads. A close-up map of the monitor location including local traffic counts and a wind-rose are given in supplementary Figure S-1. DOI:

Before early 2017, when the monitor was relocated, it consisted of two air conditioned trailers, one housing an automated gas chromatograph for hydrocarbon measurements, the other a NO/NOx and an H2S analyzer, and standard meteorological instrumentation. Data were obtained in summer 2016 through TCEQ’s online portal at Data coverage of VOC, NOx and H2S was generally above 90% during each month of 2015, except during June 2015 for VOCs (52%) (Sullivan, 2016).

VOC measurements at the Karnes City site are performed in the same manner as at TCEQ’s Floresville monitoring site (Figure 1; Schade and Roest, 2016) at the northern edge of the shale area. TCEQ’s Standard Operating Procedure for VOC precursor analysis (TCEQ, 2005) is available through their Field Operations Division. As summarized previously (Schade and Roest, 2016), an automated GC dual FID system is used to hourly process a 40 min, 600 mL volume gas sample trapped onto a cooled adsorbent, rapidly thermally desorbed onto two chromatographic columns separating short-chained and long-chained NMHCs, respectively. Here, we reiterate two features of this widely deployed NMHCs analysis system in Texas: (i) the use of a Nafion™ dryer to remove water vapor from the sample stream, and (ii) the lack of ozone removal from the air sample before preconcentration. First, Nafion dryers can cause positive alkene artifacts, particularly for isobutene (Plass-Dülmer et al., 2002), wherefore TCEQ reports “NA” for this compound. In addition, propene memory effects can occur in Nafion dryers after elevated ambient abundances were encountered (pers. comm. with Carol Meyer, Orsat LLC., Pasadena, TX). Second, alkene losses can occur due to ozone reactions during cryogenic or solid sorbent sampling with subsequent thermal desorption (Helmig, 1997); as not all ozone may be lost in the current sampling setup, we cannot exclude that alkenes highly reactive toward ozone were at least partially lost from the sample before analysis. Since losses of the simplest alkenes ethene and propene are typically small (Plass-Dülmer et al., 2002), similar to our prior analysis (Schade and Roest, 2016) we decided to retain these alkene data, but will present only results for the least reactive alkenes ethene and propene, as well as the unreactive alkyne acetylene.

Continuous measurements of NO/NOx and H2S were carried out using Teledyne instruments; a model 200 chemiluminescence analyzer was used to measure NO and the sum of NO and NO2 (NOx) with a standard Molybdenum converter, and a model 101E UV fluorescence analyzer was used to measure H2S. Both instruments’ performance was checked regularly using 5-point calibrations as well as zero and span checks following procedures outlined in Standard Operating Procedure (SOP) DC-009-SYS: LEADS Site and Data Processing Management, Appendix A, Revision 0 (TCEQ, 2017). As specified by the manufacturer, Teledyne-API, both the NO/NOx analyzer and the H2S analyzer have a short-term lower detection limit and measurement precision (2×RMS noise level) of 0.4 ppb. Both instruments require stable environmental temperatures to minimize electronic output drift.

2.2. Data handling

All quality-assured data were downloaded in units of ppb (volume mixing ratios, VMR). Data were arranged into a continuous timeline, and combined with the hourly H2S, NOx and meteorological averages, replacing missing data or periods of instrument issues or calibration times with “NA”. R language software (R Core Team, 2017) was used for data analysis.

Both the H2S and NOx timelines showed a significant number of quality-assured values reporting negative VMRs, which is unphysical. It occurs as a result of the instrument’s electronic signal dropping below the average signal acquired during the latest accepted zero check. Upon closer inspection of these periods it was found that in nearly all NOx instrument cases output had suddenly dropped after a zero/span check, and was then varying diurnally as before, only to return to its previous, above-zero readings after another zero/span check. Such periods occurred only until mid-September 2015. Since this behavior is consistent with an offset due to measuring an insufficiently “clean” zero gas (e.g. due to a leak in the zero gas line), fixed in September 2015, we decided to apply a “reverse offset” to each of those periods, instead of removing or accepting-as-is all negative value data, which would be skewing the VMR distribution. An example of the raw data and the “reverse offset” replacement data is shown in supplementary Figure S-2. In all cases, the applied offset was 2–6 ppb NOx. In total, 39% of all non-NA NOx data were replaced by this procedure.

In the case of the H2S instrument, fewer and smaller such sudden changes (offsets) were encountered, but instead there were numerous readings around zero – in other words, around the instrument’s detection limit. When sudden changes related to zero or span checks were encountered, we applied the “reverse offset” (0.3–0.4 ppb); in all other cases, we replaced small negative readings with random positive data distributed between zero and half the instrument’s zero noise level given by the manufacturer (0.1 ppb). In total, 30% of all non-NA H2S data were replaced in these ways.

As a result of these correction procedures, the H2S and NOx statistics (Sullivan, 2016) were slightly altered. The annual means and 95 percentiles increased from 3.1 to 4.9 ppb, and from 12.9 to 14.2 ppb for NOx, whereas for H2S they increased from 0.2 to 0.3 ppb, and from 1.0 to 1.1 ppb, respectively. While this has no effect with respect to the legal interpretation of air quality at the Karnes county monitor, the modified data sets are physically consistent and can be used in the desired non-negative matrix factorization (NMF) analysis.

The VOC data showed no similar inconsistencies based on its acquisition. However, we found that a small step change occurred in the acetylene data between the first and second halves of the year. Apparently, acetylene abundances after a longer GC instrument service period had increased, and we speculate this change can be attributed to an improved microtrap collection efficiency. C2-hydrocarbons are prone to incomplete collection on the instrument’s microtrap and acetylene is known to be the first C2-compound to show such breakthrough (pers. comm. with Carol Meyer, Orsat LLC, 2016). No equivalent step-changes were observed for the other C2-compounds ethylene and ethane, neither for higher NMHCs. Data discussions that include acetylene thus focus on the second half of the year.

2.3. NMF analysis

Non-negative (N), or Positive Matrix Factorization (PMF) is a statistical tool used for pollutant source apportionment in situations where time series of environmental measurements (here: non-methane hydrocarbons, NMHC) exist with little to no prior information as to which and how different emissions sources of the measured compounds contribute to local observations. The mathematical framework for PMF was described by Paatero (Paatero and Tapper, 1994; Paatero, 1997). It decomposes a data matrix into a pre-selected number of factor profiles by resolving the covariance of concentration (here: VMR) enhancements. The factor profiles, or simply factors, then characterize the chemical composition of emissions from various sources, provided the experimenter has selected an appropriate number of factors representing the dominant emission sources. Since source profiles can change as a result of chemistry, deposition, and atmospheric mixing during transport, the determined factors from a locally collected data set do normally not reflect known source profiles in the region exactly. Instead, the experimenter needs to interpret factors by assigning probable emission sources based upon comparisons to known emissions profiles and an analysis of meteorological and other conditions, such as type of environment (space, such as urban vs. forested) and related emission source process activities (time and space).

We used R language package NMF (Gaujoux and Seoighe, 2010) for analysis of the combined NMHC-NOx-H2S data set. As the NMF package was originally created for the Bioinformatics research community, for example as a tool for gene analysis, it was decided to compare its results with that of the more widely used source apportionment tool PMF2 used in environmental research (e.g. Ramadan et al., 2003; Begum et al., 2005; Chan et al., 2008; Dutton et al., 2010; Karnae and John, 2011; Rodenburg and Meng, 2013; Tian et al., 2014; Belis et al., 2015). For a direct comparison we selected the data set of Guha et al. (2015) as it provided a similar set of measurements and NMHC species.

Following the description by Guha et al. (2015) we prepared our data set by removing background levels for each species and normalizing abundances (Guha et al., equation 4). Instead of with a single value, zero values were replaced with random values distributed between zero and one half the species’ limit of detection (LOD). To carry out the NMF using the least-squares optimization algorithm used in PMF2, we defined errors in a similar way as described by equations 5a and 5c in Guha et al. (2015), not including a limit of quantification (LOQ) (equation 5b in Guha et al., 2015) as such was not defined by TCEQ for these data. For the comparison between R package NMF and PMF2 using Guha et al.’s data set, their normalized data and error matrices were used as is. Because the NMF package provides more than one optimization algorithm and seed setting options, the Guha et al. (2015) data set was processed using both random seeds and svd (single value decomposition) seeds, the latter providing for increased computational efficiency. Results are described in the supplementary materials, particularly Figure S-3. Regardless of algorithm and seed used for this analysis, the NMF package provided nearly identical outcomes to those presented by Guha et al. (2015), which lends confidence to the use of R-language based NMF for the analyses presented here, and as compared to the more common PMF tool used by EPA for source apportionment analyses.

3. Results and Discussion

We first put measured NMHC and NOx abundances into context of other, similar measurements in shale areas, including a comparison of hydrocarbon ratios as a function of shale area and time of year (section 3.1). We next describe the results of the NMF source apportionment analysis to determine the dominant factors contributing to the observed composition at the site, then interpret obtained factors based on their composition and the air mass origin (section 3.2). We present case studies that relate the factor profiles to observations at the monitor (section 3.3), and, lastly, calculate factor contributions to benzene abundances (section 3.4) to estimate how renewed oil and gas development NMHC sources have altered hazardous air pollutant trends in this area.

3.1. Abundances and ratios

Similar to our previous work (Schade and Roest, 2016), we show selected NMHC, NOx and H2S annual abundance distributions in form of a boxplot in Figure 2, including typical values observed in US cities over a decade ago (Baker et al., 2008). All saturated NMHCs, such as the n-alkanes typically associated with oil and gas exploration, showed significantly higher median VMRs as compared to the Floresville monitoring site north of the shale area (Schade and Roest, 2016), with extreme mixing ratios over 500 ppb in the case of ethane and propane. Higher alkanes were all strongly correlated with ethane and propane but lower in abundance. Overall, VMRs of nearly all NMHCs in Karnes City in 2015 were substantially higher than at TCEQ’s Floresville site in 2013/14 (Schade and Roest, 2016) located north of the shale area, likely due to the more central location of Karnes City near sources in the shale exploration area (Figure 1). For example, median to third quartile propane abundances were 1.5 times higher, pentane abundances two times higher, and cyclohexane abundances two times higher, while benzene abundances were comparable. Similar to the Floresville site, however, VMR distributions were strongly right-skewed (Figure 2), with the highest VMRs associated with shallower boundary layer heights at night under weak northerly winds. In general, log-normal or near log-normal distributions were observed, and two examples are given in supplementary Figure S-4.

Figure 2 

Boxplot of annual hydrocarbon, NOx and H2S VMRs measured at the Karnes county monitor site during January to December 2015. Horizontal bars are medians, triangles are averages, error bars represent 95% confidence limits (missing lower limits, such as for benzene, are due to values of zero not plotting on the log-scale), and outliers are plotted as small open circles. For comparison, the range of mean VMRs in 28 large cities presented by Baker et al. (2008) (vertical brown bars) are shown, although 2015 VMRs in these urban areas are likely lower (Warneke et al., 2012; McDonald et al., 2013). In addition, the medians observed in nearby Floresville are also shown (horizontal dark green bars). All compounds showed right-skewed distributions and sample skewness (Doane and Seward, 2011) is indicated across the top of the graph. DOI:

Correlations between the saturated NMHC showed both seasonal and directional variability, likely due to varying source strengths and different atmospheric lifetimes. As an example, we show the ratio of n-pentane to propane in Figure 3 as function of season and wind direction: Pentane becomes more abundant relative to propane in summer (Figure 3a). Pentane’s shorter atmospheric lifetime as compared to propane suggest that its atmospheric abundance relative to propane should decrease in summer, but the opposite was the case, meaning increasing evaporative pentane emissions due to higher temperatures in summer were not matched by equivalent propane emissions. This was especially prominent for dominant southeasterly wind directions (Figure 3b), for which air masses travelled comparatively short distances over the shale area (Figure 1). Under these conditions, local impacts from oil and gas exploration emissions in the shale area appear to be minimized, while emissions from vehicle traffic may be maximized as traffic impacts are amplified along the US-181 corridor and the neighboring town of Kenedy (Figure 1, Figure S-1a). Since ethane and propane emissions from car traffic are small as compared to the pentanes (Kawashima et al., 2006; Liu et al., 2014; Pang et al., 2014), an elevated pentane-to-propane ratio may be expected under southeasterly winds in summer.

Figure 3 

Seasonal and directional dependence of hydrocarbon ratios. Seasonal changes to the observed n-pentane to propane ratio at the Karnes county monitor (a), showing higher ratios in summer. When plotted as a function of wind direction (b), southeasterly winds stand out due to significantly higher ratios. DOI:

This feature of the Karnes City site is further illustrated via both pentane VMRs and the isopentane-to-pentane ratio, shown in Figure 4. Pentane, and in extension all alkane VMRs, minimized for southeasterly air trajectories (Figure 4a), while isopentane-to-pentane ratios maximized for those directions (Figure 4b). Urban isopentane-to-pentane ratios, dominated by car traffic emissions, are typically ≥2 (Parrish et al., 1998; Warneke et al., 2007; Roest and Schade, 2017), while the same ratio is closer to 1 in areas affected dominantly by oil and gas exploration sources (Gilman et al., 2013; Petron et al., 2014; Thompson et al., 2014). For air mass trajectories from the SE sector (approx. 120–170 degrees), air mass origins typically lie in the Gulf of Mexico (Roest and Schade, 2017), spend comparatively short periods over land and, traveling perpendicular to the shale axis, short distances over the main shale exploration areas (Figure 1). At the same time, these air masses traverse the city of Kenedy and the US-181 traffic corridor en-route to Karnes City. This explains why the isopentane-to-pentane ratio in these air masses is higher relative to other wind directions (Figure 4b).

Figure 4 

Boxplots of hydrocarbon dependencies on wind direction. Shown are n-pentane VMRs (a) and the isopentane-to-pentane ratio (b). n-pentane mixing ratios were significantly lower for southeasterly wind directions. Higher isopentane-to-pentane ratios indicate stronger contributions from car traffic, which typically emits at a ratio of >2:1, while lower ratios around 1 are indicative of oil and gas exploration related emissions (e.g. Gilman et al., 2013). DOI:

Other ratios evaluated with such air quality data are the NOy-to-acetylene (e.g. Lee et al., 2006) and benzene-to-acetylene ratios (e.g. Fortin et al., 2005). Acetylene, or ethyne, is a tracer of gasoline combustion processes (Pang et al., 2015), and has a relatively long atmospheric lifetime. Benzene in urban areas is largely attributed to tailpipe exhaust and then strongly correlated with acetylene. Improved combustion and exhaust cleaning technologies alongside reformulated gasoline have lowered NMHC emissions substantially in the last few decades (Parrish et al., 2009; Warneke et al., 2012; Pang et al., 2015; Propper et al., 2015b). Benzene emissions in particular have been reduced further due to specific US regulations (Fortin et al., 2005; Harley et al., 2006). In urban areas, NOx is typically associated with tailpipe emissions, and thus remains strongly correlated with acetylene even when measured as NOy downwind (Lee et al., 2006). Based on seasonal emission ratio and lifetime differences, ratios of selected NMHC, and NOx, to acetylene show different values in winter as compared to summer (Lee et al., 2006; Coll et al., 2010). When we evaluated these ratios for the Karnes City data, we noticed a step change in summer 2015 after a prolonged servicing period of the automated hydrocarbon GC (section 2.2). As a consequence, acetylene ratios were only evaluated for the second half of the data (July–December 2015). Figure 5 shows two examples using boxplots of monthly benzene- and isopentane-to-acetylene ratios. Both compounds’ abundances were higher during summer than during winter, and both were substantially larger than measured in and downwind of large urban areas, where traffic sources are dominant. Only a subset of data outside the inter-quartile ranges of all months approached typical urban values. In the case of isopentane, the observed seasonal change is expected based on evaporative sources, which are more prominent during warmer summer periods (Coll et al., 2010; Kota et al., 2014). However, the observed isopentane-to-acetylene ratio at the Karnes City site was 5–10 times larger. Hence, evaporative sources that do not emit acetylene appear to strongly affect alkane abundances at the Karnes City site. In turn, based on similar behavior between benzene (Figure 5a) and isopentane (Figure 5b), evaporative sources likely also affect benzene abundances, but not to the same extent. Major contributions to this evaporative source may come, for instance, from liquid storage tank emissions (Lyon et al., 2016).

Figure 5 

Boxplots of seasonal hydrocarbon ratio changes. Shown are benzene-to-acetylene (a) and isopentane-to-acetylene (b) ratios observed at the Karnes county monitor as compared to several other studies. The solid error bars shown in both graphs are based on results from Ammoura et al. (2014), Coll et al. (2010), and Warneke et al. (2007); the individual data squares are based on results from Pang et al. (2015). DOI:

In conclusion, NMHC VMRs at the Karnes county monitor appear to be strongly affected by oil and gas exploration related sources as well as car traffic sources, however, the latter prominent only under typical SE wind directions. NMHC abundances under these conditions, i.e. SE winds, are substantially lower as compared to more continental air mass origins, which are more strongly affected by emissions from oil and gas exploration in the surrounding shale area. Similar to our prior analysis for the monitor north of the shale area (Schade and Roest, 2016), we find that typical NMHC abundances and ratios (Figure 2, Table S-1) all suggest that, on average, emission sources from oil and gas exploration dominate over traffic-related sources in this area.

3.2. Source apportionment analysis

Following the removal of missing values, data normalization, and error matrix compilation, we ran a series of NMF analyses using the NMF-internal least squares optimization technique in R (R Core Team, 2017), as well as other algorithms offered by the NMF package. Both the complete, annual data set was used, as well as the subset of data after the major instrument service period in July 2015, thus representing the second half of the year. The latter was done based on our observation of acetylene breakthrough, to test whether the lower acetylene abundance levels affected the NMF results. Here, we focus on the least-squares minimization algorithm results using both random and non-negative single value decomposition (nnsvd) seeds (Boutsidis and Gallopoulos, 2008). The former seed method gives slightly different results for each individual NMF execution, requiring multiple runs to establish a representative, robust result. The latter seed method is computationally more efficient and produces very similar results.

We discuss a random seed multi-run result and compare it to the much faster nnsvd run result. We used the nnsvd seed results of 100 runs of randomly selected subsets of the data to estimate an uncertainty of the selected multi-factor decomposition and its factor profiles. Figure 6 shows the factor profile results from a 30-run random vs. nnsvd-seed computation. Six factors were selected to represent the data set after comparing results for both random and nnsvd-seed results based on 3–10 factors. Selection criteria for the number of factors were based both upon internal quality criteria available as part of the R NMF package, such as factor sparseness, silhouette, and residual sum of squares (rss) values, as well as an inspection of changes to the factor profiles as additional factors were computed. Internal, absolute quality criteria such as rss were compared to those for randomized data generated from each data set to evaluate the results more objectively (e.g. Hutchins et al., 2008). The quality criteria comparison suggested that more than five factors did not significantly improve data set variance explanation, and we found that more than six factors resulted in significant overfitting as existing factors were split into new factors without significant additional gain in factor sparseness, silhouette, or explanation of data set variance. Six factors were ultimately preferred over five factors as an additional factor characterized a subset of data strongly affected by dominantly one compound, styrene, further discussed below. The coefficients for all six factors were found to be highly independent, and a pairwise scatter plot compilation is shown in Figure S-5.

Figure 6 

NMF analysis results. Karnes county monitor NMF factor compositions using an optimized random seed (left) versus the nnsvd seed (right). See text for factor interpretations. Svd seed results on the right show slightly lower individual contributions overall due to higher residuals as compared to the optimized random seed method. DOI:

Figure 6 includes our interpretation of the factor origins based on comparisons to known profiles and the examination of wind direction distributions of the six factor coefficients shown in Figure 7. The evaporative and fugitive emissions factor compares well with EPA SPECIATE ( data on permeation-evaporative emissions from gasoline storage tank emissions (e.g. profile 8771); it is dominated by short-chain NHMCs but includes aromatic compounds. Its wind directional distribution is nearly uniform except for SE winds. The light duty vehicle (LDV) traffic emissions factor is dominated by aromatic species alongside ethylene and acetylene, and compares well with SPECIATE data on LDV emissions (e.g. profile 5625); it maximizes for the SE directions (Figure 7), from which most traffic influences are expected due to the US-181 traffic corridor including the nearby City of Kenedy (Figure 1), and shows secondary maxima for directions NE and W likely due to significant traffic on TX-80 and the local, in-town, E-W traffic axis (Figure S-1a). The third factor, named diesel engine emissions, is characterized by long-chain and aromatic NMHCs, particularly trimethylbenzenes, as well as NOx (McCarthy et al., 2013). Although it relatively poorly matched SPECIATE profiles for diesel emissions, its composition and directionality suggest contributions from a diesel truck source maximizing along the in-town E-W traffic axis as well along Texas FM-99 (Figures 1, S-1a). The fourth factor, which also showed a near uniform directional composition, resembled SPECIATE data on heavy duty diesel emissions (e.g. profile 8775), but also two open (external) combustion profiles (4420 and 4721). However, the factor is dominated by NOx, ethane, benzene, ethene and propene, which are known emissions from natural gas flaring (Knighton et al., 2012), and the profile matched recently posted flare emissions profiles (e.g. FLR97 and FLR99) much better. In combination, we interpret this factor to be dominated by emissions from flaring and compressor engines at midstream facilities in the area (Figure 1), which surround Karnes City in all directions except the SE. In contrast, the fifth factor is strongly dominated by wind directions spanning from 240 to 60 degrees, which aligns nearly perfectly with the Eagle Ford shale’s separation between dry gas production to the south and oil production to the north of Karnes City. Based on the factor’s profile, dominated by cyclic NMHCs such as cyclopentane, and a decent match with SPECIATE data for oil storage tank emissions (profile 2472), we named this factor oil field emissions. We acknowledge that this wind sector may also transport urban emissions, which may share characteristics with oil and gas sources, from the San Antonio region towards Karnes county. However, San Antonio is approximately 80 km upwind, and it is likely that any urban emissions that reach Karnes county have been diluted during transport while local emissions from oil and gas sources are much more concentrated. The last factor, dominated by the single compound styrene, was linked to a less than 4 km distant industrial source (fiberglass reinforced plastics, FRP, manufacturer), which emitted an estimated 44.6 tons of styrene in 2014. The directional dependence of the factor profile, maximizing impacts at 160 degrees (Figure 7), matches the direction to the emitting facility from the monitor location (155 degrees). The fact that o-xylene and NOx co-score on this factor suggests that emissions from co-located solvent use and manufacture-related heating processes, respectively, may contribute emissions. The only other dominant emission from this industry, acetone, was not measured at the monitor.

Figure 7 

NMF factor wind direction dependencies. Individual directional dependence of the six NMF factors shown in Figure 6. Each radius in each subplot represents the mean contribution from that direction and for that factor, scaled to a range of [0, 1] for all 36 directions. A unit plot with radius one for all directions is shown on the upper right. DOI:

Hydrogen sulfide, H2S, remained generally below 5 ppb at the monitor. It scored dominantly on the oil field and flaring and compressor engine factors. This is likely due to the fact that H2S, when present, is typically associated with produced oil, not gas, and is separated from the hydrocarbon product at midstream facilities.

Overall, the evaporative and fugitive and flaring and compressor engine factors explained almost half of the data set variance, followed by the light duty traffic and diesel engine factors explaining another third. That car traffic emissions contributed significant amounts of NMHCs to the overall NMHC abundance and variance despite low VMRs under SE wind directions is explained by the fact that SE wind directions are dominant at this location (Figure S-1b). Northerly wind directions are less frequent, leading to overall lower oil field factor contributions despite much higher VMRs for those wind directions. In addition, there is some ambiguity associated with the evaporative and fugitive factor as we cannot quantify which fraction of this source is due to emissions from mobile sources (vehicle tanks) vs. oil and gas exploration related sources, such as large well-pad storage tanks. Although pentanes are expected to be found in the light duty traffic factor, their contributions were negligible compared to the evaporative and fugitive factor. However, the strong contributions of ethane and propane to this factor suggest that the latter source is dominant over vehicle sources since ethane and propane are minor components of commercial gasoline, and did not show independent variation to those NMHCs attributed to gasoline evaporative sources.

Error estimates of the factor profiles and coefficients were based upon (i) 100 NMF implementations using the full data set with random seeding, and (ii) 100 NMF implementations on randomly selected, approximately 40-day long data subsets from the 2nd half of the year using nnsvd seeds. In both cases, while factor profile patterns were reproducible, uncertainties varied strongly between individual compounds and between profiles (Figures S-6 and S-7). The light duty car traffic, industrial, and flaring sources were reproduced more precisely than the other three sources, which showed larger run-to-run variability; and a comparison between all runs and runs that produced significantly lower residuals suggest some amount of aliasing between the latter sources, especially between the oil field and evaporative and fugitive profiles. These profiles showed similar NMHC patterns, were separated primarily via cyclic NMHCs and H2S patterns as created by wind directions from the oil production region of the Eagle Ford, and secondarily via a larger contribution of short-chain (<C5) vs. long-chain (≥C5) NHMCs in the evaporative and fugitive vs. oil field factors, respectively. Both these profiles feature a significant but highly uncertain score for alkenes such as propene. While propene is dominantly associated with the flaring profile, this does suggest other possible sources. We speculate that incomplete combustion processes in flares (Knighton et al., 2012) at individual well-pad sites may contribute to the oil field and evaporative and fugitive sources via co-emission or co-advection not resolved by the NMF procedure.

3.3. Case studies

We discuss three case studies that illustrate the sources identified by the NMF analysis (section 3.2). The first looks at what appears to have been several clear flare emissions plume observations in early August 2015, likely coming from flares operating at a large midstream facility south of Karnes City. The second explores a period of high hydrocarbon and H2S levels during a period of consistent advection from the north, while the third looks at consistent advection from the south.

3.3.1. Flaring emissions plumes

The benzene-to-acetylene ratio has been used in the past as an indicator of decreasing benzene emissions (Fortin et al., 2005; Pang et al., 2015). As traffic related emissions were and are major benzene sources in urban areas, long-term high precision measurements indicate decreasing benzene emissions from traffic due to federal regulations and improved technology (Fortin et al., 2005). Figure 8a shows the observed and expected benzene-to-acetylene ratio at the Karnes City monitor during the second half of 2015. While the correlation is strong, the observed ratio (0.72) was much higher than what would be expected for an urban area (0.2–0.4). In addition, a small subset of values showed a much lower ratio, and we identified a total of 12 days between July and December 2015 showing temporarily low benzene-to-acetylene ratios at benzene VMRs above 0.2 ppb. The data highlighted in orange in Figure 8 were obtained during several days in early August 2015. Figure 9 shows time series of winds and selected NMHCs alongside NOx data for early August before and during the plume impacts. The plumes advecting from the south on DOY (day of year) 215–218 (August 3–6) 2015 were distinguished by higher alkene and acetylene VMRs alongside much smaller amounts of aromatics, such as benzene, and alkanes. NOx variability was not affected during the period prior to changing wind directions, but nevertheless showed distinct increases during the plume impact times, clearly visible on DOY 215, 217 and 218. The alkanes showed markedly higher VMRs for northerly wind directions before DOY 215, dropping under southerly wind directions (Figure 9). As shown in Figure S8, there were several active flares directly south of Karnes City, including at a large midstream facility only 2.1 km south of the air quality monitor (Figure 1). Local wind data (Figure 9) alongside a HYSPLIT back trajectory analysis suggest that emissions from these active flares arrived at the monitor within 1–2 hours, thus providing for observations not strongly affected by atmospheric chemistry. The observed compounds, such as C2 and C3 NMHCs, and benzene alongside NOx (Figure 9), reflect the flaring and compressor engine factor identified by the NMF analysis (Figure 6), with the exception of acetylene. Thus, these observations together with the identified flare operations in nearby upwind areas strongly support the factor identification. Considering that a total of 190 individual flares (approximately one flare per 10 km2) were recognized by the VIIRS instrument in Karnes county in 2015, it makes sense that the identified NMF factor accounts for a substantial fraction of the data set variability. The high scoring of NOx leads to 91% of NOx variability explained by this factor, and some of the remainder explained by the diesel engine factor. In other words, NOx emissions in this area are likely dominated by oil and gas exploration related instead of traffic related sources, and this could explain why NO2 total column abundances in rural shale areas that practice flaring have been increasing, while they have decreased substantially in urban areas (Duncan et al., 2016). Inventories of traffic-related vs. oil and gas exploration related NOx emissions for Karnes county for 2012, approximately 2 (Texas Transportation Institute, 2015) vs. 14 (AACOG Natural Resources Department, 2014) short tons per ozone season day, respectively, support this conclusion.

Figure 8 

Determining benzene from tracer hydrocarbons. Correlations of benzene with acetylene (a), presumably dominated by car traffic, and propane (b), presumably dominated by oil and gas exploration related sources. Benzene is two to three times more abundant than expected, and acetylene correlates surprisingly well with propane (d), while benzene variability is explained to 95% through a multi-linear combination of tracer NMHC species of the dominant NMF factors (Figure 6). The data points highlighted in orange were associated with combustion plumes from local flares as close as 2 km to the south of the monitor in August 2015 (see text; Figure 9). DOI:

Figure 9 

Case study of flaring plumes. Time series of winds and selected NMHCs alongside NOx for a period in early August 2015 (day of year, DOY 213 = 1 August). Note that winds (black is wind direction, left y-axis; dashed blue is wind speed, right y-axis) switched from northerly to southerly on 2 August, considerably reducing saturated NMHCs at night, but leading to several morning spikes of unsaturated NMHCs, benzene and NOx during the following days. DOI:

In summary, strong impacts of flaring emissions at this monitor can be identified via high acetylene-to-benzene ratios (>2.5). These conditions occur frequently, but rarely as unambiguously as observed in early August 2015. Since the flares also emit alkenes, alkene-to-benzene ratios can alternatively be used to diagnose flaring impacts. The derived flaring and compressor engine factor corroborates inventories suggesting that regional NOx emissions are dominated by oil and gas exploration activities.

3.3.2. Oil field hydrocarbon plumes

The impact of oil exploration in the Eagle Ford shale manifested itself at the Karnes City site mostly for northerly and westerly wind directions, with air masses passing over the northern section of the shale extending over 100 km to the northeast, 25 km north, and over 100 km to the southwest from the monitor. Several periods of very high NMHC mixing ratios were identified for northerly air trajectories, and an example period from end April 2015 is shown in Figure 10. During most of this period, weak northerly winds prevailed, particularly at night. During all nights between 27 and 30 April, 2015, elevated NMHC mixing ratio plumes were observed, including unsaturated NMHC, H2S, and NOx (cf. also Figure 9). Lowest mixing ratios were observed at daytime under strong northerly winds on 28 April (DOY 118), and after winds switched back to south-easterlies on 1 May (DOY 121, Figure 10). During the mornings of 28 and 29 April, NOx and alkene mixing ratios were elevated similar to observations in larger urban areas, suggesting increased car traffic emissions during rush-hour. However, the highest mixing ratios occurred during a clear night with a very shallow nighttime boundary layer on 29 to 30 April (NARR data,; last accessed 10 Aug 2017): Weak northerly winds brought a plume rich in NMHCs and NOx to Karnes City before midnight. When winds shifted temporarily to south-easterlies, abundances dropped sharply at first, only to exceed before-midnight values when winds shifted back to northerlies after 2 am. The plume contained more than 3.5 ppmC NMHCs at its maximum, and was followed by a plume of H2S with an hour delay, similar to two nights earlier (Figure 10). As nearly 85% of the carbon in this NMHC plume consisted of ethane and propane, and the plume contained a high amount of alkenes and NOx, numerous active flares detected between 4 and 15 km north and north-east of the monitor that night (Figure S9) probably contributed to the observed mixing ratios. The dominant factor, however, causing such high NMHC abundances in nighttime plumes, are shallow boundary layer depths and slow transport of air masses along trajectories within the shale area, accumulating emissions from numerous sources on the way. HYSPLIT back-trajectories (Draxler and Rolph, 2015) using 12-km resolution NAM data and arriving during the peak plume impact (6 am on 30 April 2015) show that this air mass likely originated in northeast Karnes county the previous evening, first meandering south-southwest, then northwest, then south-southeast into Karnes City (Figure S-9).

Figure 10 

Case study of oil field hydrocarbon plumes. Time series of winds (black is wind direction, left y-axis; dashed blue is wind speed, right y-axis) and selected NMHCs, and H2S alongside NOx for a period in late April 2015 (DOY 117 = 27 April). Northerly winds dominated during this period, triggering several nighttime events of elevated hydrogen sulfide (orange diamonds). Elevated alkene VMRs (red circles) coincided with NOx (brown diamonds) and saturated alkane VMRs during several nights suggesting plumes affected by both flaring and evaporative/fugitive emissions. Benzene (green crosses, ×10 to increase visibility) typically correlated well with NOx, such as late on 27 April and 29/30 April. DOI:

In summary, we found that the highest NMHC VMRs observed at the Karnes City site occurred during nights with shallow atmospheric boundary layers and northerly air mass origins, probably affected by sources in the oil exploration sector of the Eagle Ford shale, including flares that contribute NOx and unsaturated hydrocarbons as a result of incomplete combustion. The co-emissions and co-advection of the latter, and well as of H2S separated at midstream facilities, can explain the composition of the oil field factor (Figure 6). In other words, the comparatively high scores of unsaturated hydrocarbons such as propene on the oil field and evaporative and fugitive factors of the NMF analysis (Figure 6) is likely due to the procedure’s inability to completely separate co-advecting sources.

3.3.3. Summertime south-easterlies

When winds blew steadily from the south to southeast, as dominantly occurring during south Texas summers, NMHC VMRs were consistently lower at the Karnes county monitor as compared to other periods. Figure 11 depicts such a period in July 2015, showing that individual alkene VMRs typically remained below 0.5 ppb, pentane VMRs below 10 ppb, and benzene VMRs below 0.2 ppb. Typical NOx levels remained below 5 ppb during this time, with higher values occurring during the weekday morning rush hour. Higher NMHC VMRs during this period were observed dominantly during the morning rush hour as well. However, on all these weekdays, DOY 197, 198, 201 and 202 (Thursday and Friday, 16 and 17 July, and Monday and Tuesday, 20 and 21 July 2015), air masses also passed over a major midstream facility and its active flare 2 km south of the monitor, discussed in section 3.3.1 (Table 1). Both the benzene-to-ethene and the isopentane-to-pentane ratios on these days dropped as these NMHCs increased, inconsistent with a car traffic source of such elevated ethene VMRs (Figure 11; isopentane-to-pentane ratios only exceeded 1.5 for southeasterly winds). Note, however, that NOx VMR increases varied more between these events, with the lowest NOx increase coinciding with the largest ethene increase on DOY 201. We speculate that this may have been related to a lower flare temperature on 20 July 2015, causing lower NOx but higher alkene production from incomplete combustion in the flare’s flame. On the other hand, NOx abundances, when restricted to southeasterly wind directions between 120 and 170 degrees (Figure 4), did show the expected morning rush hour peak. Hence, while the fundamental contribution of car traffic to NOx emissions manifests under these circumstances, it appears that the NMF procedure was unable to resolve this contribution except in a few cases (Figure S-7).

Figure 11 

Case study of south-easterlies. Time series of winds (black is wind direction, left y-axis; dashed blue is wind speed, right y-axis) and selected NMHCs, and H2S alongside NOx for a period in mid July 2015 (DOY 197 = 16 July). Southerly to south-easterly winds were maintained throughout this period, and NMHC and H2S abundances were significantly lower than shown in Figures 9 and 10. However, several spikes of alkenes on 17, 20 and 21 July were likely affected by local flare emissions (see text and Table 1). DOI:

Table 1

Selected flare occurrences at a midstream facility 2 km south of the Karnes City air quality monitor (Figure S-8)a. DOI:

Date flare size estimate flare temperature

16 July 2015 10 m2 1803 K
17 July 2015 1 m2 1755 K
20 July 2015 very smallb unknown
21 July 2015 4 m2 1810 K
3 August 2015 0.5 m2 2044 K
4 August 2015 very small unknown
6 August 2015 very small unknown

a Data are from NOAA’s Earth Observations Group (EOG) analysis of VIIRS data (Elvidge et al., 2016) (

b too small and thus of insufficient radiance to determine a flare temperature.

In summary, while NMHC VMRs were significantly lower for southerly wind directions, morning rush hour impacts remain difficult to separate from flaring impacts due to a large midstream facility just 2 km south of the air quality monitor. Only south-easterly winds appeared to guarantee cleaner air masses (Figure 4) that revealed common car traffic contributions.

3.4. Benzene abundances and sources

Benzene remains one of the most relevant air toxics nationwide, although its VMRs have dropped consistently in most urban areas in the last two decades (World Health Organization, 2010; EPA, 2015; Propper et al., 2015a; Strum and Scheffe, 2016) ( EPA estimates the non-cancer effects inhalation reference concentration (RfC) at 30 µg m–3 (9 ppb), and the one-in-hundred-thousand cancer risk at 1.3–4.5 µg m–3 (0.4–1.4 ppb) (EPA Integrated Risk Information System (IRIS), 2003). A recent review by Strum and Scheffe (2016) shows that annual average benzene levels remain above the one-in-a-million cancer risk benchmark (0.13 µg m–3) in several large US cities, and Bolden and co-authors argue that benzene as part of the BTEX species may exert significant public health effects at typically observed ambient levels due to their endocrine disruptive activity (Bolden et al., 2015a). Note that while the TCEQ uses a long-term health standard of 1.4 ppb to assess toxicology at air quality monitoring sites (, public health effects may occur at exposures below this threshold.

At the Karnes City site, average annual benzene levels in 2015 were 0.21 ppb (0.11 µg m–3; median: 0.13 ppb), lower than current levels in several large urban areas, but one half to two thirds of 2015 levels in Houston’s petrochemical industry area along the Houston Ship Channel (excluding the historically highest abundance monitor; Table S-2). This is unusual considering the rural location of the Karnes county monitor, the comparatively small amount of vehicle traffic in the area, and the lack of large industrial activities as compared to Houston’s Ship Channel. However, as the NMF analysis suggests, and the case studies illustrate, benzene sources in the area are not dominated by vehicle traffic or industry, but seemingly by shale oil and gas exploration and production activities. Propane VMRs explained 88%, and a combination of tracer compounds for all sources identified by the NMF procedure explained more than 95% of benzene variability (Figure 8c). Using the results of the random seed NMF analysis, we reconstructed mean benzene abundances as a function of sources and time of day, shown in Figure 12. Traffic source impacts maximized in the late afternoon and early evening when VMRs are lowest, while evaporative and fugitive source impacts maximized at night due to shallower boundary layer depths. The flaring and compressor engine source impact is dominant and distributed evenly across the day. On average, the NMF results showed that 18 to 38% of the annual average daily benzene abundance is due to traffic sources, while more than half of benzene is due to oil and gas exploration activities presuming none of the evaporative and fugitive sources are associated with vehicle emissions (section 3.2). In turn, this means that the rapid development of the Eagle Ford shale since 2008 caused benzene abundances to increase by a factor of three to five assuming little to no regional impacts from legacy oil and gas exploration in the area. At the Karnes county monitor, the highest benzene VMRs (3–4 ppb) were associated with two nighttime plumes advecting from the north in January 2015 under similar conditions as described for the case study in section 3.3.2. Similar results were recently published for the Denver-Julesberg Basin in Colorado, identifying high nighttime VMRs of benzene with oil and gas sources (Halliday et al., 2016). Considering that the Karnes county monitor was located relatively far from known sources to its north, much higher VMRs closer to the emitting facility than encountered at the monitor are likely (Schade and Roest, 2016). Since such situations of shallow boundary layer depths combined with significant surface emissions of air toxics occur on a regular basis, residents, livestock, and wildlife living in close proximity to emitting facilities are at elevated risk (McKenzie et al., 2012; Witter et al., 2013; Brown et al., 2014; Brown et al., 2015; Balise et al., 2016; Rasmussen et al., 2016; Chen and Carter, 2017; Tustin et al., 2017).

Figure 12 

Reconstructing source contributions to benzene abundances. Average relative contributions of the NMF-identified emission sources to the mixing ratios of benzene at the Karnes county monitor in 2015. Note that benzene “background” as defined by the NMF procedure was the lowest measured benzene level, aka 0 ppb, and that percentages reflect the calculated factor-specific VMR value divided by the measured VMR. Discrepancies from 100% are due to averaging, showing that the NMF procedure at times over-, at times under-estimates measured benzene VMRs. DOI:

4. Summary and Conclusions

We have analyzed the first year of hourly air pollutant measurements from the center of the Eagle Ford shale area in south central Texas. We showed that non-methane hydrocarbon (NMHC) abundances are unusually high, with selected NMHC-ratios, such as the isopentane-to-pentane ratio, displaying strong impacts from emissions from oil and gas exploration under almost all wind directions. We carried out a source apportionment analysis using non-negative matrix factorization (NMF), and demonstrate that the monitored NMHCs, NOx, and H2S are dominantly attributable to shale activities in the area, rather than car traffic. Oil and gas exploration impacts were lowest under typical southeasterly winds along a local traffic corridor and neighboring town, minimizing oil and gas related source density. Pollutant concentrations were highest under northerly wind directions involving air masses slowly travelling over the northern section of the shale area that produces oil while flaring associated gas. The NMF analysis identified a major source factor resembling recently adopted flare source profiles and large diesel engine emissions, which appear to vastly dominate regional NOx emissions. While this appears surprising, the available county-level NOx inventories for traffic versus oil and gas related sources are in line with this result, suggesting that the latter outweigh the former sources by approximately 7:1. It may therefore not surprise that Duncan et al. (2016) observed increasing NO2 column densities above the rural US shale areas Bakken, Permian, and Eagle Ford from the time before the shale boom to 2015, especially when considering that (associated) gas flaring is common in these oil-producing shale areas, while considerably rarer in gas-producing shale areas. One of the results of increased rural NOx emissions is increased regional photochemical ozone production due to the general NOx limitation in these areas (Pacsi et al., 2015; McDuffie et al., 2016; Roohani et al., 2017).

Overall, the flaring and compressor engine source alongside the evaporative and fugitive source dominated NMHC abundances, the former providing unsaturated, the latter saturated NMHCs. In comparison, aromatic NMHCs were more equally distributed between the identified sources, including traffic. When we investigated benzene in detail due to its role as an important air toxic, it became clear that at this monitor, contrary to at most urban monitor locations in the US, benzene concentrations have likely more than doubled since before the shale boom, even if some of the emissions attributed to the evaporative and fugitive and diesel factors stem from light duty vehicles. Depending on the composition and handling of the oil, condensate, and associated gas produced in shale areas, this finding (of increasing concentrations) is likely reproducible in other shale areas, and could be relevant for trends observed in large urban areas downwind. Considering that the effects of BTEX species on public health may be more widespread than currently considered (Bolden et al., 2015a; Bolden et al., 2015b), more efforts should be made to evaluate BTEX species in air, especially in areas affected by oil and gas exploration, as well as research with respect to the health effects potentially associated with them.

Data Accessibility Statements

A dataset was generated from raw data provided by TCEQ and is available through the Texas Data Repository at R scripts generating the subsets of data analyzed and creating the Figures in this manuscript were assembled. Both are available from the lead author upon request.

The data used in this study are available from TCEQ at no cost online at The TCEQ did not participate in this analysis, nor does it endorse any of this work.

Supplemental Files

The supplementary file for this article can be found as follows: