Comparison of methane emission estimates from multiple measurement techniques at natural gas production pads

This study presents the results of a campaign that estimated methane emissions at 268 gas production facilities in the Fayetteville shale gas play using onsite measurements (261 facilities) and two downwind methods – the dual tracer flux ratio method (Tracer Facility Estimate – TFE, 17 facilities) and the EPA Other Test Method 33a (OTM33A Facility Estimate – OFE, 50 facilities). A study onsite estimate (SOE) for each facility was developed by combining direct measurements and simulation of unmeasured emission sources, using operator activity data and emission data from literature. The SOE spans 0–403 kg/h and simulated methane emissions from liquid unloadings account for 88% of total emissions estimated by the SOE, with 76% (95% CI [51%–92%]) contributed by liquid unloading at two facilities. TFE and SOE show overlapping 95% CI between individual estimates at 15 of 16 (94%) facilities where the measurements were paired, while OFE and SOE show overlapping 95% CI between individual estimates at 28 of 43 (65%) facilities. However, variance-weighted least-squares (VWLS) regressions performed on sets of paired estimates indicate statistically significant differences between methods. The SOE represents a lower bound of emissions at facilities where onsite direct measurements of continuously emitting sources are the primary contributor to the SOE, a sub-selection of facilities which minimizes expected inter-method differences for intermittent pneumatic controllers and the impact of episodically-emitting unloadings. At 9 such facilities, VWLS indicates that TFE estimates systematically higher emissions than SOE (TFE-to-SOE ratio = 1.6, 95% CI [1.2 to 2.1]). At 20 such facilities, VWLS indicates that OFE estimates systematically lower emissions than SOE (OFE-to-SOE ratio of 0.41 [0.26 to 0.90]). Given that SOE at these facilities is a lower limit on emissions, these results indicate that OFE is likely a less accurate method than SOE or TFE for this type of facility.


Introduction
Natural gas production in the United States reached a record high of 79 billion cubic feet per day in 2015 (U.S. Energy Information Administration (EIA), 2016). Natural gas emits less carbon dioxide per unit of energy produced during combustion than coal and oil, offering potential to reduce climate impacts from energy generation.
However, the climatic benefit of natural gas use depends largely on system leakage rates (Alvarez et al., 2012;Allen, 2014;Brandt et al., 2014). Many recent studies have been motivated by the need to better quantify and understand methane emission rates from natural gas systems at the device, facility, and regional-scales, as well as across the industry segments including exploration, production, gathering, processing, transmission, and distribution (Lyon et al., 2016;Karion et al., 2015;Brantley et al., 2014;Lyon et al., 2015;Lamb et al., 2015;Alvarez and Fund, 2009;Allen et al., 2013;Roscioli et al., 2015;Mitchell et al., 2015;Subramanian et al., 2015;Allen, Sullivan, et al., 2015;Allen, Pacsi, et al., 2015;Yacovitch et al., 2015;Caulton et al., 2014;Lavoie et al., 2015;Lyon et al., 2015;Subramanian et al., 2015; Zavala-Araiza, . Two recent studies at the facility scale have suggested methane emissions estimates developed from component-based inventories and direct measurements may estimate lower total emissions than downwind estimates of the same facility if all emission sources are not represented in direct measurements or inventory Subramanian et al., 2015). At the regional scale, component-based inventories using emissions factors and activity data have often been found to estimate lower emissions relative to regional atmospheric measurement approaches (Allen, 2014;Caulton et al., 2014;Pétron et al., 2014;Karion et al., 2015; Zavala-Araiza, . Differences in emission estimates at facility and regional scales have been hypothesized to result from temporal variability arising from certain high emitting sources Nathan et al., 2015; Zavala-Araiza, , the inability to identify and perform direct measurements of all sources onsite Mitchell et al., 2015;Subramanian et al., 2015), the use of inaccurate or unrepresentative emission factors and activity factors  Zavala-Araiza, , incorrect attribution of emissions among methane sources Townsend-Small et al., 2015), and/or underestimating the frequency and contribution of large but rare emission sources ("fat-tail" sources) (Caulton et al., 2014;Lavoie et al., 2015;Lyon et al., 2015;Mitchell et al., 2015;Subramanian et al., 2015; Zavala-Araiza,  Zavala-Araiza, . This work is part of a multi-method field campaign designed to estimate methane emissions across segments of the natural gas supply chain present in the study area. The study utilized multiple measurement methods to develop estimates of methane emissions at facility and regional scales. The results presented here focus on comparisons of emission estimates made using multiple methods at gas production facilities (well pads) during this study.

Study design
This study was performed in a portion of the Fayetteville Shale, Arkansas, in the mid-continental United States (Supplementary Material, Figure S1). Approximately 79% of 5,652 currently active and producing wells in the study area were drilled after 2008 (AOGC, n.d.). Two co-sponsors of the study (study partners) operate 82% of the active wells in the study area. These study partners supported site access for the field team, and provided activity data including equipment counts and records of episodically-emitting activities during the campaign that was critical to accurate modeling and interpretation of facility-scale emission estimates. One study partner also provided a measurement team to perform onsite surveys and measurements operating under the supervision and direction of the study coordinator from the research team.
For this study, each well pad is considered a "production facility". Based on information provided by the study partners and corroborated by onsite observation, well pads in the study area have up to twelve wells and typically include one liquid separator and gas meter per well. Each well pad also includes liquid storage tanks for produced water, chemical injection equipment, supporting piping, and may have temporarily-or permanently-installed wellhead compressor equipment to augment production. Facilities targeted for measurements were planned by the authors using geospatially-clustered random sampling. Measurements were made in six geographically-dispersed clusters, each containing at least 100 geographicallyproximate production facilities operated by study partners. Each day of the campaign, measurement teams were directed to a specific cluster by a study coordinator from the research team who then selected facilities within the cluster to measure based on weather, drive time and other operational considerations. The coordinator deployed onsite and downwind teams to maximize the number of paired measurements from different methods while balancing overall sample size and operational objectives. The study coordinator considered only predominant surface wind conditions and downwind on-site or road access as criteria for site selection within the cluster, without knowledge of facility operations or emissions magnitude or input from industry representatives. Using well count as an indicator of facility size, a two sample K-S test comparing the distribution well count per site at the sampled locations to the well count per site for the study area population shows the sample is representative of the population at a critical value of 99% (alpha = 0.01).
A complete description of the organization of the study and role of study partners is provided in the Supplementary Material section S1 (S1). In short, contractual agreements ensured that the research team retained independent control of all aspects of the study, while benefiting from the additional knowledge and insight that only owners and operators of these facilities hold, which led to significant advances in confidence and interpretation of results as compared to studies without industry cooperation and site access.

Measurement methods
Three measurement methods were employed:

Onsite measurement
Two teams (one contractor AECOM, and one study partner's measurement team) performed onsite direct measurement and both teams followed the same measurement protocol (S2). Both teams were observed by a study coordinator to ensure study protocols were followed. Emission sources were first identified during a comprehensive site survey using a combination of optical gas imaging and handheld laser methane detection. An Onsite Direct Measurement (ODM) was then performed using a high-flow sampler to measure methane emissions from each identified point source -with a few exceptions, as mentioned below. Manufacturer's calibration procedures were strictly followed to eliminate measurement issues suggested in prior studies. (Howard, 2015;Alvarez et al., 2016) In this analysis, which compares estimates developed from Bell et al: Comparison of methane emission estimates from multiple measurement techniques at natural gas production pads Art. 79, page 3 of 14 onsite measurements to those developed by downwind methods, the surveys and measurements performed by the two onsite teams are assumed equivalent (S2).
In practice onsite methods to quantify facilitylevel emissions are limited by the ability to identify and/or directly measure all primary emission sources. Additionally, onsite measurements do not capture the temporal variability of intermittent sources such as automated liquid unloadings, intermittent bleed pneumatic controllers, or venting from produced water tanks as a result of separator liquid dump events. In this study vented liquid unloadings (venting of the well bore to atmosphere to remove entrained liquids) were not measured by onsite teams. All pneumatic devices encountered in this study were classified by operators as "intermittent bleed" controllers. Routine venting from intermittent pneumatic controllers was not measured in this study, however emissions from pneumatic devices other than routine venting (e.g. malfunctioning devices that were continuously emitting gas) were measured, and are included in onsite direct measurements, classified by the location of the emissions source. If a source was inaccessible or if measuring presented a safety hazard to the measurement team, the source was considered "observed but not measured"; 23 sources fell into this category. All unmeasured sources (liquid unloadings, routine venting from pneumatics, and "observed but not measured" sources) were simulated to develop facilitylevel emissions estimates (see study onsite estimate, below, and S3).

Dual tracer flux ratio
One team from Aerodyne Inc. performed the dual tracer release method to measure facility-scale methane emissions. Tracer release is an established measurement technique that estimates emission rate by comparing the downwind plume and mixing ratio of the target analyte (here, methane) to the mixing ratios of one or more tracer gases released at a known rate from 1 or 2 locations within the facility (Lamb et al., 1995;Roscioli et al., 2015). The dual tracer flux ratio method uses a second tracer gas as an internal check. The tracer release method assumes the target analyte and the tracer gases are co-dispersed to normalize atmospheric transport. The method is limited in practice to locations with sufficient downwind access to sample the co-dispersed plumes and requires site access to release the tracer gases. The method also requires a significant time -in this study 30-240 minutes -to setup and run the experiment. (Yacovitch et al., 2017) The technique typically cannot estimate componentlevel emissions at production sites, but when successfully applied the method provides aggregated facility-level estimates from all emitting components within the facility.
As per Roscioli et al., 2015, 95% confidence limits were developed for each facility using plume-to-plume variability, or method uncertainty in the absence of sufficient plume data. When the tracer method identified a facility with non-detectable methane emissions, a facility-specific lower detection limit was estimated from the mixing ratio of the tracer gases and reported as an upper confidence bound, distinguishing a facility with emission levels below the lower detection limit from a facility at which the tracer plume is not observed indicating a failure of the method. The tracer method and measurements from this campaign are presented in more detail in Yacovitch et al., 2017. Other test method 33A (OTM33A) One team from University of Wyoming performed OTM33A, a downwind emission plume mixing ratio mapping, source characterization and emissions quantification method applicable to characterization of emissions from facilities where emissions originate at a single spatially-localized location (US EPA, n.d.). OTM33A produces facility-level methane emissions estimates using inverse modeling based upon Gaussian dispersion (Turner, 1994) and the measured mixing ratio of methane and meteorological conditions at a stationary location in the emissions plume downwind of the emission location. (Brantley et al., 2014) (US EPA, n.d.) Like tracer release, OTM33A is limited in practice to locations with sufficient downwind access; however OTM33A does not require site access to release a tracer gas and requires less time than the tracer method to make a measurement (in this study 45-60 minutes on location for a single measurement, 75-90 minutes for double measurement). Since OTM33A uses a dispersion model (in contrast to the co-dispersion assumption of the tracer method), it is expected to be less accurate than the tracer release technique. The method is expected to underestimate under poor transport conditions such as emissions from tall tanks, or lofted emissions as may occur from high rate liquid unloadings. The method uncertainty is estimated from previously conducted controlled release experiments as +117%/-46% (95% confidence bounds) of the measured mass emission rate. (Robertson et al., 2017).
Campaign-specific details of the OTM33A method and measurements are presented in (Robertson et al., 2017), but key methodological points are summarized here for convenience. In this study, a survey was performed by the mobile measurement lab prior to attempting the OTM33A technique. The mobile lab measured methane along transects driven downwind of major equipment on the well pad to identify any areas where methane mixing ratios were elevated above the background methane concentration (i.e. locations with methane enhancements). If methane enhancements were identified downwind of one location, the measurement team positioned the mobile lab downwind of a suspected emission point source to attempt an OTM33A measurement. An infrared camera was employed to confirm the position of suspected emission sources. The average distance from source for these measurements was 46 ± 24.0 meters. If multiple, spatially separated, emission plumes were detected, indicating multiple emission locations on the facility (a limitation of the OTM33A technique), no facilitylevel estimate is reported. If methane mixing ratios were not distinguishable from the background levels detected during survey transects, the pad was reported as "zero based on transects" (0BOT). In contrast, if the mobile Bell et al: Comparison of methane emission estimates from multiple measurement techniques at natural gas production pads Art. 79, page 4 of 14 lab could not be positioned downwind of a detected emission source because of road access or other physical obstructions or if an attempted measurement did not pass subsequent data quality checks; the measurement attempt failed and is not used in this study.

Facility-level emissions estimates
Each measurement method results in an independent estimate that represents total emissions from the facility. These facility-level emissions estimates are compared in pairs (see Table S2 for facility counts). Emissions from production facilities are not constant and include nontrivial temporal complexities such as episodic venting from intermittent pneumatic controller actuations and automated plunger liquid unloading. Since methods were not always performed simultaneously at each facility, temporally varying operations may create differences between emission sources measured by methods. This was addressed in the analysis, where possible, by modeling both the emission rate and the likelihood of these episodic emission sources. Additionally, on-site observer's notes and operator information about activities occurring during measurements were checked to ensure site emission sources were similar insofar as the observer was aware.

Study onsite estimate (SOE)
The SOE represents an "as complete as possible" estimate of a facility's emission rate at the time of measurement considering all known emission sources, including episodic emissions from intermittent pneumatic controllers and liquid unloadings which may have occurred while downwind measurements were made. Each facility SOE is composed of ODMs plus engineering estimates of four unmeasured source categories (hereafter referred to as "simulated sources"), when present: (1) well venting during liquid unloading, (2) inaccessible sources which were noted as "observed but not measured", (3) pneumatic devices under routine operation, and (4) uncombusted methane in compressor engine exhaust if present and running. Engineering estimates of emissions from simulated sources were developed by the authors in a Monte-Carlo model utilizing empirical emissions distributions reported by prior studies deemed most relevant to the sources in Fayetteville (S3). 95% confidence intervals (CI) for the SOE are developed from the Monte-Carlo results and include uncertainty in direct measurements and variability of emission rate in simulated sources.
Direct measurements of emissions during liquid unloadings were not attempted by the study team due to resource limitations and technical challenges associated with these high-rate episodic sources. Estimates of methane emissions during vented liquid unloading were derived from time-domain models developed in this study and represent average emission rates for a one-hour measurement period (S3). Emission rates during vented liquid unloading were simulated using whole gas emission rates measured during automatic and manually triggered plunger unloadings (29-440 SCFM) and manual no-plunger unloadings (149-1085 SCFM) in the mid-Continental region by Allen, Sullivan, et al., 2015. Methane emission rates were then calculated using well-specific gas composition provided by operators. Operators reported automated plunger lift systems which vent to atmosphere were installed on wells at 40 of 261 sites visited by onsite teams. At these facilities the event frequency (events/year) and average duration (hours/event) as provided by study partners were used to simulate the episodic emission source in the SOE. To improve the degree to which the SOE represents the same time period and facility operating status as the downwind measurements, event counts specific to the day of measurement (as opposed to annual event counts) were provided by the operators for facilities with automated plunger unloadings where the onsite measurement was paired with tracer or OTM33A estimates (sites 770, 957, 1036, 1079, 1202, and 2599). A manual liquid unloading occurred at one facility (site 371) for the duration of measurement (all methods); therefore only the emission rate of the manual liquid unloading was modeled and not its frequency.
Sources which were noted as observed but not measured were simulated using ODMs from the same equipment category performed in this study. Pneumatic devices under routine operation were simulated using data specific to the mid-continental region from Allen et al. (Allen, Pacsi, et al., 2015). Uncombusted methane in compressor engine exhaust was simulated utilizing a combination of exhaust stack measurements (made prior to the study campaign on engines of the same make and model) from study partners (included in SI_DataTables) and emission factors from US EPA Greenhouse Gas Reporting Program (GHGRP) where data on the same engine make and model was unavailable.

Tracer facility estimate (TFE)
A TFE was reported (S4) if the tracer measurement passed all quality control criteria. (Yacovitch et al., 2017) A TFE result of zero was reported for 3 facilities where the tracer plume was observed downwind of the facility with no detected downwind methane plume. At these facilities the lower limit of detection (LOD) was calculated and reported as an upper confidence bound of facility methane emissions. (Yacovitch et al., 2017).

OTM33A facility estimate (OFE)
An OFE was reported (S5) if a single plume was observed from a facility and measurements passed all quality control criteria. (Robertson et al., 2017) At facilities where multiple facility-level OTM33A measurements passed quality control criteria, the measurement average was used in this work and uncertainty estimates were averaged by quadrature (S4).

Comparisons of facility-scale emission estimates
Where facility-scale emission estimates made by pairs of different methods are compared, a regression is used to identify if a bias exists between methods. The coefficient of correlation (Pearson's r) is used to identify the strength of the correlation between any two independent methods. Ordinary least squares regressions emphasize the highest-emission facility and thus under the conditions of this study where emission estimates span several orders of magnitude, this approach would bias the slope of the line toward that facility. Therefore, in this analysis, a varianceweighted least squares (VWLS) regression (Neri et al., 1989) (S7) is performed in which a higher weight is applied to sites with lower relative variance than to sites with higher relative variance. As a result, the VWLS regression better fits the body of the data but has a lower coefficient of determination (R 2 ) than ordinary least squares. Note that R 2 only accounts for variance in the y-axis and can be less than zero as a result of performing the regression without an intercept. An alternative regression (orthogonal least squares) is discussed in S8.
Contemporaneous measurements were performed on the same day (when possible simultaneously or backto-back) with the exception of six facilities at which measurements were performed on different days (dates reported in SI data tables: Onsite Measurements, OTM33 AMeasurements, Tracer Measurements). At all six of these facilities, paired measurements are not statistically different when the two paired methods are compared (S9), and thus, including/omitting these six facilities do not change the conclusions of this paper.

Calculated SOE
Facility-specific SOEs at 261 production sites range from 0 to 403 kg/h (Mean 3.3 kg/h, Median 0.12 kg/h, Cumulative 858 kg/h). Across all 261 facilities with onsite measurements, measured emissions (ODMs) account for only 6% of cumulative estimated emissions and simulated liquid unloadings account for 88% ( Figure S10). The cumulative SOE from the 41 facilities with simulated liquid unloading are 776 kg/h, of which simulated liquid unloading contributes 750 kg/h (97%). The cumulative SOE from the 220 facilities without simulated liquid unloadings are 82 kg/h, of which ODMs and simulated emissions from observed, unmeasured sources account for 64 kg/h (78%).

Onsite direct measurements
Onsite teams recorded 322 observed emission sources. An ODM was attempted at 299 of these sources, including 53 zero measurements, 127 measurements below the lower limit of the instrument (S3), and 119 measurements above the lower limit of the instrument. The distribution of ODMs is skewed with 13 ODMs at or above 1.0 kg/h (4% of measurements) totaling 40.5 kg/h (63% of the 64 kg/h total measured by ODM). Additionally, 3 of these 13 ODMs were incomplete capture, indicating actual emissions from these sources are higher than indicated by the measurement. Eight of the highest 13 ODMs were measurements performed on produced water tanks. Tank emissions in this campaign are likely associated with liquid dumps from separators to the produced water tanks, resulting in short-duration episodic emission events. In some cases the tank emissions could be associated with a malfunction such as a stuck dump valve resulting in continuous emissions. The onsite team did not diagnose the root cause of tank emissions in this study, however it is important to note the potential for temporal variability from these sources when comparing to other methods since tank emission sources, when present, are often larger than other facility sources and can dominate total emissions from a given facility.

Facilities with vented liquid unloading
The distribution of facility emission rates at the 41 sites with emissions from liquid unloading (1 manual, 40 automated plunger) modeled in the SOE is skewed with 2 facilities which account for 86% of the total SOE emissions (n = 41, Min 0.06 kg/h, Max 403 kg/h, Mean 18.9 kg/h, Median 1.15 kg/h, Figure S10).
The single manual liquid unloading observed by the study team is the largest of all simulated sources in this study at 402 kg/h (95% CI = 167-953 kg/h) and accounts for 47% of the total SOE from all 261 facilities.
For plunger liquid unloadings triggered by automatic controllers we use annual event counts and annualaverage event durations (daily event counts and average durations where SOE is contemporaneous with downwind techniques) provided by study partners to model the likelihood that an event occurred during a 1 hour measurement period (SI 3). Automated plunger unloadings at 40 facilities account for 41% of the cumulative SOE from all 261 facilities. The majority of these emissions (251 kg/h, 72% of cumulative simulated plunger unloading emissions) are contributed by facility 1202. This facility included two wells with plungers installed, each averaging 25 unloading events per day, compared to the other 39 facilities with plunger unloading, which average 5 unloading events per day (SI Tables: PlungerLiquidUnloading).

Facilities without vented liquid unloading
SOEs from 220 facilities without liquid unloading total 81.8 kg/h and exhibit a skewed distribution (Min 0.0 kg/h, Max 12.1 kg/h, Mean 0.37 kg/h, Median 0.09 kg/h) with the 11 largest emitting facilities (5%) contributing 45.6 kg/h (56% of the total SOE for these facilities), consistent with the skewed distributions reported elsewhere. Lavoie et al., 2015;Lyon et al., 2015;Mitchell et al., 2015;Rella et al., 2015; For these 220 facilities, ODMs account for 43%, simulated observed but unmeasured (inaccessible) sources account for 35%, simulated routine pneumatics account for 19%, and simulated exhaust from compressors account for 3% of cumulative emissions at this subset of facilities. This rank order is the same at the 41 facilities with liquid unloading ( Figure S10). These results suggest that a majority of emissions could be estimated through direct measurements when large emission sources too difficult to measure were not present and all sources were accessible. However, to ensure accurate facility total emission estimates, large emission sources must be considered.

Comparison of contemporaneous estimates
Comparisons made in this section are presented on a per-facility basis (Figures 1 and 2) and a cumulative basis (Figure 3). In the analysis presented here, production facilities were sub-divided into groups by the largest SOE source category. Additional comparisons and figures are discussed in S7 including a comparison of TFE to OFE. TFE is compared to SOE at 16 paired facilities. Figure 1 separates facilities into three groups: (1) three facilities where the tracer was reported as zero with an upper confidence based on the limit of detection, (2) four facilities with active liquid unloading modeled in the SOE, (3) nine other facilities. Estimates agree to within 95% confidence bounds at 15 of 16 facilities. However, per-facility VWLS comparison TFE = 1.7(+0.54/-0.40) * SOE ( Figure S11) shows that our SOE model estimates statistically lower emissions relative to TFE. This shows that although most estimates at most facilities are statistically similar, a bias where TFE is greater than SOE exists and may impact cumulative estimates across many facilities. In this case, cumulative emission estimate at these 16 paired facilities by TFE, 831 (+208/-207) kg/h, is 193% of cumulative SOE, 430 (+557/-247) kg/h ( Table 1).
OFE is compared to SOE at 43 paired facilities. Figure 2 separates facilities into three groups: (1) ten facilities reported as 0BOT, (2)

Figure 1: Contemporaneous study onsite estimate (SOE -stacked bars) and tracer facility estimate (TFEsquare markers) (n = 16).
Facilities are divided into three groups and rank ordered in each group by total SOE. Three facilities had mean TFE of zero and are shown in the left group, with upper TFE confidence limit at the facilityspecific lower limit of detection (LOD) of the method. Facilities with episodic vented liquid unloading are grouped to right of figure, the largest of which exhibits an emission rate more than two orders of magnitude larger than the highest emitting facility without an unloading. Panel (a) Shows facility-total emissions and 95% confidence limits are indicated by red whiskers for SOE and black whiskers for TFE. TFEs which are statistically similar to SOEs are indicated by green marker fill (15 facilities 1 and 2). In the following we consider these sites separately and perform regressions without these outliers.

Facilities with emissions dominated by vented liquid unloading
Given that emissions from liquid unloading account for 88% of the cumulative SOE for the 261 production facilities with SOE estimates, the modeling of unloading emissions is of particular importance to estimating total emissions from the basin. Overlapping 95% CI indicate the SOE and TFE are statistically similar at the four paired facilities where liquid unloading is included in the SOE. The TFE, 821 (+207/-207) kg/h, estimates higher cumulative emissions (although not statistically different) at these four sites relative to the SOE, 424 (+558/-247) kg/h (Table 1, Figure S12).
At the four paired facilities where liquid unloading is the largest source in the SOE, the OFE, 51 (+38/-15) kg/h, estimates statistically lower cumulative emissions than SOE, 688 (+622/-376) kg/h (Table 2, Figure S12). The difference in cumulative emissions at these four sites (637 kg/h difference) accounts for 99% of the difference between cumulative SOE and cumulative OFE at all sites where the two measurement techniques were paired (n = 43; 642 kg/h difference).
Manual Liquid Unloading -The highest emitting facility, site 371, dominates the difference in cumulative emissions from facilities with paired TFE and SOE, as well as paired OFE and SOE. The emission rate of the manual liquid unloading at this facility was simulated by Monte-Carlo in the SOE using average rates of manuallytriggered liquid unloading without plungers measured in the mid-continental region (Allen, Sullivan, et al., 2015), which range from 149-1085 SCFM of whole gas. SOE, 403 (+550/-236) kg/h, is lower at this facility than the TFE, 810 (+/-207 kg/h), however the estimates have overlapping 95% CI. The OFE, 33.5 (+36.7/-14.4) kg/h, is statistically lower than both TFE and SOE at this facility. Although the OTM33A measurements at this site passed all quality criteria, during sampling the measurement team noted when observed with a gas imaging camera the emission plume lofted above the measurement location, however due to limited road access positioning further downwind was not feasible. Although limited by just one manual unloading sampling site involving contemporaneous measurements of three measurement techniques, the results indicate that TFE may provide a more robust assessment of the unloading emission magnitude than OFE.
To further understand if OFE underestimated the unloading emissions, we also compare the OFE estimate to the average production rate of the well over the study period: 4863 SCFH (98.6 kg/h assuming 100% methane), or approximately three times the OFE estimate. Since wells are typically shut-in for some time prior to manual unloading in order to build pressure, and since gas is routed to an atmospheric pressure tank during the unloading instead of to the pressurized gathering line in order to reduce back pressure, gas flow during the unloading is typically higher than the average production rate. This additional point of comparison supports the conclusions that the OFE is likely an underestimate of emissions at this site, and that SOE and TFE are likely more accurate estimates of emissions during this unloading event. Plunger Liquid Unloading -The episodic characteristic of automated plunger unloadings likely contributes to the differences between the mean emission rates estimated by the SOE relative to the TFE and OFE. The exact timing and duration of automated plunger liquid unloading during the measurement is unknown, but is modeled in the SOE using the number of events recorded by the operator on the day of measurement and reflected in the SOE uncertainty for each facility. The duration of plunger unloading events is short (typically minutes) relative to the time required to perform a tracer measurement (multiple short plumes typically acquired over 1-2 hours). Therefore, tracer plume data at these facilities may include downwind plumes representative of times when the unloading is occurring, and times when the unloading is not occurring. The resulting TFE represents a time average during the measurement period, where the fraction of plumes that represent vented unloading emissions approximates the ratio of time vented plunger unloading was occurring during the measurement. The TFE and SOE agree within uncertainty bands at all three sites with plunger unloading where the methods were paired (Figure 1), providing some validation of our model of the timing and emission rate for these episodic sources, albeit at a small set of sites.
The OFE and SOE agree within uncertainty estimates at two (sites 1079 and 2599) of three sites where plunger unloading was the largest source in the SOE. OFE is statistically lower than SOE at the third facility (site 1202), however OFE is not statistically different than cumulative ODMs at this site (Figure 2), suggesting perhaps no unloading occurred during the OTM33A measurement (20 min measurement period). Two wells at site 1202 had automated plungers installed, each with 25 actuations per day and average durations of 0.43 hour/event (equivalent to venting 45% of the time) and 0.71 hour/event (equivalent to venting 74% of the time) respectively. Given the high frequencies and high durations it is likely (>80% probability) that at least one of the two wells at this site was unloading for 15 or more minutes of any   20 minute period. Due to the nature of the OTM33A method where a measurement is made at a stationary position and relies on the wind to sweep the plume across the measurement location, it is possible that the amount of time within which emissions from an unloading event at this site were sampled is much shorter than the 20 minute measurement period which could contribute to an underestimate.

Facilities with emissions dominated by pneumatic controllers
Emissions from pneumatic controllers, simulated in the SOE using the count of pneumatic devices at the site by classification (low bleed, high bleed, intermittent bleed, or chemical injection pump) provided by study partners and the average rates in the mid-continental region measured by Allen et al. (Allen, Pacsi, et al., 2015), was the largest SOE category at only 2 facilities (site 1543 and 956) where onsite measurement was paired with TFE and at 19 facilities where onsite measurement was paired with OFE ( Figure S13). At these 19 facilities with paired SOE and OFE, cumulative emissions (OFE = 2.9 (+1.9/-0.75) kg/h, SOE = 1.6 (+0.64/-0.43) kg/h) and a VWLS regression of OFE = 2.0(+5.5/-1.6) * SOE indicate that OFE estimates more emissions at these sites on average than represented in the SOE, however neither VWLS or cumulative estimates indicate the two methods are statistically different for these low-emitting (typically < 1 kg/h/site) facilities. Since only intermittent bleed pneumatic controllers were encountered at production sites in this study and direct measurements of continuously emitting sources represents a smaller fraction of emissions at this subset of sites, we expect emission estimates may differ across paired methods depending on the number of intermittent pneumatics actuations during the measurement. Due to the limited samples size (n = 2) a similar comparison of SOE and TFE cannot be made.

Facilities with emissions dominated by onsite direct measurements
ODMs and simulated observed, unmeasured sources were the largest categories of the SOE at 9 of 16 facilities where onsite measurement was paired with TFE and 20 of 43 facilities where onsite measurement was paired with OFE (Figure 3). At these facilities, continuous sources (i.e. leaks from flanges, fittings and malfunctioning pneumatic controllers, etc.) dominate the SOE rather than discontinuous sources such as routine operation of intermittent pneumatic devices or vented liquid unloading. At this subset of facilities we expect improved agreement between emissions estimates across methods since temporally varying emissions represent a smaller fraction of facility total emissions. For the 9 facilities with both SOE and TFE, cumulative TFE, 9.4 (+2.6/-2.3) kg/h, is greater but not statistically different than cumulative SOE, 5.7 (+5.2/-1.2) kg/h, while VWLS regression TFE = 1.6(+0.51/-0.39) * SOE suggests the methods are statistically different and the TFE estimates higher facility-level emissions on average for this subset of facilities, even though the TFE and SOE agree within CI at 8 of the 9 individual facilities.
Cumulative OFE, 13 (+5.3/-2.1) kg/h estimates lower emissions (though not statistically different) than the cumulative SOE, 19 (+7.7/-3.0) kg/h. VWLS regression OFE = 0.41(+0.51/-0.17) * SOE for this 20 facility subset indicates the two methods are statistically different, even though the OFE and SOE estimates agree to within uncertainty at 11 of 20 facilities so paired (55%). At these facilities, it is reasonable to interpret the SOE as a lower bound of emissions, since the SOE is composed primarily

Conclusions
The experimental design of this study provided a unique opportunity to understand the mix of sources at production sites and to compare emission estimates derived from multiple methods. Extrapolations of our findings are limited by sample size and the restriction of the study to a single basin. Since liquid unloadings were not measured in the field campaign, emissions from unloadings were simulated from measurements previously taken in the midcontinent region. Liquid unloadings at the 41 facilities with liquid unloadings of 261 total production facilities contribute 88% of total emissions estimated by the SOE. Further, even within facilities with liquid unloadings, simulated emissions are highly skewed -76% of the total emissions estimate was contributed by simulated emissions from liquid unloading at only two facilities (one manual and one automated plunger). The dominance of simulated emissions by one source category, is indicative of the relative magnitude of these episodic sources for total emissions from well pads in the study region. However, there is substantial uncertainty inherent in   . Relatively low coefficients of determination (R 2 ) in both (a) and (b) indicate the regressions do not capture the wide variation seen in the data (most paired estimates are within factor of 10 illustrated by y = 10x, y = x/10 bounds). DOI: https://doi.org/10.1525/elementa.266.f3 Bell et al: Comparison of methane emission estimates from multiple measurement techniques at natural gas production pads Art. 79, page 11 of 14 modeling liquid unloading emissions utilizing data from prior studies or factors in regional and national emission inventories, even when specific activity data for timing and duration of the unloadings is available.
SOE uncertainties modeled in this study include the variation within emission distributions assumed for simulated emissions (e.g. unloadings) but do not include unknown uncertainties inherent in the selection or unknown representativeness of the emissions distribution used in the SOE model. For example, we assume that emissions for liquid unloadings in the study area match, in aggregate, measurements taken by Allen, Sullivan, et al., 2015 in the much larger mid-continental region several years earlier. Mean emissions for any one of these sources in the study area could be biased higher or lower than the mean emissions in the assumed distribution. While this is less of an issue when scaling regionally or nationally, with thousands of individual sources, this type of potential bias could impact comparisons made using the small sample size of a single study. Therefore, in method comparisons, it is useful to look at comparisons in two subsets: (a) all paired measurements including major simulated emissions, and (b) the subset of facilities where the majority of emissions are due to continuously emitting sources, minimizing the overwhelming impact of episodically-emitting unloadings or pneumatic devices. We consider each comparison in turn: (a) Considering all paired measurements, TFE and SOE agree within uncertainty estimates at 15 of 16 (94%) production facilities where the methods were paired. These data indicate that on a site-by-site basis, the two methods agree to within uncertainty bounds in most cases. In contrast, there is less agreement between OFE and SOE, with 28 of 43 (65%) paired production facilities estimates within uncertainty bounds. These data indicate, even with the smaller sample size, a much stronger site-by-site agreement between bottom-up SOE methods paired with downwind TFE than between the SOE methods paired with downwind OFE. This difference in agreement is reflected in both the total emissions at paired TFE = 831 (+208/-207) kg/h vs SOE = 430 (+557/-247) kg/h, and paired OFE = 67 (+39/-15) kg/h vs SOE = 709 (+634/-377) kg/hr, and the greater scatter of OFE (pearson's r = 0.913) than TFE (pearsons r = 0.999) relative to SOE, indicating more disagreement on a site-by-site basis. (b) Focusing on the subset of 9 paired TFE/SOE facility measurements meeting criteria (b), above, TFE estimates higher emissions than SOE by VWLS (TFE-to-SOE ratio = 1.6, 95% confidence interval [1.2 to 2.1]), indicating that tracer methods estimate statistically higher emissions than those developed by SOE methods. This number is in reasonable agreement with another SOE/TFE paired measurement study, where "the tracer flux method systematically measured somewhat higher methane emission rates than the SOE at lower-emitting sites, while SOE was higher than tracer flux at higher emitting sites" . Turning to the OFE/SOE comparison, for the 20 paired facility measurements meeting criteria (b) above, OFE estimates lower emissions than SOE by VWLS (OFE-to-SOE ratio of 0.41 [0.26 to 0.90]). While a definitive explanation of this difference could not be extracted from the data, possible explanations include the presence of multiple emission sources on the wellpad, obstructions located between the emission source and sampling location, emission plumes lofted overhead of the sampling location, or measurements being too close to emission sources. The average estimated distance to source for OFE estimates in this study was 46 ± 24.0 meters which is towards the lower end of the recommended 20-200 meter range (Brantley et al., 2014). Combined, these two comparisons (a and b) suggest that OFE systematically underestimates emissions for the measured production sites, using as a standard two other independent emissions estimates and noting the limitations of the study measurements and SOE model inputs. However, OFE underestimates for continuously emitting sources are much smaller than total underestimates, which are dominated by large unloading events.
Selection of the most appropriate method to measure methane emissions from natural gas production sites depends on the goals of the study. Onsite methods can provide identification of leaks at the component level, but may not capture all emissions on a facility. The summation of onsite measurements can therefore establish a lower limit of total site emissions. Although OFE and SOE agree within uncertainties at 65% of individual paired facilities, the VWLS regression shows OTM33A estimates statistically lower emissions than our SOE, However, OTM33A is a relatively quick and non-invasive method to estimate total site emissions and under appropriate conditions may be able to provide measurements of emission sources otherwise inaccessible to onsite teams. Similarly, although TFE and SOE agree at 94% of individual paired facilities, the VWLS regression suggests dual tracer flux ratio provides a statistically higher estimate of total site emissions than SOE. The tracer flux ratio method however does provide a means to measure emissions from high rate sources such as liquid unloadings and larger facilities such as midstream or transmission compression stations. Given that all three methods utilized here are in common use for determining emission rates from natural gas operations, this study indicates that further inter-comparison and characterization of these methods under field campaign and controlled conditions, is advisable.

Data Accessibility Statement
Datasets produced in this work are available as online supplementary material. In addition to the author list, we acknowledge the contributions to conception and design, acquisition of data, analysis and interpretation of data by:

Funding information
Funding for this work was provided by RPSEA/NETL contract no 12122-95/DE-AC26-07NT42677 to the Colorado School of Mines. Cost share for this project was provided by Colorado Energy Research Collaboratory, the National Oceanic and Atmospheric Administration, Southwestern Energy, XTO Energy (a subsidiary of ExxonMobil), Chevron, Statoil and the American Gas Association, many of whom also provided operational data and/or site access. Additional data and/ or site access was also provided by CenterPoint, Enable Midstream Partners, Kinder Morgan, and BHP Billiton.