Domain Editor-in-Chief: Detlev Helmig; Institute of Alpine and Arctic Research, University of Colorado Boulder, US
Associate Editor: Allen Goldstein; University of California Berkeley, US


1. Introduction: Definitions and sources of background ozone

Ozone (O3) is a key secondary air pollutant associated with a number of health issues including asthma and premature death (Bell et al., 2004; Lippmann, 1993; Silva et al., 2013; Landrigan et al., 2018). Silva et al. (2013) estimate that ambient O3 causes between 229,000–720,000 annual premature deaths globally, with 12,300–52,200 in North America alone. Ozone also adversely impacts growing vegetation, including crops, with a global estimated crop loss of $11–18 billion for the year 2000 (Avnery et al., 2011). Ozone was accordingly designated as a criteria air pollutant by the U.S. Clean Air Act (CAA) in the 1970s. The CAA requires that the U.S. Environmental Protection Agency (EPA) establish primary (to protect public health) and secondary (to protect public welfare) National Ambient Air Quality Standards (NAAQS) for O3.

In the troposphere, O3 is produced by photochemical reactions of nitrogen oxides (NOx) with carbon monoxide (CO), methane (CH4), and volatile organic compounds (VOCs). These O3 precursors are emitted by fossil fuel combustion, agriculture, biomass burning, oil and gas production, and a variety of other industrial processes. Anthropogenic emissions of NOx and some VOCs have decreased in the U.S. over the past several decades, and peak O3 levels have declined in most areas of the U.S. as a result (Cooper et al., 2012; Simon et al., 2015; Strode et al., 2015). At the same time, new evidence has demonstrated adverse health effects at lower O3 levels (US EPA, 2013) and the EPA recently strengthened both the primary and secondary NAAQS (US EPA, 2015). A monitor meets the standard if the 3-year average of the annual 4th highest maximum daily 8-hour average O3 mole fraction (MDA8), called the “ozone design value (ODV)”, is less than or equal to 70 parts per billion (ppb). An additional metric, the “W126 exposure index”, can be used to assess the cumulative seasonal exposure of vegetation to O3.

Regulation of locally formed O3 is complicated by the fact that O3 also has significant background levels in the troposphere. Observations from remote sites along the west coast of North America show that seasonal mean O3 ranges from 30 to 50 ppb, thus the “background” air that enters the U.S. with the prevailing westerly winds already contains a substantial fraction of the 70 ppb standard. Observations and/or modeling show that, on some days, O3 at a site may be enhanced by noncontrollable O3 sources (NCOS), such as recent stratosphere-to-troposphere transport (STT), long-range transport from non-domestic sources, lightning, or photochemical production from natural NOx and VOC precursor emissions including wildfires initiated by natural or human causes (Jaffe et al., 2004, 2005; Parrish et al., 2010; Ambrose et al., 2011; Wigder et al., 2013a; Langford et al., 2009, 2012). While foreign sources of pollution are theoretically controllable, these are beyond the control of any local jurisdiction, so for this discussion we include these in the NCOS category. In addition, foreign pollution is often mixed in with other types of NCOS (e.g., Cooper et al. 2004b; Ambrose et al., 2011), making it difficult to quantify these sources. The CAA provides several mechanisms, including Section 319b (Exceptional Events Rule (US EPA, 2016a, b)) and Section 179B (international transport), that offer policy solutions to account for high O3 due to these noncontrollable sources (US EPA, 2013). We note that the EPA uses the term “exceptional events (EEs)” to consider days when surface O3 is elevated above the NAAQS by episodic natural sources such as stratospheric intrusions or wildfires that cannot be “reasonably controlled” (EEs can also include episodic emissions of anthropogenic precursors if these were not reasonably controllable and are unlikely to recur at a specific location). EE influenced data can be excluded from the design value calculation if they are identified by the state agency and supported by evidence, which is then evaluated and approved by the EPA. Thus, excluding high O3 caused by exceptional events may allow an area to be designated in attainment of the NAAQS. For areas that would otherwise violate the NAAQS because of international transport, Section 179B provides relief from penalties for failing to attain the NAAQS, but days affected by international transport are included in the calculation of the design value. In this review we focus on NCOS, rather than EEs, to consider more broadly the contributions of both international transport and EEs. Individual NCOS events can increase local surface O3 levels on timescales ranging from hours to days before dissipating to become part of the tropospheric background. They are potentially important throughout the U.S., but the impact appears to be greatest in the western states where wildfires tend to be larger (Jaffe et al., 2013), deep stratospheric intrusions are more frequent (Skerlak et al., 2014), and transport from Asia is more important (Verstraeten et al., 2015).

The frequency of NCOS events, and thus higher background O3, in the western U.S. makes it essential that we understand the sources of that O3, and this requires careful analysis using both observations and models. In this review, we use the term “U.S. background O3 (USB O3)” as O3 formed from NCOS plus anthropogenic sources in countries outside the U.S. (Dolwick et al., 2015). While USB O3 incorporates the influence from NCOS, in our discussion, we focus on NCOS that elevate O3 on a short-term basis (e.g., daily), to values above the seasonal mean USB O3. Although the global CH4 burden reflects both domestic and international emissions, we include its contributions in USB O3, similar to previous work (e.g., Fiore et al., 2014a). Essentially, USB O3 encompasses the contributions from natural and foreign sources of O3 that cannot be controlled by precursor emissions reductions solely within the U.S. Since USB O3 varies daily and is a function of season, meteorology, and elevation, quantification of USB O3 on days that exceed the NAAQS is more relevant to air quality management than seasonal mean estimates. We note that some studies use the term “North American background (NAB) O3”, which is similar to USB O3, but is defined as O3 formed from natural sources plus anthropogenic sources in countries outside the U.S., Canada, and Mexico.

A quantitative understanding of USB O3 is essential for air quality management in general, and for state and local efforts to meet the NAAQS in particular. This is especially true given the recent lowering of the NAAQS O3 levels and the associated increasing relative importance of USB O3 as domestic precursor emissions decrease. Primary tools used by states and the EPA to manage air quality are the State Implementation Plans (SIPs; US EPA, 2015) or Federal Implementation Plans (FIPs). These documents are federally-enforceable plans developed by and/or for states that identify how the state will attain and/or maintain the air quality standards. A key component of each SIP is the maintenance of a network of regulatory O3 monitors that use standardized sampling methodologies, quality assurance, and siting requirements established by the EPA, along with other federal, tribal, state and local agencies. Knowledge of the sources contributing to the ambient levels on the highest O3 days is important because controlling the domestic contribution to O3 production affects the estimates of both the health benefits and the economic costs and benefits associated with achieving the NAAQS (US EPA, 2014c). This knowledge is also important for SIP development because it helps states identify the most effective emission control strategies.

Quantification of USB O3 requires a chemical transport model (CTM) since it cannot be measured directly (e.g., Fiore et al., 2002, 2003; Zhang et al., 2009), but these models must be informed and evaluated using observations. In addition to USB O3, an alternative useful metric for evaluating modeled mole fractions is “baseline” O3, which is the distribution of O3 observations at a rural or remote site that has not been influenced by recent, local emissions (HTAP, 2010). We note that this definition differs from the one adopted by a National Research Council (NRC) report (NRC, 2010), which defined baseline as “the statistically defined lowest abundances of O3 in the air flowing into a country.” We find the HTAP (2010) definition to be a more useful metric, since the lowest mole fractions may be associated with a particular season or transport pathway and therefore not representative of all conditions. Measurements of baseline O3 are expected to be greater than model-estimated USB O3 since the former includes some O3 produced many days earlier by U.S. emissions that have been recirculated regionally or globally. In the following discussion, it is important to keep in mind that baseline O3 is not the same as USB (or NAB) O3, but both can be characterized by a seasonal mean, MDA8, 3-year ODV, and other statistical metrics. Because states develop their SIPs by evaluating O3 response to emissions controls on the highest modeled O3 days, an especially useful metric is the estimate of USB and NCOS O3 on those days.

Natural, international, and domestic sources all contribute to observed surface O3. Figure 1 demonstrates how these sources contribute to O3 mole fractions that are used in air quality management decisions. Depending on the magnitude of the sources, such as stratospheric intrusions or wildfires, these sources could be identified as EEs. However, the magnitude of the events and the ability of current data and tools to characterize it will impact whether specific episodes qualify as EEs. Which NCOS can be removed from the analysis may impact air quality management including SIPs.

Figure 1 

Conceptual models for O3 sources (a) in the U.S. and (b) at a single location. (a) U.S. O3 sources shown with yellow boxes or arrows represent domestic/controllable sources. Sources shown with blue boxes or arrows represent USBO/uncontrollable sources. Note that locations for each process are not specific to any one region. The base map shows satellite-observed tropospheric NO2 columns for 2014 from the Ozone Monitoring Instrument (OMI) onboard the NASA Aura satellite (Credit: NASA Goddard’s Scientific Visualization Studio/T. Schindler). NO2 column amounts are relative with red colors showing highest values, followed by yellow then blue. We use the OMI NO2 as a proxy to show local O3 precursor emission sources. (b) The bar chart shows a theoretical example of how both domestic and USB O3 sources combine to produce elevated O3 at a specific location on any given day. Each source varies daily and there are also nonlinear interactions between USB O3 sources and anthropogenic sources that can further add to O3 formation, e.g., forest fires and urban emissions (e.g., Singh et al., 2012). DOI: https://doi.org/10.1525/elementa.309.f1

In this review, we focus mainly on work completed since 2011 and build on earlier studies (NRC, 2010; McDonald-Buller, 2011). We address a number of scientific questions:

  1. What methods have been used to identify and quantify background O3 and what are the strengths, weaknesses, and uncertainties of these methods?
  2. What do observations and models tell us about the spatial and temporal pattern, variability, trends, and episodic peaks in baseline and background O3 across the continental U.S.?
  3. What do observations and models tell us about the sources of background O3?
  4. How does USB O3 impact local air quality and how do uncertainties in USB O3 propagate into uncertainties in source attribution?
  5. What strategies can be used to quantify daily, seasonal, and interannual variations in NCOS and what are the strengths and weaknesses of each method?
  6. What strategies are needed to improve our estimates of baseline O3, USB O3, and NCOS and what are our recommendations for future research in this area?

2. Spatial distribution of baseline O3 in the U.S.

Most of the regulatory O3 monitors in the continental U.S. are located in or near major population centers and not sufficiently isolated from upwind sources to provide representative information on the baseline O3 inflow along the U.S. West Coast. One exception is the monitor maintained by the Washington Department of Ecology at Cheeka Peak [CP] on the coast of Washington State. NOAA (National Oceanic and Atmospheric Administration) also has a non-regulatory research monitor with a long-term data record about 850 km to the south of CP at Trinidad Head [THD] in northern California. Both of these monitors are located in the marine boundary layer, but the University of Washington operates another non-regulatory research monitor on a mountaintop site (Mt. Bachelor Observatory [MBO]) in central Oregon about 200 km from the coast. Twenty years of vertical profile data are also available from the NOAA ozonesonde program at Trinidad Head. Figure 2 summarizes these observations for both spring and summer. What is clear from these data is that in the absence of local influences, both median baseline O3 and the frequency of high O3 events increase with altitude (Cooper et al., 2011; Musselman and Korfmacher, 2014). At low elevations, mean spring O3 levels are about 10 ppb higher than summer values, whereas above 1 km, median spring and summer values are comparable, with summer showing a higher frequency of enhanced O3 events. The small difference in median values for the THD sondes and MBO data at the same altitude has been attributed to large-scale dynamical patterns (Zhang and Jaffe, 2017). The positive vertical gradient and local orographic flows also cause the observations at MBO to show lower O3 in the daytime, when air from the surrounding valley is lifted to the summit and higher O3 at night, when the site is exposed to the free troposphere (Weiss-Penzias et al., 2006).

Figure 2 

Vertical profiles of O3 at Trinidad Head, Cheeka Peak, Mt. Bachelor Observatory, and Chews Ridge. Spring (left) and summer (right) vertical profiles (meters above sea level, m asl) as measured by ozonesondes (https://www.esrl.noaa.gov/gmd/) from Trinidad Head, California (2007–2017) and continuous surface observations at Cheeka Peak, Washington, at 500 m asl (2010–2016) (blue symbols), Mt. Bachelor Observatory, central Oregon, at 2763 m asl (2007–2016) (blue symbols), and Chews Ridge Observatory at 1500 m asl on the ridgeline of the Santa Lucia coastal mountain range in southern California (2012–2016) (blue symbols). For the Trinidad Head sonde data, blue lines represent individual sondes; red thin lines represent the 2nd and 98th percentiles, red dashed lines the 10th and 90th percentiles, and thick red lines the 50th percentile. From left to right, the blue symbols for the surface sites represent the 2nd, 10th, 50th, 90th and 98th percentiles of nighttime O3 observations at Mt. Bachelor and Cheeka Peak and nighttime onshore O3 observations at Chews Ridge. Black vertical lines reference the 70 ppb NAAQS. The data for Chews Ridge were provided by Ian Faloona (University of California Davis). The Cheeka Peak data were obtained from the EPA AQS network. The MBO data are from the University of Washington data archive (https://digital.lib.washington.edu/Researchworks). DOI: https://doi.org/10.1525/elementa.309.f2

Altitude also has an influence on the ODV metrics as can also be seen by comparing nearby rural sites at different elevations. Table 1 shows ODVs for pairs of rural monitoring sites in Oregon, Wyoming, and New Hampshire. In each case the higher elevation site (>1000 meters elevation difference) shows an ODV that is enhanced by at least 10 ppb compared to the lower elevation site. This reflects both the higher seasonal median O3 and larger contributions from NCOS. For the Mt. Washington, New Hampshire site, and to a lesser extent the Centennial, Wyoming site, this could also reflect greater transport of domestic O3, given that these sites are downwind of major U.S. source regions (e.g., Huang et al., 2013a). This is not the case for Mt. Bachelor, however, which receives minimal influence from U.S. anthropogenic sources (Ambrose et al., 2011). High O3 levels at remote mountaintop sites such as Mt. Bachelor do not necessarily correspond to high values in more populated, lower elevation areas. Isolated high altitude sites have greater exposure to free tropospheric air that can be diluted as it is transported and mixed into the boundary layer (Wigder et al., 2013a). Furthermore, the O3 lifetime is longer in the lower free troposphere than in near-surface air where it undergoes depositional loss to the surface and where chemical reaction rates may be enhanced in warmer, more humid air masses. The O3 levels measured at mountain sites and nearby populated areas may be similar, however, if the boundary layers are sufficiently deep and well-mixed as is often the case in the Intermountain West (Langford et al., 2017).

Table 1

Comparison of O3 ODVs for adjacent sites with differences in elevations >1000 meters (2013–2015).a DOI: https://doi.org/10.1525/elementa.309.t1

State Siteb Coordinates Meters asl O3 Design Value (ppb)c

Oregon Bend 44.02°N, 121.26°W 1135 59
Oregon Mt. Bachelor 43.98°N, 121.69°W 2763 77
Wyoming Carbon 41.78°N, 107.12°W 2015 55
Wyoming Centennial 41.36°N, 106.24°W 3178 66
New Hampshire Camp Dodge 44.31°N, 71.22°W 451 57
New Hampshire Mt. Washington 44.27°N, 71.30°W 1914 67

aData are from the EPA AQS database (https://www.epa.gov/aqs) except for the non-regulatory Mt. Bachelor measurements, which are from the University of Washington data archive (https://digital.lib.washington.edu/Researchworks).

bIn each state, the lower elevation site is in a small urban or rural location, whereas the elevated site is more remote.

cThe MDA8s used in the ODV calculations use only data acquired with start hours between 0700 and 2300 local standard time. The ODV is the three-year average of the 4th highest annual MDA8, calculated after approved EE data have been excluded from AQS. For all sites listed here, no EE days were identified or excluded from the ODV calculation. Note that EEs have not been formally evaluated for the Mt. Bachelor data, since it is not a regulatory monitor.

3. Approaches used to quantify USB and NAB O3

Most estimates of background O3 have been made using regional CTMs such as the CMAQ (Community Multiscale Air Quality Modeling System) (Byun and Schere, 2006) and CAMx (Comprehensive Air Quality Model with Extensions) (Ramboll Environ, 2014) models that are initialized using lateral boundary conditions (BCs) derived from global models. In this section, we summarize the model approaches used to estimate USB O3 and examine their different merits, limitations, and best uses. We note that different methods of employing CTMs may be best suited (scientifically or computationally) to a specific policy or research question. Biases owing to misspecification of emissions, errors in physical processes, choices regarding chemical mechanisms (Knote et al., 2015), model resolution (Lin et al., 2010), and plume dispersion (Rastigejev et al., 2010; Eastham and Jacob, 2017) may propagate into biases in the source attribution. In some cases, as described in more detail later, ad hoc methods for bias-correcting model estimated source attribution have been applied (e.g., Lin et al., 2012a, b; Lapina et al., 2014).

The most common modeling approach for quantifying USB O3 is the “zero-out” method, whereby domestic anthropogenic emissions are set to zero (e.g., Zhang et al., 2014; Fiore et al., 2014a) to provide a direct estimate of the O3 levels that would exist without domestic emissions. Nuances arise when applying the zero-out method to regional models wherein USB O3 is transported into (and potentially out of) the regional modeling domain. For example, the regional boundary conditions used for defining USB O3 may come from a global model run with U.S. anthropogenic emissions set to zero (e.g., Emery et al., 2012), or may be drawn from global model runs without any emissions perturbations (e.g., Lefohn et al., 2014). Huang et al. (2017) found that surface O3 responses in a regional model over North America to changes in USB O3 contribution from East Asia were smaller than those in the global models used to generate the boundary conditions. Zero-out scenarios also change O3 production efficiency within the model domain causing the contributions from different sectors and regions to be non-linearly related. This is particularly obvious in the case of NOx titration, which is removed when local emissions are zeroed, causing O3 increases. This non-linearity can prevent the source contributions from adding up to 100% of the total modeled O3 levels (Wu et al., 2009), which could be a concern when multiple model zero-out simulations from different source regions are combined.

Sensitivity methods can also be used to estimate USB O3 and contributions by source. The most basic implementation of sensitivity modeling is direct perturbation modeling, where emissions from each source or region of interest (or contributions from the stratosphere) are reduced or increased by small amounts (e.g., ±20%; Wu et al., 2009; Galmarini et al., 2017) such that nonlinear O3 responses are not typically triggered in polluted conditions (Cohan et al., 2005). At the extreme limit of perturbation methods (i.e., infinitesimally small perturbations), techniques such as adjoint modeling (Sandu et al., 2005; Zhang et al., 2009) and decoupled direct methods (DDM; Dunker et al., 1981; Hakami et al., 2004) efficiently calculate the local linear sensitivity of USB O3 to numerous source contributions. These methods provide results suited for projecting changes in O3 owing to small emissions perturbations (e.g., <20–50%; Reidmiller et al., 2009; Huang et al., 2017). Second-order correction terms can be applied to sensitivity approaches to estimate O3 contributions caused by larger perturbations (Wu et al., 2009; Wild et al., 2012), or nonlinear changes can be evaluated using path-integral methods (Dunker et al., 2017). While these techniques can track sensitivities within a given model, they depend strongly on the emission inventories applied in that model. It is thus critical to evaluate uncertainties in historic and future source estimates, and how these uncertainties propagate into projections of specific O3 metrics.

Tagging techniques track source contributions in models without perturbing emissions (Cohan and Napelenok, 2011; Grewe et al., 2010). Tagging relies on a set of rules for assigning each molecule of O3 to a particular source. These sources may be defined as specific tropospheric production regions (e.g., Wang et al., 1998; Fiore et al., 2002) or the stratosphere (e.g., Lin et al., 2012a; Zhang et al., 2014). Other tagging approaches use chemical indicators of the factors limiting O3 production (e.g., the ratio of hydrogen peroxide to nitric acid production, or the maximum incremental reactivity of VOC families) to assign O3 to either NOx or VOC sources, such as the CAMx OSAT (Ozone Source Apportionment Technology) and CMAQ ISAM (Integrated Source Apportionment Method) source tagging schemes (Ramboll Environ, 2014; Kwok et al., 2015). Tagging may also be defined through the addition of tracers to track the origin of precursor molecules such as NOx (e.g., Emmons et al., 2012; Pfister et al., 2013) or VOCs (Butler et al., 2011). Other tagging rules include assignment preferentially to anthropogenic precursors (Ramboll Environ, 2014), or tagging of all O3 precursors (NOx, CO, and VOCs) such as in Grewe et al. (2010, 2017) and Guo et al. (2017), which leads to larger estimates of USB O3 than sensitivity studies or tagging only one type of precursor. Ying and Krishnan (2010) developed a scheme that includes tracers for O3 produced from individual species; the treatment of VOC impacts on radical species in this approach may underestimate contributions from reactive VOCs and overestimate those from less reactive VOCs (Kwok et al., 2015). Lefohn et al. (2014) define an Emissions-Influenced Background (EIB) that accounts for the decrease in the lifetime of USB O3 caused by anthropogenic emissions. This diversity of tagging approaches can make direct comparisons across such studies challenging, and the differences in source attribution estimates as well as the computational cost of these methods make them less well suited than zero-out simulations for estimating USB O3.

Several studies have compared USB O3 estimates calculated using different methods. In one study, a tagging source apportionment method using CAMx was compared to a zero-out method using CMAQ. The two approaches were found to provide similar estimates of April–October mean NAB O3 in rural areas, but in urban areas CAMx APCA (Anthropogenic Precursor Culpability Assessment) provided lower estimates of background O3 compared to CMAQ zero-out (Dolwick et al., 2015). Other comparisons note that tagging is more appropriate for source attribution than for estimating responses to emissions changes (e.g., Collet et al., 2014). In cases strongly affected by nonlinearities of O3 formation, the choice of source estimation method can lead to considerable differences (Grewe et al., 2010; Stock et al., 2013; Lapina et al., 2014; Emmons et al., 2012).

Parrish et al. (2017a) noted that the running average ODVs for sites in Southern California over the past 4 decades can be fit to a simple exponential decay function. They postulated that the asymptotic value of this fit is the same as USB O3. However, it is difficult to compare this approach with modeling studies that use a more rigorous definition for USB O3. To derive USB O3 from the Parrish et al. (2017a) method, it is necessary to assume that U.S. emissions are asymptotically approaching zero, that emissions and ODVs are directly related, and that USB O3 on ODV days is constant over the analysis time period. Because of these limitations, the “background ODVs” calculated in this manner are probably more representative of current baseline O3, plus some unquantified contribution from U.S. anthropogenic emissions.

4. Spatial and temporal distributions of USB O3

Here we review published work on spatial and temporal distributions of USB O3 from CTMs and summarize consistent and robust patterns. We also identify discrepancies between estimates of USB O3 and, if possible, the causes for these discrepancies. While a clear, quantitative synthesis across the published literature (Tables S1 and S2) is confounded by inconsistencies in the metrics reported and the time periods and regions considered, some robust patterns are evident and several CTMs have been able to capture the major features in the daily and seasonal surface O3 patterns (Fiore et al., 2009; Reidmiller et al., 2009; Schnell et al., 2015).

The McDonald-Buller et al. (2011) review relied heavily on background O3 estimates from the global GEOS-Chem (GC) model available at that time (Zhang et al., 2011). Major methodological advances since McDonald-Buller et al. (2011) include seasonal mean USB and NAB O3 estimates from additional global and regional models (Table S1) and studies quantifying the influence of NCOS on surface O3 distributions (Table S2). A broad set of modeling studies robustly shows that seasonal mean USB and NAB O3 are usually largest at western U.S. high-altitude sites (Table S1), as expected from the general increase in O3 with altitude in the troposphere (e.g., Newchurch et al., 2003; Logan et al., 1999). This spatial pattern was emphasized in the earlier McDonald-Buller et al. (2011) review paper and was based on observations of baseline O3 and published USB and NAB O3 estimates from the GC model.

Individual studies report different O3 metrics and vary in their definitions of peak O3 season, ranging from two to seven months, mostly in spring and summer. Synthesizing across these studies, we find a range of 15–65 ppb (Table S1) for seasonal mean USB O3 (MDA8) over the U.S. The higher end of this range occurs over high-altitude western U.S. sites in spring when Asian pollution and transport from the stratosphere make their largest contributions (20–35 ppb; Table S2) and when the O3 lifetime is longer than in summer (see Table S1). In the eastern U.S. and along the California coast, seasonal mean NAB O3 from the GC model is in the range of 20–40 ppb (Fiore et al., 2014a) and USB O3 is similar for the California coast from CMAQ (Dolwick et al., 2015). Other O3 metrics, such as those relevant for vegetation exposure, like W126, a 3-month integral that heavily weights high O3, differ in their sensitivity to USB O3 (e.g., Lapina et al., 2014, 2016; Huang et al., 2013b). A 3-model average NAB O3 contributed 64–78% of the May–July daytime O3 over the Intermountain West during 2010, but only 9–27% of the W126, which more strongly weights the highest O3 levels (Lapina et al., 2014).

NCOS (and USB O3) also show significant interannual variability, complicating direct comparisons across studies from different years. The studies in Table S2 summarize individual seasonal mean NCOS estimates, which include up to 25 ppb transported from the stratosphere, up to 10 ppb produced from lightning NOx, and up to a few ppb from wildfires. Estimates for seasonal mean Asian influence are generally below 5 ppb (Table S2). Anthropogenic CH4 is included in the USB O3 estimates in Table S1, and has been estimated to contribute ~5 ppb to U.S. surface O3 (Fiore et al., 2008, 2009). Near the U.S. borders with Canada and Mexico, international pollution transport enhances USB O3 relative to NAB O3 (Wang et al., 2009; Guo et al., 2018). In the southwestern U.S., seasonal mean USB O3 is higher than in other regions during both spring and summer, and NCOS play a more important role on high O3 days (Fiore et al., 2014a; Langford et al., 2017), although stratospheric intrusions occasionally decrease surface O3 in the heavily polluted Los Angeles Basin (Langford et al., 2012).

At some locations, the influence from individual NCOS (Figure 1) leads to day-to-day variability in observed O3 and modeled USB O3. For example, at high-altitude western U.S. sites, USB O3 correlates with simulated total ground-level MDA8 O3, implying that USB O3 drives day-to-day variations in observed O3 (Fiore et al., 2014a; see their Figure 8). Other models consistently find western USB O3 increases with observed (total) O3 (Lefohn et al., 2014; Huang et al., 2015), although Dolwick et al. (2015) note that the fractional USB O3 contribution is typically less for the highest modeled values. Numerous studies have shown that NCOS can contribute up to 30 ppb to the observed MDA8 at regulatory monitors due to deep stratospheric intrusions, especially at high-altitude sites (e.g., Langford et al., 2009, 2015a; Lin et al., 2012a, 2015a; Knowland et al., 2017) or from wildfires (Jaffe et al., 2004; Singh et al., 2012; Dreessen et al., 2016; Gong et al., 2017). Cross-border transport from Mexico or Canada can also contribute to significant variations in daily MDA8 values (Wang et al., 2009). Modeled USB O3 also show these daily variations due to NCOS, with modeled USB MDA8 O3 sometimes exceeding 70 ppb (Lin et al., 2012a, b; Zhang et al., 2014). Models will not necessarily capture the O3 maximum on the highest observed days, implying uncertainty in the simulated partitioning of total O3 into USB O3 and other sources (Fiore et al., 2014a). Furthermore, even if a model captures the observations perfectly, it does not necessarily follow that the simulated source attribution is correct.

Figure 3 illustrates that the 4th highest NAB MDA8 value at rural locations in the NOAA GFDL AM3 model is much lower than the observed 4th highest MDA8 over most densely populated U.S. regions, but that NAB O3 contributes to some of the highest observed days in the Intermountain West, Pacific Northwest, and along the U.S.–Canada border. At some high elevation sites, the annual 4th highest NAB MDA8 from AM3, averaged over 2010–2014, exceeds 60 ppb although we note that AM3 simulations may be biased high by too much transport from the stratosphere (Lin et al., 2012b; Fiore et al., 2014a). Over the eastern U.S., where Figure 3 shows 4th highest NAB MDA8 values below 60 ppb, both AM3 and GEOS-Chem indicate that the highest O3 events are typically fueled by U.S. anthropogenic emissions with little correlation between USB O3 and total simulated O3 (with the possible exception of some sites along the Gulf Coast; Figure 8 of Fiore et al., 2014a).

Figure 3 

Annual 4th highest MDA8 O3 observed and NAB modeled values. Annual 4th highest MDA8 O3 value at all available rural O3 monitoring sites in the U.S. and Canada, averaged over 2010–2014 (top). Annual 4th highest MDA8 NAB value averaged over 2010–2014, from a GFDL-AM3 model simulation with North American anthropogenic emissions zeroed out (bottom). Top figure provided by the Tropospheric Ozone Assessment Report (Schultz et al., 2017). Bottom figure from NAB O3 simulation described in Lin et al., 2017. DOI: https://doi.org/10.1525/elementa.309.f3

A few of the studies in Table S1 compared seasonal mean and daily NAB O3 estimates across 2–4 models and found discrepancies in the magnitude and variability, both spatial and temporal, of NAB O3 estimates for the MDA8 (Fiore et al., 2014a), daytime mole fractions, and the W126 (Lapina et al., 2014) O3 metrics. The AM3 model generally simulates significantly higher seasonal mean values in both spring and summer (up to 20 ppb higher), compared to other models. Fiore et al. (2014a) concluded that differences in model estimates of NAB O3 resulted primarily from different model representations of stratosphere–troposphere exchange, wildfire, and lightning sources (and their subsequent chemistry) as well as isoprene oxidation chemistry in the models. HTAP (2010) and Huang et al. (2017) show that Asian and other intercontinental O3 sources also vary by model. Orbe et al. (2017) show how different convection schemes can have large influences on transport, even when using the same meteorological fields. Dolwick et al. (2015) applied two regional models to compare the zero-out and source apportionment approaches and found similar seasonal mean MDA8 USB O3 estimates (after correcting for biases as large as ±10 ppb versus observations in each of the regional models compared to observations). Discrepancies between these USB O3 estimates occurred most strongly in urban areas where anthropogenic emissions can lower background O3 levels due to NOx titration (Dolwick et al., 2015). Consideration of odd oxygen in the tracers used for source apportionment would minimize such discrepancies. Odd oxygen here would be defined as including O3 + NOx to account for conversion of O3 to NO2 (by NO titration).

Uncertainty in estimates of USB O3 can be difficult to consolidate across studies into an overall uncertainty estimate owing to differences in region, season, source apportionment method, and O3 metrics considered in different works. Nevertheless, insight into the range of uncertainties can be gained from several studies that have considered multiple models or approaches in an internally self-consistent manner. While model diversity does not strictly represent the total model uncertainty (which must also consider bias against observations), it is still a useful measure of confidence in USB O3 estimates. For example, the daytime NAB O3 in Lapina et al. (2014) from three different global models showed modest differences over most regions of the U.S., but much more significant differences in NAB O3 for the W126 vegetation index. In this case, the contribution from NAB O3 to W126 can differ by a factor of 2 using different models. In Dolwick et al. (2015), two different regional models and source apportionment methods were used to estimate seasonal MDA8 USB O3. They found that at over 75% of the locations, the differences were less than 2.5 ppb after the base models were bias corrected although we note that the same global model boundary conditions were used in each regional model. In Fiore et al. (2014a), estimates of MDA8 NAB from two global models differed by 1–10 ppb, depending upon region, season, and altitude. Hogrefe et al. (2018) evaluated surface O3 simulations in a regional model using four sets of boundary conditions from different global models (AM3, MOZART, Hemispheric CMAQ, and GEOS-Chem). The largest differences exceed 10 ppb for seasonal mean O3 observed at U.S. sites and reached 15 ppb on individual days. For two sets of boundary conditions, observation-model differences were much smaller (typically ±4 ppb). Qualitative synthesis by the authors of all these estimates of model differences and estimates of model biases suggests uncertainties in seasonal mean USB O3 of about ±10 ppb.

Comparisons to observations are essential for assessing the fidelity of models used to quantify USB O3 and NCOS and their spatial and temporal variability and lending confidence to their estimates. In some cases, different models bracket observed O3 abundances (e.g., Fiore et al., 2014a), but in others, such as for ground-level O3 over the southeastern U.S. in summer, systematic model biases exist (e.g., Travis et al., 2016). Travis et al. (2017) found that this pervasive positive summertime bias over the southeast U.S. is restricted to the surface and may reflect shortcomings in model resolution of asymmetric top-down and bottom-up vertical mixing. Systematic biases may also reflect missing (or poorly represented) loss processes (e.g., halogen chemistry (Sherwen et al., 2017) or dry deposition (e.g., Val Martin et al., 2014)). Some of the studies in Table S1 have attempted to bias-correct USB or NAB O3 estimates by simply assuming the bias is entirely due to USB O3 (Lin et al., 2012b) or by assuming that the relative model contributions from individual sources are accurate such that USB O3 is adjusted proportionally to its contribution to total simulated O3 (Dolwick et al., 2015). The former approach assumes a single process causes the error whereas the latter assumes the model is missing a sink that acts on all O3 regardless of the source (or overestimates O3 from all sources equally). Models assimilating tropospheric satellite-based O3 columns or aircraft-based profiles show improved model representation of western U.S. ozonesonde profiles (e.g., Huang et al., 2015) but would require assumptions to partition the adjustment into USB O3 versus O3 produced from U.S. anthropogenic emissions. While models adjusting emissions of O3 precursors based on satellite data assimilation (e.g., Huang et al., 2015) could lead to improved estimates of USB O3, this approach is still subject to errors in model transport and cannot differentiate between natural and anthropogenic sources occurring in the same model grid cell.

Although a single model may best represent a particular site or day of interest, a multi-model approach may best provide a general characterization of spatial, seasonal, and daily variability in USB O3 until the root sources of individual model biases are clear. Future efforts would benefit from moving beyond abundance-based evaluations and towards process-based evaluation to demonstrate whether models capture the variability in observations attributable to USB O3 and specific NCOS. This type of evaluation will require intensive field campaigns and long-term observations that measure not only O3 but also related meteorological and chemical variables. Locations and times with inter-model differences with major implications for air quality management could guide targeted observations for evaluating process-level representation in the models. Efforts to coordinate multi-model approaches, as has been done for quantifying the influence of foreign anthropogenic emissions on surface O3 under the Task Force on Hemispheric Transport of Air Pollution (HTAP, 2010; Galmarini et al., 2017), would facilitate a more systematic and rigorous assessment of our quantitative understanding of USB O3 as represented across a suite of modeling systems.

Satellite observations enable new global model analyses (via data assimilation) and have made significant contributions to EE analyses (e.g., Fiore et al., 2014b). However, satellite data have not yet been able to retrieve O3 mole fractions in the boundary layer and at the surface. Some satellite analyses have quantified tropospheric column O3, either directly (e.g., Liu et al., 2010) or by difference (Ziemke et al., 2011). However, this situation is likely to change dramatically as several geostationary satellite instruments will be deployed in the next 5 years. This includes the U.S. Tropospheric Emissions: Monitoring Pollution instrument (TEMPO), the Korean Geostationary Environment Monitoring Spectrometer (Bak et al., 2013), and the European Sentinel-4 satellite (Zoogman et al., 2017). By measuring backscattered solar radiation in both the visible and near ultraviolet (290–740 nm) from a geostationary orbit, TEMPO should be able to distinguish boundary layer O3 from that in the free troposphere and stratosphere, and provide hourly data for the continental U.S. on key O3 precursors, such as nitrogen dioxide (NO2) and formaldehyde (HCHO). Specifications for TEMPO call for a precision of 10 ppb for the 0–2 km and free tropospheric O3 measurements. Thus, TEMPO should provide key constraints on modeled O3 that can improve source and EE attribution (Zoogman et al., 2014, 2017). The satellite community has been engaged with regional air quality efforts via programs such as the NASA Air Quality Applied Sciences Team, and this has led to important partnerships between the scientific and regulatory communities (e.g., Fiore et al., 2014b; Witman et al., 2014).

5. Interannual variability and trends in baseline and USB O3

Generalization of individual measurement and model results is complicated by the fact that background O3 exhibits both long-term trends and substantial year-to-year variability. Observed year-to-year variations of surface O3 show large-scale similarity across sites over the Intermountain West (Jaffe, 2011; Lin et al., 2017), indicating that the controlling processes operate across large scales. Both mean O3 and the frequency of high O3 events (>65 ppb) measured at western U.S. rural sites increased in the springs following the strong La Niña winters that occurred in 1998–1999, 2007–2008, and 2010–2011 (Lin et al., 2015a; Xu et al., 2017). Anomalously frequent high-O3 events were also observed at Mt. Bachelor and urban sites downwind in April–May 2012. The enhanced O3 in spring 2012 resulted in 3–6 days with an MDA8 greater than 70 ppb at several rural locations including Great Basin National Park and Lassen Volcanic National Park (Baylon et al., 2016). Using the AM3 model, Lin et al. (2015b) were able to capture the significant interannual variability and identify the cause. The highest MDA8 values at western U.S. rural sites occurred in the springs of 1999, 2011, and 2012, following La Niña patterns. The increased frequency of deep tropopause folds, linked to a cyclical amplification of the polar jet stream, is the key driver of year-to-year variability of springtime high USB O3 events over the western U.S. (Lin et al., 2015b).

Large-scale variations in temperature, pressure, and airflow can also lead to substantial year-to-year variations in O3 production, air mass stagnation, snowpack accumulation, and wildfire severity (Fiore et al., 2015; Mote et al., 2016; Gong et al., 2017; Jaffe and Zhang., 2017; Lin et al., 2017; Shen and Mickley, 2017). Interannual variability of surface O3 in the Intermountain West during summer is found to correlate with wildfire severity (Jaffe, 2011; Jaffe et al., 2008). This correlation may also reflect common underlying correlations with temperature rather than a causal relationship between fire and O3 (Zhang et al., 2014), as supported by a model with constant fire emissions, which captures the observed O3 interannual variability (Lin et al., 2017). While wildfire emissions can enhance summertime monthly mean O3 at individual sites by 2–8 ppb, high temperatures and the associated buildup of O3 produced from regional anthropogenic emissions are also important to elevating observed summertime O3 in the western U.S. (Jaffe and Zhang, 2017) and throughout the rest of the country (Lin et al., 2017).

Information on long-term baseline O3 trends requires rural monitoring sites combined with methods that can select the data that are representative of air masses originating beyond the nation’s borders. While boundary layer O3 observations show more influence from local, continental, or marine sources, observations at high elevation sites (1.5–3.0 km asl) show greater influence from large-scale downward mixing of free tropospheric air, although they can also be influenced by transport of photochemically aged plumes from nearby urban areas or wildfires during summer (e.g., Ambrose et al., 2011). Studies of baseline O3 trends have mainly focused on the limited number of well-positioned monitoring sites along the U.S. borders (Parrish et al., 2012, 2017b; Gratz et al., 2015; Zhang and Jaffe, 2017) and across the Intermountain West during spring due to the great interest in the potential impact of rising Asian emissions on U.S. surface O3 (Jacob et al., 1999).

Cooper et al. (2012) found a tendency towards increasing O3 at high elevation rural sites across the western U.S. in spring and no clear trend in summer over the period 1990–2010, despite stringent precursor emission controls in the U.S. that have decreased O3 in urban areas (e.g., Russell et al., 2012). Extending the analysis to 1988–2014, Lin et al. (2017) found 0.2–0.5 ppb yr–1 increases in median springtime MDA8 O3 measured at 50% of 16 western U.S. high elevation sites, with 25% of the sites showing increases across the entire O3 mole fraction distribution. There is also evidence that O3 increased in the mid-troposphere (500 hPa or ~5.7 km asl) above western North America during April–May at the rate of ~0.3 ppb yr–1 from 1995 to 2014 (Lin et al., 2015b).

Baseline O3 trends on the West Coast of the U.S. have been determined at several of the surface and mountain sites described above, although the data records are relatively short. From 2004 to 2015, mean O3 at Mt. Bachelor (2.8 km asl) has increased significantly: 0.62 ± 0.25 ppb yr–1 in spring, 0.66 ± 0.27 ppb yr–1 in summer, and 0.79 ± 0.34 ppb yr–1 in fall (Zhang and Jaffe, 2017). In the most recent analyses, marine boundary layer O3 has remained unchanged at Cheeka Peak, Washington, and decreased at Trinidad Head in northern California (Parrish et al., 2017b). Figure 4 shows these trends. The decrease of O3 at Trinidad Head may be associated with a shift in transport pattern (as indicated by rapidly warming temperatures), while the spring increase at Mt. Bachelor has been attributed to changes in Asian emissions over the past decade and the summer increase attributed to regional wildfires (Zhang and Jaffe, 2017). The differences at these two sites, separated by a horizontal distance of 850 km, likely reflect the different influences of local processes, interannual meteorological variability, and changing USB O3.

Figure 4 

Interannual variability of baseline O3 at Mt. Bachelor Observatory and Trinidad Head. Nighttime observations of baseline O3 at Mt. Bachelor Observatory (blue) and baseline O3 with daytime onshore wind conditions at Trinidad Head (orange) for the 2nd (triangles), 50th (squares), and 98th percentiles (circles). The range of values for the 98th percentile at Mt. Bachelor over the period 2004–2016 is 64–86 ppb during spring and 61–84 ppb during summer. The range of values for the 98th percentile at Trinidad Head over the period 2005–2016 is 41–58 ppb during spring and 22–41 ppb during summer. DOI: https://doi.org/10.1525/elementa.309.f4

Attribution of baseline O3 trends requires consideration of changes in global emissions, as well as regional climate variability, particularly in short data records. It is well established that O3 formation depends on both temperature (e.g., Weaver et al., 2009) and humidity and changes in these climate variables must be considered when evaluating trends. For example, Bloomer et al. (2010) show that O3 trends in the eastern U.S. between 1989 and 2007 were largely negative, despite temperature trends that were positive, indicating the dominant role played by emission reductions. Observed baseline O3 trends have been compared with trends derived from a variety of global models: (1) CTMs driven with a single year’s meteorology that repeats each year while emissions are allowed to change (Fusco and Logan, 2003; Reidmiller et al., 2009; Wild et al., 2012; Zhang et al., 2008), (2) free-running chemistry-climate models (CCMs) that generate their own weather, but are driven with historical emissions (Cooper et al., 2014; Lamarque et al., 2010; Parrish et al., 2014), and (3) multi-decadal hindcast simulations driven with observed meteorology and historical emissions (Brown-Steiner et al., 2015; Koumoutsaris and Bey, 2012; Lin et al., 2015b; Lin et al., 2014; Lin et al., 2017; Strode et al., 2015; Xing et al., 2015). The O3 trends derived from observations are higher than those from CTMs with constant meteorology, and from free-running CCMs by a factor of two at some sites (e.g., Parrish et al., 2014). These discrepancies may partly reflect the influence of internal climate variability on observed O3 (although we note that the reduced variability in CCMs may also reflect errors in their representation of chemistry and dispersion and from numerical diffusion, similar to CTMs whose meteorology is forced to match observed large-scale weather patterns). As the free-running CCM cannot reproduce the exact meteorological fields for the specific observational period, the model cannot be expected to capture the observed trend exactly (e.g., Lin et al., 2014, 2015a; Barnes et al., 2016). For example, Deser et al. (2012) have shown that summertime surface temperature projections for mid-century in some U.S. regions can vary from <1 up to 5°C for the exact same climate forcing scenario solely because of slight variations in the initial atmospheric state. As trends in O3 are tied to meteorology, and it is unlikely if not impossible that a single climate model simulation would represent the internal variability exactly as manifest in the real atmosphere, CCMs cannot be evaluated in the same manner as CTMs driven by the observed meteorology. Furthermore, meteorologically-driven O3 variability is large over western North America, leading to significant variations in O3 trends between sites (Lin et al., 2015b).

One recent study using hindcast simulations forced with observed meteorology was able to match measured O3 trends at rural western U.S. sites by narrowing the analysis to days when the airflow is predominantly from the North Pacific Ocean in the model (Lin et al., 2017). This study suggests that the common model-observation disagreement in baseline O3 trends at western U.S. sites reflects an excessive offset from regional pollution decreases in the global models owing to their coarse resolution, which cannot fully resolve the observed baseline conditions. This shortcoming can be corrected by filtering model O3 for baseline conditions using regionally emitted tracers in the model, such as CO (Lin et al., 2017).

A synthesis of available observations from the mid-1990s to the 2000s indicates increases in surface and free tropospheric O3 across East Asia (see Supplementary Note 1 in the SI). Quantifying the effects of increasing Asian precursor emissions on O3 in the U.S., relative to the effects of regional emission controls, has been an active research area in the last decade. Reidmiller et al. (2009) and Wild et al. (2012) used the HTAP simulations to show that regional emission controls over North America are 2–10 times as effective at reducing U.S. surface O3 as the equivalent controls in Asia and Europe. Even so, Lin et al. (2017) demonstrated that the tripling of Asian NOx emissions from 1990 to 2014 contributed 65% of modeled springtime background O3 increases (0.3–0.5 ppb yr–1) over the western U.S., outpacing O3 decreases (<0.1 ppb yr–1) attained via a 50% reduction of U.S. NOx emissions. Increases in global methane contributed about 15% to the trend.

Detailed analyses of baseline O3 trends along the U.S. southern and northern borders are limited in the peer-reviewed literature. Recent analysis by the Tropospheric Ozone Assessment Report (Schultz et al., 2017) of all available rural O3 monitoring sites in the U.S. and Canada has provided some insight. While some O3 data are available for urban sites in Mexico, there are no rural monitoring sites, greatly limiting our ability to understand Mexico’s impact on U.S. baseline O3. However, roughly 3 dozen rural sites are located across southern Canada with trends that are similar to those observed on the U.S. side of the border, based on the annual 4th highest MDA8 O3 value. In general, there appears to be little change in O3 across southern Canada in spring but there is an indication of decreasing O3 in summer, presumably associated with Canadian NOx emission decreases of 34% from 2000 to 2014 (Hoesly et al., 2018). The trend in O3 transported from Mexico to the southern U.S. is not known from observations, but Mexican NOx emissions have gone down by only 3% for 2000–2014 (Hoesly et al., 2018). Further details regarding observed O3 trends across North America are provided in the SI (see Supplementary Note 2).

A number of studies have demonstrated that U.S. emissions and mole fractions of NOx have declined substantially (Simon et al., 2015; Lamsal et al., 2015; Krotkov et al., 2016), but at the same time, there can still be substantial uncertainty in the absolute amounts (Hassler et al., 2016). One analysis suggests that the EPA National Emission Inventory (NEI) significantly over-estimates NOx emissions from mobile and/or industrial sources (Travis et al., 2016). The most recent inventory shows that U.S. anthropogenic NOx emissions decreased by 49% from 2000–2014 (Hoesly et al., 2018). It should be noted that fertilized agricultural and soil emissions of NOx may be substantial, and may become more important as industrial emissions decline (Jaeglé et al., 2005; Almaraz et al., 2018). These emissions have higher uncertainties than the industrial emissions.

Peak O3 levels and ODVs have decreased at most monitoring sites in the U.S., with the largest decreases in the eastern U.S. and in California (e.g., Simon et al., 2015). Figure S1 shows trends of the annual 4th highest MDA8 O3 values (based on April–September observations) at all available rural O3 monitoring sites in the U.S. and Canada, for the period 2000–2014. The great majority of sites show decreasing O3 with p-values <0.10. Figure 5 shows O3 trends at high elevation (>1 km altitude) rural sites over the period 2000–2016. The analysis is applied to the 5th, 50th, and 95th percentiles of midday observations (1100–1600 local time) for spring (April–May) and summer (June–July–August) with the goal of assessing O3 trends within air masses that are as regionally representative as possible. During spring only one site shows increasing O3, Mt. Bachelor for the 50th and 95th percentiles (both trends in the range of 0.5–0.6 ppb yr–1). In the case of Mt. Bachelor, only nighttime data are used here to focus on free tropospheric/baseline conditions, and the analysis at this particular site is limited to 2004–2016. Of the remaining western sites, most show no significant springtime trend while any significant trends are negative. In summer, Mt. Bachelor is again the only site with a statistically significant O3 increase at the 50th and 95th percentiles (0.5 and 0.8 ppb yr–1, respectively), likely due to recent increases in regional wildfire influence (Zhang and Jaffe, 2017). Otherwise, sites in the west and east show a clear tendency towards decreasing summertime O3, especially in the upper tail of observations (95th percentile), presumably due to regional emissions controls. These results, limited to observations since 2000, differ from the conclusions of prior studies spanning the much longer periods of 1990–2010 (Cooper et al., 2012) and 1988–2014 (Lin et al., 2017), which showed a general increase of O3 in spring and no consistent trend in summer. While most U.S. rural sites do not show significant springtime O3 decreases since 2000, it appears that regional emission controls have led to widespread decreases in summertime O3 at these sites, especially in the upper tail of observations.

Figure 5 

Mean O3 trends for 2000–2016 at rural high elevation sites (>1 km asl). Spring trends (left) and summer trends (right). All sites used daytime data (1100–1600 local time.), except for Mt. Bachelor, where we use nighttime data to focus on baseline/free tropospheric air masses. Vector colors indicate the p-value associated with the linear trend at each site. DOI: https://doi.org/10.1525/elementa.309.f5

Models may fail to simulate accurately the responses of O3 to changes in U.S. emissions due to shortcomings in the underlying emission inventories. Several retrospective dynamic model evaluation studies using CMAQ tend to underestimate observed decreases in U.S. O3 over the past decades (Foley et al., 2015; Xing et al., 2015; Zhou et al., 2013). Karamchandani et al. (2017) found that models more accurately simulate trends in observed O3 in southern California when basin-wide VOC emissions were doubled. In contrast, for the eastern U.S., Travis et al. (2016) found that reducing industrial NOx emissions, compared to the NEI, gave results that were more consistent with observations. Thus, emission inventory accuracy is key to model performance and inventories may have biases that vary by region. Inaccuracies in the magnitude of NOx and VOC emissions introduce errors in the modeled sensitivity of O3 to changes in precursor emissions. Wherever possible, O3 sensitivities to precursor emissions should be evaluated directly as other sources of errors (e.g., inaccurate representation of changes in chemical or depositional loss rates) may also contribute to discrepancies between modeled and observed responses. To the extent that models misrepresent the contribution to O3 from domestic sources, they will incorrectly estimate the relative fractions of controllable and background O3.

We examined the change in the annual 4th highest MDA8 for 2000–2017 for 9 urban locations in the U.S. (San Bernardino, Chicago, Atlanta, Boston, Albuquerque, Sacramento, Salt Lake City, Denver, and Reno). In each location, we chose a single monitoring site with one of the highest ODVs in that urban area (Figure S2). From this we find that San Bernardino, Atlanta, Boston, Albuquerque, and Sacramento all show statistically significant downward trends in the 4th highest MDA8, whereas Chicago, Salt Lake City, Denver, and Reno show no significant trend since 2000 (Table S3). Overall, the significant reductions in the urban areas are generally consistent with the rural O3 trends shown in Figure S1. The negative trends in 4th highest MDA8 O3 are linked to significant reductions in emissions of O3 precursors, while at the same time there can be important regional differences in emission trends (e.g., emissions related to oil and gas extraction in some parts of the western U.S.) that can help explain some of the weaker trends. We note that three of the four locations with no significant trend are high elevation sites (Salt Lake City, Denver, and Reno). Trends in O3 at these western sites might also be influenced by increasing wildfire activity. Exclusion of wildfire EEs would impact the trend in ODVs at these sites, if relevant states have submitted the EE documentation and EPA approves. Although we have examined only a single monitor in each urban area, this demonstrates the importance of accurate assessment of the USB O3 contribution for these locations and regional modeling to quantify the controllable sources, as described in Section 6, below.

6. USB O3 influence on regional air quality modeling: A western case study

Regulatory applications (e.g., SIPs) require models to represent accurately O3 sources so that they can be used to examine emission scenarios and demonstrate future attainment of the NAAQS. This section shows one case study to highlight results as used in regulatory model applications. The regulatory treatment includes exclusion of identified exceptional days and focuses on the top 10 observed days. While this case study compares only two models, it provides insights into the relationships between regional model estimates of USB O3 and observations. In particular, this analysis compares how simulated USB O3 and other sources correlate and the implications for model performance as used in regulatory modeling.

The EPA Transport Assessment (US EPA, 2016c) and the Western Air Quality Study (WAQS, 2017) both independently simulated USB O3 at 12-km resolution in Colorado for 2011. This is an ideal case study for USB O3 relevant to state planning because the western states typically have high USB O3 contributions, and because the Northern Colorado Front Range often experiences high O3 levels that exceed the NAAQS. Both modeling systems use global simulations to provide time-varying boundary conditions (EPA: GEOS-Chem; WAQS: MOZARTv4) and quantified USB O3 contribution as the sum of tagged boundary and natural sources of O3 from May 1 to Sept. 29. Further details on both modeling systems are provided in the SI (see Supplementary Note 3). We compare simulations and contributions for two illustrative monitors: Chatfield (AQS 08-035-0004, hereafter CHAT), a regulatory relevant suburban monitor southwest of Denver, and Rocky Mountain National Park (AQS 08-069-0007, hereafter RMNP), a relatively rural high elevation monitor to the northwest.

Figure 6 shows the observed and modeled MDA8 (EPA model only) and the USBO contribution (from both models) at CHAT. Figure S3 shows a similar comparison for RMNP. Monthly averaged biases at the CHAT monitor were marginally-negative in the EPA simulations (–2.5 ± 0.4 ppb) and marginally-positive in the WAQS simulations (4.0 ± 2.8 ppb), and both are consistent with literature synthesis of model performance (Simon et al., 2012). Figure 6 suggests four distinct segments of performance and simulated contributions at CHAT that are related to NCOS contribution. The simulations start in a USB O3 dominated regime (May 1 to June 7), go through a transition period (June 8 to July 15), and then end with two periods dominated by local contributions (July 16 to Aug 22 and Aug 23 to Sept. 29). During the USB O3 dominated period, the EPA model had stronger correlation (r = 0.74) than the WAQS (r = 0.33), and WAQS had several days where USBO was greater than total observed O3. During the transition period, both simulations performed poorly (r = 0.23). During the locally dominated periods, both simulations performed well. Table S4 shows additional correlations for individual model components. In general, there is a negative correlation between USBO and local contributions. Similar results were found at RMNP (see Figure S1), where the correlation was typically not as good as at CHAT. Based on this comparison, we find that periods associated with higher background contribution were associated with worse model performance. Thus, the simulations performed better during periods of sustained contribution (USB O3 or local), simulations performed even better when USB O3 and local contribution were not anti-correlated, and simulations performed best when local contributions were dominant.

Figure 6 

Observed and modeled MDA8 O3 with USB O3 from EPA model and WAQS for Chatfield. Observed O3 (black lines), EPA model MDA8 O3 (top of dark grey), EPA model USB O3 (top of light grey), and WAQS USB O3 (dashed green lines). For four simulation segments, the values below the axis give (for both models) the mean bias (MB), correlation (r) of total prediction with observations (TOT), correlation of local contribution (LC) with observations, and correlation of USB O3 contribution with observations (USBO). DOI: https://doi.org/10.1525/elementa.309.f6

Regulatory applications focus on high concentration days, so Figure 7 examines the two models’ performance on only the top 10 MDA8 O3 days. The top 10 days were defined by the observed mole fractions. For this analysis, we excluded two days from the observations with suspected significant stratospheric influence (June 7th and 24th), consistent with guidance for regulatory modeling (US EPA, 2014a, and see further discussion in Supplementary Note 4 in SI). Both simulations have a negative mean bias (EPA: –5 ppb; WAQS: –4 ppb). The significance of the bias was evaluated using t-test. The null hypothesis is that the predicted and observed means are equal—put another way, that the predictions are on average unbiased. Despite large individual day biases on the top 10 days (range of +11 to –22 ppb), neither model bias was significant (p > 0.05).

Figure 7 

Observations, predictions, and USBO estimated by EPA and WAQS models for top 10 observed days. Observations (OBS), total predictions (TOT), and U.S. background O3 (USBO) are shown. Mean bias (MB, ppb), mean error (ME, ppb), and the p-value for a t-test comparing model to observations are provided for TOT. Similar values are provided for WAQS USBO where EPA USBO is treated as the reference. Boxes denote the inter-quartile range (IQR), and whiskers extend to the min/max excluding outliers. Outliers are further than 1.5 times IQR below 25th percentile, or above the 75th percentile. (Two possible stratospheric intrusion days were removed.). DOI: https://doi.org/10.1525/elementa.309.f7

We further compare USB O3 between EPA and WAQS on the “observed top 10 days” to test if the choice of the modeling system produced significantly different contributions from NCOS and U.S. sources. Despite daily difference of up to 14 ppb, the average difference (5 ± 5 ppb) was not significant (p > 0.05). The USB O3 differences were comparable in magnitude to differences in local contribution (–4 ± 8 ppb) that were also not significant.

Our review of EPA and WAQS 2011 modeling for Chatfield highlights similarities between different models, but also confirms the need to improve modeling of background O3. Correlations between observations and contributions at CHAT over the whole period are generally consistent with previous studies (US EPA, 2013; Zhang et al., 2011; Emery et al., 2012) showing that: (1) USBO is a significant fraction of total O3 at the CHAT and RMNP sites; (2) the observed and predicted O3 are most strongly correlated with the local contribution; and (3) boundary conditions are anti-correlated with the local contribution (see Table S4).

Both models perform well for average biases, but model correlation with observations is better when local contributions are dominant and when anti-correlation between local and USB O3 contributions is weak. The boundary conditions derived from global models are dominated by USB O3 in both models, which suggests a need for more research coupling global and regional models. The top 10 observed days are generally when the models perform best, and both models predict total O3 that is consistent with the observations and each other. The finding that the models perform worst when USB O3 and local contribution anti-correlation is strongest, or during transitions from USB O3 to local contribution dominance, highlights the need for more research on USB O3 and provides specific conditions for future studies.

7. Evidence for NCOS from observations and models

Individual NCOS events have long been associated with episodic increases in surface O3, and much of our knowledge about their impacts in the U.S. and Canada has been inferred from routine ground-based measurements coupled with meteorological analyses (Ambrose et al., 2011; Fine et al., 2015; Jaffe and Zhang, 2017; Lefohn et al., 2012; Stauffer et al., 2017; Teakles et al., 2017; Wigder et al., 2013a, b) or with models and satellite retrievals (He et al., 2011; Lin et al., 2012a, b). These studies have been hampered by the sparsity of surface O3 monitors in the western states where the impacts tend to be greatest (Gustin et al., 2015), and by limited free tropospheric measurements by aircraft (Yates et al., 2013), ozonesondes (He et al., 2011), or lidars (Kuang et al., 2012; Langford et al., 2018).

The episodic nature of some NCOS makes it difficult to target these sources with dedicated field studies, but opportunistic measurements have been made during field campaigns with other objectives (Langford et al., 2012; Ott et al., 2016; Sullivan et al., 2015). Long-range transport of O3 and its photochemical precursors from Asia to the western U.S. was a focus of several recent campaigns including the California Research at the Nexus of Air Quality and Climate Change (CalNex) (Neuman et al., 2012; Ryerson et al., 2013) and Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARC-TAS) (Huang et al., 2010; Jacob et al., 2010) missions. The impact of U.S. wildfires on O3 in the West was also investigated during ARC-TAS (Singh et al., 2012) and other studies (Jaffe et al., 2008, 2013; Dreessen et al., 2016), and the influence of wildfires, long-range transport, and stratosphere-to-troposphere transport (STT) were foci of the Las Vegas Ozone Study (LVOS) (Langford et al., 2015a).

Most STT in the U.S. occurs through tropopause folds, tongues of upper troposphere/lower stratosphere (UT/LS) air extruded beneath the jet stream circulating around mid-latitude cyclones. These occur most frequently in winter when Rossby wave activity is at a maximum in the Northern Hemisphere, but the potential impact on surface O3 is greater in late spring through early summer, when there is more O3 in the lower stratosphere and deeper mixed layers can more easily entrain O3 that reaches the lower troposphere (Langford et al., 2017). Descending stratospheric intrusions can also merge with biomass burning plumes (Brioude et al., 2007) or transported pollution (Cooper et al., 2004a, b; Lin et al., 2012b) and carry additional O3 from these sources downward to the surface. Most tropopause folds are dissipated in the free troposphere and the transported O3 becomes part of the free tropospheric background. Deep tropopause folds sometimes create localized spikes in surface O3 (Langford et al., 2009), but they more frequently lead to smaller increases (<20 ppb) that can affect larger areas over several days (Lin et al., 2012a). They can also indirectly increase surface O3 by fomenting the spread of wildfires due to their low humidity (Langford et al., 2015b). Several studies (e.g., Skerlak et al., 2014) have shown that the west coast of North America is one of the preferred regions for deep tropopause folds and there is growing evidence that the integrated contributions of frequent intrusions and co-mingled Asian pollution contribute to the springtime maximum in background O3 in the southwestern U.S. and Intermountain West. STT events have also been implicated in exceedances of the O3 NAAQS in the western U.S. (Langford et al., 2009; Langford et al., 2015a).

The contributions of STT to surface O3 are not easily simulated using regional CTMs, which have traditionally included only the troposphere with no internal stratospheric processes. Regional simulations that use a global model to provide the lateral boundary conditions have shown qualitative success at simulating STT timing and location, but typically with significant under- (Emery et al., 2012; Zhang et al., 2014) or over-estimations (He et al., 2011). Under-estimations have often been attributed to poor horizontal resolution. Emery et al. (2012) showed several case studies where 12-km horizontal resolution was capable of reproducing transport to the surface. Inadequate vertical resolution and mixing is also a problem; for example, He et al. (2011) suggested that over-estimations of STT by the Environment Canada AURAMS CTM during the 2007 Border Air-Quality and Meteorology Study (BAQS-Met) were caused by the model having limited vertical resolution near the tropopause. For the under-estimations, Zhang et al. (2014) proposed a statistical correction to improve USB O3 estimates. For the top boundary conditions in troposphere-only models, Xing et al. (2016) developed a seasonally and spatially varying potential vorticity (PV)-based function to characterize O3 in the upper troposphere that improved springtime performance by the WRF-CMAQ model, but degraded it in fall. One outstanding challenge for model assessments of STT is how to treat O3 that was originally produced in the troposphere, transported to the stratosphere, and then transported back to the troposphere, as part of a stratospheric intrusion. Zhang et al. (2014) show that different definitions for stratospheric O3 can lead to a factor of 2 difference in the amount of O3 identified as “stratospheric”. While this does not change the total modeled O3, it could lead to significant discrepancies in source contributions identified by different models.

Stratospheric intrusions can be identified in high-resolution reanalysis data (Knowland et al., 2017), and some global models have been successful in reproducing the surface contributions of STT. Simulations by GFDL-AM3 (Lin et al., 2012a, b), RAQMS (Pierce et al., 2003), and FLEXPART (Brioude et al., 2007) agreed well with lidar and in situ measurements made during the Las Vegas Ozone Study (LVOS) (Langford et al., 2017). He et al. (2011) also found good agreement between FLEXPART and surface and ozonesonde measurements made during several STT events in the BAQS-Met campaign. GFDL-AM3 estimated that deep STT can episodically increase surface O3 by 20–40 ppb on days when observed MDA8 O3 exceeds 70 ppb at western U.S. high elevation sites. GEOS-Chem can identify STT influence at the surface at high elevation sites but typically underestimates the contribution (Zhang et al., 2014).

Biomass burning can produce significant amounts of O3, and wildfires are a growing concern (US GCRP, 2016). In the western U.S., forest management and climatic factors (e.g., drought and pine bark beetle infestations) have resulted in extensive tree mortality (Raffa et al., 2008), a significant increase in wildfire activity (Dennison et al., 2014), and deteriorating air quality in some areas (McClure and Jaffe, 2018). Agricultural burning is commonplace in the central and eastern U.S. (McCarty et al., 2007; Liu et al., 2016), but these fires are, in principle, controllable so are not considered NCOS. The chemistry in fire plumes is complex and highly variable, and does not always generate O3. In a review of more than 100 studies on wildfire smoke, Jaffe and Wigder (2012) found that O3 production generally increases for up to 5 days downwind, but with a very wide range in reported ΔCO/ΔO3 enhancement ratios. While the majority of smoke plumes show some degree of O3 enhancement, many studies have found no O3 production or even O3 loss. This reflects the large variability in NOx and VOC emissions, plume heights, and downwind meteorology (Briggs et al., 2016; Baylon et al., 2015). Because wildfire emissions have high VOC/NOx ratios (Akagi et al., 2011), O3 production can increase when plumes pass over NOx-rich urban areas (Singh et al., 2012; Gong et al., 2017).

Modeling O3 production in wildfire plumes with Eulerian models is complicated by variable emissions, sub-grid processes, complex chemistry, uncertainties in emission magnitudes and injection heights, and the poorly characterized radiation fields in and around smoke plumes. Chemical transport models often over-predict the amount of O3 produced near the fire (Jaffe et al., 2013; Zhang et al., 2014; Lu et al., 2016), although the simulated bias is strongly case dependent. For example, Baker et al. (2016) used CMAQ to model the O3 produced from two wildfires that burned in 2011 and found frequent over-predictions of up to 60 ppb in hourly mole fractions. This may be mainly due to the presence of oxygenated VOCs in fire emissions, especially acetaldehyde (Akagi et al., 2011), which result in rapid sequestration of NOx into PAN (Briggs et al., 2016; Müller et al., 2016). Herron-Thorpe et al. (2014) evaluated MDA8 O3 at numerous sites in the Pacific Northwest for the summers of 2007 and 2008 and found that the AIRPACT-3 modeling system had a slight negative bias of 4.6 ppb with a mean error of 8.9 ppb over the two summers with significant fire emissions, but the authors also identified some large over-predictions for individual events. In summary, estimating wildfire O3 production from Eulerian models is challenging, due to numerous factors, and these models need careful evaluation with observations.

Alvarado et al. (2015) developed a Lagrangian plume model to examine both O3 and secondary aerosol formation from one prescribed fire in California. These results supported a critical role for rapid in-plume chemistry and NOx sequestration (as PAN) to explain O3 formation rates. A similar box model approach was successfully used by Müller et al. (2016). Both the Lagrangian and box model approaches avoid the problems of grid resolution, which is a major challenge for modeling fire plumes with 3D Eulerian models. Using a statistical model, combined with surface particulate matter (PM) and satellite data from the NOAA Hazard Mapping System, Gong et al. (2017) showed that wildfire impacts on MDA8 O3 at 7 urban sites in the western U.S. range from negative values up to 33 ppb, including on days that had MDA8 values over 70 ppb. Plume models and statistical methods may provide useful estimates of O3 production in fire plumes, but these approaches need further evaluation.

8. Methods to quantify the impact of NCOS on regulatory monitors as relevant to policy

The CAA recognizes that states and tribes should not be held responsible for sources of air pollution over which they have no control and provides several relief mechanisms to address NCOS. These include the Exceptional Events (EE) Rule (US EPA, 2016b) and CAA 179B provisions related to international transport (US EPA, 2016a). The effective implementation of these mechanisms depends on the ability to quantify the amount of O3 from NCOS. Here we review several methods and assess the strengths and weaknesses of each approach.

The EPA has not yet published guidance on EE STT demonstrations; however, the EPA has approved EE demonstrations submitted by the state of Wyoming (WYDEQ, 2012; US EPA, 2014b) and other states (https://www.epa.gov/air-quality-analysis/treatment-air-quality-data-influenced-exceptional-events). These demonstrations can include measurements and model simulations showing layers of stratospheric air (characterized by elevated O3, very low humidity, and CO), increased potential vorticity (Xing et al., 2016), and transport into the boundary layer. These analyses provided qualitative demonstrations of substantial contribution from a stratospheric intrusion event but do not provide quantitative estimates of the contribution to O3. While model simulations can provide quantitative estimates of stratospheric contributions, models sometimes fail to simulate accurately the observed surface O3 during intrusion events and thus do not provide reliable quantitative estimates. Langford et al. (2015a, 2017) have shown that O3 lidar measurements can be useful for directly observing layers of stratospheric air that descend deep into the troposphere and reach the surface boundary layer. Quantitative attribution of the stratospheric contribution can be improved if these observations are supplemented by surface measurements of O3, CO, and PM2.5 to help determine if the descending UT/LS air has mixed with international transport or wildfire plumes.

The EPA has published guidance on EE for wildfires (US EPA, 2016b) that describes three levels (or tiers) of technical analyses required to support an EE demonstration for a high O3 day. All tiers include a narrative that demonstrates a clear causal relationship between the wildfire and an O3 exceedance. When a fire is close to a site where monitored O3 is typically low, Tier 1 uses trajectory analyses (e.g., HYSPLIT) and satellite imagery to show that the fire plume impacted the monitor. For Tier 2, fire emissions divided by distance from the monitor (Q/D) must be greater than 100/tons/day/km. Tier 2 additionally requires evidence that smoke from the fire impacted the monitor, such as monitoring data, satellite imagery, or photographs. For all other cases, a Tier 3 demonstration requires further additional evidence that supports the clear causal relationship between the wildfire and the O3 exceedance. Typically, this includes an estimate of the wildfire contribution using matching day analyses, statistical regression models, or photochemical models, as described in more detail in US EPA (2016b). We note that the Q/D method, described in the EPA guidance, is based on previous methods for primary pollutants, and at present, there has been very little evaluation of the Q/D method with respect to O3 produced from wildfires. A number of states have successfully demonstrated EEs for O3 due to wildfire emissions, as described on the EPA website (https://www.epa.gov/air-quality-analysis/treatment-air-quality-data-influenced-exceptional-events).

Because of the difficulty of using Eulerian models to estimate wildfire O3, EPA guidance also recommends use of a statistical approach. Statistical relationships have been developed to estimate O3 as a function of a variety of meteorological indicators (e.g., Camalier et al., 2007). Depending on the location and meteorological data available, this method typically explains between 50 and 80% of the observed daily variability. Several studies have applied this method to estimate the O3 contribution due to wildfires (CARB, 2011; Jaffe et al., 2013; Gong et al., 2017). In this approach, the statistical model is used to estimate the usual O3 mole fraction for the observed meteorological conditions and the difference between the observation and the predicted, called the residual, is considered the additional O3 due to some unusual source. While this approach cannot identify the cause for the additional O3, it can give an indication of the magnitude of unusual contributions, if the residual is sufficiently large. Both the EPA guidance and Gong et al. (2017) discuss this method in more detail.

9. Conclusions and recommendations

The O3 NAAQS has been strengthened several times since 1979 and most recently set at 70 ppb in 2015. With each downward step, the relative importance of background O3 increases, as does the role of USB O3 in air quality policy. Contributors to USB O3, also called noncontrollable O3 sources (NCOS), include natural precursor emissions (e.g., wildfires), long-range transport (e.g., from Asia, Canada, Mexico, or other countries), and stratospheric intrusions. When the standard is strengthened, daily variations in NCOS become more important and contribute to an increased frequency of MDA8 levels above the O3 NAAQS. Model-calculated USB O3 is greatest in March through June, with monthly mean MDA8 mole fractions at higher elevations in the west of up to 50 ppb and annual 4th highest MDA8 values exceeding 60 ppb at some locations. Lower elevation cities nationwide have monthly mean USB O3 of 20–40 ppb during the O3 season. Daily variations, particularly in spring and early summer, can be due to stratospheric intrusions mixed with Asian pollution, which can contribute to observed MDA8 values over 70 ppb. Elevated levels of O3 or its precursors are also found in fire plumes, in some cases contributing to observed MDA8 O3 values in excess of 70 ppb, particularly if fire plumes interact with NOx-rich urban emissions.

While USB O3 cannot be measured directly, baseline O3 can, but suitably positioned observational stations are limited in number. Along the West Coast, baseline O3 has increased since 2004 at the Mt. Bachelor Observatory in Oregon (2800 m asl) since 2004, while surface/marine boundary layer O3 at Trinidad Head in northern California has decreased and O3 at Cheeka Peak, Washington (500 m asl), is largely unchanged. However, we note that the marine boundary layer sites are less relevant to air quality beyond their immediate coastal surroundings. In contrast, the Mt. Bachelor site is more representative of the free tropospheric inflow to western North America, but the data record is relatively short. So, while there is a significant positive O3 trend at this site, both meteorological variability and changes in USB O3 are likely involved. In comparison, O3 trends from most rural and urban sites in the U.S. show a consistent downward trend in the annual 4th highest MDA8 values since 2000, indicating the importance of regional emission reductions. The exceptions to this pattern are Chicago, Salt Lake City, Denver, and Reno, where trends in the annual 4th highest MDA8 at the most polluted monitors have not changed significantly since the year 2000.

Multiple methods have been used to estimate USB O3, and, at times, significant differences can arise. These estimates of USB O3 rarely include uncertainty. The lack of consistent reporting of model performance metrics hinders a quantitative uncertainty estimate. Uncertainty in USB O3 is estimated from many factors including differences between model results, model biases against observations, and interannual variations and trends. Baseline O3 can vary significantly between years. At Trinidad Head in the marine boundary layer, spring (April–May) observed mean O3 ranges from 32–48 ppb, based on data from 2004–2016. At Mt. Bachelor (2.8 km asl), the range in spring mean O3 is 45–59 ppb over the same time period. For summer (June–August), the ranges are 18–29 ppb at the surface and 42–55 ppb at 2.8 km asl. Thus model simulations of USB O3 must demonstrate the ability to capture these significant interannual variations with no significant bias. If systematic model biases are present, these must be explored so as to understand the underlying cause.

Given these limitations, our best estimate of the current uncertainty in the seasonal mean USB O3 for typical years is ±10 ppb, which arises from model uncertainty, as discussed in Section 4. However, in some years, seasonal mean baseline O3 is more than 5 ppb higher or lower than average (Figure 4) as a result of climate variability (e.g., El Niño), wildfire extent, and possibly other factors. Thus, for any given year, our predictive capability of USB O3 could have an uncertainty greater than ±10 ppb, which arises from the modeling uncertainty compounded by the additional interannual variability. Uncertainty for shorter time periods can be higher (e.g., Figure 6) and accurate estimates of USB O3 are especially important for MDA8 O3 on days that exceed the NAAQS. In the case of potential EE determinations (e.g., due to fires or stratospheric intrusions), this level of uncertainty can have policy implications. In the case of SIP or NAAQS analyses, enhanced NCOS contributions that remain in the ODV (i.e., not excluded through the process defined in the Exceptional Events Rule) can directly impact the level of estimated controls required (US EPA, 2013). We note that some level of NCOS is always present as part of the mean USB O3. Methods used to estimate USB O3 and NCOS include both CTMs, as well as empirical approaches, and the difference between these methods is not well characterized. This is particularly true for wildfires that can occur at spatial scales smaller than those typically resolved by CTMs. In such cases, Lagrangian and statistical models can be used, but their application in such situations is still in its infancy.

The effort to quantify USB O3 to date has lacked coordination and dedicated resources, as was noted in previous reports (NRC, 2010; McDonald-Buller, 2011; Cooper et al., 2015). With a lower O3 NAAQS, local, state, and regional air quality planning organizations will increasingly need improved methods to quantify USB O3 and NCOS with smaller uncertainties. To reduce these uncertainties, we have identified a series of research needs (in approximate order of importance):

  1. An improved observation network is needed to better understand baseline O3, USB O3, and NCOS. While the U.S. has an extensive network of regulatory surface O3 monitors, co-located measurements of key species (e.g., CO, NOx, VOCs, PM2.5, and speciated PM) that could be used to identify influences from stratospheric, foreign, natural, and/or biomass burning sources are made at only a few locations. In addition, most of the existing O3 monitors are located near population centers because of regulatory requirements and limited funding, leaving much of the interior western U.S. under-sampled. A new generation of low-cost sensors could facilitate routine observations of O3 and other key tracers at more surface monitoring sites (with careful validation), and an augmented baseline network with remote or high mountain locations and frequent vertical profiling (e.g., ozonesondes, lidar) (Langford et al., 2018) would improve identification of stratospheric, foreign, natural, and/or biomass burning sources. Key locations for enhanced observations are elevated locations and/or vertical profiles along the West Coast, in the Intermountain West, and along the U.S.-Mexico border.
  2. Improved quantification of USB O3 and the key processes controlling its distribution could be accelerated by one or more large-scale field experiments. Ideally, an experiment of this type would be conducted shortly after the TEMPO satellite instrument (Zoogman et al., 2014) becomes operational to provide large-scale, spatially and temporally continuous measurements across North America that can be directly linked to USB O3 estimates. The experiments should also include a suite of baseline sites (expanded from the targeted network above), near-continuous vertical profiles of O3 and precursor species, high mountain measurements, aircraft measurements, and multiple models operating over different seasons, including when USB O3 is expected to be highest and during O3 exceedances. Consideration should also be given to examining USB O3 over multiple years to account for interannual variations. Past experience has shown that the success of large-scale field experiments requires a community-wide effort with observational and modelling assets drawn from multiple federal, state, and university institutions (e.g., CalNex and INTEX-B).
  3. The ability of CTMs to quantify USB O3 accurately and consistently across different temporal and spatial scales should continue to be improved to more effectively support policy and scientific applications. In general, CTMs have greatly improved our understanding of the sources of USB O3. Continued progress will require process-oriented evaluations that include other key tracers wherever possible, with more attention paid to uncertainty and sensitivity analysis. Future modeling studies should report a consistent set of metrics including, at minimum, seasonal mean USB O3, the USB O3 on the observed annual 4th highest day and top 10 days (at the same time as the O3 maximum), and distributions of USB O3 binned by observed O3 (e.g., at least for the ranges below 60 ppb, 60–70 ppb, and above 70 ppb), as well as standard model performance metrics identified in recent reviews (Simon et al., 2012). Model studies should also report evaluation metrics specific to the intended use (e.g., fire or stratospheric intrusion evaluations, if those results are reported). At their core, models rely on emission inventories. Particularly as larger industrial emissions are reduced, smaller source categories become more important. The role of deposition and chemical sinks in shaping O3 distributions, including the USB O3 component has received far less attention than the role of sources. We recommend that coordinated modeling efforts include diagnostics to allow exploration of inter-model differences in sinks as well as sources of USB O3. In urban areas, USB O3 estimates from models of different spatial resolutions may differ strongly across models due to NOx titration. Additional work is needed to test whether consideration of odd oxygen (defined as O3 + NOx) reconciles such discrepancies. For hemispheric or global models that provide boundary conditions, it is necessary to archive four dimensional fields of all key tracers at 3-hour resolution at the regional model boundaries. Further, tracers or diagnostics are required that can distinguish between different types of NCOS at the boundaries. For detailed model inter-comparisons, full four-dimensional fields of O3, VOC, CO, and NOx, and key reaction products such as nitric acid, organic nitrates, total oxidized nitrogen species, and peroxides should be archived across the model domain. Comparison between models should focus on process-level analyses and model sensitivities, considering not only O3, but also related species. Intercomparisons of model source apportionment estimates can be difficult to interpret because of differences in the approaches used to implement source attribution techniques. Instead, process-level intercomparisons should include sensitivity experiments, such as simulations with zero anthropogenic emissions, to assess differences in model estimates of natural and background O3. Simulations with zero anthropogenic emissions will also provide improved estimates of background O3 in urban areas where local NOx emissions can titrate O3. A better understanding of model uncertainty will require comparisons with baseline observations, targeted intensive campaigns, and coordinated model inter-comparisons.
  4. Better methods for quantifying the impact of wildfires on O3 (and PM) should be developed, tested, and compared. Wildfires can drive exceedances of both O3 and PM, but the formation and dispersion associated with fires is poorly understood. Future progress will require more detailed observations such as those currently planned for several large-scale process-oriented studies (e.g., FIREX [https://esrl.noaa.gov/csd/projects/firex/whitepaper.pdf], FIRECHEM [https://espo.nasa.gov/FIREChem_White_Paper], and WECAN [https://www.eol.ucar.edu/field_projects/we-can]). The field experiments will require measurements upwind and downwind of wildfires to develop a detailed understanding of chemical processing, establish plume to plume variability, and improve smoke plume simulations by air quality models. Wildfire chemical processes simulated by Eulerian and Lagrangian models should be compared to statistical models to evaluate the efficacy of the three approaches.

Over the past decade, much progress has been made in our efforts to understand aspects of the USB O3 problem (e.g., episodic stratospheric sources, interannual variability, wildfire contributions), but these efforts have lacked coordination. While our understanding of USB O3 and the available tools have advanced, the uncertainties remain large and many of the conclusions and recommendations made here are similar to those made in the McDonald-Buller et al. review (2011). For a topic of such importance to air quality management and regional stakeholders, a more focused approach is needed. The strengthening of the O3 standard and the increased importance of EE demonstrations heighten the need for the scientific, regulatory, and stakeholder communities to make substantial progress in improving the observations and tools to understand USB O3.

Data Accessibility Statement

Mt. Bachelor Observatory data are available at the University of Washington data repository (https://digital.lib.washington.edu/researchworks/browse?type=subject&value=Mt.+Bachelor+Observatory). The Tropospheric Ozone Assessment Report Global surface O3 datasets are available at https://doi.pangaea.de/10.1594/PANGAEA.876108.

Supplemental files

The supplemental files for this article can be found as follows: