Download the data

The full NO2 dataset can be downloaded here.

The full PM2.5 dataset can be downloaded here.

The data files will be updated weekly. Because data is subject to change with additional QA/QC, it is recommended that users download a new data file each time they wish to work with the data, rather than using a previously downloaded file. Data is GMT hour ending. Missing data is indicated with the value -999. Breathe London data is licensed under the Open Government Licence.

The analysed mobile data shown on the Breathe London map can be downloaded here.

You can further explore and make customised visualizations with these datasets, as well as access the underlying mobile monitoring data (at 1Hz measurements), on the Air Quality Data Commons platform.

Breathe London’s air pollution data API is now available for public use. If you are interested in developing your own third-party app and/or service please contact the project team on hello@breathelondon.edf.org with details of your request. Further information and instructions can be found via the API documentation.

You can identify local pollution sources by exploring source apportionment results from the CERC ADMS model:

NOx standard emissions scenario can be downloaded here.
NOx ULEZ emissions scenario can be downloaded here.

NOx source apportionment at schools can be downloaded here. (Data is provisional)

Background

Image via Flickr - stignygaard

The Breathe London project uses cutting-edge air pollution sensors and models to test new ways of understanding air quality in Greater London. Data from the Breathe London project is intended to provide “hyperlocal” insights about the varying levels of pollution across the city, from street to street and neighbourhood to neighbourhood.

Operating from late 2018 through mid-2020, Breathe London aims to provide data and analysis to support policy making, policy evaluation and increased citizen engagement. The project advances existing monitoring methods, tests new lower-cost sensors and evaluates new ways to visualise and present data. Air pollution measurements from the project also can further validate and improve the models currently used to assess and forecast air quality in London.
The technical team is composed of Air Monitors Ltd., Cambridge Environmental Research Consultants, Environmental Defense Fund Europe, Google Earth Outreach, National Physical Laboratory, University of Cambridge and King’s College London.

Stationary monitoring

Air quality monitor outside Madam Tussauds

The Breathe London stationary network is made up of 100 AQMesh pods, each containing a collection of small sensor-based air quality monitors that offer near real-time localised air quality information. They measure nitrogen dioxide (NO₂) and nitric oxide (NO) using electrochemical sensors; particulate matter (PM) in various size cuts (PM2.5 and PM10 are reported) using a light-scattering optical particle counter; and carbon dioxide (CO₂) using a non-dispersive infrared absorbance sensor. In some locations, the pods measure ozone (O3), also using electrochemical sensors. The pods measure temperature, humidity and air pressure for the purposes of correction for environmental conditions. Each sensor pod is set up to collect data continuously for 10-second intervals and create an average every 1-15 minutes, synchronised to the top of each hour. Data presented on the Breathe London website is shown as hourly averages, with a small lag from real time before they appear online.

Site selection: The 100 pods are located across Greater London. Locations are identified based on a number of criteria developed in consultation with the Greater London Authority. These criteria include:

  • Coverage in all 32 London boroughs plus the City of London.
  • Filling gaps in the existing network of government air quality monitors.
  • Placing priorities on “sensitive” locations, such as primary schools and medical facilities.
  • Supporting assessments of the impact of new policies designed to reduce air pollution, such as the Ultra-Low Emission Zone (ULEZ), the Expanded ULEZ and the Low-Emission Bus Zones (LEBZ).
  • Distribution across a mix of traffic levels and varying distances from major roads and intersections, parks, residential areas, high-traffic streets and other commercial areas.
  • Reserving 3 of the pods (termed “gold pods”) for performance evaluation over the long-term using periodic co-location studies alongside reference instruments.

Stationary data verification and quality assurance

These devices are not intended to provide equivalent accuracy to conventional (i.e., reference) monitoring methods, but rather to provide information across a wide area in many locations at a much lower cost. As such, these devices are not “calibrated” in the normal manner with known standard materials. Their accuracy is defined by periodic co-location with conventional monitors and comparison with each other. Various Quality Assurance/Quality Control (QA/QC) checks are carried out from factory to first installation and throughout the deployment. Data is evaluated in ‘stages’ with each stage adding one or more quality assurance steps to the previous stage.

Stage 0 (Factory Settings)

Data at this stage is in the form that Air Monitors receives from the pod manufacturer, Environmental Instruments Ltd (EI Ltd), after application of factory QA/QC. Prior to shipping to the customer, each individual gas or PM sensor is characterised by EI Ltd. in terms of sensitivity and offset. This data is unique to each sensor and are used by the AQMesh processing algorithm to apply corrections for cross gas interferences and environmental conditions.

When the individual sensors are combined into a “pod” they are subjected to a minimum co-location period of seven days at the factory in Stratford-upon-Avon, where they are compared with a range of reference-grade monitors. This process determines a scaling factor, or default slope and offset, for each sensor – which, in effect, “calibrates” them against the reference instruments. Each data point generated by the pods at this stage are accompanied by a timestamp and single status code, determined by EI Ltd.

Stage 1 (Empirically Verified)

Concentrations of certain pollutants at the pod’s final field location may be quite different from those experienced during factory co-location. In order to ensure that the pods are correctly calibrated for the range of environmental conditions present at their field location, Stage 0 data is then adjusted with scaling factors determined through one of three methods:

  • Pre-deployment reference site co-location: Prior to initial field placement about one-third of the project pods were co-located with a reference monitor in Greater London for approximately 3 – 7 days. After this period, linear regressions were performed to determine slopes and offsets between the reference site and pod data.
  • “Gold” pod co-location:  “Gold” pods are standard AQMesh pods which have been co-located at one or more reference monitoring locations, providing traceable evidence of the gold pod’s performance. After the pod has been characterized, it is moved adjacent to a “candidate” pod located in the network for a period of approximately 7-14 days. After this period, a linear regression is performed to determine the slope and offset between candidate and gold pod. These scaling factors are then applied to the pre-scaled data if the slope and offset are statistically different (at a 95% confidence interval) than 1 and 0, respectively. If differences do not meet these thresholds then Stage 0 data becomes Stage 1 data without further adjustment.
  • Experimental network-based calibration method: To maximize the number of sites for which we can publish preliminary data, we are also trialing the use of an experimental calibration method being developed by the Cambridge group, which aims to scale the entire network without the need for gold pods or reference co-locations for each pod. The separation of local sources immediately adjacent to a pod site from the non-local background pollutant levels, which are often consistent over substantial distances (10s to 50s of kilometres), allows scaling of pods across the network. The experimental approach involves selecting periods when non-local pollutant levels are likely to be relatively stable over the study area to determine relative pod sensitivities. The entire network is then scaled relative to an AQMesh pod co-located with a reference monitoring instrument. The scale separation methodology has been previously demonstrated by Heimann et al. (2015) and Popoola et al. (2018). Results using this method will continue to be evaluated throughout the project by comparisons with calibration factors (slopes and offsets) derived from direct co-locations.

All three methods in this stage rely on comparisons of pod data to reference site data. Since empirical scaling factors may be determined prior to subsequent data ratification of the reference network, there may be errors in the reference data that subsequently necessitate correction of sensor scaling factors. This is considered in Stage 4. Any other measurement artefacts during a field co-location would be reviewed once the ratified reference data were available.

A small number of sites in hospital locations are published unscaled; these are labelled as ‘Hospital Pod’ in the downloadable ‘site metadata’ files and are currently undergoing calibration using the experimental network-based calibration method.

Photo source: Air Monitors Ltd

Stage 2 (Manual QA/QC)

Air Monitors’ technical staff conducts a manual quality review of the data each week and, as needed, flags any suspect data for review in later stages. For initial publication, all data flagged through this process are not published, though may be provided if determined valid at a later date.

Stage 3 (Automated QA/QC)

In Stage 3, scaled data from Stage 2 are automatically reviewed against high and low limits. Additional flags are added at this point if data exceeds any pre-set concentration limits or if the PM mass fractions are not credible (e.g. PM2.5 > PM10).

Stage 4 (Special Issues)

This stage captures specific issues encountered throughout the deployment that the project team envisions taking actions on prior to finalising data, but not prior to initial publication of data. Examples include accounting for changes in provisional data applied during ratification from reference network monitors and correcting PM data for effects of relative humidity.

Stationary data considerations

Provisional data publication:

In order to display information in near-real time, data shown on the project website prior to completion of the project are provisional and subject to change as the data undergoes additional quality assurance checks.

NO2: The provisional NO₂ data is verified through gold pod co-location or initial reference site co-location. For sites where neither of these scaling factors are available, data shown has been adjusted using the network-based calibration method if the associated goodness-of-fit parameter is sufficiently high.

For NO2 data scaled using the ‘gold pod co-location’ methods, initial estimates of uncertainty at the EU limit value range between ± 10-20% for most pods. A minority of pods have estimated uncertainties of up to ± 40%. Uncertainty is generally lower at higher pollutant concentrations. The project team will be reviewing and updating these uncertainty estimates over the course of the project. The range of uncertainties may be attenuated as the project undergoes additional QA/QC.

Uncertainty in data scaled using the network-based method is estimated to be ± 25% at this stage of the project. Improved uncertainty estimates will be available as the project progresses.

For comparison, final ratified data from reference instruments in the London Air Quality Network have an estimated uncertainty for NO₂ measurements of ± 10% at the EU limit value, the current requirement of uncertainty for reference or equivalent monitors is ±15%.

PM2.5: The provisional PM2.5 data is produced using the network-based calibration method exclusively as this method can more effectively disaggregate the contribution of background PM2.5 levels across Greater London. The network-based method yields broadly consistent results as co-locations using ‘gold pods’ (agreement of medians during reference co-location periods within 10-30%). A comparison of scaled AQMesh measurements using the network-based method (after filtering periods of high humidity or fog) and co-located reference measurements showed median-normalized root mean square errors of 17-36%.

Currently, the network-based scaling method produces results relative to a select gold pod. Analysis is on-going to evaluate the sensitivity of the network-based scaling factors to different gold pods. Because measurements from reference instruments used to calibrate gold pods are provisional, the reported Breathe London measurements are also provisional and subject to revision when reference measurements are ratified (early-mid 2020).

Preliminary evaluation of AQMesh PM2.5 measurements against reference monitors suggests the possibility that high relative humidity and/or fog can lead to AQMesh measurements that are spuriously elevated. The project team is developing a method to correct for the effect of humidity/fog on measurements based on recently published work (Crilley et al. 2018). We anticipate including this correction in future data releases.

The AQMesh uses an optical particle counter (OPC) to estimate particulate matter mass emissions, and only particles larger than ~300 nm are counted. In an environment with fresh emissions of small (nano)particles, the portion of particle mass from those particles smaller than 300 nm would not be detected by an OPC. The effect of this undercounting is greatest when instrument placement is closer to roadways and other combustion sources.

Data download:

The full NO2 dataset can be downloaded here.

The full PM2.5 dataset can be downloaded here.

The data files will be updated weekly. Because data is subject to change with additional QA/QC, it is recommended that users download a new data file each time they wish to work with the data, rather than using a previously downloaded file. Breathe London data is licensed under the Open Government Licence.

You can further explore and make customised visualizations with these datasets on the Air Quality Data Commons platform.

Data platform:

The Breathe London data platform provides the data and visualisations for the Breathe London website. The platform is based on the Google Cloud, which enables user-friendly performance when querying these large datasets to provide graphs and visualisations, and ensures the replicability and scalability of the platform to other cities around the world. The platform is open-source and is capable of ingesting data automatically from AQMesh pods and also other monitor networks such as the London Air Quality Network and Defra’s Automatic Urban and Rural Network. The platform stores Stage 0 data and calibration factors separately, and supports the QA/QC process by allowing the technical team to modify calibrations and redact suspect data. Third-party platforms and apps can connect to the platform through standardized APIs. The Breathe London platform is developed and maintained by Cambridge Environmental Research Consultants.

Mobile monitoring

Image via Nick Martin, NPL

Two Google Street View cars, equipped with reference-type air quality monitors, measured air pollution over approximately 600 eight-hour shifts between autumn 2018 and autumn 2019. The cars use fast-response, research-grade instruments to precisely measure pollution concentrations approximately every 1-10 seconds. Pollutants measured include black carbon (BC), CO₂, NO, NO₂, O3, PM2.5 (and other PM size ranges) on a variety of London roadways. The National Physical Laboratory was responsible for regular checks of instrument performance and periodic calibrations.

The cars  collected data from early morning to late evening, Monday to Friday – providing a representative view of on-road air pollution during these hours. The mobile monitoring routes were sampled at different times of day, days of week and time of year – with a target of achieving a minimum of approximately 15 passes of each route over the course of the study (however, traffic congestion and other factors impacted this number).

The initial sampling plan included full coverage of the Ultra-Low Emission Zone (ULEZ) and targeted driving routes in select “polygons” outside of the ULEZ. The project team selected these routes based on predicted high and low NO₂ concentrations, using CERC’s ADMS-Urban 2012 model for NO₂ (summarized at the postcode district level), as well as randomly selected areas of high and low Index of Multiple Deprivation score (see Index of Multiple Deprivation 2015).

At the end of the project, air pollution maps will reveal the spatial variations in air pollution at the level of city blocks in the polygons where repeated sampling targets were achieved. The project team periodically reviewed the progress toward data collection targets and made changes to the sampling areas as needed to achieve project goals and objectives.

Mobile monitoring data considerations

Credit - Julie-Anne Hogbin, EDF Europe

Mobile data on the Breathe London map is presented as median concentrations, representing the expected on-road value during weekday daytime hours over the monitoring period between August 2018 and October 2019. We required a minimum of 10 visits to a road location in order to report  a median value. All roads were not sampled the same number of times, at the same times, or on the same day. Different weather conditions, which influence regional background concentrations at the time of sampling, can introduce uncertainty in our estimates of median concentration for each segment. For NO2 we estimate that 95% of segments are subject to uncertainty less than +/- 50%. Uncertainty is lower when we have more data. For segments with 30 drives, uncertainty decreases to less than +/-30%. We anticipate that the uncertainty will decrease with on-going effort to refine analytical methods.

Particulate matter instruments on the mobile platform used optical scattering to estimate particle mass, a technique that is not sensitive to very small particles < ~200 nm. In on-road and similar environment with fresh emissions of small (nano)particles, the portion of particle mass from those particles smaller than ~200 nm would not be captured by these instruments. Instruments that quantify nanoparticles (such as the Naneos Partector) can provide an indication of the potential effect of this undercounting.

Data download:

The analysed mobile data shown on the Breathe London map can be downloaded here. The underlying data (at 1Hz measurements) can be accessed on the Air Quality Data Commons platform.

Identifying local pollution sources

Objective

The CERC ADMS model was used to assess the maximum potential impact of the Ultra Low Emission Zone (ULEZ) compliance criteria on total NOx concentrations at Breathe London AQMesh, London Air Quality Network (LAQN) and Air Quality England (AQE) sites in Central London during the period 1st April 2019 to 12th December 2019.

ADMS Model Setup

  • Modelled hourly concentrations from 1st April 2019 to 12th December 2019
  • Modelled 294 receptor locations across London. 107 were LAQN monitor locations, 43 were AQE locations and 144 were AQMesh sensor locations (past and present). 178 receptor locations were kerbside/roadside and 50 were inside the ULEZ.
  • Run with ‘standard’ and ‘ULEZ’ emissions:
    • ‘Standard’ emissions use the LAEI 2013 dataset, interpolated to 2018, which has 67% compliance with the ULEZ criteria.
    • ‘ULEZ’ emissions are the standard emissions modified to be 100% compliant with ULEZ criteria
  • Specific diurnal profiles were applied to each vehicle category
  • Pollutants were apportioned into 12 Non-Traffic and 6 Traffic Components

Sensitive receptors

  • ADMS-Urban model used emissions of NOx taken from the London Atmospheric Emissions Inventory (LAEI) published by the GLA.
  • This work used ‘LAEI 2013’, which was published in 2016, has a base year of 2013 and includes projections for 2020. It used annual average values for 2019, obtained by interpolating between the base year values and the projections for 2020.
  • Modelled annual concentrations (µg/m3) at 1,890 nurseries and primary schools (sensitive receptor locations) across Greater London for 27 different pollution sources.
  • Sensitive receptors were modelled at 1m above ground.
  • Rather than modelling pollution directly on or above buildings, new locations were created by selecting the nearest road section within 100m of the original location.

‘ULEZ emissions’ assumptions

  • All non-compliant vehicles were removed, i.e. 100% ULEZ compliance was assumed. Results therefore reflect an assessment of the maximum, rather than realistic, impact of the ULEZ
  • Total traffic volume in each main category (e.g. cars, LGVs, HGVs, buses) was assumed to be unchanged by the ULEZ, and the relative distribution of the remaining compliant vehicle sub-categories within the categories was also assumed to be unchanged
  • In the standard emissions dataset, more diesel vehicles were non-compliant than petrol cars; therefore, although the total number of cars remains unchanged in the ULEZ emissions, there are more petrol cars and fewer diesel cars in the ULEZ emissions than in the standard emissions

Complete technical documentation will be available at the end of the project. For additional information, contact hello@breathelondon.edf.org.