Sources of air quality data – API and custom reports
Information about the air quality is not an easy piece of content to deliver to your audience.
Whether you are a product owner building an app for allergy sufferers or a content director at a weather news channel, the sheer variety of sources may confuse you. This article will guide you through the essentials:
Types of air pollution
Particulate matter (PM) is various particles suspended in the air, including dust, water, and organic chemicals. It also varies in size, with some PM being visible to the naked eye (such as soot and smog), but other more dangerous particles being incredibly small, known as fine and ultrafine particles. PM10 refers to particles with a diameter of up to 10 microns, PM2.5 up to 2.5 microns, and PM1 up to 1 micron. Long term exposure to high levels of PM can cause asthma, and has also been identified as a cause of lung conditions and cancer.
The other form of air pollution is gaseous. Common gas pollutants are nitrogen dioxide (NO2), ozone (O3) and sulphur dioxide (SO2). NO2 is released into the atmosphere via many man-made processes, such as power stations and vehicles. Prolonged exposure to NO2 can result in respiratory infections in children and difficulty breathing in adults. O3 is the result of chemical reactions between sunlight and gases in the atmosphere that have been emitted from cars and other sources. Too much exposure to O3 can cause difficulty breathing and aggravate lung diseases such as asthma and emphysema. Another gaseous pollutant, SO2, is produced by burning fuels that contain sulphur, and prolonged exposure can also cause changes in lung function and increase the risk of getting bronchitis, among other illnesses.
It is therefore becoming increasingly important to both citizens and governments to monitor air quality, specifically the levels of PM and gaseous pollutants in busy urban areas. Air quality data and monitoring systems are a key part of that.
Sulfurous smog, also known as London smog, is caused by a high concentration of sulphur oxides (SOx) in the air. This is caused by the burning of sulphur-based fuels such as coal. This occurs primarily in regions where there is relatively more burning of sulfur compounds to produce heat and energy. The smog is usually exacerbated by the presence of suspended PM in the air and high levels of humidity.
Photochemical smog – also known as LA-type smog – is caused by nitrogen oxides (NOx) and hydrocarbon vapors, primarily resulting from automobile emissions. These then react photochemically with the sunlight to form ozone (O3), nitrogen dioxide (NO2) and hydrocarbons. It occurs mainly in regions with high levels of automobile emissions and high levels of sunlight.
Top 3 types of air quality data available on the market
1. Air quality monitoring reference stations:
These are essentially mains powered enclosures which contain analyzers capable of reference-grade measurements. They are extremely large structures which provide air quality monitoring from a fixed point. If data from several reference stations is combined, it is possible to model air quality for larger urban areas.
Reference stations are often used in concordance with enforcing the legal limits of pollutants. For example, every European Union member state country participates in the European air quality database, known as AirBase, which contains monitoring data covering EU member states, EEA member countries and some EEA collaborating countries. These countries report their measurements on a daily basis for a set of pollutants, and compile reports outlining overall air quality from a representative set of stations. This reporting is required by law.
There are several types of reference stations. The main method of measuring PM is by using filter-based gravimetric samplers, which is required by European countries as specified in the EU First Air Quality Directive. There are three types of samplers: super high volume, high-volume and low-volume. Each of these have a sampling inlet capable of measuring PM10. The amount of PM10 is then determined by weighing its mass gravimetrically, which is a type of measuring process where air containing PM collects on a filter, and the weight of a blank filter is subtracted from the weight of the filter containing PM to calculate its mass. These samplers can also be configured to measure PM2.5 by adjusting the filter. The low-volume sampler is the most commonly used, as the super high volume sampler is not suitable for the ambient air environment.
Another popular type of analyzer is the tapered element oscillating microbalance (TEOM). This is widely used for providing public information and measures PM by analyzing the frequency of oscillations of a glass tube.
They are extremely accurate. Their measurements consist of a multi-annual time series of air quality measurement data and statistics for a number of air pollutants. It is possible to obtain air quality data from a number of points across many countries around the world. The meta-information when all of this is combined can give details on monitoring networks around the world, and how measurements are taken at various stations.
While these stations are extremely accurate they are also extremely expensive. Each station requires significant financial investment for its various components, which include an enclosure, AC and heating systems, power supply, telephone connection and calibration systems, to name just a few. On top of this, each station requires regular service and maintenance. All of these elements mean that reference stations are also quite complicated and costly to set up. The initial investment of a single measurement point can range from 5-30k EUR depending on the type and complexity, and this does not even include yearly maintenance, electricity and other service fees which can add several thousand EUR per year, on the low end. In addition to being expensive, reference stations only provide air quality data from a fixed point. Due to their size and complexity, reference stations cannot be placed in densely-populated locations where the air is most polluted. Although it is possible to model air quality for these urban areas by using data from several stations, this is not as accurate and real-time as a network of more regularly placed sensors would be.
Depending on the country, the data can also be delivered with a delay of several hours or more, meaning that it can often be out-of-date by the time it is published as air quality can change incredibly quickly.
2. Low cost sensors
Low-cost sensors (LCS) are used to collect real-time, high-resolution spatial and temporal air quality data. These can be PM sensors, gas sensors, or a combination of both. LCS often incorporate dense networks of many sensors, and therefore continuous monitoring can help identify spots of a town or city where air pollution is particularly high. This plays an important role in helping local authorities or governments identify mitigation measures, especially with regard to traffic management or combustion policies.
Although in the past LCS were not comprehensive enough to create reliable air quality networks for urban areas and cities, this has changed due to recent advancements in technology.
A popular form of PM sensor, and therefore one that is often used in portable LSC, is the optical analyser. These measure PM concentrations using the interaction between particles and infrared or laser light. Laser light scattering is used to classify PM by number and size, and a single analyser can measure a range of sizes simultaneously, such as PM10, PM2.5 and PM1. These sensors can be used in a number of locations to build up a network of knowledge of PM distribution by space and time, but it’s also used in reference stations to provide quick results.
Gas sensors often utilize electrochemical processes to generate a current in the presence of targeted gases. Gas molecules undergo a reaction at an electrode onto a receptor. The interaction between the gas molecules and the surface of the receptor can reveal information about density, mass and temperature. These sensors can measure common air pollutants, such as NO2, NO, O3, SO2, CO and H2S.
Currently, it is fairly easy to obtain PM and gas sensors from a range of manufacturers, such as Airly, Honeywell, Alphasense, Plantower or Sensiron. Different sensor systems use different principles to measure the concentrations of atmospheric pollutants.
The advantages of LCS are that they are, as the name suggests, relatively cheap, especially when compared to reference stations. This means they are particularly suited to urban air quality analysis, as many sensors can be placed around a town or city in order to provide a high level, detailed overview of air pollution. Another advantage is their small, portable size. This makes them easy to mount and install around urban areas. Technological advances have allowed such inexpensive and portable sensors to measure many PM sizes simultaneously and many gas compounds.
Readings from LCS are currently not considered for regulatory purposes in Europe, due to the strict restrictions on data quality. These sensors depend on assumptions about the characteristics of PM and gases, and these characteristics can vary depending on location due to temperature, humidity and time of day. This means that LCS are not fully accurate unless they are calibrated to reference stations and local conditions.
Finally, the reliability of LCS ultimately depends on the quality of materials the manufacturer uses and the principles applied to the measuring of compounds.
At Airly we do both, among only a few other suppliers.
It is also possible to analyze PM and gaseous pollutants using satellite imagery. Satellites can measure the concentration of various particles in the Earth’s atmosphere by analyzing the amount of light that reaches the surface of the Earth and how much is reflected off. These measurements are compared with ozone concentrations and visibility. This is known as aerosol optical depth measuring. Satellites also have the capability of measuring PM2.5 near the ground.
Various space agencies have recently launched satellite instruments that are designed to measure air pollution from space. South Korea for example has the Geostationary Environment Monitoring Spectrometer (GEMS) which makes hourly measurements of pollutants in Asia. NASA has also scheduled a satellite, Tropospheric Emissions: Monitoring of Pollution (TEMPO), to join GEMS in 2022, and the ESA is expecting to launch its Sentinel-4 satellite in 2023. The ESA’s Sentinel satellites form a part of the EU’s Copernicus program, which is aimed at providing accurate and accessible information to improve environmental management and mitigate the effects of climate change.
There are obvious advantages to using satellites for air quality data. Firstly, they can potentially be very accurate. Their coverage is reliable and global, and there is around a decade’s worth of calibrated satellite measurements that is freely available. This means that using satellite data often proves to be cost effective. They are also capable of measuring aerosols that have lofted above the Earth’s surface, which ground sensors would not be able to detect.
There are significant disadvantages for using satellite data to learn about air quality at the street level on a daily basis. A major criticism of satellite-based methods for air quality measuring is that they can only provide reliable estimations on aerosols and PM2.5 concentrations when there are no clouds. Satellites are also not able to, on their own, detect how high above the ground air pollution is. It is difficult to model the vertical distribution of dust and satellites are therefore often not very accurate in giving estimations for the concentrations of PM in the actual air humans breathe. Finally, the availability of data from satellites depends on their trajectory and the resolution needed; you may have to wait several hours for high resolution data and only be able to download it once a day.
4. Other sources
There are also other, lesser-used methods of measuring air quality such as interpolation and microsensors. Spatial interpolation is a method that incorporates information on the geographic position of sample data points. The procedure involves predicting unknown values using known ones at neighboring locations. However, this method suffers from downfalls. For example, monitoring sites are typically located to detect emissions (the lowest concentrations of pollutants in a given neighbourhood).These locations however are not very close to many of the sources of emissions, so this data is not modeled and its impact neglected. For example, in Poland you have data on all low emission sources but you don’t know what is being burned in them and when so any modelling technique is most often wrong even by an order of magnitude). Therefore in many cases interpolation fails to describe the spatial variability of air pollution. If you have a whole model constructed for one big city and have only one official station to calibrate model output it’s like trying to fit a line so it will be drawn through a given point – you have infinite number of options to do that. To make it work, it is also necessary to know very accurate information about land topography and traffic.
Personal micro sensors can also be used to measure air quality. These are similar to low cost sensors, and information from these sensors can be used in conjunction with other data sources to paint an overall picture of air quality in a certain area. However, there are no standardized measures for the placement of these sensors so readings can be from a range of heights, both indoors and outdoors. Calibration is therefore difficult, and they cannot be used to inform a visual map of air pollution. The lack of reliability of both interpolation and microsensors hinders their ability to be used for professional purposes, and are therefore not recommended.
Formats used to deliver air quality data
When you want to use of air quality data, typically you can have it delivered typically in three most recognised formats:
- .csv files with semi-raw (typically averaged to an hour) data
- API – from which you request data programmably
- .pdf as custom reports
CSV is the format used widely especially by reference station data owners. The best example here is the EEA, which allows users to download CSV files from member states. Data in the form of CSV is as raw and specific as can be, so can be used in all possible research cases. While it’s free, it typically requires a data analyst to extract information of business value.
API is the most professional tool for any type of data delivery. It allows the most flexible means for analysing data and getting to the exact point you are looking for. While it’s undeniably the best source, it is also expensive to set-up and maintain as you need a developer in order to make full use of it. An important obstacle here is the variety of API formats used by data holders. As a result, on top of the primary APIs (usually offering data from one single measurement network for one country), there are commercially available master APIs such as Airly’s, which integrate them into one unified source ready to use by a programmer.
3. Custom reports
For the direct business use, air quality data can also be delivered as a ready-to-use report. This form combines the functionality range of API with a non-tech interface.
At the moment, we don’t know of any other company offering such a service, therefore we recommend visiting Airly’s product page in order to learn more about custom reports.
Where can you find air quality data
Choosing the right data source about the air quality depends on several factors that you should be looking to consider:
- Level of detail
- Level of reliability
- Forecast availability
Different providers have different coverage. Some of them offer Worldwide data, some specialize in particular regions. When making a decision on how to pick your data supplier, you need to consider how detailed and reliable the data you want to present is. Generalized air quality data can be accessed for free, although its source is often satellite-based measurement. More detailed and reliable data from reference stations can also be obtained for free, but it is typically raw and requires a certain budget to adapt it for business purposes.
Here are a few providers you can consider:
- Sources: reference stations and own network of manually verified sensors calibrated with nearby reference stations
- Formats: CSV, PDF, API, web and mobile apps for end users
- Forecast: yes
- Historical data: since 2016 – full in csv, 24h in API
- Check the details here
- Sources: governmental monitoring stations, satellite and interpolation
- Formats: API, web and mobile apps for end users
- Forecast: yes
- Historical data: yes, several months to several years depending on location
Air Quality Index
- Sources: reference stations, third-party low cost sensor networks
- Formats: web app
- Forecast: yes , 7 days
- Historical data: available for the past 7 years, from 2014, downloadable as csv
- Sources: reference stations, own network of sensors, satelite
- Formats: web and mobile apps
- Forecast: yes
- Historical data: yes