Machine Learning-Based Estimation of Surface NO2 Concentrations over China: A Comparative Analysis of Geostationary (GEMS) and Polar-Orbiting (TROPOMI) Satellite Data

Ma Yijin; Yi Wang; Jun Wang; Tao Minghui; Jhoon Kim; Chenyang Wu; Shanshan Zhang

doi:10.3390/rs18040614

Back

Machine Learning-Based Estimation of Surface NO2 Concentrations over China: A Comparative Analysis of Geostationary (GEMS) and Polar-Orbiting (TROPOMI) Satellite Data

Journal article

Open access

Peer reviewed

Machine Learning-Based Estimation of Surface NO2 Concentrations over China: A Comparative Analysis of Geostationary (GEMS) and Polar-Orbiting (TROPOMI) Satellite Data

Ma Yijin, Yi Wang, Jun Wang, Tao Minghui, Jhoon Kim, Chenyang Wu and Shanshan Zhang

Remote sensing (Basel, Switzerland), Vol.18(4), 614

02/15/2026

DOI: 10.3390/rs18040614

Files and links (1)

url

https://doi.org/10.3390/rs18040614View

Published (Version of record) Open Access

Abstract

What are the main findings? The CatBoost model performed best, with GEMS data yielding higher accuracy (R2 = 0.842) than TROPOMI data (R2 = 0.765). GEMS’s high temporal resolution provided a much larger training dataset, which was the key factor for its superior model performance. What are the implications of the main findings? Geostationary satellite data (like GEMS) offers a critical advantage for high-resolution air quality monitoring via machine learning due to its frequent sampling. GEMS enables the reconstruction of detailed diurnal pollution patterns and near-real-time tracking of emission events, providing valuable insights for dynamic air quality management. High-accuracy spatiotemporal monitoring of surface nitrogen dioxide (NO2) concentrations is essential for air quality management. This study evaluates machine learning-based estimates of near-surface NO2 concentrations using data from the geostationary GEMS instrument and the polar-orbiting TROPOMI over China in 2022. Four tree-based models—Random Forest, XGBoost, CatBoost, and LightGBM—were trained by integrating satellite vertical-column densities with multi-source meteorological and ancillary data. Results show that CatBoost achieved the highest accuracy, with an R2 of 0.842 for GEMS and 0.765 for TROPOMI, alongside the lowest RMSE and MAE. Models trained on GEMS data consistently outperformed TROPOMI-based models across all metrics. This advantage is primarily attributed to the substantially larger training sample size enabled by GEMS’s high temporal resolution, as confirmed through a controlled experiment with consistent sample sizes which isolated the effect of data volume. Spatially, GEMS estimates captured sharper concentration gradients and localized emission hotspots, while TROPOMI produced smoother fields. Temporally, only GEMS allowed the reconstruction of detailed diurnal patterns and near-real-time pollution episode tracking. This study confirms the significant added value of geostationary satellite data for high-frequency air quality monitoring and analysis when combined with machine learning.

Air Quality

Machine Learning

Remote Sensing

Accuracy

Air monitoring

Artificial intelligence

Comparative analysis

Concentration gradient

Datasets

Diurnal

Estimates

Learning algorithms

Models

Neural networks

Nitrogen dioxide

Pollutants

Premature mortality

Quality management

Real time

Reconstruction

Regression analysis

Sensors

Statistical analysis

Statistical methods

Synchronous satellites

Temporal resolution

Tracking

Details

Title: Subtitle: Machine Learning-Based Estimation of Surface NO2 Concentrations over China: A Comparative Analysis of Geostationary (GEMS) and Polar-Orbiting (TROPOMI) Satellite Data
Creators: Ma Yijin
Yi Wang - China University of Geosciences
Jun Wang - University of Iowa
Tao Minghui
Jhoon Kim - Yonsei University
Chenyang Wu - Huazhong University of Science and Technology
Shanshan Zhang - China University of Geosciences
Resource Type: Journal article
Publication Details: Remote sensing (Basel, Switzerland), Vol.18(4), 614
DOI: 10.3390/rs18040614
ISSN: 2072-4292
eISSN: 2072-4292
Publisher: MDPI AG
Grant note: National Natural Science Foundation of China: 42201409
This research was funded by the National Natural Science Foundation of China (Grant No. 42201409).
Language: English
Date published: 02/15/2026
Academic Unit: Electrical and Computer Engineering; Civil and Environmental Engineering; Iowa Technology Institute; Physics and Astronomy; Chemical and Biochemical Engineering
Record Identifier: 9985140873502771

Metrics

1 Record Views