close
close

Deriving PM2.5 from satellite observations with spatiotemporally weighted tree-based algorithms: enhancing modeling accuracy and interpretability

Deriving PM2.5 from satellite observations with spatiotemporally weighted tree-based algorithms: enhancing modeling accuracy and interpretability

  • Li, C. et al. Reversal of trends in global fine particulate matter air pollution. Nat. Commun. 14, 5349 (2023).

    Article 
    CAS 

    Google Scholar 

  • Xu, F. et al. The challenge of population aging for mitigating deaths from PM2.5 air pollution in China. Nat. Commun. 14, 5222 (2023).

    Article 
    CAS 

    Google Scholar 

  • Geng, G. et al. Drivers of PM2.5 air pollution deaths in China 2002–2017. Nat. Geosci. 14, 645–650 (2021).

    Article 
    CAS 

    Google Scholar 

  • Zhang, Q. et al. Transboundary health impacts of transported global air pollution and international trade. Nature 543, 705–709 (2017).

    Article 
    CAS 

    Google Scholar 

  • Zhang, Y. et al. Satellite remote sensing of atmospheric particulate matter mass concentration: advances, challenges, and perspectives. Fundamental Res. 1, 240–258 (2021).

    Article 
    CAS 

    Google Scholar 

  • Bai, K. et al. Global synthesis of two decades of research on improving PM2.5 estimation models from remote sensing and data science perspectives. Earth Sci. Rev. 241, 104461 (2023).

    Article 
    CAS 

    Google Scholar 

  • Jin, C., Yuan, Q., Li, T., Wang, Y. & Zhang, L. An optimized semi-empirical physical approach for satellite-based PM2.5 retrieval: embedding machine learning to simulate complex physical parameters. Geosci. Model Dev. 16, 4137–4154 (2023).

    Article 

    Google Scholar 

  • Li, T., Yang, Q., Wang, Y. & Wu, J. Joint estimation of PM2.5 and O3 over China using a knowledge-informed neural network. Geosci. Front. 14, 101499 (2023).

  • Yan, X., Zang, Z., Luo, N., Jiang, Y. & Li, Z. New interpretable deep learning model to monitor real-time PM2.5 concentrations from satellite data. Environ. Int. 144, 106060 (2020).

    Article 
    CAS 

    Google Scholar 

  • Bai, K. et al. LGHAP: the long-term gap-free high-resolution air pollutant concentration dataset, derived via tensor-flow-based multimodal data fusion. Earth Syst. Sci. Data 14, 907–927 (2022).

    Article 

    Google Scholar 

  • Geng, G. et al. Tracking air pollution in China: near real-time PM2.5 retrievals from multisource data fusion. Environ. Sci. Technol. 55, 12106–12115 (2021).

    Article 
    CAS 

    Google Scholar 

  • Wei, J. et al. Estimating 1-km-resolution PM2.5 concentrations across China using the space-time random forest approach. Remote Sens. Environ. 231, 111221 (2019).

    Article 

    Google Scholar 

  • Li, T., Shen, H., Zeng, C., Yuan, Q. & Zhang, L. Point-surface fusion of station measurements and satellite observations for mapping PM2.5 distribution in China: methods and assessment. Atmos. Environ. 152, 477–489 (2017).

    Article 
    CAS 

    Google Scholar 

  • Ma, Z., Hu, X., Huang, L., Bi, J. & Liu, Y. Estimating ground-level PM2.5 in China using satellite remote sensing. Environ. Sci. Technol. 48, 7436–7444 (2014).

    Article 
    CAS 

    Google Scholar 

  • Hoff, R. M. & Christopher, S. A. Remote sensing of particulate pollution from space: have we reached the promised land? J. Air Waste Manag. Assoc. 59, 645–675 (2009).

    Article 
    CAS 

    Google Scholar 

  • Martin, R. V. Satellite remote sensing of surface air quality. Atmos. Environ. 42, 7823–7843 (2008).

    Article 
    CAS 

    Google Scholar 

  • Ma, Z. et al. A review of statistical methods used for developing large-scale and long-term PM2.5 models from satellite data. Remote Sens. Environ. 269, 112827 (2022).

    Article 

    Google Scholar 

  • Pichler, M. & Hartig, F. Machine learning and deep learning—A review for ecologists. Methods Ecol. Evol. 14, 994–1016 (2023).

    Article 

    Google Scholar 

  • Zhao, C. et al. Estimating the daily PM2.5 concentration in the Beijing-Tianjin-Hebei region using a random forest model with a 0.01°×0.01° spatial resolution. Environ. Int. 134, 105297 (2020).

    Article 
    CAS 

    Google Scholar 

  • Brokamp, C., Jandarov, R., Hossain, M. & Ryan, P. Predicting daily urban fine particulate matter concentrations using a random forest model. Environ. Sci. Technol. 52, 4173–4179 (2018).

    Article 
    CAS 

    Google Scholar 

  • Wongnakae, P., Chitchum, P., Sripramong, R. & Phosri, A. Application of satellite remote sensing data and random forest approach to estimate ground-level PM2.5 concentration in Northern region of Thailand. Environ. Sci. Pollut. R. 30, 88905–88917 (2023).

    Article 
    CAS 

    Google Scholar 

  • Choi, H., Park, S., Kang, Y., Im, J. & Song, S. Retrieval of hourly PM2.5 using top-of-atmosphere reflectance from geostationary ocean color imagers I and II. Environ. Pollut. 323, 121169 (2023).

    Article 
    CAS 

    Google Scholar 

  • Yang, Q., Yuan, Q. & Li, T. Ultrahigh-resolution PM2.5 estimation from top-of-atmosphere reflectance with machine learning: theories, methods, and applications. Environ. Pollut. 306, 119347 (2022).

    Article 
    CAS 

    Google Scholar 

  • Wang, Y., Yuan, Q., Li, T., Tan, S. & Zhang, L. Full-coverage spatiotemporal mapping of ambient PM2.5 and PM10 over China from Sentinel-5P and assimilated datasets: considering the precursors and chemical compositions. Sci. Total Environ. 793, 148535 (2021).

    Article 
    CAS 

    Google Scholar 

  • Just, A. C. et al. Advancing methodologies for applying machine learning and evaluating spatiotemporal models of fine particulate matter (PM2.5) using satellite data over large regions. Atmos. Environ. 239, 117649 (2020).

    Article 
    CAS 

    Google Scholar 

  • Zamani Joharestani, M., Cao, C., Ni, X., Bashir, B. & Talebiesfandarani, S. PM2.5 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere 10, 373 (2019).

  • Chen, Z.-Y. et al. Extreme gradient boosting model to estimate PM2.5 concentrations with missing-filled satellite data in China. Atmos. Environ. 202, 180–189 (2019).

    Article 
    CAS 

    Google Scholar 

  • Hu, X. et al. Estimating PM2.5 concentrations in the conterminous United States using the random forest approach. Environ. Sci. Technol. 51, 6936–6944 (2017).

    Article 
    CAS 

    Google Scholar 

  • Su, Z., Lin, L., Chen, Y. & Hu, H. Understanding the distribution and drivers of PM2.5 concentrations in the Yangtze River Delta from 2015 to 2020 using Random Forest Regression. Environ. Monit. Assess. 194, 284 (2022).

    Article 

    Google Scholar 

  • Yang, Q., Yuan, Q., Yue, L. & Li, T. Investigation of the spatially varying relationships of PM2.5 with meteorology, topography, and emissions over China in 2015 by using modified geographically weighted regression. Environ. Pollut. 262, 114257 (2020).

    Article 
    CAS 

    Google Scholar 

  • Tai, A. P. K., Mickley, L. J. & Jacob, D. J. Correlations between fine particulate matter (PM2.5) and meteorological variables in the United States: Implications for the sensitivity of PM2.5 to climate change. Atmos. Environ. 44, 3976–3984 (2010).

    Article 
    CAS 

    Google Scholar 

  • Fang, X., Zou, B., Liu, X., Sternberg, T. & Zhai, L. Satellite-based ground PM2.5 estimation using timely structure adaptive modeling. Remote Sens. Environ. 186, 152–163 (2016).

    Article 

    Google Scholar 

  • Li, T., Shen, H., Yuan, Q. & Zhang, L. A locally weighted neural network constrained by global training for remote sensing estimation of PM2.5. IEEE Trans. Geosci. Remote Sens. 60, 1–13 (2022).

    Google Scholar 

  • Wei, J. et al. Reconstructing 1-km-resolution high-quality PM2.5 data records from 2000 to 2018 in China: spatiotemporal variations and policy implications. Remote Sens. Environ. 252, 112136 (2021).

    Article 

    Google Scholar 

  • Li, T., Shen, H., Yuan, Q., Zhang, X. & Zhang, L. Estimating ground-level PM2.5 by fusing satellite and station observations: a geo-intelligent deep learning approach. Geophys. Res. Lett. 44, 11,985–911,993 (2017).

    Article 
    CAS 

    Google Scholar 

  • Wei, J. et al. Ground-level NO2 surveillance from space across China for high resolution using interpretable spatiotemporally weighted artificial intelligence. Environ. Sci. Technol. 56, 9988–9998 (2022).

    Article 
    CAS 

    Google Scholar 

  • Wei, J. et al. First close insight into global daily gapless 1 km PM2.5 pollution, variability, and health impact. Nat. Commun. 14, 8349 (2023).

    Article 
    CAS 

    Google Scholar 

  • Fotheringham, A. S., Charlton, M. E. & Brunsdon, C. Geographically weighted regression: a natural evolution of the expansion method for spatial data analysis. Environ. Plann. A 30, 1905–1927 (1998).

    Article 

    Google Scholar 

  • Huang, B., Wu, B. & Barry, M. Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. Int. J. Geogr. Inf. Sci. 24, 383–401 (2010).

    Article 

    Google Scholar 

  • Georganos, S. et al. Geographical random forests: a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling. Geocarto Int. 36, 121–136 (2021).

    Article 

    Google Scholar 

  • Santos, F., Graw, V. & Bonilla, S. A geographically weighted random forest approach for evaluate forest change drivers in the Northern Ecuadorian Amazon. PLoS ONE 14, e0226224 (2019).

    Article 
    CAS 

    Google Scholar 

  • Su, Z. et al. Modeling the effects of drivers on PM2.5 in the Yangtze River Delta with geographically weighted Random Forest. Remote Sens. 15, 3826 (2023).

  • Ye, M. et al. Estimation of the soil arsenic concentration using a geographically weighted XGBoost model based on hyperspectral data. Sci. Total Environ. 858, 159798 (2023).

    Article 
    CAS 

    Google Scholar 

  • Wang, Y., Yuan, Q., Zhu, L. & Zhang, L. Spatiotemporal estimation of hourly 2-km ground-level ozone over China based on Himawari-8 using a self-adaptive geospatially local model. Geosci. Front. 13, 101286 (2022).

    Article 
    CAS 

    Google Scholar 

  • Fan, Z., Zhan, Q., Yang, C., Liu, H. & Bilal, M. Estimating PM2.5 concentrations using spatially local Xgboost based on full-covered SARA AOD at the urban scale. Remote Sens. 12, 3368 (2020).

  • Fotheringham, A. S., Yang, W. & Kang, W. Multiscale geographically weighted regression (MGWR). Ann. Am. Assoc. Geogr. 107, 1247–1265 (2017).

    Google Scholar 

  • Yin, S., Li, T., Cheng, X. & Wu, J. Remote sensing estimation of surface PM2.5 concentrations using a deep learning model improved by data augmentation and a particle size constraint. Atmos. Environ. 287, 119282 (2022).

    Article 
    CAS 

    Google Scholar 

  • Xiao, Q. et al. Separating emission and meteorological contributions to long-term PM2.5 trends over eastern China during 2000–2018. Atmos. Chem. Phys. 21, 9475–9496 (2021).

    Article 
    CAS 

    Google Scholar 

  • Yang, Q. et al. The relationships between PM2.5 and aerosol optical depth (AOD) in mainland China: About and behind the spatio-temporal variations. Environ. Pollut. 248, 526–535 (2019).

    Article 
    CAS 

    Google Scholar 

  • Chen, Z. et al. Influence of meteorological conditions on PM2.5 concentrations across China: a review of methodology and mechanism. Environ. Int. 139, 105558 (2020).

    Article 
    CAS 

    Google Scholar 

  • Xin, J. et al. The observation-based relationships between PM2.5 and AOD over China. J. Geophys. Res. Atmos. 121, 10,701–710,716 (2016).

    Article 
    CAS 

    Google Scholar 

  • Liu, J. et al. A mixed geographically and temporally weighted regression: exploring spatial-temporal variations from global and local perspectives. Entropy 19, 53 (2017).

  • He, Q. & Huang, B. Satellite-based high-resolution PM2.5 estimation over the Beijing-Tianjin-Hebei region of China using an improved geographically and temporally weighted regression model. Environ. Pollut. 236, 1027–1037 (2018).

    Article 
    CAS 

    Google Scholar 

  • Li, Z., Fotheringham, A. S., Li, W. & Oshan, T. Fast geographically weighted regression (FastGWR): a scalable algorithm to investigate spatial process heterogeneity in millions of observations. Int. J. Geogr. Inf. Sci. 33, 155–175 (2019).

    Article 

    Google Scholar 

  • Xue, T. et al. Spatiotemporal continuous estimates of PM2.5 concentrations in China, 2000–2016: a machine learning method with inputs from satellites, chemical transport model, and ground observations. Environ. Int. 123, 345–357 (2019).

    Article 
    CAS 

    Google Scholar 

  • Lyapustin, A., Wang, Y., Korkin, S. & Huang, D. MODIS collection 6 MAIAC algorithm. Atmos. Meas. Tech. 11, 5741–5765 (2018).

    Article 

    Google Scholar 

  • Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 146, 1999–2049 (2020).

    Article 

    Google Scholar 

  • Chelani, A. B. Estimating PM2.5 concentration from satellite derived aerosol optical depth and meteorological variables using a combination model. Atmos. Pollut. Res. 10, 847–857 (2019).

    Article 
    CAS 

    Google Scholar 

  • Inness, A. et al. The CAMS reanalysis of atmospheric composition. Atmos. Chem. Phys. 19, 3515–3556 (2019).

    Article 
    CAS 

    Google Scholar 

  • Li, T., Shen, H., Zeng, C. & Yuan, Q. A validation approach considering the uneven distribution of ground stations for satellite-based PM2.5 estimation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 13, 1312–1321 (2020).

    Article 

    Google Scholar 

  • Gregorutti, B., Michel, B. & Saint-Pierre, P. Correlation and variable importance in random forests. Stat. Comput. 27, 659–678 (2017).

    Article 

    Google Scholar 

  • Li, T., Shen, H., Yuan, Q. & Zhang, L. Geographically and temporally weighted neural networks for satellite-based mapping of ground-level PM2.5. ISPRS J. Photogramm. 167, 178–188 (2020).

    Article 

    Google Scholar 

  • Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar