Bryan Zhu *, Nicholas Lui *, Jeremy Irvin *, Jimmy Le, Sahil Tadwalkar, Chenghao Wang, Zutao Ouyang, Frankie Y. Liu, Andrew Y. Ng, Robert B. Jackson
In support of a new initiative to build a global database of methane emitting infrastructure called the MEthane Tracking Emissions Reference (METER) database, we developed METER-ML, a multi-sensor Earth observation dataset containing georeferenced images in the U.S. labeled for the presence or absence of six methane source facilities.
Reducing methane emissions is essential for mitigating global warming. To attribute methane emissions to their sources, a comprehensive dataset of methane source infrastructure is necessary. Recent advancements with deep learning on remotely sensed imagery have the potential to identify the locations and characteristics of methane sources, but there is a substantial lack of publicly available data to enable machine learning researchers and practitioners to build automated mapping approaches.
We developed METER-ML to allow the machine learning community to experiment with multi-view/multi-modal modeling approaches to automatically identify sources of methane emissions in remotely sensed imagery.
METER-ML consists of 86,625 georeferenced NAIP, Sentinel-1, and Sentinel-2 images in the U.S. labeled for the presence or absence of methane source facilities including concentrated animal feeding operations (CAFOs), coal mines, landfills, natural gas processing plants (Proc Plants), oil refineries and petroleum terminals (R&Ts), and wastewater treatment plants (WWTPs).
We collected locations of methane emitting infrastructure in the U.S. from a variety of public datasets. We additionally included a variety of images in the dataset which capture none of the six methane emitting facilities (Negatives). All of the locations were paired with three publicly available remotely sensed image sources, namely aerial imagery from the USDA National Agriculture Imagery Program (NAIP) as well as satellite imagery captured by Sentinel-1 (S1) and Sentinel-2 (S2). We included the three visible (RGB) and single near-infrared (NIR) bands from NAIP and S2, the single coastal aerosol (CA) band, four red-edge (RE1-4) bands, single water vapor (WV) band, single cirrus (C) band, and the two shortwave infrared (SWIR1-2) bands from S2, and the V-transmit (VH and VV) bands from S1. Images capture a 720m x 720m footprint. Imagery was processed and downloaded using the Descartes Labs platform.
Two Stanford University postdoctoral researchers with expertise in methane emissions and related infrastructure individually reviewed 1,534 examples to compose the held-out validation and test sets. Their consensus was used as the final label in these sets.
Category | Train | Valid | Test | Total |
---|---|---|---|---|
CAFOs | 24957 | 47 | 92 | 25096 |
Landfills | 4085 | 46 | 111 | 4242 |
Coal Mines | 1776 | 40 | 72 | 1888 |
Proc Plants | 1900 | 38 | 107 | 2045 |
R&Ts | 4012 | 59 | 108 | 4179 |
WWTPs | 14519 | 46 | 129 | 14694 |
Negatives | 34195 | 249 | 426 | 34870 |
Total | 85066 | 515 | 1018 | 86599 |
Product | Bands | Image Size | Resolution |
---|---|---|---|
NAIP | RGB & NIR | 720x720 | 1m |
Sentinel-2 | RGB & NIR | 72x72 | 10m |
Sentinel-2 | RE1-4 & SWIR1-2 | 36x36 | 20m |
Sentinel-2 | CA & WV & C | 12x12 | 60m |
Sentinel-1 | VH & VV | 72x72 | 10m |
Category | AUPRC | AUROCC | Precision | Recall | F1 |
---|---|---|---|---|---|
CAFOs | 0.915 | 0.989 | 0.822 | 0.902 | 0.860 |
Landfills | 0.259 | 0.754 | 0.246 | 0.523 | 0.334 |
Coal Mines | 0.470 | 0.905 | 0.558 | 0.403 | 0.468 |
Proc Plants | 0.350 | 0.787 | 0.336 | 0.477 | 0.394 |
R&Ts | 0.821 | 0.956 | 0.752 | 0.787 | 0.769 |
WWTPs | 0.534 | 0.836 | 0.633 | 0.477 | 0.544 |
Overall | 0.558 | 0.871 | 0.558 | 0.595 | 0.562 |
We experimented with a variety of models with a DenseNet-121 backbone which input combinations of image products, bands, image, and spatial resolutions. We found that a model which leverages NAIP with all four bands achieves the highest overall performance across the tested image product and spectral band combinations, followed closely by a joint NAIP, Sentinel-2, and Sentinel-1 model. We also found that the highest spatial resolution and footprint leads to the best overall performance, although performance can depend on the methane source category.
We selected the best performing setting for each methane source category in our baseline model. The baseline model achieved high performance in identifying concentrated animal feeding operations and oil refineries and petroleum terminals, suggesting the potential to map them at scale. There is still a large gap to achieving high performance for each of the other methane source categories and further improve performance on the high performing categories, so METER-ML is a challenging benchmark to test new infrastructure identification approaches.
If you have questions about our work, contact us at:
bwzhu@cs.stanford.edu
and niclui@stanford.edu
and jirvin16@cs.stanford.edu