What is METER-ML?

In support of a new initiative to build a global database of methane emitting infrastructure called the MEthane Tracking Emissions Reference (METER) database, we developed METER-ML, a multi-sensor Earth observation dataset containing georeferenced images in the U.S. labeled for the presence or absence of six methane source facilities.

Read The Paper (Zhu & Lui & Irvin et al.)

Download the METER-ML Dataset

Why did we develop METER-ML?

Reducing methane emissions is essential for mitigating global warming. To attribute methane emissions to their sources, a comprehensive dataset of methane source infrastructure is necessary. Recent advancements with deep learning on remotely sensed imagery have the potential to identify the locations and characteristics of methane sources, but there is a substantial lack of publicly available data to enable machine learning researchers and practitioners to build automated mapping approaches.

We developed METER-ML to allow the machine learning community to experiment with multi-view/multi-modal modeling approaches to automatically identify sources of methane emissions in remotely sensed imagery.

How did we collect and label METER-ML?

METER-ML consists of 86,625 georeferenced NAIP, Sentinel-1, and Sentinel-2 images in the U.S. labeled for the presence or absence of methane source facilities including concentrated animal feeding operations (CAFOs), coal mines, landfills, natural gas processing plants (Proc Plants), oil refineries and petroleum terminals (R&Ts), and wastewater treatment plants (WWTPs).

Images in METER-ML

We collected locations of methane emitting infrastructure in the U.S. from a variety of public datasets. We additionally included a variety of images in the dataset which capture none of the six methane emitting facilities (Negatives). All of the locations were paired with three publicly available remotely sensed image sources, namely aerial imagery from the USDA National Agriculture Imagery Program (NAIP) as well as satellite imagery captured by Sentinel-1 (S1) and Sentinel-2 (S2). We included the three visible (RGB) and single near-infrared (NIR) bands from NAIP and S2, the single coastal aerosol (CA) band, four red-edge (RE1-4) bands, single water vapor (WV) band, single cirrus (C) band, and the two shortwave infrared (SWIR1-2) bands from S2, and the V-transmit (VH and VV) bands from S1. Images capture a 720m x 720m footprint. Imagery was processed and downloaded using the Descartes Labs platform.

Expert-labeled Validation and Test Sets

Two Stanford University postdoctoral researchers with expertise in methane emissions and related infrastructure individually reviewed 1,534 examples to compose the held-out validation and test sets. Their consensus was used as the final label in these sets.

Table 1. Counts of each category in METER-ML.

Category	Train	Valid	Test	Total
CAFOs	24957	47	92	25096
Landfills	4085	46	111	4242
Coal Mines	1776	40	72	1888
Proc Plants	1900	38	107	2045
R&Ts	4012	59	108	4179
WWTPs	14519	46	129	14694
Negatives	34195	249	426	34870
Total	85066	515	1018	86599

Table 2. Summary of the remotely sensed image products and bands included in METER-ML.

Product	Bands	Image Size	Resolution
NAIP	RGB & NIR	720x720	1m
Sentinel-2	RGB & NIR	72x72	10m
Sentinel-2	RE1-4 & SWIR1-2	36x36	20m
Sentinel-2	CA & WV & C	12x12	60m
Sentinel-1	VH & VV	72x72	10m

Table 3. Per-class and overall (macros-average) test metrics of our baseline model.

Category	AUPRC	AUROCC	Precision	Recall	F1
CAFOs	0.915	0.989	0.822	0.902	0.860
Landfills	0.259	0.754	0.246	0.523	0.334
Coal Mines	0.470	0.905	0.558	0.403	0.468
Proc Plants	0.350	0.787	0.336	0.477	0.394
R&Ts	0.821	0.956	0.752	0.787	0.769
WWTPs	0.534	0.836	0.633	0.477	0.544
Overall	0.558	0.871	0.558	0.595	0.562

How well does our baseline model do?

We experimented with a variety of models with a DenseNet-121 backbone which input combinations of image products, bands, image, and spatial resolutions. We found that a model which leverages NAIP with all four bands achieves the highest overall performance across the tested image product and spectral band combinations, followed closely by a joint NAIP, Sentinel-2, and Sentinel-1 model. We also found that the highest spatial resolution and footprint leads to the best overall performance, although performance can depend on the methane source category.

We selected the best performing setting for each methane source category in our baseline model. The baseline model achieved high performance in identifying concentrated animal feeding operations and oil refineries and petroleum terminals, suggesting the potential to map them at scale. There is still a large gap to achieving high performance for each of the other methane source categories and further improve performance on the high performing categories, so METER-ML is a challenging benchmark to test new infrastructure identification approaches.

To learn more, read our publication presented at the IJCAI-ECAI 2022 Workshop on Complex Data Challenges in Earth Observation.

If you have questions about our work, contact us at:

METER-ML: A Multi-sensor Earth Observation Benchmark for Automated Methane Source Mapping

What is METER-ML?

Why did we develop METER-ML?

How did we collect and label METER-ML?

Images in METER-ML

Expert-labeled Validation and Test Sets

How well does our baseline model do?

To learn more, read our publication presented at the IJCAI-ECAI 2022 Workshop on Complex Data Challenges in Earth Observation.

`bwzhu@cs.stanford.edu` and `niclui@stanford.edu` and `jirvin16@cs.stanford.edu`

METER-ML: A Multi-sensor Earth Observation Benchmark for Automated Methane Source Mapping

What is METER-ML?

Why did we develop METER-ML?

How did we collect and label METER-ML?

Images in METER-ML

Expert-labeled Validation and Test Sets

How well does our baseline model do?

To learn more, read our publication presented at the IJCAI-ECAI 2022 Workshop on Complex Data Challenges in Earth Observation.

bwzhu@cs.stanford.edu and niclui@stanford.edu and jirvin16@cs.stanford.edu

`bwzhu@cs.stanford.edu` and `niclui@stanford.edu` and `jirvin16@cs.stanford.edu`