This dataset is named as "Kimberlina CO2 Leakage". It was generated by the U.S. Department of Energy in 2017 [1]. The train set contains 15000 images while the test set has 4430 images. In the context of seismic Full Waveform Inversion(FWI), the seismic data, referred as "data" becomes the input and the velocity map, referred as "label", is the ouput. Data shape: 9*1251*101 Label shape: 401*141 For more details, please refer to [1][2]. To use this dataset, please cite this paper: [1]Jordan, P. D., and J. L. Wagoner. Characterizing Construction of Existing Wells to a CO2 Storage Target: The Kimberlina Site, California. [2]Yang, Yuxin, et al. "Making Invisible Visible: Data-Driven Seismic Inversion with Physics-Informed Data Augmentation." arXiv preprint arXiv:2106.11892 (2021).