GitHub - DeepWave-KAUST/DiffVMB-pub: Official reproducible material for Shallow-to-deep velocity model building via diffusion models

Shallow-to-deep velocity model building via diffusion models

Shijun Cheng, Randy Harsuko, Tariq Alkhalifah

DeepWave Consortium, King Abdullah University of Science and Technology (KAUST)

Corresponding author: Shijun Cheng (sjcheng.academic@gmail.com)

Project structure

This repository is organized as follows:

📂 diffvmb_part1: code for Part I of the manuscript;
- 📂 code: python library containing model, diffusion, and dataset utilities;
- 📄 train.py: training script for Part I;
- 📄 sample.py: sampling/inference script for Part I;
📂 diffvmb_part2: code for Part II of the manuscript;
- 📂 code: python library containing model, diffusion, and dataset utilities;
- 📄 train.py: training script for Part II;
- 📄 sample.py: sampling/inference script for Part II;
📂 dataset: empty folder, to be filled with the downloaded dataset (see Supplementary files);
📂 trained_model: empty folder, to be filled with the downloaded model weights (see Supplementary files);
📂 logo: folder containing logo;

Supplementary files

The training/test datasets and pre-trained model weights for both parts of the manuscript are publicly available on Zenodo:

DOI: 10.5281/zenodo.19790506

Dataset (`dataset.zip`)

Download and extract dataset.zip into the dataset/ folder. After extraction, the structure is:

dataset/
├── part1/
│   ├── train/          # Training data for Part I (NPZ format)
│   └── test/           # Test data for Part I (MAT format)
└── part2/
    ├── train/          # Training data for Part II (NPZ format)
    └── test/           # Test data for Part II (MAT format)

Training data (.npz): each file contains two arrays — vp (P-wave velocity model) and ref (Part I) or mig (Part II) — representing 2-D cross-sections extracted from industrial 3-D velocity models.
Test data (.mat): benchmark velocity models including in-distribution models (SEAM Arid, SEG/EAGE, Overthrust for Part I; Syn for Part II) and an out-of-distribution model (Marmousi) to assess generalization.

Part I uses an idealized reflectivity model computed from the true velocity as the structural constraint.
Part II introduces two more realistic constraints: a migration-derived structural image obtained by RTM with a smooth background velocity, and the background velocity model itself as an additional low-wavenumber constraint.

Trained models (`trained_model.zip`)

Download and extract trained_model.zip into the trained_model/ folder. After extraction, the structure is:

trained_model/
├── model_part1.pt      # Pre-trained model for Part I
└── model_part2.pt      # Pre-trained model for Part II

Both models are trained using a depth-progressive conditional diffusion framework built upon the IDDPM architecture, extended with custom multi-condition inputs including shallow velocity context, depth positional encoding, well-log constraints, and structural constraints.

Getting started 👾 🤖

To ensure reproducibility of the results, we suggest using the environment.yml file when creating an environment. Simply run:

./install_env.sh

It will take some time, if at the end you see the word Done! on your terminal you are ready to go. Activate the environment by typing:

conda activate diffvmb

After that you can simply install your package:

pip install .

or in developer mode:

pip install -e .

Running code 📄

Once the supplementary files have been downloaded and the environment has been installed, you can run the training and sampling scripts for each part independently.

Part I

To train the Part I model from scratch:

cd diffvmb_part1

python train.py

To reproduce the Part I results using the provided trained model and test data:

cd diffvmb_part1

python sample.py

Part II

To train the Part II model from scratch:

cd diffvmb_part2

python train.py

To reproduce the Part II results using the provided trained model and test data:

cd diffvmb_part2

python sample.py

Note: when running sampling with DDIM, set --timestep_respacing ddim10 (or another ddim{N} value) on the command line to control the number of denoising steps.

Disclaimer: All experiments have been carried out on an Intel(R) Xeon(R) CPU @ 2.10GHz equipped with a single NVIDIA GeForce A100 GPU. Different environment configurations may be required for different combinations of workstation and GPU. If your GPU does not support large batch sizes, please reduce the batch_size argument in the corresponding training or sampling script.

Acknowledgements

This implementation is motivated by the paper Improved Denoising Diffusion Probabilistic Models and the code is adapted from their repository. We are grateful for their open-source contribution.

Cite us

Cheng et al. (2026) Shallow-to-deep velocity model building via diffusion models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shallow-to-deep velocity model building via diffusion models

Shijun Cheng, Randy Harsuko, Tariq Alkhalifah

DeepWave Consortium, King Abdullah University of Science and Technology (KAUST)

Project structure

Supplementary files

Dataset (`dataset.zip`)

Trained models (`trained_model.zip`)

Getting started 👾 🤖

Running code 📄

Part I

Part II

Acknowledgements

Cite us

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
dataset		dataset
diffvmb_part1		diffvmb_part1
diffvmb_part2		diffvmb_part2
logo		logo
trained_model		trained_model
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
install_env.sh		install_env.sh
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Shallow-to-deep velocity model building via diffusion models

Shijun Cheng, Randy Harsuko, Tariq Alkhalifah

DeepWave Consortium, King Abdullah University of Science and Technology (KAUST)

Project structure

Supplementary files

Dataset (dataset.zip)

Trained models (trained_model.zip)

Getting started 👾 🤖

Running code 📄

Part I

Part II

Acknowledgements

Cite us

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Dataset (`dataset.zip`)

Trained models (`trained_model.zip`)

Packages