Best Practices for Machine Learning Papers in Geophysics
A Community Checklist for Imaging, Data Processing, and Earth-System Discovery
Machine learning in geophysics spans three broad scientific goals:
Subsurface Imaging & MonitoringOften simulation-trained; includes tomography, velocity model building, inversion surrogates, DAS imaging, monitoring of time-lapse structure, etc.
Data Processing & Catalog BuildingIncludes earthquake detection, phase picking, classification, foundation models for seismic representation learning, and multimodal catalog creation.
Earth-System Discovery & Scientific InsightIncludes ML for discovering new processes using multimodal data (remote sensing, seismology, hydrology, climate, geotechnical signals).
Across all of these domains, the baseline for rigor is the same.
This checklist summarizes what high-quality ML geophysics papers must report to ensure scientific validity, reproducibility, and utility to the broader community.
1. Scientific Motivation & Positioning
1.1 Problem framing
Clearly define the geophysical problem and why ML is appropriate for it.
Explain the scientific or operational limitation of existing methods.
Specify whether the task targets imaging, processing, or discovery, and define the success criteria.
1.2 Literature integration (required)
Situate the work in the context of recent ML literature,
Specify what is new beyond applying a known architecture.
Compare against both traditional geophysical methods and state-of-the-art ML models.
2. Data, Simulations, & Preprocessing
2.1 Dataset definition
Provide a complete description of all datasets: dimensions, sampling, components, labels, SNR, metadata, and known biases.
State explicitly whether data are synthetic, field, or mixed, and justify how they represent the intended geophysical setting.
For simulation-based studies: Provide details of numerical solver, physics, domain size, materials, and boundary conditions.
For simulation-based studies: Quantify compute resources required to generate synthetic datasets (CPU/GPU hours, memory).
2.2 Preprocessing transparency
Describe every step: filtering, normalization, windowing, detrending, resampling, spectrogram parameters, etc.
Include examples of raw vs. processed data to illustrate the transformations.
Provide preprocessing scripts in a public repository.
2.3 Data realism & diversity
Assess representativeness relative to field conditions (noise, geometry, distance, heterogeneity).
For simulations: Include variability in sources, noise, sensor layouts, and Earth structure.
For simulations: Document limits of simulation generalization.
3. Model Architecture & Training
3.1 Architecture clarity
Provide full model diagrams or tables of layers (include input/output shapes).
Provide full distributions, confidence intervals, and class confusion matrices.
4.2 Generalization tests (strongly required for ML papers)
Test performance under changes in noise levels
Test performance under changes in source locations
Test performance under changes in acquisition geometry
Test performance under changes in structural heterogeneity
Test performance under changes in temporal drift
Test performance under out-of-distribution examples
For catalog-building studies: test on field datasets from regions not used in training.
For imaging studies: test on unseen geologic structures, realistic noise, and perturbed sensor arrays.
4.3 Real-data realism
For synthetic-trained models, demonstrate at least one bridge to field data (transfer learning, adaptation, failure analysis, or why the model is not yet field-ready).
This section responds to growing emphasis on operational feasibility.
5.1 Simulation cost
Report CPU/GPU hours, number of simulations, mesh size, and memory needed.
5.2 Training cost
List number of GPUs, GPU hours, wall time, energy or carbon cost (optional but encouraged).
Provide model parameter count and checkpoint size.
5.3 Inference cost
Report inference time per trace, per station-day, or per imaging experiment.
State memory footprint of the model during inference.
Provide runtime benchmarks on common hardware (CPU-only and GPU).
5.4 Operational readiness
Describe whether the model can run in real time, on edge devices, or at cloud scale.
Provide a Docker/Singularity environment or cloud examples if relevant.
6. Physical Consistency & Interpretability
6.1 Physical priors and constraints
Discuss whether the model respects physical constraints (e.g., causality, monotonicity, wave propagation).
For imaging/inversion: evaluate whether predictions obey known geophysical bounds.
6.2 Interpretability
Provide interpretable diagnostics (e.g., Integrated Gradients, feature importance, attribution maps).
Reveal what signal components the model uses.
7. Reproducibility & Open Science
7.1 Open code (required for modern ML)
Release full training, preprocessing, and inference code in a public repository.
Provide tested scripts for reproducing all figures.
Include environment files (requirements.txt or Conda YAML).
7.2 Open data
Release datasets or provide clear instructions for accessing them.
If data are proprietary, provide synthetic analogs for reproducibility.
7.3 Model checkpoints
Release trained weights and configuration files.
Provide sample inference notebooks.
7.4 Documentation
Provide a complete README including project structure
Provide a complete README including dataset description
Provide a complete README including training commands
Provide a complete README including evaluation commands
Provide a complete README including expected outputs
8. Writing for Dual Audiences (Geophysics + ML)
Define ML concepts clearly for geophysicists (e.g., flow matching, diffusion, transformers).
Define geophysical concepts clearly for ML readers (e.g., P/S ratios, acquisition geometry).
Include intuitive figures explaining the workflow.
Provide geophysical interpretation of the ML results, not just numerical metrics.
Summary: What Makes an ML Paper Publishable in Geophysics?
A strong ML geophysics paper:
Advances a scientifically meaningful question
Uses physically realistic data and diverse tests
Provides rigorous baselines and ablations
Documents compute cost, training, and inference
Demonstrates generalization beyond the training set
Releases reproducible code and data
Speaks clearly to both geophysicists and ML practitioners
This checklist reflects modern expectations shaped by foundation-model research, multimodal geophysical ML, and open-science principles, and is appropriate for imaging, catalog-building, and Earth-system discovery studies alike.