MOJO-LBM-Tutorial


LDC Vel Magnitude Re=100	u Benchmark Results	v Benchmark Results

A basic implementation for 2D D2Q9 LBM for Mojo using only the Standard Library as a learning exercise. If you are interested in simulation (and coming from python) this is a great exercise as it:

Learning Correct Typing and Parameterization in Mojo a. Supports any DType Floating point (mainly fp32 or fp64)
GPU kernels and TileTensor Layouts
How to call Python Modules in Mojo: a. Passing buffers into Numpy arrays with Unsafe Pointers b. Using Pyvista for Visualisation
Creating Custom structs and functions to reduce repeated code (e.g. vector, contextTileTensor)
Basic Origin tracking
Mojo Packaging

Timeline

2026/06/05 Implemented TiledLayouts for LBM 2026/06/04 Implemeted First Variation that uses thread reording 2026/05/20 LBM working with mid-gridbounce bounceback and moving wall BC. Row Major. Base Example

LBM

Lattice Boltzmann Method (LBM) is a fluid simulation based on the Boltzmann Equation and specifically made for GPU like compute. It is an explicit time stepping algorithim (so no solving systems of equations) and performed on a structured grid. The Single relaxation time (SRT) model implemented is designed for incompressible flow (Mach number less than 0.3)

Its simplicity allows one to capture fluid motion in a single tight kernel ~ 50 lines.

Steps

Stream Populations And Apply BC (I use a pulled approach here)
Calculate Post BC and streamed velocity and density
Compute Collision Step

Custom Structs

Vector

Stack allocated vector with value semantics (i.e. ImplicitelyCopyable Trait and so behaves like a number) and support for standard ops (+-*/) with same vector type or scalars. Also support sum, prod with oneself and dot product with another vector. An InlineArray stores the data inside the vector.

Currently Not Simd optimized for large vector (uses simple for loops)

ContextTileTensor

Simple Struct that manages the host and device buffer together and keeps the 2 buffers in sync. Uses familiar .cpu() and .gpu() getters to call the Tensor as a cpu or gpu tile tensor respecitively. Buffer copies between the 2 buffers only occur when we call different buffers in a row.

    a = ContextTileTensor(ctx,layout)

    cpu_tensor = a.cpu() # No Copy as initial call
    # Some CPU Work Here
    # ...
    gpu_tensor = a.gpu() # Copy is performed from Host Buffer (CPU) to Device Buffer (GPU)
    # Some Gpu Work Here...
    gpu_tensor2 = a.gpu() # No Copy as last call was the same GPU
      
    cpu_tensor = a.cpu() # Copy is perfomed from GPU to CPU

ToDo

Create function to set BC - Moving and No Slip
Create LBM kernel with mid grid bounceback

Optimisation Tasks

[] Use Benchmarking to determine speed ups and optimisations
[] Add Simd optimisation
[] Add Layout Analysis
[] Swizzling analysis

Other

[] Implement 3D lattices models
[] Implement Custom Floating Point
[] Equilibrium Conditions

Reflection

2026/05/12
- Awkward slicing syntax
- Type System can be annoying
- Int and Scalar[Dtype.int32] for Gpu kernels type mismatching
- Lack of clarity what can be passed to GPU
- Very Barebones so have to basically build everything from scratch
- Maybe to low level for now to incentivise a switch from CUDA or Python DSLs
2026_05/14
- Optional is weird and doesnt make sense
- Bool dont have is implemented so foo is False does not work
2026_05_19
- While theyare building some awesome stuff, the QA and actual usage of the language features in more realistic context can be a bit lacking
- A python User, because Mojo is targeted for systems (i.e. "low level") programming design, theres a significant gap between using std builtins and Python functions. Might be unavoidable.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
images		images
mojo_testing		mojo_testing
src		src
.gitignore		.gitignore
AoSTile.mojo		AoSTile.mojo
README.md		README.md
SoATile.mojo		SoATile.mojo
SoATile3D.mojo		SoATile3D.mojo
main.mojo		main.mojo
pyproject.toml		pyproject.toml
reorderThreads.mojo		reorderThreads.mojo
run_benchmark.mojo		run_benchmark.mojo
tiled.mojo		tiled.mojo
tiled_no_layout.mojo		tiled_no_layout.mojo
u_velocity_results.txt		u_velocity_results.txt
uv.lock		uv.lock
v_velocity_results.csv		v_velocity_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MOJO-LBM-Tutorial

Timeline

LBM

Steps

Custom Structs

Vector

ContextTileTensor

ToDo

Optimisation Tasks

Other

Reflection

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MOJO-LBM-Tutorial

Timeline

LBM

Steps

Custom Structs

Vector

ContextTileTensor

ToDo

Optimisation Tasks

Other

Reflection

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages