Skip to content

JohnCSu/mojo_lbm_tutorial

Repository files navigation

MOJO-LBM-Tutorial

LDC Vel Magnitude Re=100 u Benchmark Results v Benchmark Results

A basic implementation for 2D D2Q9 LBM for Mojo using only the Standard Library as a learning exercise. If you are interested in simulation (and coming from python) this is a great exercise as it:

  1. Learning Correct Typing and Parameterization in Mojo a. Supports any DType Floating point (mainly fp32 or fp64)
  2. GPU kernels and TileTensor Layouts
  3. How to call Python Modules in Mojo: a. Passing buffers into Numpy arrays with Unsafe Pointers b. Using Pyvista for Visualisation
  4. Creating Custom structs and functions to reduce repeated code (e.g. vector, contextTileTensor)
  5. Basic Origin tracking
  6. Mojo Packaging

Timeline

2026/06/05 Implemented TiledLayouts for LBM 2026/06/04 Implemeted First Variation that uses thread reording 2026/05/20 LBM working with mid-gridbounce bounceback and moving wall BC. Row Major. Base Example

LBM

Lattice Boltzmann Method (LBM) is a fluid simulation based on the Boltzmann Equation and specifically made for GPU like compute. It is an explicit time stepping algorithim (so no solving systems of equations) and performed on a structured grid. The Single relaxation time (SRT) model implemented is designed for incompressible flow (Mach number less than 0.3)

Its simplicity allows one to capture fluid motion in a single tight kernel ~ 50 lines.

Steps

  1. Stream Populations And Apply BC (I use a pulled approach here)
  2. Calculate Post BC and streamed velocity and density
  3. Compute Collision Step

Custom Structs

Vector

Stack allocated vector with value semantics (i.e. ImplicitelyCopyable Trait and so behaves like a number) and support for standard ops (+-*/) with same vector type or scalars. Also support sum, prod with oneself and dot product with another vector. An InlineArray stores the data inside the vector.

Currently Not Simd optimized for large vector (uses simple for loops)

ContextTileTensor

Simple Struct that manages the host and device buffer together and keeps the 2 buffers in sync. Uses familiar .cpu() and .gpu() getters to call the Tensor as a cpu or gpu tile tensor respecitively. Buffer copies between the 2 buffers only occur when we call different buffers in a row.

    a = ContextTileTensor(ctx,layout)

    cpu_tensor = a.cpu() # No Copy as initial call
    # Some CPU Work Here
    # ...
    gpu_tensor = a.gpu() # Copy is performed from Host Buffer (CPU) to Device Buffer (GPU)
    # Some Gpu Work Here...
    gpu_tensor2 = a.gpu() # No Copy as last call was the same GPU
      
    cpu_tensor = a.cpu() # Copy is perfomed from GPU to CPU
    

ToDo

  • Create function to set BC - Moving and No Slip
  • Create LBM kernel with mid grid bounceback

Optimisation Tasks

  • [] Use Benchmarking to determine speed ups and optimisations
  • [] Add Simd optimisation
  • [] Add Layout Analysis
  • [] Swizzling analysis

Other

  • [] Implement 3D lattices models
  • [] Implement Custom Floating Point
  • [] Equilibrium Conditions

Reflection

  • 2026/05/12

    • Awkward slicing syntax
    • Type System can be annoying
    • Int and Scalar[Dtype.int32] for Gpu kernels type mismatching
    • Lack of clarity what can be passed to GPU
    • Very Barebones so have to basically build everything from scratch
    • Maybe to low level for now to incentivise a switch from CUDA or Python DSLs
  • 2026_05/14

    • Optional is weird and doesnt make sense
    • Bool dont have is implemented so foo is False does not work
  • 2026_05_19

    • While theyare building some awesome stuff, the QA and actual usage of the language features in more realistic context can be a bit lacking
    • A python User, because Mojo is targeted for systems (i.e. "low level") programming design, theres a significant gap between using std builtins and Python functions. Might be unavoidable.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages