Skip to content

FarmGPU/clustermax-storage

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

SILO Quick Bench

Storage and GPU benchmarking toolkit for FarmGPU infrastructure. Runs CrystalDiskMark-style disk benchmarks via fio and measures PyTorch cold import times — useful for validating NVMe/RAID performance and shared filesystem latency on GPU nodes.

What's Included

File Description
bench.yml Ansible playbook — orchestrates everything over SSH
files/bench_fio_cdm.py CrystalDiskMark-style fio benchmark (seq/random, read/write)
files/bench_torch_import.py Cold import torch timing across multiple subprocess runs
files/gen_summary.py Generates a shareable plain-text report from JSON results

Quick Start

Run via Ansible (recommended)

Against inventory hosts:

ansible-playbook bench.yml -l serrano01-1062

Against an ad-hoc host (e.g. RunPod container):

ansible-playbook bench.yml -i "root@38.80.152.148," \
  -e ansible_port=30866 -e ansible_user=root -e fio_test_path=/workspace

With custom options:

ansible-playbook bench.yml -l serrano \
  -e fio_file_size=10g -e fio_runtime=30 -e fio_test_path=/mnt/raid

Run scripts directly

Disk benchmark (requires fio):

python3 files/bench_fio_cdm.py --path /mnt/raid --size 10g --runtime 10

Torch import benchmark (requires torch):

python3 files/bench_torch_import.py

Generate a text summary from JSON logs:

python3 files/gen_summary.py torch_import.json fio_bench.json summary.txt
python3 files/gen_summary.py --fio-only fio_bench.json summary.txt
python3 files/gen_summary.py --torch-only torch_import.json summary.txt

Playbook Variables

Variable Default Description
fio_test_path /tmp Directory to run fio benchmarks in
fio_file_size 10g Test file size (use 10g+ for accurate enterprise results)
fio_runtime 10 Seconds per fio test
torch_runs 10 Number of torch import timing runs

FIO Test Matrix

Test Block Size Queue Depth Threads Pattern
SEQ1M Q8T1 1M 8 1 Sequential
SEQ1M Q1T1 1M 1 1 Sequential
SEQ128K Q32T1 128K 32 1 Sequential
RND128K Q128T8 128K 128 8 Random
RND4K Q32T16 4K 32 16 Random
RND4K Q1T1 4K 1 1 Random

Output

Results are saved as timestamped JSON files in /tmp (or the playbook's results/ directory when run via Ansible). Each JSON log includes full system info, raw fio output, and computed summaries.

The text summary from gen_summary.py includes bandwidth charts, IOPS, and latency tables — suitable for pasting into Slack or docs.

Requirements

  • fio — disk benchmark (apt install fio)
  • Python 3.6+ — standard library only for fio bench and summary generator
  • PyTorch — only needed for the torch import benchmark (skipped gracefully if absent)
  • Ansible — only needed if using the playbook for remote execution

About

simple benchmark for pytorch import and disk bandwidth on containers, VM, or bare metal

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages