Storage and GPU benchmarking toolkit for FarmGPU infrastructure. Runs CrystalDiskMark-style disk benchmarks via fio and measures PyTorch cold import times — useful for validating NVMe/RAID performance and shared filesystem latency on GPU nodes.
| File | Description |
|---|---|
bench.yml |
Ansible playbook — orchestrates everything over SSH |
files/bench_fio_cdm.py |
CrystalDiskMark-style fio benchmark (seq/random, read/write) |
files/bench_torch_import.py |
Cold import torch timing across multiple subprocess runs |
files/gen_summary.py |
Generates a shareable plain-text report from JSON results |
Against inventory hosts:
ansible-playbook bench.yml -l serrano01-1062Against an ad-hoc host (e.g. RunPod container):
ansible-playbook bench.yml -i "root@38.80.152.148," \
-e ansible_port=30866 -e ansible_user=root -e fio_test_path=/workspaceWith custom options:
ansible-playbook bench.yml -l serrano \
-e fio_file_size=10g -e fio_runtime=30 -e fio_test_path=/mnt/raidDisk benchmark (requires fio):
python3 files/bench_fio_cdm.py --path /mnt/raid --size 10g --runtime 10Torch import benchmark (requires torch):
python3 files/bench_torch_import.pyGenerate a text summary from JSON logs:
python3 files/gen_summary.py torch_import.json fio_bench.json summary.txt
python3 files/gen_summary.py --fio-only fio_bench.json summary.txt
python3 files/gen_summary.py --torch-only torch_import.json summary.txt| Variable | Default | Description |
|---|---|---|
fio_test_path |
/tmp |
Directory to run fio benchmarks in |
fio_file_size |
10g |
Test file size (use 10g+ for accurate enterprise results) |
fio_runtime |
10 |
Seconds per fio test |
torch_runs |
10 |
Number of torch import timing runs |
| Test | Block Size | Queue Depth | Threads | Pattern |
|---|---|---|---|---|
| SEQ1M Q8T1 | 1M | 8 | 1 | Sequential |
| SEQ1M Q1T1 | 1M | 1 | 1 | Sequential |
| SEQ128K Q32T1 | 128K | 32 | 1 | Sequential |
| RND128K Q128T8 | 128K | 128 | 8 | Random |
| RND4K Q32T16 | 4K | 32 | 16 | Random |
| RND4K Q1T1 | 4K | 1 | 1 | Random |
Results are saved as timestamped JSON files in /tmp (or the playbook's results/ directory when run via Ansible). Each JSON log includes full system info, raw fio output, and computed summaries.
The text summary from gen_summary.py includes bandwidth charts, IOPS, and latency tables — suitable for pasting into Slack or docs.
- fio — disk benchmark (
apt install fio) - Python 3.6+ — standard library only for fio bench and summary generator
- PyTorch — only needed for the torch import benchmark (skipped gracefully if absent)
- Ansible — only needed if using the playbook for remote execution