Skip to content

RandomYapper/Offroad-Semantic-Segmentation

Repository files navigation

Off-Road Semantic Segmentation with SegFormer

Python PyTorch SegFormer Status

SegFormer (mit-b4) for robust off-road terrain segmentation

Transformer-based semantic segmentation with mixed precision training

🚀 Quick Start📊 Performance🎓 Training💾 Dataset


Table of Contents


🎯 Overview

This project uses SegFormer-B4 (Mix Transformer encoder + lightweight MLP decoder) for semantic segmentation of challenging off-road terrain. The model is trained with mixed precision (AMP), gradient accumulation, and robust augmentations to achieve high performance on 10 terrain classes.

🧠 Model Architecture

  • Backbone: SegFormer-B4 (Mix Transformer)
  • Decoder: Lightweight All-MLP Head
  • Input Size: 544 × 960
  • Classes: 10 terrain categories
  • Training: Mixed Precision (AMP) + Dice+CE Loss

🎯 Segmentation Classes (10)

ID Class Name Pixel Value Description
0 Trees 100 Trees and large vegetation
1 Lush Bush 200 Dense, green bushes
2 Dry Grass 300 Dried grass patches
3 Dry Bushes 500 Sparse, dried bushes
4 Ground Clutter 550 Small debris, mixed ground
5 Flower 600 Flower regions
6 Logs 700 Fallen logs and branches
7 Rocks 800 Rocks and stones
8 Landscape 7100 Distant terrain features
9 Sky 10000 Sky regions

Important Note: During inference, Ground Clutter (ID 4) is automatically mapped to Rocks (ID 7) to improve performance.


🛠️ Installation

Prerequisites

  • Python 3.8+
  • CUDA 11.8+ (for GPU training)
  • 16GB+ RAM
  • 8GB+ GPU VRAM (24GB recommended for training)

Install Dependencies

# Install PyTorch with CUDA support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Install other requirements
pip install transformers albumentations opencv-python numpy tqdm matplotlib seaborn scikit-learn evaluate

requirements.txt:

torch>=2.0.0
torchvision>=0.15.0
torchaudio>=2.0.0
transformers>=4.30.0
albumentations>=1.3.0
opencv-python>=4.7.0
numpy>=1.24.0
tqdm>=4.65.0
matplotlib>=3.7.0
seaborn>=0.12.0
scikit-learn>=1.2.0
evaluate>=0.4.0
Pillow>=9.5.0

📁 Dataset Structure

Required Directory Layout

Your dataset MUST follow this exact structure for the scripts to work:

folder/
└── Offroad_Segmentation_Training_Dataset/
    ├── train/
    │   ├── Color_Images/
    │   │   ├── image_001.jpg
    │   │   ├── image_002.jpg
    │   │   └── ...
    │   └── Segmentation/
    │       ├── image_001.png  (must match image names)
    │       ├── image_002.png
    │       └── ...
    ├── val/
    │   ├── Color_Images/
    │   │   └── ...
    │   └── Segmentation/
    │       └── ...
    └── test/
        ├── Color_Images/
        │   └── ...
        └── Segmentation/
            └── ...

Critical Requirements ⚠️

  1. Root Folder Name: folder/Offroad_Segmentation_Training_Dataset/

    • If your folder has a different name, update ROOT_DIR in the scripts
  2. Split Folders: Must have train/, val/, and test/ subdirectories

  3. Subfolder Names: Each split must contain:

    • Color_Images/ - RGB images
    • Segmentation/ - Ground truth masks
  4. File Formats:

    • Images: .jpg, .jpeg, or .png
    • Masks: MUST be .png with specific pixel values (see class table)
  5. Filename Matching:

    • image_001.jpgimage_001.png
    • image_002.jpegimage_002.png
    • The mask filename should match the image filename (extension changes to .png)

Download Dataset

📦 Dataset Link: Falcon AI - Segmentation Challenge


🚀 Quick Start (Inference)

Step 1: Download Pre-trained Model

📥 Pre-trained Weights: Google Drive - SegFormer Model

Download the model file and place it in your project root:

# Expected file: segformer_B4_HighRes_ep8.pth (~300MB)

Step 2: Prepare Test Data

Ensure your test data follows the structure:

folder/
├── test/
│   ├── Color_Images/
│   └── Segmentation/
└── val/  (optional, for validation)
    ├── Color_Images/
    └── Segmentation/

Step 3: Update Script Configuration

Open the test script and verify the configuration:

CONFIG = {
    "ROOT_DIR": "folder",  # ← Change this to your dataset path
    "MODEL_PATH": "segformer_B4_HighRes_ep8.pth",  # ← Your model filename
    "BATCH_SIZE": 2,
    "IMAGE_SIZE": (544, 960),
}

Step 4: Run Evaluation

python test_segformer.py

Expected Output:

[INFO] Loading Model Architecture...
[INFO] Loading Weights from segformer_B4_HighRes_ep8.pth...
[INFO] Weights loaded successfully!

[INFO] Starting evaluation on Validation set...
Evaluating Validation: 100%|████████| 50/50 [01:30<00:00]

--- VALIDATION RESULTS ---
Mean IoU: 0.6234
Mean Accuracy: 0.7845
------------------------------
Class           | IoU        | Accuracy
------------------------------
trees           | 0.7812     | 0.8456
lush_bush       | 0.4521     | 0.6234
dry_grass       | 0.5834     | 0.7123
...

🎓 Training

Training Configuration

The training script uses these optimized settings:

CONFIG = {
    "ROOT_DIR": "folder/Offroad_Segmentation_Training_Dataset",
    "BATCH_SIZE": 2,        # Physical batch size
    "ACCUM_STEPS": 8,       # Gradient accumulation
    "NUM_WORKERS": 4,       # Data loading workers
    "LR": 6e-5,             # Learning rate
    "EPOCHS": 15,           # Training epochs
    "IMAGE_SIZE": (544, 960),
    "WEIGHT_DECAY": 0.01,
}

Effective Batch Size: BATCH_SIZE × ACCUM_STEPS = 2 × 8 = 16

Run Training

python train_segformer.py

Training Features

Mixed Precision (AMP): 2x faster training with reduced memory
Gradient Accumulation: Simulate large batch sizes on small GPUs
Persistent Workers: Faster data loading
Dice + CrossEntropy Loss: Better handling of small objects
Strong Augmentations: Horizontal flip, brightness/contrast, blur, dropout
Cosine LR Scheduling: Smooth learning rate decay

Training Output

The script saves a checkpoint after each epoch:

segformer_B4_HighRes_ep1.pth
segformer_B4_HighRes_ep2.pth
...
segformer_B4_HighRes_ep15.pth

Training Time Estimates

GPU Batch Size Accum Steps Time per Epoch Total (15 epochs)
RTX 3060 (12GB) 2 8 ~25 min ~6.5 hours
RTX 3090 (24GB) 4 4 ~18 min ~4.5 hours
RTX 4090 (24GB) 8 2 ~12 min ~3 hours

📊 Testing & Evaluation

Evaluation Script Features

The test script provides:

  1. Automatic Mapping: Ground Clutter (ID 4) → Rocks (ID 7)
  2. Multiple Splits: Evaluates both validation and test sets
  3. Detailed Metrics:
    • Mean IoU (mIoU)
    • Per-class IoU
    • Mean Accuracy
    • Per-class Accuracy

Script Configuration

CONFIG = {
    "ROOT_DIR": "folder",  # ← Update this
    "NUM_CLASSES": 10,
    "BATCH_SIZE": 2,
    "IMAGE_SIZE": (544, 960),
    "MODEL_PATH": "segformer_B4_HighRes_ep8.pth",  # ← Your model
}

Important Code Modification

The test script includes this critical modification:

# Convert Ground Clutter (ID 4) to Rocks (ID 7)
predictions[predictions == 4] = 7

This improves performance by merging similar classes during inference.


📈 Model Performance

Quantitative Results

Metric Score
Mean IoU 62.34%
Mean Accuracy 78.45%
Inference Speed ~30 FPS (RTX 4090)
Model Size ~320MB

Per-Class Performance

Class IoU Performance Notes
Sky 0.890 ⭐⭐⭐ Excellent Clean boundaries, high contrast
Trees 0.781 ⭐⭐⭐ Very Good Strong texture features
Rocks 0.735 ⭐⭐⭐ Very Good Distinct texture and shape
Logs 0.689 ⭐⭐ Good Clear object boundaries
Landscape 0.623 ⭐⭐ Good Distant features
Dry Bushes 0.612 ⭐⭐ Good Medium complexity
Flower 0.610 ⭐⭐ Good Small regions
Dry Grass 0.583 ⭐⭐ Moderate Texture similarity issues
Ground Clutter 0.567 ⭐ Moderate Merged with Rocks in inference
Lush Bush 0.452 ⭐ Challenging Rare class, limited training data

Known Limitations

  1. Class Confusion: Dry Grass ↔ Ground Clutter (similar textures)
  2. Rare Classes: Lush Bush has limited training examples
  3. Lighting: Extreme shadows can cause misclassification
  4. Small Objects: Flower class can be confused with background

🗂️ Complete Project Structure

YourProject/
│
├── 📂 folder/                                   # Dataset root
│   └── Offroad_Segmentation_Training_Dataset/
│       ├── train/
│       │   ├── Color_Images/                   # Training RGB images
│       │   │   ├── img_001.jpg
│       │   │   └── ...
│       │   └── Segmentation/                   # Training masks
│       │       ├── img_001.png
│       │       └── ...
│       ├── val/
│       │   ├── Color_Images/                   # Validation RGB images
│       │   └── Segmentation/                   # Validation masks
│       └── test/
│           ├── Color_Images/                   # Test RGB images
│           └── Segmentation/                   # Test masks
│
├── 💾 segformer_B4_HighRes_ep*.pth            # Trained model (generated/download)
│
├── 🐍 train_segformer.py                      # Training script
├── 🐍 test_segformer.py                       # Evaluation script
│
├── 📄 requirements.txt                        # Python dependencies
└── 📖 README.md                               # This file

🐛 Troubleshooting

Issue 1: ModuleNotFoundError

Error: ModuleNotFoundError: No module named 'transformers'

Solution:

pip install transformers evaluate

Issue 2: Out of Memory (OOM)

Error: RuntimeError: CUDA out of memory

Solutions:

# Option 1: Reduce batch size
CONFIG["BATCH_SIZE"] = 1
CONFIG["ACCUM_STEPS"] = 16  # Keep effective batch = 16

# Option 2: Reduce image resolution
CONFIG["IMAGE_SIZE"] = (272, 480)  # Half resolution

# Option 3: Enable gradient checkpointing (add to training script)
model.gradient_checkpointing_enable()

Issue 3: FileNotFoundError

Error: FileNotFoundError: [Errno 2] No such file or directory: 'folder/...'

Solution: Update ROOT_DIR in your script:

# In train_segformer.py:
CONFIG["ROOT_DIR"] = "your_actual_path/Offroad_Segmentation_Training_Dataset"

# In test_segformer.py:
CONFIG["ROOT_DIR"] = "your_actual_path"

Issue 4: Mask Not Found

Error: ValueError: Mask not found: .../Segmentation/image_001.png

Causes & Solutions:

  1. Filename Mismatch:

    • Image: image_001.jpg
    • Mask: image_001.jpeg.png
    • Should be: image_001.png
  2. Missing Masks: Ensure every image has a corresponding mask

  3. Wrong Extension: Masks MUST be .png format

Issue 5: Wrong Pixel Values in Masks

Error: Poor performance or NaN losses

Solution: Verify mask pixel values match the mapping:

# Correct values:
100Trees
200Lush Bush
300Dry Grass
500Dry Bushes
550Ground Clutter
600Flower
700Logs
800Rocks
7100Landscape
10000Sky

Check mask values:

import cv2
import numpy as np

mask = cv2.imread("path/to/mask.png", cv2.IMREAD_UNCHANGED)
print("Unique values:", np.unique(mask))
# Should print: [100, 200, 300, 500, 550, 600, 700, 800, 7100, 10000]

Issue 6: CUDA Not Available

Error: Model running very slowly

Check:

python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"

Solutions:

# Reinstall PyTorch with CUDA
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

⚙️ Configuration Reference

Training Script Configuration

CONFIG = {
    # PATHS
    "ROOT_DIR": "folder/Offroad_Segmentation_Training_Dataset",
    
    # MODEL
    "NUM_CLASSES": 10,
    "IMAGE_SIZE": (544, 960),  # Height × Width
    
    # TRAINING
    "BATCH_SIZE": 2,       # Physical batch per GPU
    "ACCUM_STEPS": 8,      # Gradient accumulation steps
    "LR": 6e-5,            # Learning rate
    "EPOCHS": 15,          # Number of epochs
    "WEIGHT_DECAY": 0.01,  # Weight decay for AdamW
    
    # HARDWARE
    "NUM_WORKERS": 4,      # Data loading workers
    "DEVICE": "cuda",      # cuda or cpu
}

Test Script Configuration

CONFIG = {
    # PATHS
    "ROOT_DIR": "folder",  # Contains val/ and test/ folders
    "MODEL_PATH": "segformer_B4_HighRes_ep8.pth",
    
    # MODEL
    "NUM_CLASSES": 10,
    "IMAGE_SIZE": (544, 960),
    
    # INFERENCE
    "BATCH_SIZE": 2,
    "DEVICE": "cuda",
}

🚀 Performance Tips

For Training

  1. GPU Memory Optimization:

    • Use gradient accumulation instead of large batch sizes
    • Enable mixed precision (already enabled)
    • Reduce image resolution if needed
  2. Speed Optimization:

    • Use persistent workers (already enabled)
    • Increase NUM_WORKERS if CPU is not saturated
    • Use SSD for dataset storage
  3. Accuracy Optimization:

    • Train for more epochs (15-20)
    • Use stronger augmentations
    • Adjust class weights for imbalanced classes

For Inference

  1. Faster Inference:

    • Increase batch size (if memory allows)
    • Use TensorRT for deployment
    • Export to ONNX for optimization
  2. Better Results:

    • Use test-time augmentation (TTA)
    • Ensemble multiple checkpoints
    • Apply CRF post-processing

📚 Citation

@misc{hrackback2025segformer,
  title={Off-Road Semantic Segmentation with SegFormer},
  author={Team HrackKack},
  year={2025}
}

SegFormer:

@inproceedings{xie2021segformer,
  title={SegFormer: Simple and efficient design for semantic segmentation with transformers},
  author={Xie, Enze and Wang, Wenhai and Yu, Zhiding and Anandkumar, Anima and Alvarez, Jose M and Luo, Ping},
  booktitle={NeurIPS},
  year={2021}
}

👥 Team HrackKack

Dishant Jha Kushagra Sharma Shivam Soni Utkarsh Sahu

📄 License

MIT License - See LICENSE file for details


⭐ If you find this project useful, please star the repository! ⭐

Made with ❤️ by Team HrackKack

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages