RAM+

A minimal, inference-only Python package for RAM++ (Recognize Anything Plus Model) — an open-set image tagger that can recognize any category with high accuracy using zero-shot generalization.

Based on the original recognize-anything by Xinyu Huang et al.

What is RAM++?

RAM++ is a vision-language model that generates semantic tags for images. It covers 4,585 common categories out of the box and generalizes to open-set categories it has never seen during training — significantly outperforming CLIP on tag recognition tasks.

This package strips out training, finetuning, and demo code — leaving a clean API for running inference.

Installation

pip install git+https://github.com/aka-vm/ram.git

With uv:

uv pip install git+https://github.com/aka-vm/ram.git

From source:

git clone https://github.com/aka-vm/ram
cd ram
uv sync          # or: pip install -e .

Model Download

The model (~850 MB) is hosted on HuggingFace and can be downloaded automatically:

from ram_plus import download_model

model_path = download_model()          # saves to ~/.cache/ram_plus/
model_path = download_model("./models")  # or a custom directory

Or manually from HuggingFace.

Usage

import cv2
from ram_plus import RamTagGenerator

# Auto-downloads model if model_path is not provided
generator = RamTagGenerator(device="cuda")

# Or point to a local checkpoint
generator = RamTagGenerator(model_path="./models/ram_plus_swin_large_14m.pth", device="cuda")

# Run on a single image (numpy HWC BGR, as returned by cv2)
image = cv2.imread("photo.jpg")
tags = generator(image)
print(tags)  # ['dog', 'grass', 'outdoors', ...]

# Batch inference
images = [cv2.imread(p) for p in image_paths]
batch_tags = generator(images)

# Sort tags by confidence
generator = RamTagGenerator(device="cuda", sort_tags=True)

# Pass a pre-normalized torch.Tensor directly (NCHW, float32, ImageNet-normalized)
tags = generator(tensor_batch)

Input Formats

Input	Supported
`np.ndarray` (HWC BGR uint8)	Single image
`List[np.ndarray]`	Batch
`torch.Tensor` (NCHW float32, ImageNet-normalized)	Pre-processed batch

Acknowledgements

Original model and training: Xinyu Huang et al.
Paper: Recognize Anything: A Strong Image Tagging Model (RAM++)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ram		ram
ram_plus		ram_plus
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAM+

What is RAM++?

Installation

Model Download

Usage

Input Formats

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAM+

What is RAM++?

Installation

Model Download

Usage

Input Formats

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages