A lightweight fuzzy-matching intent parser built on rapidfuzz.
Finds the closest matching intent by comparing the utterance against all training sentences using configurable fuzzy similarity strategies. Handles spelling errors, word-order variation, contractions, and natural phrasing that exact-match parsers would miss. Best suited for small-to-medium intent sets (dozens to hundreds of training sentences per intent).
pip install nebulentoFor the OVOS pipeline plugin:
pip install "nebulento[ovos]"from nebulento import IntentContainer, MatchStrategy
container = IntentContainer(fuzzy_strategy=MatchStrategy.TOKEN_SET_RATIO)
container.add_intent("hello", ["hello", "hi", "how are you", "what's up"])
container.add_intent("buy", ["buy {item}", "purchase {item}", "get {item} for me"])
container.add_entity("item", ["milk", "cheese"])
container.calc_intent("hello")
# {'name': 'hello', 'conf': 1.0, 'entities': {}, 'best_match': 'hello',
# 'utterance': 'hello', 'utterance_consumed': 'hello', 'utterance_remainder': '',
# 'match_strategy': 'TOKEN_SET_RATIO'}
container.calc_intent("buy milk")
# {'name': 'buy', 'conf': 0.719, 'entities': {'item': ['milk']},
# 'best_match': 'buy {item}', ...}| Syntax | Meaning |
|---|---|
(one|of|these) |
Alternation — expands to one variant per combination |
[optional] |
Optional word or phrase |
{entity} |
Capture group — matched against registered entity samples |
Choose a strategy via IntentContainer(fuzzy_strategy=MatchStrategy.X).
| Strategy | Best for | FP risk |
|---|---|---|
DAMERAU_LEVENSHTEIN_SIMILARITY |
Spelling errors, zero false positives | Low — default |
TOKEN_SET_RATIO |
Natural phrasing, word-order variation | High |
SIMPLE_RATIO |
General use, balanced recall/precision | Medium |
TOKEN_SORT_RATIO |
Same words, different order | Medium |
PARTIAL_RATIO |
Substring presence — avoid for intent gating | Very high |
See docs/strategies.md for the full comparison table and benchmark rows.
Nebulento ships as an OVOS pipeline plugin (ovos-nebulento-pipeline-plugin).
{
"intents": {
"pipeline": [
"ovos-nebulento-pipeline-plugin"
]
}
}Configure the fuzzy strategy and confidence thresholds:
{
"intents": {
"nebulento": {
"strategy": "TOKEN_SET_RATIO",
"conf_high": 0.95,
"conf_med": 0.80,
"conf_low": 0.50
}
}
}Entry point: nebulento.opm:NebulentoPipeline
| Page | Description |
|---|---|
| Quickstart | 5-minute guide: intents, entities, strategies |
| Intent API | Full IntentContainer and DomainIntentContainer reference |
| Match Strategies | All 9 strategies with benchmark data and decision table |
| Template Syntax | (a|b), [opt], {slot}, :0 padatious syntax, expansion rules |
| Entity Extraction | Registration, confidence boost, result fields |
| Normalisation | Apostrophes, whitespace, case handling |
| Domain Matching | DomainIntentContainer two-stage matching |
| OVOS Pipeline Plugin | Bus events, confidence tiers, comparison with Padatious |
| Configuration | All config keys with types, defaults, and effect |
| Benchmark | Full accuracy results across all strategies |
| Troubleshooting | False positives, low recall, entity issues, lru_cache gotchas |
268 test cases: 244 natural human utterances across 22 intents, 24 deliberate no-match cases.
| Engine | Accuracy | Precision | Recall | F1 | False positives | Median |
|---|---|---|---|---|---|---|
| padaos (regex) | 25.4% | 100% | 18.0% | 0.306 | 0 / 24 | 0.07 ms |
| padatious (neural) | 53.4% | 96.9% | 50.4% | 0.663 | 4 / 24 | 1.1 ms |
nebulento token-set-ratio |
50.4% | 88.3% | 52.5% | 0.658 | 17 / 24 | 6.3 ms |
nebulento damerau-levenshtein |
38.8% | 100% | 32.8% | 0.494 | 0 / 24 | 6.8 ms |
python benchmark/compare.pyOriginally an experimental research project by TigreGoticoLda, polished and donated to OpenVoiceOS as part of the NLnet NGI0 Commons Fund under grant agreement No 101135429.
Apache 2.0
