feat: grep_rag implementation by mBerasategui-ehu · Pull Request #428 · Lamb-Project/lamb

mBerasategui-ehu · 2026-06-23T19:16:00Z

Grep RAG — Implementation

Overview

Grep RAG is a complementary search layer that works alongside any embedding-based RAG processor. A nano LLM (small-fast-model) drives iterative grep/egrep/ripgrep searches across all files in the assistant's knowledge bases. All searches run in-memory via Python's re module — no shelling out.

While embeddings find semantically similar chunks, grep finds exact keyword matches — including terms the embedding model might miss, synonyms the nano model iteratively discovers, and content in rarely-accessed sections of the KB.

Two Operating Modes

Mode	Behavior	Companion RAG
`hybrid`	Run grep AND simple_rag in parallel, merge results	Always `simple_rag` (hardcoded — grep handles precision; simple semantic coverage suffices)
`primary`	Grep runs first. If zero matches after max tries → fall back to user-selected RAG	User picks from KB-based RAGs (`simple_rag`, `context_aware_rag`, `hierarchical_rag`)

Backend: `grep_rag.py`

Architecture

rag_processor(messages, assistant, request)
  │
  ├─ _parse_config()           → Read grep_mode, max_tries, etc. from metadata JSON
  ├─ _extract_user_question()  → Get last user message from conversation
  ├─ _resolve_kb_documents()   → Fetch KB file list + content via KB Server API
  │
  ├─ MODE: primary
  │   ├─ _run_grep_search()    → Iterative nano LLM + regex loop
  │   ├─ matches found?        → _build_grep_response()
  │   └─ no matches?           → _run_fallback_rag(fallback_rag_name)
  │
  └─ MODE: hybrid
      ├─ _run_grep_search()    → Async grep loop
      ├─ _run_fallback_rag("simple_rag") → Parallel embedding RAG (always simple_rag)
      └─ _merge_contexts()     → Combine both result sets

Key Functions

`_resolve_kb_documents(assistant)`

Calls GET /collections/{id}/files on the KB server for each attached KB
For each file entry, tries to fetch content from:
- processing_stats.output_files.markdown_url (converted markdown)
- file_url (original file, may be binary)
- processing_stats.markdown_preview (first ~2000 chars, last resort)
Filters out binary content via _is_text_content() heuristic
Returns list of {file_path, original_filename, content, metadata, collection_id}

`_fetch_text_content(url, headers)`

Handles three URL types:

localhost:* URLs — Docker container can't reach them. Tries local filesystem first, then rewrites to kb:9090 (KB server's internal Docker hostname)
Relative URLs (/static/...) — prefixed with KB server base URL
Absolute URLs — fetched directly via HTTP

`_run_grep_search(user_question, documents, ...)`

The nano model search loop:

For each attempt (1..grep_max_tries):
  1. _ask_nano_model()         → LLM proposes TOOL + PATTERN + FLAGS + REASON
  2. Skip duplicate (tool, pattern) combos
  3. _search_across_documents() → Python re.search across all KB file content
  4. Accumulate matches
  5. If nano says DONE → break

Fallback when no nano model configured: single grep with user-question keyword extraction.

`_search_across_documents(documents, pattern, tool, context_lines)`

Compiles regex with re.IGNORECASE | re.MULTILINE (matching grep -i)
All three tools (grep, egrep, ripgrep) use Python re internally — no shelling out
Invalid regex → caught, returns [], nano model retries
Context lines extracted around each match

`_deduplicate_matches(matches, context_lines)`

Groups matches by file_path
Merges overlapping context ranges within each file
Cross-file matches kept separate

`_build_grep_response(matches, documents, max_total_chars)`

Formats grep results as markdown blocks with source headers:

### {source_label} ({source_url})
{context_lines}

Truncates if total exceeds grep_max_total_chars.

`_merge_contexts(grep_response, rag_response, max_total_chars)`

In hybrid mode, concatenates RAG chunks first, then appends grep results under:

---
## Exact Keyword Matches

Deduplicates sources by URL.

Nano Model Prompts

System prompt: Instructs the nano model to respond with structured TOOL:, PATTERN:, FLAGS:, REASON: lines, or DONE: when satisfied. Provides regex tips and tool selection guidance.

User prompt: Includes the user's question and a formatted search history showing previous tries, match counts, and content previews.

Configuration (assistant metadata)

{
    "rag_processor": "grep_rag",
    "grep_mode": "hybrid",
    "grep_fallback_rag": "simple_rag",
    "grep_max_tries": 5,
    "grep_context_lines": 3,
    "grep_max_total_chars": 8000
}

Field	Default	Description
`grep_mode`	`"hybrid"`	`"hybrid"` or `"primary"`
`grep_fallback_rag`	`"simple_rag"`	Used only in primary mode; hybrid always uses `simple_rag`
`grep_max_tries`	5	Max search iterations (1 LLM call each)
`grep_context_lines`	3	Lines of context before/after each match
`grep_max_total_chars`	8000	Max total chars sent to main LLM

Plugin Auto-Discovery

No changes to main.py needed. The existing load_plugins('rag') system auto-discovers .py files in backend/lamb/completions/rag/. The function rag_processor() is detected as async and awaited accordingly.

Edge Cases

Case	Behavior
No KBs attached	Returns error, falls back to embedding RAG
No small-fast-model	Single grep with keyword extraction
Nano returns garbage	Only structured responses parsed; invalid ignored
Max tries exhausted (primary)	Falls back to configured `grep_fallback_rag`
Max tries exhausted (hybrid)	Returns found matches + RAG results
No matches at all	Primary → fallback RAG; Hybrid → RAG only
Invalid regex from nano	`re.error` caught → nano retries
Duplicate pattern	Tracked (tool, pattern) set → skipped
Results > max chars	Truncated
Binary content	Filtered by `_is_text_content()`
localhost URLs in Docker	Rewritten to `kb:9090`
File deleted from KB	Handled gracefully

Frontend

Files Changed

File	Changes
`src/lib/utils/ragProcessorHelpers.js`	Added `GREP: ['grep_rag']` type, `isGrepRag()`, `isGrepBasedRag()` helpers
`src/lib/stores/assistantConfigStore.js`	Added `grep_rag` to fallback capabilities
`src/lib/components/assistants/logic/assistantFormState.svelte.js`	5 grep state fields + populate/clear logic + `isGrepRag` import
`src/lib/components/assistants/logic/assistantFormSubmit.js`	Serializes grep config into metadata; sends `RAG_collections` for grep_rag
`src/lib/components/assistants/logic/assistantFormFetchers.js`	KB fetch also triggered for `grep_rag`
`src/lib/components/assistants/components/RagOptionsPanel.svelte`	Full grep config UI + conditional fallback selector
`src/lib/components/assistants/components/ConfigurationPanel.svelte`	Passes grep props through
`src/lib/components/assistants/AssistantForm.svelte`	Wires grep state; KB fetch on grep_rag select; import handler for grep_rag
`src/routes/assistants/+page.svelte`	Detail view shows grep config + KBs; hides Top K for single_file_rag/rubric_rag
`src/lib/components/AssistantsList.svelte`	Table expansion shows KB info for grep_rag
`src/lib/locales/{en,es,ca,eu}.json`	11 new i18n keys under `assistants.form.grepRag.*`

RAG Processor Classification (`ragProcessorHelpers.js`)

export const RAG_TYPES = Object.freeze({
    KB_BASED:   ['simple_rag', 'context_aware_rag', 'hierarchical_rag'],
    SINGLE_FILE: ['single_file_rag'],
    RUBRIC:      ['rubric_rag'],
    GREP:        ['grep_rag'],        // ← NEW
    NONE:        ['no_rag']
});

export function isGrepRag(processor)   { return RAG_TYPES.GREP.includes(processor); }
export function isGrepBasedRag(processor) { return isGrepRag(processor); }

Form State (`assistantFormState.svelte.js`)

Five new reactive fields added to the form state object:

grepMode:          'hybrid',       // 'hybrid' | 'primary'
grepFallbackRag:   'simple_rag',   // Used only in primary mode
grepMaxTries:      5,              // 1-10
grepContextLines:  3,              // 1-10
grepMaxTotalChars: 8000,           // 1000-32000

populateFormFields() — reads grep metadata when editing an existing assistant
clearRagDependentState() — resets grep fields to defaults when switching away from grep_rag

Form Submission (`assistantFormSubmit.js`)

When isGrepRag() returns true:

Serializes all 5 grep fields into metadataObj
Sends RAG_collections (KBs are shared with the companion RAG)

KB Fetching (`assistantFormFetchers.js`)

Guard updated so KB fetching also triggers for grep_rag:

if (!isKbBasedRag(form.selectedRagProcessor) && !isGrepRag(form.selectedRagProcessor)) return;

Configuration UI (`RagOptionsPanel.svelte`)

When grep_rag is selected, shows:

Field	Control	Visibility
Mode	Dropdown: Hybrid / Primary	Always
Fallback RAG	Dropdown (KB-based RAGs only)	Only in Primary mode
Max Search Tries	Number input (1–10)	Always
Context Lines	Number input (1–10)	Always
Max Result Characters	Number input (1000–32000, step 1000)	Always

Additionally, KB selector and RAG Top K are shown (shared with companion RAG).

Detail View (`+page.svelte`)

In the read-only detail view:

Knowledge bases displayed for grep_rag (same as simple_rag)
RAG Top K shown (companion RAG uses it)
Grep configuration section displays mode, fallback RAG, max tries, context lines, max chars

Knowledge Bases Integration

When grep_rag is selected:

KB fetching is triggered (same flow as simple_rag)
KB selections are stored in RAG_collections (same DB field)
The detail view renders KB names

i18n Strings

11 new keys added in all 4 locale files:

Key	English
`assistants.form.grepRag.sectionTitle`	"Grep RAG Configuration"
`assistants.form.grepRag.mode.label`	"Mode"
`assistants.form.grepRag.mode.description`	"How grep interacts with embedding-based RAG"
`assistants.form.grepRag.mode.hybrid`	"Hybrid (grep + RAG in parallel)"
`assistants.form.grepRag.mode.primary`	"Primary (grep first, RAG fallback)"
`assistants.form.grepRag.fallbackRag.label`	"Fallback RAG"
`assistants.form.grepRag.fallbackRag.description`	"Embedding RAG to fall back to when grep finds no matches"
`assistants.form.grepRag.maxTries.label`	"Max Search Tries"
`assistants.form.grepRag.maxTries.description`	"Maximum number of search iterations (1-10)"
`assistants.form.grepRag.contextLines.label`	"Context Lines"
`assistants.form.grepRag.contextLines.description`	"Lines of context before/after each match (1-10)"
`assistants.form.grepRag.maxTotalChars.label`	"Max Result Characters"
`assistants.form.grepRag.maxTotalChars.description`	"Max total characters of grep results sent to main LLM (1000-32000)"

feat: grep_rag implementation

fb99b39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: grep_rag implementation#428

feat: grep_rag implementation#428
mBerasategui-ehu wants to merge 1 commit into
Lamb-Project:devfrom
mBerasategui-ehu:grep_RAG

mBerasategui-ehu commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mBerasategui-ehu commented Jun 23, 2026

Grep RAG — Implementation

Overview

Two Operating Modes

Backend: grep_rag.py

Architecture

Key Functions

_resolve_kb_documents(assistant)

_fetch_text_content(url, headers)

_run_grep_search(user_question, documents, ...)

_search_across_documents(documents, pattern, tool, context_lines)

_deduplicate_matches(matches, context_lines)

_build_grep_response(matches, documents, max_total_chars)

_merge_contexts(grep_response, rag_response, max_total_chars)

Nano Model Prompts

Configuration (assistant metadata)

Plugin Auto-Discovery

Edge Cases

Frontend

Files Changed

RAG Processor Classification (ragProcessorHelpers.js)

Form State (assistantFormState.svelte.js)

Form Submission (assistantFormSubmit.js)

KB Fetching (assistantFormFetchers.js)

Configuration UI (RagOptionsPanel.svelte)

Detail View (+page.svelte)

Knowledge Bases Integration

i18n Strings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Backend: `grep_rag.py`

`_resolve_kb_documents(assistant)`

`_fetch_text_content(url, headers)`

`_run_grep_search(user_question, documents, ...)`

`_search_across_documents(documents, pattern, tool, context_lines)`

`_deduplicate_matches(matches, context_lines)`

`_build_grep_response(matches, documents, max_total_chars)`

`_merge_contexts(grep_response, rag_response, max_total_chars)`

RAG Processor Classification (`ragProcessorHelpers.js`)

Form State (`assistantFormState.svelte.js`)

Form Submission (`assistantFormSubmit.js`)

KB Fetching (`assistantFormFetchers.js`)

Configuration UI (`RagOptionsPanel.svelte`)

Detail View (`+page.svelte`)