Validator: Answer relevance custom LLM judge#109
Conversation
📝 WalkthroughWalkthroughAdds an answer-relevance LLM validator and replaces per-tenant topic_relevance storage with a generalized tenant-scoped LLM prompt config model, CRUD, API routes, DB migrations, guardrails wiring, schema updates, tests, and related docs. ChangesAnswer Relevance Custom LLM Validator with Multi-Tenant Prompt Management
Sequence Diagram (high-level prompt resolution & CRUD interaction) sequenceDiagram
participant Client
participant GuardrailsAPI
participant Resolver as _resolve_validator_configs
participant CRUD as llm_prompt_config_crud
participant DB
Client->>GuardrailsAPI: run guardrails request (includes custom_prompt_id or config id)
GuardrailsAPI->>Resolver: resolve validator configs
alt custom_prompt_id present
Resolver->>CRUD: get(custom_prompt_id, org_id, project_id)
CRUD->>DB: SELECT WHERE org_id/project_id AND id
DB-->>CRUD: LLMPromptConfig
CRUD-->>Resolver: LLMPromptConfig.llm_prompt
Resolver->>GuardrailsAPI: guard data (prompt_template) and validator configured
else no stored prompt
Resolver->>GuardrailsAPI: use inline prompt_template or defaults
end
GuardrailsAPI->>Client: validation result
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
backend/app/alembic/versions/008_add_answer_relevance_prompt.py (1)
35-49: ⚡ Quick winAdd a composite index for the tenant-scoped list pattern.
Line 35-49 only adds single-column indexes. For queries filtered by
organization_id+project_idand ordered bycreated_at, id, a composite index will scale better.Suggested migration change
op.create_index( "idx_answer_relevance_prompt_is_active", "answer_relevance_prompt", ["is_active"], ) + op.create_index( + "idx_answer_relevance_prompt_tenant_created_id", + "answer_relevance_prompt", + ["organization_id", "project_id", "created_at", "id"], + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/alembic/versions/008_add_answer_relevance_prompt.py` around lines 35 - 49, The migration currently only creates single-column indexes via op.create_index for "idx_answer_relevance_prompt_org", "idx_answer_relevance_prompt_project", and "idx_answer_relevance_prompt_is_active" on the answer_relevance_prompt table; add a composite index for the tenant-scoped list pattern to support queries filtered by organization_id + project_id and ordered by created_at, id by creating a new composite index (e.g. name it "idx_answer_relevance_prompt_org_project_created_at_id") on columns ["organization_id","project_id","created_at","id"]; also ensure the corresponding downgrade drops that composite index (and keep or remove the single-column org/project indexes as desired) so the migration is reversible.backend/app/api/routes/guardrails.py (1)
133-142: ⚡ Quick winAvoid DB lookup when
prompt_templateis already provided inline.Currently,
custom_prompt_idtriggers a fetch unconditionally. Guarding on missingprompt_templatewould reduce unnecessary I/O and avoid overriding explicit runtime templates.Proposed patch
elif isinstance(validator, AnswerRelevanceCustomLLMSafetyValidatorConfig): - if validator.custom_prompt_id is not None: + if ( + validator.custom_prompt_id is not None + and not validator.prompt_template + ): prompt_config = answer_relevance_prompt_crud.get( session=session, id=validator.custom_prompt_id, organization_id=payload.organization_id, project_id=payload.project_id, ) validator.prompt_template = prompt_config.prompt_template🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/api/routes/guardrails.py` around lines 133 - 142, The code unconditionally looks up a DB prompt when validator.custom_prompt_id is set and then overwrites validator.prompt_template; change the logic in the AnswerRelevanceCustomLLMSafetyValidatorConfig branch to only call answer_relevance_prompt_crud.get (using session, payload.organization_id, payload.project_id) when validator.custom_prompt_id is present AND validator.prompt_template is missing/empty, so any inline-provided validator.prompt_template is preserved and unnecessary I/O is avoided; after the conditional fetch assign validator.prompt_template only from the retrieved prompt_config.backend/app/tests/validators/test_answer_relevance_custom_llm.py (1)
113-155: 💤 Low valueOptional: add non-dict JSON / non-string-field cases.
Consider adding tests for inputs like
validator._validate("123"),validator._validate("null"), and{"query": 1, "answer": "x"}so the parsing edge cases (raised on the validator file) stay covered going forward.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/tests/validators/test_answer_relevance_custom_llm.py` around lines 113 - 155, Add tests to cover non-dict JSON and non-string field cases so parsing edge cases remain covered: extend backend/app/tests/validators/test_answer_relevance_custom_llm.py with new test functions that call validator._validate on JSON primitives (e.g., "123", "null") and on a JSON object with non-string field types (e.g., {"query": 1, "answer": "x"}), and assert they return FailResult (using isinstance(result, FailResult)) and include appropriate error messages where relevant; reference validator._validate and existing test patterns (e.g., test_fails_with_non_json_input, test_fails_with_missing_query_key) to mirror structure and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/api/docs/answer_relevance_prompts/create_prompt.md`:
- Around line 19-25: The two fenced code blocks in create_prompt.md (the blocks
beginning with the lines "Query: {query} ... Answer only YES or NO." and the one
starting "You are evaluating a maternal health assistant.") need explicit
language identifiers to satisfy markdownlint MD040; update both opening fences
from ``` to ```text so each block reads ```text and leave the block contents
unchanged.
In `@backend/app/api/docs/guardrails/run_guardrails.md`:
- Line 11: Update the docs for the answer_relevance_custom_llm operation to
explicitly state the precedence and mutual-exclusivity behavior when both
custom_prompt_id and prompt_template are provided: specify whether they are
mutually exclusive (reject requests containing both) or define a deterministic
precedence rule (e.g., "custom_prompt_id takes precedence over prompt_template
if both are set"), and show a short example of the accepted input JSON
{"query":"...", "answer":"..."} with the chosen behavior. Ensure the text
mentions the parameter names custom_prompt_id and prompt_template and that
OPENAI_API_KEY is required.
In `@backend/app/core/validators/answer_relevance_custom_llm.py`:
- Around line 44-57: In _validate (in answer_relevance_custom_llm.py) guard
against non-dict JSON and non-string fields by first verifying the result of
json.loads(value) is a dict and returning FailResult if not, then extract query
and answer and ensure both are instances of str before calling .strip(); if
either is missing or not a string (or empty after strip) return FailResult with
the existing error messages. This prevents AttributeError from .get/.strip on
non-dict or non-str values while preserving the current
ValidationResult/FailResult flow.
In `@backend/app/core/validators/README.md`:
- Around line 519-525: The fenced code block containing the prompt that starts
with "Query: {query}" and ends with "Answer only YES or NO." should be annotated
with a language to satisfy markdownlint MD040; update the opening fence from ```
to ```text for that block (the block that contains the lines "Query: {query}"
and "Answer: {answer}") so the README.md stays lint-clean and consistent with
other fenced blocks.
---
Nitpick comments:
In `@backend/app/alembic/versions/008_add_answer_relevance_prompt.py`:
- Around line 35-49: The migration currently only creates single-column indexes
via op.create_index for "idx_answer_relevance_prompt_org",
"idx_answer_relevance_prompt_project", and
"idx_answer_relevance_prompt_is_active" on the answer_relevance_prompt table;
add a composite index for the tenant-scoped list pattern to support queries
filtered by organization_id + project_id and ordered by created_at, id by
creating a new composite index (e.g. name it
"idx_answer_relevance_prompt_org_project_created_at_id") on columns
["organization_id","project_id","created_at","id"]; also ensure the
corresponding downgrade drops that composite index (and keep or remove the
single-column org/project indexes as desired) so the migration is reversible.
In `@backend/app/api/routes/guardrails.py`:
- Around line 133-142: The code unconditionally looks up a DB prompt when
validator.custom_prompt_id is set and then overwrites validator.prompt_template;
change the logic in the AnswerRelevanceCustomLLMSafetyValidatorConfig branch to
only call answer_relevance_prompt_crud.get (using session,
payload.organization_id, payload.project_id) when validator.custom_prompt_id is
present AND validator.prompt_template is missing/empty, so any inline-provided
validator.prompt_template is preserved and unnecessary I/O is avoided; after the
conditional fetch assign validator.prompt_template only from the retrieved
prompt_config.
In `@backend/app/tests/validators/test_answer_relevance_custom_llm.py`:
- Around line 113-155: Add tests to cover non-dict JSON and non-string field
cases so parsing edge cases remain covered: extend
backend/app/tests/validators/test_answer_relevance_custom_llm.py with new test
functions that call validator._validate on JSON primitives (e.g., "123", "null")
and on a JSON object with non-string field types (e.g., {"query": 1, "answer":
"x"}), and assert they return FailResult (using isinstance(result, FailResult))
and include appropriate error messages where relevant; reference
validator._validate and existing test patterns (e.g.,
test_fails_with_non_json_input, test_fails_with_missing_query_key) to mirror
structure and assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 04d6d867-db71-48df-ab0c-a3c4b234f1bc
📒 Files selected for processing (24)
backend/app/alembic/versions/008_add_answer_relevance_prompt.pybackend/app/api/API_USAGE.mdbackend/app/api/docs/answer_relevance_prompts/create_prompt.mdbackend/app/api/docs/answer_relevance_prompts/delete_prompt.mdbackend/app/api/docs/answer_relevance_prompts/get_prompt.mdbackend/app/api/docs/answer_relevance_prompts/list_prompts.mdbackend/app/api/docs/answer_relevance_prompts/update_prompt.mdbackend/app/api/docs/guardrails/run_guardrails.mdbackend/app/api/main.pybackend/app/api/routes/answer_relevance_prompts.pybackend/app/api/routes/guardrails.pybackend/app/core/enum.pybackend/app/core/validators/README.mdbackend/app/core/validators/answer_relevance_custom_llm.pybackend/app/core/validators/config/answer_relevance_custom_llm_safety_validator_config.pybackend/app/crud/answer_relevance_prompt.pybackend/app/models/config/answer_relevance_prompt.pybackend/app/schemas/answer_relevance_prompt.pybackend/app/schemas/guardrail_config.pybackend/app/tests/test_answer_relevance_prompts_api.pybackend/app/tests/test_answer_relevance_prompts_api_integration.pybackend/app/tests/test_llm_validators.pybackend/app/tests/test_validate_with_guard.pybackend/app/tests/validators/test_answer_relevance_custom_llm.py
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
backend/app/models/config/llm_prompt_config.py (1)
82-90: ⚡ Quick winAlign unique-constraint name with migration.
Model uses
uq_validator_prompt_config, while migration createsuq_llm_prompt_config. Keep one canonical name to avoid schema drift/noisy future migrations.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/config/llm_prompt_config.py` around lines 82 - 90, The UniqueConstraint declared in __table_args__ currently uses name="uq_validator_prompt_config" but the migration expects "uq_llm_prompt_config"; update the constraint name in the model (the UniqueConstraint in __table_args__ that references "organization_id", "project_id", "validator_name", "prompt_schema_version", "llm_prompt") to match the migration by changing the name to "uq_llm_prompt_config" so the ORM schema and migrations remain in sync.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/alembic/versions/008_added_llm_validator_prompt.py`:
- Around line 40-48: The migration adds a non-nullable column validator_name to
llm_prompt with a server_default "topic_relevance" for backfill but never
removes that default; update the Alembic migration so after backfilling existing
rows you immediately alter the column to drop the server_default (e.g. use
op.alter_column on "llm_prompt", "validator_name" to remove the server_default
and keep nullable=False) so future inserts without an explicit validator_name
are not implicitly classified as "topic_relevance".
In `@backend/app/api/docs/llm_prompt_configs/create_config.md`:
- Around line 29-32: Add a language identifier (e.g., text) to the two fenced
code blocks that contain the policy/example strings so markdownlint MD040 is
satisfied: the block starting with "This assistant only answers questions about
maternal health and pregnancy care." and the block starting with "You are
evaluating a maternal health assistant." should be changed from ``` to ```text;
update both occurrences so the fenced code blocks explicitly declare the
language.
In `@backend/app/api/routes/guardrails.py`:
- Around line 123-140: When loading stored prompts via
llm_prompt_config_crud.get for both the TopicRelevance path (assigning
validator.configuration and prompt_schema_version) and the
AnswerRelevanceCustomLLMSafetyValidatorConfig path (assigning
validator.prompt_template), validate that the returned
prompt_config.validator_name matches the expected validator type before
assignment; if it does not match, raise a 400 error (HTTPException) rejecting
the request. Concretely, after retrieving prompt_config from
llm_prompt_config_crud.get, compare prompt_config.validator_name against the
expected identifier for the current validator (e.g., the class or type name
associated with validator or AnswerRelevanceCustomLLMSafetyValidatorConfig) and
only set validator.configuration / validator.prompt_schema_version /
validator.prompt_template when they match; otherwise return a 400 with a clear
message about mismatched validator_name.
In `@backend/app/schemas/llm_prompt_config.py`:
- Around line 53-58: LLMPromptConfigUpdate can bypass placeholder checks, so add
the same validation used for creation: implement a model validator named
validate_answer_relevance_placeholders on LLMPromptConfigUpdate (or
alternatively call the same validation routine from the CRUD update() in
backend/app/crud/llm_prompt_config.py before committing) that enforces the
presence of {query} and {answer} when validator_name == AnswerRelevanceCustomLLM
and rejects/raises on missing placeholders; ensure the validator references
LLMPromptConfigUpdate.llm_prompt and validator_name and mirrors the logic used
by the create model to prevent PATCH requests from persisting invalid prompts.
---
Nitpick comments:
In `@backend/app/models/config/llm_prompt_config.py`:
- Around line 82-90: The UniqueConstraint declared in __table_args__ currently
uses name="uq_validator_prompt_config" but the migration expects
"uq_llm_prompt_config"; update the constraint name in the model (the
UniqueConstraint in __table_args__ that references "organization_id",
"project_id", "validator_name", "prompt_schema_version", "llm_prompt") to match
the migration by changing the name to "uq_llm_prompt_config" so the ORM schema
and migrations remain in sync.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e034228f-8243-4ccc-9a58-32da70be2437
📒 Files selected for processing (26)
backend/app/alembic/versions/008_added_llm_validator_prompt.pybackend/app/api/docs/llm_prompt_configs/create_config.mdbackend/app/api/docs/llm_prompt_configs/delete_config.mdbackend/app/api/docs/llm_prompt_configs/get_config.mdbackend/app/api/docs/llm_prompt_configs/list_configs.mdbackend/app/api/docs/llm_prompt_configs/update_config.mdbackend/app/api/docs/topic_relevance_configs/create_config.mdbackend/app/api/docs/topic_relevance_configs/delete_config.mdbackend/app/api/docs/topic_relevance_configs/get_config.mdbackend/app/api/docs/topic_relevance_configs/list_configs.mdbackend/app/api/docs/topic_relevance_configs/update_config.mdbackend/app/api/main.pybackend/app/api/routes/guardrails.pybackend/app/api/routes/llm_prompt_configs.pybackend/app/api/routes/topic_relevance_configs.pybackend/app/core/enum.pybackend/app/crud/llm_prompt_config.pybackend/app/crud/topic_relevance.pybackend/app/models/config/llm_prompt_config.pybackend/app/schemas/llm_prompt_config.pybackend/app/schemas/topic_relevance.pybackend/app/tests/test_llm_prompt_configs_api.pybackend/app/tests/test_llm_prompt_configs_api_integration.pybackend/app/tests/test_topic_relevance_configs_api.pybackend/app/tests/test_topic_relevance_configs_api_integration.pybackend/app/tests/test_validate_with_guard.py
💤 Files with no reviewable changes (10)
- backend/app/api/docs/topic_relevance_configs/delete_config.md
- backend/app/api/docs/topic_relevance_configs/list_configs.md
- backend/app/api/docs/topic_relevance_configs/create_config.md
- backend/app/tests/test_topic_relevance_configs_api.py
- backend/app/api/docs/topic_relevance_configs/update_config.md
- backend/app/crud/topic_relevance.py
- backend/app/api/routes/topic_relevance_configs.py
- backend/app/schemas/topic_relevance.py
- backend/app/tests/test_topic_relevance_configs_api_integration.py
- backend/app/api/docs/topic_relevance_configs/get_config.md
✅ Files skipped from review due to trivial changes (4)
- backend/app/api/docs/llm_prompt_configs/list_configs.md
- backend/app/api/docs/llm_prompt_configs/update_config.md
- backend/app/api/docs/llm_prompt_configs/get_config.md
- backend/app/api/docs/llm_prompt_configs/delete_config.md
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
backend/app/api/docs/guardrails/run_guardrails.md (1)
11-11:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClarify behavior when both
custom_prompt_idandprompt_templateare sent.Line 11 is still ambiguous about precedence vs mutual exclusivity. Please explicitly document whether requests with both fields are rejected or which one wins.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/api/docs/guardrails/run_guardrails.md` at line 11, Update the answer_relevance_custom_llm documentation to explicitly state the resolution when both custom_prompt_id and prompt_template are provided: specify whether the request is rejected with a 4xx validation error or which field takes precedence (e.g., "custom_prompt_id takes precedence; prompt_template will be ignored"), and include expected server behavior and error message text for invalid combinations; reference the operation name answer_relevance_custom_llm and the two fields custom_prompt_id and prompt_template so callers know to either omit one or expect the documented precedence.backend/app/core/validators/answer_relevance_custom_llm.py (1)
45-57:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHarden JSON payload type checks before
.get()/.strip().This path still raises runtime exceptions for valid JSON that is not an object, or for non-string
query/answervalues.Defensive parsing patch
try: data = json.loads(value) - query = data.get("query", "") - answer = data.get("answer", "") except (json.JSONDecodeError, TypeError): return FailResult( error_message="Input must be a JSON string with 'query' and 'answer' fields." ) + if not isinstance(data, dict): + return FailResult( + error_message="Input must be a JSON string with 'query' and 'answer' fields." + ) + + query = data.get("query", "") + answer = data.get("answer", "") + if not isinstance(query, str) or not isinstance(answer, str): + return FailResult( + error_message="Input must be a JSON string with 'query' and 'answer' fields." + ) if not query.strip() or not answer.strip(): return FailResult( error_message="Both 'query' and 'answer' fields must be non-empty." )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/core/validators/answer_relevance_custom_llm.py` around lines 45 - 57, The current validator calls json.loads(value) and then uses data.get(...).strip(), which raises if the parsed JSON is not an object or if query/answer are non-strings; update the parsing in the validator in answer_relevance_custom_llm.py to: ensure data is a dict (isinstance(data, dict)) after json.loads, extract raw_query = data.get("query") and raw_answer = data.get("answer") without calling .strip() immediately, validate that both raw_query and raw_answer are instances of str and non-empty after strip (or return FailResult via the same error_message), and if types are wrong return a FailResult explaining that query and answer must be string fields; keep using the existing FailResult symbol and the same error messages but avoid any .get()/.strip() calls on non-dict/non-string values.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/api/routes/guardrails.py`:
- Around line 106-119: The code in _resolve_validator_configs mutates a single
`data` variable for all validators causing AnswerRelevance to overwrite
plain-text input for subsequent validators; update the function to avoid global
mutation by either (preferred) validating and rejecting mixed validator sets
(when payload.validators contains AnswerRelevance plus any non-AnswerRelevance)
early, or by computing per-validator data locally (e.g., inside the loop use a
local variable like `validator_data` based on validator.type) so AnswerRelevance
gets JSON-encoded {"query": input, "answer": output} only for itself; ensure
checks reference GuardrailRequest.validators and the AnswerRelevance identifier
and return the correct string for each validator.
In `@backend/app/crud/llm_prompt_config.py`:
- Around line 97-107: Guard the llm_prompt value's type before doing placeholder
membership checks: when handling the PATCH in the block that checks
obj.validator_name == LLMValidatorName.AnswerRelevanceCustomLLM and "llm_prompt"
in update_data, ensure update_data["llm_prompt"] is a str (not None or other
types) before computing missing = [p for p in ("{query}", "{answer}") if p not
in new_prompt]; if it's missing or not a string, raise the existing
HTTPException(422, ...) with an appropriate message so a TypeError cannot be
raised during membership checks on non-string values.
---
Duplicate comments:
In `@backend/app/api/docs/guardrails/run_guardrails.md`:
- Line 11: Update the answer_relevance_custom_llm documentation to explicitly
state the resolution when both custom_prompt_id and prompt_template are
provided: specify whether the request is rejected with a 4xx validation error or
which field takes precedence (e.g., "custom_prompt_id takes precedence;
prompt_template will be ignored"), and include expected server behavior and
error message text for invalid combinations; reference the operation name
answer_relevance_custom_llm and the two fields custom_prompt_id and
prompt_template so callers know to either omit one or expect the documented
precedence.
In `@backend/app/core/validators/answer_relevance_custom_llm.py`:
- Around line 45-57: The current validator calls json.loads(value) and then uses
data.get(...).strip(), which raises if the parsed JSON is not an object or if
query/answer are non-strings; update the parsing in the validator in
answer_relevance_custom_llm.py to: ensure data is a dict (isinstance(data,
dict)) after json.loads, extract raw_query = data.get("query") and raw_answer =
data.get("answer") without calling .strip() immediately, validate that both
raw_query and raw_answer are instances of str and non-empty after strip (or
return FailResult via the same error_message), and if types are wrong return a
FailResult explaining that query and answer must be string fields; keep using
the existing FailResult symbol and the same error messages but avoid any
.get()/.strip() calls on non-dict/non-string values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bf5cdceb-28d3-404d-a4a7-06adf9f2c1b2
📒 Files selected for processing (19)
backend/README.mdbackend/app/alembic/versions/008_added_llm_validator_prompt.pybackend/app/alembic/versions/009_add_output_text_to_request_log.pybackend/app/api/API_USAGE.mdbackend/app/api/docs/guardrails/run_guardrails.mdbackend/app/api/routes/guardrails.pybackend/app/core/validators/README.mdbackend/app/core/validators/answer_relevance_custom_llm.pybackend/app/core/validators/gender_assumption_bias.pybackend/app/core/validators/lexical_slur.pybackend/app/core/validators/pii_remover.pybackend/app/core/validators/topic_relevance.pybackend/app/crud/llm_prompt_config.pybackend/app/crud/request_log.pybackend/app/models/logging/request_log.pybackend/app/schemas/guardrail_config.pybackend/app/tests/test_llm_prompt_configs_api_integration.pybackend/app/tests/test_llm_validators.pybackend/app/tests/test_validate_with_guard.py
💤 Files with no reviewable changes (1)
- backend/app/tests/test_llm_validators.py
✅ Files skipped from review due to trivial changes (4)
- backend/README.md
- backend/app/core/validators/topic_relevance.py
- backend/app/core/validators/pii_remover.py
- backend/app/core/validators/README.md
| def _resolve_validator_configs(payload: GuardrailRequest, session: Session) -> str: | ||
| """ | ||
| Resolves config-backed references for all validators in-place before guard execution: | ||
| - BanList: fetches banned_words from the stored BanList when not provided inline. | ||
| - TopicRelevance: fetches configuration and prompt_schema_version from stored config. | ||
| - AnswerRelevance: fetches custom prompt template from stored config; returns | ||
| JSON-encoded {"query": input, "answer": output} as the guard data. | ||
|
|
||
| Returns the data string to pass to guard.validate(). | ||
| """ | ||
| # Input guardrails validate payload.input; output guardrails validate payload.output. | ||
| # AnswerRelevance is the exception: it needs both, encoded as JSON. | ||
| data = payload.output if payload.output is not None else payload.input | ||
| for validator in payload.validators: |
There was a problem hiding this comment.
Avoid mutating shared guard input for mixed validator runs.
data is global for the whole guard execution, but Line 148 rewrites it to answer-relevance JSON whenever that validator appears. That causes other validators in the same request to receive JSON instead of plain text.
A safe fix is to reject mixed use (answer relevance + non-answer-relevance validators) at config-resolution time, or split execution paths so each validator gets the expected input format.
Suggested guardrail-time protection
def _resolve_validator_configs(payload: GuardrailRequest, session: Session) -> str:
@@
- data = payload.output if payload.output is not None else payload.input
+ data = payload.output if payload.output is not None else payload.input
+ has_answer_relevance = any(
+ isinstance(v, AnswerRelevanceCustomLLMSafetyValidatorConfig)
+ for v in payload.validators
+ )
+ if has_answer_relevance and len(payload.validators) > 1:
+ raise HTTPException(
+ 400,
+ "answer_relevance_custom_llm cannot be combined with other validators in the same run.",
+ )Also applies to: 147-149, 167-167
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/api/routes/guardrails.py` around lines 106 - 119, The code in
_resolve_validator_configs mutates a single `data` variable for all validators
causing AnswerRelevance to overwrite plain-text input for subsequent validators;
update the function to avoid global mutation by either (preferred) validating
and rejecting mixed validator sets (when payload.validators contains
AnswerRelevance plus any non-AnswerRelevance) early, or by computing
per-validator data locally (e.g., inside the loop use a local variable like
`validator_data` based on validator.type) so AnswerRelevance gets JSON-encoded
{"query": input, "answer": output} only for itself; ensure checks reference
GuardrailRequest.validators and the AnswerRelevance identifier and return the
correct string for each validator.
| if ( | ||
| "llm_prompt" in update_data | ||
| and obj.validator_name == LLMValidatorName.AnswerRelevanceCustomLLM | ||
| ): | ||
| new_prompt = update_data["llm_prompt"] | ||
| missing = [p for p in ("{query}", "{answer}") if p not in new_prompt] | ||
| if missing: | ||
| raise HTTPException( | ||
| 422, | ||
| f"llm_prompt must contain the placeholders: {', '.join(missing)}", | ||
| ) |
There was a problem hiding this comment.
Guard llm_prompt type before placeholder membership checks.
If a PATCH sends {"llm_prompt": null}, Line 102 executes membership checks on None and can throw TypeError (500) instead of returning a validation 422.
Suggested fix
if (
"llm_prompt" in update_data
and obj.validator_name == LLMValidatorName.AnswerRelevanceCustomLLM
):
new_prompt = update_data["llm_prompt"]
+ if not isinstance(new_prompt, str):
+ raise HTTPException(
+ 422,
+ "llm_prompt must be a string containing both {query} and {answer}",
+ )
missing = [p for p in ("{query}", "{answer}") if p not in new_prompt]
if missing:
raise HTTPException(
422,
f"llm_prompt must contain the placeholders: {', '.join(missing)}",
)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/crud/llm_prompt_config.py` around lines 97 - 107, Guard the
llm_prompt value's type before doing placeholder membership checks: when
handling the PATCH in the block that checks obj.validator_name ==
LLMValidatorName.AnswerRelevanceCustomLLM and "llm_prompt" in update_data,
ensure update_data["llm_prompt"] is a str (not None or other types) before
computing missing = [p for p in ("{query}", "{answer}") if p not in new_prompt];
if it's missing or not a string, raise the existing HTTPException(422, ...) with
an appropriate message so a TypeError cannot be raised during membership checks
on non-string values.
| "configuration", | ||
| name="uq_topic_relevance_config_org_project_prompt", | ||
| "llm_prompt", | ||
| name="uq_validator_prompt_config", |
There was a problem hiding this comment.
unique constraint here is different than in migration uq_llm_prompt_config
|
|
||
|
|
||
| def _resolve_validator_configs(payload: GuardrailRequest, session: Session) -> None: | ||
| def _resolve_validator_configs(payload: GuardrailRequest, session: Session) -> str: |
There was a problem hiding this comment.
seemsdata is shared across all validators.
In _resolve_validator_configs, we pass one string to guard.validate(data). When answer_relevance_custom_llm is included, it replaces data with a JSON envelope:
{"query": payload.input, "answer": payload.output or ""}Guardrails then sends that same JSON string to every validator in the chain.
For example, if a caller asks to scrub PII and check answer relevance, pii_remover ends up running on the JSON envelope instead of the actual answer. It redacts the phone number, but the returned safe_text is now a JSON-wrapped string rather than the cleaned answer.
{ "input": "What causes fever?", "output": "Call Dr. Smith at 555-1234. Infections cause fever.", "validators": [ {"type": "pii_remover"}, {"type": "answer_relevance_custom_llm"} ] }
That breaks downstream code that expects plain text.
Can you test this using answer relevance along with PII and see how the output looks and how logs look
| """ | ||
| # Input guardrails validate payload.input; output guardrails validate payload.output. | ||
| # AnswerRelevance is the exception: it needs both, encoded as JSON. | ||
| data = payload.output if payload.output is not None else payload.input |
There was a problem hiding this comment.
Before this PR, data was always payload.input. The new output field on GuardrailRequest was added to support answer_relevance_custom_llm, but the data resolution applies
to every validator unconditionally. Any caller who passes output for any reason will silently have their guards run against the output text instead of the input.
Example of the silent break
A caller updates their client to pass both input and output (perhaps as a forward-compatible change, or to log both via request_log):
{
"input": "Tell me a joke",
"output": "Why did the chicken cross the road?",
"validators": [
{"type": "pii_remover"},
{"type": "lexical_slur"}
]
}
Expected: input guardrails (PII, slur) run on "Tell me a joke".
Actual: they run on "Why did the chicken cross the road?".
No error, no warning — just silently wrong validation. The request_log.request_text still stores input, so debugging from logs is misleading: logs say "Tell me a joke"
was validated, but it wasn't.
There was a problem hiding this comment.
The way we are calling Kaapi-guardrails from the backend service, it's two separate calls - one for each input and output validator. For the input validator, things will work as expected. However, for the output validator, it should only consider the output, right? For each validator, we specify whether its for input or output. If a request has both input and output, its the assumption that we will only check output. The input must have already been evaluated by then.
|
|
||
| class AnswerRelevanceCustomLLMSafetyValidatorConfig(BaseValidatorConfig): | ||
| type: Literal["answer_relevance_custom_llm"] | ||
| llm_callable: str = "gpt-4o-mini" |
There was a problem hiding this comment.
can we put default model in config so easy to update if this model gets deprecated
There was a problem hiding this comment.
What do you mean by config here?
Summary
Target issue is #120
Explain the motivation for making this change. What existing problem does the pull request solve?
answer_relevance_custom_llmevaluates whether an LLM's answer is relevant to a user query using an LLM as judge (YES/NO).(/guardrails/answer_relevance_prompts): full CRUD endpoints (multi-tenant, X-API-KEY auth) for NGOs to store, version, and manage domain-specific evaluation prompts. Prompts are validated at write time to enforce both {query} and {answer} placeholders. Reference a stored prompt at runtime via custom_prompt_id in the validator config.Added files:
app/models/config/answer_relevance_prompt.py — SQLModel table answer_relevance_prompt scoped to organization_id + project_id
app/crud/answer_relevance_prompt.py — standard CRUD
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Please add here if any other information is required for the reviewer.
Summary by CodeRabbit
New Features
API Changes
Documentation