S2S: Speech-to-Speech with File Search in a Single API Call#643
S2S: Speech-to-Speech with File Search in a Single API Call#643Prajna1999 wants to merge 33 commits into
Conversation
…t and eliminate code duplication
…ove callback logic in ChainContext
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a Speech-to-Speech (STS) with RAG workflow: new POST /llm/sts endpoint, request/enums models, chain block builders, detected-language propagation through chain execution, TTS parameter substitution, provider tweaks, tests, debug script, and comprehensive documentation. Changes wire the router into the API. ChangesSpeech-to-Speech (STS) Flow
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as "/llm/sts"
participant Chain as "LLM Chain\nOrchestrator"
participant STT as "STT Provider\n(Sarvam/Google)"
participant RAG as "RAG/LLM\nProvider (OpenAI)"
participant TTS as "TTS Provider\n(Sarvam/Google)"
participant Callback as "Webhook\n(Callbacks)"
Client->>API: POST /llm/sts (audio, languages, models, callback)
activate API
API->>Chain: create_chain_request (STT → RAG → TTS)
deactivate API
activate Chain
Chain->>STT: execute STT block (audio, language=auto/specified)
activate STT
STT-->>Chain: TextOutput (transcript, language_code)
deactivate STT
Chain->>Callback: STT Intermediate callback (transcript, latency)
Chain->>RAG: execute RAG block (query, knowledge_bases, language_code)
activate RAG
RAG-->>Chain: TextOutput (LLM response)
deactivate RAG
Chain->>Callback: LLM Intermediate callback (response, latency)
Chain->>TTS: execute TTS block (response, language={{detected}} → replaced)
activate TTS
TTS-->>Chain: AudioOutput (audio, latency)
deactivate TTS
deactivate Chain
Chain->>Callback: TTS Final callback (audio, usage, latency)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/crud/llm.py (1)
46-55:⚠️ Potential issue | 🔴 CriticalFix invalid Python signature ordering in
create_llm_call.Line 51 introduces a defaulted parameter (
chain_id) before required keyword-only parameters (project_id,organization_id,resolved_config,original_provider), which is a syntax error (non-default argument follows default argument).🛠️ Proposed fix
def create_llm_call( session: Session, *, request: LLMCallRequest, job_id: UUID, - chain_id: UUID | None = None, project_id: int, organization_id: int, resolved_config: ConfigBlob, original_provider: str, + chain_id: UUID | None = None, ) -> LlmCall:#!/bin/bash # Verify the file parses after parameter reordering (read-only check) python - <<'PY' import ast, pathlib p = pathlib.Path("backend/app/crud/llm.py") try: ast.parse(p.read_text()) print("OK: backend/app/crud/llm.py parses successfully") except SyntaxError as e: print(f"SyntaxError: {e}") PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/crud/llm.py` around lines 46 - 55, The function signature for create_llm_call has a defaulted parameter (chain_id) placed before required keyword-only params, causing a syntax error; fix it by reordering parameters so all required keyword-only arguments (project_id, organization_id, resolved_config, original_provider) appear before any parameters with defaults, or alternatively give defaults to those subsequent parameters—update the signature in the create_llm_call definition accordingly so Python accepts it (keep references to create_llm_call, chain_id, project_id, organization_id, resolved_config, original_provider).
🧹 Nitpick comments (12)
backend/app/services/llm/chain/types.py (1)
15-15: Consider tighteningmetadatatype.Using
dict[str, Any] | Noneinstead ofdict | Nonewould make downstream usage and static checks safer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/chain/types.py` at line 15, Replace the broad "metadata: dict | None = None" annotation with a tightened generic type "dict[str, Any] | None" in types.py and add the required "Any" import from typing; update the annotation on the metadata field (in the dataclass/type that declares metadata) to use dict[str, Any] | None so static checkers and downstream code get precise typing. Ensure you import Any (e.g., from typing import Any) at the top of the file.backend/app/tests/services/llm/test_chain_executor.py (2)
161-173: Consider using theresultvariable or removing assignment.The
resultvariable is assigned but not used. While the test validates callback behavior, adding an assertion onresultwould make the test more explicit, or you can use_to indicate intentionally unused.♻️ Proposed fix
- result = executor.run() + _ = executor.run() # Result validated via callback mock mock_callback.assert_called_once()Or add an assertion:
result = executor.run() mock_callback.assert_called_once() + assert result["success"] is False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/test_chain_executor.py` around lines 161 - 173, In test_run_failure_sends_callback, the local variable result is assigned from executor.run() but never used; either add an assertion that inspects result (e.g., assert expected failure structure or status) to make the test explicit, or rename the variable to _ to indicate it is intentionally unused—update the test_run_failure_sends_callback method where executor.run() is called to implement one of these fixes.
192-215: Consider removing or usingmock_job_crud.The
mock_job_crudis patched but never asserted. If it's only needed to prevent real DB calls, this is acceptable. However, you could either add an assertion or use the_prefix to indicate it's intentionally unused.♻️ Proposed fix
- ) as mock_job_crud, + ) as _mock_job_crud, # Prevents real DB calls🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/test_chain_executor.py` around lines 192 - 215, The patch for JobCrud in test_setup_updates_job_and_chain_status creates mock_job_crud but never uses or asserts it; either add an assertion that it was invoked (e.g., ensure JobCrud was constructed or specific CRUD method called after executor.run) referencing mock_job_crud, or rename the patch target variable to _mock_job_crud to indicate it is intentionally unused; update the test around executor.run and the with(...) patch tuple to implement one of these fixes.backend/app/api/routes/llm_speech.py (1)
67-80: Move HTTPException import to top-level.Importing
HTTPExceptioninside conditional blocks works but is unconventional. Since it's already used in FastAPI routes throughout the codebase, moving it to the top-level imports improves readability and consistency.♻️ Proposed fix
Add to the top-level imports (line 5):
-from fastapi import APIRouter, Depends +from fastapi import APIRouter, Depends, HTTPExceptionThen remove the inline imports at lines 67 and 75:
if request.input_language and request.input_language != "auto": if request.input_language not in LANGUAGE_CODES: - from fastapi import HTTPException - raise HTTPException( status_code=400, detail=f"Unsupported input language: {request.input_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}", ) if request.output_language and request.output_language not in LANGUAGE_CODES: - from fastapi import HTTPException - raise HTTPException( status_code=400, detail=f"Unsupported output language: {request.output_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}", )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/api/routes/llm_speech.py` around lines 67 - 80, The HTTPException is being imported inside conditional blocks in llm_speech route; move the import to the module top-level alongside other FastAPI imports and remove the inline imports in the validation branches that raise HTTPException (the blocks checking request.input_language and request.output_language against LANGUAGE_CODES). Locate the inline "from fastapi import HTTPException" lines near the checks and delete them, and add a single top-level "from fastapi import HTTPException" import so the raises in those branches use the top-level symbol.backend/app/tests/services/llm/test_jobs.py (1)
1358-1366: Remove unused variable or add assertion.The
resultvariable is assigned but never used, as flagged by static analysis. Either remove the assignment or add an assertion to validate the expected behavior.♻️ Proposed fix - Option 1: Remove assignment
- result = self._execute_chain_job(chain_request_data) + self._execute_chain_job(chain_request_data)♻️ Proposed fix - Option 2: Add assertion
result = self._execute_chain_job(chain_request_data) mock_update_status.assert_called_once() _, kwargs = mock_update_status.call_args assert kwargs["chain_id"] == chain_id assert kwargs["status"].value == "failed" + assert result["success"] is False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/test_jobs.py` around lines 1358 - 1366, The test assigns result = self._execute_chain_job(chain_request_data) but never uses it; either remove the unused assignment or add an assertion about the execution outcome. Locate the call to self._execute_chain_job in the test (near the mock_update_status assertions) and either delete the "result =" prefix so the call stands alone, or add an assertion referencing result (e.g., assert result is not None or assert result.status == ...) that expresses the expected behavior while keeping the existing mock_update_status assertions intact.backend/app/api/docs/llm/speech_to_speech.md (1)
7-15: Add language specifiers to fenced code blocks.The code blocks on lines 7-9 and 13-15 are missing language specifiers. While these are simple text displays, adding a language identifier (e.g.,
textorplaintext) satisfies linting rules and improves accessibility.📝 Proposed fix
-``` +```text POST /llm/sts```diff -``` +```text Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@backend/app/api/docs/llm/speech_to_speech.mdaround lines 7 - 15, Add
language specifiers to the two fenced code blocks that currently contain "POST
/llm/sts" and "Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS →
Voice Output": update the opening backticks of each code fence to include a
language identifier such astextorplaintext(e.g., changetotext)
so linting rules and accessibility are satisfied while leaving the block
contents unchanged.</details> </blockquote></details> <details> <summary>backend/app/services/llm/chain/chain.py (2)</summary><blockquote> `3-3`: **Import `Callable` from `collections.abc` instead of `typing`.** For Python 3.9+, importing `Callable` from `collections.abc` is preferred. <details> <summary>Proposed fix</summary> ```diff -from typing import Any, Callable +from collections.abc import Callable +from typing import Any ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/chain/chain.py` at line 3, The import currently brings Callable from typing; update the import in this module so Callable is imported from collections.abc instead (replace the typing import for Callable with "from collections.abc import Callable") to follow Python 3.9+ best practices; ensure any other uses of Callable in this file (the import line at the top that includes "from typing import Any, Callable") still keep Any imported from typing or move Any as needed. ``` </details> --- `121-135`: **Consider exposing block index via a property instead of accessing private `_index`.** The callback accesses `block._index` directly. While this works since it's internal code, exposing a read-only property would be cleaner. <details> <summary>Proposed change in ChainBlock class</summary> ```diff class ChainBlock: """A single block in the chain. Only responsible for executing itself.""" + + `@property` + def index(self) -> int: + """Block index in the chain.""" + return self._index ``` Then in `LLMChain.execute`: ```diff - on_block_completed(block._index, result) + on_block_completed(block.index, result) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/chain/chain.py` around lines 121 - 135, Add a read-only property on ChainBlock (e.g., def index(self) -> int: return self._index) and update LLMChain.execute to use that property instead of accessing the private attribute directly: replace block._index with block.index in the on_block_completed call and the error log (and anywhere else in LLMChain.execute that reads the private field) so callers use the public read-only accessor while preserving current behavior. ``` </details> </blockquote></details> <details> <summary>backend/test_sts_debug.py (3)</summary><blockquote> `145-147`: **Remove unused variable assignment.** The `func` variable is assigned but never used. If this is intentional verification that the attribute exists, consider using `hasattr()` or simply accessing without assignment. <details> <summary>Proposed fix</summary> ```diff - func = getattr(module, "execute_chain_job") - print(f"✅ Dynamic import successful (same as Celery)") + # Verify attribute exists (same as Celery dynamic import) + _ = module.execute_chain_job + print("✅ Dynamic import successful (same as Celery)") ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@backend/test_sts_debug.py` around lines 145 - 147, The assignment to func using getattr(module, "execute_chain_job") is unused; either remove the assignment and just access/verify the attribute (e.g., use hasattr(module, "execute_chain_job") or assert getattr(module, "execute_chain_job") is not None) or use func after assignment. Update the code around importlib.import_module("app.services.llm.jobs") and the getattr call so you don't leave an unused variable named func—choose removal of the assignment or replace it with a presence check (hasattr or assert) to express the intent. ``` </details> --- `79-82`: **Remove extraneous f-string prefixes from strings without placeholders.** Static analysis correctly identifies multiple f-strings that contain no interpolation. These should be regular strings. <details> <summary>Example fixes</summary> ```diff - print(f"✅ Chain job created successfully!") - print(f" Job ID: {job_id}") - print(f" Check your Celery worker logs for task execution") + print("✅ Chain job created successfully!") + print(f" Job ID: {job_id}") + print(" Check your Celery worker logs for task execution") ``` ```diff - print(f"✅ Celery workers are running:") + print("✅ Celery workers are running:") ``` Similar changes apply to lines 116, 139, 147, and 170-191. </details> Also applies to: 105-105, 116-116, 139-139, 147-147, 170-191 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@backend/test_sts_debug.py` around lines 79 - 82, Several print statements use f-strings with no interpolation; change them to normal string literals to satisfy static analysis (remove the leading f from prints that contain no `{}` placeholders). Locate the print calls around the function that prints and returns job_id (the three lines printing "✅ Chain job created successfully!", " Job ID: {job_id}" is fine to keep as an f-string but others without placeholders must drop the f) and also update the other print lines flagged (around lines indicated: 105, 116, 139, 147, and the block 170-191) by removing the unnecessary f prefix so only prints that actually interpolate job_id or other variables use f-strings. ``` </details> --- `14-14`: **Add return type hints to functions.** Per coding guidelines, all functions should have type hints for parameters and return values. These debug functions are missing return type annotations. <details> <summary>Proposed fixes</summary> ```diff -def test_chain_job_creation(): +def test_chain_job_creation() -> str | None: ``` ```diff -def check_celery_connection(): +def check_celery_connection() -> None: ``` ```diff -def check_function_import(): +def check_function_import() -> None: ``` </details> Also applies to: 91-91, 130-130 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@backend/test_sts_debug.py` at line 14, Add explicit return type annotations to the test functions (e.g., test_chain_job_creation and the other top-level test_* functions missing annotations) by adding -> None to their definitions and annotate any parameters if present; update each def statement (such as def test_chain_job_creation():) to include the return type (def test_chain_job_creation() -> None:) so all test functions have proper type hints. ``` </details> </blockquote></details> <details> <summary>backend/app/models/llm/request.py (1)</summary><blockquote> `15-34`: **Consider using `enum.StrEnum` for string enums (Python 3.11+).** Per coding guidelines, this codebase uses Python 3.11+. `StrEnum` is cleaner than inheriting from both `str` and `Enum`. <details> <summary>Proposed fix</summary> ```diff +from enum import StrEnum -from enum import Enum ... -class STTModel(str, Enum): +class STTModel(StrEnum): """Supported STT models for speech-to-speech.""" GEMINI_PRO = "gemini-2.5-pro" SARVAM = "saaras:v3" -class TTSModel(str, Enum): +class TTSModel(StrEnum): """Supported TTS models for speech-to-speech.""" GEMINI_PRO = "gemini-2.5-pro-preview-tts" SARVAM = "bulbul:v3" -class LLMModel(str, Enum): +class LLMModel(StrEnum): """Supported LLM models for RAG in speech-to-speech.""" GPT4O = "gpt-4o" GPT4O_MINI = "gpt-4o-mini" -class ChainStatus(str, Enum): +class ChainStatus(StrEnum): """Status of an LLM chain execution.""" PENDING = "pending" ... ``` </details> Also applies to: 634-640 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 15 - 34, The enums STTModel, TTSModel, and LLMModel currently inherit from str and Enum; replace them to inherit from enum.StrEnum (import StrEnum from enum) so they become true string enums per Python 3.11+; update the class bases for STTModel, TTSModel, and LLMModel and ensure the import is added (and replicate the same change for the other occurrence noted at lines 634-640). ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@backend/app/api/routes/llm_chain.py:
- Around line 39-41: The llm_chain endpoint function lacks a return type hint;
update the function signature for llm_chain(_current_user: AuthContextDep,
_session: SessionDep, request: LLMChainRequest) to include the correct return
type (e.g., -> LLMChainResponse or the actual response type used by this route),
and add any missing import for that response type at the top of the file; ensure
the declared return type matches what the function actually returns.In
@backend/app/api/routes/llm_speech_examples.md:
- Around line 316-323: The JSON example in the llm_speech_examples.md fenced
block contains inline // comments which make it invalid JSON; update the example
by either removing the comments from the JSON object (remove // Auto-detect and
// Odia, Bengali, Punjabi, etc. next to "input_language" and "output_language")
or change the fence fromjson tojsonc so the comments are allowed—edit
the fenced example containing "query", "knowledge_base_ids", "input_language",
and "output_language" accordingly.- Around line 5-7: Several fenced code blocks in llm_speech_examples.md (for
example the block showing "POST /llm/sts") lack language specifiers; update each
such fence to include the appropriate language tag (e.g., changetohttp
for HTTP request examples like the "POST /llm/sts" block and to ```text for
plain text snippets) so the markdown linter (MD040) is satisfied and readability
is improved. Locate the fences that contain HTTP request examples (e.g., the
block with "POST /llm/sts" and the other similar request examples) and the
plain-text response or sample blocks, and add the correct language identifiers
to those triple-backtick fences.In
@backend/app/api/routes/llm_speech.py:
- Around line 36-40: The function speech_to_speech is missing a return type
annotation; update its signature to include an explicit return type that matches
the actual return value used in the implementation (for example, if it returns a
FastAPI Response or StreamingResponse annotate with that type, or annotate with
dict/Any if it returns JSON-like data). Modify the def speech_to_speech(...)
signature to add the correct return type hint while keeping existing parameter
types AuthContextDep, SessionDep, and SpeechToSpeechRequest unchanged.In
@backend/app/models/llm/response.py:
- Around line 83-88: Update the docstring for the IntermediateChainResponse
class by correcting the typo "Flattend" to "Flattened" in the descriptive text
so the docstring reads "Flattened structure matching LLMCallResponse keys for
consistency"; locate the docstring in the IntermediateChainResponse definition
and make the single-word correction.In
@backend/app/services/llm/chain/utils.py:
- Around line 94-96: The assignment to params["language_code"] uses a redundant
conditional; replace the conditional expression with a direct assignment so
params["language_code"] is set to language_code (i.e., remove the ternary that
checks for "unknown"), locating the change at the line where
params["language_code"] and language_code are used in this module.- Around line 149-184: The build_tts_block function currently hardcodes "en-IN"
for Sarvam provider instead of using the passed language_code; update the Sarvam
branch in build_tts_block so params["target_language_code"] = language_code (use
the function arg), keeping the existing params["speaker"] and
params["output_audio_codec"] assignments intact; verify model_configs and the
provider detection logic for "sarvamai-native" remain unchanged so the correct
branch still runs for TTSModel.SARVAM.In
@backend/app/tests/services/llm/test_chain.py:
- Around line 33-219: The tests use untyped, manually-constructed fixtures
(context, text_response, audio_response, make_config) and unannotated test
functions; replace these with the project's factory-pattern fixtures (use the
existing chain/context/llm factories instead of manual construction inside
context, text_response, audio_response, and make_config) and add explicit type
hints to each fixture and test function signature and return value (e.g.,
annotate context: ChainContext, text_response/audio_response: LLMCallResponse,
make_config: LLMCallConfig, and test functions' parameters/return types),
importing any needed types from the production modules; update tests referencing
ChainBlock, LLMChain, result_to_query, BlockResult, QueryParams to accept the
typed fixtures so the tests remain functionally identical but follow the factory
- typing guidelines.
Outside diff comments:
In@backend/app/crud/llm.py:
- Around line 46-55: The function signature for create_llm_call has a defaulted
parameter (chain_id) placed before required keyword-only params, causing a
syntax error; fix it by reordering parameters so all required keyword-only
arguments (project_id, organization_id, resolved_config, original_provider)
appear before any parameters with defaults, or alternatively give defaults to
those subsequent parameters—update the signature in the create_llm_call
definition accordingly so Python accepts it (keep references to create_llm_call,
chain_id, project_id, organization_id, resolved_config, original_provider).
Nitpick comments:
In@backend/app/api/docs/llm/speech_to_speech.md:
- Around line 7-15: Add language specifiers to the two fenced code blocks that
currently contain "POST /llm/sts" and "Voice Input → STT (auto language) → RAG
(Knowledge Base) → TTS → Voice Output": update the opening backticks of each
code fence to include a language identifier such astextorplaintext(e.g.,
changetotext) so linting rules and accessibility are satisfied while
leaving the block contents unchanged.In
@backend/app/api/routes/llm_speech.py:
- Around line 67-80: The HTTPException is being imported inside conditional
blocks in llm_speech route; move the import to the module top-level alongside
other FastAPI imports and remove the inline imports in the validation branches
that raise HTTPException (the blocks checking request.input_language and
request.output_language against LANGUAGE_CODES). Locate the inline "from fastapi
import HTTPException" lines near the checks and delete them, and add a single
top-level "from fastapi import HTTPException" import so the raises in those
branches use the top-level symbol.In
@backend/app/models/llm/request.py:
- Around line 15-34: The enums STTModel, TTSModel, and LLMModel currently
inherit from str and Enum; replace them to inherit from enum.StrEnum (import
StrEnum from enum) so they become true string enums per Python 3.11+; update the
class bases for STTModel, TTSModel, and LLMModel and ensure the import is added
(and replicate the same change for the other occurrence noted at lines 634-640).In
@backend/app/services/llm/chain/chain.py:
- Line 3: The import currently brings Callable from typing; update the import in
this module so Callable is imported from collections.abc instead (replace the
typing import for Callable with "from collections.abc import Callable") to
follow Python 3.9+ best practices; ensure any other uses of Callable in this
file (the import line at the top that includes "from typing import Any,
Callable") still keep Any imported from typing or move Any as needed.- Around line 121-135: Add a read-only property on ChainBlock (e.g., def
index(self) -> int: return self._index) and update LLMChain.execute to use that
property instead of accessing the private attribute directly: replace
block._index with block.index in the on_block_completed call and the error log
(and anywhere else in LLMChain.execute that reads the private field) so callers
use the public read-only accessor while preserving current behavior.In
@backend/app/services/llm/chain/types.py:
- Line 15: Replace the broad "metadata: dict | None = None" annotation with a
tightened generic type "dict[str, Any] | None" in types.py and add the required
"Any" import from typing; update the annotation on the metadata field (in the
dataclass/type that declares metadata) to use dict[str, Any] | None so static
checkers and downstream code get precise typing. Ensure you import Any (e.g.,
from typing import Any) at the top of the file.In
@backend/app/tests/services/llm/test_chain_executor.py:
- Around line 161-173: In test_run_failure_sends_callback, the local variable
result is assigned from executor.run() but never used; either add an assertion
that inspects result (e.g., assert expected failure structure or status) to make
the test explicit, or rename the variable to _ to indicate it is intentionally
unused—update the test_run_failure_sends_callback method where executor.run() is
called to implement one of these fixes.- Around line 192-215: The patch for JobCrud in
test_setup_updates_job_and_chain_status creates mock_job_crud but never uses or
asserts it; either add an assertion that it was invoked (e.g., ensure JobCrud
was constructed or specific CRUD method called after executor.run) referencing
mock_job_crud, or rename the patch target variable to _mock_job_crud to indicate
it is intentionally unused; update the test around executor.run and the
with(...) patch tuple to implement one of these fixes.In
@backend/app/tests/services/llm/test_jobs.py:
- Around line 1358-1366: The test assigns result =
self._execute_chain_job(chain_request_data) but never uses it; either remove the
unused assignment or add an assertion about the execution outcome. Locate the
call to self._execute_chain_job in the test (near the mock_update_status
assertions) and either delete the "result =" prefix so the call stands alone, or
add an assertion referencing result (e.g., assert result is not None or assert
result.status == ...) that expresses the expected behavior while keeping the
existing mock_update_status assertions intact.In
@backend/test_sts_debug.py:
- Around line 145-147: The assignment to func using getattr(module,
"execute_chain_job") is unused; either remove the assignment and just
access/verify the attribute (e.g., use hasattr(module, "execute_chain_job") or
assert getattr(module, "execute_chain_job") is not None) or use func after
assignment. Update the code around
importlib.import_module("app.services.llm.jobs") and the getattr call so you
don't leave an unused variable named func—choose removal of the assignment or
replace it with a presence check (hasattr or assert) to express the intent.- Around line 79-82: Several print statements use f-strings with no
interpolation; change them to normal string literals to satisfy static analysis
(remove the leading f from prints that contain no{}placeholders). Locate the
print calls around the function that prints and returns job_id (the three lines
printing "✅ Chain job created successfully!", " Job ID: {job_id}" is fine to
keep as an f-string but others without placeholders must drop the f) and also
update the other print lines flagged (around lines indicated: 105, 116, 139,
147, and the block 170-191) by removing the unnecessary f prefix so only prints
that actually interpolate job_id or other variables use f-strings.- Line 14: Add explicit return type annotations to the test functions (e.g.,
test_chain_job_creation and the other top-level test_* functions missing
annotations) by adding -> None to their definitions and annotate any parameters
if present; update each def statement (such as def test_chain_job_creation():)
to include the return type (def test_chain_job_creation() -> None:) so all test
functions have proper type hints.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: defaults **Review profile**: CHILL **Plan**: Pro **Run ID**: `10106e27-d2cf-4baa-b7a8-1b10af5b048b` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between e28079c70ea98dac4ed4886541565eac1bf3f6b5 and a530796e7639ecc7f6fe12a92b25a711925ca21c. </details> <details> <summary>📒 Files selected for processing (25)</summary> * `backend/app/alembic/versions/048_create_llm_chain_table.py` * `backend/app/api/docs/llm/llm_chain.md` * `backend/app/api/docs/llm/speech_to_speech.md` * `backend/app/api/main.py` * `backend/app/api/routes/llm_chain.py` * `backend/app/api/routes/llm_speech.py` * `backend/app/api/routes/llm_speech_examples.md` * `backend/app/crud/llm.py` * `backend/app/crud/llm_chain.py` * `backend/app/models/__init__.py` * `backend/app/models/job.py` * `backend/app/models/llm/__init__.py` * `backend/app/models/llm/request.py` * `backend/app/models/llm/response.py` * `backend/app/services/llm/chain/__init__.py` * `backend/app/services/llm/chain/chain.py` * `backend/app/services/llm/chain/executor.py` * `backend/app/services/llm/chain/types.py` * `backend/app/services/llm/chain/utils.py` * `backend/app/services/llm/jobs.py` * `backend/app/tests/crud/test_llm_chain.py` * `backend/app/tests/services/llm/test_chain.py` * `backend/app/tests/services/llm/test_chain_executor.py` * `backend/app/tests/services/llm/test_jobs.py` * `backend/test_sts_debug.py` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| def llm_chain( | ||
| _current_user: AuthContextDep, _session: SessionDep, request: LLMChainRequest | ||
| ): |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add return type hint to function signature.
The function is missing a return type hint. As per coding guidelines, all function parameters and return values should have type hints.
📝 Proposed fix
def llm_chain(
_current_user: AuthContextDep, _session: SessionDep, request: LLMChainRequest
-):
+) -> APIResponse[Message]:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def llm_chain( | |
| _current_user: AuthContextDep, _session: SessionDep, request: LLMChainRequest | |
| ): | |
| def llm_chain( | |
| _current_user: AuthContextDep, _session: SessionDep, request: LLMChainRequest | |
| ) -> APIResponse[Message]: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/routes/llm_chain.py` around lines 39 - 41, The llm_chain
endpoint function lacks a return type hint; update the function signature for
llm_chain(_current_user: AuthContextDep, _session: SessionDep, request:
LLMChainRequest) to include the correct return type (e.g., -> LLMChainResponse
or the actual response type used by this route), and add any missing import for
that response type at the top of the file; ensure the declared return type
matches what the function actually returns.
| ``` | ||
| POST /llm/sts | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks.
These fences are missing language specifiers (MD040). Please label them (e.g., http, text) to satisfy lint and improve readability.
Also applies to: 124-126, 167-169, 211-214
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 5-5: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/routes/llm_speech_examples.md` around lines 5 - 7, Several
fenced code blocks in llm_speech_examples.md (for example the block showing
"POST /llm/sts") lack language specifiers; update each such fence to include the
appropriate language tag (e.g., change ``` to ```http for HTTP request examples
like the "POST /llm/sts" block and to ```text for plain text snippets) so the
markdown linter (MD040) is satisfied and readability is improved. Locate the
fences that contain HTTP request examples (e.g., the block with "POST /llm/sts"
and the other similar request examples) and the plain-text response or sample
blocks, and add the correct language identifiers to those triple-backtick
fences.
| { | ||
| "query": {...}, | ||
| "knowledge_base_ids": ["kb_123"], | ||
| "input_language": "auto", // Auto-detect | ||
| "output_language": "odia", // Odia, Bengali, Punjabi, etc. | ||
| "callback_url": "..." | ||
| } | ||
| ``` |
There was a problem hiding this comment.
json example is not valid JSON.
Lines 319-320 include // comments inside a json block. Either remove inline comments or change the fence to jsonc.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/routes/llm_speech_examples.md` around lines 316 - 323, The
JSON example in the llm_speech_examples.md fenced block contains inline //
comments which make it invalid JSON; update the example by either removing the
comments from the JSON object (remove // Auto-detect and // Odia, Bengali,
Punjabi, etc. next to "input_language" and "output_language") or change the
fence from ```json to ```jsonc so the comments are allowed—edit the fenced
example containing "query", "knowledge_base_ids", "input_language", and
"output_language" accordingly.
| def speech_to_speech( | ||
| _current_user: AuthContextDep, | ||
| _session: SessionDep, | ||
| request: SpeechToSpeechRequest, | ||
| ): |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add return type hint to function signature.
The function is missing a return type hint. As per coding guidelines, all function parameters and return values should have type hints.
📝 Proposed fix
def speech_to_speech(
_current_user: AuthContextDep,
_session: SessionDep,
request: SpeechToSpeechRequest,
-):
+) -> APIResponse[Message]:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def speech_to_speech( | |
| _current_user: AuthContextDep, | |
| _session: SessionDep, | |
| request: SpeechToSpeechRequest, | |
| ): | |
| def speech_to_speech( | |
| _current_user: AuthContextDep, | |
| _session: SessionDep, | |
| request: SpeechToSpeechRequest, | |
| ) -> APIResponse[Message]: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/routes/llm_speech.py` around lines 36 - 40, The function
speech_to_speech is missing a return type annotation; update its signature to
include an explicit return type that matches the actual return value used in the
implementation (for example, if it returns a FastAPI Response or
StreamingResponse annotate with that type, or annotate with dict/Any if it
returns JSON-like data). Modify the def speech_to_speech(...) signature to add
the correct return type hint while keeping existing parameter types
AuthContextDep, SessionDep, and SpeechToSpeechRequest unchanged.
| params["language_code"] = ( | ||
| language_code if language_code != "unknown" else "unknown" | ||
| ) |
There was a problem hiding this comment.
Redundant conditional expression.
The expression language_code if language_code != "unknown" else "unknown" always evaluates to language_code regardless of the condition.
Proposed fix
if provider == "sarvamai-native":
# Use "unknown" for automatic language detection, or specific BCP-47 code
- params["language_code"] = (
- language_code if language_code != "unknown" else "unknown"
- )
+ params["language_code"] = language_code
params["mode"] = "transcribe"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/services/llm/chain/utils.py` around lines 94 - 96, The assignment
to params["language_code"] uses a redundant conditional; replace the conditional
expression with a direct assignment so params["language_code"] is set to
language_code (i.e., remove the ternary that checks for "unknown"), locating the
change at the line where params["language_code"] and language_code are used in
this module.
| @pytest.fixture | ||
| def context(): | ||
| return ChainContext( | ||
| job_id=uuid4(), | ||
| chain_id=uuid4(), | ||
| project_id=1, | ||
| organization_id=1, | ||
| callback_url="https://example.com/callback", | ||
| total_blocks=3, | ||
| intermediate_callback_flags=[True, True, False], | ||
| ) | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def text_response(): | ||
| return LLMCallResponse( | ||
| response=LLMResponse( | ||
| provider_response_id="resp-1", | ||
| conversation_id=None, | ||
| model="gpt-4", | ||
| provider="openai", | ||
| output=TextOutput(content=ResponseTextContent(value="Hello world")), | ||
| ), | ||
| usage=Usage(input_tokens=10, output_tokens=20, total_tokens=30), | ||
| provider_raw_response=None, | ||
| ) | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def audio_response(): | ||
| return LLMCallResponse( | ||
| response=LLMResponse( | ||
| provider_response_id="resp-2", | ||
| conversation_id=None, | ||
| model="gemini", | ||
| provider="google", | ||
| output=AudioOutput( | ||
| content=AudioContent( | ||
| format="base64", | ||
| value="audio-data-base64", | ||
| mime_type="audio/wav", | ||
| ) | ||
| ), | ||
| ), | ||
| usage=Usage(input_tokens=5, output_tokens=15, total_tokens=20), | ||
| provider_raw_response=None, | ||
| ) | ||
|
|
||
|
|
||
| def make_config(): | ||
| return LLMCallConfig( | ||
| blob=ConfigBlob( | ||
| completion=NativeCompletionConfig( | ||
| provider="openai-native", | ||
| type="text", | ||
| params={"model": "gpt-4"}, | ||
| ) | ||
| ) | ||
| ) | ||
|
|
||
|
|
||
| class TestResultToQuery: | ||
| def test_text_output_to_query(self, text_response): | ||
| result = BlockResult(response=text_response, usage=text_response.usage) | ||
|
|
||
| query = result_to_query(result) | ||
|
|
||
| assert isinstance(query.input, TextInput) | ||
| assert query.input.content.value == "Hello world" | ||
|
|
||
| def test_audio_output_to_query(self, audio_response): | ||
| result = BlockResult(response=audio_response, usage=audio_response.usage) | ||
|
|
||
| query = result_to_query(result) | ||
|
|
||
| assert isinstance(query.input, AudioInput) | ||
| assert query.input.content.value == "audio-data-base64" | ||
|
|
||
| def test_unsupported_output_type_raises(self): | ||
| mock_response = MagicMock() | ||
| mock_response.response.output.type = "unknown" | ||
| mock_response.response.output.__class__ = type("Unknown", (), {}) | ||
| result = BlockResult(response=mock_response, usage=MagicMock()) | ||
|
|
||
| with pytest.raises(ValueError, match="Cannot chain output type"): | ||
| result_to_query(result) | ||
|
|
||
|
|
||
| class TestChainBlock: | ||
| def test_execute_single_block(self, context, text_response): | ||
| query = QueryParams(input="test input") | ||
| config = make_config() | ||
| block = ChainBlock(config=config, index=0, context=context) | ||
|
|
||
| with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute: | ||
| mock_execute.return_value = BlockResult( | ||
| response=text_response, usage=text_response.usage | ||
| ) | ||
|
|
||
| result = block.execute(query) | ||
|
|
||
| assert result.success | ||
| mock_execute.assert_called_once() | ||
|
|
||
| def test_execute_returns_failure(self, context): | ||
| query = QueryParams(input="test input") | ||
| config = make_config() | ||
| block = ChainBlock(config=config, index=0, context=context) | ||
|
|
||
| with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute: | ||
| mock_execute.return_value = BlockResult(error="Provider error") | ||
|
|
||
| result = block.execute(query) | ||
|
|
||
| assert not result.success | ||
| assert result.error == "Provider error" | ||
| mock_execute.assert_called_once() | ||
|
|
||
|
|
||
| class TestLLMChain: | ||
| def test_execute_empty_chain(self, context): | ||
| chain = LLMChain([], context) | ||
| query = QueryParams(input="test") | ||
|
|
||
| result = chain.execute(query) | ||
|
|
||
| assert not result.success | ||
| assert result.error == "Chain has no blocks" | ||
|
|
||
| def test_execute_single_block_chain(self, context, text_response): | ||
| config = make_config() | ||
| block = ChainBlock(config=config, index=0, context=context) | ||
| chain = LLMChain([block], context) | ||
|
|
||
| with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute: | ||
| mock_execute.return_value = BlockResult( | ||
| response=text_response, usage=text_response.usage | ||
| ) | ||
|
|
||
| result = chain.execute(QueryParams(input="hello")) | ||
|
|
||
| assert result.success | ||
| mock_execute.assert_called_once() | ||
|
|
||
| def test_execute_multi_block_chain(self, context, text_response): | ||
| config = make_config() | ||
| blocks = [ChainBlock(config=config, index=i, context=context) for i in range(3)] | ||
| chain = LLMChain(blocks, context) | ||
|
|
||
| with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute: | ||
| mock_execute.return_value = BlockResult( | ||
| response=text_response, usage=text_response.usage | ||
| ) | ||
|
|
||
| result = chain.execute(QueryParams(input="hello")) | ||
|
|
||
| assert result.success | ||
| assert mock_execute.call_count == 3 | ||
|
|
||
| def test_execute_stops_on_failure(self, context, text_response): | ||
| config = make_config() | ||
| blocks = [ChainBlock(config=config, index=i, context=context) for i in range(3)] | ||
| chain = LLMChain(blocks, context) | ||
|
|
||
| with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute: | ||
| mock_execute.return_value = BlockResult(error="Provider error") | ||
|
|
||
| result = chain.execute(QueryParams(input="hello")) | ||
|
|
||
| assert not result.success | ||
| assert result.error == "Provider error" | ||
| mock_execute.assert_called_once() | ||
|
|
||
| def test_execute_calls_on_block_completed(self, context, text_response): | ||
| config = make_config() | ||
| blocks = [ChainBlock(config=config, index=i, context=context) for i in range(2)] | ||
| chain = LLMChain(blocks, context) | ||
| callback = MagicMock() | ||
|
|
||
| with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute: | ||
| mock_execute.return_value = BlockResult( | ||
| response=text_response, usage=text_response.usage | ||
| ) | ||
|
|
||
| chain.execute(QueryParams(input="hello"), on_block_completed=callback) | ||
|
|
||
| assert callback.call_count == 2 |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Adopt typed factory fixtures in this test module.
Across Lines 33-219, fixture/test helpers are untyped and setup objects are manually assembled. Please switch to the project’s factory-pattern fixtures and add explicit type hints for fixture/test function signatures and returns.
As per coding guidelines backend/app/tests/**/*.py: “Use factory pattern for test fixtures in backend/app/tests/” and **/*.py: “Always add type hints to all function parameters and return values in Python code”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/tests/services/llm/test_chain.py` around lines 33 - 219, The
tests use untyped, manually-constructed fixtures (context, text_response,
audio_response, make_config) and unannotated test functions; replace these with
the project's factory-pattern fixtures (use the existing chain/context/llm
factories instead of manual construction inside context, text_response,
audio_response, and make_config) and add explicit type hints to each fixture and
test function signature and return value (e.g., annotate context: ChainContext,
text_response/audio_response: LLMCallResponse, make_config: LLMCallConfig, and
test functions' parameters/return types), importing any needed types from the
production modules; update tests referencing ChainBlock, LLMChain,
result_to_query, BlockResult, QueryParams to accept the typed fixtures so the
tests remain functionally identical but follow the factory + typing guidelines.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
9e4f58e to
c1807df
Compare
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (7)
backend/test_sts_debug.py (5)
91-91: Add return type hint to function signature.📝 Proposed fix
-def check_celery_connection(): +def check_celery_connection() -> None: """Check if Celery is running and can receive tasks."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/test_sts_debug.py` at line 91, The function check_celery_connection currently lacks a return type hint; update its signature to include the appropriate return type (e.g., -> bool or -> None) consistent with its implementation and callers so type checkers know what it returns; locate the function named check_celery_connection and add the correct return annotation to the def line and adjust any typing imports if needed.
130-130: Add return type hint to function signature.📝 Proposed fix
-def check_function_import(): +def check_function_import() -> None: """Verify execute_chain_job can be imported."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/test_sts_debug.py` at line 130, The function check_function_import currently lacks a return type hint; update its signature to include an explicit return type (e.g., -> bool or -> None as appropriate) so callers and type checkers know what to expect; locate the def check_function_import() declaration and add the correct return annotation consistent with its implementation and tests.
79-82: Remove extraneousfprefixes from strings without placeholders.Multiple
fprefix or use regular strings.Affected lines: 79, 81, 105, 116, 139, 147, 170-182, 190-191.
📝 Example fix
- print(f"✅ Chain job created successfully!") + print("✅ Chain job created successfully!") print(f" Job ID: {job_id}") - print(f" Check your Celery worker logs for task execution") + print(" Check your Celery worker logs for task execution")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/test_sts_debug.py` around lines 79 - 82, Several print statements use f-strings with no placeholders (e.g., print(f"✅ Chain job created successfully!"), print(f" Check your Celery worker logs for task execution"), and other prints around job_id, status, and debug outputs). Remove the unnecessary 'f' prefix on these string literals (or add actual placeholders if intended) so they become regular strings (e.g., print("✅ Chain job created successfully!")) and leave prints that do interpolate variables (like job_id) as f-strings; update all occurrences flagged (the prints around job_id, status messages, and the blocks mentioned) to eliminate F541 warnings while keeping semantics unchanged.
143-148: Remove unused variable and simplify dynamic import check.Line 146 assigns to
funcbut never uses it. Also,getattrwith a constant attribute can be replaced with direct attribute access. However, since this tests dynamic import behavior (same as Celery), consider usinghasattrinstead if you only need to verify importability.📝 Proposed fix
# Try dynamic import (same way Celery does it) import importlib module = importlib.import_module("app.services.llm.jobs") - func = getattr(module, "execute_chain_job") + _ = module.execute_chain_job # Verify attribute exists print(f"✅ Dynamic import successful (same as Celery)")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/test_sts_debug.py` around lines 143 - 148, Remove the unused variable and simplify the dynamic import check by replacing the unused getattr pattern: import the module via importlib.import_module("app.services.llm.jobs") and verify the presence of the constant attribute "execute_chain_job" with hasattr(module, "execute_chain_job") (instead of assigning func = getattr(...)), then keep the existing print confirmation; ensure importlib, module, and the "execute_chain_job" symbol are referenced as described.
14-14: Add return type hint to function signature.Per coding guidelines, all functions should have type hints for parameters and return values.
📝 Proposed fix
-def test_chain_job_creation(): +def test_chain_job_creation() -> str | None: """Test if chain job can be created and queued."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/test_sts_debug.py` at line 14, The test function test_chain_job_creation lacks a return type annotation; update its signature to include an explicit return type (e.g., change def test_chain_job_creation(): to def test_chain_job_creation() -> None:) to comply with the project's type-hinting guidelines; ensure any parameters (if added later) also include type hints and run the test suite/type checker after the change.backend/app/models/llm/request.py (1)
14-34: Consider usingStrEnuminstead of(str, Enum)for Python 3.11+.Per coding guidelines, the codebase uses Python 3.11+. The modern
StrEnumfrom theenummodule is the idiomatic replacement for(str, Enum)inheritance, providing cleaner code and better type inference.📝 Proposed fix
-from enum import Enum +from enum import Enum, StrEnum # Speech-to-Speech Model Enums -class STTModel(str, Enum): +class STTModel(StrEnum): """Supported STT models for speech-to-speech.""" GEMINI_PRO = "gemini-2.5-pro" SARVAM = "saaras:v3" -class TTSModel(str, Enum): +class TTSModel(StrEnum): """Supported TTS models for speech-to-speech.""" GEMINI_PRO = "gemini-2.5-pro-preview-tts" SARVAM = "bulbul:v3" -class LLMModel(str, Enum): +class LLMModel(StrEnum): """Supported LLM models for RAG in speech-to-speech.""" GPT4O = "gpt-4o" GPT4O_MINI = "gpt-4o-mini"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 14 - 34, Replace the three Enum classes to use the modern StrEnum: import StrEnum from enum (instead of using (str, Enum)) and change class bases for STTModel, TTSModel, and LLMModel to inherit from StrEnum; keep the same member names/values unchanged so downstream code and type inference benefit from StrEnum behavior.backend/app/api/routes/llm_speech_examples.md (1)
1-7: Documentation file location violates coding guidelines.This file is located at
backend/app/api/routes/llm_speech_examples.md, but per coding guidelines, endpoint documentation should be stored inbackend/app/api/docs/<domain>/<action>.md. Consider moving this file tobackend/app/api/docs/llm/speech_to_speech_examples.mdto maintain consistency with the codebase structure.As per coding guidelines: "Store endpoint documentation files in
backend/app/api/docs/<domain>/<action>.md"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/api/routes/llm_speech_examples.md` around lines 1 - 7, The documentation file backend/app/api/routes/llm_speech_examples.md is in the wrong location per guidelines; move it into the docs tree and rename to follow domain/action semantics (e.g., backend/app/api/docs/llm/speech_to_speech_examples.md), update any references or imports that point to the old path (search for llm_speech_examples.md or routes/llm_speech_examples.md), and ensure the new file path and name match other docs naming conventions so tooling and CI pick it up.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/api/docs/llm/speech_to_speech.md`:
- Around line 183-212: The TTS callback example under "Callback 3: TTS Output
(Final)" contains the wrong model version; update the JSON field "model":
"bulbul:v1" to "model": "bulbul:v3" so it matches the documented default TTS
model used elsewhere (e.g., the default referenced at "bulbul:v3"); ensure the
change is made only in the example JSON "response" object where "provider":
"sarvamai-native" and "provider_response_id": "tts_def456".
- Around line 122-151: The STT callback example under "Callback 1: STT Output
(Intermediate)" uses an incorrect model name; update the JSON "model" field in
that example (the key "model" inside the "response" object) from "saarika:v1" to
the documented default "saaras:v3" so it matches the default STT model elsewhere
in the docs.
- Around line 7-15: The Markdown fenced code blocks in speech_to_speech.md (the
POST /llm/sts endpoint block and the Flow diagram block) lack language
identifiers; update the first fenced block to use the "http" language identifier
and the Flow fenced block to use the "text" identifier so they pass MD040
linting (refer to the fenced blocks containing "POST /llm/sts" and "Voice Input
→ STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output").
In `@backend/app/api/routes/llm_speech_examples.md`:
- Around line 86-121: Update the STT callback JSON example so its "provider" and
"model" match the documented defaults: replace "provider": "google-native" and
"model": "gemini-2.5-pro" in the response block with "provider":
"sarvamai-native" and "model": "saaras:v3" (or alternatively add a short note in
the example clarifying it intentionally uses non-default selection); modify only
the values for the "provider" and "model" fields inside the "response" object to
keep the example consistent with the "Defaults Used" section.
- Around line 336-342: The README example uses "tts_model": "bulbul-v3" which
conflicts with the TTSModel enum in request.py that defines "bulbul:v3"; update
the example in llm_speech_examples.md (the JSON block showing
stt_model/llm_model/tts_model) to use the colon format "bulbul:v3" so it matches
the TTSModel enum, or alternatively change the TTSModel entry in request.py to
"bulbul-v3" if you prefer hyphen formatting—ensure both the example and the
TTSModel constant (named TTSModel) use the exact same string.
In `@backend/app/api/routes/llm_speech.py`:
- Around line 104-115: The code mutates request.request_metadata in-place by
assigning metadata = request.request_metadata and then calling
metadata.update(...); instead create a new dict (e.g., metadata =
dict(request.request_metadata or {} ) or use request.request_metadata.copy()) so
you can safely update it with the STS keys ("speech_to_speech",
"input_language", "output_language", "stt_model", "llm_model", "tts_model")
without altering the original request.request_metadata; update the metadata
variable and leave request.request_metadata immutable.
- Around line 65-80: The two inline imports of HTTPException inside the
validation block should be moved to the module level with the other FastAPI
imports: add "from fastapi import HTTPException" at the top of the file and
remove the inline imports inside the conditional checks that validate
request.input_language and request.output_language against LANGUAGE_CODES;
ensure the behavior of the blocks using request.input_language,
request.output_language, and LANGUAGE_CODES remains unchanged.
---
Nitpick comments:
In `@backend/app/api/routes/llm_speech_examples.md`:
- Around line 1-7: The documentation file
backend/app/api/routes/llm_speech_examples.md is in the wrong location per
guidelines; move it into the docs tree and rename to follow domain/action
semantics (e.g., backend/app/api/docs/llm/speech_to_speech_examples.md), update
any references or imports that point to the old path (search for
llm_speech_examples.md or routes/llm_speech_examples.md), and ensure the new
file path and name match other docs naming conventions so tooling and CI pick it
up.
In `@backend/app/models/llm/request.py`:
- Around line 14-34: Replace the three Enum classes to use the modern StrEnum:
import StrEnum from enum (instead of using (str, Enum)) and change class bases
for STTModel, TTSModel, and LLMModel to inherit from StrEnum; keep the same
member names/values unchanged so downstream code and type inference benefit from
StrEnum behavior.
In `@backend/test_sts_debug.py`:
- Line 91: The function check_celery_connection currently lacks a return type
hint; update its signature to include the appropriate return type (e.g., -> bool
or -> None) consistent with its implementation and callers so type checkers know
what it returns; locate the function named check_celery_connection and add the
correct return annotation to the def line and adjust any typing imports if
needed.
- Line 130: The function check_function_import currently lacks a return type
hint; update its signature to include an explicit return type (e.g., -> bool or
-> None as appropriate) so callers and type checkers know what to expect; locate
the def check_function_import() declaration and add the correct return
annotation consistent with its implementation and tests.
- Around line 79-82: Several print statements use f-strings with no placeholders
(e.g., print(f"✅ Chain job created successfully!"), print(f" Check your Celery
worker logs for task execution"), and other prints around job_id, status, and
debug outputs). Remove the unnecessary 'f' prefix on these string literals (or
add actual placeholders if intended) so they become regular strings (e.g.,
print("✅ Chain job created successfully!")) and leave prints that do interpolate
variables (like job_id) as f-strings; update all occurrences flagged (the prints
around job_id, status messages, and the blocks mentioned) to eliminate F541
warnings while keeping semantics unchanged.
- Around line 143-148: Remove the unused variable and simplify the dynamic
import check by replacing the unused getattr pattern: import the module via
importlib.import_module("app.services.llm.jobs") and verify the presence of the
constant attribute "execute_chain_job" with hasattr(module, "execute_chain_job")
(instead of assigning func = getattr(...)), then keep the existing print
confirmation; ensure importlib, module, and the "execute_chain_job" symbol are
referenced as described.
- Line 14: The test function test_chain_job_creation lacks a return type
annotation; update its signature to include an explicit return type (e.g.,
change def test_chain_job_creation(): to def test_chain_job_creation() -> None:)
to comply with the project's type-hinting guidelines; ensure any parameters (if
added later) also include type hints and run the test suite/type checker after
the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2ff6a166-6d3b-4157-a6b4-342e4bcb0255
📒 Files selected for processing (7)
backend/app/api/docs/llm/speech_to_speech.mdbackend/app/api/main.pybackend/app/api/routes/llm_speech.pybackend/app/api/routes/llm_speech_examples.mdbackend/app/models/llm/request.pybackend/app/services/llm/chain/utils.pybackend/test_sts_debug.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/api/main.py
| ``` | ||
| POST /llm/sts | ||
| ``` | ||
|
|
||
| ## Flow | ||
|
|
||
| ``` | ||
| Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks.
The fenced code blocks at lines 7-9 and 13-15 are missing language specifiers per MD040. Use http for the endpoint block and text for the flow diagram.
📝 Proposed fix
## Endpoint
-```
+```http
POST /llm/stsFlow
- +text
Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ``` | |
| POST /llm/sts | |
| ``` | |
| ## Flow | |
| ``` | |
| Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output | |
| ``` | |
| ## Endpoint | |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 7-7: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 13-13: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/docs/llm/speech_to_speech.md` around lines 7 - 15, The
Markdown fenced code blocks in speech_to_speech.md (the POST /llm/sts endpoint
block and the Flow diagram block) lack language identifiers; update the first
fenced block to use the "http" language identifier and the Flow fenced block to
use the "text" identifier so they pass MD040 linting (refer to the fenced blocks
containing "POST /llm/sts" and "Voice Input → STT (auto language) → RAG
(Knowledge Base) → TTS → Voice Output").
| ### Callback 1: STT Output (Intermediate) | ||
| ```json | ||
| { | ||
| "success": true, | ||
| "data": { | ||
| "block_index": 1, | ||
| "total_blocks": 3, | ||
| "response": { | ||
| "provider_response_id": "stt_xyz789", | ||
| "provider": "sarvamai-native", | ||
| "model": "saarika:v1", | ||
| "output": { | ||
| "type": "text", | ||
| "content": { | ||
| "value": "नमस्ते, मुझे अपने अकाउंट के बारे में जानकारी चाहिए" | ||
| } | ||
| } | ||
| }, | ||
| "usage": { | ||
| "input_tokens": 0, | ||
| "output_tokens": 12, | ||
| "total_tokens": 12 | ||
| } | ||
| }, | ||
| "metadata": { | ||
| "speech_to_speech": true, | ||
| "input_language": "hi-IN" | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
STT callback example shows incorrect model name.
Line 132 shows "model": "saarika:v1" but the documented default STT model is "saaras:v3" (line 55). Consider updating the example to use the correct model name for consistency.
📝 Proposed fix
"response": {
"provider_response_id": "stt_xyz789",
"provider": "sarvamai-native",
- "model": "saarika:v1",
+ "model": "saaras:v3",
"output": {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### Callback 1: STT Output (Intermediate) | |
| ```json | |
| { | |
| "success": true, | |
| "data": { | |
| "block_index": 1, | |
| "total_blocks": 3, | |
| "response": { | |
| "provider_response_id": "stt_xyz789", | |
| "provider": "sarvamai-native", | |
| "model": "saarika:v1", | |
| "output": { | |
| "type": "text", | |
| "content": { | |
| "value": "नमस्ते, मुझे अपने अकाउंट के बारे में जानकारी चाहिए" | |
| } | |
| } | |
| }, | |
| "usage": { | |
| "input_tokens": 0, | |
| "output_tokens": 12, | |
| "total_tokens": 12 | |
| } | |
| }, | |
| "metadata": { | |
| "speech_to_speech": true, | |
| "input_language": "hi-IN" | |
| } | |
| } | |
| ``` | |
| ### Callback 1: STT Output (Intermediate) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/docs/llm/speech_to_speech.md` around lines 122 - 151, The STT
callback example under "Callback 1: STT Output (Intermediate)" uses an incorrect
model name; update the JSON "model" field in that example (the key "model"
inside the "response" object) from "saarika:v1" to the documented default
"saaras:v3" so it matches the default STT model elsewhere in the docs.
| ### Callback 3: TTS Output (Final) | ||
| ```json | ||
| { | ||
| "success": true, | ||
| "data": { | ||
| "response": { | ||
| "provider_response_id": "tts_def456", | ||
| "provider": "sarvamai-native", | ||
| "model": "bulbul:v1", | ||
| "output": { | ||
| "type": "audio", | ||
| "content": { | ||
| "format": "base64", | ||
| "value": "base64_encoded_audio_output", | ||
| "mime_type": "audio/ogg" | ||
| } | ||
| } | ||
| }, | ||
| "usage": { | ||
| "input_tokens": 15, | ||
| "output_tokens": 0, | ||
| "total_tokens": 15 | ||
| } | ||
| }, | ||
| "metadata": { | ||
| "speech_to_speech": true, | ||
| "output_language": "hi-IN" | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
TTS callback example shows incorrect model version.
Line 191 shows "model": "bulbul:v1" but the documented default TTS model is "bulbul:v3" (line 65). Update the example for consistency.
📝 Proposed fix
"response": {
"provider_response_id": "tts_def456",
"provider": "sarvamai-native",
- "model": "bulbul:v1",
+ "model": "bulbul:v3",
"output": {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### Callback 3: TTS Output (Final) | |
| ```json | |
| { | |
| "success": true, | |
| "data": { | |
| "response": { | |
| "provider_response_id": "tts_def456", | |
| "provider": "sarvamai-native", | |
| "model": "bulbul:v1", | |
| "output": { | |
| "type": "audio", | |
| "content": { | |
| "format": "base64", | |
| "value": "base64_encoded_audio_output", | |
| "mime_type": "audio/ogg" | |
| } | |
| } | |
| }, | |
| "usage": { | |
| "input_tokens": 15, | |
| "output_tokens": 0, | |
| "total_tokens": 15 | |
| } | |
| }, | |
| "metadata": { | |
| "speech_to_speech": true, | |
| "output_language": "hi-IN" | |
| } | |
| } | |
| ``` | |
| ### Callback 3: TTS Output (Final) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/docs/llm/speech_to_speech.md` around lines 183 - 212, The TTS
callback example under "Callback 3: TTS Output (Final)" contains the wrong model
version; update the JSON field "model": "bulbul:v1" to "model": "bulbul:v3" so
it matches the documented default TTS model used elsewhere (e.g., the default
referenced at "bulbul:v3"); ensure the change is made only in the example JSON
"response" object where "provider": "sarvamai-native" and
"provider_response_id": "tts_def456".
| ```json | ||
| { | ||
| "success": true, | ||
| "data": { | ||
| "block_index": 1, | ||
| "total_blocks": 3, | ||
| "response": { | ||
| "provider_response_id": "stt_xyz789", | ||
| "provider": "google-native", | ||
| "model": "gemini-2.5-pro", | ||
| "output": { | ||
| "type": "text", | ||
| "content": { | ||
| "value": "मेरा अकाउंट बैलेंस क्या है?" | ||
| } | ||
| } | ||
| }, | ||
| "usage": { | ||
| "input_tokens": 0, | ||
| "output_tokens": 8, | ||
| "total_tokens": 8 | ||
| } | ||
| }, | ||
| "metadata": { | ||
| "speech_to_speech": true, | ||
| "input_language": "hi-IN", | ||
| "output_language": "en-IN", | ||
| "stt_model": "gemini-2.5-pro", | ||
| "llm_model": "gpt-4o", | ||
| "tts_model": "bulbul-v3", | ||
| "user_id": "user_123", | ||
| "session_id": "session_456", | ||
| "source": "whatsapp" | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
STT callback example shows different provider than documented defaults.
The example shows "provider": "google-native" and "model": "gemini-2.5-pro" at lines 94-95, but the "Defaults Used" section at lines 31-35 states the default STT is "Sarvam Saaras V3". Consider updating the example to use "provider": "sarvamai-native" and "model": "saaras:v3" to match the documented defaults, or clarify that this example uses non-default model selection.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/routes/llm_speech_examples.md` around lines 86 - 121, Update
the STT callback JSON example so its "provider" and "model" match the documented
defaults: replace "provider": "google-native" and "model": "gemini-2.5-pro" in
the response block with "provider": "sarvamai-native" and "model": "saaras:v3"
(or alternatively add a short note in the example clarifying it intentionally
uses non-default selection); modify only the values for the "provider" and
"model" fields inside the "response" object to keep the example consistent with
the "Defaults Used" section.
| ```json | ||
| { | ||
| "stt_model": "saaras:v3", | ||
| "llm_model": "gpt-4o", | ||
| "tts_model": "bulbul-v3" | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Inconsistent TTS model identifier format.
Line 340 shows "tts_model": "bulbul-v3" with a hyphen, but the TTSModel enum in request.py defines it as "bulbul:v3" with a colon. This inconsistency could cause errors if users copy the example directly.
📝 Proposed fix
{
"stt_model": "saaras:v3",
"llm_model": "gpt-4o",
- "tts_model": "bulbul-v3"
+ "tts_model": "bulbul:v3"
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/routes/llm_speech_examples.md` around lines 336 - 342, The
README example uses "tts_model": "bulbul-v3" which conflicts with the TTSModel
enum in request.py that defines "bulbul:v3"; update the example in
llm_speech_examples.md (the JSON block showing stt_model/llm_model/tts_model) to
use the colon format "bulbul:v3" so it matches the TTSModel enum, or
alternatively change the TTSModel entry in request.py to "bulbul-v3" if you
prefer hyphen formatting—ensure both the example and the TTSModel constant
(named TTSModel) use the exact same string.
| if request.input_language and request.input_language != "auto": | ||
| if request.input_language not in LANGUAGE_CODES: | ||
| from fastapi import HTTPException | ||
|
|
||
| raise HTTPException( | ||
| status_code=400, | ||
| detail=f"Unsupported input language: {request.input_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}", | ||
| ) | ||
|
|
||
| if request.output_language and request.output_language not in LANGUAGE_CODES: | ||
| from fastapi import HTTPException | ||
|
|
||
| raise HTTPException( | ||
| status_code=400, | ||
| detail=f"Unsupported output language: {request.output_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}", | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Move HTTPException import to module level.
HTTPException is imported twice inside the function body (lines 67 and 75). Move this import to the top of the file with other FastAPI imports for better performance and maintainability.
📝 Proposed fix
-from fastapi import APIRouter, Depends
+from fastapi import APIRouter, Depends, HTTPExceptionThen remove the inline imports at lines 67-68 and 75-76:
if request.input_language and request.input_language != "auto":
if request.input_language not in LANGUAGE_CODES:
- from fastapi import HTTPException
-
raise HTTPException(
status_code=400,
detail=f"Unsupported input language: {request.input_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}",
)
if request.output_language and request.output_language not in LANGUAGE_CODES:
- from fastapi import HTTPException
-
raise HTTPException(
status_code=400,
detail=f"Unsupported output language: {request.output_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}",
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/routes/llm_speech.py` around lines 65 - 80, The two inline
imports of HTTPException inside the validation block should be moved to the
module level with the other FastAPI imports: add "from fastapi import
HTTPException" at the top of the file and remove the inline imports inside the
conditional checks that validate request.input_language and
request.output_language against LANGUAGE_CODES; ensure the behavior of the
blocks using request.input_language, request.output_language, and LANGUAGE_CODES
remains unchanged.
| # Add metadata to track STS-specific info | ||
| metadata = request.request_metadata or {} | ||
| metadata.update( | ||
| { | ||
| "speech_to_speech": True, | ||
| "input_language": input_lang_code, | ||
| "output_language": output_lang_code, | ||
| "stt_model": request.stt_model.value, | ||
| "llm_model": request.llm_model.value, | ||
| "tts_model": request.tts_model.value, | ||
| } | ||
| ) |
There was a problem hiding this comment.
Avoid mutating the request's metadata dictionary in-place.
Line 105 assigns request.request_metadata directly to metadata, then line 106 mutates it with .update(). If request.request_metadata is not None, this modifies the original dictionary in-place, which could cause unintended side effects.
📝 Proposed fix
# Add metadata to track STS-specific info
- metadata = request.request_metadata or {}
+ metadata = dict(request.request_metadata) if request.request_metadata else {}
metadata.update(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/api/routes/llm_speech.py` around lines 104 - 115, The code
mutates request.request_metadata in-place by assigning metadata =
request.request_metadata and then calling metadata.update(...); instead create a
new dict (e.g., metadata = dict(request.request_metadata or {} ) or use
request.request_metadata.copy()) so you can safely update it with the STS keys
("speech_to_speech", "input_language", "output_language", "stt_model",
"llm_model", "tts_model") without altering the original
request.request_metadata; update the metadata variable and leave
request.request_metadata immutable.
|
|
||
|
|
||
| @router.post( | ||
| "/llm/sts", |
There was a problem hiding this comment.
convert into llm/chain/sts
|
|
||
| **Note:** Sarvam STT uses automatic language detection. No need to specify input language. | ||
|
|
||
| ### LLM (RAG) |
There was a problem hiding this comment.
we support more than just these two models for RAG
| EOF | ||
| ``` | ||
|
|
||
| **Note:** `stt_model`, `llm_model`, and `tts_model` are optional and will use defaults if not specified. |
There was a problem hiding this comment.
add "specifying" before "stt_model",etc , etc
| """Supported LLM models for RAG in speech-to-speech.""" | ||
|
|
||
| GPT4O = "gpt-4o" | ||
| GPT4O_MINI = "gpt-4o-mini" |
There was a problem hiding this comment.
why are we supporting oonly these two models
| print("\n" + "=" * 80) | ||
| print("STEP 1: Creating test chain request") | ||
| print("=" * 80) |
| """Debug script for STS endpoint and chain job execution.""" | ||
|
|
||
| import logging | ||
| import sys | ||
| from sqlmodel import Session | ||
|
|
||
| # Setup logging | ||
| logging.basicConfig( | ||
| level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s" |
There was a problem hiding this comment.
do we need this file?
| api_keys, | ||
| assistants, | ||
| collections, | ||
| config, |
There was a problem hiding this comment.
Imports should be sorted alphabetically
|
|
||
| import logging | ||
|
|
||
| from fastapi import APIRouter, Depends, HTTPException |
There was a problem hiding this comment.
HTTPException is not needed
| response_model=APIResponse[Message], | ||
| dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))], | ||
| ) | ||
| def speech_to_speech( |
There was a problem hiding this comment.
type annotation missing
| ) | ||
| def speech_to_speech( | ||
| _current_user: AuthContextDep, | ||
| _session: SessionDep, |
There was a problem hiding this comment.
should _session start with underscore if later we pass to different function in line 129
| ) | ||
|
|
||
| # Optional language config (BCP-47 codes) | ||
| input_language: str | None = Field( |
There was a problem hiding this comment.
should this be str | None or str as default is "auto"
| and request.input_language not in SUPPORTED_LANGUAGE_CODES | ||
| ): | ||
| return APIResponse.failure_response( | ||
| error=f"Unsupported input language code: {request.input_language}. Supported: {', '.join(sorted(SUPPORTED_LANGUAGE_CODES))}", |
There was a problem hiding this comment.
do we need it sorted ? why not keep the sorted already
| config=LLMCallConfig( | ||
| blob=ConfigBlob( | ||
| completion=KaapiCompletionConfig( | ||
| provider="openai", |
There was a problem hiding this comment.
this is hardcoded here, but we need to write code that is extensible for other provider and models and abstract it
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/services/llm/providers/sai.py (1)
191-213:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winCritical:
_execute_ttscalls the Sarvam TTS API twice — first call is wasted andspeech_sample_rate=16000is never appliedThe method makes two
text_to_speech.convertcalls inside the sametryblock:
- Line 191–198: First call with
speech_sample_rate=16000(and explicitNoneforspeaker/output_audio_codec). Result assigned tosarvam_response.- Line 213: Second call via the
tts_kwargsdict. Result overwritessarvam_response.The first call's result is silently discarded. Downstream code uses the second call's response, which:
- Does not include
speech_sample_rate=16000(the SDK default is 22050 Hz, so audio will be at 22050 Hz instead of the intended 16000 Hz).- Doubles API cost and round-trip latency on every TTS execution.
- Passes explicit
Nonefor optional params in the first call, which may behave differently from omitting them entirely on some SDK/API versions.The existing tests in
test_sai.pydo not assert onspeech_sample_rate, so this regression is invisible to the test suite.🐛 Proposed fix — remove the first call and fold `speech_sample_rate` into `tts_kwargs`
try: - # Call SarvamAI TTS with all mapped parameters - sarvam_response = self.client.text_to_speech.convert( - text=parsed_text, - target_language_code=target_language_code, - model=model, - speaker=speaker, - speech_sample_rate=16000, - output_audio_codec=output_audio_codec, - ) - # Build kwargs for API call, only including non-None parameters tts_kwargs = { "text": parsed_text, "target_language_code": target_language_code, "model": model, + "speech_sample_rate": 16000, } if speaker: tts_kwargs["speaker"] = speaker if output_audio_codec: tts_kwargs["output_audio_codec"] = output_audio_codec - # Call SarvamAI TTS with mapped parameters sarvam_response = self.client.text_to_speech.convert(**tts_kwargs)Also update
test_sai.pyto assertcall_args.kwargs["speech_sample_rate"] == 16000so this doesn't regress again.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/services/llm/providers/sai.py` around lines 191 - 213, In _execute_tts remove the initial direct call to self.client.text_to_speech.convert (the one assigning sarvam_response before tts_kwargs) so the API is not invoked twice; instead add "speech_sample_rate": 16000 into the tts_kwargs dict (alongside "text", "target_language_code", "model" and optional "speaker"/"output_audio_codec") and perform a single self.client.text_to_speech.convert(**tts_kwargs) call, ensuring sarvam_response comes from that call; also update the test_sai.py test to assert call_args.kwargs["speech_sample_rate"] == 16000 to prevent regression.
♻️ Duplicate comments (1)
backend/app/api/main.py (1)
10-16:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDuplicate
llm_stsimport still present.
llm_stsappears at both Line 10 and Line 16 in the same import tuple. One of the two must be removed to fix the Ruff F811 lint error.🐛 Proposed fix
from app.api.routes import ( api_keys, assistants, collections, config, doc_transformation_job, documents, - llm_sts, auth, login, languages, llm, llm_chain, llm_sts, organization, ... )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/api/main.py` around lines 10 - 16, The import tuple contains a duplicated symbol llm_sts; remove the second occurrence of llm_sts from the import list so the tuple only imports llm_sts once (ensure the import list remains properly comma-separated and no other duplicate names remain).
🧹 Nitpick comments (1)
backend/app/models/llm/request.py (1)
22-41: ⚡ Quick winUse
StrEnuminstead of(str, Enum)for Python 3.11+ idiomAll three new enums inherit
(str, Enum), which Ruff flags (UP042). Since the codebase targets Python 3.11+, preferStrEnum:♻️ Proposed refactor
-from enum import Enum +from enum import Enum, StrEnum -class STTModel(str, Enum): +class STTModel(StrEnum): ... -class TTSModel(str, Enum): +class TTSModel(StrEnum): ... -class LLMModel(str, Enum): +class LLMModel(StrEnum): ...🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/llm/request.py` around lines 22 - 41, Replace the three enums STTModel, TTSModel, and LLMModel to inherit from enum.StrEnum instead of (str, Enum): add the import for StrEnum from enum, update class definitions to "class STTModel(StrEnum):", "class TTSModel(StrEnum):", and "class LLMModel(StrEnum):", and run the linter to ensure Ruff UP042 is resolved.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/models/llm/request.py`:
- Around line 902-918: Add a return type hint to validate_languages (->
"SpeechToSpeechRequest") and prevent "auto" from passing through for
output_language by adding the same guard used for input_language: only
normalize/output when output_language is set and not equal to "auto" (use the
function validate_languages and the SpeechToSpeechRequest type name as
references), then return self typed as "SpeechToSpeechRequest".
---
Outside diff comments:
In `@backend/app/services/llm/providers/sai.py`:
- Around line 191-213: In _execute_tts remove the initial direct call to
self.client.text_to_speech.convert (the one assigning sarvam_response before
tts_kwargs) so the API is not invoked twice; instead add "speech_sample_rate":
16000 into the tts_kwargs dict (alongside "text", "target_language_code",
"model" and optional "speaker"/"output_audio_codec") and perform a single
self.client.text_to_speech.convert(**tts_kwargs) call, ensuring sarvam_response
comes from that call; also update the test_sai.py test to assert
call_args.kwargs["speech_sample_rate"] == 16000 to prevent regression.
---
Duplicate comments:
In `@backend/app/api/main.py`:
- Around line 10-16: The import tuple contains a duplicated symbol llm_sts;
remove the second occurrence of llm_sts from the import list so the tuple only
imports llm_sts once (ensure the import list remains properly comma-separated
and no other duplicate names remain).
---
Nitpick comments:
In `@backend/app/models/llm/request.py`:
- Around line 22-41: Replace the three enums STTModel, TTSModel, and LLMModel to
inherit from enum.StrEnum instead of (str, Enum): add the import for StrEnum
from enum, update class definitions to "class STTModel(StrEnum):", "class
TTSModel(StrEnum):", and "class LLMModel(StrEnum):", and run the linter to
ensure Ruff UP042 is resolved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d22f2a7e-e65c-4c71-bda0-eb2859cd76b1
📒 Files selected for processing (4)
backend/app/api/main.pybackend/app/models/llm/request.pybackend/app/services/llm/jobs.pybackend/app/services/llm/providers/sai.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/services/llm/jobs.py
| @model_validator(mode="after") | ||
| def validate_languages(self): | ||
| """Normalize BCP-47 language codes to standard format (e.g., 'hi-in' -> 'hi-IN').""" | ||
| # Normalize input_language | ||
| if self.input_language and self.input_language != "auto": | ||
| # Normalize BCP-47: lowercase language, uppercase region (e.g., "hi-IN") | ||
| parts = self.input_language.split("-") | ||
| if len(parts) == 2: | ||
| self.input_language = f"{parts[0].lower()}-{parts[1].upper()}" | ||
|
|
||
| # Normalize output_language | ||
| if self.output_language: | ||
| parts = self.output_language.split("-") | ||
| if len(parts) == 2: | ||
| self.output_language = f"{parts[0].lower()}-{parts[1].upper()}" | ||
|
|
||
| return self |
There was a problem hiding this comment.
Missing return type hint and missing "auto" guard on output_language
Two issues in validate_languages:
-
Missing return type — coding guidelines require type hints on all function parameters and return values. Add
-> "SpeechToSpeechRequest":. -
output_language="auto"passes validation silently. The field description explicitly states "except 'auto'" but there is no enforcement."auto"has only one--split part so the normalization block is skipped, and"auto"propagates downstream as the TTStarget_language_code, which the Sarvam API will reject at runtime.
🛡️ Proposed fix
- def validate_languages(self):
+ def validate_languages(self) -> "SpeechToSpeechRequest":
"""Normalize BCP-47 language codes to standard format (e.g., 'hi-in' -> 'hi-IN')."""
# Normalize input_language
if self.input_language and self.input_language != "auto":
parts = self.input_language.split("-")
if len(parts) == 2:
self.input_language = f"{parts[0].lower()}-{parts[1].upper()}"
# Normalize output_language
if self.output_language:
+ if self.output_language == "auto":
+ raise ValueError(
+ "'auto' is not a valid output_language. Please specify a BCP-47 code (e.g., 'hi-IN')."
+ )
parts = self.output_language.split("-")
if len(parts) == 2:
self.output_language = f"{parts[0].lower()}-{parts[1].upper()}"
return selfAs per coding guidelines: "Always add type hints to all function parameters and return values in Python code."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/models/llm/request.py` around lines 902 - 918, Add a return type
hint to validate_languages (-> "SpeechToSpeechRequest") and prevent "auto" from
passing through for output_language by adding the same guard used for
input_language: only normalize/output when output_language is set and not equal
to "auto" (use the function validate_languages and the SpeechToSpeechRequest
type name as references), then return self typed as "SpeechToSpeechRequest".
| ) | ||
|
|
||
| try: | ||
| # Call SarvamAI TTS with all mapped parameters |
| """Debug script for STS endpoint and chain job execution.""" | ||
|
|
||
| import logging | ||
| import sys | ||
| from sqlmodel import Session | ||
|
|
||
| # Setup logging | ||
| logging.basicConfig( | ||
| level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s" |
| class TestSpeechToSpeechEndpoint: | ||
| """The three real shapes users send + the common error paths.""" | ||
|
|
||
| def test_minimal_request_uses_all_defaults( |
There was a problem hiding this comment.
add test covers the language-detection flow and not just auto detect, so maybe with language hindi
| # ---------- Spec → LLMCallConfig ---------- | ||
| # The chain executor (services/llm/jobs.py::execute_llm_call) already branches | ||
| # on LLMCallConfig.is_stored_config and resolves stored configs via | ||
| # resolve_config_blob. So we just construct the right LLMCallConfig shape. |
There was a problem hiding this comment.
remove these comments if it isnt value add
| ) | ||
|
|
||
|
|
||
| def _resolve_languages(request: SpeechToSpeechRequest) -> tuple[str, str]: |
There was a problem hiding this comment.
Can you test this scenario please to be double sure
Right now, if the caller leaves input_language="auto" (the default), we expect TTS to reply in the language detected by STT. The plumbing for that already exists (sai.py → chain.py → ChainContext.detected_language), and jobs.py will swap in the detected language for TTS — but only when the param is exactly "{{detected}}".
The issue is that the new route never sets that marker. _resolve_languages currently returns ("auto", "auto"), so TTS gets called with target_language_code="auto". Sarvam rejects that, and the chain fails.
The current tests don’t catch this because they all mock start_chain_job and never exercise the mapper/provider path.
Simplest fix: in _resolve_languages, return "{{detected}}" instead of "auto" for the output language when the input is auto and the output isn’t explicitly set. That lets the existing substitution logic in jobs.py work as intended.
Please also add a test that runs the chain with a mocked STT response like language_code="hi-IN" and verifies the Sarvam TTS client is called with target_language_code="hi-IN".
| print("\n" + "=" * 80) | ||
| print("STEP 1: Creating test chain request") | ||
| print("=" * 80) |
| # ---------- Per-block resolution ---------- | ||
|
|
||
|
|
||
| def _resolve_stt_block( |
There was a problem hiding this comment.
new helper and a new route that both build a ChainBlock around a KaapiCompletionConfigare creating the same object shape, but live in different files and don’t share the implementation. chain/utils.py:81-91 and llm_sts.py:146-154
Let’s move this into a shared utility and reuse it instead of duplicating the setup.
| f"[execute_llm_call] Using detected language for TTS: {detected_language} | job_id={job_id}" | ||
| ) | ||
| else: | ||
| # Fallback to English if no language was detected |
There was a problem hiding this comment.
for fallback can you define a constant like DEFAULT_TTS_FALLBACK_LANG
| # then merges | ||
| def _merge_stt(user: STTLLMParams | None, input_lang: str) -> STTLLMParams: | ||
| base = STTLLMParams(model=DEFAULT_STT_MODEL, input_language=input_lang) | ||
| if user is None: |
There was a problem hiding this comment.
why using user variable name in correspondence to STTLLMParams ?
|
|
||
| # Provider hints. Optional — KaapiCompletionConfig auto-defaults to | ||
| # "google" for stt/tts when omitted. | ||
| stt_provider: Literal["google", "sarvamai", "elevenlabs"] | None = None |
There was a problem hiding this comment.
Consider defaulting to "google" instead of None, since KaapiCompletionConfig already falls back to google for STT/TTS, making it explicit here removes the implicit fallback and keeps the contract clear at the schema level.. this same goes for rag provider and tts provider as well.
| @model_validator(mode="after") | ||
| def validate_languages(self): | ||
| """Normalize BCP-47 language codes to standard format (e.g., 'hi-in' -> 'hi-IN').""" | ||
| # Normalize input_language |
There was a problem hiding this comment.
are we validate against a SUPPORTED_LANGUAGE_CODES allowlist here instead of just normalizing format?
since this function is more of like normalization function instead of having validation .. if u already doing validation somewhere then rename this function to nromalize_langauge

Summary
Target issue is #642
Explain the motivation for making this change. What existing problem does the pull request solve?
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Please add here if any other information is required for the reviewer.
Summary by CodeRabbit
New Features
Documentation
Tests