TextSQL converts natural language questions into SQL queries using a LangGraph-based iterative planning workflow. It breaks down complex questions into adaptive steps — retrieving relevant schema via TableRAG, generating and executing SQL (including intermediate queries when needed), and automatically recovering from errors — to produce accurate answers with key insights.
┌─────────────────────────────────────────────────────────────────┐
│ init │
│ Load config, TableRAG index, ERD graph, business knowledge │
└───────────────────────────────┬─────────────────────────────────┘
│
┌───────────────┬───────────┼───────────┬───────────┐
│ │ │ │ │
▼ ▼ ▼ ▼ │
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌───────────┐ │
│TableRAG │ │Thinking │ │ SQLPair │ │Knowledge │ │
│- Column │ │ Pattern │ │- SQL │ │- GraphRAG │ │
│- Cell │ │- FAISS │ │ pairs │ │- pgvector │ │
│- ERD │ │ │ │ │ │ │ │
└────┬────┘ └────┬────┘ └────┬────┘ └─────┬─────┘ │
│ │ │ │ │
└─────────────┴───────────┴────────────┘ │
│ (fan-in) │
▼ │
┌─────────────────────────────────────────────────────────────────┐
│ expand │
│ Combine all retrieve results, multi-path candidate generation │
└───────────────────────────────┬─────────────────────────────────┘
│
▼
generate → execute ─┬─ success → analyze → END
│
└─ failure → plan (error recovery)
│
├─ re-retrieve
├─ re-generate
└─ end
| Node | Description |
|---|---|
| init | Load database config, build/load all RAG indexes, preload ERD cache |
| tablerag | FAISS + BM25 hybrid retrieval for columns, cells, and ERD relationships |
| sqlpair | Retrieve similar (question, SQL) pairs from history for few-shot reference |
| knowledge | Query business rules and domain knowledge via GraphRAG / pgvector |
| pattern | Match thinking patterns (query decomposition templates) via local FAISS |
| expand | Merge all retrieval results, generate multi-path SQL candidates |
| generate | Generate SQL using combined context from all retrieval sources |
| execute | Run SQL against database; classify errors for recovery |
| plan | Error recovery router — re-retrieve, re-generate, or abort |
| analyze | Produce natural language answer with key insights |
LangGraph Workflow — The entire pipeline is modeled as a compiled state graph with conditional edges. The plan node acts as a central router, enabling iterative loops (retrieve → generate → execute → replan) that adapt to query complexity instead of following a fixed sequence.
TableRAG — Rather than dumping full database schemas into the LLM context, TableRAG builds FAISS + BM25 hybrid indexes over column metadata and sampled cell values. At query time it retrieves only the relevant columns, example values, and ERD relationships, dramatically reducing token usage while improving SQL accuracy on large schemas.
GraphRAG (LightRAG) — Provides two knowledge layers: (1) ERD GraphRAG constructs an entity-relationship graph from database schema — extracting entities via LLM, discovering relations through FK metadata, naming conventions, and LLM inference, then validating with SQL joins — for accurate multi-table JOIN generation; (2) Knowledge GraphRAG indexes business rules and domain knowledge for contextual SQL generation.
SQLPair & Iterative Execution — Retrieves similar (question, SQL) pairs from query history as few-shot examples. For complex questions, the planner can issue intermediate SQL queries to discover enum values or verify table relationships. Each (SQL, result) pair feeds back into the plan node for iterative refinement.
source .venv/bin/activate
uv pip install langgraph langchain-openai langchain-community pymysql pydantic python-dotenv pandas faiss-cpucp .env.example .env # Set OPENAI_API_KEY
cp config/database.example.yaml config/database.yaml # Set MySQL connectionpython cli/query.py "How many users are in the database?"
python cli/query.py "question" --output sql # SQL only, no execution
python cli/chat.py # Interactive chatbotEvaluated on the BIRD dev set with execution accuracy (EX).
| Model | EX (%) |
|---|---|
| gpt-4o | 79 |
The retrieve node supports multiple strategies for fetching database context:
| Mode | Description | Requires Index | Token Cost |
|---|---|---|---|
tableschema |
Load all table schemas into context | No | High |
tablerag |
FAISS + BM25 hybrid vector retrieval | Yes | Low |
tableragplus |
TableRAG + LLM column expansion | Yes | Low |
Use tableragplus for best accuracy. Build the index first:
python cli/tablerag_index.py[init] Load config, TableRAG index, business knowledge 0.8s
[plan] Decide: retrieve schema for "会员数量" 1.2s
[retrieve] Found 2 tables, 3 columns via TableRAG 0.3s
[generate] SELECT COUNT(*) FROM member_silver 2.1s
[execute] Success: 1 row returned 0.03s
[analyze] 数据库中共有 107,105 名会员 1.5s
Total: 5.93s | Tokens: 8,420
from langchain_openai import ChatOpenAI
from src.workflow import create_workflow_v3
llm = ChatOpenAI(model="gpt-4o", temperature=0)
app = create_workflow_v3(llm=llm)
result = app.invoke({'user_input': 'How many users are there?'})
print(result['analyze'].answer)| Tool | Description |
|---|---|
cli/query.py |
Single-question queries with workflow and output mode options |
cli/chat.py |
Interactive chatbot with conversation history |
cli/evaluate.py |
Batch evaluation against test datasets |
cli/tablerag_index.py |
Build TableRAG vector indexes for column/cell retrieval |
cli/tablerag_retrieve.py |
Query TableRAG indexes |
cli/graphrag_index.py |
Build ERD knowledge graphs from database schema |
cli/graphrag_retrieve.py |
Query ERD relationships and entities |
cli/generate_captions.py |
Generate and cache table business descriptions |
pytest tests/ # All tests
pytest tests/unit/ -v # Unit tests only
pytest tests/ -x # Stop on first failureMIT