feat(ai-proxy): add Snowflake integration with PAT authentication#1573
feat(ai-proxy): add Snowflake integration with PAT authentication#1573christophebrun-forest wants to merge 1 commit intomainfrom
Conversation
Exposes 3 MCP tools backed by Snowflake REST API v2 (Cortex Search, Cortex Analyst, read-only SQL execution), authenticated via Programmatic Access Tokens. The execute-query tool enforces a defense-in-depth read-only SQL guard on top of Snowflake role privileges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 new issue
|
|
Coverage Impact ⬆️ Merging this pull request will increase total coverage on Modified Files with Diff Coverage (6) 🛟 Help
|
|
|
||
| import { AIBadRequestError, AIToolUnprocessableError, McpConnectionError } from '../../errors'; | ||
|
|
||
| const READ_ONLY_LEADING_KEYWORD_RE = /^\s*(select|show|describe|desc|explain)\b/i; |
There was a problem hiding this comment.
🟡 Medium snowflake/utils.ts:5
The tool description in execute-query.ts states that WITH statements are allowed for CTE queries, but assertReadOnlySql rejects them. READ_ONLY_LEADING_KEYWORD_RE (line 5) does not include with as a valid leading keyword, and FORBIDDEN_KEYWORD_RE (line 7) explicitly lists with as forbidden. This causes valid read-only CTE queries like WITH cte AS (SELECT 1) SELECT * FROM cte to be incorrectly rejected with "Only read-only statements (SELECT, SHOW, DESCRIBE, EXPLAIN) are allowed." despite being documented as supported.
-const READ_ONLY_LEADING_KEYWORD_RE = /^\s*(select|show|describe|desc|explain)\b/i;
+const READ_ONLY_LEADING_KEYWORD_RE = /^\s*(select|show|describe|desc|explain|with)\b/i;🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file packages/ai-proxy/src/integrations/snowflake/utils.ts around line 5:
The tool description in `execute-query.ts` states that `WITH` statements are allowed for CTE queries, but `assertReadOnlySql` rejects them. `READ_ONLY_LEADING_KEYWORD_RE` (line 5) does not include `with` as a valid leading keyword, and `FORBIDDEN_KEYWORD_RE` (line 7) explicitly lists `with` as forbidden. This causes valid read-only CTE queries like `WITH cte AS (SELECT 1) SELECT * FROM cte` to be incorrectly rejected with "Only read-only statements (SELECT, SHOW, DESCRIBE, EXPLAIN) are allowed." despite being documented as supported.
Evidence trail:
packages/ai-proxy/src/integrations/snowflake/tools/execute-query.ts lines 17-18: description says WITH is allowed. packages/ai-proxy/src/integrations/snowflake/utils.ts line 5: READ_ONLY_LEADING_KEYWORD_RE = /^\s*(select|show|describe|desc|explain)\b/i — no 'with'. utils.ts line 7: FORBIDDEN_KEYWORD_RE includes 'with' in the alternation. utils.ts lines 55-59: assertReadOnlySql checks against READ_ONLY_LEADING_KEYWORD_RE and throws if no match. utils.ts lines 61-66: then checks FORBIDDEN_KEYWORD_RE and throws if match found.
| function normalizeSql(statement: string): string { | ||
| return statement | ||
| .replace(/\/\*[\s\S]*?\*\//g, '') | ||
| .replace(/--[^\n]*/g, '') | ||
| .replace(/'(?:''|[^'])*'/g, "''") | ||
| .replace(/"(?:""|[^"])*"/g, '""') | ||
| .trim(); | ||
| } |
There was a problem hiding this comment.
🟠 High snowflake/utils.ts:31
normalizeSql strips comments before string literals, so an attacker can hide forbidden keywords inside a string that looks like a comment. For example, SELECT '--' DELETE FROM users becomes SELECT ' after the regex removes --' DELETE FROM users, bypassing all keyword checks while the original malicious query executes. Consider normalizing string literals first (replacing them with placeholders before any comment removal), or using a proper SQL parser instead of regex.
-function normalizeSql(statement: string): string {
- return statement
- .replace(/\/\*[\s\S]*?\*\//g, '')
- .replace(/--[^\n]*/g, '')
- .replace(/'(?:''|[^'])*'/g, "''")
- .replace(/"(?:""|[^"])*"/g, '""')
- .trim();
-}🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file packages/ai-proxy/src/integrations/snowflake/utils.ts around lines 31-38:
`normalizeSql` strips comments before string literals, so an attacker can hide forbidden keywords inside a string that looks like a comment. For example, `SELECT '--' DELETE FROM users` becomes `SELECT '` after the regex removes `--' DELETE FROM users`, bypassing all keyword checks while the original malicious query executes. Consider normalizing string literals first (replacing them with placeholders before any comment removal), or using a proper SQL parser instead of regex.
Evidence trail:
packages/ai-proxy/src/integrations/snowflake/utils.ts lines 31-37 (normalizeSql function, REVIEWED_COMMIT): comment removal regexes at lines 33-34 run before string literal replacement at lines 35-36. Lines 5-8: READ_ONLY_LEADING_KEYWORD_RE and FORBIDDEN_KEYWORD_RE patterns. Lines 39-62: assertReadOnlySql uses normalizeSql output for security validation. Manual trace of `SELECT '--' DELETE FROM users` through the four regex steps confirms it normalizes to `SELECT '`, bypassing all keyword checks.

Summary
Adds a new Forest integration in
@forestadmin/ai-proxythat exposes 3 MCP tools backed by Snowflake's REST API v2, authenticated via Programmatic Access Tokens (PAT).Tools shipped:
snowflake_cortex_search— semantic search via a Cortex Search service (caller passesdatabase/schema/serviceas arguments).snowflake_cortex_analyst— natural-language analytical Q&A backed by a semantic model file (@stage/model.yaml) or a semantic view (XOR validated at runtime).snowflake_execute_query— synchronous SQL execution restricted to read-only statements.Authentication: PAT bearer token (
Authorization: Bearer <pat>+X-Snowflake-Authorization-Token-Type: PROGRAMMATIC_ACCESS_TOKEN). The token is passed per-request via the integration config — no shared OAuth flow.Config shape:
```ts
{
accountIdentifier: string; // e.g. "myorg-myaccount" or account locator
programmaticAccessToken: string;
defaultWarehouse?: string;
defaultDatabase?: string;
defaultSchema?: string;
defaultRole?: string;
}
```
Read-only SQL guard (defense-in-depth):
Test plan
🤖 Generated with Claude Code
Note
Add Snowflake integration with PAT authentication to ai-proxy
SnowflakeConfigand three LangChain tools (snowflake_cortex_search,snowflake_cortex_analyst,snowflake_execute_query) in packages/ai-proxy/src/integrations/snowflake/, using Programmatic Access Token (PAT) authentication.ForestIntegrationClientnow handles the'Snowflake'integration name in bothloadToolsandcheckConnection, routing to the new tools and a connectivity validator.assertReadOnlySqlenforces that onlySELECT/SHOW/DESCRIBE/EXPLAINstatements are executed, rejecting multi-statement and mutating SQL before any network call.validateSnowflakeConfigchecks connectivity by issuing aSELECT 1to the Snowflake statements API, throwingMcpConnectionErroron failure with HTTP status and server message.execute-querySQL filtering relies on comment/literal stripping and keyword matching, which may reject or permit edge-case SQL inputs.📊 Macroscope summarized 6c73291. 6 files reviewed, 2 issues evaluated, 0 issues filtered, 2 comments posted
🗂️ Filtered Issues