Skip to content

cloud service query: support OAuth login without Query API key provisioning#248

Merged
sdairs merged 5 commits into
mainfrom
issue-247-oauth-service-query
Jun 15, 2026
Merged

cloud service query: support OAuth login without Query API key provisioning#248
sdairs merged 5 commits into
mainfrom
issue-247-oauth-service-query

Conversation

@sdairs

@sdairs sdairs commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Closes #247

What

cloud service query now works when authenticated via OAuth (cloud auth login): the CLI sends the user's bearer token directly to the Query API, skipping Query API key lookup and auto-provisioning entirely (provisioning needs write access an OAuth token doesn't have). The existing API-key flow is unchanged. Requires control-plane PR ClickHouse/control-plane#34114 (deployed to staging; prod pending).

Library (clickhouse-cloud-api)

  • run_query refactored onto a shared request builder; new run_query_bearer method uses the client's own bearer token (Error::AuthMismatch on Basic-auth clients). Existing run_query signature unchanged.
  • The Query API host is derived per environment from the client's base URL instead of hard-coding prod: api.[control-plane.]<domain>queries.<domain>. CLICKHOUSE_CLOUD_QUERY_HOST still overrides; with_query_host() pins it programmatically (used by tests).
  • auth-provider: custom marks a custom Query API key, so it is only sent on the Basic path.

CLI

  • service_query branches on is_bearer_auth(): OAuth sends the bearer token with no provisioning and writes nothing to credentials.json; --no-auto-enable is a documented no-op under OAuth.
  • Fixed staging/dev hosts in KNOWN_CONFIGS: api.clickhouse-staging.com / api.clickhouse-dev.com are NXDOMAIN; the real hosts are api.control-plane.clickhouse-{staging,dev}.com (verified via live DNS). Staging/dev OAuth login could never have matched an auth config before this.
  • README documents both auth modes.

Design finding (live-verified on staging)

The JWT-authenticated Query API path is SQL-console style: it authenticates the user's own identity and never consults the service's query-endpoint configuration — queries succeed against a service whose query-endpoint GET returns NOT_FOUND, and SQL permissions follow the user's console role (the backend forces readonly only for the mcp/librechat strategies). Docs and error handling reflect that; an earlier "endpoint must be enabled by an admin" error mapping was removed as based on a wrong premise.

Tests

  • Library: wiremock request-shape tests for both run_query variants (auth header, body shape, headers, error mapping); unit tests for query-host derivation across prod/staging/dev; run_query_bearer added to NON_OPENAPI_CLIENT_METHODS.
  • CLI: subprocess wiremock tests asserting the bearer header, zero provisioning calls and no key file written under OAuth, and the stored-key Basic path.
  • Live: OAuth query verified end-to-end on staging (incl. no-endpoint case and host derivation); API-key auto-provisioning + stored-key reuse verified on prod. Prod OAuth verification pending the prod deploy of the control-plane change.

🤖 Generated with Claude Code


Note

Medium Risk
Changes authentication and query routing for a core cloud workflow; API signature adds wake_service to run_query, but behavior is covered by wiremock and CLI integration tests.

Overview
cloud service query now works with OAuth by sending the user's bearer token to the Query API (SQL-console style), with no per-service key provisioning or credentials.json writes. The API-key path is unchanged: stored or auto-provisioned Query API keys still use Basic auth.

In clickhouse-cloud-api, query execution is unified behind a shared builder: new run_query_bearer, run_query gains a wake_service flag, Query hosts are derived from the management API base URL (with with_query_host / env override), and HTTP 206 responses map to ServiceIdle / ServiceStopped.

The CLI retries idle services with wake-service: true after a stderr notice and fails stopped services with a service start hint. Staging/dev OAuth host mappings are corrected to api.control-plane.clickhouse-{staging,dev}.com. README documents both Query API auth modes.

Reviewed by Cursor Bugbot for commit ae8acad. Bugbot is set up for automated code reviews on this repo. Configure here.

sdairs and others added 3 commits June 10, 2026 21:46
When authenticated via OAuth (cloud auth login), `cloud service query` now
sends the user's bearer token directly to the Query API endpoint instead of
looking up or auto-provisioning a per-service Query API key — provisioning
requires write access an OAuth token doesn't have. The API-key flow is
unchanged. `--no-auto-enable` is a documented no-op under OAuth, and a 404
from the query host maps to a clear "endpoint not enabled" error pointing
at `cloud service query-endpoint create`.

Library: `run_query` is refactored onto a shared request builder and a new
`run_query_bearer` method uses the client's own bearer token (AuthMismatch
on Basic-auth clients). The Query API host is now derived per environment
from the client's base URL (`api.<domain>` → `queries.<domain>`) instead of
hard-coding the prod host; `CLICKHOUSE_CLOUD_QUERY_HOST` still overrides,
and `with_query_host` pins it programmatically (used by tests). The
`auth-provider: custom` header marks a custom Query API key, so it is only
sent on the Basic path.

Tests: wiremock request-shape coverage for both run_query variants in the
library, unit tests for query-host derivation, and CLI subprocess tests
asserting the bearer header, the absence of provisioning calls/stored keys
under OAuth, and the stored-key Basic path.

Closes #247 (code ready; blocked on server-side bearer support at the query
endpoint for live verification).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The staging and dev entries in KNOWN_CONFIGS pointed at hosts that don't
exist (api.clickhouse-staging.com / api.clickhouse-dev.com are NXDOMAIN);
the real management API hosts carry a control-plane label:
api.control-plane.clickhouse-staging.com and
api.control-plane.clickhouse-dev.com. OAuth login against staging/dev could
never have matched an auth config.

Query-host derivation learns the same shape: the query hosts do NOT carry
the control-plane label (queries.clickhouse-staging.com exists,
queries.control-plane.clickhouse-staging.com doesn't), so derivation now
strips an optional control-plane. prefix after api. — verified against
live DNS for prod, staging, and dev.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…avior

Live testing against staging (after PR 34114 deployed) confirmed the
JWT-authenticated Query API path is SQL-console style: it authenticates the
user's own identity and never consults the service's query-endpoint
configuration (verified: queries succeed against a service whose
query-endpoint GET returns NOT_FOUND). Drop the 404 → "ask an admin to run
query-endpoint create" error mapping, which was based on a wrong premise,
and fix the help text, README, and run_query_bearer docs that claimed the
endpoint must be pre-enabled.

Also soften "read-only SQL" to "permissions follow your console role": the
backend forces readonly only for the mcp/librechat strategies, not
clickhousectl.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@sdairs sdairs requested a review from iskakaushik as a code owner June 11, 2026 21:18
@sdairs sdairs had a problem deploying to cloud-integration June 11, 2026 21:18 — with GitHub Actions Failure
…t hosts

- derive_query_host now keeps a non-default port from the base URL
  (api.mycorp.example.com:8443 → queries.mycorp.example.com:8443), with
  unit coverage for both custom and default ports.
- Tests still referencing the dead api.clickhouse-staging.com host now use
  api.control-plane.clickhouse-staging.com (token serialization, URL
  normalization, lib_base_url). The plain-api-prefix derivation test keeps
  the old shape deliberately.
- assert_success moved to the shared helpers in cli_request_shape_test.rs
  and reused by the dotenv and shell-env-precedence tests.
- Restore trailing newline at end of client.rs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@sdairs sdairs temporarily deployed to cloud-integration June 11, 2026 21:46 — with GitHub Actions Inactive
The SQL-console-style query endpoint the OAuth path uses does not wake
an idled service on its own: it answers 206 {"data":"Confirm wake
service"} and expects the query to be resent with a `wake-service:
true` header once the user confirms (this is what the SQL console does
after prompting). The CLI previously streamed that 206 body straight
to stdout as if it were the query result.

The library now maps the query host's 206 service-state protocol to
typed errors (ServiceIdle / ServiceStopped) and run_query{,_bearer}
gain a wake_service flag that sends the wake confirmation header. The
CLI retries once with the flag set after printing a notice to stderr —
matching the API-key path, which the query host wakes automatically —
and turns ServiceStopped into a hint to run `cloud service start`
(stopped services are never woken by the Query API).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@sdairs sdairs temporarily deployed to cloud-integration June 11, 2026 22:37 — with GitHub Actions Inactive
@sdairs sdairs merged commit c53f19c into main Jun 15, 2026
4 checks passed
@sdairs sdairs deleted the issue-247-oauth-service-query branch June 15, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cloud service query should work with OAuth login (read-only), without auto-provisioning a Query API key

2 participants