Skip to content

QUERY_SPEC.md — substrate-neutral query specification (v0.1 + v0.2 roadmap) #138

@rdhyee

Description

@rdhyee

Context

query-spec.qmd (v0.1, landed in the previous session; currently untracked on main — needs a commit/PR) defines a substrate-neutral query contract for iSamples. It specifies canonical dimensions, a filter grammar, and bindings to three query substrates: DuckDB-WASM (browser), Python (pandas/DuckDB), and Solr (legacy Central API).

The goal: a single query expression should be runnable against any of the three substrates and return equivalent results, modulo substrate capability differences. This lets the web Explorer, Python Explorer, and legacy tooling share one mental model.

Status

  • v0.1 drafted and (to-be-)committed as query-spec.qmd
  • §7 of v0.1 enumerates 7 open questions for v0.2 — listed below

v0.2 open questions (from §7)

  • Specimen filter in web Explorer — how does specimen type (MaterialSampleRecord subtype) cross-filter work in DuckDB-WASM substrate?
  • Time in lite parquetsamples_map_lite.parquet does not include result_time granularity needed for time-range filters; upgrade path?
  • FTS field coverage — which fields does DuckDB FTS index (see issue Explore DuckDB FTS extension for full-text search in Explorer #84); what's the contract for text search across substrates?
  • Cross-filter cache shape — how are intermediate filter results cached; is cache shape part of the contract or substrate-private?
  • Confidence thresholds — vocabulary confidence scores: cutoff semantics across substrates
  • H3 tier breakpoints — zoom-to-tier mapping: is this in the query spec or the rendering layer? (Related: examples/issues/4)
  • Thumbnail provenance — thumbnail_url as a query field: sidecar schema vs wide parquet column (related: sidecar rollout issue)

Acceptance for v0.2

  • All 7 questions have a decision (resolved, deferred-with-rationale, or spun out to separate issues)
  • Bindings section shows at least one non-trivial query runnable across all three substrates with matching output
  • Linked from the site's data-access landing page

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions