Concepts

This page walks through the building blocks of QueryInsights one at a time. If you have read the Architecture overview, this fills in what each piece actually does.

Poller

The poller captures queries from databases that expose their workload as statistics tables. It is intentionally a thin data collector: it does not parse, normalize, or store anything. It reads, filters, samples, and forwards.

  • PostgreSQL — reads pg_stat_statements, joining pg_database to resolve each query's database name. Because pg_stat_statements is cluster-wide, the poller can fall back to the maintenance postgres database if the DSN's own database doesn't have the extension installed.
  • CockroachDB — reads crdb_internal.statement_statistics, skipping internal application names and DDL statements.

Each configured instance gets its own goroutine with an independent sampling rate and poll interval. Sampling is probabilistic — every captured query passes a rand() < sampling_rate check — so you can dial down volume on a busy instance. Internal and utility statements (system catalogs, SET, SHOW, transaction control, DDL, and known admin tools) are filtered out before anything is forwarded.

Captured queries are sent to the engine via the SubmitQuery gRPC call, tagged with the engine name, the source instance, and the originating database.

Engine

The engine is the centralized brain. It is a gRPC server that owns all processing and all writes to the metastore. For every query it receives it:

  1. Parses the SQL into a structured analysis (query type, tables, columns, joins, clause presence, subquery count).
  2. Normalizes it into a fingerprint by replacing literals with placeholders.
  3. Upserts the normalized pattern, deduplicating by (normalized_query, engine), and records a raw-query sample.
  4. Recommends — for a new pattern with resolvable tables, it kicks off a background per-query recommendation.

The engine also runs the workload-insight ticker, which periodically produces holistic, cross-pattern advice per engine/instance (see Recommendations below).

Internally the engine keeps a map of recommendation services keyed by engine name (trino, cockroachdb, postgresql), each wired to the matching metadata provider, and selects the right one based on the query's engine label.

Parsing and normalization

Parsing turns raw SQL into a structured Analysis, and normalization turns it into a stable fingerprint. The engine routes by engine type:

  • Trino — the official Trino SQL grammar compiled with ANTLR; a visitor walks the parse tree to extract metadata, and a token-stream normalizer replaces literals.
  • CockroachDB — a yacc-style lexer and structural analyzer.
  • PostgreSQL / others — a regex-based normalizer that strips comments, replaces string/numeric/dollar-quoted literals, and collapses whitespace.

Normalization is what makes the product tractable: a thousand variants of the same statement collapse into one pattern, so you optimize shapes, not noise.

-- Original
SELECT name FROM users WHERE id = 42 AND status = 'active'

-- Normalized
SELECT name FROM users WHERE id = ? AND status = ?

Metadata providers (schema/DDL)

Recommendations are only useful if they are grounded in your real schema. The SchemaProvider interface fetches the DDL for the tables a query touches:

EngineHow it fetches DDL
TrinoSHOW CREATE TABLE via the Trino client
CockroachDBSHOW CREATE TABLE over the PostgreSQL wire protocol
PostgreSQLReconstructed from information_schema, enriched with pg_stats

Providers are multi-instance: the engine keeps one provider per configured instance and connects lazily to the specific database a query came from. Table names are validated before being interpolated into any metadata query.

LLM provider

The LLM provider is the abstraction over the model backend that writes the actual recommendations. It exposes two operations — one for per-query rewrites and one for workload insights — behind a single interface, so support for a new backend is a matter of implementing that interface.

Two backends ship today:

  • OpenAI — chat completions using OpenAI's Structured Outputs, so the model is forced to return a strict { summary, details } JSON envelope.
  • Anthropic — Claude models.

Which backend runs is decided from the configured model name:

  • If the model name contains claude and an Anthropic API key is set → Anthropic.
  • If the model name contains claude but only an OpenAI key is set → the OpenAI-compatible client (this is the expected path when routing Claude through an OpenAI-compatible gateway such as LiteLLM or Portkey).
  • Otherwise → OpenAI.

A custom base URL lets you point either client at a gateway instead of the vendor's public API. Generation uses a low temperature and a hard token cap so output stays focused and per-call cost is bounded. If no API key is configured, the engine simply starts without recommendation features — ingestion and the dashboard still work.

See Configuration for the exact variables.

Recommendations: two scopes, one feed

QueryInsights produces two deliberately non-overlapping kinds of advice, surfaced together in the dashboard with scope and kind chips:

Query rewrite (scope=query)Workload schema (scope=workload)
SubjectOne normalized query patternAn entire engine/instance workload
TriggerAutomatically, on first sight of a new patternThe engine's periodic ticker, or an on-demand dashboard request
Question"How can this query be written better?""Given all queries hitting this engine, what schema changes help the whole workload?"
OutputRewrites, predicate ordering, join shape — no schema adviceIndexes, covering/storing clauses, layout — no per-query rewrites

Keeping the two scopes non-overlapping means the advice stays complementary instead of contradictory.

Idempotency and safety

Because recommendation generation calls an LLM, the engine takes care to never do it twice for the same pattern:

  • An is_recommendation_generated flag is checked before any LLM call.
  • SELECT ... FOR UPDATE inside a transaction serializes concurrent workers racing on the same pattern.
  • In-flight claims and per-table caps prevent a burst of new patterns from spawning unbounded concurrent LLM calls.
  • Failed attempts are counted and permanently skipped after a configurable limit.

These controls make ingestion safe to run continuously against a live, high-volume workload.