Skip to content

Feature/multi query stores dashboard overview#311

Merged
erikdarlingdata merged 9 commits intoerikdarlingdata:devfrom
rferraton:feature/multi-query-stores-overview
May 5, 2026
Merged

Feature/multi query stores dashboard overview#311
erikdarlingdata merged 9 commits intoerikdarlingdata:devfrom
rferraton:feature/multi-query-stores-overview

Conversation

@rferraton
Copy link
Copy Markdown
Contributor

Multi-Query Store Overview Dashboard

Summary

Adds a new Query Store Overview dashboard that provides a consolidated, cross-database view of all Query Store-enabled databases on a SQL Server instance. This gives users a single pane of glass to compare workload metrics across databases before drilling down into individual database Query Store details.

Features

Overview Dashboard Layout

  • Donut Chart — Shows Query Store state distribution across all databases (Read/Write, Read-Only, Off) with interactive tooltips and a popup listing databases by state
  • Consolidated Time Slicer — Reuses the existing TimeRangeSlicerControl with data aggregated across all databases, allowing users to select a time range that filters all dashboard panels
  • Wait Stats by Database — Stacked bar chart showing hourly wait totals per database (not per wait category), colored using a unified palette shared with the metric cards. Includes a dashed average reference line with smart unit formatting (ms/sec, s/sec, min/sec)
  • Metric Bar Cards (Total & Average) — Two rows of 7 cards each (CPU, Duration, Executions, Reads, Writes, Physical Reads, Memory) showing horizontal stacked bars per database with a unified color map based on top-N databases by CPU

Drill-Down

  • Right-click any database bar in any card → "Drill Down to DB Query Store" opens the single-database Query Store tab directly (no connection dialog)
  • The current time range selection from the overview slicer is automatically passed to the drilled-down Query Store tab via SetInitialTimeRange()

Data Pipeline

  • QueryStoreOverviewService fetches states, time slices, metrics, and wait stats across all user databases in parallel using SemaphoreSlim (DOP=8)
  • FetchAllWaitStatsWithErrorsAsync captures per-database errors instead of silently swallowing them, surfacing failures to the user in a dialog window
  • Active database list is cached on initial load to avoid re-querying on every time slicer change

UX

  • Progress bar (indeterminate) at the top of the dashboard during data loading — always reserves its 3px height to prevent layout hopping
  • supportsWaitStats flag (SQL 2017+ / Azure) is checked before attempting wait stats queries, with a descriptive fallback message when unsupported
  • Null-safe SQL (ISNULL, reader.IsDBNull guards) and reader.FieldCount checks to handle databases returning varying result shapes

New Files

File Purpose
QueryStoreOverviewControl.axaml Dashboard layout (donut + slicer + wait stats + 2×7 bar card grids)
QueryStoreOverviewControl.axaml.cs Canvas-based charting, parallel data orchestration, drill-down events

Modified Files

File Change
QueryStoreOverviewService.cs New service with FetchAllStatesAsync, FetchAllMetricsAsync, FetchAllTimeSlicesAsync, FetchAllWaitStatsAsync, FetchAllWaitStatsWithErrorsAsync
QueryStoreOverviewModels.cs New models: DatabaseQueryStoreState, DatabaseMetrics, DatabaseTimeSlice, DatabaseWaitAmountTimeSlice, DatabaseWaitCategoryTimeSlice
QuerySessionControl.axaml.cs "QS Overview" button handler, DrillDownRequested wiring, OpenQueryStoreForDatabaseAsync now accepts optional initialStartUtc/initialEndUtc
QueryStoreGridControl.axaml.cs Added SetInitialTimeRange() for pre-setting the slicer range on drill-down

Technical Notes

  • Top-N databases (topN=4 by default) get distinct palette colors; remaining databases are aggregated as "Others" in grey
  • All charting is custom Canvas-based (no third-party chart library)
  • Color palette: #2EAEF1 (blue), #F2994A (orange), #27AE60 (green), #9B51E0 (purple), #EB5757 (red), #F2C94C (yellow), #56CCF2 (light blue), #BB6BD9 (violet)

Which component(s) does this affect?

  • Desktop App (PlanViewer.App)
  • Core Library (PlanViewer.Core)
  • CLI Tool (PlanViewer.Cli)
  • SSMS Extension (PlanViewer.Ssms)
  • Tests
  • Documentation

How was this tested?

2026-05-05_03h06_39 2026-05-05_08h52_36

Describe the testing you've done. Include:

  • Plan files tested : no impact of plan
  • Platforms tested : windows

Checklist

  • I have read the contributing guide
  • My code builds with zero warnings (dotnet build -c Debug)
  • All tests pass (dotnet test)
  • I have not introduced any hardcoded credentials or server names

rferraton added 7 commits May 3, 2026 10:37
…ary of what was created:

New Files:
1. QueryStoreOverviewModels.cs — Models for:
• QueryStoreState enum (Off, ReadOnly, ReadWrite)
• DatabaseQueryStoreState — state per database
• DatabaseMetrics — aggregated metrics (total + avg) per database
• DatabaseTimeSlice — time slice data tagged by database
• DatabaseWaitCategoryTimeSlice — wait stats tagged by database
2. QueryStoreOverviewService.cs — Parallel data fetching with:
• SemaphoreSlim throttling (default DOP=8)
• ConcurrentBag<T> for thread-safe result collection
• Methods: FetchAllStatesAsync(string, int, CancellationToken), FetchAllMetricsAsync(string, List<string>, DateTime, DateTime, int, CancellationToken), FetchAllTimeSlicesAsync(string, List<string>, int, int, CancellationToken), FetchAllWaitStatsAsync(string, List<string>, DateTime, DateTime, int, CancellationToken)
3. QueryStoreOverviewControl.axaml — Layout with 3 rows:
• Row 1: Donut chart + consolidated time slicer + consolidated wait stats ribbon
• Row 2: 7 bar chart cards (Total metrics)
• Row 3: 7 bar chart cards (Avg metrics)
4. QueryStoreOverviewControl.axaml.cs — Code-behind with:
• Donut chart (RW=light blue, RO=dark blue, OFF=grey, center shows active/total)
• Consolidated time slicer (30-day, 24h default selection)
• Consolidated wait stats ribbon (sum across databases)
• Top-N bar cards with consistent database colors, adaptive font color, tooltips, and right-click "Drill Down to DB Query Store" context menu
Modified Files:
5. QuerySessionControl.axaml — Added "QS Overview" button
6. QuerySessionControl.axaml.cs — Added QueryStoreOverview_Click(object?, RoutedEventArgs) handler that opens the overview tab and wires drill-down to open single-DB Query Store tabs
…ownEventArgs containing Database, StartUtc, and EndUtc. The session control calls grid.SetInitialTimeRange() before the grid auto-fetches, so the drilled-down Query Store tab starts with the same time range selected in the overview.

2.  Progress bar: Added an indeterminate ProgressBar at the top of the overview. It shows during LoadAsync() (all 3 phases) and during RefreshMetricsAndWaitStatsAsync(CancellationToken) (when the slicer range changes), and hides when complete via try/finally.
@rferraton
Copy link
Copy Markdown
Contributor Author

Arf... sorry for the conflicts! i may have done a mistake when branching... but hard to know where and what to do with them!

Resolves conflicts after dev's CRLF->LF normalization (erikdarlingdata#307) and the
QueryStoreGridControl GroupBy default change (erikdarlingdata#305, erikdarlingdata#306). The only real
conflict was leftover blank lines in QueryStoreGridControl.axaml.cs where
erikdarlingdata#306 collapsed whitespace after removing ExpandRowRecursive.
Copy link
Copy Markdown
Owner

@erikdarlingdata erikdarlingdata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Overview

Adds a cross-database Query Store overview dashboard (donut + slicer + wait-stats ribbon + 14 metric bar cards) with right-click drill-down into the per-DB Query Store tab carrying the time range. ~1,524 lines, 7 files. Compiles clean against dev after the merge.

What's good

  • Parallelism is well-boundedSemaphoreSlim(maxDop=8) + ConcurrentBag for fan-out across N user databases. Cancellation propagates through to OpenAsync/ExecuteReaderAsync.
  • All SQL is parameterized (SqlParameter for @start, @end, @daysBack); database names come from sys.databases, not user input; per-DB connection strings are built via SqlConnectionStringBuilder. Clean on the security side.
  • FetchAllWaitStatsWithErrorsAsync is the right pattern — captures per-DB errors with when (ex is not OperationCanceledException) instead of swallowing them.
  • _supportsWaitStats gating + descriptive fallback for SQL 2016/older.
  • Drill-down API on QueryStoreGridControl.SetInitialTimeRange is small and explicit.

Bugs / correctness

  1. Dead codeFetchDatabaseWaitStatsAsync (QueryStoreOverviewService.cs:332-373) and DatabaseWaitCategoryTimeSlice (QueryStoreOverviewModels.cs:66-73) have no callers. Only FetchDatabaseWaitAmountAsync is used. Delete both.

  2. Errors swallowed silentlyQueryStoreOverviewService.cs:109 and :143 use bare catch { /* skip */ }, which will eat OperationCanceledException too and defeat cancellation. Apply the same when (ex is not OperationCanceledException) pattern used in FetchAllWaitStatsWithErrorsAsync.

  3. Permission errors look like "Off"FetchAllStatesAsync:37-40 maps any per-DB exception to QueryStoreState.Off, so a missing VIEW DATABASE STATE grant looks identical to QS being disabled in the donut. At minimum log/surface; ideally add an Unknown/Error state.

  4. _cts leakLoadAsync:95-96 and OnSlicerRangeChanged:424-425 cancel the old token source but never Dispose() it. Add _cts?.Dispose() between cancel and reassign, and dispose on control detach.

  5. Race in DrawWaitStatsChart — line 477 var topDbs = _dbColorMap.Keys.ToList(); depends on DrawBarCards having populated _dbColorMap first. The normal call order does this, but SizeChanged (constructor:84-88) calls DrawWaitStatsChart independently — if a resize fires between LoadAsync Phase 1 and Phase 3, every wait bar gets bucketed into Others. Defer the wait chart, or compute topDbs from _metrics directly.

  6. Misnamed fieldWaitRatio on DatabaseWaitAmountTimeSlice (and DatabaseWaitCategoryTimeSlice) is wait time in hours (SUM(total_query_wait_time_ms) / 3,600,000), not a ratio. Rename to WaitHours / WaitAmountHours to match the SQL and WaitRatioFormatter semantics elsewhere.

Style / conventions

  • Tab vs space inconsistencyQueryStoreOverviewService.cs:375-411 (FetchDatabaseWaitAmountAsync) is tab-indented while every other method is 4-space. Same in QueryStoreOverviewModels.cs:82 (WaitRatio on DatabaseWaitAmountTimeSlice). Will look broken in any editor without tab-width=4.
  • Trailing blank lines in QueryStoreOverviewModels.cs:85-87.
  • _dbColorMap is declared mid-file (line 630) instead of with the other private fields (lines 31-39).
  • if (db != "Others") on line 742 — extract private const string OthersLabel = "Others" and reuse.
  • Repeated Color.Parse("#E4E6EB") is already available as DynamicResource ForegroundBrush in XAML; resolve once or use FindResource.

Performance (minor)

  • topDbs.Contains(item.Db) inside the hour loop (:511) and per-metric loop (:699) — List<string>.Contains is O(n). Use a HashSet<string> since the list is small but checked many times per redraw.
  • SizeChanged redraws everything; rapid resizing will thrash. A small debounce would help — not blocking.

Tests

No tests added for ~1,500 lines of new code (PR checklist has Tests unchecked). QueryStoreOverviewService is a static class with discrete query methods — at minimum, table-driven tests against a real DB would catch column-order regressions in the SQL.

SQL notes

  • WHERE database_id > 4 correctly excludes system DBs.
  • READ UNCOMMITTED on QS DMVs is fine and conventional.
  • bucket_hour = DATEADD(HOUR, DATEDIFF(HOUR, 0, rsi.start_time), 0) truncates to hour-of-epoch in server local time (because DATEDIFF(HOUR, 0, ...) is naïve). The result is then SpecifyKind(..., Utc) on the C# side, which mislabels it on a non-UTC server. Either compute UTC-truncation server-side or stop forcing Utc kind when the SQL hasn't promised it. Worth verifying against a non-UTC server before merge.

Verdict

Functionally solid, security-clean, well-bounded async. Blockers are small: the dead code, the silent catches eating cancellation, and the bucket-hour timezone question. The rest is polish (indentation, naming, dispose). Recommend addressing #1–6 before merge; style/perf items can be follow-ups.

…seWaitCategoryTimeSlice (no callers).

2. Bare catch blocks fixed — All 4 parallel fetch methods now use when (ex is not OperationCanceledException) or when (!ct.IsCancellationRequested) so cancellation propagates correctly.
3. Permission errors surfaced — Added QueryStoreState.Error enum value and ErrorMessage property to DatabaseQueryStoreState. The donut now shows a red "Error" segment, and clicking it lists databases with their error messages.
4. _cts leak fixed — Added _cts?.Dispose() before every reassignment in LoadAsync() and OnSlicerRangeChanged(object?, TimeRangeChangedEventArgs), plus a DetachedFromVisualTree handler that cancels and disposes on control teardown.
5. SizeChanged race fixed — DrawWaitStatsChart() now returns early if _dbColorMap is empty, preventing all bars from being bucketed into "Others" when a resize fires before DrawBarCards() has run.
6. Misnamed field renamed — WaitRatio → WaitAmountHours on DatabaseWaitAmountTimeSlice and all references in the service and control.
@rferraton
Copy link
Copy Markdown
Contributor Author

  1. Dead code removed — Deleted FetchDatabaseWaitStatsAsync and DatabaseWaitCategoryTimeSlice (no callers).
  2. Bare catch blocks fixed — All 4 parallel fetch methods now use when (ex is not OperationCanceledException) or when (!ct.IsCancellationRequested) so cancellation propagates correctly.
  3. Permission errors surfaced — Added QueryStoreState.Error enum value and ErrorMessage property to DatabaseQueryStoreState. The donut now shows a red "Error" segment, and clicking it lists databases with their error messages.
  4. _cts leak fixed — Added _cts?.Dispose() before every reassignment in LoadAsync() and OnSlicerRangeChanged(object?, TimeRangeChangedEventArgs), plus a DetachedFromVisualTree handler that cancels and disposes on control teardown.
  5. SizeChanged race fixed — DrawWaitStatsChart() now returns early if _dbColorMap is empty, preventing all bars from being bucketed into "Others" when a resize fires before DrawBarCards() has run.
  6. Misnamed field renamed — WaitRatio → WaitAmountHours on DatabaseWaitAmountTimeSlice and all references in the service and control.

@erikdarlingdata erikdarlingdata merged commit c90bff7 into erikdarlingdata:dev May 5, 2026
2 checks passed
@rferraton
Copy link
Copy Markdown
Contributor Author

@erikdarlingdata : thanks for the merge from dev :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants