Skip to content

Tests | Flakiness improvements to XEventsTracingTest#4262

Open
edwardneal wants to merge 1 commit intodotnet:mainfrom
edwardneal:tests/xevent-reliability
Open

Tests | Flakiness improvements to XEventsTracingTest#4262
edwardneal wants to merge 1 commit intodotnet:mainfrom
edwardneal:tests/xevent-reliability

Conversation

@edwardneal
Copy link
Copy Markdown
Contributor

Description

This performs some reliability improvements to XEventsTracingTest. We can see intermittent failures, most of which are the result of them being killed to resolve deadlocks.

I wouldn't normally expect the original SQL statements (sp_help and SELECT @@VERSION) to encounter that, but sp_help returns the list of objects in the current database; perhaps if objects are being created by another CI run, this becomes an issue.

To break the dependency on server state, I've replaced both method calls with a call to a new SP which just runs SELECT 1, and a SQL statement which runs SELECT 1 directly.

During investigation it also became clear that an activity ID is recorded (and the test is capable of passing) even when a deadlock or other SQL error occurs. I've made this explicit via a new test case. Technically this means that we could simply broaden the error handling to swallow all SqlException errors when executing the command and the test would continue to pass. I've not done this because I'm a little more concerned that we're encountering deadlocks on comparatively simple statements, and don't want to mask any underlying issue.

Besides this, there are a few QoL improvements:

  • One test method is now refactored into three coherent test cases. This also makes the reflection work to retrieve the test case name unnecessary.
  • Added XML documentation which describes the tests (and the reason why they're marked as flaky.)
  • A new FlushResultSet helper was added in an earlier PR, and we now use it.

Issues

Contributes to #3453.

Testing

One new test case. All three XEvents tests pass, but I don't think I can easily reproduce the same kind of load. Could someone run CI against this PR multiple times at peak load please?

* Refactor one test method into three distinct test cases.
* Add reasons for the tests being marked as flaky.
* Switch call to 'sp_help' to a new, simpler SP.
* Switch execution of 'SELECT @@Version' to a simpler 'SELECT 1' statement.
* Use new FlushResultSet helper.
* Simplify XEvent session name generation.
* Add test case to verify that an activity ID is recorded in the extended event when the SQL statement throws an error.
@edwardneal edwardneal requested a review from a team as a code owner May 7, 2026 04:59
@github-project-automation github-project-automation Bot moved this to To triage in SqlClient Board May 7, 2026
@cheenamalhotra cheenamalhotra moved this from To triage to In review in SqlClient Board May 7, 2026
@cheenamalhotra cheenamalhotra added the Area\Tests Issues that are targeted to tests or test projects label May 7, 2026
@cheenamalhotra cheenamalhotra added this to the 7.1.0-preview2 milestone May 7, 2026
@cheenamalhotra
Copy link
Copy Markdown
Member

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 2 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area\Tests Issues that are targeted to tests or test projects

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

4 participants