Skip to content

Python: Re-enable temporarily skipped Python integration tests (merge-queue xdist runner crash) #6777

Description

@giles17

Summary

While unblocking #6753 (a routine version bump), the Python integration test jobs in the merge queue were crashing the pytest-xdist runner — every worker reported node down: Not properly terminated and the whole pytest session collapsed (reported as e.g. "20 failed"). This is an environmental/flaky failure unrelated to the version-bump diff.

To unblock the merge queue, a set of integration tests was temporarily skipped. This issue tracks everything that was skipped so it can be re-enabled and properly fixed.

Background

  • The affected jobs run on a large hosted runner; pytest -n logical spawns ~20 xdist workers (gw0gw19) against live Azure/OpenAI services.
  • Reducing workers to -n 4 did not help — the runner still spawned 20 workers and collapsed identically, so worker count is not the controlling factor.
  • The crash signature is a whole-session collapse, not individual assertion failures.

Example failing runs:

What was skipped

1. Azure OpenAI integration tests (entire integration set)

Skipped by changing the per-file skip_if_azure_openai_integration_tests_disabled guard from a conditional pytest.mark.skipif(...) to an unconditional pytest.mark.skip(...). Only the integration tests are skipped; the unit tests in these files still run.

packages/openai/tests/openai/test_openai_chat_client_azure.py

  • test_integration_web_search
  • test_integration_client_file_search
  • test_integration_client_file_search_streaming
  • test_integration_client_agent_hosted_mcp_tool
  • test_integration_client_agent_hosted_code_interpreter_tool
  • test_integration_client_agent_existing_session
  • test_azure_openai_chat_client_tool_rich_content_image
  • (all @pytest.mark.integration tests in this file)

packages/openai/tests/openai/test_openai_chat_completion_client_azure.py

  • test_azure_openai_chat_completion_client_response
  • test_azure_openai_chat_completion_client_response_tools
  • test_azure_openai_chat_completion_client_streaming
  • test_azure_openai_chat_completion_client_streaming_tools
  • test_azure_openai_chat_completion_client_agent_basic_run
  • test_azure_openai_chat_completion_client_agent_basic_run_streaming
  • test_azure_openai_chat_completion_client_agent_session_persistence
  • test_azure_openai_chat_completion_client_agent_existing_session
  • test_azure_chat_completion_client_agent_level_tool_persistence

packages/openai/tests/openai/test_openai_embedding_client_azure.py

  • test_azure_openai_get_embeddings
  • test_azure_openai_get_embeddings_multiple
  • test_azure_openai_get_embeddings_with_dimensions

2. Azure Functions / Durable Task integration tests (5 specific tests)

Skipped individually with @pytest.mark.skip(...) because these were the concrete failing/flaky tests:

  • packages/azurefunctions/tests/integration_tests/test_02_multi_agent.py::TestSampleMultiAgent::test_weather_agentrequests.exceptions.ReadTimeout (read timeout=30)
  • packages/azurefunctions/tests/integration_tests/test_11_workflow_parallel.py::TestWorkflowParallel::test_parallel_workflow_end_to_endworker 'gw2' crashed (likely triggers the cascade)
  • packages/durabletask/tests/integration_tests/test_02_dt_multi_agent.py::TestMultiAgent::test_weather_agent_with_toolAssertionError: assert 0 > 0 (empty AgentResponse.text)
  • packages/durabletask/tests/integration_tests/test_02_dt_multi_agent.py::TestMultiAgent::test_math_agent_with_toolAssertionError: assert 0 > 0 (empty AgentResponse.text; same flakiness as test_weather_agent_with_tool)
  • packages/durabletask/tests/integration_tests/test_06_dt_multi_agent_orchestration_conditionals.py::TestMultiAgentOrchestrationConditionals::test_conditional_branchingTimeoutError: Timed-out waiting for the orchestration to complete

How to re-enable

  • Azure OpenAI: restore the original pytest.mark.skipif(...) env-based guard in each of the three files.
  • Functions / Durable Task: remove the four @pytest.mark.skip(...) decorators.

Follow-up / root cause to investigate

  • Why does the Azure OpenAI integration job collapse the runner under high xdist parallelism (resource exhaustion / OOM / connection limits against live services)?
  • Whether the parallel-workflow Functions test (test_parallel_workflow_end_to_end) is the trigger for the worker crash cascade.
  • Whether these live-service integration tests should gate the merge queue at all, vs. running post-merge / on a schedule.

Related PR: #6753

Metadata

Metadata

Assignees

Labels

pythonUsage: [Issues, PRs], Target: Python

Type

No type
No fields configured for issues without a type.

Projects

Status
No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions