Skip to content

feat: accuracy issuer inherits perf concurrency in online mode (#357)#379

Open
arekay-nv wants to merge 1 commit into
mainfrom
arekay/cherry_pick_accuracy_configs
Open

feat: accuracy issuer inherits perf concurrency in online mode (#357)#379
arekay-nv wants to merge 1 commit into
mainfrom
arekay/cherry_pick_accuracy_configs

Conversation

@arekay-nv

Copy link
Copy Markdown
Collaborator

When the performance phase runs the CONCURRENCY load pattern (online), the accuracy phase now mirrors that same fixed concurrency instead of always bursting at MAX_THROUGHPUT, so evaluation exercises the endpoint the same way as the performance run.

All other patterns are unchanged: POISSON and offline MAX_THROUGHPUT perf phases keep the accuracy phase at MAX_THROUGHPUT, since inheriting POISSON would silently rate-limit evaluation to the perf QPS (no accuracy QPS-budgeting yet). The gate is purely load_pattern.type == CONCURRENCY, which the schema already constrains to online mode.

Also logs the accuracy issuer's chosen load mode (pattern + target_concurrency) per accuracy dataset. Adds unit tests for the concurrency-inheritance, POISSON-stays-max-throughput, offline-stays-max-throughput, and logging cases.

What does this PR do?

Type of change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor/cleanup

Related issues

Testing

  • Tests added/updated
  • All tests pass locally
  • Manual testing completed

Checklist

  • Code follows project style
  • Pre-commit hooks pass
  • Documentation updated (if needed)

When the performance phase runs the CONCURRENCY load pattern (online), the
accuracy phase now mirrors that same fixed concurrency instead of always
bursting at MAX_THROUGHPUT, so evaluation exercises the endpoint the same way
as the performance run.

All other patterns are unchanged: POISSON and offline MAX_THROUGHPUT perf
phases keep the accuracy phase at MAX_THROUGHPUT, since inheriting POISSON
would silently rate-limit evaluation to the perf QPS (no accuracy QPS-budgeting
yet). The gate is purely load_pattern.type == CONCURRENCY, which the schema
already constrains to online mode.

Also logs the accuracy issuer's chosen load mode (pattern + target_concurrency)
per accuracy dataset. Adds unit tests for the concurrency-inheritance,
POISSON-stays-max-throughput, offline-stays-max-throughput, and logging cases.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@arekay-nv arekay-nv requested a review from a team June 27, 2026 02:18
@github-actions

Copy link
Copy Markdown

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@github-actions github-actions Bot requested a review from nvzhihanj June 27, 2026 02:19

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the benchmark execution logic so that the accuracy phase mirrors the fixed concurrency of the performance phase when a CONCURRENCY load pattern is used, while continuing to default to MAX_THROUGHPUT for other patterns (such as POISSON). It also adds logging for the accuracy issuer's load mode and includes comprehensive unit tests to verify these behaviors. There are no review comments, and I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant