Skip to content

test: add comprehensive test coverage for Passmark framework#46

Open
chemicoholic21 wants to merge 1 commit intobug0inc:mainfrom
chemicoholic21:agent/tiger-amber-t7uc
Open

test: add comprehensive test coverage for Passmark framework#46
chemicoholic21 wants to merge 1 commit intobug0inc:mainfrom
chemicoholic21:agent/tiger-amber-t7uc

Conversation

@chemicoholic21
Copy link
Copy Markdown

Addresses Issue #2 by expanding test coverage across all priority areas:

PRIORITY 1 — Placeholder Resolution (43 tests)

  • {{run.*}} placeholder generation and validation
  • {{global.*}} placeholder persistence across executionIds
  • {{data.*}} placeholder resolution from Redis
  • {{email.*}} lazy resolution with provider integration
  • Malformed placeholder handling and error cases

PRIORITY 2 — Caching Logic (31 tests)

  • Cache HIT: uses cached actions without AI calls
  • Cache MISS: AI execution and Redis storage
  • bypassCache flag at step and runSteps level
  • Playwright retry detection bypasses cache
  • Auto-healing on cached action failures
  • Cache key scoping by userFlow + description
  • All cached action types (click, fill, hover, etc.)

PRIORITY 3 — Assertion Consensus (20 tests)

  • Both models agree TRUE → assertion passes
  • Both models agree FALSE → assertion fails
  • Models disagree → arbiter makes final call
  • Arbiter agreement with Claude or Gemini
  • API error handling with retry logic
  • Effort level affects thinking mode
  • Custom images support

PRIORITY 4 — configure() and Model Slots (45 tests)

  • Model override for all 7 model slots
  • configure() merges settings across calls
  • Gateway switching (vercel, openrouter, cloudflare, none)
  • resolveAI() precedence: step > call > global > default
  • CUA model lock validation
  • Clear error messages for misconfigurations

PRIORITY 5 — runSteps() Failure Modes (25 tests)

  • Step failures throw descriptive StepExecutionError
  • waitUntil timeout handling
  • Empty steps array gracefully handled
  • undefined/null page handling
  • Script step validation and failures
  • CUA mode error handling
  • Assertion skipping on step failure
  • Data extraction failures
  • Redis unavailability fallback

Test Infrastructure

  • All tests use proper vi.mock() patterns
  • In-memory Redis mock (Map<string, string>)
  • AI SDK mocks prevent real API calls
  • Playwright page stubs for automation
  • Comprehensive edge case coverage

All 298 tests passing with full isolation.

Addresses Issue bug0inc#2 by expanding test coverage across all priority areas:

PRIORITY 1 — Placeholder Resolution (43 tests)
- {{run.*}} placeholder generation and validation
- {{global.*}} placeholder persistence across executionIds
- {{data.*}} placeholder resolution from Redis
- {{email.*}} lazy resolution with provider integration
- Malformed placeholder handling and error cases

PRIORITY 2 — Caching Logic (31 tests)
- Cache HIT: uses cached actions without AI calls
- Cache MISS: AI execution and Redis storage
- bypassCache flag at step and runSteps level
- Playwright retry detection bypasses cache
- Auto-healing on cached action failures
- Cache key scoping by userFlow + description
- All cached action types (click, fill, hover, etc.)

PRIORITY 3 — Assertion Consensus (20 tests)
- Both models agree TRUE → assertion passes
- Both models agree FALSE → assertion fails
- Models disagree → arbiter makes final call
- Arbiter agreement with Claude or Gemini
- API error handling with retry logic
- Effort level affects thinking mode
- Custom images support

PRIORITY 4 — configure() and Model Slots (45 tests)
- Model override for all 7 model slots
- configure() merges settings across calls
- Gateway switching (vercel, openrouter, cloudflare, none)
- resolveAI() precedence: step > call > global > default
- CUA model lock validation
- Clear error messages for misconfigurations

PRIORITY 5 — runSteps() Failure Modes (25 tests)
- Step failures throw descriptive StepExecutionError
- waitUntil timeout handling
- Empty steps array gracefully handled
- undefined/null page handling
- Script step validation and failures
- CUA mode error handling
- Assertion skipping on step failure
- Data extraction failures
- Redis unavailability fallback

Test Infrastructure
- All tests use proper vi.mock() patterns
- In-memory Redis mock (Map<string, string>)
- AI SDK mocks prevent real API calls
- Playwright page stubs for automation
- Comprehensive edge case coverage

All 298 tests passing with full isolation.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@chemicoholic21 chemicoholic21 changed the title Add comprehensive test coverage for Passmark framework test: add comprehensive test coverage for Passmark framework May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant