test: add comprehensive test coverage for Passmark framework by chemicoholic21 · Pull Request #46 · bug0inc/passmark

chemicoholic21 · 2026-05-01T21:12:43Z

Addresses Issue #2 by expanding test coverage across all priority areas:

PRIORITY 1 — Placeholder Resolution (43 tests)

{{run.*}} placeholder generation and validation
{{global.*}} placeholder persistence across executionIds
{{data.*}} placeholder resolution from Redis
{{email.*}} lazy resolution with provider integration
Malformed placeholder handling and error cases

PRIORITY 2 — Caching Logic (31 tests)

Cache HIT: uses cached actions without AI calls
Cache MISS: AI execution and Redis storage
bypassCache flag at step and runSteps level
Playwright retry detection bypasses cache
Auto-healing on cached action failures
Cache key scoping by userFlow + description
All cached action types (click, fill, hover, etc.)

PRIORITY 3 — Assertion Consensus (20 tests)

Both models agree TRUE → assertion passes
Both models agree FALSE → assertion fails
Models disagree → arbiter makes final call
Arbiter agreement with Claude or Gemini
API error handling with retry logic
Effort level affects thinking mode
Custom images support

PRIORITY 4 — configure() and Model Slots (45 tests)

Model override for all 7 model slots
configure() merges settings across calls
Gateway switching (vercel, openrouter, cloudflare, none)
resolveAI() precedence: step > call > global > default
CUA model lock validation
Clear error messages for misconfigurations

PRIORITY 5 — runSteps() Failure Modes (25 tests)

Step failures throw descriptive StepExecutionError
waitUntil timeout handling
Empty steps array gracefully handled
undefined/null page handling
Script step validation and failures
CUA mode error handling
Assertion skipping on step failure
Data extraction failures
Redis unavailability fallback

Test Infrastructure

All tests use proper vi.mock() patterns
In-memory Redis mock (Map<string, string>)
AI SDK mocks prevent real API calls
Playwright page stubs for automation
Comprehensive edge case coverage

All 298 tests passing with full isolation.

Addresses Issue bug0inc#2 by expanding test coverage across all priority areas: PRIORITY 1 — Placeholder Resolution (43 tests) - {{run.*}} placeholder generation and validation - {{global.*}} placeholder persistence across executionIds - {{data.*}} placeholder resolution from Redis - {{email.*}} lazy resolution with provider integration - Malformed placeholder handling and error cases PRIORITY 2 — Caching Logic (31 tests) - Cache HIT: uses cached actions without AI calls - Cache MISS: AI execution and Redis storage - bypassCache flag at step and runSteps level - Playwright retry detection bypasses cache - Auto-healing on cached action failures - Cache key scoping by userFlow + description - All cached action types (click, fill, hover, etc.) PRIORITY 3 — Assertion Consensus (20 tests) - Both models agree TRUE → assertion passes - Both models agree FALSE → assertion fails - Models disagree → arbiter makes final call - Arbiter agreement with Claude or Gemini - API error handling with retry logic - Effort level affects thinking mode - Custom images support PRIORITY 4 — configure() and Model Slots (45 tests) - Model override for all 7 model slots - configure() merges settings across calls - Gateway switching (vercel, openrouter, cloudflare, none) - resolveAI() precedence: step > call > global > default - CUA model lock validation - Clear error messages for misconfigurations PRIORITY 5 — runSteps() Failure Modes (25 tests) - Step failures throw descriptive StepExecutionError - waitUntil timeout handling - Empty steps array gracefully handled - undefined/null page handling - Script step validation and failures - CUA mode error handling - Assertion skipping on step failure - Data extraction failures - Redis unavailability fallback Test Infrastructure - All tests use proper vi.mock() patterns - In-memory Redis mock (Map<string, string>) - AI SDK mocks prevent real API calls - Playwright page stubs for automation - Comprehensive edge case coverage All 298 tests passing with full isolation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

chemicoholic21 changed the title ~~Add comprehensive test coverage for Passmark framework~~ test: add comprehensive test coverage for Passmark framework May 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add comprehensive test coverage for Passmark framework#46

test: add comprehensive test coverage for Passmark framework#46
chemicoholic21 wants to merge 1 commit intobug0inc:mainfrom
chemicoholic21:agent/tiger-amber-t7uc

chemicoholic21 commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chemicoholic21 commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant