Skip to content

feat(runtime): project active workspace identity into Data Machine engine_data#425

Merged
chubes4 merged 2 commits into
mainfrom
feat-423-active-workspace
May 18, 2026
Merged

feat(runtime): project active workspace identity into Data Machine engine_data#425
chubes4 merged 2 commits into
mainfrom
feat-423-active-workspace

Conversation

@chubes4
Copy link
Copy Markdown
Member

@chubes4 chubes4 commented May 18, 2026

Closes #423

What this PR does

Adds `ActiveWorkspaceProjector` — a small runtime class that listens on Data Machine's `datamachine_engine_snapshot` filter and projects the current job's workspace identity into `engine_data.active_workspace` so AI directives, abilities, and tool calls can answer "which repo is this job operating against" by reading `$engine->get( 'active_workspace' )`.

Dependencies

Requires Extra-Chill/data-machine#2071 for the `datamachine_engine_snapshot` filter. DMC stays installable against older DM versions — the projector registers a callback that simply never fires when the filter isn't present.

Schema (stable contract)

```
active_workspace: {
handle: "@" or "" for primary
repo: short name (last segment of handle)
owner: GitHub owner when handle is in owner/repo form
full_name: "owner/repo" when both known
branch: worktree branch, omitted for primary
path: absolute filesystem path
primary: true for primary checkout
origin_site: site that created the worktree, when known
origin_agent: agent slug that created the worktree, when known
task_url: linked task URL (issue/PR), when set
pr_url: linked PR URL, when set
}
```

Missing fields are omitted (not nulled) so consumers can use `isset()` checks cleanly.

Caller contract

Callers opt in by passing `active_workspace.handle` via the run-flow ability's `initial_data` input:

```php
wp_get_ability( 'datamachine/run-flow' )->execute( array(
'flow_id' => $flow_id,
'initial_data' => array(
'active_workspace' => array(
'handle' => 'extrachill-artist-platform@docs/agent-run-123',
),
),
) );
```

Any additional fields the caller supplies (e.g. `owner`, `full_name`) override fields derived from worktree metadata. The projector calls `WorktreeContextInjector::get_metadata( $handle )` to enrich the entry with persisted lifecycle context.

Why explicit-input only

The first draft had a request-scoped "current workspace" tracker that workspace abilities pushed into when they cloned or worktree-added. Rejected because:

  • Concurrency — two jobs in the same process would clobber each other
  • Ambiguity — "current" is ill-defined when multiple workspace ops fire mid-pipeline
  • Coupling — required modifying every workspace ability to call the marker

Explicit input is simpler, safer, and matches how DM already handles per-call context. The CI workload (homeboy-extensions) already has the workspace handle in scope at the moment it calls `run-flow` — it just needs to pass it.

Companion changes

This PR is the DMC half. Two related changes:

  1. data-machine#2071 — DM core filter that this projector listens on
  2. homeboy-extensions workload — should be updated in a follow-up PR to pass `active_workspace.handle` via `initial_data` when calling `run-flow`. Until that lands, the projector is a no-op for CI-driven runs. Manual callers (chat, REST, ad-hoc CLI) can opt in immediately.

Layer purity

The projector talks about workspaces only — never docs, voice, audience, or any consumer-specific concept. Downstream plugins (extrachill-docs, future security plugins, audit loggers) consume the `active_workspace` entry to make their own routing decisions through the `datamachine_code_active_workspace` filter exposed here for further enrichment.

What this unblocks

  • extrachill-docs per-target context — once this lands, extrachill-docs adds a second callback on `datamachine_agent_mode_docs` (its existing `docs` mode from extrachill-docs PR feat(worktree): daily recurring schedule via Extra-Chill/data-machine#1117 #35) at a later filter priority. The callback reads `active_workspace.full_name`, looks up `runner-configs/platform-map.yml`, and stacks per-platform audience context ("you are documenting the Artist Platform for musicians and artist managers") onto the same mode without forking the upstream docs-agent bundle.
  • Future security/audit plugins — any consumer that needs per-run repo identity gets it for free through the standard `EngineData` accessor.

Smoke testing

`php -l` clean on both modified files. The projector's logic is fully testable in isolation (build_entry takes a handle + overrides, returns the enriched entry); end-to-end exercise lives downstream with the companion DM PR and the eventual homeboy-extensions workload update.

Scope

Two files, 221 net lines. New class lives at `inc/Runtime/` — first occupant of that directory, matches the naming pattern of the issue body's proposed `Runtime/WorkspaceBootstrap.php` (renamed to `ActiveWorkspaceProjector` because that's more accurate to what the class actually does).

…gine_data

Adds ActiveWorkspaceProjector — listens on DM's datamachine_engine_snapshot
filter (added in data-machine v0.10.3) and projects active workspace
identity into engine_data at job initialization. Lets AI directives,
abilities, and tool calls answer "which repo is this job operating
against" by reading $engine->get('active_workspace').

## Schema

The projected entry has a stable, generic shape:

  active_workspace: {
    handle:       "<repo>@<branch>" or "<repo>" for primary
    repo:         short name (last segment of handle)
    owner:        GitHub owner when handle is in owner/repo form
    full_name:    "owner/repo" when both known
    branch:       worktree branch, omitted for primary
    path:         absolute filesystem path
    primary:      true for primary checkout
    origin_site:  site that created the worktree, when known
    origin_agent: agent slug that created the worktree, when known
    task_url:     linked task URL, when set
    pr_url:       linked PR URL, when set
  }

Missing fields are omitted (not nulled) so consumers can use isset()
checks cleanly.

## Caller contract

Callers (e.g. homeboy-extensions's CI workload) opt in by passing
active_workspace.handle via the run-flow ability's initial_data input:

  wp_get_ability( 'datamachine/run-flow' )->execute( array(
      'flow_id'      => $flow_id,
      'initial_data' => array(
          'active_workspace' => array(
              'handle' => 'extrachill-artist-platform@docs/agent-run-123',
          ),
      ),
  ) );

Any additional fields the caller supplies override fields derived from
worktree metadata. No automatic "current workspace" tracking — identity
is always explicit so concurrent jobs cannot clobber each other.

## Layer purity

The projector talks about workspaces only — never docs, voice, audience,
or any consumer-specific concept. Downstream plugins (e.g. extrachill-docs)
consume active_workspace to make their own routing decisions through the
datamachine_code_active_workspace filter we expose for further enrichment.

## Closes #423

Depends on Extra-Chill/data-machine PR for the datamachine_engine_snapshot
filter (feat-engine-snapshot-filter branch). No-op without that filter —
DMC stays installable against older DM versions.
@homeboy-ci
Copy link
Copy Markdown
Contributor

homeboy-ci Bot commented May 18, 2026

Homeboy Results — data-machine-code

Lint

lint — failed

  • phpstan — 6 finding(s)
  • Total: 6 finding(s)

ℹ️ Auto-fix: homeboy lint data-machine-code --path /home/runner/work/data-machine-code/data-machine-code --changed-since f1d953a --fix (or homeboy refactor data-machine-code --path /home/runner/work/data-machine-code/data-machine-code --changed-since f1d953a --from lint --write)
ℹ️ Some issues may require manual fixes
ℹ️ Full options: homeboy docs commands/lint
ℹ️ Save lint baseline: homeboy lint data-machine-code --baseline
Deep dive: homeboy lint data-machine-code --changed-since f1d953a

Test

test — passed

ℹ️ No impacted tests found for --changed-since f1d953a
ℹ️ Run full suite if needed: homeboy test data-machine-code
Deep dive: homeboy test data-machine-code --changed-since f1d953a

Audit

audit — passed

  • dead_code — 3 finding(s)
  • test_coverage — 1 finding(s)
  • Total: 4 finding(s)

Deep dive: homeboy audit data-machine-code --changed-since f1d953a

Tooling versions
  • Homeboy CLI: homeboy 0.182.0+24fd9e3
  • Extension: wordpress from https://github.com/Extra-Chill/homeboy-extensions
  • Extension revision: dd47f26a
  • Action: unknown@unknown

The filter callback signature must match the 4-argument contract from
DM's datamachine_engine_snapshot filter, but the implementation only
needs $snapshot. Add a phpcs:ignore matching DMC's established pattern
(see WorkspaceAbilities::getCapabilities and friends) so the parameter
list stays accurate to the filter contract while satisfying lint.
@chubes4
Copy link
Copy Markdown
Member Author

chubes4 commented May 18, 2026

Lint CI failure is a pre-existing baseline issue, not caused by this PR

PHPStan reports 6 findings — all on lines 428-429 of `data-machine-code.php`, in the `MemoryFileRegistry::register` call that existed before this PR:

```php
\DataMachine\Engine\AI\MemoryFileRegistry::register( 'AGENTS.md', 5, array(
'layer' => \DataMachine\Engine\AI\MemoryFileRegistry::LAYER_SHARED,
```

Findings:

  • `phpstan.staticMethod.notFound`: `MemoryFileRegistry::register()` (DM core API drift — the method exists in our DM checkout but PHPStan's stubs don't see it)
  • `phpstan.classConstant.notFound`: `MemoryFileRegistry::LAYER_SHARED` (same root cause)

Confirmed pre-existing: running `homeboy lint --path .` on `main` reports 320 findings; this branch reports 308. Net -12 findings.

The CI reports these because my one-line addition at `data-machine-code.php:93` (registering `ActiveWorkspaceProjector`) triggered PHPStan to re-analyze the whole file, which surfaced the pre-existing errors at lines 428-429.

This is a baseline / scoping limitation in the CI, not a defect introduced by this PR. Merging.

Follow-up: a separate PR should either (a) fix the PHPStan errors by updating the type stubs DM ships, or (b) add a baseline file so pre-existing errors don't block PRs that touch files containing them.

@chubes4 chubes4 merged commit 678f515 into main May 18, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose active workspace identity (repo, handle, branch) to AI execution context

1 participant