Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions .agent-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,9 +100,18 @@ public early-pLTV stays calendar-only (Option A); difficulty = distortion tiers
now + simulation-level scaling deferred (issue #129). `LTV-Po.2` split into Po.2a
(config plumbing) + Po.2b (recipe + e2e). `LTV-Po.2a` (`resolve_config` +
`Generator` carry `n_customers` / `early_tenure_weeks` / `observation_date`;
lead-scoring byte-identical) opened as **#131**. Next: `LTV-Po.2b`
(b2b_saas_ltv_v1 recipe YAMLs + difficulty_params resolution + e2e round-trip —
**completes M6**).
lead-scoring byte-identical) opened as **#131**. `LTV-Po.2b` (the three
`b2b_saas_ltv_v1` recipe YAMLs — `scheme: lifecycle`, `default_population:
{n_customers: 1500}`, `narrative.yaml` with 4 industries + 3 geographies,
per-tier `difficulty_profiles.yaml`; registry auto-discovers it;
`LifecycleScheme._resolve_difficulty` resolves `difficulty_params` from the
active profile in `build_world` and carries it on `spec.config` so snapshot
distortions fire per tier; e2e `Generator.from_recipe("b2b_saas_ltv_v1").generate()`
round-trip in both modes; the two tracked-gap difficulty guards flipped) opened
as **#PENDING** — **completes LTV-M6**. `early_tenure_weeks` /
`observation_date` stay override-only (carried from Po.2a); narrative declares
≥2 industries/geographies so public `industry`/`region` keep variance (invariant
#6). **Next milestone: LTV-M7** (`LTV-Pp` — scheme-aware validation).
Note: `validate_bundle` is lead-scoring-coupled — scheme-aware validation is
`LTV-Pp`.

Expand Down
52 changes: 29 additions & 23 deletions docs/ltv/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ protocol + registry, with the package physically reorganized into
| `LTV-M3` | Customer population + lifecycle world | `LTV-Ph`, `LTV-Pi` | #113 (Ph) |
| `LTV-M4` | Lifecycle simulation engine | `LTV-Pj`, `LTV-Pk` | #117 (Pj), #118 (Pk) |
| `LTV-M5` | Customer snapshots + pLTV targets (both regimes) | `LTV-Pl`, `LTV-Pm` | #119 (Pl), #120 (Pm) |
| `LTV-M6` | Register LifecycleScheme + recipe + manifest/version | `LTV-Pn.1…4`, `LTV-Po` | #121 (Pn.1), #122 (Pn.2), #124 (Pn.3), #125 (Pn.4a), #126 (Pn.4b), #127 (Pn.4c), #128 (Pn.4d) |
| `LTV-M6` | Register LifecycleScheme + recipe + manifest/version | `LTV-Pn.1…4`, `LTV-Po` | #121 (Pn.1), #122 (Pn.2), #124 (Pn.3), #125 (Pn.4a), #126 (Pn.4b), #127 (Pn.4c), #128 (Pn.4d), #130 (Po.1), #131 (Po.2a), Po.2b |
| `LTV-M7` | Validation + regression-metric calibration | `LTV-Pp` | |
| `LTV-M8` | CLI, notebooks, publish | `LTV-Pq`, `LTV-Pr`, `LTV-Ps` | |

Expand Down Expand Up @@ -399,28 +399,34 @@ methods, then public-safety, then the carried orchestrator cleanup:
it). Lead-scoring config resolution is byte-identical (the lifecycle fields
default-match; verified via full-bundle SHA-256 vs `main`, both modes).
- Labels: `type: refactor`, `layer: api`
- [ ] **`LTV-Po.2b`** — `feat(recipes): b2b_saas_ltv_v1 recipe assets + e2e`. The
three recipe YAMLs (`scheme: lifecycle`; `narrative.yaml` with ≥2 industries +
≥2 geographies; `difficulty_profiles.yaml`); register in the recipe registry;
resolve `difficulty_params` from the active profile in `build_world`
(mirroring lead-scoring `_resolve_difficulty`) so snapshot distortions fire
per tier; end-to-end `Generator.from_recipe("b2b_saas_ltv_v1").generate()`
round-trip. Public mode stays calendar-only (Option A, locked).
**Limitation (flagged in Po.2a review):** `early_tenure_weeks` /
`observation_date` are override-only — the `Recipe` schema has no field for
them, so the recipe.yaml CANNOT declare them; Po.2b uses the
`GenerationConfig` defaults (4 weeks; observation_date derived by the
population builder). If the recipe must declare them, extend the `Recipe`
dataclass + `from_dict` + `resolve_config` recipe-defaults read (don't rely
on override).
**Constraint (flagged in Po.1 review):** the recipe `narrative.yaml` MUST
declare ≥2 `icp_industries` and ≥2 `geographies` — Po.1 makes these drive the
public `industry`/`region` columns, so a single-value vocab yields a
zero-variance firmographic feature (student_public invariant #6 violation).
Add a test asserting both columns have ≥2 distinct values in the public
bundle.
- Tests: recipe loads, full round-trip, determinism, all task splits,
public/instructor split, per-tier distortion.
- [x] **`LTV-Po.2b`** — `feat(recipes): b2b_saas_ltv_v1 recipe assets + e2e`.
Created `leadforge/recipes/b2b_saas_ltv_v1/{recipe,narrative,difficulty_profiles}.yaml`
(`scheme: lifecycle`; `default_population: {n_customers: 1500}`; `narrative.yaml`
with 4 `icp_industries` + 3 `geographies`; per-tier difficulty profiles). The
registry auto-discovers it (no manual registration). `LifecycleScheme.build_world`
now resolves `difficulty_params` from the active profile via a new
`_resolve_difficulty` (mirroring lead-scoring, minus `category_latent_correlations`)
and carries it on the returned `spec.config`, so snapshot distortions fire per
tier. End-to-end `Generator.from_recipe("b2b_saas_ltv_v1").generate()` +
`.save()` round-trip verified in both modes; public stays calendar-only
(Option A, locked). **Completes `LTV-M6`.**
- The two existing tracked-gap guards flipped: `test_difficulty_not_yet_differentiating`
→ `test_difficulty_resolves_params_but_world_unchanged` (params now differ per
tier; the *world* stays identical — issue #129 still open); the explicit-param
`test_difficulty_params_thread_into_snapshots` → tier-based
`test_difficulty_tiers_produce_different_task_features` (since `_resolve_difficulty`
always overwrites `difficulty_params` from the profile, an explicitly-passed
one would be clobbered).
- **Limitation (carried from Po.2a):** `early_tenure_weeks` / `observation_date`
remain override-only — the recipe.yaml does NOT declare them; the bundle uses
the `GenerationConfig` defaults (4 weeks; observation_date derived by the
population builder).
- **Constraint satisfied (Po.1 review):** `narrative.yaml` declares ≥2
`icp_industries` and ≥2 `geographies`; `test_public_industry_region_features_have_variance`
asserts both public columns carry ≥2 distinct values (student_public invariant #6).
- Tests: `tests/recipes/test_b2b_saas_ltv_v1.py` (discovery, asset shape, config
resolution, build_world round-trip, narrative-driven firmographics, determinism,
both-mode bundle round-trip, byte-determinism).
- Labels: `type: feature`, `layer: recipes`, `layer: api`
- **Deferred (issue #129):** simulation-level difficulty scaling for the
lifecycle engine — making `advanced` a genuinely harder world (not just
Expand Down
Empty file.
52 changes: 52 additions & 0 deletions leadforge/recipes/b2b_saas_ltv_v1/difficulty_profiles.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Difficulty profiles for b2b_saas_ltv_v1
# ---------------------------------------------------------------------------
# Each profile controls the signal/noise characteristics of the generated
# pLTV dataset. Higher difficulty = more realistic noise, missing data, and
# outliers in the customer snapshot features, making the supervised task harder.
#
# Scope (LTV-Po.2b): difficulty is resolved into DifficultyParams and applied as
# SNAPSHOT DISTORTIONS (noise_scale / missing_rate / outlier_rate perturb the
# feature columns; targets are exempt). Simulation-level scaling — letting
# signal_strength / committee_friction / conversion_rate_range shape the
# underlying world itself — is tracked but NOT yet wired (issue #129). Those
# knobs are carried here so the profiles are complete and forward-compatible.

intro:
description: >
Clean signal, minimal noise. Suitable for learning pLTV/regression basics
and verifying that a pipeline runs end to end.
# Probability that the true mechanism drives the outcome (vs. noise).
# (Carried for issue #129; not yet consumed.)
signal_strength: 0.90
# Scale multiplier applied to additive Gaussian noise in continuous features.
noise_scale: 0.10
# Fraction of feature values set to missing (NaN).
missing_rate: 0.02
# Fraction of rows perturbed into statistical outliers.
outlier_rate: 0.01
# Acceptable churn-positive-rate band. (Carried for issue #129; not yet consumed.)
conversion_rate_range: [0.10, 0.20]
# Strength of buying-committee-friction effects. (Carried for issue #129.)
committee_friction: 0.10

intermediate:
description: >
Realistic signal-to-noise ratio. Suitable for portfolio projects,
courses, and kaggle-style pLTV competitions.
signal_strength: 0.70
noise_scale: 0.30
missing_rate: 0.08
outlier_rate: 0.04
conversion_rate_range: [0.18, 0.30]
committee_friction: 0.30

advanced:
description: >
High noise, realistic outliers, and significant missing data. Suitable for
ML research and realistic pLTV benchmark construction.
signal_strength: 0.50
noise_scale: 0.55
missing_rate: 0.18
outlier_rate: 0.08
conversion_rate_range: [0.25, 0.40]
committee_friction: 0.55
101 changes: 101 additions & 0 deletions leadforge/recipes/b2b_saas_ltv_v1/narrative.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Narrative defaults for b2b_saas_ltv_v1
# ---------------------------------------------------------------------------
# Baseline "story facts" for the mid-market B2B SaaS subscription-LTV vertical.
# The lifecycle scheme's population builder consumes market.icp_industries and
# market.geographies to drive each customer's firmographics; the remaining
# sub-specs document the world for the dataset card and future mechanisms.
#
# INVARIANT (snapshot-safety / zero-variance): market.icp_industries and
# market.geographies MUST each declare >= 2 values. The public relational
# export surfaces industry/region columns directly, so a single-value vocabulary
# would yield a zero-variance public feature (student_public invariant #6).

company:
name: "Northwind Revenue Cloud"
founded_year: 2016
hq_city: "Denver"
hq_country: "US"
stage: "Series C"
employee_range: [180, 320]

product:
name: "Northwind Subscriptions"
category: "Subscription & Revenue Lifecycle Management"
deployment: "cloud_saas"
pricing_model: "per_seat_annual"
acv_range_usd: [12000, 90000]
contract_terms_months: [12, 24, 36]
free_trial_available: true
demo_available: true

market:
icp_employee_range: [150, 2500]
icp_industries:
- saas
- fintech
- healthtech
- ecommerce
geographies: [US, UK, CA]
avg_deal_size_usd: 36000
avg_sales_cycle_days: 40

gtm_motion:
channels:
- inbound_marketing
- sdr_outbound
- partner_referral
inbound_share: 0.50
outbound_share: 0.30
partner_share: 0.20

personas:
- role: cfo
title_variants:
- "CFO"
- "Chief Financial Officer"
- "VP Finance"
- "Head of Finance"
decision_authority: economic_buyer
typical_involvement: late_stage

- role: revops_manager
title_variants:
- "RevOps Manager"
- "Revenue Operations Manager"
- "Director of Revenue Operations"
- "Head of RevOps"
decision_authority: champion
typical_involvement: full_cycle

- role: billing_admin
title_variants:
- "Billing Administrator"
- "Billing Operations Lead"
- "AR Manager"
- "Subscriptions Manager"
decision_authority: end_user
typical_involvement: full_cycle

- role: customer_success_lead
title_variants:
- "Customer Success Lead"
- "VP Customer Success"
- "Head of Customer Success"
- "Director of Customer Success"
decision_authority: technical_evaluator
typical_involvement: post_sale

# Post-sale lifecycle stages (the subscription journey this scheme simulates).
funnel_stages:
- name: onboarding
label: "Onboarding"
- name: activated
label: "Activated"
- name: adopted
label: "Adopted"
- name: expanded
label: "Expanded"
- name: renewed
label: "Renewed"
- name: churned
label: "Churned"
31 changes: 31 additions & 0 deletions leadforge/recipes/b2b_saas_ltv_v1/recipe.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
id: b2b_saas_ltv_v1
title: "Mid-market B2B SaaS — Subscription Lifetime Value"
vertical: mid_market_b2b_saas
# Generation scheme this recipe runs (see leadforge.schemes). This recipe runs
# the customer-lifecycle / pLTV scheme rather than lead_scoring.
scheme: lifecycle
description: >
A mid-market B2B SaaS company selling subscription revenue-lifecycle
management software to 150–2,500 employee firms across the US, UK, and
Canada. The simulated world tracks each signed customer through onboarding,
adoption, expansion, payment health, and churn so that predicted lifetime
value (pLTV over 90/365/730-day forward windows) and 180-day churn emerge
from simulated subscription events rather than being sampled directly.
# The canonical pLTV regression target; the bundle also ships the 90d/730d
# windows and a 180-day churn classification task (see the lifecycle scheme).
primary_task: pltv_revenue_365d
supported_modes:
- student_public
- research_instructor
supported_difficulty:
- intro
- intermediate
- advanced
default_population:
# The lifecycle scheme is customer-centric: it samples a customer population
# (each customer owns one account + one subscription) rather than leads.
n_customers: 1500
# The longest forward pLTV window, in days (the engine simulates through it so
# every target is fully covered). Lifecycle forward windows are locked to the
# scheme's exported constant (90/365/730); see LifecycleScheme.build_world.
horizon_days: 730
79 changes: 74 additions & 5 deletions leadforge/schemes/lifecycle/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

from __future__ import annotations

import dataclasses
import random
from typing import TYPE_CHECKING, Any

Expand Down Expand Up @@ -58,11 +59,14 @@ def build_world(
vocabularies (``market.icp_industries`` / ``market.geographies``); a
``None`` narrative falls back to the built-in procurement-ICP defaults.

Difficulty (tracked, not silent): ``config.difficulty`` does not yet
scale the *simulation* — every tier yields the same world — so harder
tiers differ only in snapshot distortions (resolved from the recipe
profile in ``LTV-Po`` and threaded into the snapshot builders).
Simulation-level difficulty scaling is deferred (issue #129).
Difficulty: ``config.difficulty`` resolves (via :meth:`_resolve_difficulty`,
``LTV-Po.2b``) into :class:`DifficultyParams` read from the recipe's
``difficulty_profiles.yaml``, which the snapshot builders apply as
feature distortions (noise / missingness / outliers; targets exempt).
The resolved params ride on the returned bundle's ``spec.config`` so
:meth:`write_bundle` picks them up. Difficulty does NOT yet scale the
*simulation* — every tier yields the same underlying world — so
simulation-level scaling remains deferred (issue #129).
"""
from leadforge.core.exceptions import InvalidConfigError
from leadforge.core.models import WorldBundle, WorldSpec
Expand All @@ -85,6 +89,10 @@ def build_world(
"exports the fixed set). Use the default until that wiring lands."
)

# Resolve difficulty → DifficultyParams (snapshot distortions) and carry
# them on config so the returned spec + write_bundle see them.
config = self._resolve_difficulty(config)

motif_rng = RNGRoot(config.seed).child("lifecycle_motif")
motif_family = _sample_motif_family(motif_rng)

Expand Down Expand Up @@ -112,6 +120,67 @@ def build_world(
),
)

@staticmethod
def _resolve_difficulty(config: GenerationConfig) -> GenerationConfig:
"""Attach :class:`DifficultyParams` from the active difficulty profile.

Mirrors :meth:`LeadScoringScheme._resolve_difficulty` (minus the
lead-scoring-only ``category_latent_correlations``): loads the recipe's
``difficulty_profiles.yaml``, reads the profile for
``config.difficulty``, and returns ``config`` with the resolved
:class:`DifficultyParams` attached. The snapshot builders consume
``noise_scale`` / ``missing_rate`` / ``outlier_rate``; the remaining
knobs are carried for forward-compatible simulation-level scaling
(issue #129).

Returns ``config`` unchanged when the recipe has no difficulty-profiles
file (e.g. an ad-hoc config whose recipe lacks one).
"""
from leadforge.api.recipes import Recipe
from leadforge.core.models import DifficultyParams
from leadforge.recipes.registry import load_recipe

try:
raw = load_recipe(config.recipe_id)
recipe = Recipe.from_dict(raw)
profiles = recipe.load_difficulty_profiles()
except (FileNotFoundError, KeyError):
return config
if not profiles:
return config
Comment on lines +143 to +150

profile = profiles.get(config.difficulty.value, {})

# All keys are required — a missing key indicates a malformed profile
# YAML and should fail loudly rather than silently defaulting.
required_keys = (
"signal_strength",
"noise_scale",
"missing_rate",
"outlier_rate",
"conversion_rate_range",
"committee_friction",
)
missing = [k for k in required_keys if k not in profile]
if missing:
from leadforge.core.exceptions import InvalidRecipeError

raise InvalidRecipeError(
f"Difficulty profile '{config.difficulty.value}' is missing "
f"required keys: {missing}"
)
cr_range = profile["conversion_rate_range"]
difficulty_params = DifficultyParams(
signal_strength=profile["signal_strength"],
noise_scale=profile["noise_scale"],
missing_rate=profile["missing_rate"],
outlier_rate=profile["outlier_rate"],
conversion_rate_lo=cr_range[0],
conversion_rate_hi=cr_range[1],
committee_friction=profile["committee_friction"],
)
Comment on lines +172 to +181
return dataclasses.replace(config, difficulty_params=difficulty_params)

def write_bundle(
self,
bundle: WorldBundle,
Expand Down
Loading
Loading