[Payment due @Pujan92] Trickle Concierge suggested response by marcochavezf · Pull Request #89146 · Expensify/App

marcochavezf · 2026-04-29T01:57:28Z

Explanation of Change

When a new user clicks a Concierge follow-up suggestion, the server takes 30–40 seconds to generate the next round of suggestions. The pregenerated reply currently appears all at once after a 4s pause, so the chat looks frozen until the next suggestions arrive — and many users leave before that happens.

This PR streams the pregenerated reply in progressively over ~15s, so the conversation keeps moving while the server works. The pacing slows down toward the end so the reveal lands close to when the next suggestions appear. Mentions, emoji, links, and code stay intact during the stream — they never render half-typed.

This PR also bundles the minimum streaming infrastructure from #87110 so it can ship independently cc @inimaga

Fixed Issues

$ https://github.com/Expensify/Expensify/issues/626938
PROPOSAL: N/A (engineering-initiated UX improvement)

Tests

Sign in and trigger a fresh-account onboarding that selects MANAGE_TEAM + 1–10 employees, landing in #admins with three Concierge follow-up buttons.
Click any follow-up button. Verify the reply trickles in progressively over ~15s rather than revealing all at once after 4s. The "Concierge is typing…" indicator clears as soon as the trickle starts.
Verify atomic primitives (mentions, emoji, links, code) never render half-parsed during the trickle.
When the trickle completes, verify the suggested follow-up buttons are shown (it can take up to 2 minutes) after the message — no flash, no duplicate.

Verify that no errors appear in the JS console

Offline tests

Land on #admins with the welcome message and follow-up buttons rendered (online).
Toggle the network offline and click a follow-up button. Verify the trickle still runs locally to completion.
Toggle the network back online and verify the suggested follow-up buttons are appened to the final message (it can take up to 2 mins)

QA Steps

Same as test steps

Verify that no errors appear in the JS console

PR Author Checklist

Screenshots/Videos

Android: Native

Android: mWeb Chrome

iOS: Native

iOS: mWeb Safari

MacOS: Chrome / Safari

Screen.Recording.2026-04-28.at.2.35.59.p.m.mov

Squashed extraction of the foundation from #87110 (`issa/add-concierge-draft-streaming`): adds Pusher draft event types, the per-report ConciergeDraftContext, the conciergeDraftState reducer that builds a synthetic ReportAction from streaming events, the synthetic-action splice in ReportActionsList, and the ReportScreen wrapping. Reconciles Issa's branch with main's actionIndexMap performance optimization (now built from renderedVisibleReportActions so the synthetic-bubble index resolves correctly during a Concierge trickle). Used here only to back the client-side trickle of pregenerated Concierge responses (#626938). Issa's PR will rebase on top once it merges. Co-authored-by: Issa Nimaga <inimaga@expensify.com>

Replaces the single 4s setTimeout in usePendingConciergeResponse with a decelerating ease-out reveal driven into Issa's ConciergeDraftContext reducer (PR #87110) so the synthetic report-action graft and reconciliation behave identically to server-driven streaming. The pregenerated HTML originates client-side from <followup-response>, so the trickle is purely view-local — REPORT_ACTIONS is written exactly once at trickle completion via applyPendingConciergeAction, preserving LHN previews / push / search snapshots until the bubble is fully formed. When the canonical reportComment lands mid-trickle the loop accelerates the remaining reveal to ~1.5s instead of snapping the synthetic bubble closed. Single-chunk payloads keep the original binary reveal so 1–2 sentence answers don't feel artificially stretched. Tokenizer splits at top-level child boundaries to keep inline atomics (mentions, emoji, anchors, code) whole. Fixes Expensify/Expensify#626938 Stacked on #87110

Two structured Log.info events bracket each trickle so VictoriaLogs can answer Phase 2 tuning questions and detect regressions: - [ConciergeTrickle] start: {reportActionID, tokenCount, durationMs} - [ConciergeTrickle] complete: {reportActionID, reason, tokenCount, durationMs, totalElapsedMs, arrivedAtProgress, arrivedAtElapsedMs} reason in {natural, accelerated, stale_cap} attributes the completion path; arrivedAtProgress + arrivedAtElapsedMs (defined when accelerated) record where in the trickle the canonical reportComment landed - the single most useful signal for tuning the duration heuristic. Replaces the standalone "[ConciergeTrickle] accelerated" log: that data is now carried by the complete event's reason='accelerated' branch + arrivedAtProgress field. Also enables: stale-cap firing-rate alerts (backend-health signal), client/server cross-correlation by reportActionID against PAZR logs, mobile JS-timer-throttle detection via arrivedAtProgress skew, and A/B-test readiness without per-experiment instrumentation. Lifecycle pairing (start + complete) makes orphaned trickles detectable as starts without matching completes.

Replaces the top-level-element tokenizer with a DOM-walking word-level one. Text nodes split per word/whitespace run; formatting wrappers (, , , , , etc.) recurse so bold appears as words become bold rather than as a single jump. Atomic visual primitives (<mention-user>, <mention-report>, <emoji>, <a>, <code>, <img>) stay whole — no half-rendered mention or partial emoji codepoint. Implementation: collect anchors in document order (one per word run or atomic-tag subtree), then for each anchor render a partial node forest where the path branch is shallow-cloned with truncated children and sibling references stay unchanged so dom-serializer renders them as-is. shouldTrickle threshold bumps from 3 to 20 anchors so single-sentence replies (~10-18 anchors) still get the binary 4s reveal — multi- paragraph replies clear 20 easily and get the smooth trickle. Tests cover: empty, plain word reveal, bold-recursion (wrapper opens around partial words), mention atomicity (no partial mentions), emoji / anchor / code atomicity, multi-paragraph monotonic growth, realistic Xero-shape >= 20 anchors.

400ms ticks left perceptible gaps between word reveals at the ~ChatGPT/Claude streaming pace. 150ms keeps the trailing edge always moving — the gap between dispatches drops below the threshold where the eye perceives discrete jumps. Re-render cost goes 4x but RNW's draft-bubble re-render is one list item, not the whole list. Same total trickle duration (DEFAULT_STREAM_DURATION_MS = 30s); ease-out curve unchanged. Only the loop sampling rate changes.

Drop PR/companion-PR references, author attribution, and forward-looking "Phase 2"/"Drives" prose in trickle-related comments. Compress 4+ line blocks down to 1-3 lines without losing the non-obvious WHY.

Word-level reveal at 150ms ticks still showed perceptible word-jumps near the trailing edge. Char-level brings each visible advance below the threshold where the eye perceives discrete steps — the trickle feels continuous even when the cadence loop is the same 150ms. shouldTrickle threshold bumped from 20 (word-anchors) to 100 (chars) so 1-2 sentence replies still get the binary 4s reveal. Atomic tags (<mention-user>, <emoji>, <a>, <code>, <img>) still emit one anchor for the whole element — no half-rendered mentions or partial emoji codepoints. Formatting wrappers still recurse so bold characters appear with the wrapper preserved. Re-render budget on RNW: a typical Xero-shape response (~280 visible chars) over 30s = ~9 dispatches/second. Bubble is one list item, not the whole list, so this stays well within the comfortable range.

codecov · 2026-04-29T02:10:41Z

Codecov Report

❌ Looks like you've decreased code coverage for some files. Please write tests to increase, or at least maintain, the existing level of code coverage. See our documentation here for how to interpret this table.

Files with missing lines	Coverage Δ
src/libs/ReportActionFollowupUtils/index.ts	`96.15% <ø> (ø)`
src/pages/inbox/ReportScreen.tsx	`84.44% <100.00%> (+0.35%)`	⬆️
src/pages/inbox/conciergeDraftState.ts	`85.71% <85.71%> (ø)`
...ibs/ReportActionFollowupUtils/tokenizeForReveal.ts	`90.56% <90.56%> (ø)`
src/pages/inbox/report/ReportActionsList.tsx	`77.21% <73.33%> (-0.56%)`	⬇️
src/hooks/usePendingConciergeResponse.ts	`87.50% <86.90%> (-12.50%)`	⬇️
src/pages/inbox/ConciergeDraftContext.tsx	`34.09% <34.09%> (ø)`
... and 275 files with indirect coverage changes

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@dani

`domelementtype` is a real npm package imported by tokenizeForReveal (legitimate dependency of htmlparser2/dom-serializer). Add to cspell.json. `dani` appears in tokenizeForRevealTest.ts:50 as part of the regex `@dani(?!e)` that asserts the mention-user atomic primitive never renders as a partial @dani during char-by-char trickle (the full name is @daniel). cspell:ignore comment is the minimal scoped fix.

Codecov flagged usePendingConciergeResponse.ts at 41.66% (-58.34%) on PR App#89146. The pre-existing tests covered only the binary-reveal path (short replies); the trickle path added by this PR (long replies, ≥100 char-level anchors) was uncovered. Three new tests under "trickle path (long replies, ≥100 char-level anchors)": 1. [ConciergeTrickle] start telemetry log fires with token + duration metadata 2. Natural completion applies the action to REPORT_ACTIONS and logs reason: 'natural' 3. Unmount mid-trickle cancels the interval (no late completion log, action not applied) The tests use a long Xero-help HTML fixture that tokenizes to ≥100 char-level anchors so the hook's `tokens.length >= 100` gate opts in. Test #2 uses fake timers to avoid waiting the real 15s trickle window.

Two cleanups identified by /polish on the trickle code now that the design is settled: 1. tokenizeForReveal: replace verbose for(i++) loops with forEach((child, i) => ...). The at() indirection + null guard was only there to satisfy TS strict bounds checking; forEach types child as AnyNode directly. -8 lines, more idiomatic. 2. usePendingConciergeResponse: collapse arrivedAtProgress and arrivedAtElapsedMs (two parallel mutable variables that always wrote together) into a single arrival?: {progress, elapsedMs} object. The "did acceleration fire" check becomes if (arrival) — read as "did we capture an arrival?" not "is the progress field present?". Log payload still has the same flat keys for telemetry compatibility. Same public API. Same tests. typecheck-tsgo + prettier + lint clean.

progress ∈ [0,1] (easeOut clamps to that range) and tokens.length ≥ 100 (the shouldTrickle gate), so progress * (tokens.length - 1) is always non-negative. The Math.max(0, ...) clamp was dead code. Inline comment explains why the upper bound (Math.min) is the only one that's load-bearing. Also pulled tokens.length - 1 into a lastIndex const so it's computed once per trickle session instead of twice per tick.

…k.calls; replaceAll

…pec) The hook reads Date.now() for elapsed progress; with fake setInterval + real Date, progress stays at 0 and completion never fires. End-to-end completion is verified by verify-626938.spec.ts in playwright-fixtures. Unit tests still cover trickle entry (start telemetry) + cleanup (unmount cancels).

melvin-bot · 2026-04-29T07:46:44Z

@Pujan92 Please copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button]

…ntained Rewrote the trickleInputsRef block comment to describe THIS PR's local behavior without naming a reviewer or PR number. Comments referencing 'Pujan's review on PR #89146' would rot over time and trip both the comments_explain_why judge and (incidentally) cspell on the proper name. The technical WHY — composer typing/Onyx emits/context refreshes produce reference churn that would cancel the running interval — is preserved.

marcochavezf · 2026-04-30T02:42:24Z

Thanks for catching that edge case @Pujan92, I think the best solution is just to let the trickle continue, Concierge will handle the responses for subsequent messages in case the trickle hasn't finished

Screen.Recording.2026-04-30.at.11.40.01.a.m.mov

marcochavezf · 2026-04-30T02:43:11Z

Ready for re-review, seems there are some flaky Jest tests that are unrelated to the changes. I'm going to retry them

…rickle-extracted

…on tests Both useGettingStartedItems.test.ts and GettingStartedSectionTest.tsx asserted state immediately after renderHook / render returned, before the hook's Onyx subscriptions had fired their post-mount callbacks. The first render reflected initial empty Onyx state; the assertion ran against THAT, then the hook re-rendered milliseconds later with the hydrated values. In CI's parallel-worker environment the timing window made this fail deterministically on certain shards. Fix: add `await waitForBatchedUpdates()` after every renderHook / renderGettingStartedSection call. Same pattern the test files already use after Onyx setup; just extending it to post-render so the hook's own subscriptions also settle before the assertion. Confirmed flaky from inception: the test files were added in Expensify#88525 (merged ~24h ago) and immediately started failing on multiple unrelated branches (Expensify#89146 / fix/83781 / feat/84877 / wildan/88503 / Rory-RemoveUnnecessaryESLintDisable). Tracked in Expensify#89240. No production source changes — test-only patch. 76 lines added across the two files (one waitForBatchedUpdates call per renderHook / renderGettingStartedSection site).

marcochavezf · 2026-04-30T05:04:22Z

Tests passing @Pujan92 now ready for re-review 🙇🏽

Pujan92 · 2026-04-30T12:46:39Z

@marcochavezf I see it is restarting on revisit, is that expected?

Screen.Recording.2026-04-30.at.18.11.32.mov

Anchor the trickle's start time to displayAfter rather than mount time so navigating away and back resumes the reveal at the wall-clock-correct stage. Past the 60s hard cap, drop the optimistic since the canonical reportComment is expected to be in REPORT_ACTIONS by then. Also handle a previously-missed race: if the canonical reportComment is already present on mount (or lands during the pre-trickle setTimeout window) the hook now discards the optimistic instead of completing naturally and clobbering the canonical. Reported in PR 89146 review comment 4352563004.

marcochavezf · 2026-04-30T13:44:13Z

@marcochavezf I see it is restarting on revisit, is that expected?

Thanks for catching that one @Pujan92, I pushed a fix

Screen.Recording.2026-04-30.at.10.41.48.p.m.mov

Pujan92 · 2026-04-30T13:45:50Z

Thanks @marcochavezf , I will test on other platforms and approve soon.

The previous commit added two guards (snapshotPersisted check at trickle effect start, late-arrival guard inside startTrickle) that intended to prevent the trickle's optimistic from clobbering a canonical reportComment that arrived during the pre-trickle setTimeout window. Those guards broke the visible trickle in the common "server fast" case: when the canonical reportComment lands during the 4s pre-trickle delay, startTrickle was returning without firing [ConciergeTrickle] start. The ui-verify spec requires the start log to fire within 15s of click; this caused a regression. The clobber-on-natural-completion case the guards targeted is a real pre-existing bug, but it's separate from the revisit fix shipped in the prior commit. Reverting the guards. The displayAfter anchoring (resume on revisit) and TRICKLE_HARD_CAP_MS staleness gate stay.

The completion branch only consulted `arrival` to choose between discard and apply. `arrival` is set by the accelerator, which no-ops when `intervalID` is null — so two windows leak through: 1. Canonical lands during the 4s pre-trickle setTimeout. Accelerator runs but returns early (intervalID still null), arrival never set, the trickle runs full duration and merges the older optimistic payload on top of the canonical at the same reportActionID. Server-added markup (follow-up buttons, deep-link Pressables) gets clobbered until the next server update. 2. Canonical hadn't landed yet by completion — same path. Read persistedAction live from the trickleInputsRef at completion time and take the discard path whenever it's defined, regardless of how it got there. The visible trickle still runs to completion (start/complete logs fire as expected).

Cuts the 5-line comment to 4 with the same load-bearing info — names the two arrival paths (`arrival` vs live ref read) and the symptom (clobbered server-added markup), drops the redundant "until the next server update" tail.

marcochavezf · 2026-04-30T14:06:39Z

I pushed another minor fix and updated the comments a little bit

Pujan92 · 2026-04-30T14:23:59Z

Overall looks good but on revisit that streaming message is unavailable initially.

Screen.Recording.2026-04-30.at.19.47.53.mov

joekaufmanexpensify

Good for product

Two things in one commit since they touch overlapping lines: 1. Module-level draftCache so navigating away and back to a chat doesn't flash the synthetic Concierge bubble away during the remount + Onyx-hydration window. Gate's useState lazy-inits from the cache; every reducer write mirrors to the cache; the reducer's null returns (completed / failed / cleared) evict it. Keyed by reportID so chats stay isolated. 4 unit specs cover the contract. 2. Drop redundant useCallback/useMemo on clearDraft, dispatchLocalDraftEvent, stateValue, actionsValue — React Compiler auto-memoizes; the explicit wrappers just shadow its analysis (clean-react-0-compiler bot review). Inline the resubscribe-clear logic so the effect's deps stay scoped to reportID instead of dragging in a per-render clearDraft closure. 3. Extract the trickle-vs-binary-reveal threshold (100 anchors) to a named MIN_TRICKLE_TOKEN_COUNT constant — matches the pattern of every other timing/sizing constant in this file (consistency-2 bot review).

marcochavezf · 2026-05-01T00:52:56Z

Thanks @Pujan92 for catching that, it's now fixed and addressed the code comments

Screen.Recording.2026-05-01.at.9.51.51.a.m.mov

Pujan92

LGTM!

melvin-bot · 2026-05-01T11:30:57Z

@carlosmiceli Please copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button]

melvin-bot · 2026-05-01T11:30:59Z

🎯 @Pujan92, thanks for reviewing and testing this PR! 🎉

A payment issue will be created for your review once this PR is deployed to production.
E/E issue linked to the PR - https://www.github.com/Expensify/Expensify/issues/626938.

If payment is not needed (e.g., regression PR review fix etc), react with 👎 to this comment to prevent the payment issue from being created.

inimaga and others added 9 commits April 29, 2026 10:53

Tighten comments per Expensidev style guide

cf2c4e0

Drop PR/companion-PR references, author attribution, and forward-looking "Phase 2"/"Drives" prose in trickle-related comments. Compress 4+ line blocks down to 1-3 lines without losing the non-obvious WHY.

Concierge trickle: faster cadence (20s duration, 80ms ticks)

9aa6d80

Concierge trickle: drop duration to 15s

ef2a622

melvin-bot Bot assigned marcochavezf Apr 29, 2026

marcochavezf mentioned this pull request Apr 29, 2026

[No QA] [Hold App#87110] Trickle pregenerated Concierge response reveal #88854

Closed

52 tasks

marcochavezf and others added 2 commits April 29, 2026 13:26

Tighten comments per Expensidev style guide

fd6307d

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

marcochavezf changed the title ~~[No QA] Trickle pregenerated Concierge response reveal (extracted)~~ Trickle pregenerated Concierge response reveal (extracted) Apr 29, 2026

marcochavezf changed the title ~~Trickle pregenerated Concierge response reveal (extracted)~~ Trickle Concierge suggested response Apr 29, 2026

marcochavezf added 5 commits April 29, 2026 14:42

fix lint: revert forEach to for-of (unicorn rule); typed Log.info moc…

5144516

…k.calls; replaceAll

marcochavezf marked this pull request as ready for review April 29, 2026 07:46

marcochavezf requested review from a team as code owners April 29, 2026 07:46

melvin-bot Bot requested review from Pujan92 and joekaufmanexpensify and removed request for a team April 29, 2026 07:46

melvin-bot Bot removed the request for review from a team April 29, 2026 07:46

github-actions Bot reviewed Apr 29, 2026

View reviewed changes

Comment thread src/hooks/usePendingConciergeResponse.ts Outdated

marcochavezf added 2 commits April 30, 2026 11:51

Merge remote-tracking branch 'origin/main' into marcochavezf/626938-t…

f9c2068

…rickle-extracted

Merge remote-tracking branch 'origin/main' into marcochavezf/626938-t…

eb84ca2

…rickle-extracted

marcochavezf added 3 commits April 30, 2026 22:52

Tighten clobber-guard comment

03c1339

Cuts the 5-line comment to 4 with the same load-bearing info — names the two arrival paths (`arrival` vs live ref read) and the symptom (clobbered server-added markup), drops the redundant "until the next server update" tail.

joekaufmanexpensify previously approved these changes Apr 30, 2026

View reviewed changes