Obsidian-friendly output: --output dir + --expand-paths + --filter-path (#151)#155
Obsidian-friendly output: --output dir + --expand-paths + --filter-path (#151)#155cboos wants to merge 9 commits into
Conversation
Three new CLI flags (`--output` for `--all-projects`, `--expand-paths`, `--filter-path`) for projecting Claude Code transcripts into the directory topology Obsidian vaults expect. Plan covers: - The empirical finding that `--output` was silently ignored in `--all-projects` mode (closing that gap is part of the scope). - Helper API: `project_dir_to_real_path` + `project_destination`. - Three-tier path resolution: cache → JSONL peek → naive last-resort. - Flag interaction matrix. - Test plan: unit (test_path_projection.py) + integration (test_obsidian_output.py, Markdown-scoped). All six initial open questions resolved by the user. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…th (#151) Three CLI flags for projecting Claude Code transcripts into the directory topology Obsidian-style Markdown vaults expect: - `--output` is now honoured for `--all-projects` (closing the gap where it was silently ignored). Combined with the suffix heuristic in utils.output_path_is_file(), the same flag handles both file and directory destinations. - `--expand-paths` undoes Claude Code's flat encoding so each project's output lands at its real on-disk path under <output>/. - `--filter-path` selects projects by prefix and (with --expand-paths) truncates the prefix from the destination. ## Helpers (utils.py) `project_dir_to_real_path(project_dir, cached_working_directories=None)` recovers the real path with three-tier resolution: 1. cache hit (ProjectCache.working_directories), absolute paths only; 2. peek the first JSONL for a `cwd` field (lightweight json.loads loop, bounded to 32 lines, agent-* sidechain files skipped); 3. naive last-resort with `--` → `/.` mapping for dotfile dirs. `project_destination(...)` implements the flag-interaction matrix from work/obsidian-friendly-output.md and returns None for filter-excluded projects. ## process_projects_hierarchy Now consults project_destination() per project; skips filtered-out ones; threads dest_dir through convert_jsonl_to via a new `output_root` parameter. Index lives at output_dir/index.{ext} (or projects_path/ in legacy mode); html_file links computed relative to that. ## Tests - test/test_path_projection.py — 26 unit tests across resolution tiers, the disambiguation case, the agent-* skip, and every cell of the project_destination matrix. - test/test_obsidian_output.py — 5 integration tests (Markdown-scoped per the user's Q1 resolution) driving process_projects_hierarchy end-to-end and asserting the produced directory tree. `just ci` clean: 1721+ tests, ruff + pyright + ty all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two footgun fixes flagged in monk's review of dev/obsidian-friendly-output:
1. Reject relative `--filter-path` when paired with `--expand-paths`.
Path.relative_to raises ValueError for *any* mismatch including
"argument is relative" — so without the guard, a user typing
`--filter-path home/joe` (forgetting the leading `/`) would get
every project silently skipped. Now rejected at click parse time
with click.BadParameter.
2. Add the no-op flag warnings the plan promised but the impl missed:
- --expand-paths/--filter-path without --all-projects (or implicit
--all-projects via no INPUT_PATH) → warn.
- --output unset OR --output is file-suffixed (single-file path
bypasses these flags) → warn.
The existing --tui guard stays as the first branch.
Plus monk's optional doc clarification: tightened the _rel_to_index
helper's comment from "shouldn't happen" to "unreachable under the
documented matrix; kept as a paranoia rail" — empirically verified
by monk's matrix-row walkthrough.
Adds three regression tests in test_obsidian_output.py exercising
the new validations through CliRunner end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`is_html_stale` / `is_page_stale` resolve `actual_file` against `self.project_path` (the SOURCE dir under ~/.claude/projects/), not the actual output destination. With legacy in-place behaviour the two are identical. With `--output` they diverge — every run re-renders because the cache thinks the source's combined_transcripts.html is the canonical artifact (and it's never written there in `--output` mode). Practical implication: bouncing between several --output dirs always re-renders even when destinations are current. JSONL parsing is still cache-hit; only rendering re-runs. Recorded as a "Follow-up / Open points" section alongside the two follow-ups monk surfaced (archived projects with --output; peek-debug logging). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds CLI flags --expand-paths and --filter-path, utilities to recover real project paths and classify output destinations, and routes multi-project exports into per-project output directories with optional filtering and index/link relativization; includes unit and integration tests and a design doc. ChangesObsidian-Friendly Output Implementation
Sequence Diagram(s)sequenceDiagram
participant User as CLI
participant Main as main()
participant Validator as CLI validations
participant Hierarchy as process_projects_hierarchy()
participant Dest as project_destination()
participant Converter as convert_jsonl_to()
participant Index as index generator
User->>Main: invoke with flags (--output, --expand-paths, --filter-path, --all-projects)
Main->>Validator: validate flags (absolute filter_path if expand, tui, output shape)
Validator-->>Main: OK / warnings / reject
Main->>Hierarchy: process_projects_hierarchy(output_dir, expand_paths, filter_path)
Hierarchy->>Dest: compute dest_dir per project
Dest-->>Hierarchy: dest_dir or skip
Hierarchy->>Converter: convert_jsonl_to(input_path, output_root=dest_dir)
Converter-->>Hierarchy: produced file paths
Hierarchy->>Index: emit index.md with links relativized to index_root
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
claude_code_log/converter.py (1)
2636-2652:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDerive the combined filename from format and variant here too.
Both of these sites hard-code
combined_transcripts.html, butconvert_jsonl_to()now writescombined_transcripts{suffix}.{ext}. In--format md/jsonor non-default variants, the hierarchy preflight will think every project is stale forever, and archived index entries will point at the wrong file name.Suggested fix
+ from .utils import project_destination, variant_suffix as _variant_suffix - - from .utils import project_destination + variant = _variant_suffix(detail, compact, output_format) + combined_name = f"combined_transcripts{variant}.{get_file_extension(output_format)}" ... - output_path = dest_dir / "combined_transcripts.html" + output_path = dest_dir / combined_name ... - combined_stale = cache_manager.is_page_stale(1, page_size)[0] + combined_stale = cache_manager.is_page_stale(1, page_size, variant)[0] ... - "html_file": f"{archived_rel}/combined_transcripts.html", + "html_file": f"{archived_rel}/{combined_name}",Also applies to: 2971-2979
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@claude_code_log/converter.py` around lines 2636 - 2652, The code currently hard-codes "combined_transcripts.html" for output_path and cache lookups, which breaks when convert_jsonl_to() writes files as "combined_transcripts{suffix}.{ext}"; update the logic that builds output_path (and the parallel block later around the other occurrence) to derive the filename using the same format/variant/suffix and extension determination used by convert_jsonl_to() (e.g., compute suffix and ext from the requested format/variant, then set output_name = f"combined_transcripts{suffix}.{ext}" and use dest_dir / output_name), and then use that computed output_path.name when calling cache_manager.is_html_stale(...) and for any other cache lookups so combined_stale checks point to the real generated filename.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@claude_code_log/utils.py`:
- Around line 341-344: The current filter logic uses
project_dir.name.startswith(filter_path) which incorrectly matches siblings;
change it to only accept exact matches or names where filter_path is a prefix
followed by a path-component separator (e.g., a hyphen); specifically, in the
block that references filter_path, project_dir, and output_dir, replace the
startswith check with a condition that returns None unless project_dir.name ==
filter_path or project_dir.name.startswith(filter_path + "-").
- Around line 222-230: The decode logic in convert_project_path_to_claude_dir is
naively replacing "--"→"/." and "-"→"/", which corrupts Windows drive-encoded
names like "E--Workspace-src"; update the decoder to detect a leading drive
letter pattern (e.g., r'^[A-Za-z]--'), extract the drive (letter + ':'), decode
the remainder with the existing "--"→"/." and "-"→"/" rules, and construct a
Windows-aware Path (preserving the drive as its own path component) instead of
embedding the drive into the path body. Also, when building destination paths
from real_path, stop dropping the drive: if real_path.drive is non-empty include
real_path.drive (normalized, e.g., without ':'/slashes) as the first segment
when joining under output_dir rather than using real_path.parts[1:], and change
the flat-name filter check (the startswith() used to match encoded prefixes) to
only match exact prefix boundaries (i.e., match when name == filter or
name.startswith(filter + '-') ) to avoid overmatching sibling prefixes.
In `@test/test_obsidian_output.py`:
- Around line 271-297: Update the
test_warns_when_flags_used_without_all_projects to assert the CLI still exits
successfully after emitting the warning: after invoking claude_code_log.cli.main
via CliRunner (the result variable), add an assertion that result.exit_code == 0
(or "result.exception is None") in addition to checking the warning text; do the
same change for the related test covering lines 299-321 so both tests verify
success as well as the presence of the warning.
In `@work/obsidian-friendly-output.md`:
- Line 3: Update the status header line that currently reads "## Status: Plan —
not started" to reflect that the implementation and tests are included in this
PR (for example, "## Status: Done — implemented and tested" or similar); locate
the header in work/obsidian-friendly-output.md and replace the status text so
the file no longer claims the work is unstarted.
---
Outside diff comments:
In `@claude_code_log/converter.py`:
- Around line 2636-2652: The code currently hard-codes
"combined_transcripts.html" for output_path and cache lookups, which breaks when
convert_jsonl_to() writes files as "combined_transcripts{suffix}.{ext}"; update
the logic that builds output_path (and the parallel block later around the other
occurrence) to derive the filename using the same format/variant/suffix and
extension determination used by convert_jsonl_to() (e.g., compute suffix and ext
from the requested format/variant, then set output_name =
f"combined_transcripts{suffix}.{ext}" and use dest_dir / output_name), and then
use that computed output_path.name when calling cache_manager.is_html_stale(...)
and for any other cache lookups so combined_stale checks point to the real
generated filename.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9ae570a8-ca5d-4e26-b514-6c0526e39764
📒 Files selected for processing (6)
claude_code_log/cli.pyclaude_code_log/converter.pyclaude_code_log/utils.pytest/test_obsidian_output.pytest/test_path_projection.pywork/obsidian-friendly-output.md
…#151) User reported empirically: --filter-path /home/cboos/Workspace/github/daain (no --expand-paths) → 665 projects processed, all skipped, only index landed Same class of footgun as the relative-filter-with-expand case monk caught: filter resolves against the encoded flat name (-home-...), which an absolute path never matches. Recorded as a follow-up alongside three related ergonomics items the user surfaced: - --filter-path / --expand-paths should imply --all-projects (no reason for them not to; nothing else to filter). - --expand-paths for single-session / single-project mode (project one artefact into <output>/<real-path>/<filename>). - --dry-run flag — show planned destinations without writing. Two fixes considered for the absolute-filter case: rejection at parse time (symmetric with monk's existing footgun guard) or auto-implying --expand-paths (friendlier; encoded-form filtering is the niche case). Plan leans toward auto-implying. No code changes in this commit — recording for review/dispatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
work/obsidian-friendly-output.md (1)
3-3:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the status header to reflect shipped work.
This still says “Plan — not started,” but this PR includes implementation + tests, so the header is now misleading for future readers.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@work/obsidian-friendly-output.md` at line 3, Update the status header "## Status: Plan — not started" in the document to reflect that the work is shipped (e.g., "## Status: Shipped" or similar) so readers see the current state; locate the header string "## Status: Plan — not started" in the file and replace it with the appropriate shipped status.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@work/obsidian-friendly-output.md`:
- Line 3: Update the status header "## Status: Plan — not started" in the
document to reflect that the work is shipped (e.g., "## Status: Shipped" or
similar) so readers see the current state; locate the header string "## Status:
Plan — not started" in the file and replace it with the appropriate shipped
status.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ec3395e3-707e-46ef-a17a-ed2ca72e8b69
📒 Files selected for processing (1)
work/obsidian-friendly-output.md
Only --filter-path can safely imply --all-projects — there's nothing else to filter. --expand-paths can't, because it has independent meaning in single-session / single-project mode (project one artefact under <output>/<real-path>/<filename>). Implying --all-projects from --expand-paths would silently switch the input scope, which is a much bigger surprise than --filter-path could ever be. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CR review on #155 had one MAJOR and several MINOR findings; plus the Windows CI run failed on the integration tests. Both rooted in the same place — path-shape assumptions — so addressing together. ## Windows failure / CR #3: cross-platform path handling `Path(s).is_absolute()` returns False on Windows for POSIX-form strings like `/home/joe/project/A` (no drive letter). So when a Linux-recorded JSONL cwd is processed on Windows, my tier-1 and tier-2 absoluteness guards in `project_dir_to_real_path` rejected it as "non-absolute" and fell through to the naive last-resort. The subsequent `joinpath('\\', 'home', 'joe', ...)` then anchored to the drive root, writing outside `output_dir` — which is what the failing `test_expand_paths_full_tree[windows-3.14]` caught. Fixed by adding two form-aware helpers in `utils.py`: - `_path_looks_absolute(s)` — accepts both POSIX (`/foo`) and Windows (`C:\foo`) forms regardless of host OS. Replaces the bare `Path(s).is_absolute()` calls in tier 1 and tier 2. - `_split_real_path_for_join(s)` — decomposes a real-path string into the parts to join under `output_dir`. POSIX → drop leading `/`. Windows → keep drive letter as a leading dirname segment (colon stripped), so `C:\foo\bar` lands at `<output>/C/foo/bar`. `project_destination`'s filter-with-expand branch also gained form-aware dispatch: POSIX-form real paths use `PurePosixPath` for the `relative_to` check; Windows-form uses `PureWindowsPath`. Mixing forms returns None (user-error path). ## CR MAJOR #1: variant filename derivation `process_projects_hierarchy` was hard-coding `combined_transcripts.html` for `output_path` and the index `html_file` entry, but `convert_jsonl_to` writes `combined_transcripts{variant}.{ext}` (e.g. `combined_transcripts.low.compact.md`). With `--format md` or non-default --detail/--compact, the cache check always saw "stale" and the index linked to the wrong file. Now derives `combined_name` from the same `variant_suffix(detail, compact, format)` shape and threads it through `output_path`, `is_html_stale`, `is_page_stale` (with the variant arg), and the archived-project index entry. ## CR MINOR #2: filter word boundary `name.startswith(filter_path)` over-matched siblings — `--filter-path -home-joe` would also pass `-home-joet-...`. Tightened to `name == filter or name.startswith(filter + "-")`. ## CR MINOR #4: exit_code asserts in warning tests `test_warns_when_flags_used_without_all_projects` and `test_warns_when_expand_paths_with_file_output` only checked warning text. Added `assert result.exit_code == 0` so they verify the warning doesn't escalate to a failure. ## CR MINOR #5: status header `work/obsidian-friendly-output.md` still said "Status: Plan — not started" but the implementation + tests are in this PR. Updated to "Shipped (impl + tests in this PR; follow-ups recorded below)". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
(Claude) Re: CR's 🟠 MAJOR finding on
With The other four findings (1 MAJOR-adjacent re: Windows drive handling, 3 MINOR) were addressed in the same commit; replies on each thread. |
Two more path-handling bugs uncovered by Windows CI on `ced19a8`:
1. `str(WindowsPath("/home/joe"))` returns `"\\home\\joe"` on Windows
(Path stringification uses native separators). `_split_real_path_for_join`
then doesn't recognize the backslash form as POSIX-absolute and falls
through to the relative branch; `output_dir.joinpath("\\home\\joe")`
resets to drive root → destination escapes output_dir.
Fixed by using `real_path.as_posix()` instead of `str(real_path)` in
`project_destination`. `as_posix()` always returns forward slashes
regardless of host OS, so our form-aware detection works.
2. The CLI guard `Path(filter_path).is_absolute()` is host-OS-bound
too. `Path("/home/joe").is_absolute()` returns False on Windows
(no drive letter), so the test_absolute_filter_path_with_expand_is_accepted
test was getting BadParameter on Windows.
Fixed by promoting `_path_looks_absolute` → `path_looks_absolute`
(public) and reusing it in cli.py. Same form-aware logic as the
utils.py internal callers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-surfaced points for follow-up work, recorded in the
"Follow-up / Open points" section:
1. `--combined yes/no/only` (or `both/none/only`) — suppress combined
transcripts. Combined + per-session both is dead weight in Obsidian;
the file tree itself is the navigation surface. When suppressed,
the index links directly to `session-{id}.md` files.
2. Markdown index in `--expand-paths` mode should render the directory
hierarchy as a nested bullet list: directories as parent bullets,
sessions as nested children. Renders nicely in Obsidian preview AND
plain Markdown viewers.
3. **CRITICAL**: Markdown renderer omits per-message timestamps —
blocks cross-session narrative / episodic-memory reconstruction.
HTML already has them; needs porting to
`claude_code_log/markdown/renderer.py`. Format proposal in the doc
with concrete before/after examples. Should land BEFORE Obsidian-
friendly output sees serious narrative use; worth its own issue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Three CLI flags for projecting Claude Code transcripts into the directory topology Obsidian-style Markdown vaults expect:
claude-code-log --output ~/Documents/Obsidian/ClaudeProjects \ --expand-paths --filter-path /home/joe \ --format md --detail low --compactLands sessions at
~/Documents/Obsidian/ClaudeProjects/project/A/<session>.low.md.Closes #151.
What's changed
--outputis now honoured for--all-projects(closing the silent-ignore gap). With the new suffix heuristic inutils.output_path_is_file(), the same flag handles both file destinations (recognised extensions:.html/.md/.markdown/.json) and directory destinations (everything else).--expand-pathsundoes Claude Code's flat encoding so each project's output lands at its real on-disk path under<output>/.--filter-pathselects projects by prefix:--expand-paths: matches against the resolved real path AND truncates the prefix from the destination.--expand-paths: matches against the encoded flat dir name (no truncation).Path-projection helper
The lossy nature of Claude Code's encoding (
/→-,.→-, both directions collide for inputs like-home-joe-x-y↔/home/joe/x/yvs/home/joe/x-y) made the inverse non-trivial.utils.project_dir_to_real_path()uses three-tier resolution:ProjectCache.working_directories[0](absolute paths only). Authoritative — Claude Code recorded thiscwdat session time.agent-*.jsonl, scan up to 32 lines for an entry with acwdfield. Cheap (onejson.loadsper line, no model validation)./-for--inversion with a--→/.mapping for dotfile dirs (-home-joe--git→/home/joe/.git).The
Path(wd).is_absolute()filter on tiers 1 and 2 keeps synthetic test-fixture cwds from polluting resolution.CLI guards
Two validations to keep silent failures out of
--filter-path/--expand-paths:--filter-pathwhen paired with--expand-paths(without it,Path.relative_to(filter)would raiseValueErrorfor "argument is relative" and silently exclude every project).--tuimode, single-file/single-project mode,--outputunset,--outputwith a recognised file suffix.Cache freshness — known caveat
cache.is_html_stale/is_page_staleresolveactual_fileagainstself.project_path(the source under~/.claude/projects/), not against the actual output destination. Practical implication: bouncing between several--outputdirs on the same source always re-renders, even when the destination is already up-to-date. JSONL parsing is still cache-hit (~half the time); only rendering re-runs. Output is always correct thanks to thenot output_path.exists()term inprocess_projects_hierarchy'sneeds_work. Recorded as a follow-up inwork/obsidian-friendly-output.md.Test plan
test/test_path_projection.py— 26 unit tests covering the three resolution tiers, the disambiguation collision case, theagent-*skip, the absoluteness guard, the dotfile-dir naive case from the real corpus, and every cell of theproject_destinationflag matrix.test/test_obsidian_output.py— 9 integration tests (Markdown-scoped) drivingprocess_projects_hierarchyand the CLI end-to-end. Covers the 5 flag-matrix shapes plus 4 CLI guard scenarios.just ciclean — 1838 collected, ruff + pyright + ty all green.test/test_data/real_projects/for legacy /--output/--expand-paths/--filter-path/ relative-filter-path-rejection / two-runs-different-output shapes.Summary by CodeRabbit
New Features
Bug Fixes / UX
Documentation
Tests