feat(session-historian): cross-platform session history agent and /ce-sessions skill (#534)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 07:52:26 -07:00
parent a5ce094772
commit 3208ec71f8
12 changed files with 1500 additions and 7 deletions
--- a/plugins/compound-engineering/README.md
+++ b/plugins/compound-engineering/README.md
@@ -24,6 +24,7 @@ The primary entry points for engineering work, invoked as slash commands:
 | `/ce:work` | Execute work items systematically |
 | `/ce:compound` | Document solved problems to compound team knowledge |
 | `/ce:compound-refresh` | Refresh stale or drifting learnings and decide whether to keep, update, replace, or archive them |
+| `/ce-sessions` | Ask questions about session history across Claude Code, Codex, and Cursor |

 ### Git Workflow

@@ -151,6 +152,7 @@ Agents are specialized subagents invoked by skills — you typically don't call
 | `issue-intelligence-analyst` | Analyze GitHub issues to surface recurring themes and pain patterns |
 | `learnings-researcher` | Search institutional learnings for relevant past solutions |
 | `repo-research-analyst` | Research repository structure and conventions |
+| `session-historian` | Search prior Claude Code, Codex, and Cursor sessions for related investigation context |
 | `slack-researcher` | Search Slack for organizational context relevant to the current task |

 ### Design
--- a/plugins/compound-engineering/agents/research/session-historian.md
+++ b/plugins/compound-engineering/agents/research/session-historian.md
@@ -0,0 +1,189 @@
+---
+name: session-historian
+description: "Searches Claude Code, Codex, and Cursor session history for related prior sessions about the same problem or topic. Use to surface investigation context, failed approaches, and learnings from previous sessions that the current session cannot see. Supports time-based queries for conversational use."
+model: inherit
+---
+
+**Note: The current year is 2026.** Use this when interpreting session timestamps.
+
+You are an expert at extracting institutional knowledge from coding agent session history. Your mission is to find *prior sessions* about the same problem, feature, or topic across Claude Code, Codex, and Cursor, and surface what was learned, tried, and decided -- context that the current session cannot see.
+
+This agent serves two modes of use:
+- **Compound enrichment** -- dispatched by `/ce:compound` to add cross-session context to documentation
+- **Conversational** -- invoked directly when someone wants to ask about past work, recent activity, or what happened in prior sessions
+
+## Guardrails
+
+These rules apply at all times during extraction and synthesis.
+
+- **Never read entire session files into context.** Session files can be 1-7MB. Always use the extraction scripts below to filter first, then reason over the filtered output.
+- **Never extract or reproduce tool call inputs/outputs verbatim.** Summarize what was attempted and what happened.
+- **Never include thinking or reasoning block content.** Claude Code thinking blocks are internal reasoning; Codex reasoning blocks are encrypted. Neither is actionable.
+- **Never analyze the current session.** Its conversation history is already available to the caller.
+- **Never make claims about team dynamics or other people's work.** This is one person's session data.
+- **Never write any files.** Return text findings only.
+- **Surface technical content, not personal content.** Sessions contain everything — credentials, frustration, half-formed opinions. Use judgment about what belongs in a technical summary and what doesn't.
+- **Never substitute other data sources when session files are inaccessible.** If session files cannot be read (permission errors, missing directories), report the limitation and what was attempted. Do not fall back to git history, commit logs, or other sources — that is a different agent's job.
+- **Fail fast on access errors.** If the first extraction attempt fails on permissions, report the issue immediately. Do not retry the same operation with different tools or approaches — repeated retries waste tokens without changing the outcome.
+
+## Why this matters
+
+Compound documentation (`/ce:compound`) captures what happened in the current session. But problems often span multiple sessions across different tools -- a developer might investigate in Claude Code, try an approach in Codex, and fix it in a third session. Each session only sees its own conversation. This agent bridges that gap by searching across all session history.
+
+## Time Range
+
+The caller may specify a time range -- either explicitly ("last 3 days", "this past week", "last month") or implicitly through context ("what did I work on recently" implies a few days; "how did this feature evolve" implies the full feature branch lifetime).
+
+Infer the time range from the request and map it to a scan window. **Start narrow** — recent sessions on the same branch are almost always sufficient. Only widen if the narrow scan finds nothing relevant and the request warrants it.
+
+| Signal | Scan window | Codex directory strategy |
+|--------|-------------|--------------------------|
+| "today", "this morning" | 1 day | Current date dir only |
+| "recently", "last few days", "this week", or no time signal (default) | 7 days | Last 7 date dirs |
+| "last few weeks", "this month" | 30 days | Last 30 date dirs |
+| "last few months", broad feature history | 90 days | Last 90 date dirs |
+
+**Widen only when needed.** If the initial scan finds related sessions, stop there. If it comes up empty and the request suggests a longer history matters (feature evolution, recurring problem), widen to the next tier and scan again. Do not jump straight to 30 or 90 days — step through the tiers one at a time.
+
+**When widening the time window**, re-run both discovery and metadata extraction with the new `<days>` parameter. The discovery script applies `-mtime` filtering, so files outside the original window are never returned. A wider scan requires re-running `discover-sessions.sh` with the larger day count.
+
+**For Codex**, sessions are in date directories. A narrow window means fewer directories to list and fewer files to process.
+
+## Session Sources
+
+Search Claude Code, Codex, and Cursor session history. A developer may use any combination of tools on the same project, so findings from all sources are valuable regardless of which harness is currently active.
+
+### Claude Code
+
+Sessions stored at `~/.claude/projects/<encoded-cwd>/<session-id>.jsonl`, where `<encoded-cwd>` replaces `/` with `-` in the working directory path (e.g., `/Users/alice/Code/my-project` becomes `-Users-alice-Code-my-project`). Claude Code retains session history for ~30 days by default. Wider scan tiers (90 days) may find nothing unless the user has extended retention. Codex and Cursor may retain longer.
+
+Key message types:
+- `type: "user"` -- Human messages. First user message includes `gitBranch` and `cwd` metadata.
+- `type: "assistant"` -- Claude responses. `content` array contains `thinking`, `text`, and `tool_use` blocks.
+- Tool results appear as `type: "user"` messages with `content[].type: "tool_result"`.
+
+### Codex
+
+Sessions stored at `~/.codex/sessions/YYYY/MM/DD/<session-file>.jsonl`, organized by date. Also check `~/.agents/sessions/YYYY/MM/DD/` as Codex may migrate to this location.
+
+Unlike Claude Code, Codex sessions are not organized by project directory. Filter by matching the `cwd` field in `session_meta` against the current working directory.
+
+Key message types:
+- `session_meta` -- Contains `cwd`, session `id`, `source`, `cli_version`.
+- `turn_context` -- Contains `cwd`, `model`, `current_date`.
+- `event_msg/user_message` -- User message text.
+- `response_item/message` with `role: "assistant"` -- Assistant text in `output_text` blocks.
+- `event_msg/exec_command_end` -- Command execution results with exit codes.
+- Codex does not store git branch in session metadata. Correlation relies on CWD matching and keyword search.
+
+### Cursor
+
+Agent transcripts stored at `~/.cursor/projects/<encoded-cwd>/agent-transcripts/<session-id>/<session-id>.jsonl`. Same CWD-encoding as Claude Code.
+
+Limitations compared to Claude Code and Codex:
+- No timestamps in the JSONL — file modification date is the only time signal.
+- No git branch, session ID, or CWD metadata in the data — derived from directory structure.
+- No tool results logged — tool calls are captured but not their outcomes (no success/fail signal).
+- `[REDACTED]` markers appear where Cursor stripped thinking/reasoning content.
+
+Key message types:
+- `role: "user"` -- User messages. Text wrapped in `<user_query>` tags (stripped by extraction scripts).
+- `role: "assistant"` -- Assistant responses. Same `content` array structure as Claude Code (`text`, `tool_use` blocks).
+
+## Extraction Scripts
+
+**Execute scripts by path, not by reading them into context.** Locate the `session-history-scripts/` directory relative to this agent file using the native file-search tool (e.g., Glob), then run scripts directly. Do not use the Read tool to load script content and pass it via `python3 -c`.
+
+Scripts:
+
+- `discover-sessions.sh` -- Discovers session files across all platforms. Handles directory structures, mtime filtering, repo-name matching, and zsh glob safety. Usage: `bash <script-dir>/discover-sessions.sh <repo-name> <days> [--platform claude|codex|cursor]`
+- `extract-metadata.py` -- Extracts session metadata. Batch mode: pass file paths as arguments. Pass `--cwd-filter <repo-name>` to filter Codex sessions at the script level. Usage: `bash <script-dir>/discover-sessions.sh <repo-name> <days> | tr '\n' '\0' | xargs -0 python3 <script-dir>/extract-metadata.py --cwd-filter <repo-name>`
+- `extract-skeleton.py` -- Extracts the conversation skeleton: user messages, assistant text, and collapsed tool call summaries. Filters out raw tool inputs/outputs, thinking/reasoning blocks, and framework wrapper tags. Usage: `cat <file> | python3 <script-dir>/extract-skeleton.py`
+- `extract-errors.py` -- Extracts error signals. Claude Code: tool results with `is_error`. Codex: commands with non-zero exit codes. Cursor: no error extraction possible. Usage: `cat <file> | python3 <script-dir>/extract-errors.py`
+
+Python scripts output a `_meta` line at the end with `files_processed` and `parse_errors` counts. When `parse_errors > 0`, note in the response that extraction was partial.
+
+## Methodology
+
+### Step 1: Determine scope and discover sessions
+
+**Scope decision.** Two dimensions to resolve before scanning:
+
+- **Project scope**: Default to the current project. Widen to all projects only when the question explicitly asks.
+- **Platform scope**: Default to all platforms (Claude Code, Codex, Cursor). Narrow to a single platform when the question specifies one. If unclear on either dimension, use the default.
+
+Determine the scan window from the Time Range table above, then discover and extract metadata.
+
+**Derive the repo name** using a worktree-safe approach: check `git rev-parse --git-common-dir` first — in a normal checkout it returns `.git` (use `--show-toplevel` to get the repo root), but in a linked worktree it returns the absolute path to the main repo's `.git` directory (use `dirname` on that path to get the repo root). In either case, `basename` the result to get the repo name. Example: `common=$(git rev-parse --git-common-dir 2>/dev/null); if [ "$common" = ".git" ]; then basename "$(git rev-parse --show-toplevel 2>/dev/null)"; else basename "$(dirname "$common")"; fi`. If the repo name was pre-resolved in the dispatch prompt, use that instead.
+
+**Discover session files using the discovery script.** `session-history-scripts/discover-sessions.sh` handles all platform-specific directory structures, mtime filtering, and zsh glob safety. Run it by path (do not read it into context):
+
+```bash
+bash <script-dir>/discover-sessions.sh <repo-name> <days>
+```
+
+This outputs one file path per line across all platforms. To restrict to a single platform: `--platform claude|codex|cursor`. Pass the output to the metadata script with `--cwd-filter` to filter Codex sessions by repo name:
+
+```bash
+bash <script-dir>/discover-sessions.sh <repo-name> <days> | tr '\n' '\0' | xargs -0 python3 <script-dir>/extract-metadata.py --cwd-filter <repo-name>
+```
+
+If no files are found, return: "No session history found within the requested time range." If the `_meta` line shows `parse_errors > 0`, note that some sessions could not be parsed.
+
+### Step 3: Identify related sessions
+
+Correlate sessions to the current problem using these signals (in priority order):
+
+1. **Same git branch** (Claude Code) -- Sessions on the same branch are almost certainly about the same feature/problem. Strongest signal.
+2. **Same CWD** (Codex) -- Sessions in the same working directory are likely the same project.
+3. **Related branch names** -- Branches with overlapping keywords (e.g., `feat/auth-fix` and `feat/auth-refactor`).
+4. **Keyword matching** -- If the caller provides topic keywords, search session user messages for those terms.
+
+**Exclude the current session** -- its conversation history is already available to the caller.
+
+**Drop sessions outside the scan window before selecting.** A session is within the window if it was active during that period — use `last_ts` (session end) when available, fall back to `ts` (session start). A session that started 10 days ago but ended 2 days ago IS within a 7-day window. Discard sessions where both `ts` and `last_ts` fall before the window start. Do not carry forward old sessions just because they exist — a 20-day-old session with no recent activity is irrelevant regardless of how relevant its branch looks.
+
+From the remaining sessions, select the most relevant (typically 2-5 total across sources). Prefer sessions that are:
+- Strongly correlated (same branch or same CWD)
+- Substantive (file size > 30KB suggests meaningful work)
+
+### Step 4: Extract conversation skeleton
+
+For each selected session, run the skeleton extraction script. Pipe the output through `head -200` to cap the skeleton at 200 lines per session. Large sessions (4MB+) can produce 500-700 skeleton lines — the opening turns establish the topic and the final turns show the conclusion, but the middle is often repetitive tool call cycles. 200 lines is enough to understand the narrative arc without flooding context.
+
+If the truncated skeleton doesn't cover the session's conclusion, extract the tail separately: `cat <file> | python3 <script-dir>/extract-skeleton.py | tail -50`.
+
+### Step 5: Extract error signals (selective)
+
+For sessions where investigation dead-ends are likely valuable, run the error extraction script. Use this selectively -- only when understanding what went wrong adds value.
+
+### Step 6: Synthesize findings
+
+Reason over the extracted conversation skeletons and error signals from both sources.
+
+Look for:
+
+- **Investigation journey** -- What approaches were tried? What failed and why? What led to the eventual solution?
+- **User corrections** -- Moments where the user redirected the approach. These reveal what NOT to do and why.
+- **Decisions and rationale** -- Why one approach was chosen over alternatives.
+- **Error patterns** -- Recurring errors across sessions that indicate a systemic issue.
+- **Evolution across sessions** -- How understanding of the problem changed from session to session, potentially across different tools.
+- **Cross-tool blind spots** -- When findings come from both Claude Code and Codex, look for things the user might not realize from either tool alone. This could be complementary work (one tool tackled the schema while the other tackled the API), duplicated effort (same approach tried in both tools days apart), or gaps (neither tool's sessions touched a component that connects the work). Only mention cross-tool observations when they're genuinely informative — if both sources tell the same story, there's nothing to call out.
+- **Staleness** -- Older sessions may reflect conclusions about code that has since changed. When surfacing findings from sessions more than a few days old, consider whether the relevant code or context is likely to have moved on. Caveat older findings when appropriate rather than presenting them with the same confidence as recent ones.
+
+## Output
+
+**If the caller specifies an output format**, use it. The dispatching skill or user knows what structure serves their workflow best. Follow their format instructions and do not add extra sections.
+
+**If no format is specified**, respond in whatever way best answers the question. Include a brief header noting what was searched:
+
+```
+**Sessions searched**: [count] ([N] Claude Code, [N] Codex, [N] Cursor) | [date range]
+```
+
+
+## Tool Guidance
+
+- Use shell commands piped through python for JSONL extraction via the scripts described above.
+- Use native file-search (e.g., Glob in Claude Code) to list session files.
+- Use native content-search (e.g., Grep in Claude Code) when searching for specific keywords across session files.
--- a/plugins/compound-engineering/agents/research/session-history-scripts/discover-sessions.sh
+++ b/plugins/compound-engineering/agents/research/session-history-scripts/discover-sessions.sh
@@ -0,0 +1,81 @@
+#!/usr/bin/env bash
+# Discover session files across Claude Code, Codex, and Cursor.
+#
+# Usage: discover-sessions.sh <repo-name> <days> [--platform claude|codex|cursor]
+#
+# Outputs one file path per line. Safe in both bash and zsh (all globs guarded).
+# Pass output to extract-metadata.py:
+#   python3 extract-metadata.py --cwd-filter <repo-name> $(bash discover-sessions.sh <repo-name> 7)
+#
+# Arguments:
+#   repo-name  Folder name of the repo (e.g., "my-repo"). Used for directory matching.
+#   days       Scan window in days (e.g., 7). Files older than this are skipped.
+#   --platform Restrict to a single platform. Omit to search all.
+
+set -euo pipefail
+
+REPO_NAME="${1:?Usage: discover-sessions.sh <repo-name> <days> [--platform claude|codex|cursor]}"
+DAYS="${2:?Usage: discover-sessions.sh <repo-name> <days> [--platform claude|codex|cursor]}"
+PLATFORM="${4:-all}"
+
+# Parse optional --platform flag
+shift 2
+while [ $# -gt 0 ]; do
+    case "$1" in
+        --platform) PLATFORM="$2"; shift 2 ;;
+        *) shift ;;
+    esac
+done
+
+# --- Claude Code ---
+discover_claude() {
+    local base="$HOME/.claude/projects"
+    [ -d "$base" ] || return 0
+
+    # Find all project dirs matching repo name
+    for dir in "$base"/*"$REPO_NAME"*/; do
+        [ -d "$dir" ] || continue
+        find "$dir" -maxdepth 1 -name "*.jsonl" -mtime "-${DAYS}" 2>/dev/null
+    done
+}
+
+# --- Codex ---
+discover_codex() {
+    for base in "$HOME/.codex/sessions" "$HOME/.agents/sessions"; do
+        [ -d "$base" ] || continue
+
+        # Use mtime-based discovery (consistent with Claude/Cursor) so that
+        # sessions started before the scan window but still active within it
+        # are not missed.
+        find "$base" -name "*.jsonl" -mtime "-${DAYS}" 2>/dev/null
+    done
+}
+
+# --- Cursor ---
+discover_cursor() {
+    local base="$HOME/.cursor/projects"
+    [ -d "$base" ] || return 0
+
+    for dir in "$base"/*"$REPO_NAME"*/; do
+        [ -d "$dir" ] || continue
+        local transcripts="$dir/agent-transcripts"
+        [ -d "$transcripts" ] || continue
+        find "$transcripts" -name "*.jsonl" -mtime "-${DAYS}" 2>/dev/null
+    done
+}
+
+# --- Dispatch ---
+case "$PLATFORM" in
+    claude)  discover_claude ;;
+    codex)   discover_codex ;;
+    cursor)  discover_cursor ;;
+    all)
+        discover_claude
+        discover_codex
+        discover_cursor
+        ;;
+    *)
+        echo "Unknown platform: $PLATFORM" >&2
+        exit 1
+        ;;
+esac
--- a/plugins/compound-engineering/agents/research/session-history-scripts/extract-errors.py
+++ b/plugins/compound-engineering/agents/research/session-history-scripts/extract-errors.py
@@ -0,0 +1,104 @@
+#!/usr/bin/env python3
+"""Extract error signals from a Claude Code, Codex, or Cursor JSONL session file.
+
+Usage: cat <session.jsonl> | python3 extract-errors.py
+
+Auto-detects platform from the JSONL structure.
+Note: Cursor agent transcripts do not log tool results, so no errors can be extracted.
+Finds failed tool calls / commands and outputs them with timestamps.
+Outputs a _meta line at the end with processing stats.
+"""
+import sys
+import json
+
+stats = {"lines": 0, "parse_errors": 0, "errors_found": 0}
+
+
+def summarize_error(raw):
+    """Extract a short error summary instead of dumping the full payload."""
+    text = str(raw).strip()
+    # Take the first non-empty line as the error message
+    for line in text.split("\n"):
+        line = line.strip()
+        if line:
+            return line[:200]
+    return text[:200]
+
+
+def handle_claude(obj):
+    if obj.get("type") == "user":
+        content = obj.get("message", {}).get("content", [])
+        if isinstance(content, list):
+            for block in content:
+                if block.get("type") == "tool_result" and block.get("is_error"):
+                    ts = obj.get("timestamp", "")[:19]
+                    summary = summarize_error(block.get("content", ""))
+                    print(f"[{ts}] [error] {summary}")
+                    print("---")
+                    stats["errors_found"] += 1
+
+
+def handle_codex(obj):
+    if obj.get("type") == "event_msg":
+        p = obj.get("payload", {})
+        if p.get("type") == "exec_command_end":
+            output = p.get("aggregated_output", "")
+            stderr = p.get("stderr", "")
+            command = p.get("command", [])
+            cmd_str = command[-1] if command else ""
+
+            exit_match = None
+            if "Process exited with code " in output:
+                try:
+                    code_str = output.split("Process exited with code ")[1].split("\n")[0]
+                    exit_code = int(code_str)
+                    if exit_code != 0:
+                        exit_match = exit_code
+                except (IndexError, ValueError):
+                    pass
+
+            if exit_match is not None or stderr:
+                ts = obj.get("timestamp", "")[:19]
+                error_summary = summarize_error(stderr if stderr else output)
+                print(f"[{ts}] [error] exit={exit_match} cmd={cmd_str[:120]}: {error_summary}")
+                print("---")
+                stats["errors_found"] += 1
+
+
+# Auto-detect platform from first few lines, then process all
+detected = None
+buffer = []
+
+for line in sys.stdin:
+    line = line.strip()
+    if not line:
+        continue
+    buffer.append(line)
+    stats["lines"] += 1
+
+    if not detected and len(buffer) <= 10:
+        try:
+            obj = json.loads(line)
+            if obj.get("type") in ("user", "assistant"):
+                detected = "claude"
+            elif obj.get("type") in ("session_meta", "turn_context", "response_item", "event_msg"):
+                detected = "codex"
+            elif obj.get("role") in ("user", "assistant") and "type" not in obj:
+                detected = "cursor"
+        except (json.JSONDecodeError, KeyError):
+            pass
+
+# Cursor transcripts don't log tool results — no errors to extract
+def handle_noop(obj):
+    pass
+
+handlers = {"claude": handle_claude, "codex": handle_codex, "cursor": handle_noop}
+handler = handlers.get(detected, handle_noop)
+
+for line in buffer:
+    try:
+        handler(json.loads(line))
+    except (json.JSONDecodeError, KeyError):
+        stats["parse_errors"] += 1
+
+print(json.dumps({"_meta": True, **stats}))
--- a/plugins/compound-engineering/agents/research/session-history-scripts/extract-metadata.py
+++ b/plugins/compound-engineering/agents/research/session-history-scripts/extract-metadata.py
@@ -0,0 +1,187 @@
+#!/usr/bin/env python3
+"""Extract session metadata from Claude Code, Codex, and Cursor JSONL files.
+
+Batch mode (preferred — one invocation for all files):
+  python3 extract-metadata.py /path/to/dir/*.jsonl
+  python3 extract-metadata.py file1.jsonl file2.jsonl file3.jsonl
+
+Single-file mode (stdin):
+  head -20 <session.jsonl> | python3 extract-metadata.py
+
+Auto-detects platform from the JSONL structure.
+Outputs one JSON object per file, one per line.
+Includes a final _meta line with processing stats.
+"""
+import sys
+import json
+import os
+
+MAX_LINES = 25  # Only need first ~25 lines for metadata
+
+
+def try_claude(lines):
+    for line in lines:
+        try:
+            obj = json.loads(line.strip())
+            if obj.get("type") == "user" and "gitBranch" in obj:
+                return {
+                    "platform": "claude",
+                    "branch": obj["gitBranch"],
+                    "ts": obj.get("timestamp", ""),
+                    "session": obj.get("sessionId", ""),
+                }
+        except (json.JSONDecodeError, KeyError):
+            pass
+    return None
+
+
+def try_codex(lines):
+    meta = {}
+    for line in lines:
+        try:
+            obj = json.loads(line.strip())
+            if obj.get("type") == "session_meta":
+                p = obj.get("payload", {})
+                meta["platform"] = "codex"
+                meta["cwd"] = p.get("cwd", "")
+                meta["session"] = p.get("id", "")
+                meta["ts"] = p.get("timestamp", obj.get("timestamp", ""))
+                meta["source"] = p.get("source", "")
+                meta["cli_version"] = p.get("cli_version", "")
+            elif obj.get("type") == "turn_context":
+                p = obj.get("payload", {})
+                meta["model"] = p.get("model", "")
+                meta["cwd"] = meta.get("cwd") or p.get("cwd", "")
+        except (json.JSONDecodeError, KeyError):
+            pass
+    return meta if meta else None
+
+
+def try_cursor(lines):
+    """Cursor agent transcripts: role-based entries, no timestamps or metadata fields."""
+    for line in lines:
+        try:
+            obj = json.loads(line.strip())
+            # Cursor entries have 'role' at top level but no 'type'
+            if obj.get("role") in ("user", "assistant") and "type" not in obj:
+                return {"platform": "cursor"}
+        except (json.JSONDecodeError, KeyError):
+            pass
+    return None
+
+
+def extract_from_lines(lines):
+    return try_claude(lines) or try_codex(lines) or try_cursor(lines)
+
+
+TAIL_BYTES = 16384  # Read last 16KB to find final timestamp past trailing metadata
+
+
+def get_last_timestamp(filepath, size):
+    """Read the tail of a file to find the last message with a timestamp."""
+    try:
+        with open(filepath, "rb") as f:
+            f.seek(max(0, size - TAIL_BYTES))
+            tail = f.read().decode("utf-8", errors="ignore")
+            lines = tail.strip().split("\n")
+        for line in reversed(lines):
+            try:
+                obj = json.loads(line.strip())
+                if "timestamp" in obj:
+                    return obj["timestamp"]
+            except (json.JSONDecodeError, KeyError):
+                pass
+    except (OSError, IOError):
+        pass
+    return None
+
+
+def process_file(filepath):
+    try:
+        size = os.path.getsize(filepath)
+        with open(filepath, "r") as f:
+            lines = []
+            for i, line in enumerate(f):
+                if i >= MAX_LINES:
+                    break
+                lines.append(line)
+        result = extract_from_lines(lines)
+        if result:
+            result["file"] = filepath
+            result["size"] = size
+            if result["platform"] == "cursor":
+                # Cursor transcripts have no timestamps in JSONL.
+                # Use file modification time as the best available signal.
+                # Derive session ID from the parent directory name (UUID).
+                mtime = os.path.getmtime(filepath)
+                from datetime import datetime, timezone
+
+                result["ts"] = datetime.fromtimestamp(mtime, tz=timezone.utc).isoformat()
+                result["session"] = os.path.basename(os.path.dirname(filepath))
+            else:
+                last_ts = get_last_timestamp(filepath, size)
+                if last_ts:
+                    result["last_ts"] = last_ts
+            return result, None
+        else:
+            return None, filepath
+    except (OSError, IOError) as e:
+        return None, filepath
+
+
+# Parse arguments: files and optional --cwd-filter <substring>
+files = []
+cwd_filter = None
+args = sys.argv[1:]
+i = 0
+while i < len(args):
+    if args[i] == "--cwd-filter" and i + 1 < len(args):
+        cwd_filter = args[i + 1]
+        i += 2
+    elif not args[i].startswith("-"):
+        files.append(args[i])
+        i += 1
+    else:
+        i += 1
+
+if files:
+    # Batch mode: process all files
+    processed = 0
+    parse_errors = 0
+    filtered = 0
+    for filepath in files:
+        if not filepath.endswith(".jsonl"):
+            continue
+        result, error = process_file(filepath)
+        processed += 1
+        if result:
+            # Apply CWD filter: skip Codex sessions from other repos
+            if cwd_filter and result.get("cwd") and cwd_filter not in result["cwd"]:
+                filtered += 1
+                continue
+            print(json.dumps(result))
+        elif error:
+            parse_errors += 1
+
+    meta = {"_meta": True, "files_processed": processed, "parse_errors": parse_errors}
+    if filtered:
+        meta["filtered_by_cwd"] = filtered
+    print(json.dumps(meta))
+else:
+    # No file arguments: either single-file stdin mode or empty xargs invocation.
+    # When xargs runs us with no input (e.g., discover found no files), stdin is
+    # empty or a TTY — emit a clean zero-file result instead of a false parse error.
+    if sys.stdin.isatty():
+        lines = []
+    else:
+        lines = list(sys.stdin)
+
+    if not lines:
+        # No input at all — zero-file result (clean exit for empty pipelines)
+        print(json.dumps({"_meta": True, "files_processed": 0, "parse_errors": 0}))
+    else:
+        # Genuine single-file stdin mode (backward compatible)
+        result = extract_from_lines(lines)
+        if result:
+            print(json.dumps(result))
+        print(json.dumps({"_meta": True, "files_processed": 1, "parse_errors": 0 if result else 1}))
--- a/plugins/compound-engineering/agents/research/session-history-scripts/extract-skeleton.py
+++ b/plugins/compound-engineering/agents/research/session-history-scripts/extract-skeleton.py
@@ -0,0 +1,317 @@
+#!/usr/bin/env python3
+"""Extract the conversation skeleton from a Claude Code, Codex, or Cursor JSONL session file.
+
+Usage: cat <session.jsonl> | python3 extract-skeleton.py
+
+Auto-detects platform (Claude Code, Codex, or Cursor) from the JSONL structure.
+Extracts:
+  - User messages (text only, no tool results)
+  - Assistant text (no thinking/reasoning blocks)
+  - Collapsed tool call summaries (consecutive same-tool calls grouped)
+
+Consecutive tool calls of the same type are collapsed:
+  3+ Read calls -> "[tools] 3x Read (file1, file2, +1 more) -> all ok"
+Codex call/result pairs are deduplicated (only the result with status is kept).
+Outputs a _meta line at the end with processing stats.
+"""
+import sys
+import json
+import re
+
+stats = {"lines": 0, "parse_errors": 0, "user": 0, "assistant": 0, "tool": 0}
+
+# Claude Code wrapper tags to strip from user message content.
+# Strip entirely (tag + content): framework noise and raw command output.
+# Strip tags only (keep content): command-message, command-name, command-args, user_query.
+_STRIP_BLOCK = re.compile(
+    r"<(?:task-notification|local-command-caveat|local-command-stdout|local-command-stderr|system-reminder)[^>]*>.*?</(?:task-notification|local-command-caveat|local-command-stdout|local-command-stderr|system-reminder)>",
+    re.DOTALL,
+)
+_STRIP_TAG = re.compile(
+    r"</?(?:command-message|command-name|command-args|user_query)[^>]*>"
+)
+
+
+def clean_text(text):
+    """Strip framework wrapper tags from message text (Claude and Cursor)."""
+    text = _STRIP_BLOCK.sub("", text)
+    text = _STRIP_TAG.sub("", text)
+    text = re.sub(r"\n{3,}", "\n\n", text).strip()
+    return text
+
+# Buffer for pending tool entries: [{"ts", "name", "target", "status"}]
+pending_tools = []
+
+
+def flush_tools():
+    """Print buffered tool entries, collapsing consecutive same-name groups."""
+    if not pending_tools:
+        return
+
+    # Group consecutive entries by tool name
+    groups = []
+    for entry in pending_tools:
+        if groups and groups[-1][0]["name"] == entry["name"]:
+            groups[-1].append(entry)
+        else:
+            groups.append([entry])
+
+    for group in groups:
+        name = group[0]["name"]
+        if len(group) <= 2:
+            # Print individually
+            for e in group:
+                status = f" -> {e['status']}" if e.get("status") else ""
+                ts_prefix = f"[{e['ts']}] " if e.get("ts") else ""
+                print(f"{ts_prefix}[tool] {name} {e['target']}{status}")
+                stats["tool"] += 1
+        else:
+            # Collapse
+            ts = group[0].get("ts", "")
+            targets = [e["target"] for e in group if e.get("target")]
+            ok = sum(1 for e in group if e.get("status") == "ok")
+            err = sum(1 for e in group if e.get("status") and e["status"] != "ok")
+            no_status = len(group) - ok - err
+
+            # Show first 2 targets, then "+N more"
+            if len(targets) > 2:
+                target_str = ", ".join(targets[:2]) + f", +{len(targets) - 2} more"
+            elif targets:
+                target_str = ", ".join(targets)
+            else:
+                target_str = ""
+
+            if no_status == len(group):
+                status_str = ""
+            elif err == 0:
+                status_str = " -> all ok"
+            else:
+                status_str = f" -> {ok} ok, {err} error"
+
+            ts_prefix = f"[{ts}] " if ts else ""
+            print(f"{ts_prefix}[tools] {len(group)}x {name} ({target_str}){status_str}")
+            stats["tool"] += len(group)
+
+    pending_tools.clear()
+
+
+def summarize_claude_tool(block):
+    """Extract name and target from a Claude Code tool_use block."""
+    name = block.get("name", "unknown")
+    inp = block.get("input", {})
+    target = (
+        inp.get("file_path")
+        or inp.get("path")
+        or inp.get("command", "")[:120]
+        or inp.get("pattern", "")
+        or inp.get("query", "")[:80]
+        or inp.get("prompt", "")[:80]
+        or ""
+    )
+    if isinstance(target, str) and len(target) > 120:
+        target = target[:120]
+    return name, target
+
+
+def handle_claude(obj):
+    msg_type = obj.get("type")
+    ts = obj.get("timestamp", "")[:19]
+
+    if msg_type == "user":
+        msg = obj.get("message", {})
+        content = msg.get("content", "")
+
+        if isinstance(content, list):
+            for block in content:
+                if block.get("type") == "tool_result":
+                    is_error = block.get("is_error", False)
+                    status = "error" if is_error else "ok"
+                    tool_use_id = block.get("tool_use_id")
+                    matched = False
+                    if tool_use_id:
+                        for entry in pending_tools:
+                            if entry.get("id") == tool_use_id:
+                                entry["status"] = status
+                                matched = True
+                                break
+                    if not matched:
+                        # Fallback: assign to earliest pending entry without a status
+                        for entry in pending_tools:
+                            if not entry.get("status"):
+                                entry["status"] = status
+                                break
+
+            texts = [
+                c.get("text", "")
+                for c in content
+                if c.get("type") == "text" and len(c.get("text", "")) > 10
+            ]
+            content = " ".join(texts)
+
+        if isinstance(content, str):
+            content = clean_text(content)
+            if len(content) > 15:
+                flush_tools()
+                print(f"[{ts}] [user] {content[:800]}")
+                print("---")
+                stats["user"] += 1
+
+    elif msg_type == "assistant":
+        msg = obj.get("message", {})
+        content = msg.get("content", [])
+        if isinstance(content, list):
+            has_text = False
+            for block in content:
+                if block.get("type") == "text":
+                    text = clean_text(block.get("text", ""))
+                    if len(text) > 20:
+                        if not has_text:
+                            flush_tools()
+                            has_text = True
+                        print(f"[{ts}] [assistant] {text[:800]}")
+                        print("---")
+                        stats["assistant"] += 1
+                elif block.get("type") == "tool_use":
+                    name, target = summarize_claude_tool(block)
+                    entry = {"ts": ts, "name": name, "target": target}
+                    tool_id = block.get("id")
+                    if tool_id:
+                        entry["id"] = tool_id
+                    pending_tools.append(entry)
+
+
+def handle_codex(obj):
+    msg_type = obj.get("type")
+    ts = obj.get("timestamp", "")[:19]
+
+    if msg_type == "event_msg":
+        p = obj.get("payload", {})
+        if p.get("type") == "user_message":
+            text = p.get("message", "")
+            if isinstance(text, str) and len(text) > 15:
+                parts = text.split("</system_instruction>")
+                user_text = parts[-1].strip() if parts else text
+                if len(user_text) > 15:
+                    flush_tools()
+                    print(f"[{ts}] [user] {user_text[:800]}")
+                    print("---")
+                    stats["user"] += 1
+
+        elif p.get("type") == "exec_command_end":
+            # This is the deduplicated result — has status info
+            command = p.get("command", [])
+            cmd_str = command[-1] if command else ""
+            output = p.get("aggregated_output", "")
+
+            status = "ok"
+            if "Process exited with code " in output:
+                try:
+                    code = int(output.split("Process exited with code ")[1].split("\n")[0])
+                    if code != 0:
+                        status = f"error(exit {code})"
+                except (IndexError, ValueError):
+                    pass
+
+            if cmd_str:
+                # Shorten common patterns for readability
+                short_cmd = cmd_str[:120]
+                pending_tools.append({"ts": ts, "name": "exec", "target": short_cmd, "status": status})
+
+    elif msg_type == "response_item":
+        p = obj.get("payload", {})
+        if p.get("type") == "message" and p.get("role") == "assistant":
+            for block in p.get("content", []):
+                if block.get("type") == "output_text" and len(block.get("text", "")) > 20:
+                    flush_tools()
+                    print(f"[{ts}] [assistant] {block['text'][:800]}")
+                    print("---")
+                    stats["assistant"] += 1
+
+        # Skip function_call — exec_command_end is the deduplicated version with status
+
+
+def handle_cursor(obj):
+    """Cursor agent transcripts: role-based, no timestamps, same content structure as Claude."""
+    role = obj.get("role")
+    content = obj.get("message", {}).get("content", [])
+
+    if role == "user":
+        texts = []
+        for block in (content if isinstance(content, list) else []):
+            if block.get("type") == "text":
+                texts.append(block.get("text", ""))
+        text = clean_text(" ".join(texts))
+        if len(text) > 15:
+            flush_tools()
+            # No timestamps available in Cursor transcripts
+            print(f"[user] {text[:800]}")
+            print("---")
+            stats["user"] += 1
+
+    elif role == "assistant":
+        has_text = False
+        for block in (content if isinstance(content, list) else []):
+            if block.get("type") == "text":
+                text = block.get("text", "")
+                # Skip [REDACTED] placeholder blocks
+                if len(text) > 20 and text.strip() != "[REDACTED]":
+                    if not has_text:
+                        flush_tools()
+                        has_text = True
+                    print(f"[assistant] {text[:800]}")
+                    print("---")
+                    stats["assistant"] += 1
+            elif block.get("type") == "tool_use":
+                name = block.get("name", "unknown")
+                inp = block.get("input", {})
+                target = (
+                    inp.get("path")
+                    or inp.get("file_path")
+                    or inp.get("command", "")[:120]
+                    or inp.get("pattern", "")
+                    or inp.get("glob_pattern", "")
+                    or inp.get("target_directory", "")
+                    or ""
+                )
+                if isinstance(target, str) and len(target) > 120:
+                    target = target[:120]
+                # No status info available — Cursor doesn't log tool results
+                pending_tools.append({"ts": "", "name": name, "target": target})
+
+
+# Auto-detect platform from first few lines, then process all
+detected = None
+buffer = []
+
+for line in sys.stdin:
+    line = line.strip()
+    if not line:
+        continue
+    buffer.append(line)
+    stats["lines"] += 1
+
+    if not detected and len(buffer) <= 10:
+        try:
+            obj = json.loads(line)
+            if obj.get("type") in ("user", "assistant"):
+                detected = "claude"
+            elif obj.get("type") in ("session_meta", "turn_context", "response_item", "event_msg"):
+                detected = "codex"
+            elif obj.get("role") in ("user", "assistant") and "type" not in obj:
+                detected = "cursor"
+        except (json.JSONDecodeError, KeyError):
+            pass
+
+handlers = {"claude": handle_claude, "codex": handle_codex, "cursor": handle_cursor}
+handler = handlers.get(detected, handle_codex)
+
+for line in buffer:
+    try:
+        handler(json.loads(line))
+    except (json.JSONDecodeError, KeyError):
+        stats["parse_errors"] += 1
+
+# Flush any remaining buffered tools
+flush_tools()
+
+print(json.dumps({"_meta": True, **stats}))
--- a/plugins/compound-engineering/skills/ce-compound/SKILL.md
+++ b/plugins/compound-engineering/skills/ce-compound/SKILL.md
@@ -47,6 +47,16 @@ Present the user with two options before proceeding, using the platform's blocki

 Do NOT pre-select a mode. Do NOT skip this prompt. Wait for the user's choice before proceeding.

+**If the user chooses Full**, ask one follow-up question before proceeding. Detect which harness is running (Claude Code, Codex, or Cursor) and ask:
+
+```
+Would you also like to search your [harness name] session history
+for relevant knowledge to help the Compound process? This adds
+time and token usage.
+```
+
+If the user says yes, dispatch the Session Historian in Phase 1. If no, skip it. Do not ask this in lightweight mode.
+
 ---

 ### Full Mode
@@ -78,12 +88,17 @@ and codebase findings take priority over these notes.

 If no relevant entries are found, proceed to Phase 1 without passing memory context.

-### Phase 1: Parallel Research
+### Phase 1: Research
+
+Launch research subagents. Each returns text data to the orchestrator.
+
+**Dispatch order:**
+- Launch `Context Analyzer`, `Solution Extractor`, and `Related Docs Finder` in parallel (background)
+- Then dispatch `session-historian` in foreground — it reads session files outside the working directory that background agents may not have access to
+- The foreground dispatch runs while the background agents work, adding no wall-clock time

 <parallel_tasks>

-Launch these subagents IN PARALLEL. Each returns text data to the orchestrator.
-
 #### 1. **Context Analyzer**
   - Extracts conversation history
   - Reads `references/schema.yaml` for enum validation and **track classification**
@@ -151,6 +166,29 @@ Launch these subagents IN PARALLEL. Each returns text data to the orchestrator.

 </parallel_tasks>

+#### 4. **Session Historian** (foreground, after launching the above — only if the user opted in)
+   - **Skip entirely** if the user declined session history in the follow-up question
+   - Dispatched as `compound-engineering:research:session-historian`
+   - Dispatch in **foreground** — this agent reads session files outside the working directory (`~/.claude/projects/`, `~/.codex/sessions/`, `~/.cursor/projects/`) which background agents may not have access to
+   - Searches prior Claude Code, Codex, and Cursor sessions for the same project to find related investigation context
+   - Correlates sessions by repo name across all platforms (matches sessions from main checkouts, worktrees, and Conductor workspaces)
+   - In the dispatch prompt, pass:
+     - A specific description of the problem being documented — not a generic topic, but the concrete issue (error messages, module names, what broke and how it was fixed). This is what the agent filters its findings against.
+     - The current git branch and working directory
+     - The instruction: "Only surface findings from prior sessions that are directly relevant to this specific problem. Ignore unrelated work from the same sessions or branches."
+     - The output format:
+
+       ```
+       Structure your response with these sections (omit any with no findings):
+       - What was tried before: prior approaches to this specific problem
+       - What didn't work: failed attempts at this problem from prior sessions
+       - Key decisions: choices made about this problem and their rationale
+       - Related context: anything else from prior sessions that directly informs this problem's documentation
+       ```
+   - Omit the `mode` parameter so the user's configured permission settings apply
+   - Dispatch on the mid-tier model (e.g., `model: "sonnet"` in Claude Code) — the synthesis feeds into compound assembly and doesn't need frontier reasoning
+   - Returns: structured digest of findings from prior sessions, or "no relevant prior sessions" if none found
+
 ### Phase 2: Assembly & Write

 <sequential_tasks>
@@ -172,10 +210,15 @@ The orchestrating agent (main conversation) performs these steps:

   When updating an existing doc, preserve its file path and frontmatter structure. Update the solution, code examples, prevention tips, and any stale references. Add a `last_updated: YYYY-MM-DD` field to the frontmatter. Do not change the title unless the problem framing has materially shifted.

-3. Assemble complete markdown file from the collected pieces, reading `assets/resolution-template.md` for the section structure of new docs
-4. Validate YAML frontmatter against `references/schema.yaml`
-5. Create directory if needed: `mkdir -p docs/solutions/[category]/`
-6. Write the file: either the updated existing doc or the new `docs/solutions/[category]/[filename].md`
+3. **Incorporate session history findings** (if available). When the Session History Researcher returned relevant prior-session context:
+   - Fold investigation dead ends and failed approaches into the **What Didn't Work** section (bug track) or **Context** section (knowledge track)
+   - Use cross-session patterns to enrich the **Prevention** or **Why This Matters** sections
+   - Tag session-sourced content with "(session history)" so its origin is clear to future readers
+   - If findings are thin or "no relevant prior sessions," proceed without session context
+4. Assemble complete markdown file from the collected pieces, reading `assets/resolution-template.md` for the section structure of new docs
+5. Validate YAML frontmatter against `references/schema.yaml`
+6. Create directory if needed: `mkdir -p docs/solutions/[category]/`
+7. Write the file: either the updated existing doc or the new `docs/solutions/[category]/[filename].md`

 When creating a new doc, preserve the section order from `assets/resolution-template.md` unless the user explicitly asks for a different structure.

@@ -392,6 +435,7 @@ Subagent Results:
  ✓ Context Analyzer: Identified performance_issue in brief_system, category: performance-issues/
  ✓ Solution Extractor: 3 code fixes, prevention strategies
  ✓ Related Docs Finder: 2 related issues
+  ✓ Session History: 3 prior sessions on same branch, 2 failed approaches surfaced

 Specialized Agent Reviews (Auto-Triggered):
  ✓ performance-oracle: Validated query optimization approach
--- a/plugins/compound-engineering/skills/ce-sessions/SKILL.md
+++ b/plugins/compound-engineering/skills/ce-sessions/SKILL.md
@@ -0,0 +1,33 @@
+---
+name: ce-sessions
+description: "Search and ask questions about your coding agent session history. Use when asking what you worked on, what was tried before, how a problem was investigated across sessions, what happened recently, or any question about past agent sessions. Also use when the user references prior sessions, previous attempts, or past investigations — even without saying 'sessions' explicitly."
+---
+
+# /ce-sessions
+
+Search your session history.
+
+## Usage
+
+```
+/ce-sessions [question or topic]
+/ce-sessions
+```
+
+## Pre-resolved context
+
+**Repo name (pre-resolved):** !`common=$(git rev-parse --git-common-dir 2>/dev/null); if [ "$common" = ".git" ]; then basename "$(git rev-parse --show-toplevel 2>/dev/null)"; else basename "$(dirname "$common")"; fi`
+
+**Git branch (pre-resolved):** !`git rev-parse --abbrev-ref HEAD 2>/dev/null`
+
+If the lines above resolved to plain values (a folder name like `my-repo` and a branch name like `feat/my-branch`), they are ready to pass to the agent. If they still contain backtick command strings or are empty, they did not resolve — omit them from the dispatch and let the agent derive them at runtime.
+
+## Execution
+
+If no argument is provided, ask what the user wants to know about their session history. Use the platform's blocking question tool (`AskUserQuestion` in Claude Code, `request_user_input` in Codex, `ask_user` in Gemini). If no question tool is available, ask in plain text and wait for a reply.
+
+Dispatch `compound-engineering:research:session-historian` with the user's question as the task prompt. Omit the `mode` parameter so the user's configured permission settings apply. Include in the dispatch prompt:
+
+- The user's question
+- The current working directory
+- The repo name and git branch from pre-resolved context (only if they resolved to plain values — do not pass literal command strings)