Files
claude-engineering-plugin/plugins/compound-engineering/skills/ce-compound-refresh/SKILL.md
Trevin Chow 6969014532 fix(skills): enforce branch creation when committing on main
The model was offering "commit to current branch" on main instead
of "create a branch and PR." Added explicit branch detection step
and "Do NOT commit directly to main" instruction.
2026-03-15 14:44:34 -07:00

30 KiB

name, description, argument-hint, disable-model-invocation
name description argument-hint disable-model-invocation
ce:compound-refresh Refresh stale or drifting learnings and pattern docs in docs/solutions/ by reviewing, updating, replacing, or archiving them against the current codebase. Use after refactors, migrations, dependency upgrades, or when a retrieved learning feels outdated or wrong. Also use when reviewing docs/solutions/ for accuracy, when a recently solved problem contradicts an existing learning, or when pattern docs no longer reflect current code. [mode:autonomous] [optional: scope hint] true

Compound Refresh

Maintain the quality of docs/solutions/ over time. This workflow reviews existing learnings against the current codebase, then refreshes any derived pattern docs that depend on them.

Mode Detection

Check if $ARGUMENTS contains mode:autonomous. If present, strip it from arguments (use the remainder as a scope hint) and run in autonomous mode.

Mode When Behavior
Interactive (default) User is present and can answer questions Ask for decisions on ambiguous cases, confirm actions
Autonomous mode:autonomous in arguments No user interaction. Apply all unambiguous actions (Keep, Update, auto-Archive, Replace with sufficient evidence). Mark ambiguous cases as stale. Generate a summary report at the end.

Autonomous mode rules

  • Skip all user questions. Never pause for input.
  • Process all docs in scope. No scope narrowing questions — if no scope hint was provided, process everything.
  • Attempt all safe actions: Keep (no-op), Update (fix references), auto-Archive (unambiguous criteria met), Replace (when evidence is sufficient). If a write succeeds, record it as applied. If a write fails (e.g., permission denied), record the action as recommended in the report and continue — do not stop or ask for permissions.
  • Mark as stale when uncertain. If classification is genuinely ambiguous (Update vs Replace vs Archive) or Replace evidence is insufficient, mark as stale with status: stale, stale_reason, and stale_date in the frontmatter. If even the stale-marking write fails, include it as a recommendation.
  • Use conservative confidence. In interactive mode, borderline cases get a user question. In autonomous mode, borderline cases get marked stale. Err toward stale-marking over incorrect action.
  • Always generate a report. The report is the primary deliverable. It has two sections: Applied (actions that were successfully written) and Recommended (actions that could not be written, with full rationale so a human can apply them or run the skill interactively). The report structure is the same regardless of what permissions were granted — the only difference is which section each action lands in.

Interaction Principles

These principles apply to interactive mode only. In autonomous mode, skip all user questions and apply the autonomous mode rules above.

Follow the same interaction style as ce:brainstorm:

  • Ask questions one at a time — use the platform's interactive question tool (e.g. AskUserQuestion in Claude Code, request_user_input in Codex) and stop to wait for the answer before continuing
  • Prefer multiple choice when natural options exist
  • Start with scope and intent, then narrow only when needed
  • Do not ask the user to make decisions before you have evidence
  • Lead with a recommendation and explain it briefly

The goal is not to force the user through a checklist. The goal is to help them make a good maintenance decision with the smallest amount of friction.

Refresh Order

Refresh in this order:

  1. Review the relevant individual learning docs first
  2. Note which learnings stayed valid, were updated, were replaced, or were archived
  3. Then review any pattern docs that depend on those learnings

Why this order:

  • learning docs are the primary evidence
  • pattern docs are derived from one or more learnings
  • stale learnings can make a pattern look more valid than it really is

If the user starts by naming a pattern doc, you may begin there to understand the concern, but inspect the supporting learning docs before changing the pattern.

Maintenance Model

For each candidate artifact, classify it into one of four outcomes:

Outcome Meaning Default action
Keep Still accurate and still useful No file edit by default; report that it was reviewed and remains trustworthy
Update Core solution is still correct, but references drifted Apply evidence-backed in-place edits
Replace The old artifact is now misleading, but there is a known better replacement Create a trustworthy successor or revised pattern, then mark/archive the old artifact as needed
Archive No longer useful or applicable Move the obsolete artifact to docs/solutions/_archived/ with archive metadata when appropriate

Core Rules

  1. Evidence informs judgment. The signals below are inputs, not a mechanical scorecard. Use engineering judgment to decide whether the artifact is still trustworthy.
  2. Prefer no-write Keep. Do not update a doc just to leave a review breadcrumb.
  3. Match docs to reality, not the reverse. When current code differs from a learning, update the learning to reflect the current code. The skill's job is doc accuracy, not code review — do not ask the user whether code changes were "intentional" or "a regression." If the code changed, the doc should match. If the user thinks the code is wrong, that is a separate concern outside this workflow.
  4. Be decisive, minimize questions. When evidence is clear (file renamed, class moved, reference broken), apply the update. In interactive mode, only ask the user when the right action is genuinely ambiguous. In autonomous mode, mark ambiguous cases as stale instead of asking. The goal is automated maintenance with human oversight on judgment calls, not a question for every finding.
  5. Avoid low-value churn. Do not edit a doc just to fix a typo, polish wording, or make cosmetic changes that do not materially improve accuracy or usability.
  6. Use Update only for meaningful, evidence-backed drift. Paths, module names, related links, category metadata, code snippets, and clearly stale wording are fair game when fixing them materially improves accuracy.
  7. Use Replace only when there is a real replacement. That means either:
    • the current conversation contains a recently solved, verified replacement fix, or
    • the user has provided enough concrete replacement context to document the successor honestly, or
    • the codebase investigation found the current approach and can document it as the successor, or
    • newer docs, pattern docs, PRs, or issues provide strong successor evidence.
  8. Archive when the code is gone. If the referenced code, controller, or workflow no longer exists in the codebase and no successor can be found, recommend Archive — don't default to Keep just because the general advice is still "sound." A learning about a deleted feature misleads readers into thinking that feature still exists. When in doubt between Keep and Archive, ask the user (in interactive mode) or mark as stale (in autonomous mode). But missing referenced files with no matching code is not a doubt case — it is strong, unambiguous Archive evidence. Auto-archive it.

Scope Selection

Start by discovering learnings and pattern docs under docs/solutions/.

Exclude:

  • README.md
  • docs/solutions/_archived/

Find all .md files under docs/solutions/, excluding README.md files and anything under _archived/.

If $ARGUMENTS is provided, use it to narrow scope before proceeding. Try these matching strategies in order, stopping at the first that produces results:

  1. Directory match — check if the argument matches a subdirectory name under docs/solutions/ (e.g., performance-issues, database-issues)
  2. Frontmatter match — search module, component, or tags fields in learning frontmatter for the argument
  3. Filename match — match against filenames (partial matches are fine)
  4. Content search — search file contents for the argument as a keyword (useful for feature names or feature areas)

If no matches are found, report that and ask the user to clarify. In autonomous mode, report the miss and stop — do not guess at scope.

If no candidate docs are found, report:

No candidate docs found in docs/solutions/.
Run `ce:compound` after solving problems to start building your knowledge base.

Phase 0: Assess and Route

Before asking the user to classify anything:

  1. Discover candidate artifacts
  2. Estimate scope
  3. Choose the lightest interaction path that fits

Route by Scope

Scope When to use it Interaction style
Focused 1-2 likely files or user named a specific doc Investigate directly, then present a recommendation
Batch Up to ~8 mostly independent docs Investigate first, then present grouped recommendations
Broad 9+ docs, ambiguous, or repo-wide stale-doc sweep Triage first, then investigate in batches

Broad Scope Triage

When scope is broad (9+ candidate docs), do a lightweight triage before deep investigation:

  1. Inventory — read frontmatter of all candidate docs, group by module/component/category
  2. Impact clustering — identify areas with the densest clusters of learnings + pattern docs. A cluster of 5 learnings and 2 patterns covering the same module is higher-impact than 5 isolated single-doc areas, because staleness in one doc is likely to affect the others.
  3. Spot-check drift — for each cluster, check whether the primary referenced files still exist. Missing references in a high-impact cluster = strongest signal for where to start.
  4. Recommend a starting area — present the highest-impact cluster with a brief rationale and ask the user to confirm or redirect. In autonomous mode, skip the question and process all clusters in impact order.

Example:

Found 24 learnings across 5 areas.

The auth module has 5 learnings and 2 pattern docs that cross-reference
each other — and 3 of those reference files that no longer exist.
I'd start there.

1. Start with auth (recommended)
2. Pick a different area
3. Review everything

Do not ask action-selection questions yet. First gather evidence.

Phase 1: Investigate Candidate Learnings

For each learning in scope, read it, cross-reference its claims against the current codebase, and form a recommendation.

A learning has several dimensions that can independently go stale. Surface-level checks catch the obvious drift, but staleness often hides deeper:

  • References — do the file paths, class names, and modules it mentions still exist or have they moved?
  • Recommended solution — does the fix still match how the code actually works today? A renamed file with a completely different implementation pattern is not just a path update.
  • Code examples — if the learning includes code snippets, do they still reflect the current implementation?
  • Related docs — are cross-referenced learnings and patterns still present and consistent?

Match investigation depth to the learning's specificity — a learning referencing exact file paths and code snippets needs more verification than one describing a general principle.

Drift Classification: Update vs Replace

The critical distinction is whether the drift is cosmetic (references moved but the solution is the same) or substantive (the solution itself changed):

  • Update territory — file paths moved, classes renamed, links broke, metadata drifted, but the core recommended approach is still how the code works. ce:compound-refresh fixes these directly.
  • Replace territory — the recommended solution conflicts with current code, the architectural approach changed, or the pattern is no longer the preferred way. This means a new learning needs to be written. A replacement subagent writes the successor following ce:compound's document format (frontmatter, problem, root cause, solution, prevention), using the investigation evidence already gathered. The orchestrator does not rewrite learnings inline — it delegates to a subagent for context isolation.

The boundary: if you find yourself rewriting the solution section or changing what the learning recommends, stop — that is Replace, not Update.

Judgment Guidelines

Three guidelines that are easy to get wrong:

  1. Contradiction = strong Replace signal. If the learning's recommendation conflicts with current code patterns or a recently verified fix, that is not a minor drift — the learning is actively misleading. Classify as Replace.
  2. Age alone is not a stale signal. A 2-year-old learning that still matches current code is fine. Only use age as a prompt to inspect more carefully.
  3. Check for successors before archiving. Before recommending Replace or Archive, look for newer learnings, pattern docs, PRs, or issues covering the same problem space. If successor evidence exists, prefer Replace over Archive so readers are directed to the newer guidance.

Phase 1.5: Investigate Pattern Docs

After reviewing the underlying learning docs, investigate any relevant pattern docs under docs/solutions/patterns/.

Pattern docs are high-leverage — a stale pattern is more dangerous than a stale individual learning because future work may treat it as broadly applicable guidance. Evaluate whether the generalized rule still holds given the refreshed state of the learnings it depends on.

A pattern doc with no clear supporting learnings is a stale signal — investigate carefully before keeping it unchanged.

Subagent Strategy

Use subagents for context isolation when investigating multiple artifacts — not just because the task sounds complex. Choose the lightest approach that fits:

Approach When to use
Main thread only Small scope, short docs
Sequential subagents 1-2 artifacts with many supporting files to read
Parallel subagents 3+ truly independent artifacts with low overlap
Batched subagents Broad sweeps — narrow scope first, then investigate in batches

When spawning any subagent, include this instruction in its task prompt:

Use dedicated file search and read tools (Glob, Grep, Read) for all investigation. Do NOT use shell commands (ls, find, cat, grep, test, bash) for file operations. This avoids permission prompts and is more reliable.

There are two subagent roles:

  1. Investigation subagents — read-only. They must not edit files, create successors, or archive anything. Each returns: file path, evidence, recommended action, confidence, and open questions. These can run in parallel when artifacts are independent.
  2. Replacement subagents — write a single new learning to replace a stale one. These run one at a time, sequentially (each replacement subagent may need to read significant code, and running multiple in parallel risks context exhaustion). The orchestrator handles all archival and metadata updates after each replacement completes.

The orchestrator merges investigation results, detects contradictions, coordinates replacement subagents, and performs all archival/metadata edits centrally. In interactive mode, it asks the user questions on ambiguous cases. In autonomous mode, it marks ambiguous cases as stale instead. If two artifacts overlap or discuss the same root issue, investigate them together rather than parallelizing.

Phase 2: Classify the Right Maintenance Action

After gathering evidence, assign one recommended action.

Keep

The learning is still accurate and useful. Do not edit the file — report that it was reviewed and remains trustworthy. Only add last_refreshed if you are already making a meaningful update for another reason.

Update

The core solution is still valid but references have drifted (paths, class names, links, code snippets, metadata). Apply the fixes directly.

Replace

Choose Replace when the learning's core guidance is now misleading — the recommended fix changed materially, the root cause or architecture shifted, or the preferred pattern is different.

The user may have invoked the refresh months after the original learning was written. Do not ask them for replacement context they are unlikely to have — use agent intelligence to investigate the codebase and synthesize the replacement.

Evidence assessment:

By the time you identify a Replace candidate, Phase 1 investigation has already gathered significant evidence: the old learning's claims, what the current code actually does, and where the drift occurred. Assess whether this evidence is sufficient to write a trustworthy replacement:

  • Sufficient evidence — you understand both what the old learning recommended AND what the current approach is. The investigation found the current code patterns, the new file locations, the changed architecture. → Proceed to write the replacement (see Phase 4 Replace Flow).
  • Insufficient evidence — the drift is so fundamental that you cannot confidently document the current approach. The entire subsystem was replaced, or the new architecture is too complex to understand from a file scan alone. → Mark as stale in place:
    • Add status: stale, stale_reason: [what you found], stale_date: YYYY-MM-DD to the frontmatter
    • Report what evidence you found and what is missing
    • Recommend the user run ce:compound after their next encounter with that area, when they have fresh problem-solving context

Archive

Choose Archive when:

  • The code or workflow no longer exists
  • The learning is obsolete and has no modern replacement worth documenting
  • The learning is redundant and no longer useful on its own
  • There is no meaningful successor evidence suggesting it should be replaced instead

Action:

  • Move the file to docs/solutions/_archived/, preserving directory structure when helpful
  • Add:
    • archived_date: YYYY-MM-DD
    • archive_reason: [why it was archived]

Before archiving: check if the problem domain is still active

When a learning's referenced files are gone, that is strong evidence — but only that the implementation is gone. Before archiving, reason about whether the problem the learning solves is still a concern in the codebase:

  • A learning about session token storage where auth_token.rb is gone — does the application still handle session tokens? If so, the concept persists under a new implementation. That is Replace, not Archive.
  • A learning about a deprecated API endpoint where the entire feature was removed — the problem domain is gone. That is Archive.

Do not search mechanically for keywords from the old learning. Instead, understand what problem the learning addresses, then investigate whether that problem domain still exists in the codebase. The agent understands concepts — use that understanding to look for where the problem lives now, not where the old code used to be.

Auto-archive only when both the implementation AND the problem domain are gone:

  • the referenced code is gone AND the application no longer deals with that problem domain
  • the learning is fully superseded by a clearly better successor
  • the document is plainly redundant and adds no distinct value

If the implementation is gone but the problem domain persists (the app still does auth, still processes payments, still handles migrations), classify as Replace — the problem still matters and the current approach should be documented.

Do not keep a learning just because its general advice is "still sound" — if the specific code it references is gone, the learning misleads readers. But do not archive a learning whose problem domain is still active — that knowledge gap should be filled with a replacement.

If there is a clearly better successor, strongly consider Replace before Archive so the old artifact points readers toward the newer guidance.

Pattern Guidance

Apply the same four outcomes (Keep, Update, Replace, Archive) to pattern docs, but evaluate them as derived guidance rather than incident-level learnings. Key differences:

  • Keep: the underlying learnings still support the generalized rule and examples remain representative
  • Update: the rule holds but examples, links, scope, or supporting references drifted
  • Replace: the generalized rule is now misleading, or the underlying learnings support a different synthesis. Base the replacement on the refreshed learning set — do not invent new rules from guesswork
  • Archive: the pattern is no longer valid, no longer recurring, or fully subsumed by a stronger pattern doc

If "archive" feels too strong but the pattern should no longer be elevated, reduce its prominence in place if the docs structure supports that.

Phase 3: Ask for Decisions

Autonomous mode

Skip this entire phase. Do not ask any questions. Do not present options. Do not wait for input. Proceed directly to Phase 4 and execute all actions based on the classifications from Phase 2:

  • Unambiguous Keep, Update, auto-Archive, and Replace (with sufficient evidence) → execute directly
  • Ambiguous cases → mark as stale
  • Then generate the report (see Output Format)

Interactive mode

Most Updates should be applied directly without asking. Only ask the user when:

  • The right action is genuinely ambiguous (Update vs Replace vs Archive)
  • You are about to Archive a document and the evidence is not unambiguous (see auto-archive criteria in Phase 2). When auto-archive criteria are met, proceed without asking.
  • You are about to create a successor via ce:compound

Do not ask questions about whether code changes were intentional, whether the user wants to fix bugs in the code, or other concerns outside doc maintenance. Stay in your lane — doc accuracy.

Question Style

Always present choices using the platform's interactive question tool (e.g. AskUserQuestion in Claude Code, request_user_input in Codex). If the environment has no interactive prompt tool, present numbered options in plain text and wait for the user's response before proceeding.

Question rules:

  • Ask one question at a time
  • Prefer multiple choice
  • Lead with the recommended option
  • Explain the rationale for the recommendation in one concise sentence
  • Avoid asking the user to choose from actions that are not actually plausible

Focused Scope

For a single artifact, present:

  • file path
  • 2-4 bullets of evidence
  • recommended action

Then ask:

This [learning/pattern] looks like a [Update/Keep/Replace/Archive].

Why: [one-sentence rationale based on the evidence]

What would you like to do?

1. [Recommended action]
2. [Second plausible action]
3. Skip for now

Do not list all four actions unless all four are genuinely plausible.

Batch Scope

For several learnings:

  1. Group obvious Keep cases together
  2. Group obvious Update cases together when the fixes are straightforward
  3. Present Replace cases individually or in very small groups
  4. Present Archive cases individually unless they are strong auto-archive candidates

Ask for confirmation in stages:

  1. Confirm grouped Keep/Update recommendations
  2. Then handle Replace one at a time
  3. Then handle Archive one at a time unless the archive is unambiguous and safe to auto-apply

Broad Scope

If the user asked for a sweeping refresh, keep the interaction incremental:

  1. Narrow scope first
  2. Investigate a manageable batch
  3. Present recommendations
  4. Ask whether to continue to the next batch

Do not front-load the user with a full maintenance queue.

Phase 4: Execute the Chosen Action

Keep Flow

No file edit by default. Summarize why the learning remains trustworthy.

Update Flow

Apply in-place edits only when the solution is still substantively correct.

Examples of valid in-place updates:

  • Rename app/models/auth_token.rb reference to app/models/session_token.rb
  • Update module: AuthToken to module: SessionToken
  • Fix outdated links to related docs
  • Refresh implementation notes after a directory move

Examples that should not be in-place updates:

  • Fixing a typo with no effect on understanding
  • Rewording prose for style alone
  • Small cleanup that does not materially improve accuracy or usability
  • The old fix is now an anti-pattern
  • The system architecture changed enough that the old guidance is misleading
  • The troubleshooting path is materially different

Those cases require Replace, not Update.

Replace Flow

Process Replace candidates one at a time, sequentially. Each replacement is written by a subagent to protect the main context window.

When evidence is sufficient:

  1. Spawn a single subagent to write the replacement learning. Pass it:
    • The old learning's full content
    • A summary of the investigation evidence (what changed, what the current code does, why the old guidance is misleading)
    • The target path and category (same category as the old learning unless the category itself changed)
  2. The subagent writes the new learning following ce:compound's document format: YAML frontmatter (title, category, date, module, component, tags), problem description, root cause, current solution with code examples, and prevention tips. It should use dedicated file search and read tools if it needs additional context beyond what was passed.
  3. After the subagent completes, the orchestrator:
    • Adds superseded_by: [new learning path] to the old learning's frontmatter
    • Moves the old learning to docs/solutions/_archived/

When evidence is insufficient:

  1. Mark the learning as stale in place:
    • Add to frontmatter: status: stale, stale_reason: [what you found], stale_date: YYYY-MM-DD
  2. Report what evidence was found and what is missing
  3. Recommend the user run ce:compound after their next encounter with that area

Archive Flow

Archive only when a learning is clearly obsolete or redundant. Do not archive a document just because it is old.

Output Format

The full report MUST be printed as markdown output. Do not summarize findings internally and then output a one-liner. The report is the deliverable — print every section in full, formatted as readable markdown with headers, tables, and bullet points.

After processing the selected scope, output the following report:

Compound Refresh Summary
========================
Scanned: N learnings

Kept: X
Updated: Y
Replaced: Z
Archived: W
Skipped: V
Marked stale: S

Then for EVERY file processed, list:

  • The file path
  • The classification (Keep/Update/Replace/Archive/Stale)
  • What evidence was found
  • What action was taken (or recommended)

For Keep outcomes, list them under a reviewed-without-edits section so the result is visible without creating git churn.

Autonomous mode output

In autonomous mode, the report is the sole deliverable — there is no user present to ask follow-up questions, so the report must be self-contained and complete. Print the full report. Do not abbreviate, summarize, or skip sections.

Split actions into two sections:

Applied (writes that succeeded):

  • For each Updated file: the file path, what references were fixed, and why
  • For each Replaced file: what the old learning recommended vs what the current code does, and the path to the new successor
  • For each Archived file: the file path and what referenced code/workflow is gone
  • For each Marked stale file: the file path, what evidence was found, and why it was ambiguous

Recommended (actions that could not be written — e.g., permission denied):

  • Same detail as above, but framed as recommendations for a human to apply
  • Include enough context that the user can apply the change manually or re-run the skill interactively

If all writes succeed, the Recommended section is empty. If no writes succeed (e.g., read-only invocation), all actions appear under Recommended — the report becomes a maintenance plan.

Phase 5: Commit Changes

After all actions are executed and the report is generated, handle committing the changes. Skip this phase if no files were modified (all Keep, or all writes failed).

Detect git context

Before offering options, check:

  1. Which branch is currently checked out (main/master vs feature branch)
  2. Whether the working tree has other uncommitted changes beyond what compound-refresh modified
  3. Recent commit messages to match the repo's commit style

Autonomous mode

Use sensible defaults — no user to ask:

Context Default action
On main/master Create a descriptively named branch, commit, attempt to open a PR. If PR creation fails, report the branch name.
On a feature branch Commit as a separate commit on the current branch
Git operations fail Include the recommended git commands in the report and continue

Stage only the files that compound-refresh modified — not other dirty files in the working tree.

Interactive mode

First, run git branch --show-current to determine the current branch. Then present the correct options based on the result. Stage only compound-refresh files regardless of which option the user picks.

If the current branch is main, master, or the repo's default branch:

Do NOT offer to commit directly to main. Always offer a branch first:

  1. Create a branch, commit, and open a PR (recommended)
  2. Don't commit — I'll handle it

If the current branch is a feature branch, clean working tree:

  1. Commit to this branch as a separate commit (recommended)
  2. Create a separate branch and commit
  3. Don't commit

If the current branch is a feature branch, dirty working tree (other uncommitted changes):

  1. Commit only the compound-refresh changes to this branch (selective staging — other dirty files stay untouched)
  2. Don't commit

Commit message

Write a descriptive commit message that:

  • Summarizes what was refreshed (e.g., "update 3 stale learnings, archive 1 obsolete doc")
  • Follows the repo's existing commit conventions (check recent git log for style)
  • Is succinct — the details are in the changed files themselves

Relationship to ce:compound

  • ce:compound captures a newly solved, verified problem
  • ce:compound-refresh maintains older learnings as the codebase evolves

Use Replace only when the refresh process has enough real evidence to write a trustworthy successor. When evidence is insufficient, mark as stale and recommend ce:compound for when the user next encounters that problem area.