feat(ce-work): reduce token usage by extracting late-sequence references (#540)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:48:21 -07:00
parent 31b0686c2e
commit bb59547a2e
6 changed files with 543 additions and 285 deletions
--- a/plugins/compound-engineering/skills/ce-work-beta/references/shipping-workflow.md
+++ b/plugins/compound-engineering/skills/ce-work-beta/references/shipping-workflow.md
@@ -0,0 +1,136 @@
+# Shipping Workflow
+
+This file contains the shipping workflow (Phase 3-4). Load it only when all Phase 2 tasks are complete and execution transitions to quality check.
+
+## Phase 3: Quality Check
+
+1. **Run Core Quality Checks**
+
+   Always run before submitting:
+
+   ```bash
+   # Run full test suite (use project's test command)
+   # Examples: bin/rails test, npm test, pytest, go test, etc.
+
+   # Run linting (per AGENTS.md)
+   # Use linting-agent before pushing to origin
+   ```
+
+2. **Code Review** (REQUIRED)
+
+   Every change gets reviewed before shipping. The depth scales with the change's risk profile, but review itself is never skipped.
+
+   **Tier 2: Full review (default)** -- REQUIRED unless Tier 1 criteria are explicitly met. Invoke the `ce:review` skill with `mode:autofix` to run specialized reviewer agents, auto-apply safe fixes, and surface residual work as todos. When the plan file path is known, pass it as `plan:<path>`. This is the mandatory default -- proceed to Tier 1 only after confirming every criterion below.
+
+   **Tier 1: Inline self-review** -- A lighter alternative permitted only when **all four** criteria are true. Before choosing Tier 1, explicitly state which criteria apply and why. If any criterion is uncertain, use Tier 2.
+   - Purely additive (new files only, no existing behavior modified)
+   - Single concern (one skill, one component -- not cross-cutting)
+   - Pattern-following (implementation mirrors an existing example with no novel logic)
+   - Plan-faithful (no scope growth, no deferred questions resolved with surprising answers)
+
+3. **Final Validation**
+   - All tasks marked completed
+   - Testing addressed -- tests pass and new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
+   - Linting passes
+   - Code follows existing patterns
+   - Figma designs match (if applicable)
+   - No console errors or warnings
+   - If the plan has a `Requirements Trace`, verify each requirement is satisfied by the completed work
+   - If any `Deferred to Implementation` questions were noted, confirm they were resolved during execution
+
+4. **Prepare Operational Validation Plan** (REQUIRED)
+   - Add a `## Post-Deploy Monitoring & Validation` section to the PR description for every change.
+   - Include concrete:
+     - Log queries/search terms
+     - Metrics or dashboards to watch
+     - Expected healthy signals
+     - Failure signals and rollback/mitigation trigger
+     - Validation window and owner
+   - If there is truly no production/runtime impact, still include the section with: `No additional operational monitoring required` and a one-line reason.
+
+## Phase 4: Ship It
+
+1. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
+
+   For **any** design changes, new views, or UI modifications, capture and upload screenshots before creating the PR:
+
+   **Step 1: Start dev server** (if not running)
+   ```bash
+   bin/dev  # Run in background
+   ```
+
+   **Step 2: Capture screenshots with agent-browser CLI**
+   ```bash
+   agent-browser open http://localhost:3000/[route]
+   agent-browser snapshot -i
+   agent-browser screenshot output.png
+   ```
+   See the `agent-browser` skill for detailed usage.
+
+   **Step 3: Upload using imgup skill**
+   ```bash
+   skill: imgup
+   # Then upload each screenshot:
+   imgup -h pixhost screenshot.png  # pixhost works without API key
+   # Alternative hosts: catbox, imagebin, beeimg
+   ```
+
+   **What to capture:**
+   - **New screens**: Screenshot of the new UI
+   - **Modified screens**: Before AND after screenshots
+   - **Design implementation**: Screenshot showing Figma design match
+
+2. **Update Plan Status**
+
+   If the input document has YAML frontmatter with a `status` field, update it to `completed`:
+   ```
+   status: active  ->  status: completed
+   ```
+
+3. **Commit and Create Pull Request**
+
+   Load the `git-commit-push-pr` skill to handle committing, pushing, and PR creation. The skill handles convention detection, branch safety, logical commit splitting, adaptive PR descriptions, and attribution badges.
+
+   When providing context for the PR description, include:
+   - The plan's summary and key decisions
+   - Testing notes (tests added/modified, manual testing performed)
+   - Screenshot URLs from step 1 (if applicable)
+   - Figma design link (if applicable)
+   - The Post-Deploy Monitoring & Validation section (see Phase 3 Step 4)
+
+   If the user prefers to commit without creating a PR, load the `git-commit` skill instead.
+
+4. **Notify User**
+   - Summarize what was completed
+   - Link to PR (if one was created)
+   - Note any follow-up work needed
+   - Suggest next steps if applicable
+
+## Quality Checklist
+
+Before creating PR, verify:
+
+- [ ] All clarifying questions asked and answered
+- [ ] All tasks marked completed
+- [ ] Testing addressed -- tests pass AND new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
+- [ ] Linting passes (use linting-agent)
+- [ ] Code follows existing patterns
+- [ ] Figma designs match implementation (if applicable)
+- [ ] Before/after screenshots captured and uploaded (for UI changes)
+- [ ] Commit messages follow conventional format
+- [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
+- [ ] Code review completed (inline self-review or full `ce:review`)
+- [ ] PR description includes summary, testing notes, and screenshots
+- [ ] PR description includes Compound Engineered badge with accurate model and harness
+
+## Code Review Tiers
+
+Every change gets reviewed. The tier determines depth, not whether review happens.
+
+**Tier 2 (full review)** -- REQUIRED default. Invoke `ce:review mode:autofix` with `plan:<path>` when available. Safe fixes are applied automatically; residual work surfaces as todos. Always use this tier unless all four Tier 1 criteria are explicitly confirmed.
+
+**Tier 1 (inline self-review)** -- permitted only when all four are true (state each explicitly before choosing):
+- Purely additive (new files only, no existing behavior modified)
+- Single concern (one skill, one component -- not cross-cutting)
+- Pattern-following (mirrors an existing example, no novel logic)
+- Plan-faithful (no scope growth, no surprising deferred-question resolutions)