feat(ce-work): reduce token usage by extracting late-sequence references (#540)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Trevin Chow
2026-04-09 12:48:21 -07:00
committed by GitHub
parent 31b0686c2e
commit bb59547a2e
6 changed files with 543 additions and 285 deletions

View File

@@ -333,109 +333,9 @@ Determine how to proceed based on what was provided in `<input_document>`.
- Create new tasks if scope expands
- Keep user informed of major milestones
### Phase 3: Quality Check
### Phase 3-4: Quality Check and Ship It
1. **Run Core Quality Checks**
Always run before submitting:
```bash
# Run full test suite (use project's test command)
# Examples: bin/rails test, npm test, pytest, go test, etc.
# Run linting (per AGENTS.md)
# Use linting-agent before pushing to origin
```
2. **Code Review** (REQUIRED)
Every change gets reviewed before shipping. The depth scales with the change's risk profile, but review itself is never skipped.
**Tier 2: Full review (default)** — REQUIRED unless Tier 1 criteria are explicitly met. Invoke the `ce:review` skill with `mode:autofix` to run specialized reviewer agents, auto-apply safe fixes, and surface residual work as todos. When the plan file path is known, pass it as `plan:<path>`. This is the mandatory default — proceed to Tier 1 only after confirming every criterion below.
**Tier 1: Inline self-review** — A lighter alternative permitted only when **all four** criteria are true. Before choosing Tier 1, explicitly state which criteria apply and why. If any criterion is uncertain, use Tier 2.
- Purely additive (new files only, no existing behavior modified)
- Single concern (one skill, one component — not cross-cutting)
- Pattern-following (implementation mirrors an existing example with no novel logic)
- Plan-faithful (no scope growth, no deferred questions resolved with surprising answers)
3. **Final Validation**
- All tasks marked completed
- Testing addressed -- tests pass and new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
- Linting passes
- Code follows existing patterns
- Figma designs match (if applicable)
- No console errors or warnings
- If the plan has a `Requirements Trace`, verify each requirement is satisfied by the completed work
- If any `Deferred to Implementation` questions were noted, confirm they were resolved during execution
4. **Prepare Operational Validation Plan** (REQUIRED)
- Add a `## Post-Deploy Monitoring & Validation` section to the PR description for every change.
- Include concrete:
- Log queries/search terms
- Metrics or dashboards to watch
- Expected healthy signals
- Failure signals and rollback/mitigation trigger
- Validation window and owner
- If there is truly no production/runtime impact, still include the section with: `No additional operational monitoring required` and a one-line reason.
### Phase 4: Ship It
1. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
For **any** design changes, new views, or UI modifications, capture and upload screenshots before creating the PR:
**Step 1: Start dev server** (if not running)
```bash
bin/dev # Run in background
```
**Step 2: Capture screenshots with agent-browser CLI**
```bash
agent-browser open http://localhost:3000/[route]
agent-browser snapshot -i
agent-browser screenshot output.png
```
See the `agent-browser` skill for detailed usage.
**Step 3: Upload using imgup skill**
```bash
skill: imgup
# Then upload each screenshot:
imgup -h pixhost screenshot.png # pixhost works without API key
# Alternative hosts: catbox, imagebin, beeimg
```
**What to capture:**
- **New screens**: Screenshot of the new UI
- **Modified screens**: Before AND after screenshots
- **Design implementation**: Screenshot showing Figma design match
2. **Commit and Create Pull Request**
Load the `git-commit-push-pr` skill to handle committing, pushing, and PR creation. The skill handles convention detection, branch safety, logical commit splitting, adaptive PR descriptions, and attribution badges.
When providing context for the PR description, include:
- The plan's summary and key decisions
- Testing notes (tests added/modified, manual testing performed)
- Screenshot URLs from step 1 (if applicable)
- Figma design link (if applicable)
- The Post-Deploy Monitoring & Validation section (see Phase 3 Step 4)
If the user prefers to commit without creating a PR, load the `git-commit` skill instead.
3. **Update Plan Status**
If the input document has YAML frontmatter with a `status` field, update it to `completed`:
```
status: active → status: completed
```
4. **Notify User**
- Summarize what was completed
- Link to PR (if one was created)
- Note any follow-up work needed
- Suggest next steps if applicable
When all Phase 2 tasks are complete and execution transitions to quality check, read `references/shipping-workflow.md` for the full shipping workflow: quality checks, code review, final validation, PR creation, and notification.
---
@@ -478,35 +378,6 @@ When `delegation_active` is true after argument parsing, read `references/codex-
- Don't leave features 80% done
- A finished feature that ships beats a perfect feature that doesn't
## Quality Checklist
Before creating PR, verify:
- [ ] All clarifying questions asked and answered
- [ ] All tasks marked completed
- [ ] Testing addressed -- tests pass AND new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
- [ ] Linting passes (use linting-agent)
- [ ] Code follows existing patterns
- [ ] Figma designs match implementation (if applicable)
- [ ] Before/after screenshots captured and uploaded (for UI changes)
- [ ] Commit messages follow conventional format
- [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
- [ ] Code review completed (inline self-review or full `ce:review`)
- [ ] PR description includes summary, testing notes, and screenshots
- [ ] PR description includes Compound Engineered badge with accurate model and harness
## Code Review Tiers
Every change gets reviewed. The tier determines depth, not whether review happens.
**Tier 2 (full review)** — REQUIRED default. Invoke `ce:review mode:autofix` with `plan:<path>` when available. Safe fixes are applied automatically; residual work surfaces as todos. Always use this tier unless all four Tier 1 criteria are explicitly confirmed.
**Tier 1 (inline self-review)** — permitted only when all four are true (state each explicitly before choosing):
- Purely additive (new files only, no existing behavior modified)
- Single concern (one skill, one component — not cross-cutting)
- Pattern-following (mirrors an existing example, no novel logic)
- Plan-faithful (no scope growth, no surprising deferred-question resolutions)
## Common Pitfalls to Avoid
- **Analysis paralysis** - Don't overthink, read the plan and execute

View File

@@ -0,0 +1,136 @@
# Shipping Workflow
This file contains the shipping workflow (Phase 3-4). Load it only when all Phase 2 tasks are complete and execution transitions to quality check.
## Phase 3: Quality Check
1. **Run Core Quality Checks**
Always run before submitting:
```bash
# Run full test suite (use project's test command)
# Examples: bin/rails test, npm test, pytest, go test, etc.
# Run linting (per AGENTS.md)
# Use linting-agent before pushing to origin
```
2. **Code Review** (REQUIRED)
Every change gets reviewed before shipping. The depth scales with the change's risk profile, but review itself is never skipped.
**Tier 2: Full review (default)** -- REQUIRED unless Tier 1 criteria are explicitly met. Invoke the `ce:review` skill with `mode:autofix` to run specialized reviewer agents, auto-apply safe fixes, and surface residual work as todos. When the plan file path is known, pass it as `plan:<path>`. This is the mandatory default -- proceed to Tier 1 only after confirming every criterion below.
**Tier 1: Inline self-review** -- A lighter alternative permitted only when **all four** criteria are true. Before choosing Tier 1, explicitly state which criteria apply and why. If any criterion is uncertain, use Tier 2.
- Purely additive (new files only, no existing behavior modified)
- Single concern (one skill, one component -- not cross-cutting)
- Pattern-following (implementation mirrors an existing example with no novel logic)
- Plan-faithful (no scope growth, no deferred questions resolved with surprising answers)
3. **Final Validation**
- All tasks marked completed
- Testing addressed -- tests pass and new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
- Linting passes
- Code follows existing patterns
- Figma designs match (if applicable)
- No console errors or warnings
- If the plan has a `Requirements Trace`, verify each requirement is satisfied by the completed work
- If any `Deferred to Implementation` questions were noted, confirm they were resolved during execution
4. **Prepare Operational Validation Plan** (REQUIRED)
- Add a `## Post-Deploy Monitoring & Validation` section to the PR description for every change.
- Include concrete:
- Log queries/search terms
- Metrics or dashboards to watch
- Expected healthy signals
- Failure signals and rollback/mitigation trigger
- Validation window and owner
- If there is truly no production/runtime impact, still include the section with: `No additional operational monitoring required` and a one-line reason.
## Phase 4: Ship It
1. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
For **any** design changes, new views, or UI modifications, capture and upload screenshots before creating the PR:
**Step 1: Start dev server** (if not running)
```bash
bin/dev # Run in background
```
**Step 2: Capture screenshots with agent-browser CLI**
```bash
agent-browser open http://localhost:3000/[route]
agent-browser snapshot -i
agent-browser screenshot output.png
```
See the `agent-browser` skill for detailed usage.
**Step 3: Upload using imgup skill**
```bash
skill: imgup
# Then upload each screenshot:
imgup -h pixhost screenshot.png # pixhost works without API key
# Alternative hosts: catbox, imagebin, beeimg
```
**What to capture:**
- **New screens**: Screenshot of the new UI
- **Modified screens**: Before AND after screenshots
- **Design implementation**: Screenshot showing Figma design match
2. **Update Plan Status**
If the input document has YAML frontmatter with a `status` field, update it to `completed`:
```
status: active -> status: completed
```
3. **Commit and Create Pull Request**
Load the `git-commit-push-pr` skill to handle committing, pushing, and PR creation. The skill handles convention detection, branch safety, logical commit splitting, adaptive PR descriptions, and attribution badges.
When providing context for the PR description, include:
- The plan's summary and key decisions
- Testing notes (tests added/modified, manual testing performed)
- Screenshot URLs from step 1 (if applicable)
- Figma design link (if applicable)
- The Post-Deploy Monitoring & Validation section (see Phase 3 Step 4)
If the user prefers to commit without creating a PR, load the `git-commit` skill instead.
4. **Notify User**
- Summarize what was completed
- Link to PR (if one was created)
- Note any follow-up work needed
- Suggest next steps if applicable
## Quality Checklist
Before creating PR, verify:
- [ ] All clarifying questions asked and answered
- [ ] All tasks marked completed
- [ ] Testing addressed -- tests pass AND new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
- [ ] Linting passes (use linting-agent)
- [ ] Code follows existing patterns
- [ ] Figma designs match implementation (if applicable)
- [ ] Before/after screenshots captured and uploaded (for UI changes)
- [ ] Commit messages follow conventional format
- [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
- [ ] Code review completed (inline self-review or full `ce:review`)
- [ ] PR description includes summary, testing notes, and screenshots
- [ ] PR description includes Compound Engineered badge with accurate model and harness
## Code Review Tiers
Every change gets reviewed. The tier determines depth, not whether review happens.
**Tier 2 (full review)** -- REQUIRED default. Invoke `ce:review mode:autofix` with `plan:<path>` when available. Safe fixes are applied automatically; residual work surfaces as todos. Always use this tier unless all four Tier 1 criteria are explicitly confirmed.
**Tier 1 (inline self-review)** -- permitted only when all four are true (state each explicitly before choosing):
- Purely additive (new files only, no existing behavior modified)
- Single concern (one skill, one component -- not cross-cutting)
- Pattern-following (mirrors an existing example, no novel logic)
- Plan-faithful (no scope growth, no surprising deferred-question resolutions)

View File

@@ -268,109 +268,9 @@ Determine how to proceed based on what was provided in `<input_document>`.
- Create new tasks if scope expands
- Keep user informed of major milestones
### Phase 3: Quality Check
### Phase 3-4: Quality Check and Ship It
1. **Run Core Quality Checks**
Always run before submitting:
```bash
# Run full test suite (use project's test command)
# Examples: bin/rails test, npm test, pytest, go test, etc.
# Run linting (per AGENTS.md)
# Use linting-agent before pushing to origin
```
2. **Code Review** (REQUIRED)
Every change gets reviewed before shipping. The depth scales with the change's risk profile, but review itself is never skipped.
**Tier 2: Full review (default)** — REQUIRED unless Tier 1 criteria are explicitly met. Invoke the `ce:review` skill with `mode:autofix` to run specialized reviewer agents, auto-apply safe fixes, and surface residual work as todos. When the plan file path is known, pass it as `plan:<path>`. This is the mandatory default — proceed to Tier 1 only after confirming every criterion below.
**Tier 1: Inline self-review** — A lighter alternative permitted only when **all four** criteria are true. Before choosing Tier 1, explicitly state which criteria apply and why. If any criterion is uncertain, use Tier 2.
- Purely additive (new files only, no existing behavior modified)
- Single concern (one skill, one component — not cross-cutting)
- Pattern-following (implementation mirrors an existing example with no novel logic)
- Plan-faithful (no scope growth, no deferred questions resolved with surprising answers)
3. **Final Validation**
- All tasks marked completed
- Testing addressed -- tests pass and new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
- Linting passes
- Code follows existing patterns
- Figma designs match (if applicable)
- No console errors or warnings
- If the plan has a `Requirements Trace`, verify each requirement is satisfied by the completed work
- If any `Deferred to Implementation` questions were noted, confirm they were resolved during execution
4. **Prepare Operational Validation Plan** (REQUIRED)
- Add a `## Post-Deploy Monitoring & Validation` section to the PR description for every change.
- Include concrete:
- Log queries/search terms
- Metrics or dashboards to watch
- Expected healthy signals
- Failure signals and rollback/mitigation trigger
- Validation window and owner
- If there is truly no production/runtime impact, still include the section with: `No additional operational monitoring required` and a one-line reason.
### Phase 4: Ship It
1. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
For **any** design changes, new views, or UI modifications, capture and upload screenshots before creating the PR:
**Step 1: Start dev server** (if not running)
```bash
bin/dev # Run in background
```
**Step 2: Capture screenshots with agent-browser CLI**
```bash
agent-browser open http://localhost:3000/[route]
agent-browser snapshot -i
agent-browser screenshot output.png
```
See the `agent-browser` skill for detailed usage.
**Step 3: Upload using imgup skill**
```bash
skill: imgup
# Then upload each screenshot:
imgup -h pixhost screenshot.png # pixhost works without API key
# Alternative hosts: catbox, imagebin, beeimg
```
**What to capture:**
- **New screens**: Screenshot of the new UI
- **Modified screens**: Before AND after screenshots
- **Design implementation**: Screenshot showing Figma design match
2. **Commit and Create Pull Request**
Load the `git-commit-push-pr` skill to handle committing, pushing, and PR creation. The skill handles convention detection, branch safety, logical commit splitting, adaptive PR descriptions, and attribution badges.
When providing context for the PR description, include:
- The plan's summary and key decisions
- Testing notes (tests added/modified, manual testing performed)
- Screenshot URLs from step 1 (if applicable)
- Figma design link (if applicable)
- The Post-Deploy Monitoring & Validation section (see Phase 3 Step 4)
If the user prefers to commit without creating a PR, load the `git-commit` skill instead.
3. **Update Plan Status**
If the input document has YAML frontmatter with a `status` field, update it to `completed`:
```
status: active → status: completed
```
4. **Notify User**
- Summarize what was completed
- Link to PR (if one was created)
- Note any follow-up work needed
- Suggest next steps if applicable
When all Phase 2 tasks are complete and execution transitions to quality check, read `references/shipping-workflow.md` for the full shipping workflow: quality checks, code review, final validation, PR creation, and notification.
## Key Principles
@@ -405,35 +305,6 @@ Determine how to proceed based on what was provided in `<input_document>`.
- Don't leave features 80% done
- A finished feature that ships beats a perfect feature that doesn't
## Quality Checklist
Before creating PR, verify:
- [ ] All clarifying questions asked and answered
- [ ] All tasks marked completed
- [ ] Testing addressed -- tests pass AND new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
- [ ] Linting passes (use linting-agent)
- [ ] Code follows existing patterns
- [ ] Figma designs match implementation (if applicable)
- [ ] Before/after screenshots captured and uploaded (for UI changes)
- [ ] Commit messages follow conventional format
- [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
- [ ] Code review completed (inline self-review or full `ce:review`)
- [ ] PR description includes summary, testing notes, and screenshots
- [ ] PR description includes Compound Engineered badge with accurate model and harness
## Code Review Tiers
Every change gets reviewed. The tier determines depth, not whether review happens.
**Tier 2 (full review)** — REQUIRED default. Invoke `ce:review mode:autofix` with `plan:<path>` when available. Safe fixes are applied automatically; residual work surfaces as todos. Always use this tier unless all four Tier 1 criteria are explicitly confirmed.
**Tier 1 (inline self-review)** — permitted only when all four are true (state each explicitly before choosing):
- Purely additive (new files only, no existing behavior modified)
- Single concern (one skill, one component — not cross-cutting)
- Pattern-following (mirrors an existing example, no novel logic)
- Plan-faithful (no scope growth, no surprising deferred-question resolutions)
## Common Pitfalls to Avoid
- **Analysis paralysis** - Don't overthink, read the plan and execute

View File

@@ -0,0 +1,136 @@
# Shipping Workflow
This file contains the shipping workflow (Phase 3-4). Load it only when all Phase 2 tasks are complete and execution transitions to quality check.
## Phase 3: Quality Check
1. **Run Core Quality Checks**
Always run before submitting:
```bash
# Run full test suite (use project's test command)
# Examples: bin/rails test, npm test, pytest, go test, etc.
# Run linting (per AGENTS.md)
# Use linting-agent before pushing to origin
```
2. **Code Review** (REQUIRED)
Every change gets reviewed before shipping. The depth scales with the change's risk profile, but review itself is never skipped.
**Tier 2: Full review (default)** -- REQUIRED unless Tier 1 criteria are explicitly met. Invoke the `ce:review` skill with `mode:autofix` to run specialized reviewer agents, auto-apply safe fixes, and surface residual work as todos. When the plan file path is known, pass it as `plan:<path>`. This is the mandatory default -- proceed to Tier 1 only after confirming every criterion below.
**Tier 1: Inline self-review** -- A lighter alternative permitted only when **all four** criteria are true. Before choosing Tier 1, explicitly state which criteria apply and why. If any criterion is uncertain, use Tier 2.
- Purely additive (new files only, no existing behavior modified)
- Single concern (one skill, one component -- not cross-cutting)
- Pattern-following (implementation mirrors an existing example with no novel logic)
- Plan-faithful (no scope growth, no deferred questions resolved with surprising answers)
3. **Final Validation**
- All tasks marked completed
- Testing addressed -- tests pass and new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
- Linting passes
- Code follows existing patterns
- Figma designs match (if applicable)
- No console errors or warnings
- If the plan has a `Requirements Trace`, verify each requirement is satisfied by the completed work
- If any `Deferred to Implementation` questions were noted, confirm they were resolved during execution
4. **Prepare Operational Validation Plan** (REQUIRED)
- Add a `## Post-Deploy Monitoring & Validation` section to the PR description for every change.
- Include concrete:
- Log queries/search terms
- Metrics or dashboards to watch
- Expected healthy signals
- Failure signals and rollback/mitigation trigger
- Validation window and owner
- If there is truly no production/runtime impact, still include the section with: `No additional operational monitoring required` and a one-line reason.
## Phase 4: Ship It
1. **Capture and Upload Screenshots for UI Changes** (REQUIRED for any UI work)
For **any** design changes, new views, or UI modifications, capture and upload screenshots before creating the PR:
**Step 1: Start dev server** (if not running)
```bash
bin/dev # Run in background
```
**Step 2: Capture screenshots with agent-browser CLI**
```bash
agent-browser open http://localhost:3000/[route]
agent-browser snapshot -i
agent-browser screenshot output.png
```
See the `agent-browser` skill for detailed usage.
**Step 3: Upload using imgup skill**
```bash
skill: imgup
# Then upload each screenshot:
imgup -h pixhost screenshot.png # pixhost works without API key
# Alternative hosts: catbox, imagebin, beeimg
```
**What to capture:**
- **New screens**: Screenshot of the new UI
- **Modified screens**: Before AND after screenshots
- **Design implementation**: Screenshot showing Figma design match
2. **Update Plan Status**
If the input document has YAML frontmatter with a `status` field, update it to `completed`:
```
status: active -> status: completed
```
3. **Commit and Create Pull Request**
Load the `git-commit-push-pr` skill to handle committing, pushing, and PR creation. The skill handles convention detection, branch safety, logical commit splitting, adaptive PR descriptions, and attribution badges.
When providing context for the PR description, include:
- The plan's summary and key decisions
- Testing notes (tests added/modified, manual testing performed)
- Screenshot URLs from step 1 (if applicable)
- Figma design link (if applicable)
- The Post-Deploy Monitoring & Validation section (see Phase 3 Step 4)
If the user prefers to commit without creating a PR, load the `git-commit` skill instead.
4. **Notify User**
- Summarize what was completed
- Link to PR (if one was created)
- Note any follow-up work needed
- Suggest next steps if applicable
## Quality Checklist
Before creating PR, verify:
- [ ] All clarifying questions asked and answered
- [ ] All tasks marked completed
- [ ] Testing addressed -- tests pass AND new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed)
- [ ] Linting passes (use linting-agent)
- [ ] Code follows existing patterns
- [ ] Figma designs match implementation (if applicable)
- [ ] Before/after screenshots captured and uploaded (for UI changes)
- [ ] Commit messages follow conventional format
- [ ] PR description includes Post-Deploy Monitoring & Validation section (or explicit no-impact rationale)
- [ ] Code review completed (inline self-review or full `ce:review`)
- [ ] PR description includes summary, testing notes, and screenshots
- [ ] PR description includes Compound Engineered badge with accurate model and harness
## Code Review Tiers
Every change gets reviewed. The tier determines depth, not whether review happens.
**Tier 2 (full review)** -- REQUIRED default. Invoke `ce:review mode:autofix` with `plan:<path>` when available. Safe fixes are applied automatically; residual work surfaces as todos. Always use this tier unless all four Tier 1 criteria are explicitly confirmed.
**Tier 1 (inline self-review)** -- permitted only when all four are true (state each explicitly before choosing):
- Purely additive (new files only, no existing behavior modified)
- Single concern (one skill, one component -- not cross-cutting)
- Pattern-following (mirrors an existing example, no novel logic)
- Plan-faithful (no scope growth, no surprising deferred-question resolutions)