feat: add claude-permissions-optimizer skill (#298)
This commit is contained in:
@@ -0,0 +1,93 @@
|
||||
---
|
||||
title: "Offload data processing to bundled scripts to reduce token consumption"
|
||||
category: "skill-design"
|
||||
date: "2026-03-17"
|
||||
tags:
|
||||
- token-optimization
|
||||
- skill-architecture
|
||||
- bundled-scripts
|
||||
- data-processing
|
||||
severity: "high"
|
||||
component: "plugins/compound-engineering/skills"
|
||||
---
|
||||
|
||||
# Script-First Skill Architecture
|
||||
|
||||
When a skill processes large datasets (session transcripts, log files, configuration inventories), having the model do the processing is a token-expensive anti-pattern. Moving data processing into a bundled Node.js script and having the model present the results cuts token usage by 60-75%.
|
||||
|
||||
## Origin
|
||||
|
||||
Learned while building the `claude-permissions-optimizer` skill, which analyzes Claude Code session transcripts to find safe Bash commands to auto-allow. Initial iterations had the model reading JSONL session files, classifying commands against a 370-line reference doc, and normalizing patterns -- averaging 85-115k tokens per run. After moving all processing into the extraction script, runs dropped to ~40k tokens with equivalent output quality.
|
||||
|
||||
## The Anti-Pattern: Model-as-Processor
|
||||
|
||||
The default instinct when building a skill that touches data is to have the model read everything into context, parse it, classify it, and reason about it. This works for small inputs but scales terribly:
|
||||
|
||||
- Token usage grows linearly with data volume
|
||||
- Most tokens are spent on mechanical work (parsing JSON, matching patterns, counting frequencies)
|
||||
- Loading reference docs for classification rules inflates context further
|
||||
- The model's actual judgment contributes almost nothing to the classification output
|
||||
|
||||
## The Pattern: Script Produces, Model Presents
|
||||
|
||||
```
|
||||
skills/<skill-name>/
|
||||
SKILL.md # Instructions: run script, present output
|
||||
scripts/
|
||||
process.mjs # Does ALL data processing, outputs JSON
|
||||
```
|
||||
|
||||
1. **Script does all mechanical work.** Reading files, parsing structured formats, applying classification rules (regex, keyword lists), normalizing results, computing counts. Outputs pre-classified JSON to stdout.
|
||||
|
||||
2. **SKILL.md instructs presentation only.** Run the script, read the JSON, format it for the user. Explicitly prohibit re-classifying, re-parsing, or loading reference files.
|
||||
|
||||
3. **Single source of truth for rules.** Classification logic lives exclusively in the script. The SKILL.md references the script's output categories as given facts but does not define them.
|
||||
|
||||
## Token Impact
|
||||
|
||||
| Approach | Tokens | Reduction |
|
||||
|---|---|---|
|
||||
| Model does everything (read, parse, classify, present) | ~100k | baseline |
|
||||
| Added "do NOT grep session files" instruction | ~84k | 16% |
|
||||
| Script classifies; model still loads reference doc | ~38k | 62% |
|
||||
| Script classifies; model presents only | ~35k | 65% |
|
||||
|
||||
The biggest single win was moving classification into the script. The second was removing the instruction to load the reference file -- once the script handles classification, the reference file is maintenance documentation only.
|
||||
|
||||
## When to Apply
|
||||
|
||||
Apply script-first architecture when a skill meets **any** of these:
|
||||
|
||||
- Processes more than ~50 items or reads files larger than a few KB
|
||||
- Classification rules are deterministic (regex, keyword lists, lookup tables)
|
||||
- Input data follows a consistent schema (JSONL, CSV, structured logs)
|
||||
- The skill runs frequently or feeds into further analysis
|
||||
|
||||
**Do not apply** when:
|
||||
- The skill's core value is the model's judgment (code review, architectural analysis)
|
||||
- Input is unstructured natural language
|
||||
- The dataset is small enough that processing costs are negligible
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
- **Instruction-only optimization.** Adding "don't do X" to SKILL.md without providing a script alternative. The model will find other token-expensive paths to the same result.
|
||||
|
||||
- **Hybrid classification.** Having the script classify some items and the model classify the rest. This still loads context and reference docs. Go all-in on the script. Items the script can't classify should be dropped as "unclassified," not handed to the model.
|
||||
|
||||
- **Dual rule definitions.** Classification rules in both the script AND the SKILL.md. They drift apart, the model may override the script's decisions, and tokens are wasted on re-evaluation. One source of truth.
|
||||
|
||||
## Checklist for Skill Authors
|
||||
|
||||
- [ ] Can the data processing be expressed as deterministic logic (regex, keyword matching, field checks)?
|
||||
- [ ] Script is the single owner of all classification rules
|
||||
- [ ] SKILL.md instructs the model to run the script as its first action
|
||||
- [ ] SKILL.md does not restate or duplicate the script's classification logic
|
||||
- [ ] Script output is structured JSON the model can present directly
|
||||
- [ ] Reference docs exist for maintainers but are never loaded at runtime
|
||||
- [ ] After building, verify the model is not doing any mechanical parsing or rule-application work
|
||||
|
||||
## Related
|
||||
|
||||
- [Reduce plugin context token usage](../../plans/2026-02-08-refactor-reduce-plugin-context-token-usage-plan.md) -- established the principle that descriptions are for discovery, detailed content belongs in the body
|
||||
- [Compound refresh skill improvements](compound-refresh-skill-improvements.md) -- patterns for autonomous skill execution and subagent architecture
|
||||
- [Beta skills framework](beta-skills-framework.md) -- skill organization and rollout conventions
|
||||
Reference in New Issue
Block a user