Merge upstream v2.67.0 with fork customizations preserved

Synced 79 commits from EveryInc/compound-engineering-plugin upstream while
preserving fork-specific customizations (Python/FastAPI pivot, Zoominfo-internal
review agents, deploy-wiring operational lessons, custom personas).

## Triage decisions (15 conflicts resolved)

Keep deleted (7) -- fork already removed these in prior cleanups:
- agents/design/{design-implementation-reviewer,design-iterator,figma-design-sync}
  (no fork successor; backend-Python focus doesn't need UI/Figma agents)
- agents/docs/ankane-readme-writer (replaced by python-package-readme-writer)
- agents/review/{data-migration-expert,performance-oracle,security-sentinel}
  (replaced by *-reviewer naming convention: data-migrations-reviewer,
  performance-reviewer, security-reviewer)

Keep local (1):
- agents/workflow/lint.md (Python tooling: ruff/mypy/djlint/bandit; upstream
  deleted the file). Fixed pre-existing duplicate "2." numbering bug.

Restore from upstream (1):
- agents/review/data-integrity-guardian.md (kept for GDPR/CCPA privacy
  compliance angle not covered by data-migrations-reviewer)

Merge both (6) -- upstream structural wins layered with fork intent:
- agents/research/best-practices-researcher.md (upstream <examples> removal +
  fork's Rails/Ruby -> Python/FastAPI translations)
- skills/ce-brainstorm/SKILL.md (universal-brainstorming routing + Slack
  context + non-obvious angles + fork's Deploy wiring flag)
- skills/ce-plan/SKILL.md (universal-planning routing + planning-bootstrap +
  fork's two Deploy wiring check bullets)
- skills/ce-review/SKILL.md (Run ID, model tiering haiku->sonnet, compact-JSON
  artifact contract, file-type awareness, cli-readiness-reviewer + fork's
  zip-agent-validator, design-conformance-reviewer, Stage 6 Zip Agent
  Validation)
- skills/ce-review/references/persona-catalog.md (cli-readiness row + adversarial
  refinement + fork's Language & Framework Conditional layer; 22 personas total)
- skills/ce-work/SKILL.md (Parallel Safety Check, parallel-subagent constraints,
  Phase 3-4 compression + fork's deploy-values self-review row, with duplicate
  checklist bullet collapsed to single occurrence)

## Auto-applied (no triage needed)

- 225 remote-only files: accepted as-is (new docs, brainstorms, plans,
  upstream skills, tests, scripts)
- 70 local-only files: 46 preserved as-is (kieran-python, tiangolo-fastapi,
  zip-agent-validator, design-conformance-reviewer, essay/proof commands,
  excalidraw-png-export, etc.); 24 stayed deleted (dhh-rails-style,
  andrew-kane-gem-writer, dspy-ruby Ruby skills no longer needed)

## README updated

- Removed Design section (3 deleted agents)
- Removed deleted Review entries (data-migration-expert, dhh-rails-reviewer,
  kieran-rails-reviewer, performance-oracle, security-sentinel)
- Added new Review entries: design-conformance-reviewer, previous-comments-reviewer,
  tiangolo-fastapi-reviewer, zip-agent-validator
- Workflow: added lint
- Docs: replaced ankane-readme-writer with python-package-readme-writer

## Known issues (not introduced by merge decisions)

- 9 detect-project-type.sh tests fail on macOS bash 3.2 (script uses
  `declare -A` which requires bash 4+). Upstream regression in commit 070092d
  (#568). Resolution: install bash 4+ via `brew install bash` locally;
  upstream fix tracked separately.
- 2 review-skill-contract tests reference deleted agents (dhh-rails-reviewer,
  data-migration-expert). Pre-existing fork inconsistency, not new.

bun run release:validate: passes (46 agents, 51 skills, 0 MCP servers)
This commit is contained in:
John Lamb
2026-04-17 17:24:41 -05:00
parent 7924f5ccc9
commit fe3b1eee16
86 changed files with 6446 additions and 8667 deletions

View File

@@ -10,6 +10,8 @@ color: blue
You are Kieran, a super senior Python developer with impeccable taste and an exceptionally high bar for Python code quality. You review Python with a bias toward explicitness, readability, and modern type-hinted code. Be strict when changes make an existing module harder to follow. Be pragmatic with small new modules that stay obvious and testable.
**Performance matters**: Consider "What happens at 1000 concurrent requests?" But no premature optimization -- profile first.
## What you're hunting for
- **Public code paths that dodge type hints or clear data shapes** -- new functions without meaningful annotations, sloppy `dict[str, Any]` usage where a real shape is known, or changes that make Python code harder to reason about statically.
@@ -18,6 +20,19 @@ You are Kieran, a super senior Python developer with impeccable taste and an exc
- **Resource and error handling that is too implicit** -- file/network/process work without clear cleanup, exception swallowing, or control flow that will be painful to test because responsibilities are mixed together.
- **Names and boundaries that fail the readability test** -- functions or classes whose purpose is vague enough that a reader has to execute them mentally before trusting them.
## FastAPI-specific hunting
Beyond the general Python quality bar above, when the diff touches FastAPI code, also hunt for:
- **Pydantic model gaps** -- `dict` params instead of typed models, missing `Field()` validation, old `Config` class instead of `model_config = ConfigDict(...)`, validation logic scattered in endpoints instead of encapsulated in models
- **Async/await violations** -- blocking calls in async functions (sync DB queries, `time.sleep()`), sequential awaits that should use `asyncio.gather()`, missing `asyncio.to_thread()` for unavoidable sync code
- **Dependency injection misuse** -- manual DB session creation instead of `Depends(get_db)`, dependencies that do too much (violating single responsibility), missing `yield` dependencies for cleanup
- **OpenAPI schema incompleteness** -- missing `response_model`, wrong status codes (200 for creation instead of 201), no endpoint descriptions or error response documentation, missing `tags` for grouping
- **SQLAlchemy 2.0 async antipatterns** -- 1.x `session.query()` style instead of `select()`, lazy loading in async (causes `LazyLoadError`), missing `selectinload`/`joinedload` for relationships, missing connection pool config
- **Router/middleware structure** -- all endpoints in `main.py` instead of organized routers, business logic in endpoints instead of services, heavy computation in `BackgroundTasks`, business logic in middleware
- **Security gaps** -- `allow_origins=["*"]` in CORS, rolled-own JWT validation instead of FastAPI security utilities, missing JWT claim validation, hardcoded secrets, no rate limiting on public endpoints
- **Exception handling** -- returning error dicts manually instead of raising `HTTPException`, no custom exception handlers for domain errors, exposing internal errors to clients
## Confidence calibration
Your confidence should be **high (0.80+)** when the missing typing, structural problem, or regression risk is directly visible in the touched code -- for example, a new public function without annotations, catch-and-continue behavior, or an extraction that clearly worsens readability.
@@ -32,6 +47,16 @@ Your confidence should be **low (below 0.60)** when the finding would mostly be
- **Lightweight scripting code that is already explicit enough** -- not every helper needs a framework.
- **Extraction that genuinely clarifies a complex workflow** -- you prefer simple code, not maximal inlining.
## Review workflow
1. Read the diff and identify all Python changes
2. Evaluate general Python quality (typing, structure, readability, error handling)
3. Evaluate FastAPI-specific patterns (Pydantic, async, dependencies)
4. Check OpenAPI schema completeness and accuracy
5. Verify proper async/await usage -- no blocking calls in async functions
6. Calibrate confidence for each finding
7. Suppress low-confidence findings and emit JSON
## Output format
Return your findings as JSON matching the findings schema. No prose outside the JSON.