From 153bea8669d63848f57942e842cd58ed664e7435 Mon Sep 17 00:00:00 2001
From: Trevin Chow <trevin@trevinchow.com>
Date: Mon, 20 Apr 2026 20:44:16 -0700
Subject: [PATCH] fix(ce-resolve-pr-feedback): stop dropping unresolved and
 actionable feedback (#617)

---
 .../workflow/ce-pr-comment-resolver.agent.md  |  5 ++
 .../skills/ce-resolve-pr-feedback/SKILL.md    |  9 ++--
 .../scripts/get-pr-comments                   | 46 ++++++++++++-------
 .../scripts/get-thread-for-comment            |  4 ++
 4 files changed, 43 insertions(+), 21 deletions(-)
diff --git a/plugins/compound-engineering/agents/workflow/ce-pr-comment-resolver.agent.md b/plugins/compound-engineering/agents/workflow/ce-pr-comment-resolver.agent.md
index 0f46dd7..d385c78 100644
--- a/plugins/compound-engineering/agents/workflow/ce-pr-comment-resolver.agent.md
+++ b/plugins/compound-engineering/agents/workflow/ce-pr-comment-resolver.agent.md
@@ -32,6 +32,11 @@ Before touching any code, read the referenced file and classify the feedback:
 3. **Is it still relevant?** Has the code at this location changed since the review?
    - NO -> verdict: `not-addressing`
 
+   **Outdated threads (`isOutdated=true`):** The diff hunk shifted, so the reported line may no longer be where the concern lives. GitHub also exposes `line` as nullable -- outdated and file-level threads often have `line == null`. Start the lookup at whichever location field is available, preferring in order: `line`, `startLine`, `originalLine`, `originalStartLine`. If none resolve to current content matching the reviewer's description, extract an anchor from the comment (a symbol, identifier, or distinctive phrase) and search the **same file** once for it before concluding. Do not search other files. Three outcomes:
+   - Anchor found in the file (here or elsewhere in it) -> re-evaluate at that location using steps 2-4.
+   - Anchor not found and the comment describes concrete in-place code -> verdict: `not-addressing` with evidence ("searched <file> for <anchor>, not present").
+   - Anchor not found and the comment suggests the code was extracted to another file -> verdict: `needs-human`. Do not grep the repo; the reviewer's surrounding context is gone and picking the right new location is a judgment call for the user.
+
 4. **Would fixing improve the code?**
    - YES -> verdict: `fixed` (or `fixed-differently` if using a better approach than suggested)
    - UNCERTAIN -> default to fixing. Agent time is cheap.
diff --git a/plugins/compound-engineering/skills/ce-resolve-pr-feedback/SKILL.md b/plugins/compound-engineering/skills/ce-resolve-pr-feedback/SKILL.md
index 4b7d100..98c8bd6 100644
--- a/plugins/compound-engineering/skills/ce-resolve-pr-feedback/SKILL.md
+++ b/plugins/compound-engineering/skills/ce-resolve-pr-feedback/SKILL.md
@@ -49,7 +49,7 @@ Returns a JSON object with three keys:
 
 | Key | Contents | Has file/line? | Resolvable? |
 |-----|----------|---------------|-------------|
-| `review_threads` | Unresolved, non-outdated inline code review threads | Yes | Yes (GraphQL) |
+| `review_threads` | Unresolved inline code review threads (includes outdated; each carries its `isOutdated` flag so the resolver can account for line drift) | Yes | Yes (GraphQL) |
 | `pr_comments` | Top-level PR conversation comments (excludes PR author) | No | No |
 | `review_bodies` | Review submission bodies with non-empty text (excludes PR author) | No | No |
 
@@ -72,7 +72,7 @@ Before processing, classify each piece of feedback as **new** or **already handl
 
 The distinction is about content, not who posted what. A deferral from a teammate, a previous skill run, or a manual reply all count. Similarly, actionability is about content -- bot feedback that requests a specific code change is actionable; a bot's boilerplate header wrapping those requests is not.
 
-**Silent drop.** Non-actionable items are dropped without narration. Do not announce, list, or count dropped items in conversation, the task list, or the step 10 summary. Known review-bot wrappers (CodeRabbit, Codex, Gemini Code Assist, Copilot) and CI/status bot summaries (Codecov) are already filtered at the script level and will not appear; if a similar wrapper slips through, drop it silently and move on.
+**Silent drop.** Non-actionable items are dropped without narration. Do not announce, list, or count dropped items in conversation, the task list, or the step 10 summary. Review-bot wrappers from CodeRabbit, Codex, Gemini Code Assist, and Copilot (bodies like "Here are some automated review suggestions...") commonly appear here -- recognize them by their boilerplate content, drop silently. Only CI/status bot summaries (Codecov) are pre-filtered at the script level; everything else relies on this content-aware check so bot format changes cannot silently hide actionable findings.
 
 If there are no new items across all feedback types, skip steps 3-9 and go straight to step 10.
 
@@ -147,10 +147,11 @@ Previously-resolved threads (from `cross_invocation.resolved_threads`) participa
 
 Each agent receives:
 - The thread ID
-- The file path and line number
+- The file path and location fields: `line`, `originalLine`, `startLine`, `originalStartLine` (any can be null; outdated and file-level threads often have `line == null` and must fall back to `originalLine`)
 - The full comment text (all comments in the thread)
 - The PR number (for context)
 - The feedback type (`review_thread`)
+- The `isOutdated` flag from the thread node (tells the agent the reported line may have drifted)
 
 **For PR comments and review bodies** (`pr_comments`, `review_bodies`): These lack file/line context. Spawn a `workflow:ce-pr-comment-resolver` agent for each actionable non-clustered item. The agent receives the comment ID, body text, PR number, and feedback type (`pr_comment` or `review_body`). The agent must identify the relevant files from the comment text and the PR diff.
 
@@ -375,7 +376,7 @@ This fetches thread IDs and their first comment IDs (minimal fields, no bodies)
 
 ### 2. Fix, Reply, Resolve
 
-Spawn a single `workflow:ce-pr-comment-resolver` agent for the thread. Then follow the same validate -> commit -> push -> reply -> resolve flow as Full Mode steps 6-8.
+Spawn a single `workflow:ce-pr-comment-resolver` agent for the thread. Pass the same fields full mode does, including `isOutdated` and the location fields (`line`, `originalLine`, `startLine`, `originalStartLine`) -- targeted threads can be outdated too and need the same relocation handling. Then follow the same validate -> commit -> push -> reply -> resolve flow as Full Mode steps 6-8.
 
 ---
 
diff --git a/plugins/compound-engineering/skills/ce-resolve-pr-feedback/scripts/get-pr-comments b/plugins/compound-engineering/skills/ce-resolve-pr-feedback/scripts/get-pr-comments
index 5dfab1d..1aa15ac 100755
--- a/plugins/compound-engineering/skills/ce-resolve-pr-feedback/scripts/get-pr-comments
+++ b/plugins/compound-engineering/skills/ce-resolve-pr-feedback/scripts/get-pr-comments
@@ -33,17 +33,18 @@ fi
 #     signal: true when both resolved and unresolved threads exist (multi-round review)
 #     resolved_threads: last N resolved threads by recency, for cluster analysis input
 #
-# Bot filtering: the bots listed below fall into two categories:
-#   - AI review bots (coderabbitai, codex, gemini, copilot): put actionable
-#     feedback in inline review comments (review_threads). Their top-level PR
-#     comments and review submission bodies are wrappers/summaries with no
-#     actionable asks on their own.
-#   - CI/status bots (codecov): post coverage, build, or deploy summaries. Never
-#     actionable on their own; any follow-up belongs in human review comments.
-# Filtering them at the source keeps the agent focused on real feedback and
-# avoids narrating "ignoring bot wrapper" on every run. Add new bot logins here
-# as they're observed; prefer exact login match over pattern matching to avoid
-# dropping actionable content from general-purpose bots.
+# Bot filtering: only CI/status bots (codecov, etc.) are filtered at the source.
+# Their output is structurally never actionable -- coverage numbers, build
+# summaries, deploy status -- and that holds regardless of format changes.
+# AI review bots (coderabbitai, codex, gemini, copilot) are NOT filtered here.
+# Historically their top-level comments were assumed to always be wrappers, but
+# that turned out to be wrong: Codex sometimes posts actionable findings as
+# top-level PR comments with no inline thread counterpart. Any source-level
+# heuristic to separate wrapper from actionable for these bots is brittle (one
+# bot format change away from silently dropping feedback). SKILL.md step 2
+# has a content-aware actionability check and Silent Drop rule that handles
+# wrappers correctly, so we trust that layer instead. Add new logins to the CI
+# list only if their output is structurally non-actionable like codecov's.
 gh api graphql -f owner="$OWNER" -f repo="$REPO" -F pr="$PR_NUMBER" -f query='
 query FetchPRFeedback($owner: String!, $repo: String!, $pr: Int!) {
   repository(owner: $owner, name: $repo) {
@@ -57,6 +58,9 @@ query FetchPRFeedback($owner: String!, $repo: String!, $pr: Int!) {
             isOutdated
             path
             line
+            originalLine
+            startLine
+            originalStartLine
             comments(first: 10) {
               nodes {
                 id
@@ -87,11 +91,15 @@ query FetchPRFeedback($owner: String!, $repo: String!, $pr: Int!) {
     }
   }
 }' | jq '.data.repository.pullRequest as $pr |
-  # Bots whose top-level comments and review bodies are wrappers or CI summaries, not actionable.
-  ["coderabbitai", "chatgpt-codex-connector", "gemini-code-assist", "copilot-pull-request-reviewer", "codecov"] as $review_bot_logins |
-  # Unresolved threads (existing behavior, unchanged)
+  # Structurally non-actionable bot output; always dropped.
+  ["codecov"] as $ci_bot_logins |
+  # Unresolved threads. `isOutdated` means the diff hunk around the comment
+  # has shifted since the thread was opened -- not that the reviewer concern
+  # was addressed. Resolution state is the only authoritative signal; outdated
+  # threads are still surfaced (with their isOutdated flag intact) so the
+  # resolver can factor in that the referenced line may have moved.
   [$pr.reviewThreads.edges[]
-    | select(.node.isResolved == false and .node.isOutdated == false)] as $unresolved |
+    | select(.node.isResolved == false)] as $unresolved |
   # Resolved threads for cross-invocation awareness (last 10 by most recent comment)
   [$pr.reviewThreads.edges[]
     | select(.node.isResolved == true)
@@ -103,12 +111,16 @@ query FetchPRFeedback($owner: String!, $repo: String!, $pr: Int!) {
   review_threads: $unresolved,
   pr_comments: [$pr.comments.nodes[]
     | select(.author.login != $pr.author.login)
-    | select(.author.login as $l | $review_bot_logins | index($l) | not)
+    | select(
+        .author.login as $l | $ci_bot_logins | index($l) | not
+      )
     | select(.body | test("^\\s*$") | not)],
   review_bodies: [$pr.reviews.nodes[]
     | select(.body != null and .body != "")
     | select(.author.login != $pr.author.login)
-    | select(.author.login as $l | $review_bot_logins | index($l) | not)],
+    | select(
+        .author.login as $l | $ci_bot_logins | index($l) | not
+      )],
   cross_invocation: {
     signal: (($resolved | length) > 0 and ($unresolved | length) > 0),
     resolved_threads: $resolved
diff --git a/plugins/compound-engineering/skills/ce-resolve-pr-feedback/scripts/get-thread-for-comment b/plugins/compound-engineering/skills/ce-resolve-pr-feedback/scripts/get-thread-for-comment
index 56c4564..7f45f3e 100755
--- a/plugins/compound-engineering/skills/ce-resolve-pr-feedback/scripts/get-thread-for-comment
+++ b/plugins/compound-engineering/skills/ce-resolve-pr-feedback/scripts/get-thread-for-comment
@@ -36,8 +36,12 @@ query($owner: String!, $repo: String!, $pr: Int!) {
         nodes {
           id
           isResolved
+          isOutdated
           path
           line
+          originalLine
+          startLine
+          originalStartLine
           comments(first: 100) {
             nodes {
               id