chore(dx): add error-triage Claude skill for #console-alerts#3002
chore(dx): add error-triage Claude skill for #console-alerts#3002
Conversation
Adds a skill that scans #console-alerts, investigates Sentry and Grafana errors in depth, deduplicates against Linear, and proposes well-structured bug issues. Includes dedup logic to avoid flooding Slack threads.
📝 WalkthroughWalkthroughAdds a new "error-triage" Claude skill: documentation defining a Slack-driven alert triage workflow (handles empty Grafana messages via Loki/Prometheus queries, Sentry investigations, grouping/prioritization, Linear dedupe/issue drafting) plus three eval cases validating expected behaviors. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.claude/skills/error-triage/SKILL.md (1)
62-62: Optional wording polish for repeated sentence starts.Line 62 has three consecutive sentences beginning with “If…”. Consider slight rewording for readability; no behavior impact.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/error-triage/SKILL.md at line 62, Line 62 repeats three sentences that all start with "If…", hurting flow; rewrite them to vary sentence openings while preserving meaning — e.g., keep the first as-is, change the second to "When an alert has eyes, note who is investigating," and the third to "Record thread replies that include a Linear issue link (e.g., `CON-XXX`) so you can skip duplicate checks." Update the sentence phrasing in the paragraph containing those sentences to use this or a similar variation to improve readability without changing behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/error-triage/SKILL.md:
- Around line 16-19: Update the four fenced code blocks that lack language tags
by adding an appropriate language identifier (e.g., text or bash) to each
opening backtick fence; specifically, add a language tag to the blocks beginning
with "Use ToolSearch to find: slack_read_channel", "Use ToolSearch to find
relevant Grafana tools:", "Use ToolSearch to find Sentry tools:", and the block
starting "✅ Investigated via Sentry:" so markdownlint MD040 stops failing—open
each triple-backtick fence for those blocks and append "text" (or a more
specific tag like "bash" if commands are present).
---
Nitpick comments:
In @.claude/skills/error-triage/SKILL.md:
- Line 62: Line 62 repeats three sentences that all start with "If…", hurting
flow; rewrite them to vary sentence openings while preserving meaning — e.g.,
keep the first as-is, change the second to "When an alert has eyes, note who is
investigating," and the third to "Record thread replies that include a Linear
issue link (e.g., `CON-XXX`) so you can skip duplicate checks." Update the
sentence phrasing in the paragraph containing those sentences to use this or a
similar variation to improve readability without changing behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3e3895bb-a411-4b3f-ac64-a66a50534e2b
📒 Files selected for processing (2)
.claude/skills/error-triage/SKILL.md.claude/skills/error-triage/evals/evals.json
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.claude/skills/error-triage/SKILL.md (1)
62-62: Consider varying sentence structure for readability.Three consecutive sentences begin with "If", which slightly impacts flow. While this is a minor style consideration, you could rephrase for variety if desired.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/error-triage/SKILL.md at line 62, Rewrite the three consecutive sentences that begin with "If" to improve flow: keep the same facts but vary sentence openings in the line that mentions reactions and thread replies (the sentence containing the example token `CON-XXX`), e.g. turn the first into a declarative statement ("A checkmark reaction indicates it has been handled."), change the second to start with the reaction description rather than "If" ("An eyes reaction means someone is investigating — note who."), and rephrase the third to lead with the action ("Record any Linear issue link (e.g., `CON-XXX`) found in a thread reply so you can skip duplicate checks."); ensure meaning and examples remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.claude/skills/error-triage/SKILL.md:
- Line 62: Rewrite the three consecutive sentences that begin with "If" to
improve flow: keep the same facts but vary sentence openings in the line that
mentions reactions and thread replies (the sentence containing the example token
`CON-XXX`), e.g. turn the first into a declarative statement ("A checkmark
reaction indicates it has been handled."), change the second to start with the
reaction description rather than "If" ("An eyes reaction means someone is
investigating — note who."), and rephrase the third to lead with the action
("Record any Linear issue link (e.g., `CON-XXX`) found in a thread reply so you
can skip duplicate checks."); ensure meaning and examples remain unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 59cc8aeb-4d8c-4945-8380-dfe67b767c73
📒 Files selected for processing (1)
.claude/skills/error-triage/SKILL.md
Add language tags to fenced code blocks (markdownlint MD040) and vary sentence structure at line 62 for readability.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
.claude/skills/error-triage/SKILL.md (2)
64-67: Clarify precedence between “skip addressed alerts” and “attend every alert.”These two rules can be read as conflicting. Add one explicit rule for checkmarked alerts (e.g., “record in summary only, no new thread reply unless new evidence appears”).
Also applies to: 228-233
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/error-triage/SKILL.md around lines 64 - 67, Clarify the precedence conflict between the "Skip:" rule (the bullet list that starts with "Skip: - Deployment approval requests..." including the checkmark rule) and the "attend every alert" policy by adding an explicit sentence that checkmarked alerts are recorded in the summary only and should not receive a new thread reply unless new evidence appears; update the same wording where the "attend every alert" directive appears (referenced as "attend every alert") to point to this exception so readers know the checkmark rule takes precedence for already-addressed alerts.
99-99: Avoid hard-coding a single Loki datasource UID.A fixed UID is brittle across Grafana environments and can break triage when datasources are recreated/renamed. Prefer “discover by name, then fall back to UID if confirmed.”
💡 Suggested doc tweak
-When querying Loki, use the `beenf7rks2e4gd` datasource UID. Filter by `service_name` label and use text filters like `|= '"level":"error"'` for nested JSON. The `detected_level` label is unreliable for error filtering. +When querying Loki, first identify the correct Loki datasource for the environment (prefer lookup by name), then use its UID for queries. If the workspace is known to use `beenf7rks2e4gd`, use it as a fallback. Filter by `service_name` label and use text filters like `|= '"level":"error"'` for nested JSON. The `detected_level` label is unreliable for error filtering.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.claude/skills/error-triage/SKILL.md at line 99, The doc currently hard-codes the Loki datasource UID "beenf7rks2e4gd" which is brittle; update the guidance and any example queries to first resolve the Grafana datasource by its human-readable name (e.g., "Loki" or the expected datasource name) and only fall back to using a UID when the name lookup fails and the UID is confirmed; keep the recommendations to filter by the service_name label and use text filters like |= '"level":"error"' for nested JSON and explicitly note that detected_level is unreliable for error filtering so callers should prefer name-resolution then optional UID fallback.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.claude/skills/error-triage/SKILL.md:
- Around line 64-67: Clarify the precedence conflict between the "Skip:" rule
(the bullet list that starts with "Skip: - Deployment approval requests..."
including the checkmark rule) and the "attend every alert" policy by adding an
explicit sentence that checkmarked alerts are recorded in the summary only and
should not receive a new thread reply unless new evidence appears; update the
same wording where the "attend every alert" directive appears (referenced as
"attend every alert") to point to this exception so readers know the checkmark
rule takes precedence for already-addressed alerts.
- Line 99: The doc currently hard-codes the Loki datasource UID "beenf7rks2e4gd"
which is brittle; update the guidance and any example queries to first resolve
the Grafana datasource by its human-readable name (e.g., "Loki" or the expected
datasource name) and only fall back to using a UID when the name lookup fails
and the UID is confirmed; keep the recommendations to filter by the service_name
label and use text filters like |= '"level":"error"' for nested JSON and
explicitly note that detected_level is unreliable for error filtering so callers
should prefer name-resolution then optional UID fallback.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cc21f3a8-679e-4a78-a7bf-2df42b6335da
📒 Files selected for processing (1)
.claude/skills/error-triage/SKILL.md
Why
The team needs a repeatable way to triage #console-alerts — scanning Slack, investigating errors via Sentry and Grafana, deduplicating against Linear, and filing well-structured issues. This skill automates that workflow.
What
Adds
.claude/skills/error-triage/with a comprehensive triage workflow:Summary by CodeRabbit