Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,8 @@ Create labels once: `priority: high`, `priority: medium`, `priority: low`, `area

- **Natural language rules (#12):** `review_rules_prose: [ "Rule one", "Rule two" ]` in config; injected as "Custom rules (natural language)" bullets into review guidance. Tests: `test_config_deserialize_review_rules_prose_from_yaml`, `build_review_guidance_includes_prose_rules`.
- **Triage skip deletion-only (#29):** `triage_skip_deletion_only: true` in config; when true, deletion-only diffs get `SkipDeletionOnly` and skip expensive review. Default false. Tests: `test_triage_deletion_only_with_skip_true_returns_skip_deletion_only`, config deserialize.
- **LLM parsing (#28):** Repair candidate for diff-style line prefixes (`+` on each line) in `repair_json_candidates`; test `parse_json_with_diff_prefix_artifact`.
- **Dynamic context (#25):** `find_enclosing_boundary_line` in `function_chunker.rs`; `context.rs` expands hunk start to enclosing function/class boundary; asymmetric context (5 before, 1 after).
- **LLM parsing (#28):** Repair candidates in `repair_json_candidates`: diff-style line prefixes (`+`), single-quoted keys/values → double-quoted via `convert_single_quoted_json_to_double`; raw bracket span fallback when valid JSON not found. Tests: `parse_json_with_diff_prefix_artifact`, `parse_json_with_single_quotes`.
- **Secrets (#20):** Built-in secret scanner in `plugins/builtin/secret_scanner.rs`.
- **Verification (#23):** Verification pass and config (verification.*) in pipeline.

Expand Down
106 changes: 106 additions & 0 deletions src/parsing/llm_response.rs
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,10 @@ fn parse_json_format(content: &str, file_path: &Path) -> Vec<core::comment::RawC
.or_else(|| find_json_array(content))
.or_else(|| find_json_object(content));

let json_str = json_str
.or_else(|| find_balanced_bracket_span(content, '[', ']'))
.or_else(|| find_balanced_bracket_span(content, '{', '}'));

if let Some(json_str) = json_str {
for candidate in repair_json_candidates(&json_str) {
let Ok(value) = serde_json::from_str::<serde_json::Value>(&candidate) else {
Expand Down Expand Up @@ -305,6 +309,26 @@ fn find_json_object(content: &str) -> Option<String> {
find_balanced_json(content, '{', '}')
}

/// Find the first balanced span for open/close (e.g. [ and ]) without validating JSON.
/// Used when valid JSON isn't found so we can run repair (e.g. single-quote conversion) and retry.
fn find_balanced_bracket_span(content: &str, open: char, close: char) -> Option<String> {
for (start, _) in content.char_indices().filter(|&(_, ch)| ch == open) {
let mut depth = 0i32;
for (offset, ch) in content[start..].char_indices() {
if ch == open {
depth += 1;
} else if ch == close {
depth -= 1;
if depth == 0 {
let end = start + offset;
return Some(content[start..=end].to_string());
}
}
}
}
None
}

fn find_balanced_json(content: &str, open: char, close: char) -> Option<String> {
for (start, _) in content.char_indices().filter(|&(_, ch)| ch == open) {
let mut depth = 0i32;
Expand Down Expand Up @@ -368,9 +392,80 @@ fn repair_json_candidates(candidate: &str) -> Vec<String> {
candidates.push(without_diff_prefix.to_string());
}

// When LLM outputs single-quoted keys/values (e.g. {'line': 9}), convert to valid JSON (issue #28).
let with_double_quotes = convert_single_quoted_json_to_double(trimmed);
if with_double_quotes != trimmed
&& (with_double_quotes.starts_with('[') || with_double_quotes.starts_with('{'))
{
candidates.push(with_double_quotes);
}

candidates
}

/// Convert single-quoted JSON-like strings to double-quoted so serde_json can parse.
/// Only converts single-quoted regions that are outside any double-quoted string.
fn convert_single_quoted_json_to_double(s: &str) -> String {
let mut out = String::with_capacity(s.len());
let mut chars = s.chars().peekable();
let mut in_double = false;
let mut escape_next = false;

while let Some(c) = chars.next() {
if escape_next {
escape_next = false;
out.push(c);
continue;
}
if in_double {
if c == '\\' {
escape_next = true;
out.push(c);
} else if c == '"' {
in_double = false;
out.push(c);
} else {
out.push(c);
}
continue;
}
if c == '"' {
in_double = true;
out.push(c);
continue;
}
if c == '\'' {
// Start of single-quoted string: emit " and copy until unescaped ', escaping " and \.
out.push('"');
for c in chars.by_ref() {
if c == '\\' {
escape_next = true;
out.push(c);
} else if c == '\'' {
if escape_next {
escape_next = false;
out.push('\'');
} else {
out.push('"');
break;
}
} else if c == '"' {
out.push('\\');
out.push('"');
} else {
out.push(c);
}
Comment on lines +455 to +457
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The convert_single_quoted_json_to_double function incorrectly handles backslash escape sequences in single-quoted strings, causing it to produce invalid JSON for inputs containing escapes like \n.
Severity: HIGH

Suggested Fix

In the inner loop of convert_single_quoted_json_to_double that processes single-quoted strings, ensure the escape_next flag is reset to false after processing the character that follows a backslash. This should be done in the else block and after handling an escaped single quote if the logic is to continue processing.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: src/parsing/llm_response.rs#L455-L457

Potential issue: The function `convert_single_quoted_json_to_double` incorrectly
processes backslash escape sequences within single-quoted strings. When a backslash is
encountered, an `escape_next` flag is set to true. However, this flag is not reset if
the following character is anything other than a single quote. This causes the flag to
persist across subsequent characters in the string. As a result, when the actual closing
single quote is found, it is incorrectly treated as an escaped character, leading to a
malformed output string that is not valid JSON (e.g., it may be missing a closing
double-quote). This will cause JSON parsing to fail for any LLM response containing
single-quoted strings with common escape sequences like `\n` or `\t`.

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inner loop never resets escape_next for non-quote chars

High Severity

In convert_single_quoted_json_to_double, the inner for loop processing single-quoted string contents sets escape_next = true when it encounters \, but only checks/resets escape_next inside the c == '\'' branch. For any non-quote character following a backslash (e.g. \n, \t, \\), escape_next remains true indefinitely. This causes the actual closing ' to be misidentified as an escaped quote, producing corrupted output or an unclosed string. The outer loop and find_balanced_json both correctly handle this by checking escape_next at the top of the loop — the inner loop needs the same pattern.

Fix in Cursor Fix in Web

}
if escape_next {
escape_next = false;
}
continue;
}
out.push(c);
}
out
}

fn extract_structured_items(value: serde_json::Value) -> Vec<serde_json::Value> {
if let Some(items) = value.as_array() {
return items.clone();
Expand Down Expand Up @@ -1243,6 +1338,17 @@ let data = &input;
assert!(comments[0].content.contains("Missing check"));
}

#[test]
fn parse_json_with_single_quotes() {
// LLM sometimes outputs JSON with single-quoted keys/values; repair converts to double quotes (issue #28).
let input = r#"[{'line': 9, 'issue': 'Use of deprecated API'}]"#;
let file_path = PathBuf::from("src/lib.rs");
let comments = parse_llm_response(input, &file_path).unwrap();
assert_eq!(comments.len(), 1);
assert_eq!(comments[0].line_number, 9);
assert!(comments[0].content.contains("deprecated"));
}

// ── Bug: find_json_array uses mismatched brackets ──────────────────
//
// `find_json_array` uses `find('[')` (first) + `rfind(']')` (last).
Expand Down
Loading