fix(im): fix markdown URL rendering issues in post content#206
fix(im): fix markdown URL rendering issues in post content#206sammi-bytedance wants to merge 2 commits intolarksuite:mainfrom
Conversation
📝 WalkthroughWalkthroughConverts optimized markdown into Feishu inline-elements JSON by tokenizing text (markdown links/images vs bare URLs), emits Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR fixes a real rendering bug in Feishu's Key changes:
Open concern: Confidence Score: 4/5Safe to merge for the core underscore/ampersand fix, but the fenced-code-block URL extraction bug should be resolved before shipping. One new P1 is found: fenced code blocks containing bare URLs have their URLs incorrectly lifted out and rendered as links while the surrounding backtick fences become orphaned literal text. Three prior P1/P2 items from earlier review rounds (backtick in regex, preceding-char guard, silent encode error) remain open. The PR solves its stated goal correctly for the common case but the code-block regression is a real rendering breakage for a plausible input. shortcuts/im/helpers.go — buildPostElements needs to protect fenced code block regions from URL extraction, mirroring the guard already present in optimizeMarkdownStyle. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["wrapMarkdownAsPost(markdown)\nor resolveMarkdownAsPost(...)"] --> B["optimizeMarkdownStyle(markdown)\n— extracts code blocks → placeholders\n— heading/table fixes\n— reinserts code blocks ← ⚠️ live URLs inside"]
B --> C["buildPostElements(optimized)"]
C --> D{"rePostToken.FindAllStringIndex"}
D -->|"No matches"| E["[{'tag':'md','text':...}]"]
D -->|"token[0]=='[' or '!'"| F["{'tag':'md','text':token}\n(markdown link / image)"]
D -->|"bare URL"| G["TrimRight trailing punctuation\n{'tag':'a','text':url,'href':url}"]
D -->|"text between tokens"| H["{'tag':'md','text':segment}"]
F --> I["Join elements → JSON array"]
G --> I
H --> I
E --> J["Wrap: {zh_cn:{content:[array]}}"]
I --> J
J --> K["Final post JSON sent to Feishu API"]
Reviews (5): Last reviewed commit: "Merge branch 'larksuite:main' into fix/i..." | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/im/helpers.go`:
- Around line 667-670: The comment above the call to wrapBareURLs (the variable
r) is stale: it claims bare URLs are wrapped as <url> autolinks, but
wrapBareURLs actually wraps them as markdown links [display](href); update the
comment to accurately describe the current behavior (that bare URLs are
converted to markdown link syntax with a display text and href) and preserve the
note about skipping URLs already inside angle brackets, markdown link parens, or
quotes; reference wrapBareURLs and r so reviewers can locate and verify the
code.
- Around line 705-713: The guard that protects already-linked URL contexts only
checks for '(' and '"' so autolinks and bracketed link text can be re-wrapped;
update the check around start (where ch := text[start-1]) to also treat '<' and
'[' as early-return contexts: if ch is '<' (autolink like <https://...>) or '['
(URL used as link text like [https://...](...)) then write u into sb, set prev =
end and continue, preserving the existing behavior for '(' and '"' — reference
the variables start, text, ch, sb.WriteString(u), prev and end when making the
change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
d1b21d1 to
b82737b
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
shortcuts/im/helpers.go (1)
637-640: URL regex may capture trailing punctuation.The bare URL pattern
https?://[^\s<>()\[\]"]+will match characters like.,,,!,?,;at the end of a URL. In natural text like"Visit https://example.com, then click", the comma would be captured as part of the URL.This is a common edge case in URL detection. Consider stripping common trailing punctuation after matching:
💡 Optional improvement to trim trailing punctuation
func buildPostElements(text string) string { matches := rePostToken.FindAllStringIndex(text, -1) if len(matches) == 0 { return `[{"tag":"md","text":` + marshalStringNoEscape(text) + `}]` } var elems []string prev := 0 for _, loc := range matches { start, end := loc[0], loc[1] if start > prev { elems = append(elems, `{"tag":"md","text":`+marshalStringNoEscape(text[prev:start])+`}`) } token := text[start:end] if token[0] == '[' || token[0] == '!' { // Markdown link [text](url) or image  — keep in md segment elems = append(elems, `{"tag":"md","text":`+marshalStringNoEscape(token)+`}`) } else { - // Bare URL — use native a tag so Feishu renders it without md parsing - elems = append(elems, `{"tag":"a","text":`+marshalStringNoEscape(token)+`,"href":`+marshalStringNoEscape(token)+`}`) + // Bare URL — use native a tag so Feishu renders it without md parsing + // Trim trailing punctuation that's likely sentence-ending, not part of URL + trimmed := strings.TrimRight(token, ".,;:!?") + if trimmed == "" { + trimmed = token + } + suffix := token[len(trimmed):] + elems = append(elems, `{"tag":"a","text":`+marshalStringNoEscape(trimmed)+`,"href":`+marshalStringNoEscape(trimmed)+`}`) + if suffix != "" { + elems = append(elems, `{"tag":"md","text":`+marshalStringNoEscape(suffix)+`}`) + } } prev = end }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/helpers.go` around lines 637 - 640, The URL regex in rePostToken (`regexp.MustCompile(...https?://[^\s<>()\[\]"]+...)`) can capture trailing punctuation (e.g., .,;:!?), so update the handling in the rePostToken matching logic: either tighten the regex to disallow trailing punctuation characters (e.g., exclude [.,;:!?)] at the end) or keep the current regex but post-process each match from rePostToken to trim common trailing punctuation characters and closing brackets before treating it as a URL; reference the rePostToken variable and its use site(s) to implement the change so matched URLs no longer include trailing punctuation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@shortcuts/im/helpers.go`:
- Around line 637-640: The URL regex in rePostToken
(`regexp.MustCompile(...https?://[^\s<>()\[\]"]+...)`) can capture trailing
punctuation (e.g., .,;:!?), so update the handling in the rePostToken matching
logic: either tighten the regex to disallow trailing punctuation characters
(e.g., exclude [.,;:!?)] at the end) or keep the current regex but post-process
each match from rePostToken to trim common trailing punctuation characters and
closing brackets before treating it as a URL; reference the rePostToken variable
and its use site(s) to implement the change so matched URLs no longer include
trailing punctuation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 533fb65c-2e6d-49bd-8f80-1db6f6565e71
📒 Files selected for processing (2)
shortcuts/im/helpers.goshortcuts/im/helpers_test.go
c3c17de to
03b7884
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
shortcuts/im/helpers.go (1)
637-641: Regex excludes parentheses from bare URLs, which may truncate Wikipedia-style URLs.The URL pattern
[^\s<>()\[\]"]+will not match URLs containing parentheses in the path, such ashttps://en.wikipedia.org/wiki/Foo_(bar)— this would be truncated at the first(. This is a known edge case that's difficult to handle without balanced-parenthesis parsing. If this is an acceptable limitation, consider adding a brief comment noting it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/helpers.go` around lines 637 - 641, The regexp assigned to rePostToken (in the shortcuts/im/helpers.go diff) intentionally excludes parentheses in the bare-URL character class which causes URLs with parentheses (e.g., Wikipedia "Foo_(bar)") to be truncated; update the comment above rePostToken to explicitly call out this limitation and rationale (balanced-parentheses parsing is required to fully support such URLs) so future maintainers know this is a known tradeoff rather than a bug, and optionally mention that fixing it would need a more complex parser instead of a simple regexp.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@shortcuts/im/helpers.go`:
- Around line 637-641: The regexp assigned to rePostToken (in the
shortcuts/im/helpers.go diff) intentionally excludes parentheses in the bare-URL
character class which causes URLs with parentheses (e.g., Wikipedia "Foo_(bar)")
to be truncated; update the comment above rePostToken to explicitly call out
this limitation and rationale (balanced-parentheses parsing is required to fully
support such URLs) so future maintainers know this is a known tradeoff rather
than a bug, and optionally mention that fixing it would need a more complex
parser instead of a simple regexp.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f22bd87a-774f-4da1-a1f9-7ac06d56c800
📒 Files selected for processing (2)
shortcuts/im/helpers.goshortcuts/im/helpers_test.go
Change-Id: I4c3157ef4ffe2152070b1390a8357d7f73e05176
e240327 to
45bae36
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
shortcuts/im/helpers_test.go (1)
354-449: Comprehensive test coverage with one minor gap.The table-driven tests thoroughly cover the key scenarios including bare URLs, markdown links, mixed content, and punctuation handling.
Consider adding a test case for markdown images (
) to ensure they are also kept inmdsegments rather than being promoted toatags:💡 Suggested test case
{ name: "markdown link kept in md segment", input: "[click here](https://example.com/path_with_underscore)", wantSubs: []string{`"tag":"md"`, `[click here](https://example.com/path_with_underscore)`}, }, + { + name: "markdown image kept in md segment", + input: "", + wantSubs: []string{`"tag":"md"`, ``}, + wantNsubs: []string{`"tag":"a"`}, + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/im/helpers_test.go` around lines 354 - 449, Add a table-driven test case to TestBuildPostElements that verifies markdown images (e.g., "") remain in the "md" segment and are not promoted to an "a" tag: update the tests slice in TestBuildPostElements to include a case with input containing an image markdown and set wantSubs to include `"tag":"md"` and the raw image markdown `` and wantNsubs to include `"tag":"a"` (or any anchor-specific substring); this ensures buildPostElements handles image markdown the same way it treats link markdown.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@shortcuts/im/helpers_test.go`:
- Around line 354-449: Add a table-driven test case to TestBuildPostElements
that verifies markdown images (e.g., "")
remain in the "md" segment and are not promoted to an "a" tag: update the tests
slice in TestBuildPostElements to include a case with input containing an image
markdown and set wantSubs to include `"tag":"md"` and the raw image markdown
`` and wantNsubs to include `"tag":"a"` (or
any anchor-specific substring); this ensures buildPostElements handles image
markdown the same way it treats link markdown.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1884455c-8727-454a-9500-56e1f9cebf1e
📒 Files selected for processing (2)
shortcuts/im/helpers.goshortcuts/im/helpers_test.go
Summary
Bare
http(s)URLs containing underscores in--markdownmessages were broken by Feishu'smdrenderer, which misinterprets underscores as italic markers. This PR fixes the rendering by emitting bare URLs as native{"tag":"a"}post elements instead of relying on themdrenderer to handle them.Changes
buildPostElementsinshortcuts/im/helpers.goto tokenize markdown text into mixed Feishu post inline elements: bare URLs become{"tag":"a"}, markdown links/images stay in{"tag":"md"}segmentsmd-tag post serialization inwrapMarkdownAsPostandresolveMarkdownAsPostwithbuildPostElementsmarshalStringNoEscapeto serialize post content JSON without HTML-escaping&,<,>, preserving URLs containing&in query paramsbuildPostElementsandmarshalStringNoEscapecovering plain text, bare URLs with underscores/ampersands, markdown links, images, and mixed contentTest Plan
go test ./shortcuts/im/passed--markdownmessage containing bare URL with underscores (OAuth flow URL) — renders as clickable link with correct display text&in query params renders correctlyRelated Issues
N/A
Summary by CodeRabbit
New Features
Bug Fixes
Tests