Skip to content

fix(im): fix markdown URL rendering issues in post content#206

Open
sammi-bytedance wants to merge 2 commits intolarksuite:mainfrom
sammi-bytedance:fix/im-markdown-url-rendering
Open

fix(im): fix markdown URL rendering issues in post content#206
sammi-bytedance wants to merge 2 commits intolarksuite:mainfrom
sammi-bytedance:fix/im-markdown-url-rendering

Conversation

@sammi-bytedance
Copy link
Copy Markdown
Contributor

@sammi-bytedance sammi-bytedance commented Apr 2, 2026

Summary

Bare http(s) URLs containing underscores in --markdown messages were broken by Feishu's md renderer, which misinterprets underscores as italic markers. This PR fixes the rendering by emitting bare URLs as native {"tag":"a"} post elements instead of relying on the md renderer to handle them.

Changes

  • Add buildPostElements in shortcuts/im/helpers.go to tokenize markdown text into mixed Feishu post inline elements: bare URLs become {"tag":"a"}, markdown links/images stay in {"tag":"md"} segments
  • Replace single-md-tag post serialization in wrapMarkdownAsPost and resolveMarkdownAsPost with buildPostElements
  • Add marshalStringNoEscape to serialize post content JSON without HTML-escaping &, <, >, preserving URLs containing & in query params
  • Add unit tests for buildPostElements and marshalStringNoEscape covering plain text, bare URLs with underscores/ampersands, markdown links, images, and mixed content

Test Plan

  • go test ./shortcuts/im/ passed
  • Manual verification: sent --markdown message containing bare URL with underscores (OAuth flow URL) — renders as clickable link with correct display text
  • Manual verification: URL with & in query params renders correctly
  • Manual verification: bold, italic, headings, lists, code blocks, markdown links all render normally

Related Issues

N/A

Summary by CodeRabbit

  • New Features

    • Posts now convert markdown and plain URLs into rich inline elements so links render natively and punctuation is handled correctly.
  • Bug Fixes

    • Prevents incorrect punctuation from being included in detected URLs and preserves markdown link formatting where appropriate.
  • Tests

    • Added targeted tests validating element conversion, punctuation trimming, and JSON string encoding without HTML escaping.

@github-actions github-actions bot added domain/im PR touches the im domain size/M Single-domain feat or fix with limited business impact labels Apr 2, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

📝 Walkthrough

Walkthrough

Converts optimized markdown into Feishu inline-elements JSON by tokenizing text (markdown links/images vs bare URLs), emits md or a elements accordingly, uses a JSON encoder with HTML escaping disabled, and adds unit tests for the new helpers and edge cases.

Changes

Cohort / File(s) Summary
Post element builder & JSON encoding
shortcuts/im/helpers.go
Added rePostToken regex, buildPostElements(text string) string, and marshalStringNoEscape(s string) string. Replaced previous single-md JSON marshaling with direct embedding of the inline-elements array into {"zh_cn":{"content":[ ... ]}} via wrapMarkdownAsPost / resolveMarkdownAsPost.
Unit tests for new helpers
shortcuts/im/helpers_test.go
Replaced broad assertion with targeted tests: added TestMarshalStringNoEscape, table-driven TestBuildPostElements (markdown vs bare URLs, punctuation trimming, multiple/mixed tokens, empty input), and narrowed TestWrapMarkdownAsPost subtests verifying md and a element behaviors.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hop through lines and split each link,

I nudge bare URLs so they brightly blink.
No HTML masks—plain text I keep,
Elements clear, no secrets to sweep.
A tiny cheer from a rabbit, neat.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: fixing markdown URL rendering issues in post content for the IM module, which aligns with the core problem being solved.
Description check ✅ Passed The description follows the template structure and includes all required sections: Summary explains the problem and fix clearly, Changes lists the main modifications, Test Plan documents both unit and manual verification steps, and Related Issues is addressed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 2, 2026

Greptile Summary

This PR fixes a real rendering bug in Feishu's md post renderer where underscores inside bare URLs were misinterpreted as italic markers. The fix introduces buildPostElements, which tokenizes optimized markdown into a mixed list of Feishu post inline elements — bare URLs become native {"tag":"a"} elements (bypassing the md renderer entirely) while everything else stays as {"tag":"md"} segments. A marshalStringNoEscape helper is also added to prevent Go's default HTML-escaping of &, <, > from corrupting query-parameter strings.

Key changes:

  • buildPostElements in shortcuts/im/helpers.go — regex-based tokenizer producing a flat JSON array of {"tag":"md"} / {"tag":"a"} elements; both wrapMarkdownAsPost and resolveMarkdownAsPost now delegate to it.
  • marshalStringNoEscape — wraps json.NewEncoder with SetEscapeHTML(false) to preserve literal & in URLs.
  • Comprehensive table-driven tests cover plain text, bare URLs, markdown links, trailing-punctuation trimming, and & preservation.

Open concern: buildPostElements operates on the final text returned by optimizeMarkdownStyle, which reinserts fenced code block content before returning. rePostToken therefore matches bare URLs that appear inside triple-backtick code blocks and promotes them to {"tag":"a"} elements, breaking the code block's visual rendering. The test suite has no case for this scenario. The three items flagged in the previous review thread (backtick in URL regex, preceding-character guard, silently-dropped enc.Encode error) remain open as well.

Confidence Score: 4/5

Safe to merge for the core underscore/ampersand fix, but the fenced-code-block URL extraction bug should be resolved before shipping.

One new P1 is found: fenced code blocks containing bare URLs have their URLs incorrectly lifted out and rendered as links while the surrounding backtick fences become orphaned literal text. Three prior P1/P2 items from earlier review rounds (backtick in regex, preceding-char guard, silent encode error) remain open. The PR solves its stated goal correctly for the common case but the code-block regression is a real rendering breakage for a plausible input.

shortcuts/im/helpers.go — buildPostElements needs to protect fenced code block regions from URL extraction, mirroring the guard already present in optimizeMarkdownStyle.

Important Files Changed

Filename Overview
shortcuts/im/helpers.go Adds buildPostElements to tokenize markdown into Feishu post elements; correctly fixes underscore/ampersand URL rendering but does not protect fenced code block content from URL extraction, which breaks code blocks containing URLs.
shortcuts/im/helpers_test.go Replaces the old single-assertion TestWrapMarkdownAsPost with thorough table-driven tests for buildPostElements and marshalStringNoEscape; no test case for URLs inside fenced code blocks.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["wrapMarkdownAsPost(markdown)\nor resolveMarkdownAsPost(...)"] --> B["optimizeMarkdownStyle(markdown)\n— extracts code blocks → placeholders\n— heading/table fixes\n— reinserts code blocks ← ⚠️ live URLs inside"]
    B --> C["buildPostElements(optimized)"]
    C --> D{"rePostToken.FindAllStringIndex"}
    D -->|"No matches"| E["[{'tag':'md','text':...}]"]
    D -->|"token[0]=='[' or '!'"| F["{'tag':'md','text':token}\n(markdown link / image)"]
    D -->|"bare URL"| G["TrimRight trailing punctuation\n{'tag':'a','text':url,'href':url}"]
    D -->|"text between tokens"| H["{'tag':'md','text':segment}"]
    F --> I["Join elements → JSON array"]
    G --> I
    H --> I
    E --> J["Wrap: {zh_cn:{content:[array]}}"]
    I --> J
    J --> K["Final post JSON sent to Feishu API"]
Loading

Reviews (5): Last reviewed commit: "Merge branch 'larksuite:main' into fix/i..." | Re-trigger Greptile

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@shortcuts/im/helpers.go`:
- Around line 667-670: The comment above the call to wrapBareURLs (the variable
r) is stale: it claims bare URLs are wrapped as <url> autolinks, but
wrapBareURLs actually wraps them as markdown links [display](href); update the
comment to accurately describe the current behavior (that bare URLs are
converted to markdown link syntax with a display text and href) and preserve the
note about skipping URLs already inside angle brackets, markdown link parens, or
quotes; reference wrapBareURLs and r so reviewers can locate and verify the
code.
- Around line 705-713: The guard that protects already-linked URL contexts only
checks for '(' and '"' so autolinks and bracketed link text can be re-wrapped;
update the check around start (where ch := text[start-1]) to also treat '<' and
'[' as early-return contexts: if ch is '<' (autolink like <https://...>) or '['
(URL used as link text like [https://...](...)) then write u into sb, set prev =
end and continue, preserving the existing behavior for '(' and '"' — reference
the variables start, text, ch, sb.WriteString(u), prev and end when making the
change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 424eb2fc-a309-485c-8e52-d6f3166c7714

📥 Commits

Reviewing files that changed from the base of the PR and between 79f43dc and d1b21d1.

📒 Files selected for processing (1)
  • shortcuts/im/helpers.go

@sammi-bytedance sammi-bytedance force-pushed the fix/im-markdown-url-rendering branch from d1b21d1 to b82737b Compare April 2, 2026 07:15
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
shortcuts/im/helpers.go (1)

637-640: URL regex may capture trailing punctuation.

The bare URL pattern https?://[^\s<>()\[\]"]+ will match characters like ., ,, !, ?, ; at the end of a URL. In natural text like "Visit https://example.com, then click", the comma would be captured as part of the URL.

This is a common edge case in URL detection. Consider stripping common trailing punctuation after matching:

💡 Optional improvement to trim trailing punctuation
 func buildPostElements(text string) string {
 	matches := rePostToken.FindAllStringIndex(text, -1)
 	if len(matches) == 0 {
 		return `[{"tag":"md","text":` + marshalStringNoEscape(text) + `}]`
 	}
 	var elems []string
 	prev := 0
 	for _, loc := range matches {
 		start, end := loc[0], loc[1]
 		if start > prev {
 			elems = append(elems, `{"tag":"md","text":`+marshalStringNoEscape(text[prev:start])+`}`)
 		}
 		token := text[start:end]
 		if token[0] == '[' || token[0] == '!' {
 			// Markdown link [text](url) or image ![alt](url) — keep in md segment
 			elems = append(elems, `{"tag":"md","text":`+marshalStringNoEscape(token)+`}`)
 		} else {
-			// Bare URL — use native a tag so Feishu renders it without md parsing
-			elems = append(elems, `{"tag":"a","text":`+marshalStringNoEscape(token)+`,"href":`+marshalStringNoEscape(token)+`}`)
+			// Bare URL — use native a tag so Feishu renders it without md parsing
+			// Trim trailing punctuation that's likely sentence-ending, not part of URL
+			trimmed := strings.TrimRight(token, ".,;:!?")
+			if trimmed == "" {
+				trimmed = token
+			}
+			suffix := token[len(trimmed):]
+			elems = append(elems, `{"tag":"a","text":`+marshalStringNoEscape(trimmed)+`,"href":`+marshalStringNoEscape(trimmed)+`}`)
+			if suffix != "" {
+				elems = append(elems, `{"tag":"md","text":`+marshalStringNoEscape(suffix)+`}`)
+			}
 		}
 		prev = end
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/im/helpers.go` around lines 637 - 640, The URL regex in rePostToken
(`regexp.MustCompile(...https?://[^\s<>()\[\]"]+...)`) can capture trailing
punctuation (e.g., .,;:!?), so update the handling in the rePostToken matching
logic: either tighten the regex to disallow trailing punctuation characters
(e.g., exclude [.,;:!?)] at the end) or keep the current regex but post-process
each match from rePostToken to trim common trailing punctuation characters and
closing brackets before treating it as a URL; reference the rePostToken variable
and its use site(s) to implement the change so matched URLs no longer include
trailing punctuation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@shortcuts/im/helpers.go`:
- Around line 637-640: The URL regex in rePostToken
(`regexp.MustCompile(...https?://[^\s<>()\[\]"]+...)`) can capture trailing
punctuation (e.g., .,;:!?), so update the handling in the rePostToken matching
logic: either tighten the regex to disallow trailing punctuation characters
(e.g., exclude [.,;:!?)] at the end) or keep the current regex but post-process
each match from rePostToken to trim common trailing punctuation characters and
closing brackets before treating it as a URL; reference the rePostToken variable
and its use site(s) to implement the change so matched URLs no longer include
trailing punctuation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 533fb65c-2e6d-49bd-8f80-1db6f6565e71

📥 Commits

Reviewing files that changed from the base of the PR and between d1b21d1 and b82737b.

📒 Files selected for processing (2)
  • shortcuts/im/helpers.go
  • shortcuts/im/helpers_test.go

@sammi-bytedance sammi-bytedance force-pushed the fix/im-markdown-url-rendering branch from c3c17de to 03b7884 Compare April 2, 2026 07:23
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
shortcuts/im/helpers.go (1)

637-641: Regex excludes parentheses from bare URLs, which may truncate Wikipedia-style URLs.

The URL pattern [^\s<>()\[\]"]+ will not match URLs containing parentheses in the path, such as https://en.wikipedia.org/wiki/Foo_(bar) — this would be truncated at the first (. This is a known edge case that's difficult to handle without balanced-parenthesis parsing. If this is an acceptable limitation, consider adding a brief comment noting it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/im/helpers.go` around lines 637 - 641, The regexp assigned to
rePostToken (in the shortcuts/im/helpers.go diff) intentionally excludes
parentheses in the bare-URL character class which causes URLs with parentheses
(e.g., Wikipedia "Foo_(bar)") to be truncated; update the comment above
rePostToken to explicitly call out this limitation and rationale
(balanced-parentheses parsing is required to fully support such URLs) so future
maintainers know this is a known tradeoff rather than a bug, and optionally
mention that fixing it would need a more complex parser instead of a simple
regexp.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@shortcuts/im/helpers.go`:
- Around line 637-641: The regexp assigned to rePostToken (in the
shortcuts/im/helpers.go diff) intentionally excludes parentheses in the bare-URL
character class which causes URLs with parentheses (e.g., Wikipedia "Foo_(bar)")
to be truncated; update the comment above rePostToken to explicitly call out
this limitation and rationale (balanced-parentheses parsing is required to fully
support such URLs) so future maintainers know this is a known tradeoff rather
than a bug, and optionally mention that fixing it would need a more complex
parser instead of a simple regexp.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f22bd87a-774f-4da1-a1f9-7ac06d56c800

📥 Commits

Reviewing files that changed from the base of the PR and between b82737b and 03b7884.

📒 Files selected for processing (2)
  • shortcuts/im/helpers.go
  • shortcuts/im/helpers_test.go

Change-Id: I4c3157ef4ffe2152070b1390a8357d7f73e05176
@sammi-bytedance sammi-bytedance force-pushed the fix/im-markdown-url-rendering branch from e240327 to 45bae36 Compare April 2, 2026 07:32
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
shortcuts/im/helpers_test.go (1)

354-449: Comprehensive test coverage with one minor gap.

The table-driven tests thoroughly cover the key scenarios including bare URLs, markdown links, mixed content, and punctuation handling.

Consider adding a test case for markdown images (![alt](url)) to ensure they are also kept in md segments rather than being promoted to a tags:

💡 Suggested test case
 		{
 			name:     "markdown link kept in md segment",
 			input:    "[click here](https://example.com/path_with_underscore)",
 			wantSubs: []string{`"tag":"md"`, `[click here](https://example.com/path_with_underscore)`},
 		},
+		{
+			name:      "markdown image kept in md segment",
+			input:     "![alt text](https://example.com/image.png)",
+			wantSubs:  []string{`"tag":"md"`, `![alt text](https://example.com/image.png)`},
+			wantNsubs: []string{`"tag":"a"`},
+		},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/im/helpers_test.go` around lines 354 - 449, Add a table-driven test
case to TestBuildPostElements that verifies markdown images (e.g.,
"![alt](https://example.com/img.png)") remain in the "md" segment and are not
promoted to an "a" tag: update the tests slice in TestBuildPostElements to
include a case with input containing an image markdown and set wantSubs to
include `"tag":"md"` and the raw image markdown
`![alt](https://example.com/img.png)` and wantNsubs to include `"tag":"a"` (or
any anchor-specific substring); this ensures buildPostElements handles image
markdown the same way it treats link markdown.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@shortcuts/im/helpers_test.go`:
- Around line 354-449: Add a table-driven test case to TestBuildPostElements
that verifies markdown images (e.g., "![alt](https://example.com/img.png)")
remain in the "md" segment and are not promoted to an "a" tag: update the tests
slice in TestBuildPostElements to include a case with input containing an image
markdown and set wantSubs to include `"tag":"md"` and the raw image markdown
`![alt](https://example.com/img.png)` and wantNsubs to include `"tag":"a"` (or
any anchor-specific substring); this ensures buildPostElements handles image
markdown the same way it treats link markdown.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1884455c-8727-454a-9500-56e1f9cebf1e

📥 Commits

Reviewing files that changed from the base of the PR and between 03b7884 and 45bae36.

📒 Files selected for processing (2)
  • shortcuts/im/helpers.go
  • shortcuts/im/helpers_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain/im PR touches the im domain size/M Single-domain feat or fix with limited business impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant