feat(docs): normalize <lark-table> to GFM pipe tables in docs +fetch by liuhedev · Pull Request #260 · larksuite/cli

liuhedev · 2026-04-03T15:24:52Z

Summary

docs +fetch outputs markdown containing custom <lark-table> tags that are not renderable in any standard markdown viewer (Obsidian, GitHub, VS Code, etc.). This PR adds automatic conversion to GFM pipe tables.

Changes

shortcuts/doc/lark_table.go: New NormalizeLarkTables() function that converts <lark-table>/<lark-tr>/<lark-td> blocks to standard GFM pipe tables
shortcuts/doc/docs_fetch.go: Calls NormalizeLarkTables() on the markdown output during pretty-print
shortcuts/doc/lark_table_test.go: Unit tests covering basic tables, no-header mode, code-block fallback, and pipe escaping

Behavior

Before	After
`<lark-table><lark-tr><lark-td>Name</lark-td>...`	`\| Name \| Age \|`

Edge cases handled

header-row="false": Skips separator row
Fenced code blocks in cells: Falls back to original <lark-table> (GFM pipe tables cannot contain multi-line code)
Pipe characters: Escaped as \|
Nested <grid>/<column>/<text> tags: Flattened and stripped

Motivation

When using docs +fetch to export Lark documents for local use (e.g., in Obsidian vaults), tables are the most common broken element. This change makes the output immediately usable without post-processing.

Related issue: #259

Summary by CodeRabbit

New Features
- Documentation now converts custom Lark-table markup into GitHub-flavored Markdown tables with proper headers (optional), column alignment, cell normalization (nested content flattened,
  normalized, pipe characters escaped), and preserves original markup if fenced code blocks are present.
Tests
- Added comprehensive unit tests covering header handling, code-block preservation, special-character escaping, uneven columns, and nested-content flattening.

…output When docs +fetch returns markdown containing <lark-table> custom tags, they are not renderable in standard markdown viewers. This change adds automatic conversion of <lark-table> blocks to GFM (GitHub Flavored Markdown) pipe tables during the pretty-print output phase. Features: - Converts <lark-table>/<lark-tr>/<lark-td> to standard | pipe | tables - Handles header-row attribute (defaults to true) - Escapes pipe characters in cell content - Falls back to original <lark-table> when cells contain fenced code blocks - Strips nested <grid>/<column>/<text> tags during conversion Includes unit tests for basic table, no-header, code-block fallback, and pipe escaping scenarios.

coderabbitai · 2026-04-03T15:25:41Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 06a93abf-bbb3-43a7-9442-c7a923961d33

📥 Commits

Reviewing files that changed from the base of the PR and between 139f466 and f370afc.

📒 Files selected for processing (2)

shortcuts/doc/lark_table.go
shortcuts/doc/lark_table_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

shortcuts/doc/lark_table_test.go

📝 Walkthrough

Walkthrough

Adds a new doc.NormalizeLarkTables(md string) string to convert <lark-table> blocks into GitHub-flavored Markdown tables and calls it from the markdown post-processing step in shortcuts/doc/docs_fetch.go before writing output. No other control flow changes.

Changes

Cohort / File(s)	Summary
Lark Table Normalization Logic `shortcuts/doc/lark_table.go`	New exported `NormalizeLarkTables()` that finds `<lark-table>...</lark-table>` blocks, parses rows/cells, flattens `<grid>/<column>`, strips `<text>` wrappers, converts `<br/>` to newlines, escapes `
Lark Table Tests `shortcuts/doc/lark_table_test.go`	New unit tests covering basic conversion, `header-row="false"` behavior, fenced-code preservation, pipe escaping, column padding, grid/column flattening, `<text>` stripping, `<br/>` normalization, and no-op when no `<lark-table>` present.
Document Processing Integration `shortcuts/doc/docs_fetch.go`	Minor integration: post-processes `result["markdown"]` with `NormalizeLarkTables()` (assigns `md = NormalizeLarkTables(md)`) before writing output; no other logic changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibbled at tags of Lark’s little net,
Turned grids and columns to pipes with a fret,
Br and text faded, rows aligned with cheer,
No code-blocks harmed — I left those clear,
Hop! Markdown’s tidy now, ready to vet.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding automatic normalization of custom lark-table markup to GitHub Flavored Markdown pipe tables in the docs fetch command.
Description check	✅ Passed	The description covers all template sections with comprehensive detail: Summary explains the problem, Changes lists all modified files with specifics, Test Plan confirms unit tests are included, and Related Issues links issue `#259`.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-04-03T15:27:28Z

Greptile Summary

This PR adds NormalizeLarkTables() to convert <lark-table> markup to GFM pipe tables during docs +fetch output, with fallback to the original markup when cells contain fenced code blocks.

P1 — headerAttrRE matched against full block content: headerAttrRE.MatchString(block) on line 34 of lark_table.go scans the entire <lark-table>…</lark-table> block, so a cell whose text contains the literal string header-row=\"false\" would incorrectly suppress the header-separator row. The check should be restricted to the opening tag only.

Confidence Score: 4/5

Safe to merge after fixing the header-attribute detection scope bug.

One P1 logic bug: headerAttrRE is checked against the full table block (including all cell content) rather than just the opening tag, meaning a cell containing the literal text header-row="false" would incorrectly suppress the separator row. All other findings from prior review rounds have been addressed or are P2.

shortcuts/doc/lark_table.go line 34 — header attribute detection scope.

Important Files Changed

Filename	Overview
shortcuts/doc/lark_table.go	New file implementing Lark-to-GFM table conversion; one P1 logic bug where `headerAttrRE` is matched against the full table block (including cell content) instead of only the opening tag, which can incorrectly suppress the header-separator row.
shortcuts/doc/docs_fetch.go	One-line integration of `NormalizeLarkTables` into the pretty-print path; change is minimal and correct.
shortcuts/doc/lark_table_test.go	Good coverage of the main scenarios (basic, no-header, code-block fallback, pipe escaping, uneven columns, grid flattening, text tags, br normalization); missing a test for the header-attribute-in-cell-content edge case.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["docs +fetch executes"] --> B["CallMCPTool fetch-doc"]
    B --> C["result markdown string"]
    C --> D["NormalizeLarkTables(md)"]
    D --> E{"larkTableRE\nfind match?"}
    E -- No --> F["return md unchanged"]
    E -- Yes --> G["Extract inner content\nCheck header-row attr on block"]
    G --> H{"Any cell\ncontains ```?"}
    H -- Yes --> I["return original\n<lark-table> block"]
    H -- No --> J["cleanCell each td\n(flattenGrid, stripTextTags,\nstrip lark tags, br→\n)"]
    J --> K["Pad rows to maxCols"]
    K --> L{"headerRow == true?"}
    L -- Yes --> M["Emit header row\n+ separator ---"]
    L -- No --> N["Skip separator"]
    M --> O["Emit data rows"]
    N --> O
    O --> P["Join lines with \n"]
    P --> Q["fmt.Fprintln output"]

_{Reviews (2): Last reviewed commit: "chore(docs): address review feedback for..." | Re-trigger Greptile}

shortcuts/doc/lark_table.go

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

shortcuts/doc/lark_table.go (1)
64-71: Fenced code block detection occurs after cleanCell processes content.

The check for fenced code blocks (strings.Contains(cell, "```")) happens after cells have been processed by cleanCell. Looking at cleanCell, it only strips <lark-*>, <text>, <grid>, and <br> tags, so the triple backticks should survive. However, this sequencing is subtle—if cleanCell is ever modified to strip more content, this check could silently break.

Consider moving the fenced code check to operate on the raw cell content before cleaning, or add a comment clarifying the dependency.
💡 Alternative: check raw content before cleaning
 		for _, rm := range rowMatches {
 			cellMatches := larkTdRE.FindAllStringSubmatch(rm[1], -1)
 			var cells []string
 			for _, cm := range cellMatches {
+				// Check for fenced code blocks in raw content before cleaning
+				if strings.Contains(cm[1], "```") {
+					return block
+				}
 				cells = append(cells, cleanCell(cm[1]))
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/lark_table.go` around lines 64 - 71, The fenced-code detection
currently checks for "```" after cells are processed by cleanCell; to avoid
future brittle behavior move the fenced-code check to operate on the raw cell
content before calling cleanCell (i.e., while iterating rows and the original
cell string from rows), returning block immediately if any raw cell contains
"```", or if you prefer to keep the current order add a clear comment next to
the loop documenting the dependency that cleanCell must not strip backticks and
that the fenced-code check must run on raw input; reference the rows iteration
and cleanCell/cells/block identifiers when making this change.
shortcuts/doc/lark_table_test.go (2)
57-63: Consider adding tests for additional edge cases.

The current tests cover core scenarios well, but the following documented behaviors lack test coverage:

Nested <grid>/<column> flattening

<text> tag stripping

Uneven column counts (row padding)

<br> tag conversion to newlines and back

These could be added in a follow-up to strengthen regression protection.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/lark_table_test.go` around lines 57 - 63, Add unit tests to
shortc uts/doc/lark_table_test.go that exercise missing edge cases for the
NormalizeLarkTables function: add separate Test* cases validating nested
<grid>/<column> flattening (e.g., nested columns collapse to single table),
stripping of <text> tags inside cells, behavior when rows have uneven column
counts (ensure padding or empty cells are produced), and conversion of <br> tags
to newlines and back; implement each test by constructing representative input
strings, calling NormalizeLarkTables, and asserting the expected normalized
output to protect these documented behaviors.
27-35: Consider adding a positive assertion for the data row.

The test verifies the separator row is absent, but doesn't confirm the data row | A | B | is present. Adding this check would ensure the function outputs content correctly in no-header mode.
💡 Suggested addition
 	if strings.Contains(result, "| --- |") {
 		t.Errorf("should not have separator when header-row=false, got:\n%s", result)
 	}
+	if !strings.Contains(result, "| A | B |") {
+		t.Errorf("expected data row, got:\n%s", result)
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/doc/lark_table_test.go` around lines 27 - 35, The test
TestNormalizeLarkTables_NoHeaderRow currently only asserts the absence of the
separator row; add a positive assertion that the converted table contains the
expected data row (e.g., check that NormalizeLarkTables(input) includes the
string "| A | B |"). Update the test to call result :=
NormalizeLarkTables(input) (already present) and add a strings.Contains check
that calls t.Errorf with a clear message if the data row is missing; reference
the test name and NormalizeLarkTables to locate where to add this assertion.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@shortcuts/doc/lark_table.go`:
- Around line 149-152: Remove the unused fmt import and the init() shim that
references fmt.Sprint: delete the fmt import from the file's import block and
remove the init() function (the lines containing func init() { _ = fmt.Sprint })
so there is no unused-import workaround; ensure no other code in this file
depends on fmt before committing.

---

Nitpick comments:
In `@shortcuts/doc/lark_table_test.go`:
- Around line 57-63: Add unit tests to shortc uts/doc/lark_table_test.go that
exercise missing edge cases for the NormalizeLarkTables function: add separate
Test* cases validating nested <grid>/<column> flattening (e.g., nested columns
collapse to single table), stripping of <text> tags inside cells, behavior when
rows have uneven column counts (ensure padding or empty cells are produced), and
conversion of <br> tags to newlines and back; implement each test by
constructing representative input strings, calling NormalizeLarkTables, and
asserting the expected normalized output to protect these documented behaviors.
- Around line 27-35: The test TestNormalizeLarkTables_NoHeaderRow currently only
asserts the absence of the separator row; add a positive assertion that the
converted table contains the expected data row (e.g., check that
NormalizeLarkTables(input) includes the string "| A | B |"). Update the test to
call result := NormalizeLarkTables(input) (already present) and add a
strings.Contains check that calls t.Errorf with a clear message if the data row
is missing; reference the test name and NormalizeLarkTables to locate where to
add this assertion.

In `@shortcuts/doc/lark_table.go`:
- Around line 64-71: The fenced-code detection currently checks for "```" after
cells are processed by cleanCell; to avoid future brittle behavior move the
fenced-code check to operate on the raw cell content before calling cleanCell
(i.e., while iterating rows and the original cell string from rows), returning
block immediately if any raw cell contains "```", or if you prefer to keep the
current order add a clear comment next to the loop documenting the dependency
that cleanCell must not strip backticks and that the fenced-code check must run
on raw input; reference the rows iteration and cleanCell/cells/block identifiers
when making this change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c3bc9e4-589f-46ab-9fa7-3ba542fc1078

📥 Commits

Reviewing files that changed from the base of the PR and between 0c77c95 and 139f466.

📒 Files selected for processing (3)

shortcuts/doc/docs_fetch.go
shortcuts/doc/lark_table.go
shortcuts/doc/lark_table_test.go

shortcuts/doc/lark_table.go

- Refactor NormalizeLarkTables to check for fenced code blocks on raw content - Remove unused fmt import and init() shim - Make headerAttrRE case-insensitive - Add comprehensive unit tests for edge cases: - Uneven column counts (row padding) - Grid/column layout flattening - Text tag stripping - <br/> tag normalization - Add positive assertion to TestNormalizeLarkTables_NoHeaderRow - Improve docstring coverage for exported functions

CLAassistant · 2026-04-03T15:33:06Z

All committers have signed the CLA.

github-actions bot added domain/ccm PR touches the ccm domain size/M Single-domain feat or fix with limited business impact labels Apr 3, 2026

greptile-apps bot reviewed Apr 3, 2026

View reviewed changes

shortcuts/doc/lark_table.go Outdated Show resolved Hide resolved

shortcuts/doc/lark_table.go Outdated Show resolved Hide resolved

shortcuts/doc/lark_table.go Show resolved Hide resolved

coderabbitai bot reviewed Apr 3, 2026

View reviewed changes

shortcuts/doc/lark_table.go Outdated Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(docs): normalize <lark-table> to GFM pipe tables in docs +fetch#260

feat(docs): normalize <lark-table> to GFM pipe tables in docs +fetch#260
liuhedev wants to merge 2 commits intolarksuite:mainfrom
liuhedev:feat/docs-fetch-normalize-lark-table

liuhedev commented Apr 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

greptile-apps bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

CLAassistant commented Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liuhedev commented Apr 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Behavior

Edge cases handled

Motivation

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

greptile-apps bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CLAassistant commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liuhedev commented Apr 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 3, 2026 •

edited

Loading

greptile-apps bot commented Apr 3, 2026 •

edited

Loading

CLAassistant commented Apr 3, 2026 •

edited

Loading