Skip to content

[wip] Add docs-driven E2E coverage for the playground#1260

Draft
whoiskatrin wants to merge 4 commits intomainfrom
playground-testing-automation
Draft

[wip] Add docs-driven E2E coverage for the playground#1260
whoiskatrin wants to merge 4 commits intomainfrom
playground-testing-automation

Conversation

@whoiskatrin
Copy link
Copy Markdown
Contributor

@whoiskatrin whoiskatrin commented Apr 4, 2026

Summary

This PR adds a docs-driven browser testing workflow for the playground, using examples/playground/testing.md as the source of truth for what should be covered.

The main goal is to make it easy to keep the playground test suite in sync with the testing guide:

  • editing testing.md regenerates a machine-readable manifest and coverage artifacts
  • uncovered scenarios show up automatically as generated test.fixme() entries
  • CI fails when testing.md and generated artifacts drift

What changed

Docs-driven test generation

Added generation/reporting scripts under examples/playground/scripts/:

  • generate-e2e-from-testing.ts
  • generate-testing-coverage-report.ts
  • prepare-e2e-deps.ts

These produce:

  • examples/playground/e2e/testing.manifest.json
  • examples/playground/e2e/generated/testing.generated.spec.ts
  • examples/playground/e2e/testing.coverage.json
  • examples/playground/e2e/testing.coverage.md

Playwright setup and manual coverage

Added a dedicated playground Playwright setup under examples/playground/e2e/:

  • playwright.config.ts
  • helpers.ts
  • manual specs for core, navigation/workflow, routing/readonly, supervisor/sql, chat/approval/retry/docs, and schedule/codemode/error coverage
  • manual/coverage.ts to map implemented specs back to scenario ids from testing.md

UI/testability improvements

Added stable test hooks (data-testid) and a few UX/testability improvements across playground demos and shared components, including:

  • connection status
  • log panel
  • demo wrapper / sidebar
  • SQL, routing, readonly, connections
  • chat rooms / supervisor
  • workflow basic demo
  • codemode

This keeps the browser tests resilient and lets the coverage report map cleanly to real scenarios.

CI + package scripts

Added a new workflow:

  • .github/workflows/playground-e2e.yml

Added/updated scripts so the common flows are:

  • npm run sync:playground-testing
  • npm run check:playground-testing-sync
  • npm run test:playground:e2e

How this works in practice

  1. Update examples/playground/testing.md
  2. Run npm run sync:playground-testing
  3. New or changed scenarios show up in the generated manifest/fixme spec/coverage report
  4. Implement the real Playwright coverage in a manual spec
  5. CI verifies generated artifacts stay in sync with the guide

Current coverage

Because email automation was intentionally removed from this branch, the current automated coverage is:

  • 100 / 113 scenarios implemented

The remaining uncovered scenarios are the email flows still documented in examples/playground/testing.md.

Testing

I ran:

  • npm run test:e2e -w @cloudflare/agents-playground
  • npm run check:playground-testing-sync

The playground-specific sync check and local playground browser suite are passing.

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 4, 2026

⚠️ No Changeset found

Latest commit: 8158a18

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@whoiskatrin whoiskatrin force-pushed the playground-testing-automation branch 2 times, most recently from ac87386 to afe7b5c Compare April 4, 2026 16:17
@whoiskatrin whoiskatrin force-pushed the playground-testing-automation branch from afe7b5c to 5a3a26c Compare April 4, 2026 16:24
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 4, 2026

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1260

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1260

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1260

hono-agents

npm i https://pkg.pr.new/hono-agents@1260

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1260

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1260

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1260

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1260

commit: 5a3a26c

@whoiskatrin whoiskatrin changed the title Add docs-driven E2E coverage for the playground [wip] Add docs-driven E2E coverage for the playground Apr 4, 2026
…isolation, gate CI on Playwright

- Only generate uncovered stubs in testing.generated.spec.ts
- Gitignore generated artifacts (manifest, coverage, generated spec)
- Extract shared Scenario type to e2e/types.ts
- Add state reset to core, supervisor, and readonly tests
- Fix PORT → DEV_PORT naming in playwright.config.ts
- Comment setWorkflowStepCount .evaluate() workaround
- Use Playwright-idiomatic expect.poll for auto-scroll assertion
- O(n) duplicate detection in coverage report
- Handle empty generated spec gracefully
- Gate main CI on Playwright E2E tests passing
- Add Playwright report artifact upload on failure
- Bump CI timeout to 30min
- Move playground-e2e.yml to nightly/manual only (main CI covers PRs)
- Simplify check:testing-sync (no git diff needed with gitignored files)
- Update README to reflect gitignored generated files
Replace ~1,400 lines of hand-written Playwright specs with an AI-driven
test runner that interprets testing.md scenarios directly at runtime.

- Parse testing.md into typed Scenario objects
- Use Workers AI (Llama 4 Scout) to translate natural-language
  actions/expectations into Playwright commands via accessibility snapshots
- Single ai-runner.spec.ts creates one test() per scenario (113 total)
- Auto-skip deployed-only scenarios
- Delete 6 manual spec files, coverage tracker, generation scripts
- Update CI to use existing CLOUDFLARE_API_TOKEN/CLOUDFLARE_ACCOUNT_ID
- Remove sync-check job and check:playground-testing-sync step

Adding a new test = editing testing.md. No Playwright code needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant