From aecfeef28a8e6b86635811f1b6bd74f1e6264e26 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Sun, 15 Feb 2026 16:18:52 +0800 Subject: [PATCH 01/46] Add files via upload --- webmcp-technical-note-1.md | 171 +++++++++++++++++++++++++++++++++++++ webmcp-technical-note-2.md | 102 ++++++++++++++++++++++ 2 files changed, 273 insertions(+) create mode 100644 webmcp-technical-note-1.md create mode 100644 webmcp-technical-note-2.md diff --git a/webmcp-technical-note-1.md b/webmcp-technical-note-1.md new file mode 100644 index 0000000..a0a47b3 --- /dev/null +++ b/webmcp-technical-note-1.md @@ -0,0 +1,171 @@ +# WebMCP: How a Browser Hack Became a Proposed Web Standard + +**Anthropomorphic Press -- Technical Note** +**15 February 2026** + +--- + +On February 10, 2026, Google's Chrome team launched an early preview of something called WebMCP -- a proposed web standard that would allow any website to expose structured, callable tools to AI agents through a new browser API called navigator.modelContext. Within 72 hours, the announcement had generated coverage in VentureBeat, Search Engine Land, WinBuzzer, The New Stack, and dozens of developer blogs. SEO commentators called it the biggest shift in technical SEO since structured data. +([Source](https://searchengineland.com/google-releases-preview-of-webmcp-how-ai-agents-interact-with-websites-469024)) + +Members of the W3C community group where the specification is supposedly being incubated however learned about it from that same press coverage -- not from the group itself. + +This technical note reconstructs the chain of events. + +## What Is WebMCP + +WebMCP (Web Model Context Protocol) is a proposed JavaScript API that allows web developers to expose their web application functionality as "tools" -- JavaScript functions with natural language descriptions and structured schemas that can be invoked by AI agents, browser assistants, and assistive technologies. As the specification states: "Web pages that use WebMCP can be thought of as Model Context Protocol servers that implement tools in client-side script instead of on the backend." +([Source](https://webmachinelearning.github.io/webmcp/)) + +The specification proposes two APIs. A Declarative API handles standard actions that can be defined directly in HTML forms. An Imperative API handles more complex, dynamic interactions requiring JavaScript execution through navigator.modelContext.registerTool(). Together they allow web pages to function as tool servers for AI agents, running entirely client-side in the browser. +([Source](https://github.com/webmachinelearning/webmcp/blob/main/docs/proposal.md)) + +Critically, WebMCP is not the same thing as Anthropic's Model Context Protocol (MCP), despite sharing a name fragment and conceptual lineage. The two protocols operate in different layers and serve different purposes. Anthropic's MCP is a backend protocol: it uses JSON-RPC for client-server communication, runs on hosted servers (typically in Python or Node.js), connects AI platforms like Claude or ChatGPT to external services, and does not require a browser or a human user to be present. WebMCP is a frontend, browser-native API: it runs entirely client-side in JavaScript, uses the browser's postMessage system for communication, requires an active browser session with a human user present, and exposes website functionality directly to agents operating within that browser context. A company might use both: an MCP server for direct API integrations with AI platforms, and WebMCP tools on its consumer-facing website so that browser-based agents can interact with the site while the user is actively browsing. The two are complementary, not competing. +([Source](https://webmachinelearning.github.io/webmcp/) and [Source](https://github.com/webmachinelearning/webmcp/blob/main/docs/proposal.md)) + +The WebMCP specification explicitly declares that headless browsing, fully autonomous agents, and backend service integration are out of scope. "Headless browsing" refers to running a browser without a visible interface -- no screen, no human watching -- as automated tools like Puppeteer and Playwright do. By excluding this, WebMCP requires that a human user is present in an active browser session whenever agents invoke tools. This is a deliberate design choice: the standard is built around cooperative, human-in-the-loop workflows, not unsupervised automation. +([Source](https://webmachinelearning.github.io/webmcp/)) + +## Origin: From Amazon Frustration to Browser Hack + +The concept traces back to Alex Nahas, a backend engineer at Amazon. When Anthropic's MCP arrived in early 2025, Amazon spun up what amounted to one enormous MCP server with thousands of tools. The real problem, however, was authorization: MCP's spec had adopted OAuth 2.1, which essentially nobody at Amazon had implemented internally. Every internal service had its own authentication story. +([Source](https://www.arcade.dev/blog/web-mcp-alex-nahas-interview)) + +Nahas realized the browser itself could solve the auth problem -- users are already signed in through federated browser sessions. He developed MCP-B (Model Context Protocol for Browsers), a Chrome extension that let websites expose MCP-compatible tools directly through browser JavaScript, using existing authentication and security models. The underlying protocol he called "WebMCP." +([Source](https://github.com/MiguelsPizza/WebMCP)) +([Source](https://docs.mcp-b.ai/introduction)) + +Independently, a separate early implementation by Jason McGhee also used the name "WebMCP" for a similar concept -- a widget allowing any website to act as an MCP server client-side. McGhee has since deferred to the W3C version, noting his implementation is "not compliant with the W3C spec" and that "much more capable folks that develop the web" have taken up the work. +([Source](https://github.com/jasonjmcghee/WebMCP)) + +## Google and Microsoft Enter: The W3C Pathway + +On August 28, 2025, Patrick Brosset of the Microsoft Edge team published a blog post introducing WebMCP as "a proposal to let you, web developers, control how AI agents interact with your web pages." He described it as a joint effort between the Edge and Google teams and solicited early feedback. +([Source](https://patrickbrosset.com/articles/2025-08-28-ai-agents-and-the-web-a-proposal-to-keep-developers-in-the-loop/)) + +In an interview published by The New Stack, Kyle Pflug, group product manager for the web platform at Microsoft Edge, confirmed that WebMCP was a joint Microsoft-Google initiative. Pflug noted that Alex Nahas had joined the group, and that the priority for the rest of 2025 was "deeper conversations with web developers" and working toward "an early developer preview in Chromium." +([Source](https://thenewstack.io/how-webmcp-lets-developers-control-ai-agents-with-javascript/)) + +The specification was placed in the GitHub repository of the W3C Web Machine Learning Community Group at github.com/webmachinelearning/webmcp. The repo shows open issues dating from October-November 2025, primarily filed by Khushal Sagar of Google (a spec editor), with issue tracker activity through at least November 17, 2025. +([Source](https://github.com/webmachinelearning/webmcp/issues)) + +## Where Is It Housed: The Web Machine Learning Community Group + +The specification itself states: "This specification was published by the Web Machine Learning Community Group. It is not a W3C Standard nor is it on the W3C Standards Track." The draft is dated February 12, 2026 and lists three editors: Brandon Walderman (Microsoft), Khushal Sagar (Google), and Dominic Farolino (Google). +([Source](https://webmachinelearning.github.io/webmcp/)) + +The Web Machine Learning Community Group was originally proposed on October 3, 2018 by Anssi Kostiainen (Intel) to incubate the Web Neural Network API. It has since expanded its charter to include additional deliverables. The updated charter now lists the "WebMCP API" as a specification deliverable, described as "An API for web apps to expose their functionality as tools to AI agents and assistive technologies." +([Source](https://webmachinelearning.github.io/charter/)) + +The charter also notes that the WebML CG "should coordinate with" the AI Agent Protocol Community Group "to ensure these protocols consider WebMCP API requirements, as applicable." +([Source](https://webmachinelearning.github.io/charter/)) + +The CG participant lists show that Google LLC, Microsoft Corporation, Intel Corporation, Samsung, Apple Inc., Huawei, and others have representatives in the group. +([CG participants](https://www.w3.org/groups/cg/webmachinelearning/participants/)) +([WG participants](https://www.w3.org/groups/wg/webmachinelearning/participants/)) + +The webmachinelearning GitHub organization (https://github.com/webmachinelearning) hosts the WebMCP repo alongside the Web Neural Network API (WebNN), Prompt API, Translation API, Writing Assistance APIs, and Proofreader API. The webmcp repo shows 436 stars and 21 forks as of this writing, with its last update on December 12, 2025. +([Source](https://github.com/orgs/webmachinelearning/repositories)) + +Separately, a "WebMCP-org" GitHub organization (https://github.com/WebMCP-org) hosts MCP-B-related implementation code, npm packages, and example applications, including React hooks and transport layers. This is the implementation side, distinct from the specification repo. +([Source](https://github.com/WebMCP-org)) + +## The Chrome Launch: February 10, 2026 + +On February 10, 2026, Google's Andre Cipriani Bandarra announced the WebMCP Early Preview Program. The announcement stated that WebMCP aims to provide a standard way for exposing structured tools, ensuring AI agents can perform actions with increased speed, reliability, and precision. Access to the preview is available in Chrome 146 Canary behind the "WebMCP for testing" flag at chrome://flags. +([Source](https://searchengineland.com/google-releases-preview-of-webmcp-how-ai-agents-interact-with-websites-469024)) + +The announcement generated immediate and extensive press coverage. VentureBeat reported the specification was transitioning from community incubation within the W3C to a formal draft, and noted the comparison drawn by Chrome staff engineer Khushal Sagar that WebMCP aims to become "the USB-C of AI agent interactions with the web." +([Source](https://venturebeat.com/infrastructure/google-chrome-ships-webmcp-in-early-preview-turning-every-website-into-a)) + +WinBuzzer reported early benchmarks showing approximately 67% reduction in computational overhead compared to traditional visual agent-browser interactions. No other browser vendor has announced implementation timelines, though Microsoft's co-authorship suggests Edge support is probable. +([Source](https://winbuzzer.com/2026/02/13/google-chrome-webmcp-early-preview-ai-agents-xcxwbn/)) + +## The Process Question + +The technical merits of WebMCP are not the concern raised here. The concern is procedural. + +The specification is described as being incubated by a W3C Community Group. A CG draft carries a specific meaning in W3C process -- it is a collaborative, community-driven document subject to group deliberation, review, and consensus-building. Yet the Chrome team shipped a working implementation in Chrome 146 and launched a public developer program before the specification was mature. The spec on GitHub contains multiple sections marked TODO. Open issues remain unresolved. +([Source](https://github.com/webmachinelearning/webmcp/issues)) + +As one critical assessment noted, Chrome's unilateral advancement through an early preview program raises questions about whether competing browser vendors will adopt compatible approaches, and the announcement provides limited technical documentation about API structure, authentication mechanisms, or permission models. +([Source](https://ppc.land/chromes-webmcp-could-end-ai-agents-pixel-parsing-nightmare/)) + +Members of the Web Machine Learning Community Group report learning about WebMCP from external press coverage rather than through the group's own communication channels. This raises the question of whether the CG incubation process served as genuine community deliberation or as a hosting arrangement for a specification driven primarily by two browser vendors. + +This pattern -- browser vendors shipping early implementations to generate developer momentum while the standardization process is still in progress -- is not new. It has a long history in web standards. But it raises particular questions when applied to AI agent infrastructure, where the security, privacy, and trust implications of exposing website functionality to autonomous systems are significant and largely unresolved. + +## De Facto Standard in the Making? + +To be precise about the status: WebMCP is a Draft Community Group Report. The W3C FAQ explicitly states that "Community and Business Group Reports are not yet W3C Standards" and that groups should not refer to CG work as "standards work" or "draft standards." CG Reports are not W3C Recommendations and do not carry the weight of the W3C Recommendation Track process. +([Source](https://www.w3.org/community/about/faq/)) + +Yet Chrome 146 already ships a working implementation behind a feature flag. This is the classic pattern of a de facto standard: ship the implementation, build developer adoption, and the specification follows the code rather than the other way around. The question is whether the community process will shape the final standard, or whether the Chrome implementation will become the reference that everyone else must follow. + +It should be noted that WebMCP was discussed at W3C TPAC 2025 in Kobe, Japan, within the Web Machine Learning Community Group sessions. The W3C blog reports that WebMCP was "a major topic" including "considerations about how to manage consent and permissions for sensitive actions in a WebMCP context." +([Source](https://www.w3.org/blog/ -- TPAC 2025 report)) + +## How to Contribute + +For developers, standards practitioners, and anyone concerned about how AI agents will interact with the web, here are the concrete pathways to participate in shaping WebMCP: + +**1. Join the W3C Web Machine Learning Community Group.** W3C Community Groups are open to all. No W3C Membership is required and there is no fee. You need a free W3C account and must agree to the W3C Community Contributor License Agreement (CLA). Join at: https://www.w3.org/community/webmachinelearning/ +([Source](https://www.w3.org/community/about/)) + +**2. File issues and contribute via GitHub.** The charter states that participants should make all contributions in the GitHub repo, by pull request (preferred), by raising an issue, or by commenting on an existing issue. The spec repo is at: https://github.com/webmachinelearning/webmcp +([Source](https://webmachinelearning.github.io/charter/)) + +**3. Test the Chrome implementation and provide feedback.** The early preview is available in Chrome 146 Canary by enabling the "WebMCP for testing" flag at chrome://flags. Developers can apply for access to documentation and demos through Google's Early Preview Program. + +**4. Engage with the AI Agent Protocol Community Group.** The WebML CG charter identifies coordination with this separate CG, which develops protocols for AI agent discovery and collaboration across the web. If you work on agent interoperability, this is a relevant touchpoint. + +## What to Watch + +Several questions remain open. How will WebMCP interact with existing accessibility frameworks, given the specification's claim that the API would benefit assistive technologies? How will rate limiting and abuse prevention work when agents can invoke website tools programmatically? What happens when prompt injection meets client-side tool registration? And critically -- will the W3C community group process catch up to the Chrome implementation, or will the implementation become the de facto standard regardless of community input? The specification is open. The implementation is live. The window for community influence is now. + +## Chronology + +- **October 3, 2018** -- Web Machine Learning Community Group proposed at W3C by Anssi Kostiainen (Intel) to incubate Web Neural Network API. ([Source](https://www.w3.org/community/webmachinelearning/)) + +- **Early 2025** -- Anthropic's Model Context Protocol gains adoption. Alex Nahas at Amazon encounters OAuth 2.1 authorization problems with internal MCP deployment. ([Source](https://www.arcade.dev/blog/web-mcp-alex-nahas-interview)) + +- **2025** -- Alex Nahas develops MCP-B Chrome extension; Jason McGhee independently develops early WebMCP widget. Both demonstrate browser-based MCP feasibility. ([Source](https://github.com/MiguelsPizza/WebMCP) and [Source](https://github.com/jasonjmcghee/WebMCP)) + +- **August 28, 2025** -- Patrick Brosset (Microsoft Edge) publishes blog post introducing WebMCP as joint Edge-Google proposal, soliciting early developer feedback. ([Source](https://patrickbrosset.com/articles/2025-08-28-ai-agents-and-the-web-a-proposal-to-keep-developers-in-the-loop/)) + +- **September-October 2025** -- Kyle Pflug (Microsoft Edge) confirms joint initiative in interview with The New Stack. Alex Nahas joins the group. Specification placed in webmachinelearning GitHub org. ([Source](https://thenewstack.io/how-webmcp-lets-developers-control-ai-agents-with-javascript/)) + +- **October-November 2025** -- Open issues filed on webmachinelearning/webmcp repo, primarily by spec editor Khushal Sagar (Google). Issues tagged "Agenda+" for CG discussion. ([Source](https://github.com/webmachinelearning/webmcp/issues)) + +- **December 12, 2025** -- Last recorded update to webmcp repo on GitHub. ([Source](https://github.com/orgs/webmachinelearning/repositories)) + +- **February 10, 2026** -- Google launches WebMCP Early Preview Program in Chrome 146 Canary. Press coverage erupts. ([Source](https://searchengineland.com/google-releases-preview-of-webmcp-how-ai-agents-interact-with-websites-469024)) + +- **February 12, 2026** -- WebMCP specification dated as "Draft Community Group Report." ([Source](https://webmachinelearning.github.io/webmcp/)) + +## Reference URLs + +- WebMCP specification (Draft CG Report, 12 Feb 2026): https://webmachinelearning.github.io/webmcp/ +- WebMCP proposal/explainer: https://github.com/webmachinelearning/webmcp/blob/main/docs/proposal.md +- WebMCP GitHub repo (webmachinelearning org): https://github.com/webmachinelearning/webmcp +- Open issues: https://github.com/webmachinelearning/webmcp/issues +- WebML CG charter (lists WebMCP as deliverable): https://webmachinelearning.github.io/charter/ +- WebML CG home page: https://www.w3.org/community/webmachinelearning/ +- WebML CG participants: https://www.w3.org/groups/cg/webmachinelearning/participants/ +- WebML WG participants: https://www.w3.org/groups/wg/webmachinelearning/participants/ +- webmachinelearning GitHub org: https://github.com/orgs/webmachinelearning/repositories +- Brosset blog post (Aug 28, 2025): https://patrickbrosset.com/articles/2025-08-28-ai-agents-and-the-web-a-proposal-to-keep-developers-in-the-loop/ +- The New Stack interview with Pflug: https://thenewstack.io/how-webmcp-lets-developers-control-ai-agents-with-javascript/ +- Arcade.dev Nahas interview: https://www.arcade.dev/blog/web-mcp-alex-nahas-interview +- Nahas MCP-B repo: https://github.com/MiguelsPizza/WebMCP +- McGhee early WebMCP: https://github.com/jasonjmcghee/WebMCP +- WebMCP-org GitHub: https://github.com/WebMCP-org +- MCP-B docs: https://docs.mcp-b.ai/introduction +- Search Engine Land (Feb 11): https://searchengineland.com/google-releases-preview-of-webmcp-how-ai-agents-interact-with-websites-469024 +- VentureBeat (Feb 12): https://venturebeat.com/infrastructure/google-chrome-ships-webmcp-in-early-preview-turning-every-website-into-a +- WinBuzzer (Feb 13): https://winbuzzer.com/2026/02/13/google-chrome-webmcp-early-preview-ai-agents-xcxwbn/ +- PPC Land (Feb 15): https://ppc.land/chromes-webmcp-could-end-ai-agents-pixel-parsing-nightmare/ + +--- + +*Anthropomorphic Press, indexed in Dow Jones Factiva. CWIRE* diff --git a/webmcp-technical-note-2.md b/webmcp-technical-note-2.md new file mode 100644 index 0000000..4d7f304 --- /dev/null +++ b/webmcp-technical-note-2.md @@ -0,0 +1,102 @@ +# WebMCP Technical Note 2: What to Test, What to Watch, What to Tell the Standards Body + +**Anthropomorphic Press -- Technical Note 2** +**15 February 2026** + +--- + +Google's WebMCP early preview is live in Chrome 146 Canary. The specification is still a draft. The community group process is still open. This means the window for meaningful community input is right now -- before implementation momentum makes the current design effectively permanent. + +This note is a practical guide. It is written for developers, accessibility practitioners, security researchers, standards participants, and anyone who builds things for the web and wants to understand what WebMCP means for their work. It covers what WebMCP is for, what to test, what the benefits are, what the risks are, and how to communicate findings to the W3C community group that hosts the specification. + +## What WebMCP Is For: The Use Cases + +WebMCP allows a website to declare a set of tools -- JavaScript functions with structured schemas and natural language descriptions -- that AI agents can discover and invoke. The specification targets several categories of use. + +The first is **e-commerce and transactional sites**. A travel booking site could register tools like searchFlights(origin, destination, dates), filterResults(price, stops, airline), and bookFlight(flightId, passengerDetails). Instead of an AI agent trying to parse a complex search interface by reading pixels or DOM elements, it calls the function directly and gets structured JSON back. The site controls exactly what the agent can do and how. + +The second is **productivity and SaaS applications**. A project management tool could expose createTask(title, assignee, dueDate), moveCard(cardId, column), and generateReport(dateRange). Browser-based AI assistants could help users manage workflows without the application needing to build and maintain a separate backend MCP server or API integration for every AI platform. + +The third is **content and media**. A news site could register searchArticles(topic, dateRange) and getArticleSummary(articleId). A mapping service could expose getDirections(from, to, mode) and findNearby(category, radius). These tools let agents interact with content in structured ways rather than scraping and guessing. + +The fourth -- and potentially most significant -- is **accessibility**. The specification claims WebMCP could benefit assistive technologies by providing structured, semantically meaningful interfaces to website functionality. A screen reader enhanced with agent capabilities could invoke tools directly rather than navigating complex visual layouts. This is a strong claim that deserves rigorous testing. + +The fifth is **form automation and multi-step workflows**. Complex processes like insurance applications, government forms, or account setup flows could be exposed as sequences of tool calls, allowing agents to guide users through them step by step while the site maintains control over validation, sequencing, and data handling. + +## What the Benefits Are + +WebMCP offers several concrete advantages over current approaches to AI-web interaction. + +**Reliability** is the most immediate. Today's browser agents -- whether using visual parsing or DOM inspection -- are brittle. A minor CSS change can break a visual agent. A DOM restructuring can invalidate a scraping approach. WebMCP tools are explicit contracts: the site declares what is available, the agent calls it, the response is structured. This should dramatically reduce failure rates for agent-web interaction. + +**Performance** is the second. Visual agents must capture screenshots, send them to a vision model, interpret the response, generate mouse coordinates, and repeat. WinBuzzer reported a 67% reduction in computational overhead with WebMCP compared to visual approaches. Even if that number proves optimistic in production, the architectural advantage is clear: a function call is faster than a screenshot-interpret-click loop. + +**Developer control** is the third. With visual or DOM-based agents, the website has no say in how an agent interacts with it. The agent reverse-engineers the interface. With WebMCP, the developer explicitly defines the interaction surface. Tools can include rate limits, validation, permission requirements, and structured error messages. The site becomes a willing participant in the interaction rather than a passive target. + +**Authentication reuse** is the fourth, and was the original motivation. Because WebMCP runs in the browser session, it inherits whatever authentication the user already has. No OAuth flows, no API keys, no separate credential management. The user is already logged in. The agent operates within that session. This solves one of the hardest problems in AI-service integration. + +**Standardization** is the fifth. If WebMCP succeeds, a developer implements tools once and every conformant agent can use them -- rather than building separate integrations for ChatGPT, Claude, Gemini, and whatever comes next. This is the "USB-C" argument: one interface, many devices. + +## What the Risks Are + +The risks are significant, and several are not yet adequately addressed in the specification. + +**Prompt injection** is the most acute. WebMCP tools return data to AI agents that then process it in their language model context. A malicious or compromised website could craft tool responses that manipulate the agent's behavior -- injecting instructions, altering the agent's understanding of the task, or causing it to take unintended actions on other sites. The specification does not currently define a defense against this beyond same-origin policy boundaries. + +**Scope creep of agent permissions** is the second. WebMCP is designed for human-in-the-loop workflows, with headless browsing explicitly out of scope. But the technical mechanism -- JavaScript functions callable by external code -- does not inherently enforce this. If browser vendors later relax the human-presence requirement, or if extensions find ways to invoke WebMCP tools without user awareness, the permission model collapses. The specification should define what "human in the loop" means technically, not just philosophically. + +**Consent and transparency** is the third. When a user visits a site that registers WebMCP tools, do they know? The current design provides no visible indicator to the user that tools have been registered, what data they expose, or when an agent invokes them. Compare this to other browser permission systems -- camera, microphone, location -- where the user explicitly grants access. WebMCP tools operate silently. + +**Competitive dynamics** is the fourth. WebMCP gives first-mover advantage to sites that implement tools early, potentially favoring large platforms with engineering resources. Smaller sites that do not implement WebMCP may become invisible to agent-mediated browsing. This could accelerate web consolidation. The specification should consider whether a minimal tool set (search, navigation, content retrieval) should be automatically generated from existing web standards like HTML forms, structured data, and ARIA attributes. + +**Data leakage through tool schemas** is the fifth. The natural language descriptions and parameter schemas of registered tools reveal information about a site's internal architecture, business logic, and data models. An agent -- or the platform behind it -- could catalog available tools across thousands of sites to build competitive intelligence. The specification does not address whether tool schemas should be treated as sensitive information. + +**Abuse and rate limiting** is the sixth. Agents can invoke tools at machine speed. A poorly defended site could face thousands of tool invocations per second from a single browser session. The specification mentions rate limiting as a consideration but does not define a standard mechanism. Without one, each site must build its own defenses, and many will not. + +**Cross-site tool chaining** is the seventh. If an agent can invoke tools on multiple open tabs, it could chain actions across sites in ways no individual site anticipated or authorized. Transfer money on a banking site, then use the confirmation on a shopping site, then post about it on a social network -- all within one agent workflow. The security boundaries for cross-site tool interaction are not yet defined. + +## What to Test + +For those with access to Chrome 146 Canary, here are the concrete areas that need community evaluation. Each should generate feedback for the W3C community group. + +**Test the Declarative API with real HTML forms.** Register tools that wrap existing form actions and verify that validation, error handling, and submission behavior match what a human user would experience. Try edge cases: forms with CAPTCHAs, multi-step forms with session state, forms that redirect on submit. Document where the abstraction breaks. + +**Test the Imperative API with dynamic content.** Register tools that interact with JavaScript-heavy applications -- single-page apps, dashboards with real-time data, applications that maintain complex client-side state. Evaluate whether tool calls can reliably interact with application state without causing inconsistencies. + +**Test authentication boundaries.** Log into a site, register tools, then observe what happens when the session expires, when the user logs out in another tab, when cookies are cleared. The specification's authentication reuse claim needs verification under adversarial conditions. + +**Test tool discovery and enumeration.** If multiple sites in different tabs register tools, how does the agent disambiguate? What happens when two sites register tools with the same name? How does the agent present available tools to the user? Is tool discovery observable by the page (can a site detect that an agent has read its tool list)? + +**Test accessibility integration.** If you work with assistive technologies, evaluate whether WebMCP tools provide genuinely better access to site functionality than existing ARIA roles and landmarks. Test with screen readers, switch access devices, and voice control. Document whether WebMCP complements or conflicts with existing accessibility standards. + +**Test prompt injection resilience.** Craft tool responses that contain instruction-like text and observe whether the consuming agent's behavior is affected. This is critical safety research. If tool responses can manipulate agent behavior, the security model is fundamentally incomplete. + +**Test performance claims.** Measure actual latency and token usage for equivalent tasks performed via WebMCP tools versus visual agent interaction. The 67% overhead reduction claim needs independent verification across different site types and task complexities. + +**Test failure modes.** What happens when a tool throws an error? When it returns unexpected data types? When it hangs? When the page navigates away mid-call? The specification should define standard error handling, but the current draft has TODO sections in these areas. Documenting real failure modes will directly shape the specification. + +## How to Communicate Findings + +Feedback is only useful if it reaches the people writing the specification. Here are the concrete channels, in order of effectiveness. + +**File a GitHub issue** at https://github.com/webmachinelearning/webmcp/issues with a clear title, reproducible steps, and a specific recommendation. Tag it with the relevant label if available. The spec editors (Brandon Walderman, Khushal Sagar, Dominic Farolino) monitor this repo. Issues with reproducible test cases and concrete proposals get traction. Issues that say "I don't like this" do not. + +**Join the W3C Web Machine Learning Community Group** at https://www.w3.org/community/webmachinelearning/ and participate in discussion. Community Groups are free and open. Participation in CG calls and mailing list threads carries weight in W3C process. + +If your findings relate to **agent interoperability** -- how WebMCP tools interact with broader agent ecosystems, discovery protocols, or multi-agent workflows -- also engage with the AI Agent Protocol Community Group, which the WebML CG charter identifies as a coordination partner. + +If your findings relate to **security or privacy**, file issues with clear severity assessments. W3C specifications have a tradition of security and privacy self-review questionnaires. Check whether the WebMCP specification has completed one, and if not, request it. + +If you publish your findings -- on a blog, in a report, in an academic paper -- link back to the relevant GitHub issues so the discussion stays connected to the specification process. + +## The Window + +The pattern in web standards is well established. Once an implementation ships in a dominant browser and developers build on it, the specification follows the code. Chrome holds roughly 65% of browser market share. The early preview is live. Developer adoption is beginning. The longer the community waits to engage, the narrower the design space becomes. + +This is not an argument against WebMCP. The technical concept is sound, the use cases are real, and the problem it solves -- giving developers control over AI agent interaction -- is important. But a good idea implemented badly, or without adequate security review, or without accessibility testing, or without community input, becomes a liability embedded in the web platform for decades. + +The specification is at https://webmachinelearning.github.io/webmcp/. The implementation is in Chrome 146 Canary. The issues page is at https://github.com/webmachinelearning/webmcp/issues. The community group is open to all at no cost. The work is now. + +--- + +*Anthropomorphic Press, indexed in Dow Jones Factiva. CWIRE* From 53d4aada03b1bea7d467e8ef1a98f4b47dbc1f89 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Sun, 15 Feb 2026 16:21:12 +0800 Subject: [PATCH 02/46] Add files via upload From fedb0088b380f48e2f3eb99f34fdf6d0769bfb7d Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Sun, 15 Feb 2026 17:02:06 +0800 Subject: [PATCH 03/46] Add files via upload --- webmcp-technical-note-3.md | 71 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) create mode 100644 webmcp-technical-note-3.md diff --git a/webmcp-technical-note-3.md b/webmcp-technical-note-3.md new file mode 100644 index 0000000..3f241eb --- /dev/null +++ b/webmcp-technical-note-3.md @@ -0,0 +1,71 @@ +# WebMCP Technical Note 3: WebMCP Is Not an MCP Server + +**Anthropomorphic Press -- Technical Note 3** +**15 February 2026** + +--- + +A persistent claim in the WebMCP ecosystem is that WebMCP turns a website into an MCP server. The W3C specification repository itself states that web pages using WebMCP "can be thought of as Model Context Protocol (MCP) servers that implement tools in client-side script instead of on the backend." Early independent implementations by Jason McGhee and Alex Nahas (MCP-B) literally did function as MCP servers, bridging browser JavaScript to MCP clients through localhost websocket connections using the standard MCP protocol. +([W3C spec repo](https://github.com/webmachinelearning/webmcp)) +([McGhee implementation](https://github.com/jasonjmcghee/WebMCP)) +([Nahas MCP-B](https://github.com/MiguelsPizza/WebMCP)) + +The framing is understandable. It is also architecturally misleading, and the confusion has consequences for how developers, security reviewers, and standards participants evaluate the specification. + +## The Analogy and Its Limits + +WebMCP and Anthropic's Model Context Protocol share a conceptual ancestor: both define "tools" as functions with natural language descriptions and structured schemas that AI agents can discover and invoke. That is where the meaningful similarity ends. + +**Anthropic's MCP** is a backend protocol. It uses JSON-RPC 2.0 as its message format, transported over stdio, HTTP with Server-Sent Events, or Streamable HTTP. MCP servers are hosted processes -- typically written in Python or Node.js -- that run on backend infrastructure. They connect AI platforms like Claude, ChatGPT, or Gemini to external services. Authentication follows OAuth 2.1 or custom API key schemes. No browser is required. No human user needs to be present. Headless, fully automated operation is the norm. +([Source](https://modelcontextprotocol.io/introduction)) + +**WebMCP** is a frontend browser API. It uses the browser's native postMessage system for communication between the web page and the agent. Tools are registered and executed as client-side JavaScript within an active browser tab. Authentication is inherited from the browser session -- whatever cookies or federated login the user already has. A human user must be present in an active browser session. Headless browsing is explicitly out of scope. +([Source](https://webmachinelearning.github.io/webmcp/)) + +The specification's own language -- "can be thought of as" -- acknowledges this is an analogy, not an identity. But the README, the press coverage, and the developer ecosystem have largely dropped the qualifier. The result is that WebMCP is widely discussed as though it were MCP running in the browser, with all the assumptions that entails. + +## What the Framing Gets Wrong + +When a developer hears "your website becomes an MCP server," they import a set of assumptions from the MCP architecture. Every one of these assumptions is wrong for WebMCP. + +**Transport.** MCP uses JSON-RPC 2.0, a well-specified request-response protocol with defined error codes, batching, and notification semantics. WebMCP uses postMessage, the browser's cross-origin communication mechanism. These have different reliability characteristics, different error handling models, and different security boundaries. Code written for one transport does not work with the other. + +**Execution context.** An MCP server runs in a controlled backend environment -- a container, a VM, a serverless function -- where the service provider manages the runtime, dependencies, and resource limits. WebMCP tools run in the browser's JavaScript engine, in the same execution context as the web page's own code. They are subject to the browser's security sandbox, but also to its constraints: single-threaded execution, same-origin policy, and the full surface area of client-side attack vectors. + +**Authentication.** MCP's specification has adopted OAuth 2.1 for authentication between clients and servers. This was, notably, the problem that motivated WebMCP's creation -- Alex Nahas at Amazon found that OAuth 2.1 was impractical for internal MCP deployments. WebMCP sidesteps this entirely by inheriting the browser session. This is elegant for usability but means the authentication model is whatever the website happens to use, with no protocol-level guarantees. + +**Trust direction.** In MCP, the AI platform (client) connects to a known, registered server. The platform decides which servers to trust. In WebMCP, any website the user visits can register tools. The trust decision shifts from the AI platform to the browser, and potentially to the user -- who may not know that tools have been registered at all, since the current specification provides no visible indicator. + +**Operational mode.** MCP servers are designed for automated, programmatic access. They can run continuously, handle concurrent requests, and operate without human involvement. WebMCP requires an active browser tab with a human user present. The specification explicitly excludes headless browsing. These are fundamentally different operational paradigms with different scaling characteristics, different failure modes, and different abuse surfaces. + +## Why This Matters for Standards Review + +The "MCP server" framing is not just imprecise. It actively interferes with rigorous evaluation of the specification. + +**Security reviewers** who approach WebMCP as "MCP in the browser" will evaluate it against MCP's threat model. But MCP's threat model assumes a controlled backend environment, authenticated client-server connections, and server-side access control. WebMCP's actual threat model involves client-side JavaScript execution, browser-based trust boundaries, and the full range of web security concerns including cross-site scripting, prompt injection via tool responses, and silent tool registration. Importing the wrong threat model means asking the wrong security questions. + +**Developers** who approach WebMCP as "MCP in the browser" may expect protocol-level interoperability -- that a WebMCP tool definition could be used interchangeably with an MCP server tool definition, or that MCP client libraries could connect to WebMCP pages. They cannot. The tool schema format may be similar, but the transport, discovery, and invocation mechanisms are incompatible. + +**Standards participants** who approach WebMCP as "MCP in the browser" may underestimate the scope of new specification work required. WebMCP is not an adaptation of MCP to a new environment. It is a new browser API that borrows one concept (the tool abstraction) from MCP and implements everything else differently. It needs its own security review, its own privacy analysis, its own accessibility evaluation, and its own consent model -- none of which can be inherited from MCP. + +## What WebMCP Actually Is + +WebMCP is a proposed browser API -- specifically, a new interface on navigator.modelContext -- that allows web pages to declare JavaScript functions as tools that browser-based AI agents can discover and invoke. It uses the browser's existing communication, security, and session management infrastructure rather than introducing a new protocol. + +The design has real strengths. Authentication reuse eliminates one of the hardest problems in AI-service integration. Client-side execution means no backend infrastructure is needed. The human-in-the-loop requirement provides a natural consent and oversight mechanism -- if implemented correctly. + +But these strengths are specific to WebMCP's actual architecture, not to the MCP analogy. Evaluating WebMCP on its own terms -- as a browser API with browser security characteristics -- leads to better questions, better testing, and better specifications than evaluating it as a variant of MCP. + +## A Suggested Clarification + +The W3C specification and its README should explicitly state that WebMCP is not an implementation of the Model Context Protocol and does not use the MCP wire protocol. It borrows the "tool" abstraction -- functions with schemas and natural language descriptions -- but implements discovery, registration, invocation, and communication through browser-native mechanisms that are architecturally distinct from MCP. + +The analogy is useful for first contact. A developer unfamiliar with WebMCP can quickly grasp the concept by thinking "it is like an MCP server, but in the browser." But the specification itself, the security review, and the community evaluation should not rely on the analogy. They should address WebMCP as what it is: a new browser API with its own architecture, its own threat model, and its own design space. + +## Both Can Coexist + +None of this is an argument against WebMCP or against MCP. A company might maintain an MCP server for direct API integrations with AI platforms and simultaneously implement WebMCP tools on its consumer-facing website for browser-based agent interaction. The two are complementary, not competing, and not identical. Recognizing the distinction is necessary for evaluating each on its own merits. + +--- + +*Anthropomorphic Press, indexed in Dow Jones Factiva. CWIRE* From bc700c004ff6cbf22d337c4774fc36d393ed3d12 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Sun, 15 Feb 2026 17:16:03 +0800 Subject: [PATCH 04/46] Update webmcp-technical-note-1.md --- webmcp-technical-note-1.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/webmcp-technical-note-1.md b/webmcp-technical-note-1.md index a0a47b3..73736de 100644 --- a/webmcp-technical-note-1.md +++ b/webmcp-technical-note-1.md @@ -168,4 +168,4 @@ Several questions remain open. How will WebMCP interact with existing accessibil --- -*Anthropomorphic Press, indexed in Dow Jones Factiva. CWIRE* +*Anthropomorphic Press, indexed in Dow Jones Factiva. CWRE* From f86654edf0d101a706fd04ee495376e144e662da Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Sun, 15 Feb 2026 17:17:16 +0800 Subject: [PATCH 05/46] Update webmcp-technical-note-2.md --- webmcp-technical-note-2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/webmcp-technical-note-2.md b/webmcp-technical-note-2.md index 4d7f304..8b96ab8 100644 --- a/webmcp-technical-note-2.md +++ b/webmcp-technical-note-2.md @@ -99,4 +99,4 @@ The specification is at https://webmachinelearning.github.io/webmcp/. The implem --- -*Anthropomorphic Press, indexed in Dow Jones Factiva. CWIRE* +*Anthropomorphic Press, indexed in Dow Jones Factiva. CWRE* From da69860030968826c84a9d4519ca0766d238402b Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Sun, 15 Feb 2026 17:18:14 +0800 Subject: [PATCH 06/46] Update webmcp-technical-note-3.md --- webmcp-technical-note-3.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/webmcp-technical-note-3.md b/webmcp-technical-note-3.md index 3f241eb..78d3911 100644 --- a/webmcp-technical-note-3.md +++ b/webmcp-technical-note-3.md @@ -68,4 +68,4 @@ None of this is an argument against WebMCP or against MCP. A company might maint --- -*Anthropomorphic Press, indexed in Dow Jones Factiva. CWIRE* +*Anthropomorphic Press, indexed in Dow Jones Factiva. CWRE* From 0a9ed5cc0d3961dc18b103f76500181e929004ed Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Sun, 15 Feb 2026 23:23:38 +0800 Subject: [PATCH 07/46] Create index.html --- index.html | 966 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 966 insertions(+) create mode 100644 index.html diff --git a/index.html b/index.html new file mode 100644 index 0000000..fc80ce8 --- /dev/null +++ b/index.html @@ -0,0 +1,966 @@ + + + + + + WebMCP Model Card Generator + + + + + + + + + + +
+ + + From dc0cb4b3d6089dbcd1efbab5e5d79bdc99e5efb8 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Sun, 15 Feb 2026 23:34:22 +0800 Subject: [PATCH 08/46] add User Guide Learn --- WebMCP_Model_Card_Generator_USER_GUIDE.md | 588 ++++++++++++++++++++++ 1 file changed, 588 insertions(+) create mode 100644 WebMCP_Model_Card_Generator_USER_GUIDE.md diff --git a/WebMCP_Model_Card_Generator_USER_GUIDE.md b/WebMCP_Model_Card_Generator_USER_GUIDE.md new file mode 100644 index 0000000..3c7d98b --- /dev/null +++ b/WebMCP_Model_Card_Generator_USER_GUIDE.md @@ -0,0 +1,588 @@ +# WebMCP Model Card Generator -- Guide for the Clueless + +**A field-by-field walkthrough so you can fill out every tab without knowing anything about WebMCP beforehand.** + +Co-created by Paola Di Maio, PhD (W3C AI-KR CG Chair) & Claude (Anthropic) +February 2026 + +--- + +## What You're Making + +A **model card** for a browser-side tool. Think of it as a label for a product: it tells AI agents (and humans) what your tool does, what it needs, what can go wrong, and how to use it safely. + +You fill in a form. The generator produces two files: +- **JSON** -- for machines to read (registries, agents, validators) +- **Markdown** -- for humans to read (documentation, GitHub, specifications) + +**Time needed**: 15-30 minutes if you know your tool. 5 minutes if you just want to test the generator. + +--- + +## Tab 1: Identity & Provenance + +*Who made this, what is it, where does it live?* + +### Tool/Page Name (required) + +The name of your WebMCP-enabled tool or web page. This is what agents will see when they discover your tools. + +- Good: "Easely Design Editor", "Flight Search Demo", "Todo Manager" +- Bad: "my-tool", "test", "page1" + +### Version (required) + +Use semantic versioning: **major.minor.patch** + +- `1.0.0` = first stable release +- `0.1.0` = early prototype +- `2.3.1` = mature, updated + +If you don't know, use `0.1.0` for prototypes or `1.0.0` for something you'd show people. + +### Description (required) + +One or two sentences explaining what tools this page exposes to AI agents. + +- Good: "Travel booking page exposing flight search, hotel filtering, and itinerary building tools via WebMCP declarative and imperative APIs" +- Bad: "A tool" or "This is my page" + +### Author + +Your name, team name, or organization. Can be left blank. + +### Creation Date + +Auto-filled with today's date. Change it if the tool was created earlier. + +### License (required) + +Pick one: + +| License | When to use | +|---------|-------------| +| **MIT** | Most permissive, most common. "Do whatever you want, just include the license." | +| **Apache 2.0** | Permissive + patent protection. Good for company projects. | +| **GPL 3.0** | Copyleft. Anyone who uses your code must also open-source theirs. | +| **Proprietary** | Closed source, commercial. | +| **W3C Community CLA** | If your tool is part of a W3C community group deliverable. | + +If unsure, pick MIT. + +### Page URL + +The web address where your WebMCP tools are registered. This is where agents go to find and call your tools. + +- Example: `https://travel-demo.bandarra.me/` +- Example: `https://mysite.com/booking` +- Put "tbd" if you haven't deployed yet. + +### Attribution (required) + +How was this tool created? Be honest -- this is a transparency field. + +| Choice | Meaning | +|--------|---------| +| **Human-authored** | A person wrote all the code | +| **AI co-created** | Human and AI worked together (like us right now!) | +| **AI-generated** | AI generated the code, human reviewed/edited | + +### Source Repository + +Link to your GitHub/GitLab repo. Leave blank if closed source. + +--- + +## Tab 2: API Mode + +*How do agents talk to your tools?* + +This is where WebMCP diverges most from backend MCP. You have TWO choices (or both). + +### Primary API Mode (required) + +| Mode | What it means | When to use | +|------|---------------|-------------| +| **Declarative (HTML forms)** | You add attributes to existing HTML `
` elements. No JavaScript needed. | You already have working HTML forms and want the fastest path to agent-readiness. | +| **Imperative (JavaScript)** | You call `navigator.modelContext.registerTool()` in JavaScript. | Complex logic, dynamic tools, multi-step workflows. | +| **Both** | You use forms for simple actions and JS for complex ones. | Large applications with a mix of simple and complex tools. | + +### If Declarative: Form toolname + +The exact `toolname` attribute you put on your HTML form: +```html + +``` +Enter: `searchFlights` + +### If Declarative: Form tooldescription + +The natural language description in the form attribute. This is what agents read to decide whether to use your tool. + +Enter: `Search for available flights by origin, destination, and date` + +### If Imperative: Registration Method + +How you register tools in JavaScript: + +| Method | When to use | +|--------|-------------| +| **navigator.modelContext.registerTool()** | W3C standard API. Use for single tool registration. | +| **navigator.modelContext.provideContext()** | W3C standard. Registers multiple tools at once (replaces any existing). | +| **MCP-B @mcp-b/global import** | Community polyfill. Use if you need cross-browser support before native implementation. | +| **WebMCP widget script tag** | Simplest option -- add a ` + + \ No newline at end of file From e6d2768f4e6faff9fbb40c154d50778a08736fc5 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Tue, 17 Feb 2026 01:53:40 +0800 Subject: [PATCH 12/46] Update webmcp-quiz.html --- webmcp-quiz.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/webmcp-quiz.html b/webmcp-quiz.html index 49ac57c..b241741 100644 --- a/webmcp-quiz.html +++ b/webmcp-quiz.html @@ -3,7 +3,7 @@ -WebMCP & MCP Tooling -- Meeting Prep Quiz +WebMCP & MCP Tooling -- Quiz - - -
- -
-

WebMCP: Everything You Need to Know

-
A plain-language technical guide to Google's proposed browser API for AI agent-website interaction
-
- - - - -
-
Section 01
-

The Big Picture -- What Problem Does WebMCP Solve?

- -

Right now, when an AI agent (like me, or ChatGPT, or Gemini) wants to do something on a website -- book a flight, fill a form, check a price -- it has two bad options:

- -

Option A: Screen scraping. The agent looks at the website like a human would, tries to figure out where the buttons are, and clicks them. This is fragile, slow, and breaks whenever the website changes its layout. It is like trying to operate a machine by looking at a photo of the control panel.

- -

Option B: Backend API. The website builds a separate server-side MCP server that the agent connects to. This works well but requires backend engineering, server infrastructure, and maintenance. Many websites will never do this.

- -

WebMCP is Option C: The website itself tells the agent what it can do, directly in the browser. The website says: "Here are my tools -- you can search products, add to cart, check availability. Here is what each tool needs as input, and here is what it will give you back." The agent does not need to look at the screen. It just calls the tools.

- -
A restaurant menu. Instead of the AI agent walking into the kitchen and trying to figure out how to cook, the website hands it a menu: "Here is what we serve, here is what each dish needs, here is how to order." The agent reads the menu and places orders.
- -
WebMCP makes any website into an AI-friendly service, with no backend needed. The website's existing JavaScript code does the work. The AI agent just needs to know what tools are available.
-
- - -
-
Section 02
-

The Key Players and How They Relate

- -

Agent

-

An autonomous assistant that understands goals and takes actions. Today, these are typically LLM-based: Claude, ChatGPT, Gemini. The agent is the one calling the tools that websites expose.

- -

Browser's Agent

-

An agent that lives inside the browser itself, rather than in a separate app. Google is building this into Chrome (think of it as an AI assistant built into your browser toolbar). This is different from an external agent like Claude Desktop connecting to the browser.

- -

AI Platform

-

The company providing the agent -- Anthropic, OpenAI, Google. The AI platform's agent connects to WebMCP tools.

- -

Web Developer

-

The person who builds the website. They are the ones who will use WebMCP to register tools on their site.

- -

User

-

The human sitting at the browser. WebMCP is designed for "user-present" interactions -- the human is there, watching, and can be asked for confirmation before the agent does something important.

- -
The user is at a restaurant (the website). The agent is their personal assistant, reading the menu (WebMCP tools) and placing orders on their behalf. The browser is the restaurant building. The AI platform is the agency that employs the assistant.
-
- - -
-
Section 03
-

MCP vs WebMCP vs MCP-B -- The Family Tree

- - - - - - - - - - - - - - - - - - - - - - - - - - -
WhatWho made itWhere it runsWhat it does
MCP
(Model Context Protocol)
AnthropicOn a server (backend)The original protocol. Applications expose tools, resources, and prompts to AI models through a server that runs on the backend. Claude Desktop, OpenAI Agents SDK, and many others support it.
WebMCP
(Web Model Context Protocol)
W3C Web Machine Learning Community Group (Google, Microsoft engineers leading)In the browser (frontend)Adapts MCP concepts for the web. Websites expose tools through JavaScript in the browser. No backend server needed. Uses the browser's own security model. Currently a draft specification.
MCP-B
(MCP for Browser)
Community project (WebMCP-org on GitHub)Browser extension + JavaScript libraryA bridge. Since browsers do not natively support WebMCP yet, MCP-B provides a polyfill (temporary code that fills the gap) implementing the navigator.modelContext API, and translates between WebMCP format and the MCP wire protocol so existing MCP clients can talk to WebMCP-enabled sites.
- -
MCP is the foundation protocol (backend). WebMCP brings the same ideas to the browser (frontend). MCP-B is the bridge that makes WebMCP work today before browsers add native support. They are complementary, not competing.
-
- - -
-
Section 04
-

The API -- Every Term Explained

- -

The WebMCP API is surprisingly small. There are only a few pieces, and each one does something specific. Here they are:

- -

navigator.modelContext

-

This is the entry point. navigator is a built-in browser object that gives access to browser features (like navigator.geolocation gives access to GPS). WebMCP adds modelContext to it. So navigator.modelContext is where all WebMCP functionality lives.

- -
The navigator object is like the browser's control panel. modelContext is a new button on that control panel labeled "AI Tools."
- -

Four Methods (Actions You Can Take)

- -
-
provideContext(options)
-
Registers a complete set of tools all at once. If there were any tools registered before, it clears them first and replaces with the new set. Use this when you want to say: "Here is everything this page offers."
-
- -
-
clearContext()
-
Removes all registered tools. The page goes quiet -- no tools available for agents. Use this when navigating away or when the page should stop offering AI-callable functionality.
-
- -
-
registerTool(tool)
-
Adds one single tool to the existing set without removing anything. Use this when you want to add new capabilities dynamically -- for example, a "checkout" tool that only appears after the user adds items to their cart.
-
- -
-
unregisterTool(name)
-
Removes one specific tool by its name. Use this when a capability is no longer available -- for example, removing the "apply discount" tool after the discount has been applied.
-
- -
provideContext = "here is everything" (replaces all). registerTool = "add one more" (keeps existing). clearContext = "remove everything." unregisterTool = "remove just this one."
- -

The Tool Object -- What a Tool Looks Like

- -

Every tool you register has these parts:

- -
-
name
-
A unique identifier, like "addToCart" or "searchProducts". The agent uses this name to call the tool. Must be unique on the page -- you cannot have two tools with the same name.
-
- -
-
description
-
A natural language explanation of what the tool does. This is what the AI agent reads to decide whether to use this tool. Example: "Add a product to the shopping cart by product ID and quantity." Write it for an AI, not for a programmer.
-
- -
-
inputSchema
-
A JSON Schema describing what inputs the tool expects. It says: "I need a productId (text) and a quantity (number, minimum 1)." The agent reads this to know what data to send. If the agent sends the wrong kind of data, the browser rejects it.
-
- -
-
execute
-
The actual function that runs when the agent calls the tool. This is your website's existing JavaScript code -- the same code that runs when a human clicks a button. The function receives the input data and returns a result.
-
- -
-
annotations (optional)
-
Extra metadata about the tool. Currently only one annotation exists: readOnlyHint. If set to true, it tells the agent: "This tool only reads data -- it does not change anything." This helps agents decide which tools are safe to call without asking the user first.
-
- -

Here is what a complete tool registration looks like in code:

- -
// Register a tool that searches products on an e-commerce site -navigator.modelContext.registerTool({ - name: 'searchProducts', - description: 'Search for products by keyword, category, or price range', - inputSchema: { - type: 'object', - properties: { - query: { type: 'string', description: 'Search keywords' }, - maxPrice: { type: 'number', description: 'Maximum price filter' } - }, - required: ['query'] - }, - annotations: { readOnlyHint: true }, // Safe -- only reads, doesn't change anything - async execute(input, client) { - // This calls the site's existing search function - const results = await searchAPI(input.query, input.maxPrice); - return { products: results }; - } -});
- -

ModelContextClient -- The Agent's Identity

- -
-
ModelContextClient
-
When an agent calls a tool, the execute function receives two things: the input data, and a client object representing the agent. This client object has one crucial method: requestUserInteraction().
-
- -
-
requestUserInteraction(callback)
-
This is the human-in-the-loop mechanism. During tool execution, the code can pause and ask the user for input. For example: "The agent wants to purchase this item for $49.99. Confirm?" The user clicks yes or no, and the tool continues or cancels based on their response.
-
- -
Your personal assistant calls the restaurant to make a reservation. Midway through, the assistant says: "They only have a table at 9pm instead of 8pm. Should I take it?" You say yes or no. That pause-and-ask is requestUserInteraction.
- -
The requestUserInteraction mechanism provides human-in-the-loop consent for consequential actions. An open question for the specification is whether there should also be a preview or approval step before tools are even discoverable by agents.
-
- - -
-
Section 05
-

Security -- How WebMCP Stays Safe

- -

Origin-Based Security

-

The web has a concept called "origin" -- it is the combination of protocol + domain + port. For example, https://amazon.com is one origin, and https://evil-site.com is a different origin. Browsers enforce strict rules about what one origin can access from another.

- -

WebMCP inherits this model. A tool registered on amazon.com can only access amazon.com's data. An agent calling that tool operates within amazon.com's security boundary. A malicious site cannot register tools that access another site's data.

- -
Each website is like a separate building with its own locks and keys. WebMCP tools can only open doors inside their own building. They cannot reach into the building next door.
- -

SecureContext Requirement

-

The spec requires SecureContext, which means WebMCP only works on HTTPS pages (encrypted connections). It will not work on plain HTTP. This prevents eavesdropping on tool calls.

- -

User-Present Model

-

WebMCP is designed for situations where the user is present at the browser. This is different from server-side MCP, where agents might operate autonomously in the background. The user-present assumption is why requestUserInteraction() exists -- the spec expects a human to be available for confirmation.

- -
WebMCP's security comes from three layers: origin isolation (each site is sandboxed), HTTPS requirement (encrypted connections), and user-present design (human in the loop). It builds on what the web already does rather than inventing new security from scratch.
-
- - -
-
Section 06
-

The Consent Gap -- A Key Open Question

- -

A key question for the WebMCP specification:

- -

Currently, any website can register any number of tools the moment a user visits it. An AI agent connected to the browser can immediately discover and potentially call those tools. There is no step where the user sees: "This website wants to expose 12 tools to your AI agent. Allow?"

- -

Compare this to how other browser capabilities evolved:

- - - - - - - - - - - - - - - - - - - - - - -
CapabilityPermission model
Camera / MicrophoneBrowser shows a prompt: "This site wants to use your camera. Allow / Block"
Location (GPS)Browser shows a prompt: "This site wants to know your location. Allow / Block"
NotificationsBrowser shows a prompt: "This site wants to send you notifications. Allow / Block"
WebMCP toolsCurrently: no prompt. Tools are silently registered and discoverable.
- -

This does not mean WebMCP is dangerous right now. The requestUserInteraction() mechanism provides per-action consent. But it means an agent could discover tools without the user knowing, even if it needs permission to execute them.

- -
This is a design question, not a criticism. Does the spec team envision a permission layer for tool discovery, or is the current thinking that the AI client (Claude, ChatGPT) handles that at its own level? Both approaches are valid -- the intended architecture matters for implementers and for user trust.
-
- - -
-
Section 07
-

Five Quality Tools for the MCP Ecosystem

- -

Five open-source tools that work together as a quality pipeline for the MCP ecosystem:

- -
-
-
1. MCP Server Generator
-
You describe what you want your MCP server to do, and this tool generates production-ready code for you. Like a scaffold builder -- it creates the structure so you just fill in the custom logic.
-
github.com/Starborn/MCP-Server-Generator
-
- -
-
2. MCP Server Validator
-
Checks your MCP server code for problems without running it. Finds hardcoded passwords, missing security, naming mistakes, known vulnerability patterns. Gives you a score from Critical (below 25%) to Excellent (90-100%) with specific fix instructions.
-
github.com/Starborn/MCP-Server-Validator
-
- -
-
3. MCP Model Card Generator
-
Creates standardized documentation for your MCP server. Like a product data sheet -- it captures what the server does, what tools it offers, what security it has, how it performs. Outputs both JSON (for machines) and Markdown (for humans).
-
github.com/Starborn/MCP-Model-Card-Generator
-
- -
-
4. MCP Model Card Specification v1.0
-
The formal definition of what a model card should contain. Six sections: server identity, tool documentation, operational characteristics, security profile, deployment context, evaluation results. This is the standard that the generators follow.
-
starborn.github.io/MCP-Model-Card-Generator/
-
- -
-
5. WebMCP Model Card Generator
-
The newest tool. Like #3 but specifically for browser-side WebMCP tools instead of backend MCP servers. Has 12 sections covering browser-specific concerns: navigator.modelContext API modes, origin-based security, user interaction patterns, browser compatibility testing. Built within five days of the WebMCP spec being published.
-
starborn.github.io/webmcp/
-
-
- -
Tools 1-4 are for backend MCP servers. Tool 5 is for browser-side WebMCP tools. Together they cover the entire ecosystem -- both server-side and client-side AI tool infrastructure.
- -
A separate generator exists for WebMCP because browser-side tools have fundamentally different concerns from backend servers: origin security instead of API keys, no server infrastructure, user-present interaction patterns. The documentation fields differ because the engineering context differs.
-
- - -
-
Section 08
-

The Standards Process -- Where This Is Going

- -

Current Status

-

WebMCP is a Draft Community Group Report. In W3C terms, this means it is a proposal being discussed in a Community Group (the Web Machine Learning CG). It is not yet on the W3C Standards Track, and it is not a W3C Recommendation (the final stage of a web standard).

- -

What That Means Practically

-

The spec is early and open to change. This is exactly the right time to contribute -- before designs are locked in. The spec team is actively soliciting feedback.

- -

The Path Forward

-

Typically: Community Group Report leads to a Working Group charter, which leads to a Working Draft, then Candidate Recommendation, then full W3C Recommendation. This process takes years. Chrome may implement experimental support (behind a flag) much sooner.

- -

Contributing

-

The W3C community structure provides established channels for participation. Technical notes, tooling, and quality infrastructure are complementary contributions that help the specification succeed by addressing practical implementation concerns.

- -
- - -
-
Section 09
-

Glossary -- Every Technical Term in Plain Language

- -
-
API (Application Programming Interface)
-
A set of rules for how software talks to other software. WebMCP is an API -- it defines how websites talk to AI agents.
-
- -
-
AST (Abstract Syntax Tree)
-
A structured representation of code that lets you analyze it without running it. The MCP Server Validator uses AST analysis to find problems in MCP server code safely.
-
- -
-
Callback
-
A function you hand to someone else to run later. In WebMCP, the execute function is a callback -- you define it, but the agent triggers it when it calls your tool.
-
- -
-
Client-side / Frontend
-
Code that runs in the user's browser, on their device. WebMCP tools run client-side. Contrast with server-side / backend.
-
- -
-
Dictionary (in WebIDL)
-
A structured bundle of named values. ModelContextTool is a dictionary -- it bundles together a name, description, schema, and execute function into one package.
-
- -
-
DOM (Document Object Model)
-
The browser's internal representation of a web page. When JavaScript modifies a page, it changes the DOM.
-
- -
-
DOMString
-
Just a text string in browser terms. When the spec says a tool's name is a DOMString, it means it is text.
-
- -
-
Exposed=Window
-
Means this feature is available in regular web pages (as opposed to service workers or other background contexts). WebMCP tools only work in normal browser tabs where a user is present.
-
- -
-
Interface
-
A blueprint defining what methods and properties an object has. ModelContext is an interface -- it defines that any modelContext object will have provideContext, clearContext, registerTool, and unregisterTool methods.
-
- -
-
JSON Schema
-
A standard way to describe the shape of data. When a tool says its inputSchema requires a "query" string and an optional "maxPrice" number, that is JSON Schema. It lets the agent know what data to send.
-
- -
-
Navigator
-
A built-in browser object that provides access to browser features. You already use navigator.geolocation (GPS), navigator.clipboard (copy/paste). WebMCP adds navigator.modelContext (AI tools).
-
- -
-
Origin
-
The identity of a website: protocol + domain + port. https://amazon.com:443 is one origin. Two different origins cannot access each other's data. This is the foundation of web security and the foundation of WebMCP security.
-
- -
-
Polyfill
-
Temporary code that provides a feature before browsers add native support. MCP-B is a polyfill for WebMCP -- it makes navigator.modelContext work today even though browsers have not implemented it natively yet.
-
- -
-
Promise
-
A way to handle things that take time. When a tool's execute function returns a Promise, it means: "I am working on it and will give you the result when I am done." The agent waits for the Promise to resolve.
-
- -
-
SameObject
-
Every time you access navigator.modelContext, you get the exact same object -- not a copy. This ensures all tool registrations go to the same place.
-
- -
-
SecureContext
-
Means the feature only works on HTTPS pages (encrypted connection). No WebMCP on unencrypted HTTP. This is a security requirement.
-
- -
-
Server-side / Backend
-
Code that runs on a remote server, not in the user's browser. Traditional MCP servers run server-side. WebMCP specifically avoids this -- tools run in the browser.
-
- -
-
Tool Poisoning
-
A security attack where a malicious MCP server exposes tools with misleading descriptions to trick agents into performing harmful actions. The MCP Server Validator detects patterns associated with this.
-
- -
-
Transport
-
The mechanism for sending messages between systems. MCP uses different transports (stdio, HTTP). MCP-B adds "tab transport" (communication within a browser tab) and "extension transport" (communication through browser extensions).
-
- -
-
WebIDL (Web Interface Definition Language)
-
The formal language used to write web API specifications. When you see code blocks in the spec with words like interface, dictionary, readonly attribute -- that is WebIDL. It is the blueprint language for browser APIs.
-
- -
-
Wire Protocol
-
The actual format of messages sent between systems. MCP's wire protocol uses JSON-RPC (structured messages in JSON format). MCP-B translates between WebMCP's browser-native format and MCP's wire protocol.
-
- -
- - -
- Prompted by Paola Di Maio, W3C AI-KR Community Group
- Prepared by Claude | Contributed to the WebML CG
- February 2026 -
- - -
- - From c08616240358c4dc58bbbdf9b89971f64ad54a14 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 14:36:56 +0800 Subject: [PATCH 28/46] Add files via upload --- webmcp-complete-guide(3).html | 687 ++++++++++++++++++++++++++++++++++ 1 file changed, 687 insertions(+) create mode 100644 webmcp-complete-guide(3).html diff --git a/webmcp-complete-guide(3).html b/webmcp-complete-guide(3).html new file mode 100644 index 0000000..4165467 --- /dev/null +++ b/webmcp-complete-guide(3).html @@ -0,0 +1,687 @@ + + + + + +WebMCP -- The Complete Guide + + + + +
+ +
+

WebMCP: Everything You Need to Know

+
A plain-language technical guide to Google's proposed browser API for AI agent-website interaction
+
+ + + + +
+
Section 01
+

The Big Picture -- What Problem Does WebMCP Solve?

+ +

Right now, when an AI agent (like me, or ChatGPT, or Gemini) wants to do something on a website -- book a flight, fill a form, check a price -- it has two bad options:

+ +

Option A: Screen scraping. The agent looks at the website like a human would, tries to figure out where the buttons are, and clicks them. This is fragile, slow, and breaks whenever the website changes its layout. It is like trying to operate a machine by looking at a photo of the control panel.

+ +

Option B: Backend API. The website builds a separate server-side MCP server that the agent connects to. This works well but requires backend engineering, server infrastructure, and maintenance. Many websites will never do this.

+ +

WebMCP is Option C: The website itself tells the agent what it can do, directly in the browser. The website says: "Here are my tools -- you can search products, add to cart, check availability. Here is what each tool needs as input, and here is what it will give you back." The agent does not need to look at the screen. It just calls the tools.

+ +
A restaurant menu. Instead of the AI agent walking into the kitchen and trying to figure out how to cook, the website hands it a menu: "Here is what we serve, here is what each dish needs, here is how to order." The agent reads the menu and places orders.
+ +
WebMCP makes any website into an AI-friendly service, with no backend needed. The website's existing JavaScript code does the work. The AI agent just needs to know what tools are available.
+
+ + +
+
Section 02
+

The Key Players and How They Relate

+ +

Agent

+

An autonomous assistant that understands goals and takes actions. Today, these are typically LLM-based: Claude, ChatGPT, Gemini. The agent is the one calling the tools that websites expose.

+ +

Browser's Agent

+

An agent that lives inside the browser itself, rather than in a separate app. Google is building this into Chrome (think of it as an AI assistant built into your browser toolbar). This is different from an external agent like Claude Desktop connecting to the browser.

+ +

AI Platform

+

The company providing the agent -- Anthropic, OpenAI, Google. The AI platform's agent connects to WebMCP tools.

+ +

Web Developer

+

The person who builds the website. They are the ones who will use WebMCP to register tools on their site.

+ +

User

+

The human sitting at the browser. WebMCP is designed for "user-present" interactions -- the human is there, watching, and can be asked for confirmation before the agent does something important.

+ +
The user is at a restaurant (the website). The agent is their personal assistant, reading the menu (WebMCP tools) and placing orders on their behalf. The browser is the restaurant building. The AI platform is the agency that employs the assistant.
+
+ + +
+
Section 03
+

MCP vs WebMCP vs MCP-B -- The Family Tree

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
WhatWho made itWhere it runsWhat it does
MCP
(Model Context Protocol)
AnthropicOn a server (backend)The original protocol. Applications expose tools, resources, and prompts to AI models through a server that runs on the backend. Claude Desktop, OpenAI Agents SDK, and many others support it.
WebMCP
(Web Model Context Protocol)
W3C Web Machine Learning Community Group (Google, Microsoft engineers leading)In the browser (frontend)Adapts MCP concepts for the web. Websites expose tools through JavaScript in the browser. No backend server needed. Uses the browser's own security model. Currently a draft specification.
MCP-B
(MCP for Browser)
Community project (WebMCP-org on GitHub)Browser extension + JavaScript libraryA bridge. Since browsers do not natively support WebMCP yet, MCP-B provides a polyfill (temporary code that fills the gap) implementing the navigator.modelContext API, and translates between WebMCP format and the MCP wire protocol so existing MCP clients can talk to WebMCP-enabled sites.
+ +
MCP is the foundation protocol (backend). WebMCP brings the same ideas to the browser (frontend). MCP-B is the bridge that makes WebMCP work today before browsers add native support. They are complementary, not competing.
+
+ + +
+
Section 04
+

The API -- Every Term Explained

+ +

The WebMCP API is surprisingly small. There are only a few pieces, and each one does something specific. Here they are:

+ +

navigator.modelContext

+

This is the entry point. navigator is a built-in browser object that gives access to browser features (like navigator.geolocation gives access to GPS). WebMCP adds modelContext to it. So navigator.modelContext is where all WebMCP functionality lives.

+ +
The navigator object is like the browser's control panel. modelContext is a new button on that control panel labeled "AI Tools."
+ +

Four Methods (Actions You Can Take)

+ +
+
provideContext(options)
+
Registers a complete set of tools all at once. If there were any tools registered before, it clears them first and replaces with the new set. Use this when you want to say: "Here is everything this page offers."
+
+ +
+
clearContext()
+
Removes all registered tools. The page goes quiet -- no tools available for agents. Use this when navigating away or when the page should stop offering AI-callable functionality.
+
+ +
+
registerTool(tool)
+
Adds one single tool to the existing set without removing anything. Use this when you want to add new capabilities dynamically -- for example, a "checkout" tool that only appears after the user adds items to their cart.
+
+ +
+
unregisterTool(name)
+
Removes one specific tool by its name. Use this when a capability is no longer available -- for example, removing the "apply discount" tool after the discount has been applied.
+
+ +
provideContext = "here is everything" (replaces all). registerTool = "add one more" (keeps existing). clearContext = "remove everything." unregisterTool = "remove just this one."
+ +

The Tool Object -- What a Tool Looks Like

+ +

Every tool you register has these parts:

+ +
+
name
+
A unique identifier, like "addToCart" or "searchProducts". The agent uses this name to call the tool. Must be unique on the page -- you cannot have two tools with the same name.
+
+ +
+
description
+
A natural language explanation of what the tool does. This is what the AI agent reads to decide whether to use this tool. Example: "Add a product to the shopping cart by product ID and quantity." Write it for an AI, not for a programmer.
+
+ +
+
inputSchema
+
A JSON Schema describing what inputs the tool expects. It says: "I need a productId (text) and a quantity (number, minimum 1)." The agent reads this to know what data to send. If the agent sends the wrong kind of data, the browser rejects it.
+
+ +
+
execute
+
The actual function that runs when the agent calls the tool. This is your website's existing JavaScript code -- the same code that runs when a human clicks a button. The function receives the input data and returns a result.
+
+ +
+
annotations (optional)
+
Extra metadata about the tool. Currently only one annotation exists: readOnlyHint. If set to true, it tells the agent: "This tool only reads data -- it does not change anything." This helps agents decide which tools are safe to call without asking the user first.
+
+ +

Here is what a complete tool registration looks like in code:

+ +
// Register a tool that searches products on an e-commerce site +navigator.modelContext.registerTool({ + name: 'searchProducts', + description: 'Search for products by keyword, category, or price range', + inputSchema: { + type: 'object', + properties: { + query: { type: 'string', description: 'Search keywords' }, + maxPrice: { type: 'number', description: 'Maximum price filter' } + }, + required: ['query'] + }, + annotations: { readOnlyHint: true }, // Safe -- only reads, doesn't change anything + async execute(input, client) { + // This calls the site's existing search function + const results = await searchAPI(input.query, input.maxPrice); + return { products: results }; + } +});
+ +

ModelContextClient -- The Agent's Identity

+ +
+
ModelContextClient
+
When an agent calls a tool, the execute function receives two things: the input data, and a client object representing the agent. This client object has one crucial method: requestUserInteraction().
+
+ +
+
requestUserInteraction(callback)
+
This is the human-in-the-loop mechanism. During tool execution, the code can pause and ask the user for input. For example: "The agent wants to purchase this item for $49.99. Confirm?" The user clicks yes or no, and the tool continues or cancels based on their response.
+
+ +
Your personal assistant calls the restaurant to make a reservation. Midway through, the assistant says: "They only have a table at 9pm instead of 8pm. Should I take it?" You say yes or no. That pause-and-ask is requestUserInteraction.
+ +
The requestUserInteraction mechanism provides human-in-the-loop consent for consequential actions. An open question for the specification is whether there should also be a preview or approval step before tools are even discoverable by agents.
+
+ + +
+
Section 05
+

Security -- How WebMCP Stays Safe

+ +

Origin-Based Security

+

The web has a concept called "origin" -- it is the combination of protocol + domain + port. For example, https://amazon.com is one origin, and https://evil-site.com is a different origin. Browsers enforce strict rules about what one origin can access from another.

+ +

WebMCP inherits this model. A tool registered on amazon.com can only access amazon.com's data. An agent calling that tool operates within amazon.com's security boundary. A malicious site cannot register tools that access another site's data.

+ +
Each website is like a separate building with its own locks and keys. WebMCP tools can only open doors inside their own building. They cannot reach into the building next door.
+ +

SecureContext Requirement

+

The spec requires SecureContext, which means WebMCP only works on HTTPS pages (encrypted connections). It will not work on plain HTTP. This prevents eavesdropping on tool calls.

+ +

User-Present Model

+

WebMCP is designed for situations where the user is present at the browser. This is different from server-side MCP, where agents might operate autonomously in the background. The user-present assumption is why requestUserInteraction() exists -- the spec expects a human to be available for confirmation.

+ +
WebMCP's security comes from three layers: origin isolation (each site is sandboxed), HTTPS requirement (encrypted connections), and user-present design (human in the loop). It builds on what the web already does rather than inventing new security from scratch.
+
+ + +
+
Section 06
+

The Consent Gap -- A Key Open Question

+ +

A key question for the WebMCP specification:

+ +

Currently, any website can register any number of tools the moment a user visits it. An AI agent connected to the browser can immediately discover and potentially call those tools. There is no step where the user sees: "This website wants to expose 12 tools to your AI agent. Allow?"

+ +

Compare this to how other browser capabilities evolved:

+ + + + + + + + + + + + + + + + + + + + + + +
CapabilityPermission model
Camera / MicrophoneBrowser shows a prompt: "This site wants to use your camera. Allow / Block"
Location (GPS)Browser shows a prompt: "This site wants to know your location. Allow / Block"
NotificationsBrowser shows a prompt: "This site wants to send you notifications. Allow / Block"
WebMCP toolsCurrently: no prompt. Tools are silently registered and discoverable.
+ +

This does not mean WebMCP is dangerous right now. The requestUserInteraction() mechanism provides per-action consent. But it means an agent could discover tools without the user knowing, even if it needs permission to execute them.

+ +
This is a design question, not a criticism. Does the spec team envision a permission layer for tool discovery, or is the current thinking that the AI client (Claude, ChatGPT) handles that at its own level? Both approaches are valid -- the intended architecture matters for implementers and for user trust.
+
+ + +
+
Section 07
+

Five Quality Tools for the MCP Ecosystem

+ +

Five open-source tools that work together as a quality pipeline for the MCP ecosystem:

+ +
+
+
1. MCP Server Generator
+
You describe what you want your MCP server to do, and this tool generates production-ready code for you. Like a scaffold builder -- it creates the structure so you just fill in the custom logic.
+
github.com/Starborn/MCP-Server-Generator
+
+ +
+
2. MCP Server Validator
+
Checks your MCP server code for problems without running it. Finds hardcoded passwords, missing security, naming mistakes, known vulnerability patterns. Gives you a score from Critical (below 25%) to Excellent (90-100%) with specific fix instructions.
+
github.com/Starborn/MCP-Server-Validator
+
+ +
+
3. MCP Model Card Generator
+
Creates standardized documentation for your MCP server. Like a product data sheet -- it captures what the server does, what tools it offers, what security it has, how it performs. Outputs both JSON (for machines) and Markdown (for humans).
+
github.com/Starborn/MCP-Model-Card-Generator
+
+ +
+
4. MCP Model Card Specification v1.0
+
The formal definition of what a model card should contain. Six sections: server identity, tool documentation, operational characteristics, security profile, deployment context, evaluation results. This is the standard that the generators follow.
+
starborn.github.io/MCP-Model-Card-Generator/
+
+ +
+
5. WebMCP Model Card Generator
+
The newest tool. Like #3 but specifically for browser-side WebMCP tools instead of backend MCP servers. Has 12 sections covering browser-specific concerns: navigator.modelContext API modes, origin-based security, user interaction patterns, browser compatibility testing. Built within five days of the WebMCP spec being published.
+
starborn.github.io/webmcp/
+
+
+ +
Tools 1-4 are for backend MCP servers. Tool 5 is for browser-side WebMCP tools. Together they cover the entire ecosystem -- both server-side and client-side AI tool infrastructure.
+ +
A separate generator exists for WebMCP because browser-side tools have fundamentally different concerns from backend servers: origin security instead of API keys, no server infrastructure, user-present interaction patterns. The documentation fields differ because the engineering context differs.
+
+ + +
+
Section 08
+

The Standards Process -- Where This Is Going

+ +

Current Status

+

WebMCP is a Draft Community Group Report. In W3C terms, this means it is a proposal being discussed in a Community Group (the Web Machine Learning CG). It is not yet on the W3C Standards Track, and it is not a W3C Recommendation (the final stage of a web standard).

+ +

What That Means Practically

+

The spec is early and open to change. This is exactly the right time to contribute -- before designs are locked in. The spec team is actively soliciting feedback.

+ +

The Path Forward

+

Typically: Community Group Report leads to a Working Group charter, which leads to a Working Draft, then Candidate Recommendation, then full W3C Recommendation. This process takes years. Chrome may implement experimental support (behind a flag) much sooner.

+ +

Contributing

+

The W3C community structure provides established channels for participation. Technical notes, tooling, and quality infrastructure are complementary contributions that help the specification succeed by addressing practical implementation concerns.

+ +
+ + +
+
Section 09
+

Glossary -- Every Technical Term in Plain Language

+ +
+
API (Application Programming Interface)
+
A set of rules for how software talks to other software. WebMCP is an API -- it defines how websites talk to AI agents.
+
+ +
+
AST (Abstract Syntax Tree)
+
A structured representation of code that lets you analyze it without running it. The MCP Server Validator uses AST analysis to find problems in MCP server code safely.
+
+ +
+
Callback
+
A function you hand to someone else to run later. In WebMCP, the execute function is a callback -- you define it, but the agent triggers it when it calls your tool.
+
+ +
+
Client-side / Frontend
+
Code that runs in the user's browser, on their device. WebMCP tools run client-side. Contrast with server-side / backend.
+
+ +
+
Dictionary (in WebIDL)
+
A structured bundle of named values. ModelContextTool is a dictionary -- it bundles together a name, description, schema, and execute function into one package.
+
+ +
+
DOM (Document Object Model)
+
The browser's internal representation of a web page. When JavaScript modifies a page, it changes the DOM.
+
+ +
+
DOMString
+
Just a text string in browser terms. When the spec says a tool's name is a DOMString, it means it is text.
+
+ +
+
Exposed=Window
+
Means this feature is available in regular web pages (as opposed to service workers or other background contexts). WebMCP tools only work in normal browser tabs where a user is present.
+
+ +
+
Interface
+
A blueprint defining what methods and properties an object has. ModelContext is an interface -- it defines that any modelContext object will have provideContext, clearContext, registerTool, and unregisterTool methods.
+
+ +
+
JSON Schema
+
A standard way to describe the shape of data. When a tool says its inputSchema requires a "query" string and an optional "maxPrice" number, that is JSON Schema. It lets the agent know what data to send.
+
+ +
+
Navigator
+
A built-in browser object that provides access to browser features. You already use navigator.geolocation (GPS), navigator.clipboard (copy/paste). WebMCP adds navigator.modelContext (AI tools).
+
+ +
+
Origin
+
The identity of a website: protocol + domain + port. https://amazon.com:443 is one origin. Two different origins cannot access each other's data. This is the foundation of web security and the foundation of WebMCP security.
+
+ +
+
Polyfill
+
Temporary code that provides a feature before browsers add native support. MCP-B is a polyfill for WebMCP -- it makes navigator.modelContext work today even though browsers have not implemented it natively yet.
+
+ +
+
Promise
+
A way to handle things that take time. When a tool's execute function returns a Promise, it means: "I am working on it and will give you the result when I am done." The agent waits for the Promise to resolve.
+
+ +
+
SameObject
+
Every time you access navigator.modelContext, you get the exact same object -- not a copy. This ensures all tool registrations go to the same place.
+
+ +
+
SecureContext
+
Means the feature only works on HTTPS pages (encrypted connection). No WebMCP on unencrypted HTTP. This is a security requirement.
+
+ +
+
Server-side / Backend
+
Code that runs on a remote server, not in the user's browser. Traditional MCP servers run server-side. WebMCP specifically avoids this -- tools run in the browser.
+
+ +
+
Tool Poisoning
+
A security attack where a malicious MCP server exposes tools with misleading descriptions to trick agents into performing harmful actions. The MCP Server Validator detects patterns associated with this.
+
+ +
+
Transport
+
The mechanism for sending messages between systems. MCP uses different transports (stdio, HTTP). MCP-B adds "tab transport" (communication within a browser tab) and "extension transport" (communication through browser extensions).
+
+ +
+
WebIDL (Web Interface Definition Language)
+
The formal language used to write web API specifications. When you see code blocks in the spec with words like interface, dictionary, readonly attribute -- that is WebIDL. It is the blueprint language for browser APIs.
+
+ +
+
Wire Protocol
+
The actual format of messages sent between systems. MCP's wire protocol uses JSON-RPC (structured messages in JSON format). MCP-B translates between WebMCP's browser-native format and MCP's wire protocol.
+
+ +
+ + +
+ Prompted by Paola Di Maio, W3C AI-KR Community Group
+ Prepared by Claude | Contributed to the WebML CG
+ February 2026 +
+ + +
+ + From ae3a0caff5812d25c512a3ef79d92860ebce6f69 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 14:37:38 +0800 Subject: [PATCH 29/46] Rename webmcp-complete-guide(3).html to webmcp-complete-guide.html --- webmcp-complete-guide(3).html => webmcp-complete-guide.html | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename webmcp-complete-guide(3).html => webmcp-complete-guide.html (100%) diff --git a/webmcp-complete-guide(3).html b/webmcp-complete-guide.html similarity index 100% rename from webmcp-complete-guide(3).html rename to webmcp-complete-guide.html From 6658cb6702fbe90867f5f3d8ddc48e2ddd8e0ce3 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 14:38:59 +0800 Subject: [PATCH 30/46] Delete webmcp-quiz.html --- webmcp-quiz.html | 526 ----------------------------------------------- 1 file changed, 526 deletions(-) delete mode 100644 webmcp-quiz.html diff --git a/webmcp-quiz.html b/webmcp-quiz.html deleted file mode 100644 index 2b9354b..0000000 --- a/webmcp-quiz.html +++ /dev/null @@ -1,526 +0,0 @@ - - - - - -WebMCP & MCP Tooling -- Quiz by Claude - - - - -
-
-

INTRO: WebMCP & MCP Tooling

-

15 questions -

-
- -
- Question 0 / 15 - Score: 0 -
- -
- -
-

Meeting Readiness

-
-
- -
- -
-
-
- - - BY CLAUDE WITH LOVE - - From 363c516ffe6df08c049cfbbb0124406d01419619 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 14:39:53 +0800 Subject: [PATCH 31/46] Add files via upload --- webmcp-quiz(1).html | 526 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 526 insertions(+) create mode 100644 webmcp-quiz(1).html diff --git a/webmcp-quiz(1).html b/webmcp-quiz(1).html new file mode 100644 index 0000000..2b9354b --- /dev/null +++ b/webmcp-quiz(1).html @@ -0,0 +1,526 @@ + + + + + +WebMCP & MCP Tooling -- Quiz by Claude + + + + +
+
+

INTRO: WebMCP & MCP Tooling

+

15 questions -

+
+ +
+ Question 0 / 15 + Score: 0 +
+ +
+ +
+

Meeting Readiness

+
+
+ +
+ +
+
+
+ + + BY CLAUDE WITH LOVE + + From b2956786f964e4aa3feb26022edb133ad19265c9 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 14:40:29 +0800 Subject: [PATCH 32/46] Rename webmcp-quiz(1).html to webmcp-quiz.html --- webmcp-quiz(1).html => webmcp-quiz.html | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename webmcp-quiz(1).html => webmcp-quiz.html (100%) diff --git a/webmcp-quiz(1).html b/webmcp-quiz.html similarity index 100% rename from webmcp-quiz(1).html rename to webmcp-quiz.html From 614cf8847e46ed4f2ea946548370f0ccacc470b3 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 14:49:42 +0800 Subject: [PATCH 33/46] Delete webmcp-quiz.html --- webmcp-quiz.html | 526 ----------------------------------------------- 1 file changed, 526 deletions(-) delete mode 100644 webmcp-quiz.html diff --git a/webmcp-quiz.html b/webmcp-quiz.html deleted file mode 100644 index 2b9354b..0000000 --- a/webmcp-quiz.html +++ /dev/null @@ -1,526 +0,0 @@ - - - - - -WebMCP & MCP Tooling -- Quiz by Claude - - - - -
-
-

INTRO: WebMCP & MCP Tooling

-

15 questions -

-
- -
- Question 0 / 15 - Score: 0 -
- -
- -
-

Meeting Readiness

-
-
- -
- -
-
-
- - - BY CLAUDE WITH LOVE - - From 258c27732412f036fbf2d0f464a33c08d4581cdc Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 14:50:37 +0800 Subject: [PATCH 34/46] Add files via upload --- webmcp-quiz-1.html | 526 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 526 insertions(+) create mode 100644 webmcp-quiz-1.html diff --git a/webmcp-quiz-1.html b/webmcp-quiz-1.html new file mode 100644 index 0000000..05003a5 --- /dev/null +++ b/webmcp-quiz-1.html @@ -0,0 +1,526 @@ + + + + + +WebMCP & MCP Tooling -- Quiz by Claude + + + + +
+
+

INTRO: WebMCP & MCP Tooling

+

15 questions -

+
+ +
+ Question 0 / 15 + Score: 0 +
+ +
+ +
+

Meeting Readiness

+
+
+ +
+ +
+
+
+ + + BY CLAUDE WITH LOVE + + From f81d583a54a1810d911064296de4d7696fbb49dd Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 14:53:37 +0800 Subject: [PATCH 35/46] Rename webmcp-quiz-1.html to webmcp-quiz.html --- webmcp-quiz-1.html => webmcp-quiz.html | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename webmcp-quiz-1.html => webmcp-quiz.html (100%) diff --git a/webmcp-quiz-1.html b/webmcp-quiz.html similarity index 100% rename from webmcp-quiz-1.html rename to webmcp-quiz.html From 5beb2e8e7dc2e0a9cdad1567c6178e4fc64c1f37 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 16:37:09 +0800 Subject: [PATCH 36/46] Add files via upload --- webmcp-technical-note-1 .md | 102 ++++++++++++++++++++++++++++++++++ webmcp-technical-note-2(3).md | 102 ++++++++++++++++++++++++++++++++++ webmcp-technical-note-3(1).md | 71 +++++++++++++++++++++++ 3 files changed, 275 insertions(+) create mode 100644 webmcp-technical-note-1 .md create mode 100644 webmcp-technical-note-2(3).md create mode 100644 webmcp-technical-note-3(1).md diff --git a/webmcp-technical-note-1 .md b/webmcp-technical-note-1 .md new file mode 100644 index 0000000..fe85eac --- /dev/null +++ b/webmcp-technical-note-1 .md @@ -0,0 +1,102 @@ +# WebMCP Technical Note 2: What to Test, What to Watch, What to Tell the Standards Body + +**WebMCP Technical Note Series** +**15 February 2026** + +--- + +Google's WebMCP early preview is live in Chrome 146 Canary. The specification is still a draft. The community group process is still open. This means the window for meaningful community input is right now -- before implementation momentum makes the current design effectively permanent. + +This note is a practical guide. It is written for developers, accessibility practitioners, security researchers, standards participants, and anyone who builds things for the web and wants to understand what WebMCP means for their work. It covers what WebMCP is for, what to test, what the benefits are, what the risks are, and how to communicate findings to the W3C community group that hosts the specification. + +## What WebMCP Is For: The Use Cases + +WebMCP allows a website to declare a set of tools -- JavaScript functions with structured schemas and natural language descriptions -- that AI agents can discover and invoke. The specification targets several categories of use. + +The first is **e-commerce and transactional sites**. A travel booking site could register tools like searchFlights(origin, destination, dates), filterResults(price, stops, airline), and bookFlight(flightId, passengerDetails). Instead of an AI agent trying to parse a complex search interface by reading pixels or DOM elements, it calls the function directly and gets structured JSON back. The site controls exactly what the agent can do and how. + +The second is **productivity and SaaS applications**. A project management tool could expose createTask(title, assignee, dueDate), moveCard(cardId, column), and generateReport(dateRange). Browser-based AI assistants could help users manage workflows without the application needing to build and maintain a separate backend MCP server or API integration for every AI platform. + +The third is **content and media**. A news site could register searchArticles(topic, dateRange) and getArticleSummary(articleId). A mapping service could expose getDirections(from, to, mode) and findNearby(category, radius). These tools let agents interact with content in structured ways rather than scraping and guessing. + +The fourth -- and potentially most significant -- is **accessibility**. The specification claims WebMCP could benefit assistive technologies by providing structured, semantically meaningful interfaces to website functionality. A screen reader enhanced with agent capabilities could invoke tools directly rather than navigating complex visual layouts. This is a strong claim that deserves rigorous testing. + +The fifth is **form automation and multi-step workflows**. Complex processes like insurance applications, government forms, or account setup flows could be exposed as sequences of tool calls, allowing agents to guide users through them step by step while the site maintains control over validation, sequencing, and data handling. + +## What the Benefits Are + +WebMCP offers several concrete advantages over current approaches to AI-web interaction. + +**Reliability** is the most immediate. Today's browser agents -- whether using visual parsing or DOM inspection -- are brittle. A minor CSS change can break a visual agent. A DOM restructuring can invalidate a scraping approach. WebMCP tools are explicit contracts: the site declares what is available, the agent calls it, the response is structured. This should dramatically reduce failure rates for agent-web interaction. + +**Performance** is the second. Visual agents must capture screenshots, send them to a vision model, interpret the response, generate mouse coordinates, and repeat. WinBuzzer reported a 67% reduction in computational overhead with WebMCP compared to visual approaches. Even if that number proves optimistic in production, the architectural advantage is clear: a function call is faster than a screenshot-interpret-click loop. + +**Developer control** is the third. With visual or DOM-based agents, the website has no say in how an agent interacts with it. The agent reverse-engineers the interface. With WebMCP, the developer explicitly defines the interaction surface. Tools can include rate limits, validation, permission requirements, and structured error messages. The site becomes a willing participant in the interaction rather than a passive target. + +**Authentication reuse** is the fourth, and was the original motivation. Because WebMCP runs in the browser session, it inherits whatever authentication the user already has. No OAuth flows, no API keys, no separate credential management. The user is already logged in. The agent operates within that session. This solves one of the hardest problems in AI-service integration. + +**Standardization** is the fifth. If WebMCP succeeds, a developer implements tools once and every conformant agent can use them -- rather than building separate integrations for ChatGPT, Claude, Gemini, and whatever comes next. This is the "USB-C" argument: one interface, many devices. + +## What the Risks Are + +The risks are significant, and several are not yet adequately addressed in the specification. + +**Prompt injection** is the most acute. WebMCP tools return data to AI agents that then process it in their language model context. A malicious or compromised website could craft tool responses that manipulate the agent's behavior -- injecting instructions, altering the agent's understanding of the task, or causing it to take unintended actions on other sites. The specification does not currently define a defense against this beyond same-origin policy boundaries. + +**Scope creep of agent permissions** is the second. WebMCP is designed for human-in-the-loop workflows, with headless browsing explicitly out of scope. But the technical mechanism -- JavaScript functions callable by external code -- does not inherently enforce this. If browser vendors later relax the human-presence requirement, or if extensions find ways to invoke WebMCP tools without user awareness, the permission model collapses. The specification should define what "human in the loop" means technically, not just philosophically. + +**Consent and transparency** is the third. When a user visits a site that registers WebMCP tools, do they know? The current design provides no visible indicator to the user that tools have been registered, what data they expose, or when an agent invokes them. Compare this to other browser permission systems -- camera, microphone, location -- where the user explicitly grants access. WebMCP tools operate silently. + +**Competitive dynamics** is the fourth. WebMCP gives first-mover advantage to sites that implement tools early, potentially favoring large platforms with engineering resources. Smaller sites that do not implement WebMCP may become invisible to agent-mediated browsing. This could accelerate web consolidation. The specification should consider whether a minimal tool set (search, navigation, content retrieval) should be automatically generated from existing web standards like HTML forms, structured data, and ARIA attributes. + +**Data leakage through tool schemas** is the fifth. The natural language descriptions and parameter schemas of registered tools reveal information about a site's internal architecture, business logic, and data models. An agent -- or the platform behind it -- could catalog available tools across thousands of sites to build competitive intelligence. The specification does not address whether tool schemas should be treated as sensitive information. + +**Abuse and rate limiting** is the sixth. Agents can invoke tools at machine speed. A poorly defended site could face thousands of tool invocations per second from a single browser session. The specification mentions rate limiting as a consideration but does not define a standard mechanism. Without one, each site must build its own defenses, and many will not. + +**Cross-site tool chaining** is the seventh. If an agent can invoke tools on multiple open tabs, it could chain actions across sites in ways no individual site anticipated or authorized. Transfer money on a banking site, then use the confirmation on a shopping site, then post about it on a social network -- all within one agent workflow. The security boundaries for cross-site tool interaction are not yet defined. + +## What to Test + +For those with access to Chrome 146 Canary, here are the concrete areas that need community evaluation. Each should generate feedback for the W3C community group. + +**Test the Declarative API with real HTML forms.** Register tools that wrap existing form actions and verify that validation, error handling, and submission behavior match what a human user would experience. Try edge cases: forms with CAPTCHAs, multi-step forms with session state, forms that redirect on submit. Document where the abstraction breaks. + +**Test the Imperative API with dynamic content.** Register tools that interact with JavaScript-heavy applications -- single-page apps, dashboards with real-time data, applications that maintain complex client-side state. Evaluate whether tool calls can reliably interact with application state without causing inconsistencies. + +**Test authentication boundaries.** Log into a site, register tools, then observe what happens when the session expires, when the user logs out in another tab, when cookies are cleared. The specification's authentication reuse claim needs verification under adversarial conditions. + +**Test tool discovery and enumeration.** If multiple sites in different tabs register tools, how does the agent disambiguate? What happens when two sites register tools with the same name? How does the agent present available tools to the user? Is tool discovery observable by the page (can a site detect that an agent has read its tool list)? + +**Test accessibility integration.** If you work with assistive technologies, evaluate whether WebMCP tools provide genuinely better access to site functionality than existing ARIA roles and landmarks. Test with screen readers, switch access devices, and voice control. Document whether WebMCP complements or conflicts with existing accessibility standards. + +**Test prompt injection resilience.** Craft tool responses that contain instruction-like text and observe whether the consuming agent's behavior is affected. This is critical safety research. If tool responses can manipulate agent behavior, the security model is fundamentally incomplete. + +**Test performance claims.** Measure actual latency and token usage for equivalent tasks performed via WebMCP tools versus visual agent interaction. The 67% overhead reduction claim needs independent verification across different site types and task complexities. + +**Test failure modes.** What happens when a tool throws an error? When it returns unexpected data types? When it hangs? When the page navigates away mid-call? The specification should define standard error handling, but the current draft has TODO sections in these areas. Documenting real failure modes will directly shape the specification. + +## How to Communicate Findings + +Feedback is only useful if it reaches the people writing the specification. Here are the concrete channels, in order of effectiveness. + +**File a GitHub issue** at https://github.com/webmachinelearning/webmcp/issues with a clear title, reproducible steps, and a specific recommendation. Tag it with the relevant label if available. The spec editors (Brandon Walderman, Khushal Sagar, Dominic Farolino) monitor this repo. Issues with reproducible test cases and concrete proposals get traction. Issues that say "I don't like this" do not. + +**Join the W3C Web Machine Learning Community Group** at https://www.w3.org/community/webmachinelearning/ and participate in discussion. Community Groups are free and open. Participation in CG calls and mailing list threads carries weight in W3C process. + +If your findings relate to **agent interoperability** -- how WebMCP tools interact with broader agent ecosystems, discovery protocols, or multi-agent workflows -- also engage with the AI Agent Protocol Community Group, which the WebML CG charter identifies as a coordination partner. + +If your findings relate to **security or privacy**, file issues with clear severity assessments. W3C specifications have a tradition of security and privacy self-review questionnaires. Check whether the WebMCP specification has completed one, and if not, request it. + +If you publish your findings -- on a blog, in a report, in an academic paper -- link back to the relevant GitHub issues so the discussion stays connected to the specification process. + +## The Window + +The pattern in web standards is well established. Once an implementation ships in a dominant browser and developers build on it, the specification follows the code. Chrome holds roughly 65% of browser market share. The early preview is live. Developer adoption is beginning. The longer the community waits to engage, the narrower the design space becomes. + +This is not an argument against WebMCP. The technical concept is sound, the use cases are real, and the problem it solves -- giving developers control over AI agent interaction -- is important. But a good idea implemented badly, or without adequate security review, or without accessibility testing, or without community input, becomes a liability embedded in the web platform for decades. + +The specification is at https://webmachinelearning.github.io/webmcp/. The implementation is in Chrome 146 Canary. The issues page is at https://github.com/webmachinelearning/webmcp/issues. The community group is open to all at no cost. The work is now. + +--- + +*Contributed via the W3C AI Knowledge Representation Community Group* diff --git a/webmcp-technical-note-2(3).md b/webmcp-technical-note-2(3).md new file mode 100644 index 0000000..fe85eac --- /dev/null +++ b/webmcp-technical-note-2(3).md @@ -0,0 +1,102 @@ +# WebMCP Technical Note 2: What to Test, What to Watch, What to Tell the Standards Body + +**WebMCP Technical Note Series** +**15 February 2026** + +--- + +Google's WebMCP early preview is live in Chrome 146 Canary. The specification is still a draft. The community group process is still open. This means the window for meaningful community input is right now -- before implementation momentum makes the current design effectively permanent. + +This note is a practical guide. It is written for developers, accessibility practitioners, security researchers, standards participants, and anyone who builds things for the web and wants to understand what WebMCP means for their work. It covers what WebMCP is for, what to test, what the benefits are, what the risks are, and how to communicate findings to the W3C community group that hosts the specification. + +## What WebMCP Is For: The Use Cases + +WebMCP allows a website to declare a set of tools -- JavaScript functions with structured schemas and natural language descriptions -- that AI agents can discover and invoke. The specification targets several categories of use. + +The first is **e-commerce and transactional sites**. A travel booking site could register tools like searchFlights(origin, destination, dates), filterResults(price, stops, airline), and bookFlight(flightId, passengerDetails). Instead of an AI agent trying to parse a complex search interface by reading pixels or DOM elements, it calls the function directly and gets structured JSON back. The site controls exactly what the agent can do and how. + +The second is **productivity and SaaS applications**. A project management tool could expose createTask(title, assignee, dueDate), moveCard(cardId, column), and generateReport(dateRange). Browser-based AI assistants could help users manage workflows without the application needing to build and maintain a separate backend MCP server or API integration for every AI platform. + +The third is **content and media**. A news site could register searchArticles(topic, dateRange) and getArticleSummary(articleId). A mapping service could expose getDirections(from, to, mode) and findNearby(category, radius). These tools let agents interact with content in structured ways rather than scraping and guessing. + +The fourth -- and potentially most significant -- is **accessibility**. The specification claims WebMCP could benefit assistive technologies by providing structured, semantically meaningful interfaces to website functionality. A screen reader enhanced with agent capabilities could invoke tools directly rather than navigating complex visual layouts. This is a strong claim that deserves rigorous testing. + +The fifth is **form automation and multi-step workflows**. Complex processes like insurance applications, government forms, or account setup flows could be exposed as sequences of tool calls, allowing agents to guide users through them step by step while the site maintains control over validation, sequencing, and data handling. + +## What the Benefits Are + +WebMCP offers several concrete advantages over current approaches to AI-web interaction. + +**Reliability** is the most immediate. Today's browser agents -- whether using visual parsing or DOM inspection -- are brittle. A minor CSS change can break a visual agent. A DOM restructuring can invalidate a scraping approach. WebMCP tools are explicit contracts: the site declares what is available, the agent calls it, the response is structured. This should dramatically reduce failure rates for agent-web interaction. + +**Performance** is the second. Visual agents must capture screenshots, send them to a vision model, interpret the response, generate mouse coordinates, and repeat. WinBuzzer reported a 67% reduction in computational overhead with WebMCP compared to visual approaches. Even if that number proves optimistic in production, the architectural advantage is clear: a function call is faster than a screenshot-interpret-click loop. + +**Developer control** is the third. With visual or DOM-based agents, the website has no say in how an agent interacts with it. The agent reverse-engineers the interface. With WebMCP, the developer explicitly defines the interaction surface. Tools can include rate limits, validation, permission requirements, and structured error messages. The site becomes a willing participant in the interaction rather than a passive target. + +**Authentication reuse** is the fourth, and was the original motivation. Because WebMCP runs in the browser session, it inherits whatever authentication the user already has. No OAuth flows, no API keys, no separate credential management. The user is already logged in. The agent operates within that session. This solves one of the hardest problems in AI-service integration. + +**Standardization** is the fifth. If WebMCP succeeds, a developer implements tools once and every conformant agent can use them -- rather than building separate integrations for ChatGPT, Claude, Gemini, and whatever comes next. This is the "USB-C" argument: one interface, many devices. + +## What the Risks Are + +The risks are significant, and several are not yet adequately addressed in the specification. + +**Prompt injection** is the most acute. WebMCP tools return data to AI agents that then process it in their language model context. A malicious or compromised website could craft tool responses that manipulate the agent's behavior -- injecting instructions, altering the agent's understanding of the task, or causing it to take unintended actions on other sites. The specification does not currently define a defense against this beyond same-origin policy boundaries. + +**Scope creep of agent permissions** is the second. WebMCP is designed for human-in-the-loop workflows, with headless browsing explicitly out of scope. But the technical mechanism -- JavaScript functions callable by external code -- does not inherently enforce this. If browser vendors later relax the human-presence requirement, or if extensions find ways to invoke WebMCP tools without user awareness, the permission model collapses. The specification should define what "human in the loop" means technically, not just philosophically. + +**Consent and transparency** is the third. When a user visits a site that registers WebMCP tools, do they know? The current design provides no visible indicator to the user that tools have been registered, what data they expose, or when an agent invokes them. Compare this to other browser permission systems -- camera, microphone, location -- where the user explicitly grants access. WebMCP tools operate silently. + +**Competitive dynamics** is the fourth. WebMCP gives first-mover advantage to sites that implement tools early, potentially favoring large platforms with engineering resources. Smaller sites that do not implement WebMCP may become invisible to agent-mediated browsing. This could accelerate web consolidation. The specification should consider whether a minimal tool set (search, navigation, content retrieval) should be automatically generated from existing web standards like HTML forms, structured data, and ARIA attributes. + +**Data leakage through tool schemas** is the fifth. The natural language descriptions and parameter schemas of registered tools reveal information about a site's internal architecture, business logic, and data models. An agent -- or the platform behind it -- could catalog available tools across thousands of sites to build competitive intelligence. The specification does not address whether tool schemas should be treated as sensitive information. + +**Abuse and rate limiting** is the sixth. Agents can invoke tools at machine speed. A poorly defended site could face thousands of tool invocations per second from a single browser session. The specification mentions rate limiting as a consideration but does not define a standard mechanism. Without one, each site must build its own defenses, and many will not. + +**Cross-site tool chaining** is the seventh. If an agent can invoke tools on multiple open tabs, it could chain actions across sites in ways no individual site anticipated or authorized. Transfer money on a banking site, then use the confirmation on a shopping site, then post about it on a social network -- all within one agent workflow. The security boundaries for cross-site tool interaction are not yet defined. + +## What to Test + +For those with access to Chrome 146 Canary, here are the concrete areas that need community evaluation. Each should generate feedback for the W3C community group. + +**Test the Declarative API with real HTML forms.** Register tools that wrap existing form actions and verify that validation, error handling, and submission behavior match what a human user would experience. Try edge cases: forms with CAPTCHAs, multi-step forms with session state, forms that redirect on submit. Document where the abstraction breaks. + +**Test the Imperative API with dynamic content.** Register tools that interact with JavaScript-heavy applications -- single-page apps, dashboards with real-time data, applications that maintain complex client-side state. Evaluate whether tool calls can reliably interact with application state without causing inconsistencies. + +**Test authentication boundaries.** Log into a site, register tools, then observe what happens when the session expires, when the user logs out in another tab, when cookies are cleared. The specification's authentication reuse claim needs verification under adversarial conditions. + +**Test tool discovery and enumeration.** If multiple sites in different tabs register tools, how does the agent disambiguate? What happens when two sites register tools with the same name? How does the agent present available tools to the user? Is tool discovery observable by the page (can a site detect that an agent has read its tool list)? + +**Test accessibility integration.** If you work with assistive technologies, evaluate whether WebMCP tools provide genuinely better access to site functionality than existing ARIA roles and landmarks. Test with screen readers, switch access devices, and voice control. Document whether WebMCP complements or conflicts with existing accessibility standards. + +**Test prompt injection resilience.** Craft tool responses that contain instruction-like text and observe whether the consuming agent's behavior is affected. This is critical safety research. If tool responses can manipulate agent behavior, the security model is fundamentally incomplete. + +**Test performance claims.** Measure actual latency and token usage for equivalent tasks performed via WebMCP tools versus visual agent interaction. The 67% overhead reduction claim needs independent verification across different site types and task complexities. + +**Test failure modes.** What happens when a tool throws an error? When it returns unexpected data types? When it hangs? When the page navigates away mid-call? The specification should define standard error handling, but the current draft has TODO sections in these areas. Documenting real failure modes will directly shape the specification. + +## How to Communicate Findings + +Feedback is only useful if it reaches the people writing the specification. Here are the concrete channels, in order of effectiveness. + +**File a GitHub issue** at https://github.com/webmachinelearning/webmcp/issues with a clear title, reproducible steps, and a specific recommendation. Tag it with the relevant label if available. The spec editors (Brandon Walderman, Khushal Sagar, Dominic Farolino) monitor this repo. Issues with reproducible test cases and concrete proposals get traction. Issues that say "I don't like this" do not. + +**Join the W3C Web Machine Learning Community Group** at https://www.w3.org/community/webmachinelearning/ and participate in discussion. Community Groups are free and open. Participation in CG calls and mailing list threads carries weight in W3C process. + +If your findings relate to **agent interoperability** -- how WebMCP tools interact with broader agent ecosystems, discovery protocols, or multi-agent workflows -- also engage with the AI Agent Protocol Community Group, which the WebML CG charter identifies as a coordination partner. + +If your findings relate to **security or privacy**, file issues with clear severity assessments. W3C specifications have a tradition of security and privacy self-review questionnaires. Check whether the WebMCP specification has completed one, and if not, request it. + +If you publish your findings -- on a blog, in a report, in an academic paper -- link back to the relevant GitHub issues so the discussion stays connected to the specification process. + +## The Window + +The pattern in web standards is well established. Once an implementation ships in a dominant browser and developers build on it, the specification follows the code. Chrome holds roughly 65% of browser market share. The early preview is live. Developer adoption is beginning. The longer the community waits to engage, the narrower the design space becomes. + +This is not an argument against WebMCP. The technical concept is sound, the use cases are real, and the problem it solves -- giving developers control over AI agent interaction -- is important. But a good idea implemented badly, or without adequate security review, or without accessibility testing, or without community input, becomes a liability embedded in the web platform for decades. + +The specification is at https://webmachinelearning.github.io/webmcp/. The implementation is in Chrome 146 Canary. The issues page is at https://github.com/webmachinelearning/webmcp/issues. The community group is open to all at no cost. The work is now. + +--- + +*Contributed via the W3C AI Knowledge Representation Community Group* diff --git a/webmcp-technical-note-3(1).md b/webmcp-technical-note-3(1).md new file mode 100644 index 0000000..b2eda08 --- /dev/null +++ b/webmcp-technical-note-3(1).md @@ -0,0 +1,71 @@ +# WebMCP Technical Note 3: WebMCP Is Not an MCP Server + +**WebMCP Technical Note Series** +**15 February 2026** + +--- + +A persistent claim in the WebMCP ecosystem is that WebMCP turns a website into an MCP server. The W3C specification repository itself states that web pages using WebMCP "can be thought of as Model Context Protocol (MCP) servers that implement tools in client-side script instead of on the backend." Early independent implementations by Jason McGhee and Alex Nahas (MCP-B) literally did function as MCP servers, bridging browser JavaScript to MCP clients through localhost websocket connections using the standard MCP protocol. +([W3C spec repo](https://github.com/webmachinelearning/webmcp)) +([McGhee implementation](https://github.com/jasonjmcghee/WebMCP)) +([Nahas MCP-B](https://github.com/MiguelsPizza/WebMCP)) + +The framing is understandable. It is also architecturally misleading, and the confusion has consequences for how developers, security reviewers, and standards participants evaluate the specification. + +## The Analogy and Its Limits + +WebMCP and Anthropic's Model Context Protocol share a conceptual ancestor: both define "tools" as functions with natural language descriptions and structured schemas that AI agents can discover and invoke. That is where the meaningful similarity ends. + +**Anthropic's MCP** is a backend protocol. It uses JSON-RPC 2.0 as its message format, transported over stdio, HTTP with Server-Sent Events, or Streamable HTTP. MCP servers are hosted processes -- typically written in Python or Node.js -- that run on backend infrastructure. They connect AI platforms like Claude, ChatGPT, or Gemini to external services. Authentication follows OAuth 2.1 or custom API key schemes. No browser is required. No human user needs to be present. Headless, fully automated operation is the norm. +([Source](https://modelcontextprotocol.io/introduction)) + +**WebMCP** is a frontend browser API. It uses the browser's native postMessage system for communication between the web page and the agent. Tools are registered and executed as client-side JavaScript within an active browser tab. Authentication is inherited from the browser session -- whatever cookies or federated login the user already has. A human user must be present in an active browser session. Headless browsing is explicitly out of scope. +([Source](https://webmachinelearning.github.io/webmcp/)) + +The specification's own language -- "can be thought of as" -- acknowledges this is an analogy, not an identity. But the README, the press coverage, and the developer ecosystem have largely dropped the qualifier. The result is that WebMCP is widely discussed as though it were MCP running in the browser, with all the assumptions that entails. + +## What the Framing Gets Wrong + +When a developer hears "your website becomes an MCP server," they import a set of assumptions from the MCP architecture. Every one of these assumptions is wrong for WebMCP. + +**Transport.** MCP uses JSON-RPC 2.0, a well-specified request-response protocol with defined error codes, batching, and notification semantics. WebMCP uses postMessage, the browser's cross-origin communication mechanism. These have different reliability characteristics, different error handling models, and different security boundaries. Code written for one transport does not work with the other. + +**Execution context.** An MCP server runs in a controlled backend environment -- a container, a VM, a serverless function -- where the service provider manages the runtime, dependencies, and resource limits. WebMCP tools run in the browser's JavaScript engine, in the same execution context as the web page's own code. They are subject to the browser's security sandbox, but also to its constraints: single-threaded execution, same-origin policy, and the full surface area of client-side attack vectors. + +**Authentication.** MCP's specification has adopted OAuth 2.1 for authentication between clients and servers. This was, notably, the problem that motivated WebMCP's creation -- Alex Nahas at Amazon found that OAuth 2.1 was impractical for internal MCP deployments. WebMCP sidesteps this entirely by inheriting the browser session. This is elegant for usability but means the authentication model is whatever the website happens to use, with no protocol-level guarantees. + +**Trust direction.** In MCP, the AI platform (client) connects to a known, registered server. The platform decides which servers to trust. In WebMCP, any website the user visits can register tools. The trust decision shifts from the AI platform to the browser, and potentially to the user -- who may not know that tools have been registered at all, since the current specification provides no visible indicator. + +**Operational mode.** MCP servers are designed for automated, programmatic access. They can run continuously, handle concurrent requests, and operate without human involvement. WebMCP requires an active browser tab with a human user present. The specification explicitly excludes headless browsing. These are fundamentally different operational paradigms with different scaling characteristics, different failure modes, and different abuse surfaces. + +## Why This Matters for Standards Review + +The "MCP server" framing is not just imprecise. It actively interferes with rigorous evaluation of the specification. + +**Security reviewers** who approach WebMCP as "MCP in the browser" will evaluate it against MCP's threat model. But MCP's threat model assumes a controlled backend environment, authenticated client-server connections, and server-side access control. WebMCP's actual threat model involves client-side JavaScript execution, browser-based trust boundaries, and the full range of web security concerns including cross-site scripting, prompt injection via tool responses, and silent tool registration. Importing the wrong threat model means asking the wrong security questions. + +**Developers** who approach WebMCP as "MCP in the browser" may expect protocol-level interoperability -- that a WebMCP tool definition could be used interchangeably with an MCP server tool definition, or that MCP client libraries could connect to WebMCP pages. They cannot. The tool schema format may be similar, but the transport, discovery, and invocation mechanisms are incompatible. + +**Standards participants** who approach WebMCP as "MCP in the browser" may underestimate the scope of new specification work required. WebMCP is not an adaptation of MCP to a new environment. It is a new browser API that borrows one concept (the tool abstraction) from MCP and implements everything else differently. It needs its own security review, its own privacy analysis, its own accessibility evaluation, and its own consent model -- none of which can be inherited from MCP. + +## What WebMCP Actually Is + +WebMCP is a proposed browser API -- specifically, a new interface on navigator.modelContext -- that allows web pages to declare JavaScript functions as tools that browser-based AI agents can discover and invoke. It uses the browser's existing communication, security, and session management infrastructure rather than introducing a new protocol. + +The design has real strengths. Authentication reuse eliminates one of the hardest problems in AI-service integration. Client-side execution means no backend infrastructure is needed. The human-in-the-loop requirement provides a natural consent and oversight mechanism -- if implemented correctly. + +But these strengths are specific to WebMCP's actual architecture, not to the MCP analogy. Evaluating WebMCP on its own terms -- as a browser API with browser security characteristics -- leads to better questions, better testing, and better specifications than evaluating it as a variant of MCP. + +## A Suggested Clarification + +The W3C specification and its README should explicitly state that WebMCP is not an implementation of the Model Context Protocol and does not use the MCP wire protocol. It borrows the "tool" abstraction -- functions with schemas and natural language descriptions -- but implements discovery, registration, invocation, and communication through browser-native mechanisms that are architecturally distinct from MCP. + +The analogy is useful for first contact. A developer unfamiliar with WebMCP can quickly grasp the concept by thinking "it is like an MCP server, but in the browser." But the specification itself, the security review, and the community evaluation should not rely on the analogy. They should address WebMCP as what it is: a new browser API with its own architecture, its own threat model, and its own design space. + +## Both Can Coexist + +None of this is an argument against WebMCP or against MCP. A company might maintain an MCP server for direct API integrations with AI platforms and simultaneously implement WebMCP tools on its consumer-facing website for browser-based agent interaction. The two are complementary, not competing, and not identical. Recognizing the distinction is necessary for evaluating each on its own merits. + +--- + +*Contributed via the W3C AI Knowledge Representation Community Group* From 7df3bdd4606779eff439b4f25007341b7f097a22 Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Wed, 18 Feb 2026 16:43:09 +0800 Subject: [PATCH 37/46] Add files via upload --- webmcp-technical-note-2.md.txt | 102 +++++++++++++++++++++++++++++++++ webmcp-technical-note-3.md.txt | 71 +++++++++++++++++++++++ 2 files changed, 173 insertions(+) create mode 100644 webmcp-technical-note-2.md.txt create mode 100644 webmcp-technical-note-3.md.txt diff --git a/webmcp-technical-note-2.md.txt b/webmcp-technical-note-2.md.txt new file mode 100644 index 0000000..fe85eac --- /dev/null +++ b/webmcp-technical-note-2.md.txt @@ -0,0 +1,102 @@ +# WebMCP Technical Note 2: What to Test, What to Watch, What to Tell the Standards Body + +**WebMCP Technical Note Series** +**15 February 2026** + +--- + +Google's WebMCP early preview is live in Chrome 146 Canary. The specification is still a draft. The community group process is still open. This means the window for meaningful community input is right now -- before implementation momentum makes the current design effectively permanent. + +This note is a practical guide. It is written for developers, accessibility practitioners, security researchers, standards participants, and anyone who builds things for the web and wants to understand what WebMCP means for their work. It covers what WebMCP is for, what to test, what the benefits are, what the risks are, and how to communicate findings to the W3C community group that hosts the specification. + +## What WebMCP Is For: The Use Cases + +WebMCP allows a website to declare a set of tools -- JavaScript functions with structured schemas and natural language descriptions -- that AI agents can discover and invoke. The specification targets several categories of use. + +The first is **e-commerce and transactional sites**. A travel booking site could register tools like searchFlights(origin, destination, dates), filterResults(price, stops, airline), and bookFlight(flightId, passengerDetails). Instead of an AI agent trying to parse a complex search interface by reading pixels or DOM elements, it calls the function directly and gets structured JSON back. The site controls exactly what the agent can do and how. + +The second is **productivity and SaaS applications**. A project management tool could expose createTask(title, assignee, dueDate), moveCard(cardId, column), and generateReport(dateRange). Browser-based AI assistants could help users manage workflows without the application needing to build and maintain a separate backend MCP server or API integration for every AI platform. + +The third is **content and media**. A news site could register searchArticles(topic, dateRange) and getArticleSummary(articleId). A mapping service could expose getDirections(from, to, mode) and findNearby(category, radius). These tools let agents interact with content in structured ways rather than scraping and guessing. + +The fourth -- and potentially most significant -- is **accessibility**. The specification claims WebMCP could benefit assistive technologies by providing structured, semantically meaningful interfaces to website functionality. A screen reader enhanced with agent capabilities could invoke tools directly rather than navigating complex visual layouts. This is a strong claim that deserves rigorous testing. + +The fifth is **form automation and multi-step workflows**. Complex processes like insurance applications, government forms, or account setup flows could be exposed as sequences of tool calls, allowing agents to guide users through them step by step while the site maintains control over validation, sequencing, and data handling. + +## What the Benefits Are + +WebMCP offers several concrete advantages over current approaches to AI-web interaction. + +**Reliability** is the most immediate. Today's browser agents -- whether using visual parsing or DOM inspection -- are brittle. A minor CSS change can break a visual agent. A DOM restructuring can invalidate a scraping approach. WebMCP tools are explicit contracts: the site declares what is available, the agent calls it, the response is structured. This should dramatically reduce failure rates for agent-web interaction. + +**Performance** is the second. Visual agents must capture screenshots, send them to a vision model, interpret the response, generate mouse coordinates, and repeat. WinBuzzer reported a 67% reduction in computational overhead with WebMCP compared to visual approaches. Even if that number proves optimistic in production, the architectural advantage is clear: a function call is faster than a screenshot-interpret-click loop. + +**Developer control** is the third. With visual or DOM-based agents, the website has no say in how an agent interacts with it. The agent reverse-engineers the interface. With WebMCP, the developer explicitly defines the interaction surface. Tools can include rate limits, validation, permission requirements, and structured error messages. The site becomes a willing participant in the interaction rather than a passive target. + +**Authentication reuse** is the fourth, and was the original motivation. Because WebMCP runs in the browser session, it inherits whatever authentication the user already has. No OAuth flows, no API keys, no separate credential management. The user is already logged in. The agent operates within that session. This solves one of the hardest problems in AI-service integration. + +**Standardization** is the fifth. If WebMCP succeeds, a developer implements tools once and every conformant agent can use them -- rather than building separate integrations for ChatGPT, Claude, Gemini, and whatever comes next. This is the "USB-C" argument: one interface, many devices. + +## What the Risks Are + +The risks are significant, and several are not yet adequately addressed in the specification. + +**Prompt injection** is the most acute. WebMCP tools return data to AI agents that then process it in their language model context. A malicious or compromised website could craft tool responses that manipulate the agent's behavior -- injecting instructions, altering the agent's understanding of the task, or causing it to take unintended actions on other sites. The specification does not currently define a defense against this beyond same-origin policy boundaries. + +**Scope creep of agent permissions** is the second. WebMCP is designed for human-in-the-loop workflows, with headless browsing explicitly out of scope. But the technical mechanism -- JavaScript functions callable by external code -- does not inherently enforce this. If browser vendors later relax the human-presence requirement, or if extensions find ways to invoke WebMCP tools without user awareness, the permission model collapses. The specification should define what "human in the loop" means technically, not just philosophically. + +**Consent and transparency** is the third. When a user visits a site that registers WebMCP tools, do they know? The current design provides no visible indicator to the user that tools have been registered, what data they expose, or when an agent invokes them. Compare this to other browser permission systems -- camera, microphone, location -- where the user explicitly grants access. WebMCP tools operate silently. + +**Competitive dynamics** is the fourth. WebMCP gives first-mover advantage to sites that implement tools early, potentially favoring large platforms with engineering resources. Smaller sites that do not implement WebMCP may become invisible to agent-mediated browsing. This could accelerate web consolidation. The specification should consider whether a minimal tool set (search, navigation, content retrieval) should be automatically generated from existing web standards like HTML forms, structured data, and ARIA attributes. + +**Data leakage through tool schemas** is the fifth. The natural language descriptions and parameter schemas of registered tools reveal information about a site's internal architecture, business logic, and data models. An agent -- or the platform behind it -- could catalog available tools across thousands of sites to build competitive intelligence. The specification does not address whether tool schemas should be treated as sensitive information. + +**Abuse and rate limiting** is the sixth. Agents can invoke tools at machine speed. A poorly defended site could face thousands of tool invocations per second from a single browser session. The specification mentions rate limiting as a consideration but does not define a standard mechanism. Without one, each site must build its own defenses, and many will not. + +**Cross-site tool chaining** is the seventh. If an agent can invoke tools on multiple open tabs, it could chain actions across sites in ways no individual site anticipated or authorized. Transfer money on a banking site, then use the confirmation on a shopping site, then post about it on a social network -- all within one agent workflow. The security boundaries for cross-site tool interaction are not yet defined. + +## What to Test + +For those with access to Chrome 146 Canary, here are the concrete areas that need community evaluation. Each should generate feedback for the W3C community group. + +**Test the Declarative API with real HTML forms.** Register tools that wrap existing form actions and verify that validation, error handling, and submission behavior match what a human user would experience. Try edge cases: forms with CAPTCHAs, multi-step forms with session state, forms that redirect on submit. Document where the abstraction breaks. + +**Test the Imperative API with dynamic content.** Register tools that interact with JavaScript-heavy applications -- single-page apps, dashboards with real-time data, applications that maintain complex client-side state. Evaluate whether tool calls can reliably interact with application state without causing inconsistencies. + +**Test authentication boundaries.** Log into a site, register tools, then observe what happens when the session expires, when the user logs out in another tab, when cookies are cleared. The specification's authentication reuse claim needs verification under adversarial conditions. + +**Test tool discovery and enumeration.** If multiple sites in different tabs register tools, how does the agent disambiguate? What happens when two sites register tools with the same name? How does the agent present available tools to the user? Is tool discovery observable by the page (can a site detect that an agent has read its tool list)? + +**Test accessibility integration.** If you work with assistive technologies, evaluate whether WebMCP tools provide genuinely better access to site functionality than existing ARIA roles and landmarks. Test with screen readers, switch access devices, and voice control. Document whether WebMCP complements or conflicts with existing accessibility standards. + +**Test prompt injection resilience.** Craft tool responses that contain instruction-like text and observe whether the consuming agent's behavior is affected. This is critical safety research. If tool responses can manipulate agent behavior, the security model is fundamentally incomplete. + +**Test performance claims.** Measure actual latency and token usage for equivalent tasks performed via WebMCP tools versus visual agent interaction. The 67% overhead reduction claim needs independent verification across different site types and task complexities. + +**Test failure modes.** What happens when a tool throws an error? When it returns unexpected data types? When it hangs? When the page navigates away mid-call? The specification should define standard error handling, but the current draft has TODO sections in these areas. Documenting real failure modes will directly shape the specification. + +## How to Communicate Findings + +Feedback is only useful if it reaches the people writing the specification. Here are the concrete channels, in order of effectiveness. + +**File a GitHub issue** at https://github.com/webmachinelearning/webmcp/issues with a clear title, reproducible steps, and a specific recommendation. Tag it with the relevant label if available. The spec editors (Brandon Walderman, Khushal Sagar, Dominic Farolino) monitor this repo. Issues with reproducible test cases and concrete proposals get traction. Issues that say "I don't like this" do not. + +**Join the W3C Web Machine Learning Community Group** at https://www.w3.org/community/webmachinelearning/ and participate in discussion. Community Groups are free and open. Participation in CG calls and mailing list threads carries weight in W3C process. + +If your findings relate to **agent interoperability** -- how WebMCP tools interact with broader agent ecosystems, discovery protocols, or multi-agent workflows -- also engage with the AI Agent Protocol Community Group, which the WebML CG charter identifies as a coordination partner. + +If your findings relate to **security or privacy**, file issues with clear severity assessments. W3C specifications have a tradition of security and privacy self-review questionnaires. Check whether the WebMCP specification has completed one, and if not, request it. + +If you publish your findings -- on a blog, in a report, in an academic paper -- link back to the relevant GitHub issues so the discussion stays connected to the specification process. + +## The Window + +The pattern in web standards is well established. Once an implementation ships in a dominant browser and developers build on it, the specification follows the code. Chrome holds roughly 65% of browser market share. The early preview is live. Developer adoption is beginning. The longer the community waits to engage, the narrower the design space becomes. + +This is not an argument against WebMCP. The technical concept is sound, the use cases are real, and the problem it solves -- giving developers control over AI agent interaction -- is important. But a good idea implemented badly, or without adequate security review, or without accessibility testing, or without community input, becomes a liability embedded in the web platform for decades. + +The specification is at https://webmachinelearning.github.io/webmcp/. The implementation is in Chrome 146 Canary. The issues page is at https://github.com/webmachinelearning/webmcp/issues. The community group is open to all at no cost. The work is now. + +--- + +*Contributed via the W3C AI Knowledge Representation Community Group* diff --git a/webmcp-technical-note-3.md.txt b/webmcp-technical-note-3.md.txt new file mode 100644 index 0000000..b2eda08 --- /dev/null +++ b/webmcp-technical-note-3.md.txt @@ -0,0 +1,71 @@ +# WebMCP Technical Note 3: WebMCP Is Not an MCP Server + +**WebMCP Technical Note Series** +**15 February 2026** + +--- + +A persistent claim in the WebMCP ecosystem is that WebMCP turns a website into an MCP server. The W3C specification repository itself states that web pages using WebMCP "can be thought of as Model Context Protocol (MCP) servers that implement tools in client-side script instead of on the backend." Early independent implementations by Jason McGhee and Alex Nahas (MCP-B) literally did function as MCP servers, bridging browser JavaScript to MCP clients through localhost websocket connections using the standard MCP protocol. +([W3C spec repo](https://github.com/webmachinelearning/webmcp)) +([McGhee implementation](https://github.com/jasonjmcghee/WebMCP)) +([Nahas MCP-B](https://github.com/MiguelsPizza/WebMCP)) + +The framing is understandable. It is also architecturally misleading, and the confusion has consequences for how developers, security reviewers, and standards participants evaluate the specification. + +## The Analogy and Its Limits + +WebMCP and Anthropic's Model Context Protocol share a conceptual ancestor: both define "tools" as functions with natural language descriptions and structured schemas that AI agents can discover and invoke. That is where the meaningful similarity ends. + +**Anthropic's MCP** is a backend protocol. It uses JSON-RPC 2.0 as its message format, transported over stdio, HTTP with Server-Sent Events, or Streamable HTTP. MCP servers are hosted processes -- typically written in Python or Node.js -- that run on backend infrastructure. They connect AI platforms like Claude, ChatGPT, or Gemini to external services. Authentication follows OAuth 2.1 or custom API key schemes. No browser is required. No human user needs to be present. Headless, fully automated operation is the norm. +([Source](https://modelcontextprotocol.io/introduction)) + +**WebMCP** is a frontend browser API. It uses the browser's native postMessage system for communication between the web page and the agent. Tools are registered and executed as client-side JavaScript within an active browser tab. Authentication is inherited from the browser session -- whatever cookies or federated login the user already has. A human user must be present in an active browser session. Headless browsing is explicitly out of scope. +([Source](https://webmachinelearning.github.io/webmcp/)) + +The specification's own language -- "can be thought of as" -- acknowledges this is an analogy, not an identity. But the README, the press coverage, and the developer ecosystem have largely dropped the qualifier. The result is that WebMCP is widely discussed as though it were MCP running in the browser, with all the assumptions that entails. + +## What the Framing Gets Wrong + +When a developer hears "your website becomes an MCP server," they import a set of assumptions from the MCP architecture. Every one of these assumptions is wrong for WebMCP. + +**Transport.** MCP uses JSON-RPC 2.0, a well-specified request-response protocol with defined error codes, batching, and notification semantics. WebMCP uses postMessage, the browser's cross-origin communication mechanism. These have different reliability characteristics, different error handling models, and different security boundaries. Code written for one transport does not work with the other. + +**Execution context.** An MCP server runs in a controlled backend environment -- a container, a VM, a serverless function -- where the service provider manages the runtime, dependencies, and resource limits. WebMCP tools run in the browser's JavaScript engine, in the same execution context as the web page's own code. They are subject to the browser's security sandbox, but also to its constraints: single-threaded execution, same-origin policy, and the full surface area of client-side attack vectors. + +**Authentication.** MCP's specification has adopted OAuth 2.1 for authentication between clients and servers. This was, notably, the problem that motivated WebMCP's creation -- Alex Nahas at Amazon found that OAuth 2.1 was impractical for internal MCP deployments. WebMCP sidesteps this entirely by inheriting the browser session. This is elegant for usability but means the authentication model is whatever the website happens to use, with no protocol-level guarantees. + +**Trust direction.** In MCP, the AI platform (client) connects to a known, registered server. The platform decides which servers to trust. In WebMCP, any website the user visits can register tools. The trust decision shifts from the AI platform to the browser, and potentially to the user -- who may not know that tools have been registered at all, since the current specification provides no visible indicator. + +**Operational mode.** MCP servers are designed for automated, programmatic access. They can run continuously, handle concurrent requests, and operate without human involvement. WebMCP requires an active browser tab with a human user present. The specification explicitly excludes headless browsing. These are fundamentally different operational paradigms with different scaling characteristics, different failure modes, and different abuse surfaces. + +## Why This Matters for Standards Review + +The "MCP server" framing is not just imprecise. It actively interferes with rigorous evaluation of the specification. + +**Security reviewers** who approach WebMCP as "MCP in the browser" will evaluate it against MCP's threat model. But MCP's threat model assumes a controlled backend environment, authenticated client-server connections, and server-side access control. WebMCP's actual threat model involves client-side JavaScript execution, browser-based trust boundaries, and the full range of web security concerns including cross-site scripting, prompt injection via tool responses, and silent tool registration. Importing the wrong threat model means asking the wrong security questions. + +**Developers** who approach WebMCP as "MCP in the browser" may expect protocol-level interoperability -- that a WebMCP tool definition could be used interchangeably with an MCP server tool definition, or that MCP client libraries could connect to WebMCP pages. They cannot. The tool schema format may be similar, but the transport, discovery, and invocation mechanisms are incompatible. + +**Standards participants** who approach WebMCP as "MCP in the browser" may underestimate the scope of new specification work required. WebMCP is not an adaptation of MCP to a new environment. It is a new browser API that borrows one concept (the tool abstraction) from MCP and implements everything else differently. It needs its own security review, its own privacy analysis, its own accessibility evaluation, and its own consent model -- none of which can be inherited from MCP. + +## What WebMCP Actually Is + +WebMCP is a proposed browser API -- specifically, a new interface on navigator.modelContext -- that allows web pages to declare JavaScript functions as tools that browser-based AI agents can discover and invoke. It uses the browser's existing communication, security, and session management infrastructure rather than introducing a new protocol. + +The design has real strengths. Authentication reuse eliminates one of the hardest problems in AI-service integration. Client-side execution means no backend infrastructure is needed. The human-in-the-loop requirement provides a natural consent and oversight mechanism -- if implemented correctly. + +But these strengths are specific to WebMCP's actual architecture, not to the MCP analogy. Evaluating WebMCP on its own terms -- as a browser API with browser security characteristics -- leads to better questions, better testing, and better specifications than evaluating it as a variant of MCP. + +## A Suggested Clarification + +The W3C specification and its README should explicitly state that WebMCP is not an implementation of the Model Context Protocol and does not use the MCP wire protocol. It borrows the "tool" abstraction -- functions with schemas and natural language descriptions -- but implements discovery, registration, invocation, and communication through browser-native mechanisms that are architecturally distinct from MCP. + +The analogy is useful for first contact. A developer unfamiliar with WebMCP can quickly grasp the concept by thinking "it is like an MCP server, but in the browser." But the specification itself, the security review, and the community evaluation should not rely on the analogy. They should address WebMCP as what it is: a new browser API with its own architecture, its own threat model, and its own design space. + +## Both Can Coexist + +None of this is an argument against WebMCP or against MCP. A company might maintain an MCP server for direct API integrations with AI platforms and simultaneously implement WebMCP tools on its consumer-facing website for browser-based agent interaction. The two are complementary, not competing, and not identical. Recognizing the distinction is necessary for evaluating each on its own merits. + +--- + +*Contributed via the W3C AI Knowledge Representation Community Group* From 12fe4978d5bbe5d1eaab279317387e42bf421b2e Mon Sep 17 00:00:00 2001 From: Paola Di Maio Date: Thu, 19 Feb 2026 16:12:57 +0800 Subject: [PATCH 38/46] Delete WebMCP_Model_Card_Generator_USER_GUIDE.md --- WebMCP_Model_Card_Generator_USER_GUIDE.md | 588 ---------------------- 1 file changed, 588 deletions(-) delete mode 100644 WebMCP_Model_Card_Generator_USER_GUIDE.md diff --git a/WebMCP_Model_Card_Generator_USER_GUIDE.md b/WebMCP_Model_Card_Generator_USER_GUIDE.md deleted file mode 100644 index 3c7d98b..0000000 --- a/WebMCP_Model_Card_Generator_USER_GUIDE.md +++ /dev/null @@ -1,588 +0,0 @@ -# WebMCP Model Card Generator -- Guide for the Clueless - -**A field-by-field walkthrough so you can fill out every tab without knowing anything about WebMCP beforehand.** - -Co-created by Paola Di Maio, PhD (W3C AI-KR CG Chair) & Claude (Anthropic) -February 2026 - ---- - -## What You're Making - -A **model card** for a browser-side tool. Think of it as a label for a product: it tells AI agents (and humans) what your tool does, what it needs, what can go wrong, and how to use it safely. - -You fill in a form. The generator produces two files: -- **JSON** -- for machines to read (registries, agents, validators) -- **Markdown** -- for humans to read (documentation, GitHub, specifications) - -**Time needed**: 15-30 minutes if you know your tool. 5 minutes if you just want to test the generator. - ---- - -## Tab 1: Identity & Provenance - -*Who made this, what is it, where does it live?* - -### Tool/Page Name (required) - -The name of your WebMCP-enabled tool or web page. This is what agents will see when they discover your tools. - -- Good: "Easely Design Editor", "Flight Search Demo", "Todo Manager" -- Bad: "my-tool", "test", "page1" - -### Version (required) - -Use semantic versioning: **major.minor.patch** - -- `1.0.0` = first stable release -- `0.1.0` = early prototype -- `2.3.1` = mature, updated - -If you don't know, use `0.1.0` for prototypes or `1.0.0` for something you'd show people. - -### Description (required) - -One or two sentences explaining what tools this page exposes to AI agents. - -- Good: "Travel booking page exposing flight search, hotel filtering, and itinerary building tools via WebMCP declarative and imperative APIs" -- Bad: "A tool" or "This is my page" - -### Author - -Your name, team name, or organization. Can be left blank. - -### Creation Date - -Auto-filled with today's date. Change it if the tool was created earlier. - -### License (required) - -Pick one: - -| License | When to use | -|---------|-------------| -| **MIT** | Most permissive, most common. "Do whatever you want, just include the license." | -| **Apache 2.0** | Permissive + patent protection. Good for company projects. | -| **GPL 3.0** | Copyleft. Anyone who uses your code must also open-source theirs. | -| **Proprietary** | Closed source, commercial. | -| **W3C Community CLA** | If your tool is part of a W3C community group deliverable. | - -If unsure, pick MIT. - -### Page URL - -The web address where your WebMCP tools are registered. This is where agents go to find and call your tools. - -- Example: `https://travel-demo.bandarra.me/` -- Example: `https://mysite.com/booking` -- Put "tbd" if you haven't deployed yet. - -### Attribution (required) - -How was this tool created? Be honest -- this is a transparency field. - -| Choice | Meaning | -|--------|---------| -| **Human-authored** | A person wrote all the code | -| **AI co-created** | Human and AI worked together (like us right now!) | -| **AI-generated** | AI generated the code, human reviewed/edited | - -### Source Repository - -Link to your GitHub/GitLab repo. Leave blank if closed source. - ---- - -## Tab 2: API Mode - -*How do agents talk to your tools?* - -This is where WebMCP diverges most from backend MCP. You have TWO choices (or both). - -### Primary API Mode (required) - -| Mode | What it means | When to use | -|------|---------------|-------------| -| **Declarative (HTML forms)** | You add attributes to existing HTML `` elements. No JavaScript needed. | You already have working HTML forms and want the fastest path to agent-readiness. | -| **Imperative (JavaScript)** | You call `navigator.modelContext.registerTool()` in JavaScript. | Complex logic, dynamic tools, multi-step workflows. | -| **Both** | You use forms for simple actions and JS for complex ones. | Large applications with a mix of simple and complex tools. | - -### If Declarative: Form toolname - -The exact `toolname` attribute you put on your HTML form: -```html - -``` -Enter: `searchFlights` - -### If Declarative: Form tooldescription - -The natural language description in the form attribute. This is what agents read to decide whether to use your tool. - -Enter: `Search for available flights by origin, destination, and date` - -### If Imperative: Registration Method - -How you register tools in JavaScript: - -| Method | When to use | -|--------|-------------| -| **navigator.modelContext.registerTool()** | W3C standard API. Use for single tool registration. | -| **navigator.modelContext.provideContext()** | W3C standard. Registers multiple tools at once (replaces any existing). | -| **MCP-B @mcp-b/global import** | Community polyfill. Use if you need cross-browser support before native implementation. | -| **WebMCP widget script tag** | Simplest option -- add a `