Semantic, not Pixel-based
Agents receive a structured PageGraph with roles, states, affordances, and stable identifiers — no screenshot parsing, no fragile CSS selectors.
Today’s AI agents interact with web UIs in one of three ways — all of them fragile:
There is no standard that answers: “What can I do here? Is it safe? How do I know it worked?”
UIAP closes this gap.
Semantic, not Pixel-based
Agents receive a structured PageGraph with roles, states, affordances, and stable identifiers — no screenshot parsing, no fragile CSS selectors.
Action Safety Built-in
Every action declares a risk level. The SDK enforces policy evaluation, confirmation flows, and human handoff before anything irreversible happens.
Success Verification
Actions define expected outcomes: route changes, toasts, dialog closures. Agents verify success instead of hoping it worked.
Transport Agnostic
Works over WebSocket, HTTP/SSE, postMessage, or any custom binding. The protocol defines the contract, not the wire.
App integrates the UIAP SDK
A few lines of code or data-uiap-* attributes instrument your UI with semantic meaning.
Session starts, capabilities are exchanged
The agent learns what the app can do: available actions, risk levels, execution modes, success signals.
Agent receives a PageGraph
A structured snapshot of the current UI state — routes, scopes, elements, states, affordances — not raw HTML.
Agent plans an action, SDK checks policy
Before execution, the policy layer decides: allow, confirm, deny, or hand off to the user.
Action executes, app sends feedback
The SDK runs the action (preferring app-native execution over DOM manipulation) and reports results + state deltas.
Agent verifies success via signals
Route changed? Toast appeared? Form submitted? The agent confirms the outcome before moving to the next step.
UIAP is a suite of 11 specifications:
| # | Spec | What it defines |
|---|---|---|
| 1 | Core | Message envelope, session lifecycle, errors, versioning |
| 2 | Capability Model | UI roles, states, affordances, actions, risk, signals |
| 3 | Web Profile | DOM, ARIA, PageGraph, iframes, Shadow DOM, routes |
| 4 | Action Runtime | Execution, verification, confirmation, result reporting |
| 5 | Policy Extension | Permissions, risk classes, sensitivity, audit |
| 6 | SDK API | Client-side JavaScript integration API |
| 7 | Workflow Extension | Multi-step flows, skills, recovery |
| 8 | Discovery Mapper | Automatic UI element discovery and classification |
| 9 | Authoring/Manifest | Configuration formats, validation, build/release |
| 10 | Conformance Suite | Test modules, harness model, evaluation rules |
| 11 | HTTP/REST Binding | HTTP+SSE transport binding |
UIAP doesn’t replace existing standards — it fills the gap between them:
| Standard | What it does | What’s missing for agents |
|---|---|---|
| ARIA | Describes roles, states, properties | No business actions, no risk levels, no success signals |
| MCP | Connects tools and knowledge to models | Doesn’t model live UI state or browser interaction |
| Playwright / WebDriver | Automates browser interaction | No semantic understanding, no policy, no verification |
| AG-UI | Agent-to-app event protocol | No canonical UI model, no action catalog, no safety layer |
UIAP is adapter-capable: expose capabilities as MCP tools, mirror events to AG-UI, or delegate execution to WebDriver — the protocol is the contract, not the runtime.