CUPComputer Use Protocol
MCP Integration

MCP Tools

Reference for all 9 CUP tools exposed via the MCP server.

Tools overview

CUP's MCP server exposes 9 tools for AI agents to perceive and interact with the UI.

snapshot

Capture the active window's accessibility tree.

snapshot()

Returns the foreground window's UI tree in compact text format with a header containing platform, screen, and app metadata.

snapshot_app

Capture a specific app's window by title.

snapshot_app(app: string)
ParameterTypeDescription
appstringCase-insensitive substring match against window titles

Use this when you need to interact with a window that is not in the foreground.

overview

List all open windows. Near-instant, no tree walking.

overview()

Returns a lightweight window list showing app names, PIDs, and bounds. Use this to discover what apps are open before targeting one with snapshot_app.

snapshot_desktop

Capture the desktop surface (icons, widgets, shortcuts).

snapshot_desktop()

Returns the desktop accessibility tree for interacting with desktop items.

find

Search the last captured tree for elements matching criteria.

find(query?, role?, name?, state?)
ParameterTypeDescription
querystring?Freeform semantic query (e.g., "play button")
rolestring?Role filter with synonyms
namestring?Fuzzy name match
statestring?Exact state match

Searches the full tree (including pruned elements) with semantic matching and relevance ranking. Results are sorted by relevance. See Session API > find() for usage examples.

action

Perform an action on a UI element or send a keyboard shortcut.

action(action, element_id?, value?, direction?, keys?)
ParameterTypeDescription
actionstringAction name (click, type, press, etc.)
element_idstring?Target element (e.g., "e14")
valuestring?Text for type or setvalue
directionstring?Direction for scroll (up/down/left/right)
keysstring?Key combo for press (e.g., "ctrl+s")

See Actions Reference for all 15 actions and their parameters.

open_app

Open an application by name with fuzzy matching.

open_app(name: string)
ParameterTypeDescription
namestringApp name (fuzzy matched against installed apps)

Waits for the app window to appear before returning.

page

Page through clipped content in a scrollable container.

page(element_id, direction?, offset?, limit?)
ParameterTypeDescription
element_idstringScrollable container element ID (e.g., "e5")
directionstring?Page direction: up, down, left, or right
offsetint?Jump to a specific child index (overrides direction)
limitint?Override page size (default: match viewport count)

When a snapshot shows "N more items — page(...) to see", use this tool to retrieve the next batch of hidden children from the cached tree. This does not scroll the actual UI — it serves from the cached tree. After any action or new snapshot, pagination resets.

screenshot

Capture a screenshot of the screen.

screenshot(region_x?, region_y?, region_w?, region_h?)
ParameterTypeDescription
region_xint?Left edge of capture region
region_yint?Top edge of capture region
region_wint?Width of capture region
region_hint?Height of capture region

By default captures the full primary monitor. Returns a PNG image.

On this page