CUPComputer Use Protocol
Schema

Envelope Format

The top-level CUP JSON envelope structure.

Overview

The CUP envelope is the root JSON object returned by snapshot_raw(). It contains metadata about the capture and the UI tree.

Structure

{
  "version": "0.1.0",
  "platform": "windows",
  "timestamp": 1740067200000,
  "screen": {
    "w": 2560,
    "h": 1440,
    "scale": 1.0
  },
  "scope": "foreground",
  "app": {
    "name": "Spotify",
    "pid": 1234
  },
  "tree": [ ... ],
  "windows": [ ... ]
}

Fields

FieldTypeDescription
versionstringSchema version (semver)
platformenumSource platform: windows, macos, linux, web, android, ios
timestampintegerUnix epoch in milliseconds
screen.wintegerScreen width in pixels
screen.hintegerScreen height in pixels
screen.scalenumberDisplay scale factor (default 1.0)
scopeenumCapture scope: overview, foreground, desktop, full
app.namestringFocused application name
app.pidintegerProcess ID
app.bundleIdstring?macOS/iOS bundle ID or Android package name
treearrayRoot-level UI nodes
windowsarrayWindow list with metadata
toolsarray?WebMCP tools (web platform only)

Window list

The windows array provides metadata about open windows:

{
  "windows": [
    {
      "title": "Spotify",
      "pid": 1234,
      "focused": true,
      "bounds": { "x": 120, "y": 40, "w": 1680, "h": 1020 }
    },
    {
      "title": "Terminal",
      "pid": 5678,
      "focused": false,
      "bounds": { "x": 0, "y": 0, "w": 800, "h": 600 }
    }
  ]
}

On this page