On this page

A Computer is a Firecracker microVM running a full Linux desktop (Xfce). Use it when you need a real GUI environment: browser automation, computer-use AI agents, RPA, or screenshot-driven testing.

For code execution, dev servers, and most AI agent build loops, Sandboxes are cheaper and faster. Choose based on whether your workload needs a rendered desktop.

When to use Computer vs Sandbox

Use casePick
Run untrusted code, dev server, npm installSandbox
Computer-use agent, GUI automationComputer
Browser automation that needs a real rendered DOMComputer
AI-agent build loop (code-gen, hot reload, preview)Sandbox
Screenshot-based UI testingComputer
Headless Playwright / SeleniumSandbox (cheaper)
Controlling desktop apps (Figma, Slack, etc.)Computer

What is in the box

  • Xfce desktop with Firefox and common Linux apps pre-installed.
  • Terminalbash, python, standard CLI toolchain.
  • Persistent storage — files survive stop/start cycles via snapshot restore.
  • VNC + WebSocket streaming — embed the desktop live in a browser via MIOSA’s pixel-stream protocol.
  • OSA agent — pre-installed, disabled by default. Activate it for autonomous in-VM AI task execution.

Quick example

Desktop action reference

All 28 methods available on a running computer:

GroupMethodDescription
Screenscreenshot()Capture full desktop as PNG bytes
screenshot_base64()Same, base64-encoded — ready for LLM vision APIs
Clickclick(x, y)Generic click (defaults to left button)
left_click(x, y)Left-button click
right_click(x, y)Right-button click (context menu)
double_click(x, y)Double-click
Mousemove_cursor(x, y)Move cursor without clicking
mouse_down(x, y)Press and hold the mouse button
mouse_up(x, y)Release a held mouse button
drag(from_x, from_y, to_x, to_y)Click-drag between two coordinates
Keyboardtype(text)Type a string at the current focus
key(key)Send a single key (e.g. "Return", "ctrl+a")
hotkey(*keys)Simultaneous key combo (e.g. "ctrl", "c")
key_down(key)Press and hold a key
key_up(key)Release a held key
Scrollscroll(direction, clicks)Scroll up/down/left/right
scroll_up/down/left/right(clicks)Convenience scroll methods
Clipboardget_clipboard()Read clipboard text
set_clipboard(text)Write text to clipboard
Screen infoget_screen_size()Desktop resolution {width, height}
get_cursor_position()Current cursor {x, y} in normalized coords
Windowswindows()List open windows with IDs, titles, positions
launch(app)Open an installed app by name
focus_window(id)Bring a window to the foreground
get/set_window_size(id, ...)Read or set window dimensions
get/set_window_position(id, ...)Read or set window position
maximize/minimize/close_window(id)Change window state
Environmentget_desktop_environment()DE name and version (e.g. xfce4)
set_wallpaper(path)Set desktop background from a VM file path
get_accessibility_tree()AT-SPI element tree for structured agent perception
Shellbash(cmd)Execute a shell command inside the VM
python(code)Execute a Python snippet inside the VM
write_file(path, content)Write a file into the VM’s filesystem
read_file(path)Read a file from the VM’s filesystem

See Desktop Control for full examples, parameter details, and coordinate system documentation.

Workspace-scoped creation

Computers can be created inside a named workspace so that resources are isolated and billed separately. Pass external_workspace_id to attribute the computer to a tenant in your platform.

external_workspace_id is a free-form string you control — use your own tenant/org identifier. Usage is tracked per workspace in the billing dashboard so you can attribute compute costs to individual customers.

White-label desktops

Computers support per-tenant desktop branding. After starting a computer, push a wallpaper and apply it with two SDK calls. The wallpaper persists across stop/start cycles because it is stored in the VM’s snapshotted filesystem.

How it fits together

Your code talks to the MIOSA API. MIOSA routes actions to envd running inside the VM. envd drives the X11 session and returns screenshots or action confirmations. For embedded views, MIOSA streams the desktop over WebSocket to a browser iframe.

Embed the desktop in a browser

Mint a stream token from your backend, then pass the URL to a browser iframe:

See Embedding & Streaming for the full pattern including authentication, iframe policy, and expiry handling.

Lifecycle


create → provisioning → running ⇆ stopped → deleted
StateDescription
provisioningFirecracker is booting the rootfs
runningDesktop is up, accepting API actions
stoppedSnapshotted; file state preserved; restartable
deletedPermanent — not reversible

Stopping a computer pauses billing (or charges at the stopped rate, depending on your plan). Restarting resumes from the snapshot.

Sizing

SizevCPURAMDiskTypical use
small24 GB20 GBSingle agent, basic browsing
medium48 GB50 GBHeavier desktop apps, faster page loads
large816 GB100 GBMulti-app, large-context AI workloads

GPU support for accelerated desktop and GPU-aware in-VM agents is plan-dependent.

Next

Desktop Control

Full action reference: screenshot, click, type, key, scroll, drag, hotkey, windows, accessibility tree, and more. Open →

Embedding & Streaming

Embed the live pixel stream in your own UI using short-lived stream tokens. Open →

BYOC / OpenComputers

Bring your own hardware (Mac, Linux, Windows). One command registers your machine as a MIOSA computer. Open →

Was this helpful?