A Computer is a Firecracker microVM running a full Linux desktop (Xfce). Use it when you need a real GUI environment: browser automation, computer-use AI agents, RPA, or screenshot-driven testing.
For code execution, dev servers, and most AI agent build loops, Sandboxes are cheaper and faster. Choose based on whether your workload needs a rendered desktop.
When to use Computer vs Sandbox
| Use case | Pick |
|---|---|
Run untrusted code, dev server, npm install | Sandbox |
| Computer-use agent, GUI automation | Computer |
| Browser automation that needs a real rendered DOM | Computer |
| AI-agent build loop (code-gen, hot reload, preview) | Sandbox |
| Screenshot-based UI testing | Computer |
| Headless Playwright / Selenium | Sandbox (cheaper) |
| Controlling desktop apps (Figma, Slack, etc.) | Computer |
What is in the box
- Xfce desktop with Firefox and common Linux apps pre-installed.
- Terminal —
bash,python, standard CLI toolchain. - Persistent storage — files survive stop/start cycles via snapshot restore.
- VNC + WebSocket streaming — embed the desktop live in a browser via MIOSA’s pixel-stream protocol.
- OSA agent — pre-installed, disabled by default. Activate it for autonomous in-VM AI task execution.
Quick example
Desktop action reference
All 28 methods available on a running computer:
| Group | Method | Description |
|---|---|---|
| Screen | screenshot() | Capture full desktop as PNG bytes |
screenshot_base64() | Same, base64-encoded — ready for LLM vision APIs | |
| Click | click(x, y) | Generic click (defaults to left button) |
left_click(x, y) | Left-button click | |
right_click(x, y) | Right-button click (context menu) | |
double_click(x, y) | Double-click | |
| Mouse | move_cursor(x, y) | Move cursor without clicking |
mouse_down(x, y) | Press and hold the mouse button | |
mouse_up(x, y) | Release a held mouse button | |
drag(from_x, from_y, to_x, to_y) | Click-drag between two coordinates | |
| Keyboard | type(text) | Type a string at the current focus |
key(key) | Send a single key (e.g. "Return", "ctrl+a") | |
hotkey(*keys) | Simultaneous key combo (e.g. "ctrl", "c") | |
key_down(key) | Press and hold a key | |
key_up(key) | Release a held key | |
| Scroll | scroll(direction, clicks) | Scroll up/down/left/right |
scroll_up/down/left/right(clicks) | Convenience scroll methods | |
| Clipboard | get_clipboard() | Read clipboard text |
set_clipboard(text) | Write text to clipboard | |
| Screen info | get_screen_size() | Desktop resolution {width, height} |
get_cursor_position() | Current cursor {x, y} in normalized coords | |
| Windows | windows() | List open windows with IDs, titles, positions |
launch(app) | Open an installed app by name | |
focus_window(id) | Bring a window to the foreground | |
get/set_window_size(id, ...) | Read or set window dimensions | |
get/set_window_position(id, ...) | Read or set window position | |
maximize/minimize/close_window(id) | Change window state | |
| Environment | get_desktop_environment() | DE name and version (e.g. xfce4) |
set_wallpaper(path) | Set desktop background from a VM file path | |
get_accessibility_tree() | AT-SPI element tree for structured agent perception | |
| Shell | bash(cmd) | Execute a shell command inside the VM |
python(code) | Execute a Python snippet inside the VM | |
write_file(path, content) | Write a file into the VM’s filesystem | |
read_file(path) | Read a file from the VM’s filesystem |
See Desktop Control for full examples, parameter details, and coordinate system documentation.
Workspace-scoped creation
Computers can be created inside a named workspace so that resources are isolated and billed separately. Pass external_workspace_id to attribute the computer to a tenant in your platform.
external_workspace_id is a free-form string you control — use your own tenant/org identifier. Usage is tracked per workspace in the billing dashboard so you can attribute compute costs to individual customers.
White-label desktops
Computers support per-tenant desktop branding. After starting a computer, push a wallpaper and apply it with two SDK calls. The wallpaper persists across stop/start cycles because it is stored in the VM’s snapshotted filesystem.
How it fits together
Your code talks to the MIOSA API. MIOSA routes actions to envd running inside the VM. envd drives the X11 session and returns screenshots or action confirmations. For embedded views, MIOSA streams the desktop over WebSocket to a browser iframe.
Embed the desktop in a browser
Mint a stream token from your backend, then pass the URL to a browser iframe:
See Embedding & Streaming for the full pattern including authentication, iframe policy, and expiry handling.
Lifecycle
create → provisioning → running ⇆ stopped → deleted | State | Description |
|---|---|
provisioning | Firecracker is booting the rootfs |
running | Desktop is up, accepting API actions |
stopped | Snapshotted; file state preserved; restartable |
deleted | Permanent — not reversible |
Stopping a computer pauses billing (or charges at the stopped rate, depending on your plan). Restarting resumes from the snapshot.
Sizing
| Size | vCPU | RAM | Disk | Typical use |
|---|---|---|---|---|
small | 2 | 4 GB | 20 GB | Single agent, basic browsing |
medium | 4 | 8 GB | 50 GB | Heavier desktop apps, faster page loads |
large | 8 | 16 GB | 100 GB | Multi-app, large-context AI workloads |
GPU support for accelerated desktop and GPU-aware in-VM agents is plan-dependent.
Next
Full action reference: screenshot, click, type, key, scroll, drag, hotkey, windows, accessibility tree, and more. Open →
Embed the live pixel stream in your own UI using short-lived stream tokens. Open →
Bring your own hardware (Mac, Linux, Windows). One command registers your machine as a MIOSA computer. Open →