On this page

Benchmarks

Production sandbox benchmark

100 sandbox lifecycles, measured through the production API path.

These numbers include API authentication, admission checks, scheduling, VM readiness, a command probe inside the sandbox, and cleanup. They are not kernel-only boot timings. The production lane uses `POST /api/v1/sandboxes/run`, which combines create, server-side readiness, and first exec into one durable production request.

Verified 2026-06-09 UTC 100 / 100 successful create -> ready -> exec -> destroy lifecycles
Success100%0 failures
Command-ready p50512msproduction run path
Command-ready p950.992s10-way burst
Command-ready p991.300sall warm path
Total run24s100 samples

Latest verified run

ParameterValue
Samples100
Client concurrency10
Batch pace0.5s
Templatemiosa-sandbox
Sizexs
Moderun (POST /api/v1/sandboxes/run)
Regions requestedus-mia, us-east, us-west
Successful lifecycles100 / 100
Admission rejections0
Total elapsed24s

Object storage smoke benchmark

MIOSA also verifies the tenant object-storage API through the same public API surface customers use:

POST   /api/v1/storage/buckets
PUT    /api/v1/storage/buckets/:id/objects/:key
GET    /api/v1/storage/buckets/:id/objects/:key
DELETE /api/v1/storage/buckets/:id/objects/:key

The June 9, 2026 smoke used a scoped API key with storage:read and storage:write, created a temporary private bucket, uploaded one object, downloaded it back, verified the SHA-256 digest, then deleted the object and bucket.

Object sizeSamplesResultUploadDownloadThroughputComposite
1 KB11 / 1212ms178ms0.05 Mbps94.39
1 MB11 / 11.155s506ms16.59 Mbps92.61
5 MB1010 / 104.975s682ms61.48 Mbps87.01
8 MB10 / 1failedfailedfailed0.00
10 MB10 / 1failedfailedfailed0.00

Storage comparison against the ComputeSDK reference

The ComputeSDK storage benchmark shown in the reference screenshots uses 10 MB files and 100 iterations. That exact test is not passing on MIOSA yet because the current direct raw-upload path fails at 8-10 MB. The table below compares the reference values with MIOSA’s latest passing live storage run so the gap is visible instead of hidden.

ProviderBenchmark shapeSuccessUpload medianDownload medianThroughput medianComposite
Tigris10 MB, 100 iterations100%319ms277ms303 Mbps95.4
Cloudflare R210 MB, 100 iterations100%628ms276ms303 Mbps94.8
MIOSA Storage API5 MB, 10 iterations100%4.975s682ms61.48 Mbps87.01
MIOSA Storage API10 MB, 1 probe0%failedfailedfailed0.00

Comments:

  • MIOSA is not leaderboard-comparable to Tigris or Cloudflare R2 until the 10 MB direct upload path passes consistently.
  • The current gap is upload-side. MIOSA’s 5 MB download median is usable but still slower than the ComputeSDK storage leaders; upload median is much slower.
  • The next backend fix is to move object upload off the default request-body read path and onto the streaming or presigned upload path, then rerun the same 10 MB, 100 iteration benchmark.
  • Until that fix ships, customer docs should treat direct API uploads as a small-object path and recommend presigned/direct storage uploads for larger artifacts.

MIOSA latency split

Production run path:

Phasep50p95p99MinMax
Command-ready TTI512ms0.992s1.300s295ms1.331s
VM boot slice95ms253ms253ms30ms312ms

Previous standard public path, kept as a baseline:

Phasep50p95p99MinMax
Create request287ms348ms353ms218ms356ms
Ready/running589ms1.005s1.246s477ms1.246s
Command-ready TTI947ms1.333s1.348s762ms1.610s
VM boot slice101ms253ms524ms31ms524ms

What is making the number slower?

The VM path is fast. In the production run, all 100 / 100 samples used the warm path and the reported boot slice had a 95ms median. The standard path was slower because it used three public round trips: create, poll readiness, then exec.

ComponentMedianWhat it includes
Fused command-ready path512msAuth, workspace admission, scheduling, server-side wait, first command, response
Previous standard path947msCreate response, external readiness polling, separate exec request
Round-trip removed by fusion~434msPublic polling plus second public exec call
Reported warm boot95msThe actual VM boot slice reported by the fleet

So the first optimization target was not raw boot. It was create/status/exec round trips. The run endpoint removes that waste while keeping the same durable sandbox lifecycle underneath.

Optimization path

The benchmark exposes three separate public lanes. MIOSA should publish all three, because each answers a different buyer question.

LaneCurrent p50What it provesImmediate target
VM-ready95mswarm sandbox runtime is assigned and ready<100ms sustained
Standard API-ready589mspublic API create has produced a running sandbox<400ms
Standard command-ready947msfirst command succeeds through the legacy multi-request baseline<800ms
Fused command-ready512msfirst command succeeds through one durable request<500ms

The engineering cuts are concrete:

CutExpected impactWhy it works
create_and_wait API/SDK path50-150msremoves external GET polling and returns only when the server has committed running state
create_and_exec benchmark/API path434ms measuredremoves external polling and the separate public exec POST
Admission-path caching100-180msavoids repeat template, policy, plan, credit-balance, and scheduler reads for hot API keys
Region-local control plane / benchmark routing200-350ms in far regionsavoids us-west control-plane round trips when the VM already boots locally
Warm-pool guardrailstail reductionkeeps the 99/100 warm hit rate at 100/100 and removes the cold 5.322s outlier

Fast without fragile

The fastest version of MIOSA should not skip billing, policy, cleanup, or durable placement. It should move those checks to the right boundary and avoid repeating them on every hot sandbox.

LayerKeep durableMake faster
AdmissionAPI key, workspace, plan, policy, credits, idempotency all remain enforcedcache the effective policy/template/credit admission result for short TTLs per API key and workspace
Placementpersist sandbox row and node_id before exposing a routeallocate from an in-memory reservation ledger first, then write-through to Postgres/outbox
Bootwarm-pool claim remains the default, cold boot remains fallbackkeep per-region warm pools ready for burst traffic
Readinessonly mark running after route registration and command health passpush readiness over PubSub/SSE instead of external 50ms GET polling
First commandcommand still runs with auth and timeoutfuse create -> wait -> exec inside the selected host so the first command is not a second public API round trip
Cleanupdestroy, release reservation, stop billing, and revoke routes stay mandatorymake cleanup idempotent and janitor-backed so failed requests do not hold resources
Scaleeach host keeps local runtime truth; control plane keeps durable truthroute creates to region-local controllers and replicate fleet state asynchronously

Target shape: the public API has the run endpoint today. The SDK wrapper should expose the same fast lane next:

await miosa.sandboxes.createAndWait({ template: "miosa-sandbox", region: "auto" })
await miosa.sandboxes.run({ template: "miosa-sandbox", command: "echo ok" })

createAndWait publishes API-ready latency. run publishes command-ready latency. Both use the same durable sandbox lifecycle underneath; the difference is that the server owns the wait loop and can execute the first command node-local.

Region split

RegionSamplesSuccessTTI p50TTI p95TTI p99
us-east33100%328ms644ms684ms
us-mia34100%506ms815ms859ms
us-west33100%934ms1.257s1.331s

Command-ready leaderboard

The external provider values below are from the supplied ComputeSDK-style benchmark view. MIOSA is inserted by measured p50 command-ready TTI from the production production run path, not isolated as a vanity row. Lower is better.

1
Declaw
0.49sp50 TTI · score 94.9
2
MIOSA
0.512sp50 production run path
3
Northflank
0.54sp50 TTI · score 94.4
4
Daytona
0.58sp50 TTI · score 74.3
5
E2B
0.64sp50 TTI · score 92.7
6
Modal
0.67sp50 TTI · score 92.8
7
Vercel
0.72sp50 TTI · score 90.7
8
Archil
0.75sp50 TTI · score 91.8
9
Runloop
0.81sp50 TTI · score 84.6
10
Cloudflare
1.84sp50 TTI · score 78.3
11
Blaxel
1.87sp50 TTI · score 80.1
12
CodeSandbox
7.32sp50 TTI · score 16.4
13
Tensorlake
15.22sp50 TTI · score 0.0
14
Upstash
17.01sp50 TTI · score 0.0
-
Orgo
0.27swarm-pool boot · 2.09s desktop-ready (measured)
-
AgentComputer
n/anot yet timed
-
ascii Box
n/anot yet timed

Benchmark placements

The benchmark screenshots expose separate tabs for median, P95, P99, and composite score. MIOSA’s raw latency placement is measured. The composite score below is labeled as an estimate because the external benchmark app does not publish its exact scoring formula; the estimate is anchored against the supplied provider score table and should be treated as directional until MIOSA is added to their official dataset.

MIOSA rank: #20.512s p50 production run path · 100/100 success
1 Declaw
0.49s
100/100
2 MIOSA
0.512s
100/100
3 Northflank
0.54s
100/100
4 Daytona
0.58s
100/100
5 E2B
0.64s
100/100
6 Modal
0.67s
100/100
7 Vercel
0.72s
100/100
8 Archil
0.75s
100/100
9 Runloop
0.81s
100/100
10 Cloudflare
1.84s
100/100
11 Blaxel
1.87s
100/100
12 CodeSandbox
7.32s
100/100
13 Tensorlake
15.22s
100/100
14 Upstash
17.01s
100/100
MIOSA rank: #60.992s p95 production run path · tail is the next optimization target
1 Declaw
0.54s
100/100
2 Northflank
0.59s
100/100
3 Modal
0.78s
100/100
4 E2B
0.83s
100/100
5 Archil
0.90s
100/100
6 MIOSA
0.992s
100/100
7 Vercel
1.20s
100/100
8 Blaxel
2.07s
100/100
9 Cloudflare
2.62s
100/100
10 Runloop
2.64s
100/100
11 Daytona
5.52s
100/100
12 CodeSandbox
9.90s
100/100
13 Tensorlake
15.76s
100/100
14 Upstash
23.71s
100/100
MIOSA rank: #61.300s p99 production run path · west-region tail is visible here
1 Declaw
0.54s
100/100
2 Northflank
0.61s
100/100
3 Modal
0.79s
100/100
4 Archil
0.94s
100/100
5 E2B
0.94s
100/100
6 MIOSA
1.300s
100/100
7 Vercel
1.35s
100/100
8 Blaxel
2.35s
100/100
9 Runloop
2.64s
100/100
10 Cloudflare
2.72s
100/100
11 Daytona
5.58s
100/100
12 CodeSandbox
10.54s
100/100
13 Tensorlake
15.81s
100/100
14 Upstash
23.98s
100/100
MIOSA estimated score: 92.4directional rank: #5, between E2B/Modal/Archil and Vercel
1
Declaw
94.9Composite
2
Northflank
94.4Composite
3
Modal
92.8Composite
4
E2B
92.7Composite
5
MIOSA
~92.4Estimated
6
Archil
91.8Composite
7
Vercel
90.7Composite
8
Runloop
84.6Composite
9
Blaxel
80.1Composite
10
Cloudflare
78.3Composite
11
Daytona
74.3Composite
12
CodeSandbox
16.4Composite
13
Tensorlake
0.0Composite
14
Upstash
0.0Composite

Composite score is estimated because the external benchmark does not publish the exact formula. MIOSA's measured inputs are 0.512s median, 0.992s p95, 1.300s p99, and 100% success.

Detailed metrics

ProviderScoreMedian TTIP95 TTIP99 TTISuccess
Declaw94.90.49s0.54s0.54s100%
MIOSA production path~92.4 est.0.512s0.992s1.300s100%
Northflank94.40.54s0.59s0.61s100%
Daytona74.30.58s5.52s5.58s100%
E2B92.70.64s0.83s0.94s100%
Modal92.80.67s0.78s0.79s100%
Vercel90.70.72s1.20s1.35s100%
Archil91.80.75s0.90s0.94s100%
Runloop84.60.81s2.64s2.64s100%
Legacy create / poll / exec baselinen/a0.947s1.333s1.348s100%
Cloudflare78.31.84s2.62s2.72s100%
Blaxel80.11.87s2.07s2.35s100%
CodeSandbox16.47.32s9.90s10.54s100%
Tensorlake0.015.22s15.76s15.81s100%
Upstash0.017.01s23.71s23.98s100%

Capability matrix

Speed is only one axis. MIOSA’s product surface is broader than “spawn a headless sandbox and exec a command.”

Fast sandbox lane#2median TTI placement
Tail latency#6P95/P99 placement
Platform breadth11/13capability rows covered
GPU optionsH100available by plan

Runtime lane

Headless sandbox, files, previews, snapshots, and first command execution.

MIOSA competes directly here.

Platform lane

Desktop VM, deploy/release plane, managed data, white-label embedding, BYOC.

This is the differentiation layer.

Enterprise lane

Compliance posture, GPU/H100 options, and mature enterprise procurement story.

This is where incumbents still have air.
Full provider matrix Scroll horizontally for Daytona, CodeSandbox, and Upstash. Teal = yes, amber = partial, gray = no.
CapabilityMIOSADeclawNorthflankModalE2BVercelDaytonaCodeSandboxUpstash
Headless sandbox create/execYESYESYESYESYESYESYESYESYES
Filesystem APIYESYESYESPARTIALYESYESYESYESYES
Port preview URLsYESYESYESPARTIALYESYESYESYESPARTIAL
Snapshot/fork/resumeYESYESPARTIALPARTIALYESYESYESYESPARTIAL
Full desktop/browser VMYESNONONOPARTIALNOPARTIALPARTIALNO
Managed deploy/release planeYESNOYESPARTIALNOYESNONONO
Managed Postgres/Redis/storageYESNOYESNONOPARTIALNONORedis/Vector
White-label tenant embeddingYESNOPARTIALNONONONONONO
BYOC / customer-owned fleetYESNONONONONOPARTIALNONO
MCP/agent tool surfaceYESNONONOPARTIALNOPARTIALNOPARTIAL
Multi-language SDKs5TS/PyTSPyTS/PyTS/PyTS/Py/Go/Java/RubyTSTS
GPU storyH100NONOYESNONONONONO
Compliance public postureCompliantLimitedEnterpriseEnterpriseSOC2SOC2EnterpriseEnterpriseEnterprise

What this means

  • If a buyer only cares about raw median headless sandbox TTI, the category is tight.
  • If a buyer needs desktops, browser automation, white-label embedding, deploys, data, and BYOC in the same platform, MIOSA is no longer competing on a one-column sandbox table.
  • If a buyer needs GPU today, MIOSA can support GPU/H100 options while keeping the same platform surface.

Provider coverage

This page tracks the providers shown in the benchmark screenshots plus the providers exposed by ComputeSDK’s current provider list. Some vendors are broader platforms, some are narrow sandbox APIs, and some expose sandboxes as one feature in a larger developer-cloud product.

ProviderCategoryStrongest public angleMIOSA comparison note
MIOSASandbox + desktop + deploy + data platformFull lifecycle platform for agents and white-label SaaSBroader platform surface than a headless sandbox-only provider
DeclawSecurity-oriented sandboxFast TTI plus policy/security positioningStrong security story; no public desktop/deploy/data plane equivalent
NorthflankDeveloper cloud with sandboxesPersistent app/runtime platform plus sandbox executionStrong deploy platform; less agent/desktop-specific
ModalServerless compute/GPUGPU and Python-function workflowStrong GPU story; sandbox is not a white-label desktop platform
E2BAI code execution sandboxMature AI-agent sandbox APIStrong headless agent sandbox; limited platform breadth
ArchilSandbox/storage-oriented providerFast benchmark row and storage-first positioningLess public breadth than MIOSA’s computers/deploy/data surface
VercelFrontend platform plus sandboxDistribution, OIDC, polished DXStrong existing-account funnel; sandbox is headless and region-limited in public docs
RunloopDevbox/sandbox providerLong-lived devboxes and snapshotsStrong devbox framing; no comparable white-label/data plane
BlaxelAgent platform and sandboxAgent hosting, batch jobs, sandbox consoleStrong agent platform/compliance posture; narrower managed data/deploy surface
CloudflareEdge platform sandboxEdge distribution and developer ecosystemStrong edge ecosystem; sandbox is one product within Cloudflare
DaytonaOSS/open sandbox platformOpen-source breadth and fast code-to-exec positioningStrong OSS story; MIOSA adds managed data, deploys, desktops, white-label
CodeSandboxCloud dev environmentBrowser IDE, previews, devbox UXStrong interactive IDE; weaker benchmark row in supplied data
TensorlakeAI-native sandboxAI/RL tooling and sandbox filesystem benchmarksStrong AI-lab framing; no public desktop/deploy/data platform equivalent
UpstashServerless data plus BoxRedis/Vector/QStash adjacency and built-in agent toolingStrong data brand; Box is newer and JS/TS-centered
HopXCloud sandbox APIMulti-language code execution and desktop automation docsLower benchmark visibility; overlaps sandbox APIs more than platform plane
NamespaceBuild/devbox platformBuilders, devboxes, macOS/CI style workloadsStrong CI/build niche; different buyer motion from MIOSA agent platform

Benchmark notes

The published MIOSA result is the clean 100/100 production lifecycle run after deploying POST /api/v1/sandboxes/run on 2026-06-09 UTC. The older standard path is also shown so the optimization is auditable instead of hidden. Setup failures from under-scoped keys or insufficient plan concurrency are not counted as fleet performance.

How to reproduce

Use a workspace and API key with enough sandbox concurrency for the test shape:

export MIOSA_API_KEY="msk_..."
export API_URL="https://api.miosa.ai"
export BENCH_WORKSPACE_ID="your-workspace-uuid"

./scripts/bench-continuous.sh 
  --samples 100 
  --concurrent 10 
  --pace 0.5 
  --template miosa-sandbox 
  --size xs 
  --mode run 
  --output bench-results/MIOSA-100-sandbox.tsv 
  --html bench-results/MIOSA-100-sandbox.html

Use --mode standard to reproduce the legacy create / poll / exec baseline. Use --mode run to reproduce the production command-ready lane. The benchmark deletes successful samples unless --keep is passed.

Sources and current research

  • ComputeSDK introduction for the provider set and abstraction shape.
  • Daytona docs for Daytona’s current sandbox positioning and SDK breadth.
  • Tensorlake homepage for Tensorlake’s published sandbox/filesystem benchmark positioning.
  • HopX docs for HopX sandbox/code-execution capabilities.
  • Blaxel docs for Blaxel sandbox/agent platform positioning.
  • Northflank sandboxes docs for Northflank sandbox behavior.
  • Internal competitive notes under docs/audits/providers/ for the first-pass capability matrix.

Was this helpful?