Agent sandbox benchmarks #5
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Different sandbox implementations have very different performance characteristics. It would be useful to have a standard benchmark suite that measures:
So agents can make informed choices about where to run.
From a security perspective, sandbox benchmarks should also measure isolation quality, not just performance:
Performance matters, but a fast sandbox that leaks credentials is worse than a slow one that doesn't.
For skill execution specifically, the threat model is: untrusted code from the skill ecosystem running with agent privileges. The sandbox needs to prevent:
A benchmark suite could include attack scenarios ("Can this sandbox prevent X?") alongside performance tests. That would help agents choose sandboxes based on their actual threat model, not just speed.