What is swarmtest?

swarmtest is a headless game testing framework that spawns swarms of AI-driven agents to stress-test multiplayer game servers over WebSocket connections. It uses Claude to generate behavior trees that simulate realistic player behavior, automatically detecting crashes, desyncs, latency spikes, protocol errors, and invariant violations.

Why swarmtest?

Multiplayer game servers are difficult to test manually. Bugs often emerge only under load, with many concurrent players performing unpredictable actions. swarmtest automates this by:

  • Spawning dozens of concurrent agents that connect to your game server via WebSocket
  • Generating diverse test scenarios using Claude to create behavior trees from natural language descriptions
  • Detecting real issues automatically through six built-in detectors covering crashes, desyncs, latency, protocol errors, invariant violations, and message rate anomalies
  • Recording and replaying behavior trees that triggered bugs for regression testing

Key Concepts

  • Behavior trees – Claude generates JSON-based test logic that drives agent behavior (move, fight, trade, etc.)
  • Game adapters – A pluggable adapter pattern lets swarmtest work with any game that has a WebSocket protocol. Built-in adapters exist for Tipo and PlayerTwo.
  • Detectors – Pluggable modules that monitor agent state and message traffic for anomalies
  • Tree library – Behavior trees that trigger bugs are saved to disk and replayed in future runs as regression tests
  • Headless – No display server needed. Runs in terminals, CI pipelines, and automated environments.

How It Works

  1. You point swarmtest at your game server’s WebSocket URL and choose a game adapter
  2. swarmtest spawns N agents with a mix of regression trees, LLM-generated trees, and handwritten trees
  3. Each agent connects, authenticates, and begins executing its behavior tree on a configurable tick interval
  4. Detectors monitor all agents for crashes, desyncs, latency spikes, protocol errors, and invariant violations
  5. After the test duration, swarmtest outputs a report summarizing all findings
  6. Trees that triggered meaningful issues are saved to the tree library for future regression runs