Varpulis Agent Runtime, an open-source library for real-time behavioral monitoring of AI agents, was published to GitHub in early 2026 under an Apache-2.0 license. The project is installable via npm or pip and targets a specific gap that the documentation describes directly: existing tools either analyze traces after execution ends, or validate individual inputs — neither catches behavioral failures as they unfold across sequences of agent actions. Author details were not available at time of publication.
- The runtime detects six documented failure modes in AI agents — including retry storms, circular reasoning, and budget runaway — using a Complex Event Processing engine built on NFA-based pattern matching and Zero-suppressed Decision Diagrams.
- It runs in-process via WebAssembly (JavaScript) or as a native Python extension, with no external infrastructure required and sub-millisecond detection latency claimed by the project documentation.
- Compatible with LangChain, CrewAI, OpenAI Agents SDK, Anthropic’s Model Context Protocol (MCP), and any OpenTelemetry-instrumented agent framework.
- Author details were not available at time of publication; the repository had 38 commits and 2 stars at the time of writing.
What Happened
Varpulis released an open-source runtime library designed to monitor AI agent behavior and fire callbacks when specific failure patterns emerge during execution. The project is publicly available at github.com/varpulis/varpulis-agent-runtime and supports JavaScript and Python environments through separate distribution packages. The release positions itself against a class of agent failures that only manifest over time — not within a single tool call or model response, but across sequences of agent actions.
Why It Matters
The project documentation frames the problem precisely: “Observability tools (LangSmith, Braintrust) analyze traces post-hoc. Static guardrails (Guardrails AI, NeMo) validate individual inputs/outputs. Varpulis detects behavioral patterns as they unfold.” This distinction is operationally significant: a model that alternates between two tools indefinitely, or progressively escalates token consumption, will pass per-call validation while still producing a failed or costly run.
The documentation further states that these failure modes are temporal patterns that “only become visible when you look at sequences of events over time” — a class of problem that neither post-hoc log analysis nor per-step input validation is designed to address.
Technical Details
The runtime is powered by the Varpulis CEP (Complex Event Processing) engine, which uses NFA-based pattern matching with Kleene closure and Zero-suppressed Decision Diagrams (ZDDs) for what the documentation describes as “efficient combinatorial matching.” ZDDs are a compact representation of sparse Boolean functions, well-suited to enumerating complex event combinations without the state explosion that plagues naive automaton approaches.
Six failure modes are documented. Retry storms detect repeated identical tool calls; circular reasoning flags alternation between tools without forward progress; budget runaway triggers when cumulative token spend exceeds a configured threshold; error spirals catch repeated tool errors followed by reformulation cycles; stuck agents fire when a model produces excessive internal reasoning without output; and token velocity spikes detect abnormally rapid token consumption rates. Developers configure per-pattern thresholds — minimum repetition counts, maximum cost ceilings, time windows — and receive event-driven callbacks when a pattern fires, allowing custom intervention logic without changes to the agent orchestration layer.
For JavaScript and TypeScript, the package installs as npm install @varpulis/agent-runtime and runs in-process via WebAssembly, with a claimed bundle size of approximately 1 MB. The Python version installs via pip install varpulis-agent-runtime and runs as a native extension. Both variants are described in the documentation as operating at sub-millisecond detection latency with no external infrastructure dependencies — these figures are project claims and have not been independently benchmarked at time of publication.
Who’s Affected
The library is aimed at developers building production agents on LangChain, CrewAI, OpenAI’s Agents SDK, Anthropic’s MCP, or any framework that surfaces OpenTelemetry traces. It is most directly relevant to teams running long-horizon or multi-step agents where unchecked failure loops — particularly budget runaway or error spirals — could produce significant API costs or cascading downstream errors before any human reviewer intervenes.
Because intervention logic is delivered through event-driven callbacks, teams can embed monitoring directly into existing application code without restructuring agent orchestration pipelines. This lowers the integration overhead for teams already using compatible frameworks.
What’s Next
At time of publication the repository had 38 commits, 2 stars, and 0 forks, indicating the project is early-stage. No production adoption data, independent performance benchmarks, or public release roadmap were available in the repository documentation.
An interactive playground that runs entirely in the browser via WebAssembly is accessible through the repository, allowing developers to evaluate pattern detection behavior without a local installation. No versioning schedule or planned framework integrations beyond those already documented were publicly disclosed.