Template Worker Architecture
Architecture details for the template-worker service including VM distribution, lifecycle, and runtime management.
Template Worker Architecture
This document covers the internal architecture of the template-worker service, including VM distribution strategies, lifecycle management, and runtime details.
VM Distribution Strategies
A VM distribution strategy defines how guild VMs should be distributed across threads or processes. The template-worker supports multiple strategies:
Thread Pool (Current Default)
- Multiple guilds share threads from a pool
- More efficient resource utilization
- Better scalability for large deployments
- Internally used by the legacy "thread per guild" strategy
Thread Per Guild (Legacy)
- Each guild was originally given its own dedicated thread
- Now internally uses thread pool
- Maintained for compatibility
Process Pooling (Future)
- VMs run in separate processes for maximum isolation
- Requires additional work in the sandwich daemon layer
- Provides the highest level of security and isolation
Note
When either strategy encounters a panic, a per thread panic hook is triggered/fired to clean up all current guild templating handles so future dispatches to a guild start a fresh new guild VM on a different (non-panicking) thread.
VM Lifecycle
Each guild is assigned a dedicated Lua(u) VM that:
- Persists until explicitly destroyed or marked as broken
- Survives multiple template executions within the same guild
- Is cleaned up automatically on:
- Panic or internal errors
- Memory limit exceeded
- Infrastructure requirements
- Explicit destruction requests
VM State Management
- VMs use WeakLua references to prevent memory leaks
- Lua VM handles should be held for as short a time as possible
- Never clone the Lua VM handle directly (considered a logic bug)
Important
Each lua vm is owned by a single struct called a KhronosRuntime (which is provided by the khronos runtime). It is considered a logic bug to attempt to clone the lua vm handle. Instead, a WeakLua (weak ref to lua vm handle) should be used with the resulting lua vm handle being held for as little time as possible in all uses.
Khronos Runtime
The template-worker uses the khronos runtime to manage Lua VMs:
- Each Lua VM is owned by a single
KhronosRuntimestruct - Provides VM lifecycle management
- Handles VM state and isolation
- Manages resource limits
Runtime Best Practices
- Never clone the Lua VM handle directly
- Use
WeakLua(weak reference) for temporary access - Hold Lua VM handles for minimal time
- Clean up references promptly
Thread Entry Implementation
The ThreadEntry structure provides a generic abstraction for VM distribution strategies:
Implementation Details
The actual contents of a vm distribution strategy is simple, do any setup needed (to init guild state, the tokio runtime and the khronos runtime). Then run a loop listening on a request channel for LuaVmAction messages. When a action is recieved with its (optional) associated callback channel, simply call the handle method of LuaVmAction which will handle the execution layer. To simplify things, a ThreadEntry structure is provided which provides a generic abstract implementation that does all of this work in a way that can be used by any VM distribution strategy.
Setup Phase
- Initialize guild state
- Set up Tokio runtime
- Initialize Khronos runtime
Main Loop
- Listen on request channel for
LuaVmActionmessages - Handle actions with optional callback channels
- Call
handlemethod ofLuaVmActionfor execution
Error Handling
- Per-thread panic hooks clean up guild templating handles
- Future dispatches to a guild start fresh VMs on non-panicking threads
- Automatic recovery from panics
Resource Limits
The service enforces global limits on all templates:
pub const MAX_TEMPLATE_MEMORY_USAGE: usize = 1024 * 1024 * 20; // 20MB maximum memory
pub const MAX_VM_THREAD_STACK_SIZE: usize = 1024 * 1024 * 20; // 20MB maximum stack
pub const MAX_TEMPLATES_EXECUTION_TIME: std::time::Duration =
std::time::Duration::from_secs(60 * 10); // 10 minute maximum execution time
pub const MAX_TEMPLATES_RETURN_WAIT_TIME: std::time::Duration =
std::time::Duration::from_secs(60); // 60 seconds before return is ignoredThese limits are designed to:
- Allow complex templates to run successfully
- Prevent resource exhaustion and abuse
- Maintain system stability under load
Note
The above limits are designed to be generous and allow for complex templates to run/work while trying to limit abuse as far as possible. If we notice abuse, we may reduce the limits either for you or globally.
Sandboxing
The template-worker implements multiple layers of sandboxing:
Luau Runtime Sandboxing
- Built-in sandboxing capabilities of Luau
- Prevents direct system access
- Limits available standard library functions
Global Table Isolation
- Read-only shared global table for Luau standard library
- Isolated per-template global table with custom
__indexand__metatable - Prevents template interference while allowing
_Gusage
Plugin Sandboxing
- Plugins are read-only and cannot be monkey-patched
- Controlled API surface through the template context
- Capability-based access control
Resource Limits
- Memory usage monitoring and enforcement
- Execution time limits
- Stack size restrictions
Event Dispatching
Templates are invoked via events:
- Events are dispatched asynchronously using Luau threads
- Each event receives the event data and template context
- The
tasklibrary works as expected in event handlers - See the documentation for available events
Code Structure
The template-worker source code is organized in src/:
src/api/- HTTP API handlers and typessrc/events/- Event type definitions- VM Management - VM lifecycle, distribution, and cleanup
- Execution Engine - Template execution and event handling
- Sandboxing - Security and isolation enforcement
- Resource Management - Memory, time, and stack limits
Extension Points
To extend the template-worker:
-
Add New VM Distribution Strategy
- Implement the distribution strategy interface
- Use
ThreadEntryfor common functionality - Register in service initialization
-
Add New Resource Limits
- Define constants in limit configuration
- Update enforcement logic
- Add monitoring/metrics
-
Extend Sandboxing
- Modify global table isolation
- Add new capability checks
- Update plugin sandboxing rules
-
Add New Event Types
- Define event structure in
src/events/ - Update dispatcher logic
- Document in templating API reference
- Define event structure in
Related Documentation
- Template Worker Overview- Service overview
Template Worker Service
Developer documentation for the template-worker service - the template execution engine that runs user-defined Lua/Luau scripts in a sandboxed environment.
- Internals- Implementation details
Template Worker Internals
Deep dive into the template-worker implementation details, code patterns, and internal architecture.
- HTTP API- API endpoints and types
Template Worker HTTP API
HTTP API documentation for the template-worker service including endpoints, types, and OpenAPI specification.
- Templating Architecture- Additional architecture details
Templating Architecture
Architecture and VM distribution strategies for AntiRaid's templating system.
Last updated on
Template Worker Service
Developer documentation for the template-worker service - the template execution engine that runs user-defined Lua/Luau scripts in a sandboxed environment.
Template Worker Internals
Deep dive into the template-worker implementation details, code patterns, and internal architecture.