Get system health
const url = 'https://app.everruns.com/api/v1/durable/health';const options = {method: 'GET'};
try { const response = await fetch(url, options); const data = await response.json(); console.log(data);} catch (error) { console.error(error);}curl --request GET \ --url https://app.everruns.com/api/v1/durable/healthResponses
Section titled “ Responses ”System health
System health response
object
Number of workers in the running state, ready to claim tasks.
Tasks currently claimed by a worker (gauge).
Cumulative count of tasks that completed successfully (monotonic counter).
Cumulative count of workflows that completed successfully (monotonic counter).
Total tasks currently in flight across all workers.
Size of the dead-letter queue (gauge). High values indicate stuck activities.
Event-delivery backend in use: nats for distributed deployments, in_memory for single-instance. None if the field was omitted by an older server.
Cumulative count of tasks that failed terminally or were sent to the DLQ (monotonic counter).
Cumulative count of workflows that ended in failure (monotonic counter).
current_load / total_capacity * 100. 0.0 when no workers are registered.
Tasks waiting to be claimed (gauge).
Workflows waiting to be claimed (gauge).
Workflows currently executing (gauge).
Cumulative count of tasks claimed at least once (monotonic counter).
Cumulative count of workflows that started (monotonic counter).
Aggregate system status: healthy, degraded, or unhealthy. Derived from worker availability, load, and queue depths.
Sum of max_concurrency across all workers (the upper bound on concurrent task execution).
Total number of workers registered (heartbeating in the last window).
Number of workers currently accepting new task assignments (subset of active_workers; drains/backpressure excluded).
Example
{ "active_workers": 4, "claimed_tasks": 7, "completed_tasks": 12041, "completed_workflows": 4128, "current_load": 7, "dlq_size": 0, "event_delivery": "nats", "failed_tasks": 34, "failed_workflows": 12, "load_percentage": 21.875, "pending_tasks": 2, "pending_workflows": 1, "running_workflows": 3, "started_tasks": 12082, "started_workflows": 4144, "status": "healthy", "total_capacity": 32, "total_workers": 4, "workers_accepting": 4}Internal server error
Standard error response.
Wire shape is RFC 9457 Problem Details:
every error response includes title and status, and may include
detail, code, allowed_actions, retry_after_seconds, instance,
and type. The content type is rewritten to application/problem+json
by [problem_json_content_type].
object
Recovery actions the caller can take next.
Agent-actionable link describing a follow-up the caller can take. Used in two contexts:
- Error recovery —
ErrorResponse.allowed_actionscarriesrels likeretry,retry-later,unarchive,get-existingso the agent knows the right next call after a 4xx/429. - Entity hypermedia —
WithUrls<T>.allowed_actionscarries state-awarerels likecancel,events,self,updateon the entity itself so the agent can follow links instead of reconstructing routes from prose.
The shape is intentionally identical across both contexts; the closed
rel vocabulary documented in specs/api-conventions.md distinguishes
them.
object
Short, agent-readable hint (e.g. “Shorten ‘name’ to <= 200 chars.”, “Cancel the active turn for this session.”).
Absolute (preferred) or relative URL the caller may invoke
directly. Always present on entity hypermedia actions
(WithUrls<T>.allowed_actions); optional on error-recovery
actions (ErrorResponse.allowed_actions) where the matching
operation_id is enough and the URI is implicit from the failed
call.
HTTP method to use against href. Required for entity hypermedia
actions; usually omitted on error-recovery actions where the same
operation is retried with its original method.
OpenAPI operationId the caller should invoke. Lets an MCP client
resolve the call without parsing href.
Link relation describing the action. Closed vocabulary documented
in specs/api-conventions.md — examples: self, cancel, pause,
resume, events, retry, retry-later, unarchive,
get-existing, delete, update.
OpenAPI $ref to the request-body schema, when the action takes one
(e.g. #/components/schemas/UpdateSessionRequest). Lets a tool-calling
agent fetch the input shape without scanning the whole spec.
Stable, machine-readable error code (snake_case).
Human-readable explanation specific to this occurrence.
Request URI for this occurrence.
Seconds the caller should wait before retrying (429 / transient 503).
HTTP status code; mirrors the response status line.
Short, human-readable summary of the problem (e.g. “Not Found”).
RFC 9457 problem type URI. Optional; identifies the problem class.
Example
{ "allowed_actions": [ { "method": "POST" } ], "code": "session_not_found", "detail": "Session session_01933b5a000070008000000000000001 not found in org org_01933b5a000070008000000000000001.", "instance": "/v1/sessions/session_01933b5a000070008000000000000001", "retry_after_seconds": 30, "status": 404, "title": "Session not found", "type": "https://docs.everruns.com/errors/session_not_found"}