Skip to main content
External monitors prove your endpoints return 200. They’re blind to whether the work behind them finished — checkout sagas stuck retrying, webhooks dead-lettering, jobs failing halfway. Process Health instruments your processes so you can see completion rate, p95 duration, and exactly which step failed. A run maps to a trace and each step to a span, so it rides the same pipeline as distributed tracing.

Install

npm install @kodo-status/sdk

Instrument a process

Wrap a unit of work with kodo.workflow() and instrument each step with wf.step(). Your control flow is unchanged — step() runs your function, times it, records success/failure, and rethrows on error.
import { Kodo } from "@kodo-status/sdk";

const kodo = new Kodo({ apiKey: process.env.KODO_API_KEY! });

await kodo.workflow("checkout-fulfillment", async (wf) => {
  await wf.step("charge_card",       () => stripe.charge(order));
  await wf.step("reserve_inventory", () => inventory.reserve(order));
  await wf.step("notify_warehouse",  () => warehouse.notify(order));
  await wf.step("mark_fulfilled",    () => db.markFulfilled(order));
});
That’s it. On completion (success or failure) the run is recorded and appears under Dashboard → Process Health, grouped by workflow name.

What you get

  • Completion rate — % of runs that finished without an errored step, per process.
  • p95 duration — how long runs take end-to-end.
  • Failed runs + the failing step — when a run errors, the exact step that broke is surfaced.
Telemetry is fire-and-forget — a failure to report never throws into your process. Kodo only observes; retries, checkpointing, and durability remain your concern (use it alongside Temporal, BullMQ, Cloudflare Workflows, or plain queues).

Zero-code: point Kōdo at a table you already have

If your runs already live in a table — a jobs queue, a saga table, webhook_deliveries — you don’t have to wrap anything. Declare the table in kodo.yaml and run the collector. It reads the table in your own infrastructure (the database connection never leaves your network) and ships only normalized run summaries to Kōdo.
# kodo.yaml
processes:
  - name: webhook-delivery
    source: postgres
    connection: ${DATABASE_URL}   # resolved from your env, locally
    table: webhook_deliveries
    id: id
    status: status
    status_map: { delivered: ok, failed: failed, pending: running }
    started_at: created_at
    completed_at: updated_at       # optional — used for duration
kodo collect            # watch + ship continuously
kodo collect --once     # single pass (e.g. from your own cron)
status_map is optional — common values (success/completed/delivered → ok, failed/error → failed, pending/running → running) are recognized by default. The collector tracks a per-source cursor in .kodo-collector-state.json, so each run ships exactly once.
The postgres source records completed runs today (completion rate, p95, failures). In-flight / stuck detection for table sources is on the roadmap; use the SDK source for live stuck detection now.

Notes

  • Server-side. kodo.workflow() is for backend processes (workers, jobs, route handlers) where your async work runs.
  • Engine-agnostic. Observe completion across whatever you run — Temporal, Restate, DBOS, BullMQ, Cloudflare Workflows, or raw queues. Kōdo only observes; durability stays with your engine.
  • Rides tracing. Runs are stored as workflow traces (op: "workflow"), so Process Health is available on plans that include distributed tracing.