The two failure channels
α — Why does Aperio have two separate failure mechanisms, and how do you choose between them?
Failure-handling is where most languages quietly accumulate
the largest amount of accidental complexity. Exceptions vs.
sentinels vs. error returns vs. Result<T, E> vs. panics —
many languages have several of these layered, with different
disciplines for when to use which, often in the same codebase.
Aperio carves the space cleanly into two orthogonal channels, with strict rules about which is allowed where:
- The structural channel (
↑): a locus’s declared invariant breaks. The runtime constructs a typed event and routes it upward to the parent’son_failurehandler. Recovery primitives (restart,quarantine,bubble,dissolve) decide what to do. - The value channel (
fallible(E)): an individual call can fail with a payload. The caller MUST address the error inline via anorclause before consuming the value.
There is no panic, no assert, no try/catch, no
implicitly-propagating exception system. The two channels
above cover every legitimate failure case; anything else
indicates a category error in the modeling.
The structural channel
A locus has commitments it must hold across its lifetime.
Those commitments are declared in closure blocks:
locus PnLAttribution {
params { intent_pnl: Decimal = 0.00d; book_pnl: Decimal = 0.00d; }
closure books_balance {
self.intent_pnl ~~ self.book_pnl within 0.05d;
epoch tick;
}
}
The ~~ operator is approximate equality within tolerance.
The closure says: at each tick, my intent PnL and book PnL
must agree within five cents. The runtime evaluates the
expression at each declared epoch; if it holds, nothing
happens (closures are silent on success). If it doesn’t, the
runtime constructs a typed ClosureViolation event and routes
it to the parent’s on_failure:
locus TradingDesk {
accept(p: PnLAttribution) { /* ... */ }
on_failure(p: PnLAttribution, err: Error) {
match err {
Error::ClosureViolation(v) -> {
// err.closure is "books_balance"
// err.left, err.right are the two values
// err.tolerance is 0.05d
// err.diff is left - right
quarantine(p) for 60s;
}
_ -> bubble(err);
}
}
}
The parent’s recovery options:
- Absorb — return from
on_failurewithout calling any recovery primitive. The child’s failure is treated as “noted, not propagating.” restart(child)— dissolve the child and instantiate a fresh one with the same declared params.restart_in_place(child)— reset the child to post-birth state while preserving its arena.quarantine(child) for d— pause the child but preserve its state, optionally auto-restart afterd.bubble(err)— pass the failure up to this locus’s parent. Recursive propagation.dissolve(child)— force-dissolve the child.
If a failure bubbles all the way past the runtime root with no handler absorbing, the process exits non-zero with a structured violation report on stderr. That’s the only way the program “crashes” — and it’s a deliberate, structured event, not an unexpected exception.
This is Erlang’s let-it-crash philosophy with one important addition: the parent’s policy is typed and declared. You write the recovery rule next to the locus it applies to, and it can be different for different child types. The runtime enforces the state machine — a child can’t be running and quarantined at the same time, can’t accept while draining, etc.
The value channel
Sometimes a function can fail in a way that’s not a structural event — just “this call didn’t produce a value, here’s why”:
fn parse_player_id(s: String) -> PlayerId fallible(ParseError) {
if !std::str::can_parse_int(s) {
fail ParseError { kind: "not_int", input: s };
}
return PlayerId { value: std::str::parse_int(s) };
}
A function declared fallible(E) returns either a value of
the success type or a FallibleErr(E) payload. The caller
must address the error — the typechecker rejects a bare
call result:
let id = parse_player_id(input); // ERROR: "error not addressed"
You address it with an or clause, in one of three motions:
let id = parse_player_id(input) or raise; // propagate up
let id = parse_player_id(input) or default_id(); // substitute
let id = parse_player_id(input) or handle(err); // hand off
or raise— propagate the error one frame up the static call stack. The enclosing function must itself befallible(E)(with the same payload type or a compatible one) so the error has somewhere to go. This is the value channel’s version of “let it propagate.”or <expression>— substitute a fallback value of the success type.erris implicitly bound to the payload inside the fallback expression. The fallback can be a literal (or 0), an expression (or default_id()), or a call (or handle(err)).- The error’s payload type is fully typed. You don’t need
to downcast or pattern-match a generic Error; the
fallible(E)declaration says exactly what shape the payload has.
Chains work right-associatively:
let id = parse_player_id(input) or lookup_default() or raise;
Reads as: try parse; on failure, try lookup_default(); on
that failure, propagate up. Each or disposes one fallible
in turn, reducing the chain toward a non-fallible value.
The value channel is value-level. It propagates through the
static call stack, not the locus tower. Two functions that
both fallible(ParseError) and call each other share the
same payload type and pass it up the stack until something
addresses it.
Where each channel lives
This is the rule that often surprises people coming from other languages:
fallible(E)may be declared on free functions and on stdlib-synthesized@form(...)methods. It may NOT be declared on user-declared locus methods.
Why the restriction? Because locus methods are
substrate-facing. They participate in the locus’s lifecycle
— bus subscription handlers, mode projections, contract reads.
Failures at this layer are structural events, not
value-level errors. They belong on the closure-violation
channel, where the parent’s on_failure is the policy
handler.
If a locus method needs to expose application-layer failure semantics, it wraps a fallible free function:
fn parse_message(b: Bytes) -> Message fallible(ParseError) { ... }
locus Reader {
bus { subscribe Input as on_input; }
fn on_input(b: Bytes) {
let m = parse_message(b) or default_message();
// ... handle m
}
}
The typechecker enforces this. Trying to declare fn ... -> T fallible(E) on a user locus method produces a focused
diagnostic naming the rule.
The reverse direction has a complementary rule: only stdlib-
synthesized form methods (@form(vec).get, @form(vec).pop,
@form(hashmap).get, @form(hashmap).remove,
@form(ring_buffer).pop) declare fallible(E). These are
application-layer storage substrate, not lifecycle-bearing
loci, so the value channel fits.
Bridging the channels: structural failure from value-error context
The two-channel rule keeps locus methods off the value channel —
but real systems regularly need to cross from one to the other.
A locus method catches a value error in an or clause, decides
the error is unrecoverable, and wants to immediately escalate
into the structural channel so the parent’s on_failure policy
takes over.
Aperio’s primitive for this is inline closure violation: a
locus declares a named structural-failure type as an
assertion-less closure with epoch inline, then any member
function can fire it with the violate statement.
type Query { sql: String; }
type Row { data: String; }
type DbError { kind: String; detail: String; }
topic ExecuteQuery { payload: Query; }
topic QueryResult { payload: Row; }
fn send_query(fd: Int, q: Query) -> Row fallible(DbError) {
let sent = std::io::tcp::send_bytes(fd, std::bytes::from_string(q.sql));
if sent < 0 { fail DbError { kind: "send_failed", detail: "connection lost" }; }
let resp = std::io::tcp::recv_bytes(fd, 4096);
if len(resp) == 0 { fail DbError { kind: "recv_empty", detail: "peer closed" }; }
return Row { data: std::str::from_bytes(resp) };
}
locus DbConnection {
params {
host: String = "127.0.0.1";
port: Int = 5432;
conn_fd: Int = -1;
last_error: String = "";
}
bus { subscribe ExecuteQuery as on_query; publish QueryResult; }
// Named structural-failure type. No assertion body; the fire
// IS the violation. The captures clause snapshots state into
// the ClosureViolation payload at the violate site.
closure fatal_io {
captures: last_error;
epoch inline;
}
birth() { self.conn_fd = std::io::tcp::connect(self.host, self.port); }
dissolve() { if self.conn_fd >= 0 { std::io::tcp::close_fd(self.conn_fd); } }
// The "error-check function": takes the error type, returns
// the success type expected at the call site, and chooses
// recovery (return a value) or escalation (violate).
fn handle_io(e: DbError) -> Row {
self.last_error = e.detail;
if e.kind == "send_failed" || e.kind == "recv_empty" {
violate fatal_io; // diverges — no return needed
}
return Row { data: "" }; // transient; substitute
}
fn on_query(q: Query) {
let r = send_query(self.conn_fd, q) or self.handle_io(err);
if !self.draining { QueryResult <- r; }
}
}
Three primitives are doing the work:
closure fatal_io { ... epoch inline; }— the vocabulary. A named structural-failure type local to this locus. Thecaptures:clause names locus state to snapshot when fired.fn handle_io(e: DbError) -> Row— the policy. A member fn shaped exactly for theorclause: takes the error type, returns the success type. Inside, the body decides between recovery (return a value) and escalation (violate). One function can be reused across every fallible call site on this locus that producesRowfromDbError.violate fatal_io— the trigger. Statement-level, divergent (typechecker treats asNever, same asfailin fallible fns andbubbleinon_failure). At the next cooperative yield, the runtime transitions this locus to drain. At dissolve, the parent receives the typedClosureViolationwith the capturedlast_error.
The flow when a value error propagates up:
send_query(self.conn_fd, q)fails — returnsFallibleErr(DbError {...}).- The
or self.handle_io(err)clause fires —errbinds to theDbError;handle_ioruns. handle_iowritese.detailtoself.last_error, sees the fatal kind, and executesviolate fatal_io.- The runtime constructs
ClosureViolation { locus: "DbConnection", closure: "fatal_io", captures: { last_error: "connection lost" } }and sets the locus’s internal__drain_requestedflag. Control diverges —handle_ionever returns to its caller. - At the next cooperative yield, the runtime begins drain.
dissolve()runs, closing the fd. - The parent’s
on_failure(c, ClosureViolation { ... })fires with the snapshot, decides policy (restart/quarantine/bubble/ absorb).
Why this composes well
Three roles, three slots, no double duty:
| Slot | Role | Reusable across |
|---|---|---|
| Closure declaration | Vocabulary — named failure type with optional payload schema | The locus type |
| Member fn (error-check) | Policy — decide recovery vs escalation per error kind | Every call site on the locus with same (ErrType, SuccessType) |
or self.handler(err) at call site | Binding — typechecker-enforced disposition | Every fallible call returning the matching success type |
Compare to the older workaround pattern (a should_exit: Bool
flag, a fatal_error: Bool flag, a while !should_exit { yield; }
loop in run(), a separate diagnostic field, plus a closure to
audit at dissolve): five pieces of state doing what one
closure + one violate + one member fn now do.
A note on Never
violate NAME; is divergent. The typechecker treats it
as the Never type: code after a violate is unreachable
within the current function. This is the same shape fail E;
takes inside a fallible function and bubble(err); takes
inside an on_failure handler — three statement forms whose
“return type” is “control doesn’t return through here.”
That’s what makes the error-check function work cleanly:
fn handle_io(e: DbError) -> Row {
if e.kind == "fatal" {
violate fatal_io; // Never; no return required
}
return Row { data: "" }; // Row; required on the other branch
}
The branches that violate don’t need a return; the branches
that return must provide a value of the declared type. The
typechecker enforces total coverage exactly as it would for a
function that mixes fail and return.
Why two channels and not one?
Languages that have only structural failure (Erlang) make
value-level errors awkward — you end up modeling “couldn’t
parse this int” as a process crash, which is too heavy.
Languages that have only value failure (Rust, Go) make
structural errors awkward — invariant violations end up
sprinkled across every call site as Result<T, Error>
returns, which is too granular and loses the parent-policy-
oriented recovery model.
Aperio splits the concern: structural failure routes up the locus tower with typed policy, and value failure routes up the static call stack with required inline disposition. The two never mix at intermediate frames; the only place they meet is the implicit root boundary (where any unhandled error of either kind ends the process).
In practice the rule of thumb is:
| Failure shape | Channel |
|---|---|
| “This invariant I declared broke” | structural (closure → on_failure) |
| “This individual call can fail and the caller should choose” | value (fallible(E)) |
| “Couldn’t parse” / “key not found” / “out of bounds” | value |
| “Books don’t balance” / “k_max exceeded” / “child wedged” | structural |
No panic / assert
Aperio has no panic(msg), no assert(cond), no throw.
“Impossible state” becomes “a closure asserting the state is
possible” — and when it isn’t, the runtime constructs the
typed violation and routes it up. “Bail from this function”
becomes either or raise (value channel) or “make this a
closure on the locus” (structural channel).
This isn’t asceticism. It’s that every legitimate use of
panic falls cleanly into one of the two channels above,
with better typing and better recovery shape than panic
itself provides.
Next
The next chapter, Lifecycle & time, covers how loci come into being, run, and dissolve — the state machine the failure channels operate over.