Anatomy of the runtime#

You can get a long way with tractor by treating it as “trio with nurseries that spawn processes”. But once you start asking where does my msg actually go?, which process is that? or who keeps the phonebook?, it pays to know how the runtime hangs together. This page walks the stack top to bottom.

layer cake of app tasks, tractor runtime, IPC, OS process

The four runtime layers inside every actor process.#

The layer cake#

Every actor process is the same four-layer sandwich:

  • your app: plain trio tasks, nurseries and cancel scopes; nothing special. tractor is a structured concurrency (SC) multi-processing runtime built on trio and the whole pitch is that this layer stays just trio: no callbacks, no futures, no proxy objects.

  • the tractor runtime: a per-process tractor.Actor running the msg loop and RPC task scheduler, plus the user-facing primitives layered on it: tractor.ActorNursery (spawning + supervision), tractor.Portal (calling into a peer) and tractor.Context + tractor.MsgStream (SC-linked cross-actor task pairs and streaming).

  • IPC channels: one tractor.Channel per connected peer, each wrapping a MsgTransport that ships msgspec-typed msgs over TCP or UDS.

  • the OS: one process per actor, started by a swappable spawn backend.

The property that holds it all together: SC composes through the layers. A crash in a leaf actor’s app task unwinds that actor’s trio tree, ships across its IPC channel as a typed Error msg, and unwinds the parent’s trio tree in turn — the “SC-transitive supervision protocol” from the README’s pitch. The whole tree cancels and errors like one big trio program; it just happens to be spread across processes.

One actor, one process, one trio.run()#

A tractor “actor” is not a green thread, nor an object with a mailbox, nor a coroutine: it’s one OS process running one trio.run() whose root task boots the runtime machinery — msg loop, RPC task scheduler, IPC server — all embodied by a single tractor.Actor instance.

You rarely construct an Actor yourself; the runtime makes exactly one per process and you grab it with tractor.current_actor():

import tractor

actor = tractor.current_actor()  # NoRuntime if none running
print(actor.aid.name)   # str name, need not be unique
print(actor.aid.uuid)   # uuid4 str, IS unique
print(actor.aid.pid)    # the OS pid
print(actor.uid)        # legacy (name, uuid) pair

Identity is carried by the Aid msg-struct (see tractor.msg.types): a name/uuid/pid triple exchanged in the very first “mailbox handshake” whenever two actors connect. It’s what the registrar stores and what shows up in logs and proc-titles. The older .uid 2-tuple of (name, uuid) predates Aid and is still pervasive across the codebase; treat it as the legacy spelling of the same identity.

If this smells like the actor model, sure — but as the README warns, it probably doesn’t look like what you think an actor model looks like, and that’s intentional. Here an “actor” is purely a runtime-unit-of-abstraction: process + trio.run() + IPC machinery.

IPC: channels, transports, addresses#

Two connected actors talk through a tractor.Channel: a duplex, per-peer msg pipe. Each Channel wraps a MsgTransport instance which does the wire work: framing, encode/decode and the socket itself. The encoding is msgpack (via msgspec) and every msg is an instance of one of the runtime’s tagged-union msgspec.Struct types: the Aid handshake, Start/StartAck (RPC init), Started/Yield/Stop/Return (the ctx dialog phases), Error, etc. There is no raw-bytes mode; the msg-spec is the protocol, which is exactly what lets payloads be type-limited per-context (see pld_spec in The Context: a cross-actor task pair).

Addresses come in two spellings:

  • unwrapped: the plain-tuple form you pass to user APIs — ('127.0.0.1', 1616) for tcp, or a (<filedir>, <filename>) path-pair for uds;

  • wrapped: the internal TCPAddress/UDSAddress struct types (plus libp2p-style multiaddr helpers over in tractor.discovery).

You only ever need the tuple form; the runtime wraps and unwraps at the boundaries.

TCP: the boring default#

The default transport ('tcp') binds each actor’s IPC server to loopback ('127.0.0.1', <random port>) unless told otherwise, and is the only choice when your tree spans hosts. Nothing exotic: trio TCP streams + length-prefixed msgpack framing.

UDS: same-host, creds included#

Pass enable_transports=['uds'] and actors instead talk over unix-domain sockets, with socket files placed in the per-user runtime dir ($XDG_RUNTIME_DIR/tractor/ on linux, the platformdirs equivalent elsewhere). Two perks over tcp on a single host:

  • no ports to fight over; addrs are just file paths,

  • the kernel snitches on your peer for free: the listening side reads the connector’s pid (plus uid/gid on linux) straight off the socket via SO_PEERCRED / LOCAL_PEERPID — no extra handshake msgs required B)

Warning

Socket-file lifetime == listening actor lifetime. On listener teardown the runtime os.unlink()s the socket file immediately, so any late connection attempt (say, a sub-actor racing to deregister with a registrar that’s already shutting down) fails with FileNotFoundError. And ofc, UDS is same-host only.

Here’s a full actor tree run entirely over uds:

examples/uds_transport_actor_tree.py#
'''
Demonstrate an actor tree which talks over unix-domain-socket
(UDS) transport instead of the default TCP: pass
`enable_transports=['uds']` when opening the root and every
subactor inherits the preference.

Every channel address is a filesystem socket path (no TCP port
in sight!) and, as a kernel-provided bonus, the peer's pid is
exchanged for free via `SO_PEERCRED`.

'''
import os

import trio
import tractor


async def report_addr() -> str:
    '''
    Return this actor's own accept (bind) addr + pid.

    '''
    actor = tractor.current_actor()
    addr: tuple = actor.accept_addr
    pid: int = os.getpid()
    return f'{actor.name}@{addr} pid={pid}'


async def main() -> None:
    async with tractor.open_nursery(
        enable_transports=['uds'],
    ) as an:
        portal = await an.start_actor(
            'uds_child',
            enable_modules=[__name__],
        )
        # the channel's remote addr is a `UDSAddress`: a
        # filesystem socket path, NOT a (host, port) pair!
        raddr = portal.chan.raddr
        assert raddr.proto_key == 'uds'
        # NOTE, `.sockpath` is the *shared listener* socket file
        # (named for the root registrar) this channel rode in
        # on, NOT a per-child path; the child-specific identity
        # we get for free is the kernel-reported peer pid (via
        # `SO_PEERCRED`).
        print(
            f'portal chan tpt proto: {raddr.proto_key!r}\n'
            f'listener sock file: {raddr.sockpath}\n'
            f'kernel-reported peer pid: {raddr.maybe_pid}\n'
        )
        # ask the child for its OWN distinct bind addr: another
        # socket-file path under the runtime dir.
        print(f'child says: {await portal.run(report_addr)}')
        await portal.cancel_actor()


if __name__ == '__main__':
    trio.run(main)

Picking a transport#

Transport choice is per-actor via the enable_transports kwarg accepted by tractor.open_root_actor() (and proxied through open_nursery() when it implicitly boots the runtime), plus per-child via ActorNursery.start_actor(enable_transports=...). Two rules the runtime enforces today:

  • exactly ONE transport per actor: multi-transport actors are on the roadmap but currently raise RuntimeError;

  • your registry_addrs protos must all be in enable_transports: mismatches fail fast with ValueError instead of (as in darker times) hanging the registrar handshake forever.

Spawn backends#

How does an actor actually become a process? Via a swappable spawn backend, selected with the start_method kwarg to tractor.open_root_actor():

'trio' (default)

The home-grown spawner: re-exec the child as python -m tractor._child using trio’s subprocess machinery, then bootstrap it over the first IPC exchange (the parent ships a SpawnSpec msg carrying all init state). Supported on all platforms and the most battle tested choice by far.

'mp_spawn' / 'mp_forkserver'

The stdlib multiprocessing start-methods of the same names (forkserver is posix-only). Mostly interesting for ecosystem compat and start-up-latency tuning.

'subint' (in development, py3.14+)

On the roadmap (not yet selectable via start_method on this release): run each actor as a PEP 734 sub-interpreter (concurrent.interpreters) driven on its own OS thread inside the parent process — interpreter-level shared-nothing isolation with much faster start-up. Yes, this bends the one-actor-one-process rule; the rest of the model is unchanged.

The TRACTOR_SPAWN_METHOD env-var beats any caller-passed start_method, so you can swap backends under an unmodified app:

TRACTOR_SPAWN_METHOD=mp_forkserver python my_app.py

One current limitation worth knowing: debug_mode=True (the crash-to-REPL machinery) is only supported on backends whose child-side runtime is trio-native, e.g. the default 'trio'; see “Native” multi-process debugging for the deats.

The registrar#

Discovery needs a phonebook. Every actor, as part of boot, registers its Aid and bind-addrs with the registrar: an otherwise ordinary actor (a tractor.Registrar, subtype of Actor) that keeps the name -> addrs table for the tree; on graceful exit each actor de-registers itself.

Who is the registrar? Decided at root boot, rendezvous style. tractor.open_root_actor() probes each addr in registry_addrs with a quick connect-ping, then:

  • somebody answered: this root is a plain actor; it registers with the existing registrar and binds random same-proto addrs for its own IPC server;

  • nobody answered: this root becomes the registrar and binds the registry addrs itself.

So single-program trees need zero config — the root quietly self-appoints — while multi-program setups share a registrar by pointing every program at the same registry_addrs. Pass ensure_registry=True to demand that this call create the registry; it raises if the addrs are already served.

The lookup APIs — tractor.find_actor(), tractor.wait_for_actor(), tractor.query_actor() and tractor.get_registry() — all consult it (after first checking already-connected peers):

examples/service_daemon_discovery.py#
'''
Demonstrate the "service daemon" pattern: a named,
long-lived actor spawned via `ActorNursery.start_actor()`
which any other task can locate through the registrar using
`tractor.find_actor()` / `tractor.wait_for_actor()` - no
spawn-portal required - and RPC into directly.

Teardown is explicit and graceful via `portal.cancel_actor()`
once the clients are done.

'''
import trio
import tractor

_quotes: dict[str, float] = {
    'btcusdt': 66_000.5,
    'ethusdt': 3_500.25,
}


async def get_quote(sym: str) -> float:
    '''
    Look up the "current" quote for a symbol.

    '''
    name: str = tractor.current_actor().name
    print(f'{name}: serving quote for {sym!r}')
    return _quotes[sym]


async def client_task() -> None:
    '''
    Locate the quote service by name and RPC it; note no
    spawn-nursery/portal reference is ever passed in here!

    '''
    # a lookup miss yields `None` (not an error).
    async with tractor.find_actor('no_such_svc') as portal:
        assert portal is None
        print('client: "no_such_svc" is not registered')
    # block until the service shows up in the registry,
    # then call into it through the delivered portal.
    async with tractor.wait_for_actor('quote_svc') as portal:
        quote: float = await portal.run(
            get_quote,
            sym='btcusdt',
        )
        print(f'client: got btcusdt quote {quote}')


async def main() -> None:
    async with tractor.open_nursery() as an:
        portal = await an.start_actor(
            'quote_svc',
            enable_modules=[__name__],
        )
        # run the client in a separate task which discovers
        # the daemon purely by its registered name.
        async with trio.open_nursery() as tn:
            tn.start_soon(client_task)
        # explicit graceful teardown of the daemon.
        print('root: cancelling quote_svc')
        await portal.cancel_actor()
    print('root: service shut down cleanly')


if __name__ == '__main__':
    trio.run(main)

If you bump into “arbiter” in old issues or posts: that’s the legacy name for the same thing, surviving in-code only as the Arbiter = Registrar class alias; all current terminology is “registrar”/”registry”. Fair warning per the README: this is still a very naive discovery sys (no re-election, no gossip protocol… yet) and a registrar is expected to outlive its registrants.

Runtime env vars#

A few env-vars let you re-tune a whole tree without touching app code; each wins over its corresponding kwarg:

env-var

effect

vs. kwarg

TRACTOR_LOGLEVEL

crank (or silence) console-log verbosity for every actor in the tree

beats loglevel

TRACTOR_SPAWN_METHOD

swap the process spawn backend

beats start_method

TRACTOR_ENABLE_STACKSCOPE

install the SIGUSR1 task-tree-dump handler in every actor, even outside debug_mode (see “Native” multi-process debugging)

OR’d with enable_stack_on_sig

Spotting actors from your shell#

Every sub-actor sets an OS-level proc-title of the form _subactor[<name>@<pid>] (via setproctitle, silently skipped when not installed) so ps/htop/pstree show which actor is which at a glance. The README’s signature incantation — watch a tree build and self-destruct live:

$TERM -e watch -n 0.1 "pstree -a $$" \
    & python examples/nested_actor_tree.py \
    && kill $!

For scripting there are two stable cmdline markers:

pgrep -fa '_subactor\['     # live, titled sub-actors
pgrep -fa 'tractor._child'  # 'trio'-backend children not
                            # yet (re)titled

The title also lands in the kernel comm (truncated to ~15 bytes) which survives into zombie state — that’s what tractor’s own test-harness reapers key off. To be crystal clear about the contract though: you should never need a reaper; if you can create zombie child processes (without using a system signal) it is a bug — please report it!

Logging#

The runtime logs through a thin adapter over stdlib logging that stamps every record with actor + task info. Two calls get you going:

from tractor.log import get_console_log, get_logger

log = get_logger(__name__)  # actor/task-aware sub-logger
get_console_log('info')     # attach console handler @ level

(or just pass loglevel='info' to tractor.open_root_actor() and the console handler comes up with the runtime).

tractor adds custom levels — and matching logger methods — that slot between the stdlib ones so you can dial in which runtime subsystem you want to hear from: .transport() (5), .runtime() (15), .devx() (17), .cancel() (22), plus a PDB (500) level for debugger chatter. E.g. loglevel='cancel' plays the whole cancellation chorus while staying quiet about transport-layer noise. Beyond that tractor isn’t opinionated about how you consume logs: it’s all stdlib logging underneath.

Where to next?#

See also