“Native” multi-process debugging#
tractor ships the thing every multiprocessing user has
wished for and quietly assumed was impossible: a multi-process
debugger that just works.
Drop await tractor.pause() — or, with greenback installed,
a plain builtin breakpoint() — anywhere in any actor: the
root, a child, a grandchild, a sync helper function, even an
asyncio task inside an “infected” actor. A full-featured
pdbp REPL opens in that process, with syntax-highlighted
source listings, tab completion and sticky mode, attached to your
one terminal.
Under the hood every REPL entry acquires a tree-global tty mutex
via an IPC request to the root actor, so prompts from concurrent
pauses and crashes never interleave. ctrl-c is shielded while
any REPL is live, so a stray SIGINT can’t vaporize the tree
out from under you. And in debug mode any uncaught error drops
you into a crash REPL first in the failing child, then again at
each parent as the boxed RemoteActorError climbs
the supervision tree.
No remote-pdb sockets, no set_trace() port juggling, no
ptrace attach dance: the debugger semantics you already know,
transparently extended across an entire process tree. Because
tractor is a structured concurrency (SC) runtime, the
debugger composes with supervision instead of fighting it — quit
a REPL and errors keep propagating exactly like trio taught
you, ending in clean, zombie-free teardown.
We’re pretty sure it’s the (first ever?) “native” debugging UX for multi-process Python B)
Enabling debug mode#
Pass debug_mode=True to your runtime entrypoint, either
tractor.open_nursery() (which forwards it to the implicitly
opened root actor) or tractor.open_root_actor() directly:
async with tractor.open_nursery(
debug_mode=True, # arm the whole actor tree
) as an:
...
This arms the debug machinery tree-wide:
crash handling is enabled in every actor: uncaught errors enter a REPL before they propagate,
the internal tty-lock module is auto-exposed over RPC to every subactor (this is what makes the one-terminal handoff work),
console logging is bumped to include
PDB-level status msgs so you can see REPL acquire/release events as they happen.
You can instead flip it on for just one child, letting its siblings crash-and-burn the normal way:
portal = await an.start_actor(
'sketchy_worker',
debug_mode=True, # OR-ed with the tree-wide flag
)
See examples/debugging/per_actor_debug.py for a runnable
proof of the selective style.
Note
Debug mode requires the child-side runtime to be
trio-native so that the tty-lock IPC dialog works; it’s
currently supported on the 'trio' (default) and
'main_thread_forkserver' spawn backends and raises
RuntimeError for any other start_method.
Your first pause point#
tractor.pause() is the SC-aware, multi-process spelling of
the stdlib’s breakpoint(). In the root actor it looks almost
boring:
import trio
import tractor
async def main():
async with tractor.open_root_actor(
debug_mode=True,
):
await trio.sleep(0.1)
await tractor.pause()
await trio.sleep(0.1)
if __name__ == '__main__':
trio.run(main)
Run it and you get a (Pdb+) prompt parked on the pause()
line; type c (continue) and the program finishes normally.
The exact same call works from any subactor, no matter how deep in the tree:
import trio
import tractor
async def breakpoint_forever():
'''
Indefinitely re-enter debugger in child actor.
'''
while True:
await trio.sleep(0.1)
await tractor.pause()
async def main():
async with tractor.open_nursery(
debug_mode=True,
loglevel='cancel',
) as n:
portal = await n.run_in_actor(
breakpoint_forever,
)
await portal.wait_for_result()
if __name__ == '__main__':
trio.run(main)
Each loop iteration the child actor requests the terminal from
the root over IPC, REPLs you, then releases it on c. Pause
points are re-entrant-safe: repeat calls from the same task are
no-op’d and other local tasks queue politely for the REPL.
When you get bored, type q (quit): the resulting
bdb.BdbQuit is boxed and shipped to the parent like any other
remote error XD — causality is preserved even for your debugging
mistakes.
Crash REPLs: errors climb the tree#
Pause points are only half the story. With debug mode armed, any uncaught error anywhere in the tree triggers what we call crash handling mode:
import trio
import tractor
async def name_error():
getattr(doggypants) # noqa (on purpose)
async def main():
async with tractor.open_nursery(
debug_mode=True,
) as an:
# TODO: ideally the REPL arrives at this frame in the parent,
# ABOVE the @api_frame of `Portal.run_in_actor()` (which
# should eventually not even be a portal method ... XD)
# await tractor.pause()
p: tractor.Portal = await an.run_in_actor(name_error)
# with this style, should raise on this line
await p.wait_for_result()
# with this alt style should raise at `open_nusery()`
# return await p.wait_for_result()
if __name__ == '__main__':
trio.run(main)
What happens when the child hits that (very intentional)
NameError:
a REPL opens in the crashed child first — you inspect the raising frame, its locals, the works, right inside the failed process,
when you quit, the error is boxed into a
RemoteActorErrorand relayed to the parent,the parent (here the root) gets its own crash REPL with the rendered remote traceback,
quit again and the nursery tears the tree down — errors keep propagating per SC rules, no zombies left behind.
You debug the failure at every hop of the supervision tree, which for multi-hop trees means you can chase an error from the leaf that raised it all the way up to the root that supervises it.
Need to skip REPL entry for certain exceptions? Pass a predicate
via open_root_actor(debug_filter=...); by default
cancellation-only exception (groups) don’t engage the REPL.
One terminal, many actors#
So how do N processes share one tty without garbling it? The root actor owns stdio for the whole tree and guards it with a FIFO mutex; every subactor REPL entry is an IPC lock request to the root. Exactly one actor-task in the entire tree can own the terminal at a time, so prompts never interleave — ever.
Every REPL entry serializes through the root actor’s
tty lock; continue-ing one REPL hands the terminal to
the next waiter, FIFO style.#
The runtime’s teardown paths cooperate too: a cancelling parent always waits for any live REPL to release before reaping children, so the debugger never gets yanked out from under you mid-keystroke.
Here’s the showpiece: one daemon child re-entering
tractor.pause() forever inside a stream, while its sibling
repeatedly raises a NameError:
import tractor
import trio
async def breakpoint_forever():
"Indefinitely re-enter debugger in child actor."
try:
while True:
yield 'yo'
await tractor.pause()
except BaseException:
tractor.log.get_console_log().exception(
'Cancelled while trying to enter pause point!'
)
raise
async def name_error():
"Raise a ``NameError``"
getattr(doggypants) # noqa
async def main():
'''
Test breakpoint in a streaming actor.
'''
async with tractor.open_nursery(
debug_mode=True,
) as an:
p0 = await an.start_actor('bp_forever', enable_modules=[__name__])
p1 = await an.start_actor('name_error', enable_modules=[__name__])
# retreive results
async with p0.open_stream_from(breakpoint_forever) as stream:
# triggers the first name error
try:
await p1.run(name_error)
except tractor.RemoteActorError as rae:
assert rae.boxed_type is NameError
async for i in stream:
# a second time try the failing subactor and this tie
# let error propagate up to the parent/nursery.
await p1.run(name_error)
if __name__ == '__main__':
trio.run(main)
What you’ll actually see#
Running it looks roughly like this (uids, tracebacks and source listings elided; REPL order can vary with who wins the lock race):
$ python examples/debugging/multi_daemon_subactors.py
Opening a pdb REPL in paused actor: ('bp_forever', '<uuid>')
<highlighted source around the `await tractor.pause()` line>
(Pdb+) c
Opening a pdb REPL in crashed actor: ('name_error', '<uuid>')
<live traceback: NameError: name 'doggypants' is not defined>
(Pdb+) q
Opening a pdb REPL in crashed actor: ('root', '<uuid>')
<boxed RemoteActorError traceback relayed from 'name_error'>
(Pdb+) q
Two (then three) processes, one terminal, zero confusion:
c-ing out of the paused daemon’s REPL releases the tty lock,
which immediately hands the prompt to the crashed sibling; quit
that and the error propagates as a fully-rendered
RemoteActorError to the parent where one final
crash REPL catches it before clean, zombie-free teardown.
For maximum drama run
multi_nested_subactors_error_up_through_nurseries.py (under
examples/debugging/) which pulls the same trick across a
three-deep process tree — the tty lock keeps every prompt
orderly the whole way up.
Post-mortem, on demand#
Crash handling is automatic, but you can also enter a REPL on
a live exception manually with tractor.post_mortem() —
the actor-aware equivalent of pdb.post_mortem() — from inside
any except block in any actor (kwargs: tb= for an
explicit traceback, plus shield= and hide_tb=):
import trio
import tractor
@tractor.context
async def name_error(
ctx: tractor.Context,
):
'''
Raise a `NameError`, catch it and enter `.post_mortem()`, then
expect the `._rpc._invoke()` crash handler to also engage.
'''
try:
getattr(doggypants) # noqa (on purpose)
except NameError:
await tractor.post_mortem()
raise
async def main():
'''
Test 3 `PdbREPL` entries:
- one in the child due to manual `.post_mortem()`,
- another in the child due to runtime RPC crash handling.
- final one here in parent from the RAE.
'''
# XXX NOTE: ideally the REPL arrives at this frame in the parent
# ONE UP FROM the inner ctx block below!
async with tractor.open_nursery(
debug_mode=True,
# loglevel='cancel',
) as an:
p: tractor.Portal = await an.start_actor(
'child',
enable_modules=[__name__],
)
# XXX should raise `RemoteActorError[NameError]`
# AND be the active frame when REPL enters!
try:
async with p.open_context(name_error) as (ctx, first):
assert first
except tractor.RemoteActorError as rae:
assert rae.boxed_type is NameError
# manually handle in root's parent task
await tractor.post_mortem()
raise
else:
raise RuntimeError('IPC ctx should have remote errored!?')
if __name__ == '__main__':
trio.run(main)
This example demos three REPL entries from one error:
the child’s manual
post_mortem()inside itsexcept,the runtime’s automatic crash handler in the same child once the error re-raises out of the RPC task,
a manual
post_mortem()in the parent on the receivedRemoteActorError, whose.boxed_typefaithfully reports the originalNameError.
Pausing from sync code#
No await? No problem. tractor.pause_from_sync() brings
the same tree-aware REPL to plain synchronous functions — handy
when the suspect code is three helpers deep and decidedly not
async.
It’s powered by greenback, which is optional, so you need to:
install it (it ships in
tractor’ssync_pausedependency group),enable it at runtime entry:
async with tractor.open_nursery(
debug_mode=True,
maybe_enable_greenback=True,
) as an:
...
With that armed, sync code can pause from three different caller
environments: the main trio thread, trio.to_thread bg
threads, and (see the next section) asyncio tasks in infected
actors. The greenback “portal” hops back into the trio loop
to do the lock/REPL dance on your behalf:
def sync_pause(
use_builtin: bool = False,
error: bool = False,
hide_tb: bool = True,
pre_sleep: float|None = None,
):
if pre_sleep:
time.sleep(pre_sleep)
if use_builtin:
breakpoint(hide_tb=hide_tb)
else:
# TODO: maybe for testing some kind of cm style interface
# where the `._set_trace()` call doesn't happen until block
# exit?
# assert get_lock().ctx_in_debug is None
# assert get_debug_req().repl is None
tractor.pause_from_sync()
# assert get_debug_req().repl is None
if error:
raise RuntimeError('yoyo sync code error')
@tractor.context
async def start_n_sync_pause(
ctx: tractor.Context,
):
actor: tractor.Actor = tractor.current_actor()
disable_pdbp_color()
# sync to parent-side task
await ctx.started()
print(f'Entering `sync_pause()` in subactor: {actor.uid}\n')
sync_pause()
print(f'Exited `sync_pause()` in subactor: {actor.uid}\n')
The full script also exercises the hairier root-actor bg-thread cases (and documents their remaining sharp edges) if you want the deep lore.
The builtin breakpoint() override#
When debug mode boots with greenback available, tractor wires
Python’s PEP 553 hook so the builtin breakpoint() becomes
the actor-aware sync pause, by exporting:
PYTHONBREAKPOINT=tractor.devx.debug._sync_pause_from_builtin
That means third-party and legacy code containing bare
breakpoint() calls debugs correctly inside your actor tree
with zero edits (the override even forwards kwargs like
hide_tb to the underlying pause machinery, as shown in the
excerpt above).
Warning
Without greenback (or with maybe_enable_greenback=False,
the default), debug_mode=True instead blocks the builtin
breakpoint(): sys.breakpointhook is swapped for a
raiser and PYTHONBREAKPOINT=0 is set. A naive
breakpoint() from some random process would clobber the
shared tty, so we’d rather hand you a loud RuntimeError
with install instructions.
Both the hook and the env var are restored to their prior values
on runtime exit — see
examples/debugging/restore_builtin_breakpoint.py for the
proof.
Breakpoints inside asyncio tasks#
Yes, even “infected asyncio” actors get the goods. Spawn a
child with infect_asyncio=True (trio runs as a guest on
the asyncio loop inside it) and, with debug mode + greenback
armed, every asyncio task started via tractor.to_asyncio
is automatically granted a greenback portal — so a plain builtin
breakpoint() (or tractor.pause_from_sync()) inside an
asyncio.Task joins the same single-terminal, tree-locked REPL
flow:
'''
Examples of using the builtin `breakpoint()` from an `asyncio.Task`
running in a subactor spawned with `infect_asyncio=True`.
'''
import asyncio
import trio
import tractor
from tractor import (
to_asyncio,
Portal,
)
async def aio_sleep_forever():
await asyncio.sleep(float('inf'))
async def bp_then_error(
chan: to_asyncio.LinkedTaskChannel,
raise_after_bp: bool = True,
) -> None:
# sync with `trio`-side (caller) task
chan.started_nowait('start')
# NOTE: what happens here inside the hook needs some refinement..
# => seems like it's still `.debug._set_trace()` but
# we set `Lock.local_task_in_debug = 'sync'`, we probably want
# some further, at least, meta-data about the task/actor in debug
# in terms of making it clear it's `asyncio` mucking about.
breakpoint() # asyncio-side
# short checkpoint / delay
await asyncio.sleep(0.5) # asyncio-side
if raise_after_bp:
raise ValueError('asyncio side error!')
# TODO: test case with this so that it gets cancelled?
else:
# XXX NOTE: this is required in order to get the SIGINT-ignored
# hang case documented in the module script section!
await aio_sleep_forever()
@tractor.context
async def trio_ctx(
ctx: tractor.Context,
bp_before_started: bool = False,
):
# this will block until the ``asyncio`` task sends a "first"
# message, see first line in above func.
async with (
to_asyncio.open_channel_from(
bp_then_error,
# raise_after_bp=not bp_before_started,
) as (chan, first),
trio.open_nursery() as tn,
):
assert first == 'start'
if bp_before_started:
await tractor.pause() # trio-side
await ctx.started(first) # trio-side
tn.start_soon(
to_asyncio.run_task,
aio_sleep_forever,
)
await trio.sleep_forever()
async def main(
bps_all_over: bool = True,
# TODO, WHICH OF THESE HAZ BUGZ?
cancel_from_root: bool = False,
err_from_root: bool = False,
) -> None:
async with tractor.open_nursery(
debug_mode=True,
maybe_enable_greenback=True,
# loglevel='devx',
) as an:
ptl: Portal = await an.start_actor(
'aio_daemon',
enable_modules=[__name__],
infect_asyncio=True,
debug_mode=True,
# loglevel='cancel',
)
async with ptl.open_context(
trio_ctx,
bp_before_started=bps_all_over,
) as (ctx, first):
assert first == 'start'
# pause in parent to ensure no cross-actor
# locking problems exist!
await tractor.pause() # trio-root
if cancel_from_root:
await ctx.cancel()
if err_from_root:
assert 0
else:
await trio.sleep_forever()
# TODO: case where we cancel from trio-side while asyncio task
# has debugger lock?
# await ptl.cancel_actor()
if __name__ == '__main__':
# works fine B)
trio.run(main)
# will hang and ignores SIGINT !!
# NOTE: you'll need to send a SIGQUIT (via ctl-\) to kill it
# manually..
# trio.run(main, True)
Note the interleave: a breakpoint() on the asyncio side,
tractor.pause() on the trio side of the same actor, and
another pause up in the root — all serialized through the one tty
lock with no cross-actor (or cross-event-loop!) clobbering.
One catch: asyncio tasks spawned out-of-band — i.e. not via
tractor.to_asyncio, typically by some third-party aio lib —
have no portal bestowed, so a sync pause from one raises a loud
RuntimeError telling you to greenback.ensure_portal()
first. See the caveats below.
Teardown debugging: the shielded pause#
Cancellation is trio’s bread and butter, which raises an
awkward question: how do you REPL inside an already-cancelled
scope, say while debugging some teardown sequence? A bare
pause() would itself be cancelled at its next checkpoint.
The answer is await tractor.pause(shield=True), which wraps
the lock acquisition and REPL session in a shielded cancel scope
(post_mortem(shield=True) works the same way):
import trio
import tractor
async def cancellable_pause_loop(
task_status: trio.TaskStatus[trio.CancelScope] = trio.TASK_STATUS_IGNORED
):
with trio.CancelScope() as cs:
task_status.started(cs)
for _ in range(3):
try:
# ON first entry, there is no level triggered
# cancellation yet, so this cp does a parent task
# ctx-switch so that this scope raises for the NEXT
# checkpoint we hit.
await trio.lowlevel.checkpoint()
await tractor.pause()
cs.cancel()
# parent should have called `cs.cancel()` by now
await trio.lowlevel.checkpoint()
except trio.Cancelled:
print('INSIDE SHIELDED PAUSE')
await tractor.pause(shield=True)
else:
# should raise it again, bubbling up to parent
print('BUBBLING trio.Cancelled to parent task-nursery')
await trio.lowlevel.checkpoint()
async def pm_on_cancelled():
async with trio.open_nursery() as tn:
tn.cancel_scope.cancel()
try:
await trio.sleep_forever()
except trio.Cancelled:
# should also raise `Cancelled` since
# we didn't pass `shield=True`.
try:
await tractor.post_mortem(hide_tb=False)
except trio.Cancelled as taskc:
# should enter just fine, in fact it should
# be debugging the internals of the previous
# sin-shield call above Bo
await tractor.post_mortem(
hide_tb=False,
shield=True,
)
raise taskc
else:
raise RuntimeError('Dint cancel as expected!?')
async def cancelled_before_pause(
):
'''
Verify that using a shielded pause works despite surrounding
cancellation called state in the calling task.
'''
async with trio.open_nursery() as tn:
cs: trio.CancelScope = await tn.start(cancellable_pause_loop)
await trio.sleep(0.1)
assert cs.cancelled_caught
await pm_on_cancelled()
async def main():
async with tractor.open_nursery(
debug_mode=True,
) as n:
portal: tractor.Portal = await n.run_in_actor(
cancelled_before_pause,
)
await portal.wait_for_result()
# ensure the same works in the root actor!
await pm_on_cancelled()
if __name__ == '__main__':
trio.run(main)
If you forget, tractor has your back: an unshielded
pause() from a cancelled scope fails fast with a hint
suggesting await tractor.pause(shield=True) instead of
silently never REPL-ing.
Go ahead, mash ctrl-c#
While any REPL is live the runtime installs a custom SIGINT
handler tree-wide so that a reflexive ctrl-c (or five) can’t
nuke your debug session:
the actor that owns the REPL ignores the interrupt and simply re-flushes the prompt — keep mashing, it’s fine,
the root actor ignores
SIGINTwhile a still-IPC-connected child holds the tty lock, so the supervisor won’t tear down the tree out from under the debugger,if the lock state has gone stale — the locking child died or its IPC channel dropped — the root cancels the stale lock scope and restores
trio’s default handler, soctrl-cworks again exactly when it should.
The handler is uninstalled and trio’s own SIGINT
semantics restored every time a REPL releases (on continue /
quit).
Live task-tree dumps#
Sometimes there’s no error to catch — the tree is just hung and
you want to know where. For that tractor integrates
stackscope: send a signal, get a full trio task-tree dump
from every actor in the tree.
Enable it any of three ways:
open_root_actor(enable_stack_on_sig=True)(or viaopen_nursery()which forwards it),set
TRACTOR_ENABLE_STACKSCOPE=1in the env — it’s inherited through the process tree so every (sub)actor arms the handler at boot,call
tractor.devx.enable_stack_on_sig()directly.
It’s intentionally not gated on debug_mode so you can leave
it armed in plain runs. Then, when the hang strikes, signal the
tree with SIGUSR1.
Tip
No need to hunt down pids — pattern-match the original cmdline
with pkill:
$ pkill --signal SIGUSR1 -f "python example_script.py"
Each actor dumps its entire trio task tree (full nursery
recursion via stackscope.extract()) to its tty and tees it
to /tmp/tractor-stackscope-<pid>.log — so the trace survives
even under captured-stdio harnesses — then relays the signal on
to its children, parent-before-child, until the whole tree has
reported in.
Try it yourself with the demo script, which deliberately hangs a subactor in a shielded sleep:
'''
Verify we can dump a `stackscope` tree on a hang.
'''
import os
import platform
import signal
import trio
import tractor
@tractor.context
async def start_n_shield_hang(
ctx: tractor.Context,
):
# actor: tractor.Actor = tractor.current_actor()
# sync to parent-side task
await ctx.started(os.getpid())
print('Entering shield sleep..')
with trio.CancelScope(shield=True):
await trio.sleep_forever() # in subactor
# XXX NOTE ^^^ since this shields, we expect
# the zombie reaper (aka T800) to engage on
# SIGINT from the user and eventually hard-kill
# this subprocess!
async def main(
from_test: bool = False,
) -> None:
if platform.system() != 'Darwin':
tpt = 'uds'
else:
# XXX, precisely we can't use pytest's tmp-path generation
# for tests.. apparently because:
#
# > The OSError: AF_UNIX path too long in macOS Python occurs
# > because the path to the Unix domain socket exceeds the
# > operating system's maximum path length limit (around 104
#
# WHICH IS just, wtf hillarious XD
tpt = 'tcp'
async with (
tractor.open_nursery(
debug_mode=True,
enable_stack_on_sig=True,
loglevel='devx', # XXX REQUIRED log level!
enable_transports=[tpt],
# maybe_enable_greenback=True,
# ^TODO? maybe a "smarter" way todo all this is how
# `modden` does with a rtv serialized through the osenv?
) as an,
):
ptl: tractor.Portal = await an.start_actor(
'hanger',
enable_modules=[__name__],
debug_mode=True,
)
async with ptl.open_context(
start_n_shield_hang,
) as (ctx, cpid):
_, proc, _ = an._children[
ptl.chan.aid.uid
]
assert cpid == proc.pid
print(
'Yo my child hanging..?\n'
# "i'm a user who wants to see a `stackscope` tree!\n"
)
# XXX simulate the wrapping test's "user actions"
# (i.e. if a human didn't run this manually but wants to
# know what they should do to reproduce test behaviour)
if from_test:
print(
f'Sending SIGUSR1 to {cpid!r}!\n'
)
os.kill(
cpid,
signal.SIGUSR1,
)
# simulate user cancelling program
await trio.sleep(0.5)
os.kill(
os.getpid(),
signal.SIGINT,
)
else:
# actually let user send the ctl-c
await trio.sleep_forever() # in root
if __name__ == '__main__':
trio.run(main)
(That trio.CancelScope(shield=True) hang also shows off the
zombie reaper: ctrl-c the root and the un-cancellable child
still gets hard-reaped — if you can create a zombie it is a
bug.)
Crash handling for sync and CLI code#
All of the above rides on the actor runtime, but crashes don’t
politely wait for trio.run(). For plain sync code — think
typer/click CLI endpoints, config parsing, anything
pre-runtime — there’s a sync context manager that wraps the same
pdbp post-mortem UX:
from tractor.devx import open_crash_handler
def main(): # any sync code, no runtime required
with open_crash_handler() as boxed:
run_my_cli_thing()
By default any BaseException (minus an ignore set
defaulting to KeyboardInterrupt and trio.Cancelled)
enters the REPL then re-raises on exit; pass
raise_on_exit=False to suppress instead and introspect the
boxed.value afterward. The catch/ignore sets and a
repl_fixture are all tweakable.
For the classic --pdb CLI-flag pattern use the conditional
variant:
from tractor.devx import maybe_open_crash_handler
@app.command() # a `typer` (or `click`) endpoint
def cmd(pdb: bool = False):
with maybe_open_crash_handler(pdb=pdb):
...
REPL niceties and hooks#
Every REPL in this guide is a pdbp instance (the maintained
fork-and-fix of pdb++) pre-configured by tractor:
pygments syntax highlighting in listings and tracebacks,
tab completion — including an automatic fixup for libedit-compiled CPythons (e.g.
uv-distributed pythons),sticky mode available via the
stickycommand (off by default),no long-line truncation (terminal resizes behave),
the
(Pdb+)prompt,ll, hidden-frames support and the rest of thepdb++goodies you may already know.
Internal runtime frames are traceback-hidden so the REPL lands
exactly on your pause()-call or crash frame, never on
tractor guts.
Finally, if your app owns the terminal (TUIs, fullscreen
dashboards) pass repl_fixture=<your ctx mngr> to pause(),
post_mortem() or open_crash_handler(): it’s entered just
before the REPL engages (return False to skip entry entirely)
and exited on release — perfect for suspending and restoring your
screen around a debug session.
Caveats and platform notes#
An honest list of the current rough edges:
Windows: the debugger has no CI coverage on windows at all (the entire test module is skipped there); manual testing has shown it can work, but you’re in uncharted territory — reports welcome!
macOS: supported but with rough edges: special-cased prompt re-flushing for
bash-on-darwin, a few tooling tests skipped on CI, and the AF_UNIX ~104-char socket-path limit forces some examples (like the stackscope demo above) to fall back from'uds'to'tcp'transport. Wonder if all of it’ll work on OS X? So do we.CPython 3.14:
greenback(viagreenlet) doesn’t support 3.14 yet, sopause_from_sync()and the builtinbreakpoint()override are effectively 3.13-only for now. The async APIs —pause()andpost_mortem()— need no greenback and work everywhere.out-of-band
asynciotasks: sync pauses from aio tasks not spawned viatractor.to_asyncioraise aRuntimeError(no greenback portal was bestowed); runawait greenback.ensure_portal()inside such a task first.nested-tree ctrl-c edges:
SIGINTrelay through intermediary parents that aren’t themselves in debug mode still has known rough edges — see #320.captured stdio:
pytest-style output capture can hang apause(); use a real terminal (or a pty à lapexpect, which is howtractor’s own suite drives every one of these examples).
Where to next?#
See also
The Context: a cross-actor task pair — the SC-linked cross-actor task API that all the crash-propagation semantics above ride on.
tractor.pause(),tractor.post_mortem()andtractor.pause_from_sync()in the API reference.examples/debugging/— 20-odd runnable scripts, nearly every one exercised by the test suite through a real pty.