Status: Design (tracks #1367 with companion task #1590).
Audience: daemon maintainers, release engineering, anyone landing follow-up
work under the broader async migration (#1751, #1935, #1411, #1412).
Scope: pick the runtime (or non-runtime) that the daemon accept loop will
adopt to serve high concurrent connection counts. This is the which
question. The how lives in
docs/design/daemon-tokio-async-listener-impl.md (#1935) and the why
lives in docs/design/daemon-async-accept-sync-workers.md (the hybrid
accept-plus-sync-workers model). Neither is restated here.
crates/daemon/src/daemon/sections/server_runtime/accept_loop.rs and
connection.rs currently bind std::net::TcpListener and dispatch each
accepted socket onto an std::thread::spawn worker. The thread-per-
connection (TPC) model is the production path today; the bench plan in
docs/design/daemon-tpc-benchmark-plan.md (#1933) measures its
ceiling.
A parallel async listener prototype exists at
crates/daemon/src/daemon/async_session/listener.rs behind the
async = ["dep:tokio", ...] cargo feature
(crates/daemon/Cargo.toml:20). The prototype already uses tokio; the
question is whether that choice is correct, and whether async should
ship at all.
This document resolves three sub-questions:
1.13.0 (Sep 2023). The repository has had no
feature work through 2024-2025; open issues sit unanswered, and the
upstream maintainer has publicly signalled the project is in
long-term maintenance mode at best. Picking async-std today is
picking a soft-frozen runtime.RUSTSEC-*). The runtime ships fixes within days of
upstream kernel quirks (recent example: io_uring accept_multi
edge cases on Linux 6.x).For a long-lived system daemon shipping across Linux, macOS, and Windows, runtime maintenance is not negotiable. async-std fails this gate before any technical comparison.
| Capability | tokio | async-std |
|---|---|---|
TcpListener accept |
tokio::net::TcpListener |
async_std::net::TcpListener |
Conversion to blocking std |
into_std() + set_nonblocking |
into_std() available, less polished |
| Signal handling (SIGTERM, Ctrl-C) | tokio::signal (cross-platform) |
requires async-ctrlc third-party |
| Subprocess / pipe | tokio::process (full) |
async_std::process (limited) |
| Blocking offload | spawn_blocking (tuned pool) |
task::spawn_blocking (less tuned) |
| Time / timeouts | tokio::time::timeout |
async_std::future::timeout |
| Filesystem async I/O | tokio::fs |
async_std::fs |
| Sync primitives (Semaphore etc.) | tokio::sync::* |
async_std::sync::* (thinner) |
Both runtimes can technically do what the daemon needs (bind, accept,
handoff to sync workers). Tokio’s offering is the more complete and
better-instrumented one; the gap matters most in the areas the daemon
actually touches: signal handling and spawn_blocking.
crates/rsync_io/Cargo.toml:22,32 and slated for the async SSH path
per docs/design/async-runtime-ssh-eval.md and #1411) is
tokio-native. async-std would require an async-compat shim.async feature
(crates/daemon/Cargo.toml:20) and via the embedded SSH facade
(crates/rsync_io/src/ssh/embedded/connect.rs:107). Adding
async-std would violate #1780 even if it were technically attractive.Verdict: async-std is rejected on maintenance and #1780 grounds before performance enters the discussion. Tokio is the only viable async runtime for this workspace.
The threaded model has real virtues, and they should be named so the decision below is not handwaved:
crates/protocol, crates/engine,
crates/checksums, crates/transfer, crates/core are 100% sync,
blocking, rayon-parallel. Keeping the daemon sync keeps the boundary
clean: no async fn viruses, no .await insertions in hot paths,
no Pin<Box<dyn Future>> show-ups in error types.catch_unwind(AssertUnwindSafe(|| {
handle_session(...) })) in
connection.rs:184 is byte-for-byte the upstream-equivalent
“fork crash kills only the child” guarantee. Async tasks isolate via
JoinHandle, which is comparable but newer to this codebase.gdb thread apply all bt Just Works on a TPC
daemon. A tokio worker stuck inside poll_* requires
tokio-console or careful tracing.async-daemon off keep
tokio out of the binary entirely. Embedded operators (small fleet
rsync mirrors, OpenWRT-style hosts) benefit.The threaded model fails only at scale. The audit
(docs/audits/daemon-thread-per-connection-scalability.md) and the
benchmark plan (docs/design/daemon-tpc-benchmark-plan.md) bracket
where: somewhere between W1k and W10k concurrent sessions on default
Linux ulimits, depending on glibc stack reservation and whether the
operator has raised RLIMIT_NOFILE and RLIMIT_NPROC.
Tokio is already in the workspace and is the right anchor for the remaining async work:
tokio::sync::Semaphore.The threaded path’s only structural advantage - no async colouring
inside transfer code - is preserved by the hybrid model from
daemon-async-accept-sync-workers.md: tokio runs the accept layer,
spawn_blocking hands sockets to the existing sync workers, and the
transfer state machine never sees an .await. The choice of tokio
does not commit the project to going async anywhere else.
Adopt tokio with the rt-multi-thread flavour for the daemon
async listener path, behind the existing async = ["dep:tokio", ...]
cargo feature, gated at runtime by an opt-in
use-async-listener = true oc-rsyncd.conf directive.
Rationale, in priority order:
rt-multi-thread, not current_thread. The daemon serves
independent connections that can run on independent reactor cores.
current_thread would serialise accept and limit reactor
parallelism to one core; the only reason to pick it would be a
single-threaded process invariant the daemon does not have. The
multi-thread runtime can be sized to min(available_parallelism(),
8) per the impl doc (#1935), keeping the reactor footprint small
without artificially capping accept throughput.daemon-async-accept-sync-workers.md, the transfer engine never
becomes async-coloured. Tokio touches only the cheap parts.--features async-daemon ship a tokio-free binary.
Operators who build with the feature but leave
use-async-listener = false ship a binary that links tokio but
never starts the runtime.This is consistent with the conclusions in
docs/design/async-runtime-ssh-eval.md (#1411) and
docs/design/async-migration-plan.md (#1594): one runtime, opt-in,
hybrid with the existing sync engine.
The cost is modest because most of it is already paid:
AsyncDaemonListener at
crates/daemon/src/daemon/async_session/listener.rs:108 binds
tokio::net::TcpListener, drives the accept loop, applies a
tokio::sync::Semaphore for connection limits, and respects a
tokio::sync::broadcast shutdown signal. The async session
scaffold sits at crates/daemon/src/daemon/async_session/session.rs
and shutdown.rs.crates/daemon/Cargo.toml:45 for tokio = { ..., features = ["net",
"io-util", "sync", "rt", "time"] }. The async feature gate is
live.spawn_blocking (the pattern proven
under #1751).What still needs to happen:
crates/daemon/src/daemon/sections/server_runtime/connection.rs
(the run_single_listener_loop and run_dual_stack_loop paths)
so that when use-async-listener = true is set, the daemon starts
a tokio runtime and delegates to AsyncDaemonListener. When the
directive is unset or false, the existing
spawn_connection_worker path runs unchanged.tokio::net::TcpStream -> std::net::TcpStream via into_std() +
set_nonblocking(false), then call
tokio::task::spawn_blocking(|| run_sync_worker(stream, ...)) and
.await the join handle. The blocking closure is the existing
worker body factored out of connection.rs.use-async-listener as a boolean directive in
crates/daemon/src/rsyncd_config/sections.rs and validate it in
crates/daemon/src/rsyncd_config/validation.rs. Default false.["rt-multi-thread", "net", "macros", "signal", "sync", "time"].
rt-multi-thread and signal are not currently in the daemon’s
tokio feature list (crates/daemon/Cargo.toml:45 declares
["net", "io-util", "sync", "rt", "time"]); they get added under
#1935.The cost is bounded: this is a swap-in at the accept boundary, not a
rewrite. The implementation diff under #1935 is expected to stay
inside crates/daemon and not touch crates/core, crates/engine,
crates/protocol, crates/transfer, or crates/checksums.
The cutover is a separate decision from the implementation. The implementation under #1935 ships the path; the default flip is a follow-up that needs evidence from the bench plan in #1933. The gating signals:
pthread_create is contending with the
protocol handshake for wall-clock time. Below 200 us, the spawn
cost is in the noise next to the rsync handshake itself.docs/design/daemon-tpc-benchmark-plan.md Section 9. Above this,
the operator cost of TPC stops being acceptable on common 8-16 GiB
hosts.thread::spawn rather than on accept.Any one threshold tripped flips the default. None tripped keeps the sync path the default and the async path strictly opt-in.
Below 100 concurrent connections, async stays off regardless. The small-daemon tokio overhead is not worth paying.
#1935 owns the implementation. This plan owns the adoption. The two are coordinated but separable.
Land the implementation behind the existing feature gate.
Merge #1935. Default build remains tokio-free; --features
async-daemon builds the async path. Default behaviour is
unchanged because use-async-listener defaults to false.
Stand up CI coverage for both paths. Add a daemon-async
CI matrix entry that runs the daemon integration tests under
--features async-daemon with use-async-listener = true. The
existing matrix continues to test the sync path. Both must stay
green on Linux, macOS, and Windows. No default flip until two
consecutive release cycles pass clean on the async matrix.
Run the TPC benchmark plan. Execute
docs/design/daemon-tpc-benchmark-plan.md (#1933) on dedicated
hardware once the Section 3 precondition fix (active-counter
admission gate) lands. Record W100, W1k, W10k results for the
sync path. Compare against the trigger thresholds in Section 7
of this document.
Flip the default in a separate PR if any trigger fires. The
PR changes the default of use-async-listener to true and
updates the daemon operator guide. The cargo feature stays in
place so distributions that want to ship without tokio can still
do so via --no-default-features. The sync path is not deleted
for at least one release after the flip; it remains the fallback
if a regression is reported.
Retire the sync accept path only after one full release
cycle of the async default with no rollback. Even then, the
std::thread::spawn worker bodies stay: only the accept loop
moves. The hybrid model means the actual transfer code is
untouched throughout this five-step sequence.
crates/core, crates/engine, crates/transfer,
crates/checksums, crates/protocol, or crates/metadata. The
hybrid model exists precisely so that work never has to happen.docs/design/async-runtime-ssh-eval.md.async-compat. Rejected outright per #1780.current_thread runtime flavour. Rejected per Section 5.docs/design/iouring-daemon-tcp.md and is a tokio internal
implementation choice (tokio-uring), not a runtime choice.docs/design/daemon-async-accept-sync-workers.md
(the “what runs where” answer).docs/design/daemon-tokio-async-listener-impl.md
(the “how to build it” answer, #1935).docs/design/daemon-tpc-benchmark-plan.md
(the “when to flip the default” data source, #1933).docs/audits/daemon-thread-per-connection-scalability.md
(the “why we need this at all” data source, #1673).docs/design/async-migration-plan.md (#1594).docs/design/async-runtime-ssh-eval.md (#1411) and
docs/design/ssh-transport-async-io-eval.md (#1593).