rsync

Parallel-flist deterministic delete (#2251)

Status: Design (task #2251; supersedes #1940 + docs/design/delete-during-strict-order-gate.md) Audience: receiver, generator, engine, cli maintainers Scope: replace today’s batched --delete-during sweep and PR #4245’s opt-in --delete-strict-order gate with a single two-phase model that reproduces upstream rsync 3.4.1’s per-directory delete order byte-for-byte while preserving internal parallelism. No new user-visible flags; no fallback to the old batched sweep.

1. Problem statement

1.1 Today’s divergence

oc-rsync’s receiver runs a single deletion sweep before the transfer pipeline (crates/transfer/src/receiver/transfer.rs:540-544):

if self.config.flags.delete {
    let (ds, exceeded) = self.delete_extraneous_files(&setup.dest_dir, writer)?;
    ...
}

Inside delete_extraneous_files (crates/transfer/src/receiver/directory/deletion.rs:40-186):

Upstream, in contrast, drives deletion per directory inside the generator loop (target/interop/upstream-src/rsync-3.4.1/generator.c):

1.2 Why PR #4245 was the wrong shape

PR #4245 (#1940) added --delete-strict-order as an opt-in flag. Audit of the implementation (crates/engine/src/local_copy/executor/directory/recursive/mod.rs:188-200, plus 94 references across cli, core, engine):

1.3 Mandate

2. Two-phase model

                  +-----------------+      +------------------+
   flist segment  | compute_extras  |---->-| DeletePlan(D)    |
   arrives (#N)   | (rayon worker)  |      +------------------+
                  +-----------------+               |
                                                    v
                  +-----------------+      +------------------+
   flist segment  | compute_extras  |---->-| DeletePlan(D')   |
   arrives (#N+1) | (rayon worker)  |      +------------------+
                  +-----------------+               |
                                                    v
                  +---------------------------------------+
                  | DeletePlanMap (keyed by dir relpath)  |
                  +---------------------------------------+
                                    |
                                    v
                  +---------------------------------------+
                  | DirTraversalCursor (upstream order)   |
                  +---------------------------------------+
                                    |
                                    v
                  +---------------------------------------+
                  | single emitter thread:                |
                  | for each dir in upstream order        |
                  |   await DeletePlan(D)                 |
                  |   for each entry in plan order        |
                  |     unlink, itemize, stat++           |
                  +---------------------------------------+

2.1 Phase 1 - parallel compute_extras

Attach to the existing per-segment receiver hook in crates/transfer/src/receiver/file_list.rs:130-233 (receive_extra_file_lists). For each arriving segment S describing content directory D:

  1. Determine the segment’s content directories. For each, look up the destination directory’s read_dir snapshot (filename, file_type, normalized for macOS NFC parity as today).
  2. extras(D) = readdir(D) - segment_entries(D), intersected with FilterChain::allows_deletion() for the snapshot of the chain that is in effect for that directory (including any .rsync-filter merge files loaded by enter_directory for that subtree).
  3. Sort extras(D) with compare_file_entries (crates/protocol/src/flist/sort.rs:60, our existing port of upstream f_name_cmp, target/interop/upstream-src/rsync-3.4.1/flist.c:3217-3343). The sort uses t_PATH when protocol_version >= 29, matching f_name_cmp exactly. Upstream uses qsort (unstable); we use sort_unstable_by to match.
  4. Reverse the sorted slice. Upstream iterates for (i = dirlist->used; i--; ) (generator.c:320), so plan-order is the reverse of ascending f_name_cmp.
  5. Wrap the result in a DeletePlan and publish it into DeletePlanMap keyed by the directory’s relative path.

This work parallelises naturally on the existing rayon segment-dispatch pool. compute_extras is pure (read-only filesystem stat + immutable flist + immutable filter chain), so workers do not coordinate. The same PARALLEL_STAT_THRESHOLD = 64 knob used elsewhere in the receiver gates whether per-directory scans inside a segment fan out further.

2.2 Phase 2 - single emitter

A single drain task owns the unlink, itemize, and stats sequence. It walks directories in upstream traversal order via DirTraversalCursor (section 4); for each directory D it blocks until DeletePlanMap[D] is ready, then iterates the plan and:

Because every observable side effect happens on a single thread in upstream order, the wall-clock event sequence is bit-identical.

2.3 Invariants the design locks down

The implementation tasks (#2252-#2285) MUST preserve these:

  1. Single emitter for observable effects. Only the drain task calls unlink, emits itemize, mutates DeleteStats, or updates io_error. Workers compute candidates and stop.
  2. Order = upstream f_name_cmp reversed, per directory. Plan order inside a directory is compare_file_entries ascending, reversed - identical to delete_in_dir’s decrementing loop.
  3. Directory order = upstream depth-first traversal. The cursor yields directories in the order do_delete_pass / generate_files() would visit them; see section 4.
  4. Filter chain snapshot per directory. The compute_extras worker uses the chain in effect when that directory’s .rsync-filter merges have been applied, mirroring change_local_filter_dir(fbuf, dlen, F_DEPTH(file)) (generator.c:301). The drain task does not re-evaluate filters.
  5. Workers are pure. No syscalls beyond read_dir/stat; no mutation of shared state beyond inserting into DeletePlanMap.
  6. Plan publication is monotonic. A DeletePlan is published once per directory and never mutated. The drain task may wait on it but never races with a writer.
  7. --delete-delay defers, does not reorder. Plans are still built in phase 1; the emitter writes them to a delay buffer and replays during finalisation, in the same per-directory order.
  8. No fallback. The old delete_extraneous_files batched sweep is deleted, not gated. The --delete-strict-order / --no-delete-strict-order flags are removed from the CLI surface.

3. f_name_cmp semantics

Upstream flist.c:3217-3343 defines a four-state automaton over (dirname, basename):

Our existing port lives in crates/protocol/src/flist/sort.rs:

The design REUSES this port. Phase 1 builds a transient FileEntry per destination-side candidate (using the same S_ISDIR bit upstream checks), then sorts with compare_file_entries. The cited golden tests at crates/protocol/src/flist/sort.rs:472-809 cover the comparator’s edge cases (qsort vs stable, pre-29 vs >=29, dir-vs-file trailing slash); the implementation task adds one more golden file pinning the reverse iteration order matches generator.c:320.

4. Data structures

These types live in a new module crates/engine/src/delete_plan/ (siblings: mod.rs, plan.rs, cursor.rs, emitter.rs).

4.1 DeletePlan

/// Sorted, frozen list of destination entries to delete in one directory.
#[derive(Debug)]
pub struct DeletePlan {
    /// Relative directory path from the destination root.
    pub dir: PathBuf,
    /// Entries to delete, in upstream `delete_in_dir` emission order
    /// (i.e. `compare_file_entries` ascending, reversed).
    pub entries: Vec<PlannedDelete>,
    /// Filter snapshot used to compute this plan (for diagnostics; the
    /// snapshot was already applied before publication).
    pub filter_generation: u64,
}

#[derive(Debug)]
pub struct PlannedDelete {
    pub file_name: OsString,
    pub file_type: FileTypeFlags,
    pub flags: DeleteFlags, // DEL_RECURSE, DEL_NO_UID_WRITE, etc.
}

DeleteFlags is a bitset mirroring upstream rsync.h:291-301 (DEL_NO_UID_WRITE, DEL_RECURSE, DEL_DIR_IS_EMPTY, DEL_FOR_FILE, DEL_FOR_DIR, DEL_MAKE_ROOM). The plan stores only the flags upstream sets in delete_in_dir (DEL_RECURSE, optional DEL_NO_UID_WRITE); DEL_FOR_* and DEL_MAKE_ROOM belong to the in-loop replacement path and are not part of the sweep.

4.2 DeletePlanMap

/// Lock-free map from directory relpath to publish-once `DeletePlan`.
pub struct DeletePlanMap {
    inner: DashMap<PathBuf, Arc<OnceLock<DeletePlan>>>,
}

impl DeletePlanMap {
    /// Worker side. Reserves a slot and returns a publisher handle.
    pub fn reserve(&self, dir: PathBuf) -> PlanPublisher { ... }

    /// Drain side. Blocks until `dir`'s plan is published; returns
    /// `None` if the cursor has been told this directory will never
    /// receive a plan (e.g. it does not exist at the destination).
    pub fn wait(&self, dir: &Path) -> Option<Arc<DeletePlan>> { ... }
}

DashMap is already in the workspace (used in the buffer pool and the hardlink table). The OnceLock inside each slot makes publication single-writer / multi-reader and lock-free on the read path. An explicit sentinel marks “no plan ever” so the drain task does not deadlock when a directory is missing from the destination.

4.3 DirTraversalCursor

/// Yields directories in upstream traversal order for the drain task.
pub struct DirTraversalCursor {
    flist: Arc<SegmentedFileList>,
    next_segment_ix: usize,
    segment_dir_queue: VecDeque<PathBuf>,
}

impl DirTraversalCursor {
    /// Returns the next directory to drain, or `None` when all
    /// segments are exhausted.
    pub fn next_dir(&mut self) -> Option<PathBuf>;
}

Order rules (mirrors generate_files() and do_delete_pass()):

5. Per-timing-mode wiring

All four modes use the same Phase 1 (parallel compute_extras) and the same Phase 2 emitter. The mode only changes WHEN the drain task runs and WHAT it does with each plan.

Mode Phase 1 trigger Phase 2 trigger Per-plan action Upstream reference
--delete-before as each segment lands after EOF on flist, before first xfer unlink immediately generator.c:2263-2264 (do_delete_pass)
--delete-during as each segment lands interleaved per directory with xfers unlink immediately, before children generator.c:1523, 2307
--delete-delay as each segment lands after all xfers complete append to deldelay_buf, replay last generator.c:265, 2408-2409
--delete-after as each segment lands after all xfers complete unlink immediately generator.c:2410-2411 (do_delete_pass)
--delete-excluded as each segment lands (per-mode above) extras include filter-excluded entries exclude.c:1330, 1571, 1648

--delete-excluded is orthogonal to timing: it widens compute_extras so filter-excluded entries are eligible for deletion on the sender side (exclude.c:1571) and on the receiver side (exclude.c:1648). Phase 1 honours delete_excluded when computing the extras set; phase 2 is unchanged.

--delete-missing-args keeps its existing path (receiver/transfer.rs::create_directories / args resolution); it operates on top-level arguments, not on directory contents, and is out of scope for DeletePlanMap.

engine::hardlink::HardlinkTracker and protocol::flist::HardlinkTable describe the source side - which entries in the flist are followers of an earlier leader. They are not consulted when deciding what to delete on the destination: upstream’s delete_in_dir uses flist_find_ignore_dirness (generator.c:333), which looks the destination entry up by name in the sender’s sorted flist regardless of its inode story.

The design therefore does NOT cross-reference the hardlink table when building extras. A destination-side file that happens to be a hardlink target of a kept file but whose name is absent from the segment is still deleted (matching unlink semantics: only the name goes; the inode persists if there are other links). The single exception is the FLAG_MOUNT_DIR check (generator.c:324-329), which suppresses mount-point deletion and is preserved verbatim in the emitter.

--remove-source-files runs on the sender after the receiver acknowledges each file. It does not flow through DeletePlanMap; its ordering is governed by sender-side ACK arrival and is already upstream-identical.

7. Error policy

Upstream’s delete_in_dir and delete_item continue on most errors and abort only on:

The emitter mirrors both. Other errors (ENOENT, EACCES, ENOTEMPTY, etc.) are logged via debug_log!(Del, 1, ...) and the loop advances. This matches today’s delete_extraneous_files error handling (crates/transfer/src/receiver/directory/deletion.rs:178-182) and upstream’s delete.c:165-200 continue-on-failure stance.

DeleteStats is updated only on successful unlink, exactly where upstream increments stats.deleted_files and friends. The varint encoding in crates/protocol/src/stats/delete.rs is unchanged; the NDX_DEL_STATS writer in the generator goodbye phase (protocol >= 31) continues to consume the same struct.

8. Removal plan

The following are deleted as part of the implementation series, NOT deprecated, NOT gated:

8.1 CLI surface

Delete from crates/cli/src/frontend/:

8.2 Config and engine surface

Delete from crates/core/src/client/config/:

Delete from crates/engine/src/local_copy/:

8.3 Batched sweep

Delete from crates/transfer/src/receiver/:

8.4 Documentation

9. Test plan

9.1 Interop event-order parity (gates this entire change)

Add tests/delete_order_interop.rs driven by tools/ci/run_interop.sh:

9.2 Determinism property tests

crates/engine/src/delete_plan/tests.rs:

9.3 Unit tests for the comparator and cursor

9.4 Filter snapshot tests

9.6 Performance benches

benches/delete_throughput.rs (new):

10. Implementation order (tasks #2252-#2285)

The work is sequenced so each step is independently shippable behind the existing pipeline (no half-states observable to the user). Each step has its own task tag inside the #2252-#2285 range.

  1. Step 1 - data structures (#2252-#2257). Land crates/engine/src/delete_plan/ with DeletePlan, PlannedDelete, DeleteFlags, DeletePlanMap, DirTraversalCursor. No callers yet; pure unit tests pinning the f_name_cmp reverse-iteration order and cursor traversal.

  2. Step 2 - emitter and stats wiring (#2258-#2264). Land delete_plan::emitter with the single-threaded drain loop, wired through MsgInfoSender, DeleteStats, --max-delete, and the upstream error policy. Behind a fresh cfg(test) switch so nothing in the production receiver calls it yet.

  3. Step 3 - parallel compute_extras (#2265-#2271). Hook into receive_extra_file_lists (receiver/file_list.rs:130). Build plans and publish to DeletePlanMap. No emitter consumption yet; assert plan content in unit tests.

  4. Step 4 - cut the receiver over (#2272-#2278). Replace receiver/transfer.rs:540-544 and engine/local_copy/executor/directory/recursive/mod.rs:188-200 with the new emitter. Delete delete_extraneous_files and the strict-order field cluster from cli / core / engine. --delete-strict-order becomes an unknown-arg error.

  5. Step 5 - interop, perf, docs (#2279-#2285). Land the interop matrix from section 9.1, the perf bench from section 9.6, and the doc rewrites from section 8.4. Close out the audit references in docs/architecture/delete-during.md and docs/design/delete-during-strict-order-gate.md.

    • DDP-F3 (#2272): SHIPPED. The legacy-batched-delete cargo feature and the pre-DDP-E batched sweep helpers (delete_extraneous_entries_batched, remove_extraneous_path, delete_directory_tree_recursive) are removed from crates/engine. The emitter is now the sole production unlink path for every --delete-* timing mode.

11. Cross-references