Codebase review — 2026-06-09 (five-agent sweep)

Five parallel review agents over: reconciler enumeration correctness, concurrency/main-thread hygiene, deletion-propagation engine logic, performance, and UI progress honesty. Informed by the build 124–127 troubleshooting arc. Findings below are deduplicated and re-prioritized; items found independently by multiple agents are marked [corroborated].

Severity here is calibrated to cairn’s stakes: anything that can trash a photo still alive on the phone is critical regardless of likelihood.


Tier 1 — deletion-safety bugs (fix before next release)

1.1 Incremental metadata writes wipe allResourceFilenames, defeating the build-121 edited-asset protection

PhotoKitPersistentChangeReconciler.swift:1895-1902 (observeAndFilter) and :1845-1852 (recordFullEnumerationMetadata); destructive replace at SwiftDataStores.swift:1277. Severity: critical · certain (mechanism)

Both reconciler metadata paths construct LocalAssetMetadata without allResourceFilenames (defaults []), and record() unconditionally overwrites allResourceFilenamesCSV. Any insert/update event — including the edit event itself — permanently erases the non-primary resource filenames, which are the only source of the pre-edit upload name (IMG_2999.MOV vs FullSizeRender.mov) for the alive-key set (AppDependencies.swift:2210). Pre-anchor-era edited assets can then escape the alive-on-phone filter, get limbo-stamped, and in .trusting mode trash after 14 days while the photo is still on the phone.

Fix: observeAndFilter already has resources in hand at line 1892 — map resources.map(\.originalFilename) into the entry there and in recordFullEnumerationMetadata; and/or make record() preserve a non-empty stored CSV when the incoming list is empty.

1.2 Incremental insert/update path actively manages broad-source assets [corroborated: also via PhotoKitScopeEnumerator]

PhotoKitPersistentChangeReconciler.swift:537-607 (no source filter in .fullLibrary mode), observeAndFilter:1883 and hashAssets(ids:):2006 (both options: nil); PhotoKitScopeEnumerator.swift:79 includes .typeCloudShared in the membership map. Severity: high · likely

The post-125 narrow rule was enforced at hashAllCurrentAssets and discoverUntrackedAssets, but the third active-management entry point — persistent-change insert/update ids — has no source-type/hidden filter. A cloud-shared asset added by the other shared-library member fires an insert → cairn iCloud-downloads, hashes, indexes it (the 125 hang class, arriving incrementally). Worse: once indexed, the other member’s removal fires deletedLocalIdentifiers → stamped confirmed → cairn can trash the user’s Immich copy for a deletion the user never made.

Fix: filter insert/update candidate ids to .typeUserLibrary (e.g. via asset.sourceType inside observeAndFilter, where assets are already in hand); leave deletion-id resolution and the orphan sweep broad. Narrow PhotoKitScopeEnumerator’s source types to match.

1.3 Stale ConfirmedDeleted clocks skip the quarantine window — three converging mechanisms [corroborated]

  • Deferred drains never un-confirm restored checksums (PhotoKitPersistentChangeReconciler.swift:1026-1032 restoration covers only insert/update batches; drains at :1316-1339, :1495-1505 excluded).
  • Limbo-stamp suppression uses exact-second alive-key match while the engine uses ±1s (AppDependencies.swift:2412-2423 vs ReconciliationEngine.swift:362-370) — a 1s-drifted alive asset is candidate-suppressed every sync but still limbo-stamped.
  • runFullEnumeration never un-confirms re-observed checksums (only the incremental path does; host cleanup at AppDependencies.swift:2301-2303 covers only edit-retirement anchors).

Because ConfirmedDeletedStore.union is first-write-wins, the stale clock quietly ages out; a later genuine delete goes straight to deleteCandidates with zero quarantine. Severity: high · likely

Fix: each sync remove confirmedMap.keys ∩ extendedLocal from the store (generalizing the anchor cleanup); apply the ±1s probe when building aliveOnPhoneServerChecksums; fold drained checksums into the restoration path and recentlyObservedChecksums.

1.4 Restore-via-cairn misses Live Photo motion-video checksums → can re-trash the video half it just restored

AppDependencies.swift:4127-4159; expansion at TrashOrchestrator.swift:114-116. Severity: high · likely

Post-restore cleanup derives checksums from planningTrash targets by assetId; motion videos enter via livePhotoVideoId expansion and are restored but never un-confirmed/excluded. Next sync: in Observed, absent locally, not excluded, stale past-quarantine stamp → re-trashed. Broken Live Photo — the outcome expandLivePhotoPairs exists to prevent.

Fix: include checksums for target.livePhotoVideoId ids when computing the cleanup set (or record video checksums in TrashTarget).

1.5 Pending-trash retry queue not pruned on phone-side restoration

AppDependencies.swift:3040-3095 (drainPendingTrashes); prunes exist only at :4192, :4230, :4391. Severity: high · likely

Confirm-trash offline → intent queued → user restores from Recently Deleted → next sync un-confirms the checksum but the queued intent survives and the drain (which runs right after that same sync, :3989) trashes the server copy anyway.

Fix: prune intents containing scan.recentlyObservedChecksums (or at least unconfirmedByRestoration) before draining.


Tier 2 — correctness bugs (high, not directly destructive)

2.1 Incremental loops clobber modificationDate to nil right after hashing

PhotoKitPersistentChangeReconciler.swift:742, :771. certain

hashAssets already persists each success with the real modDate (:2220); the outer loops then call the two-arg hashStore.setmodificationDate: nil → delete-and-reinsert. Every incrementally-hashed asset becomes “stale” to the full-enum resume classifier (:1730-1742) → re-pays iCloud downloads on any future full enumeration (the 125–127 hang class); also defeats filterStaleUpdates’ cheap modDate exit and drainInternal’s alreadyHashed short-circuit.

Fix: delete the redundant outer set calls — the inner per-asset persistence already did the work. One-line fix, highest leverage in this tier.

2.2 Incremental change-token checkpoint loses events that fire during the scan

PhotoKitPersistentChangeReconciler.swift:1034-1043. certain (window); likely (impact)

Token is saved at scan end (after minutes of hashing); events in that window are never seen. The comment claims the opposite. Lost deletions are mostly recovered by the orphan sweep (but NOT in limited-Photos mode, where orphan finds are deliberately unstamped, :939); lost inserts are recovered by the untracked sweep; lost edits are recovered by nothing — cache SHA1 goes permanently stale, so a later delete resolves to the wrong checksum.

Fix: capture currentChangeToken before fetchPersistentChanges and save that (matching runFullEnumeration:1533’s baseline pattern). Double-processing on the next scan is idempotent and safe.

2.3 No reentrancy guard on requestSync [corroborated ×3 — found independently by reconciler, concurrency, and UI agents]

AppDependencies.swift:3852-3899; unguarded call sites: CairnAppIntents.swift:48, CairnApp.swift:237, 274, debug fire. likely

Shortcut/BG-handler/automation invocations concurrent with a manual sync interleave two performLiveReconciliation runs at every suspension point: double token consumption (widening 2.2’s window), cross-reset narration, whichever finishes first flips isSyncing=false so the survivor’s progress callbacks are swallowed (frozen UI for minutes of real work). Also: the cancel-unwind check (:4000, syncStartedAt != nil) can’t distinguish its own sync from a successor’s — a late CancellationError can stomp a fresh sync’s state.

Fix: early guard !model.isSyncing (or a sync-generation UUID compared in the unwind) inside the requestSync closure so all five entry points share one guard.

2.4 BG task handlers report success: true on expiration

CairnApp.swift:230-248, 268-294; requestSync swallows CancellationError at AppDependencies.swift:3990. certain

Expired BG slots log “refresh completed successfully” and complete with success. Diagnostics lie; system scheduler bookkeeping gets false signal. Fix: check Task.isCancelled after requestSync returns in BG handlers.

2.5 serverAssetsTask outlives a failed/cancelled sync and mutates model state afterwards

AppDependencies.swift:1837-1922. certain

Thrown sync never cancels the paginated server fetch (minutes of network/battery); its completion block writes serverChecksumSet / model.library with no isSyncing guard — cancelled sync’s UI visibly mutates afterward. Fix: cancel in defer/catch (or async let); gate the completion writes on isSyncing.

2.6 Orphan-merge rebuild drops recycledExclusionCandidates and aliveOnPhoneCandidateCount

AppDependencies.swift:2547-2554. certain

The ReconciliationOutput(...) rebuild omits both newer fields → the recycled-exclusion review surface vanishes whenever inferred orphans co-occur. Fail-safe direction, but nondeterministic feature suppression. Fix: pass both fields through.


Tier 3 — performance (hit on real workloads)

3.1 indexedCount() full-table scan on every hash-progress tick — O(n²) during initial scan

AppDependencies.swift:143 + SwiftDataStores.swift:931-939. high confidence Progress fires every 25 assets/250ms; each tick hydrates every StoredLocalHashEntry row (200k+ on a 100k library) to count distinct ids — serialized on the same actor as the hash writer. Fix: running counter in the callback (we already have done/initialDone), or propertiesToFetch.

3.2 setImputed per-asset transactions — ~81k sequential saves

AppDependencies.swift:1735-1742 + SwiftDataStores.swift:1053-1076. Largely explains the observed ~8-minute imputation phase (pure DB, no download to hide behind). Fix: batched setImputedBatch with ~500-1000 ids per save.

3.3 SwiftDataLocalAssetMetadataStore.record N+1 with ~108k entries

SwiftDataStores.swift:1262-1291; caller AppDependencies.swift:1627. One SELECT per entry + one giant save. Same N+1 shape in SwiftDataExclusionStore.insert (:695-720) and removeAll(for:) (:1128-1141). Fix: bulk-fetch existing ids once, insert/update in chunks.

3.4 applyEvents fetches the whole server cache per 100-event batch — quadratic bootstrap

SwiftDataStores.swift:1829-1832. Dormant today (incremental server sync ships off), but will dominate bootstrap the moment session-auth lands (open work #2). Fix: predicate-fetch only the batch’s ids.

3.5 saveThumbhashes N+1 over the full matched set, every sync

SwiftDataStores.swift:1563-1577; caller AppDependencies.swift:2724-2733. ~100k SELECTs per sync at steady state for a no-op. Fix: one query for already-populated ids, filter first.

3.6 Per-asset ISO8601DateFormatter allocation in DTO decode

ImmichClient.swift:899-906. 100-200k ICU-backed inits per full fetch. Fix: Date.ISO8601FormatStyle or static formatters.

3.7 refreshLibrarySizeStats hydrates two full stores, twice per sync

AppDependencies.swift:2907-2925 (called :1793, :2063). Fix: compute from the sync prelude’s existing snapshots.

Watch items: thumbnail eviction hydrates every blob (SwiftDataStores.swift:1592-1634 — store a sizeBytes column); visibleFetch enumerated 4× per sync (one pass could fill all four structures); Observed store full-table fetch per union call; journal read 3× per sync end.


Tier 4 — main-thread hygiene (the remaining UI-stutter class)

The DiagnosticLogFlusher fix removed the periodic offender; sync orchestration still funnels heavy work through MainActor:

  • performLiveReconciliation (AppDependencies.swift:2136-2526): enumerates the full visibleFetch three times on main, per-Live-Photo PHAssetResource.assetResources calls (10-30ms each), engine compute + orphan match over 100k-element sets — all on main. high
  • handlePhotoLibraryChange (:952-967): per-asset assetResources on main for every insert/change burst — a 50-photo import ≈ 1.5s stall while foreground. high
  • backfillMetadataIfNeeded (:1264-1306): full-library enumeration
    • per-asset resources on main; near-whole-cache on upgrade installs. (Contrast migrateResourceFilenamesIfNeeded, which detaches correctly.)
  • refreshLibrarySizeStats full-library path (:2875-2919): synchronous fetch + 100k-iteration intersection on main, 3×/sync.

Fix direction (shared): enumerate visibleFetch once on a detached task producing Sendable products; run engine/orphan compute off main; hop back only for model writes.

Verified safe: both nonisolated(unsafe) formatters; SwiftData store actors; CairnAppModel isolation; BG-task registration pattern.


Tier 5 — UI honesty (the “45-minute Syncing…” class)

5.1 Server-fetch progress invisible outside InitialScanScreen

AppDependencies.swift:1825-1829; CairnAppRoot.swift:321-322. certain serverAssetsFetched reaches only InitialScanScreen. Post-initial-scan, performLiveReconciliation blocks at serverAssetsTask.value with syncPhase still .hashing → Status shows frozen “Processing N/N” at 100% for the 8-16 min fetch. coordinator.syncToCache() has no progress callback at all. Fix: surface fetched/expected in StatusScreen + SyncDetailSheet; drive .fetchingServer as a real phase; add per-batch callback to the coordinator.

5.2 DryRunSheet integrity [corroborated ×2]

  • “Safety checks” card hardcodes three green checkmarks (DryRunSheet.swift:302-306) — asserts “Photos access is Full” even when limited. Fabricated assurance on a destructive surface.
  • “Raise threshold to N% and retry” button has an empty action (:358).
  • Terminal state reports “N moved to Trash” unconditionally (:444-450) — including when confirmTrash silently no-ops or throws (failure alert pops over the success sheet).
  • Related: iOS never runs SafetyRails.evaluate (CLI-only); the percent banner is the only iOS mass-delete guard. Defensible (all iOS trashes are user-confirmed) but worth an explicit decision. Fix: wire real check values; outcome-returning onConfirm with a failure terminal state; implement or remove the threshold button.

5.3 Settings “Hash now” drain uncancellable + invisible after Cancel

CairnAppRoot.swift:629 vs :565-570. certain Not tracked in activeSyncTask; Cancel flips isSyncing=false, drain keeps doing unlimited-mode iCloud downloads with all progress suppressed. Fix: route both entry points through one tracked-task helper.

5.4 ETA/progress integrity during initial scan

  • Post-imputation first emit re-baselines from initialHashed: 0 (AppDependencies.swift:2026 + :183-185) → wildly optimistic cumulative rate AND a bogus persisted per-asset rate poisoning next session’s bootstrap ETA. certain logic
  • Imputation reuses syncProgress.hashed as “scanned” → the bar fills to 100% twice with a backward snap (:1541-1550, :2026). Fix: sentinel/separate fields per phase; re-establish baseline from done.

5.5 Smaller UI items

  • Bootstrap renders Color.clear through up to ~2 min of serial network awaits (AppDependencies.swift:718-752) — flip isBootstrapping before the probes.
  • model.appState is dead state; Status “Safety rail tripped” banner can never fire and contains fixture math (StatusScreen.swift:676-686).
  • Failure toast can report a stale lastScanBurstCount and never auto-dismisses (AppDependencies.swift:4046-4049).
  • Export/import: zero in-progress feedback on multi-second operations.
  • requestSync early-bail guards leak syncStartedAt (ELAPSED ticks while idle) (:3914-3932).
  • suppressErrors can resurrect a dismissed alert (CairnAppRoot.swift:416-418).
  • Imputation join key is case-sensitive unlike every other filename comparison (AppDependencies.swift:1516) — safe direction, but evades collision detection.

Lower-priority reconciler notes

  • Asset whose every resource read fails is recorded nowhere → perpetual untracked-sweep rediscovery (:2477-2482, :2209). Defer with a retry reason instead.
  • drainInternal staleness check uses narrow fetch where pruneStaleDeferredEntries deliberately uses broad — hidden-while-queued assets dropped as “deleted” (:1442-1458 vs :1188-1202).
  • untrackedDiscovered counts pre-scope-filter ids → permanently inflated log/Result noise in .selectedAlbums mode (:563-567).

Verified clean (high-confidence)

  • Engine compute() set arithmetic, quarantine boundary, scope filter, ±1s tolerance — densely test-pinned, all correct.
  • Post-125 filter split at hashAllCurrentAssets / discoverUntrackedAssets (narrow) and reconcileCacheAgainstLibrary / prune pass-2 (broad).
  • hashAssets TaskGroup priming/cancel/resume; the build-127 ≥50 threshold is correctly wired through both insert and update batches (cosmetic: counter restarts between the two batches).
  • Crash-consistency: token-after-mutations ordering, first-write-wins union, WAL journal ordering, Live Photo pair expansion.
  • Hashing.swift streaming; ServerAssetSyncCoordinator batching design; sync phase state machine exit paths; SwiftData actor confinement.

Implementation status (2026-06-09 session)

Fixed & committed (each its own commit, all build green + 668 tests pass):

  • Tier 1 — 1.1, 1.2, 1.3 (a/b/c), 1.4, 1.5 — all deletion-safety items.
  • Tier 2 — 2.1, 2.2, 2.3, 2.4, 2.5, 2.6 — all correctness items.
  • Tier 3 — 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7 — all performance items.
  • Tier 4 — all main-thread items: handlePhotoLibraryChange, backfill, refreshLibrarySizeStats (also 3.7), and the big one — performLiveReconciliation alive-scan + engine + orphan match moved off MainActor (3× redundant full-library enumeration collapsed to one). Plus the DiagnosticLogFlusher duplicate-capture clamp and the force-unwrap.
  • Tier 5 — 5.2 (DryRunSheet real checks + dead-button removal), 5.3 (Settings drain cancellable), plus M3 (bootstrap probes off critical path), M6 (failure-toast count + dismiss), M7 (appState marked unwired), L1 (early-bail syncStartedAt leak), and review item 7 (imputation join-key case-fold).

Also fixed (second session):

  • 5.1 / H1 — server-fetch progress. performLiveReconciliation now drives .fetchingServer as a real phase during the post-hash blocking wait (replacing the synthetic timeline entry), and StatusScreen surfaces serverAssetsFetched/Expected with a server-fraction bar + live count instead of a frozen “Processing N/N”. Paginated path only; the incremental coordinator’s per-batch callback stays deferred (moot until session auth — that path 403s and falls back to paginated today).
  • 5.4 / M1SyncProgress.initialHashed made optional; imputation / pre-hash scan leave it nil so the first hash emit captures the baseline from done (stops the optimistic ETA + the poisoned persisted rate).
  • 5.4 / M2 — added SyncProgress.scanned; the pre-hash scan renders as its own “X / Y phone assets scanned” line and the main bar fills once.

Also fixed (third session) — all remaining items closed:

  • Incremental /sync/stream per-batch progress callback (now relevant since session auth is in active use) — wired to serverAssetsFetched so the stream path gets the same live counter as 5.1’s paginated path.
  • M8 (export/import spinner), L2 (suppressErrors alert resurrection), L3 (Retry-now busy state), L4 (.nothingToVerify toast instead of a false “Rescan queued”), L5 (applyHashEvent isSyncing guard + clear-on-cancel).
  • Reconciler notes (a) defer all-resources-failed assets, (b) broaden the drain liveness fetch (hidden-while-queued), (c) count untrackedDiscovered post-scope-filter.

Nothing outstanding from the review.

Suggested fix order

  1. 1.1 (alive-filename wipe — critical, small fix) + 2.1 (modDate clobber — one-line) in one build.
  2. 1.2 source-type filter on incremental inserts (+ scope enumerator).
  3. 1.3/1.4/1.5 as a “quarantine-clock integrity” bundle.
  4. 2.2 token-before-fetch + 2.3 requestSync guard.
  5. Tier 3 perf bundle (3.1/3.2/3.3 are the initial-scan wall-clock).
  6. Tier 4 main-thread bundle (the remaining stutter).
  7. Tier 5 UI honesty bundle (5.1/5.2/5.3 first).