Lessons Learned

Recurring Architectural Patterns in TermOnMac

This article observes patterns that recur across the TermOnMac codebase. They aren’t documented as deliberate principles in any one place — they emerge by reading the relay server, Mac agent, and iOS app side by side. Each pattern is grounded in a specific code reference.

1. Strong Consistency in DO, Cache in KV

The SubscriptionDO file states the principle directly in its header comment:

// relay_server/src/subscription-do.ts
// SubscriptionDO — per-user Durable Object that serializes all subscription writes.
// DO storage is the source of truth for tier and subscription state (strongly consistent).
// KV is written as a cache but not relied upon for decisions.

The same shape recurs in the Room DO for room state. Decisions that need correctness (subscription tier, room ownership) live in DO storage. KV is used for things that tolerate eventual consistency (rate limiters, usage history, profile cache).

2. Separate Keys to Avoid Read-Modify-Write Races

The extra quota implementation splits grant from counter:

// relay_server/src/usage-tracking.ts
function extraQuotaKey(userId: string): string {
  return `extra_quota:${userId}`;       // grant — never modified
}

function extraQuotaUsedKey(userId: string): string {
  return `extra_quota_used:${userId}`; // counter — separate key
}

The same approach is used in API key rotation: apikey:{key} and user_apikey:{userId} are separate keys, written/deleted independently rather than as a single record.

When two pieces of data are updated at different rates or by different writers, splitting them into separate KV keys avoids the R-M-W trap.

3. Version Markers in Cached Records

When a cached record might outlive a regenerated authoritative record, embed a version marker:

// relay_server/src/usage-tracking.ts
const usedRecord = { used: ..., grant_created_at: record.created_at };
// ...
if (usedRecord.grant_created_at === grant.created_at) {
  used = usedRecord.used;
}
// else: stale used counter from previous grant — treat as 0

The same pattern in RemoteDevCore/SessionCrypto.swift uses a version prefix in the HKDF salt:

let salt = Data("remotedev-v2".utf8) + Data(sortedNonces.joined().utf8)

v1 and v2 keys never collide. A v1 client cannot accidentally derive the same key as a v2 client even with identical nonces.

4. Eager Cleanup Before Possible Hibernation

When state is about to become invalid, clean up immediately rather than waiting for the natural cleanup path:

// relay_server/src/room.ts (Mac socket replacement)
// Eagerly clean up old socket's storage entries so stale role mappings
// don't confuse hibernation recovery (the old socket's webSocketClose
// may fire after we hibernate, leaving two "mac" role entries).
const oldTags = this.state.getTags(this.macSocket);
if (oldTags.length > 0) {
  await this.state.storage.delete(`role:${oldTags[0]}`);
  await this.state.storage.delete(`user:${oldTags[0]}`);
  await this.state.storage.delete(`ip:${oldTags[0]}`);
  await this.state.storage.delete(`clientRole:${oldTags[0]}`);
}

The cleanup happens here even though webSocketClose would also handle it — because hibernation may occur between the two events, leaving stale role mappings that confuse the next wake.

5. Buffered Output Through a Serial Queue

Both the Mac PTY layer and the relay batching layer use the same pattern: a private serial dispatch queue, an in-memory buffer, and two flush triggers (size threshold and timer).

PTY output:

// mac_agent/Sources/MacAgentLib/PTYSession.swift
private let outputQueue = DispatchQueue(label: "pty.output")
// ...
if self.outputBuffer.count >= 32768 {
    self.flushOutput()
} else if self.flushTimer == nil {
    // Schedule 200ms flush timer
}

Relay batching:

// mac_agent/Sources/MacAgentLib/RelayConnection.swift
private let batchQueue = DispatchQueue(label: "relay.batch")
private static let batchFlushInterval: TimeInterval = 0.05  // 50ms
private static let batchMaxSize = 20

Both use serial queues to avoid locks. Both have explicit dual triggers (size + time). Both flush on disconnect.

6. Channel Binding Wherever Identity Could Be Substituted

The challenge HMAC binds to ephemeral keys to prevent relay-level substitution:

// mac_agent/Sources/RemoteDevCore/SessionCrypto.swift
/// Compute challenge-response HMAC with channel binding.
/// When both ephemeral keys are present, they are sorted and appended to the nonce
/// to bind the HMAC to the specific ECDH ephemeral keys, preventing a relay MITM
/// from substituting ephemeral keys while passing TOFU checks on identity keys.
public static func challengeHMAC(nonce: Data, roomSecret: String,
                                 localEphemeralKey: String = "",
                                 peerEphemeralKey: String = "") -> Data {
    var payload = nonce
    if !localEphemeralKey.isEmpty && !peerEphemeralKey.isEmpty {
        let sorted = [localEphemeralKey, peerEphemeralKey].sorted()
        payload.append(Data(sorted.joined().utf8))
    }
    return hmacSHA256(data: payload, key: Data(roomSecret.utf8))
}

The same idea — bind authentication to the specific channel — appears in HKDF salt construction (binding the session key to specific nonces) and in the IPC versioning check (binding requests to a specific protocol version).

7. Idempotency Caches for Network Retries

Refresh token rotation caches its result for a 5-minute grace period:

// relay_server/src/api-keys.ts
// Cache result so concurrent/retried requests with the same token get identical response.
// TTL matches grace period (5 min).
await kv.put(`rotated:${refreshToken}`, JSON.stringify(result), { expirationTtl: 300 });

Apple notification dedup uses the same pattern with a longer TTL:

// relay_server/src/subscription-do.ts
const UUID_TTL_MS = 7 * 24 * 60 * 60 * 1000; // 7 days

if (notificationUUID) {
  const existing = await this.state.storage.get(`uuid:${notificationUUID}`);
  if (existing) {
    return Response.json({ handled: true, detail: "duplicate" });
  }
}

Any operation that has side effects and might be retried gets an idempotency cache keyed on a unique identifier. The TTL is sized to outlive realistic retry windows.

8. Lazy Expiry Instead of Background Jobs

The subscription DO checks expiry on read:

// relay_server/src/subscription-do.ts
// Lazy expiry check — downgrade if subscription has expired
const sub = await this.state.storage.get("subscription") as Record<string, unknown> | null;
if (sub?.expires_date && (sub.expires_date as number) < Date.now() && tier !== "free") {
  tier = "free";
  await this.state.storage.put("tier", "free");
  await this.state.storage.delete("subscription");
}

The room DO checks idle timeout in the alarm callback. The KV TTLs handle automatic cleanup of API keys, refresh tokens, room registrations, and usage records. There are no background sweep jobs anywhere — every cleanup either uses TTLs or runs lazily on the next access.

This is consistent with the serverless model: there’s no persistent process to run a sweep, and TTLs/lazy checks are cheaper than scheduled jobs.

9. State Machine Restoration Through Tagged Sockets

When the Room DO wakes from hibernation, it reconstructs in-memory socket references from durable storage:

// relay_server/src/room.ts
this.state.blockConcurrencyWhile(async () => {
  const sockets = this.state.getWebSockets();
  for (const ws of sockets) {
    const tags = this.state.getTags(ws);
    if (tags.length > 0) {
      const role = await this.state.storage.get<string>(`role:${tags[0]}`);
      // ... restore based on role
    }
  }
});

Every piece of in-memory state that survives hibernation has a corresponding storage representation. The constructor’s job is to read storage and rebuild memory.

10. Explicit Protocol Versioning in Multiple Layers

Three independent layers carry version information:

  • Cryptographic: "remotedev-v1" / "remotedev-v2" salt prefix
  • IPC: ipcProtocolVersion: Int = 1 between CLI and helper daemon
  • Server: /server-info endpoint returns PROTOCOL_VERSION for client migration decisions

Each version number is independent of the others. The cryptographic version protects against cross-protocol key reuse; the IPC version protects against CLI/daemon mismatches; the server version coordinates client upgrades.