Surviving Cloudflare DO Hibernation Without Dropping WebSocket Connections
Cloudflare Durable Objects hibernate when there are no active requests. For most DO use cases this is invisible — the next HTTP request wakes the DO and restores state from DO storage. But a WebSocket relay has active connections that must survive hibernation without being dropped.
The Hibernation Problem
Without hibernation support, a DO holding WebSocket connections would prevent hibernation entirely, consuming compute and billing for the full duration of every connection — even during idle periods. Cloudflare’s solution is the WebSocket Hibernation API: the DO can hibernate while WebSocket connections remain open. When a message arrives, the DO is woken and the message is delivered.
The catch: in-memory state (like macSocket and iosSocket) is lost during hibernation. The DO must be able to reconstruct its in-memory state from durable storage on every wake.
Tagging Sockets for Recovery
Each WebSocket is accepted with a UUID tag:
// relay_server/src/room.ts
const socketId = crypto.randomUUID();
this.state.acceptWebSocket(server, [socketId]);
The tag persists through hibernation. When the DO wakes, state.getWebSockets() returns all active sockets, and state.getTags(ws) returns the UUID for each.
Restoring State in the Constructor
The Room constructor runs after every hibernation wake. It uses the socket tags to look up which role each socket had:
constructor(state: DurableObjectState, env: Env) {
this.state = state;
this.env = env;
this.state.blockConcurrencyWhile(async () => {
const sockets = this.state.getWebSockets();
console.log(`[room] constructor: restoring ${sockets.length} sockets from hibernation`);
for (const ws of sockets) {
const tags = this.state.getTags(ws);
if (tags.length > 0) {
const role = await this.state.storage.get<string>(`role:${tags[0]}`);
const userId = await this.state.storage.get<string>(`user:${tags[0]}`);
// Skip sockets that are not fully open (e.g. CLOSING after replacement)
if (ws.readyState !== WebSocket.READY_STATE_OPEN) {
console.warn(`[room] skipping non-OPEN socket tag=${tags[0]} readyState=${ws.readyState}`);
continue;
}
if (role === "mac") this.macSocket = ws;
else if (role === "ios") this.iosSocket = ws;
if (userId) this.userIdBySocket.set(ws, userId);
}
}
// Restore session timing for duration tracking
const savedStart = await this.state.storage.get<number>("session_start");
if (savedStart) this.sessionStartTime = savedStart;
const savedFlush = await this.state.storage.get<number>("last_usage_flush");
if (savedFlush) this.lastUsageFlush = savedFlush;
});
}
blockConcurrencyWhile ensures incoming messages are queued until reconstruction is complete.
What Gets Stored Per Socket
Metadata is written to DO storage at two moments. When the WebSocket is first accepted, the fetch handler stores the IP, declared client role (from the X-Client-Role header), and user ID:
await this.state.storage.put(`ip:${socketId}`, clientIP);
if (clientRole) {
await this.state.storage.put(`clientRole:${socketId}`, clientRole);
}
if (userId) {
await this.state.storage.put(`user:${socketId}`, userId);
}
The role:${socketId} key is only written later, once a register_room or join_room message confirms the socket’s role:
// inside handleRegister, after the register_room message validates
await this.state.storage.put(`role:${tags[0]}`, "mac" as ClientRole);
All four keys are deleted together when the socket closes or is replaced.
The Non-OPEN Socket Edge Case
After the DO wakes, some sockets may be in CLOSING state — for example, if the Mac reconnected and the old socket was closed just before hibernation. The constructor skips these:
if (ws.readyState !== WebSocket.READY_STATE_OPEN) {
console.warn(`[room] skipping non-OPEN socket tag=${tags[0]} readyState=${ws.readyState}`);
continue;
}
Without this check, a CLOSING socket could be assigned to macSocket, and subsequent messages would fail silently.
Old Socket Cleanup on Replacement
When the Mac reconnects and a new socket replaces the old one, the old socket’s storage entries are deleted immediately:
// Eagerly clean up old socket's storage entries so stale role mappings
// don't confuse hibernation recovery (the old socket's webSocketClose
// may fire after we hibernate, leaving two "mac" role entries).
const oldTags = this.state.getTags(this.macSocket);
if (oldTags.length > 0) {
await this.state.storage.delete(`role:${oldTags[0]}`);
await this.state.storage.delete(`user:${oldTags[0]}`);
await this.state.storage.delete(`ip:${oldTags[0]}`);
await this.state.storage.delete(`clientRole:${oldTags[0]}`);
}
The comment explains why: webSocketClose for the old socket may fire after hibernation, leaving stale role entries that would confuse the next wakeup.
Alarm Rescheduling
The heartbeat alarm is rescheduled at the end of every alarm() invocation:
// Reschedule next heartbeat check
this.state.storage.setAlarm(now + HEARTBEAT_INTERVAL);
If the DO hibernates between alarm fires, the alarm itself wakes the DO. The alarm fires reliably regardless of hibernation state.