Most web applications start with a simple request-response model.
The browser asks for something, the server responds, and the connection is done.
That model works well for pages, APIs, forms, dashboards, and most CRUD applications. But some features need something different:
- chat messages that appear instantly
- live sports scores
- multiplayer game state
- collaborative editing cursors
- trading prices
- delivery tracking
- real-time notifications
- terminal sessions in the browser
For these features, repeatedly asking the server “anything new?” becomes wasteful and slow.
This is where WebSockets are useful.
In this post, we will look at how WebSockets work, what actually happens during the connection upgrade, how data flows after the connection is open, and what we should check before choosing WebSockets for a system.
The Problem WebSockets Solve
HTTP is naturally request-response.
A client sends a request:
GET /notifications
The server sends a response:
[
{ "id": 1, "message": "Your build finished" }
]
Then the request is complete.
If the client wants fresh data later, it needs another request.
For real-time features, the simplest approach is polling:
Every 5 seconds:
browser -> GET /notifications
server -> latest notifications
Polling is easy to implement, but it has tradeoffs:
- it sends many requests even when nothing changed
- updates are delayed until the next poll interval
- short polling intervals increase server load
- long polling intervals make the product feel less real-time
There is also long polling, where the server keeps the request open until data is available or a timeout happens. Long polling can work, but it still creates a repeated request cycle.
WebSockets use a different model.
The client and server establish one long-lived connection. After that, either side can send messages whenever it has something to say.
What Is a WebSocket?
A WebSocket is a persistent, full-duplex connection between a client and a server.
There are two important parts in that sentence:
- persistent: the connection stays open instead of closing after one response
- full-duplex: both client and server can send messages independently
With normal HTTP APIs, the server usually speaks only after the client asks.
With WebSockets, the server can push data to the client at any time:
Client Server
| |
| ---- open connection --> |
| |
| <-- new chat message --- |
| |
| ---- typing event -----> |
| |
| <-- user joined room --- |
| |
This makes WebSockets a good fit for interactive systems where latency matters and data flows in both directions.
The WebSocket URL
WebSockets use their own URL schemes:
ws://example.com/socket
wss://example.com/socket
The difference is similar to HTTP and HTTPS:
| Scheme | Meaning |
|---|---|
ws:// | WebSocket over an unencrypted connection |
wss:// | WebSocket over TLS |
In production, we should almost always use wss://.
If the page is loaded over HTTPS, browsers also expect secure WebSocket connections. Trying to connect from an HTTPS page to ws:// is usually blocked as mixed content.
How the WebSocket Handshake Works
A WebSocket connection starts as an HTTP request.
That is a useful detail because it means WebSockets can use the same ports as normal web traffic:
- port
80forws:// - port
443forwss://
The browser sends an HTTP request that asks the server to upgrade the connection:
GET /socket HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
The key headers are:
| Header | Purpose |
|---|---|
Upgrade: websocket | Tells the server the client wants to switch protocols |
Connection: Upgrade | Marks this request as a protocol upgrade |
Sec-WebSocket-Key | A browser-generated value used by the server to prove it understands WebSockets |
Sec-WebSocket-Version | The WebSocket protocol version |
If the server accepts, it replies with:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: ...
The 101 Switching Protocols status is the important signal.
After that response, the connection is no longer used like a normal HTTP request-response exchange. It becomes a WebSocket connection.
What Happens After the Upgrade?
Once the WebSocket is open, data moves as WebSocket frames.
At the application level, we usually think in messages:
{
"type": "chat.message",
"roomId": "general",
"text": "Hello everyone"
}
Under the hood, the protocol frames those messages so the receiver can identify message boundaries.
That is different from raw TCP. TCP gives us a byte stream, not application-level messages. WebSocket adds message framing on top of TCP, which is one reason it is convenient in browser applications.
WebSockets can carry:
- text messages
- binary messages
- ping and pong frames
- close frames
Most application code works with text messages, often JSON. Binary messages are useful when sending compact data, files, audio chunks, video chunks, or game state.
A Minimal Browser Example
The browser API is small.
const socket = new WebSocket("wss://example.com/socket");
socket.addEventListener("open", () => {
socket.send(JSON.stringify({
type: "chat.join",
roomId: "general"
}));
});
socket.addEventListener("message", (event) => {
const message = JSON.parse(event.data);
console.log("Received:", message);
});
socket.addEventListener("close", () => {
console.log("Socket closed");
});
socket.addEventListener("error", (error) => {
console.error("Socket error:", error);
});
The main operations are:
- create a
WebSocket - wait for
open - send messages with
socket.send(...) - receive messages through the
messageevent - handle
closeanderror
The API looks simple, but production behavior needs more structure.
A Minimal Node.js Server Example
In Node.js, the popular ws package gives us a direct WebSocket server.
import { WebSocketServer } from "ws";
const server = new WebSocketServer({ port: 8080 });
server.on("connection", (socket) => {
socket.send(JSON.stringify({
type: "system.connected"
}));
socket.on("message", (data) => {
const message = JSON.parse(data.toString());
if (message.type === "chat.message") {
server.clients.forEach((client) => {
if (client.readyState === client.OPEN) {
client.send(JSON.stringify(message));
}
});
}
});
});
This example broadcasts every chat message to every connected client.
It is intentionally minimal. In a real system, we would add authentication, room membership, schema validation, rate limiting, error handling, and a way to scale beyond one process.
WebSockets vs HTTP Polling vs Server-Sent Events
WebSockets are not the only way to build live updates.
It helps to compare the options:
| Approach | Direction | Best For |
|---|---|---|
| Polling | Client repeatedly asks server | Simple updates where delay is acceptable |
| Long polling | Client waits on a held request | Real-time-ish updates without WebSocket infrastructure |
| Server-Sent Events | Server pushes to client | One-way live streams such as notifications or logs |
| WebSockets | Client and server both push | Interactive two-way systems |
We should think about the choice this way:
- If updates are rare and a delay is fine, use polling.
- If the server only needs to stream updates to the browser, consider Server-Sent Events.
- If both sides need low-latency communication, use WebSockets.
WebSockets are powerful, but they are not automatically the simplest option.
Common WebSocket Use Cases
Chat
Chat is the classic WebSocket example.
Users need to send messages, receive messages, see typing indicators, and update presence. A persistent connection maps naturally to this interaction.
Collaborative Editing
In collaborative editors, users need to see changes from other users quickly:
- document edits
- cursor movement
- selections
- comments
- user presence
The challenge here is not just transport. The harder part is conflict resolution, ordering, and consistency. WebSockets move messages, but the application still needs a correct collaboration model.
Live Dashboards
Operational dashboards often show changing metrics, job states, logs, queue depth, or alerts.
WebSockets can push changes immediately instead of making the browser poll every few seconds.
Multiplayer Games
Games often require frequent bidirectional messages.
For browser games, WebSockets are commonly used because they work everywhere the browser works. For very latency-sensitive games, WebRTC data channels or custom UDP-based protocols may be considered, but WebSockets are still a practical starting point for many real-time browser games.
Browser Terminals
Web terminals are a strong WebSocket use case.
The browser sends keystrokes to the server, and the server streams terminal output back to the browser. Both directions matter, and the interaction needs low latency.
Authentication and Authorization
Authentication with WebSockets deserves careful design.
The opening handshake is HTTP, so we can use familiar mechanisms:
- cookies
- session IDs
- bearer tokens
- signed short-lived tokens
For browser clients, cookies are often convenient when the WebSocket endpoint lives on the same site as the web application. Tokens are common for API-style clients.
One mistake we should avoid is treating a successful socket connection as permanent authorization.
Authorization should still be checked at the application message level.
For example:
- Can this user join this room?
- Can this user publish to this topic?
- Can this user subscribe to this account’s updates?
- Is the token still valid?
- Was the user removed from the workspace after connecting?
A WebSocket connection can live for minutes or hours. Permissions can change during that time.
Message Design
A WebSocket gives us transport. It does not design the application protocol for us.
We should use explicit message types:
{
"type": "chat.message.created",
"requestId": "req_123",
"roomId": "general",
"payload": {
"text": "WebSockets make sense here"
}
}
A few practical rules help:
- include a
typefield - validate every incoming message
- keep payloads small
- include IDs for correlation and deduplication
- version the protocol if clients may lag behind servers
- define error messages clearly
Without structure, WebSocket code can turn into a pile of string comparisons and implicit assumptions.
Heartbeats and Dead Connections
One practical issue with WebSockets is detecting dead connections.
A connection can disappear without a clean close event:
- the user closes a laptop
- mobile network changes
- a proxy drops an idle connection
- Wi-Fi disconnects
- a server process crashes
The application should not assume that an open socket is always healthy.
A common solution is heartbeat logic:
Server sends ping
Client replies with pong
If no pong arrives in time, close the connection
The WebSocket protocol has ping and pong frames. Some libraries expose them directly. In browser JavaScript, ping and pong frames are handled by the browser, so applications often implement their own heartbeat message if needed:
{ "type": "ping" }
and:
{ "type": "pong" }
The exact approach depends on the client and server libraries, but the goal is the same: do not keep dead connections forever.
Reconnection Strategy
Clients should expect disconnections.
Real networks are messy, especially on mobile devices.
A good WebSocket client usually needs:
- automatic reconnect
- exponential backoff
- jitter to avoid reconnect storms
- a maximum retry delay
- a way to resubscribe after reconnecting
- idempotent messages where possible
For example, after reconnecting, a chat client may need to:
- authenticate again
- rejoin rooms
- fetch missed messages from an HTTP API
- resume live updates
The WebSocket connection should not be the only source of truth. If a user disconnects for 30 seconds, the application needs a way to recover missed state.
Scaling WebSockets
Scaling WebSockets is different from scaling stateless HTTP APIs.
With normal HTTP, a load balancer can send each request to any healthy server because each request is independent.
With WebSockets, a client holds a long-lived connection to one server process.
That creates a few design questions:
- How many concurrent connections can one server handle?
- How do we broadcast a message to users connected to different servers?
- Do we need sticky sessions?
- What happens when a server deploy restarts?
- How do we drain connections gracefully?
For a small app, one WebSocket server may be enough.
For a larger app, we usually need a shared messaging layer:
Client A -> WebSocket Server 1
Client B -> WebSocket Server 2
Server 1 <-> Redis / NATS / Kafka / Pub/Sub <-> Server 2
If Client A sends a room message to Server 1, but Client B is connected to Server 2, the servers need a shared way to distribute the message.
Redis Pub/Sub, NATS, Kafka, RabbitMQ, cloud pub/sub systems, or a dedicated real-time platform can all fit depending on throughput, ordering, durability, and operational requirements.
Backpressure
Backpressure means the sender is producing data faster than the receiver can process it.
This matters with WebSockets because a slow client can cause memory to grow if the server keeps buffering outgoing messages.
Examples:
- a browser tab is throttled in the background
- a mobile client has a weak connection
- the server broadcasts too many messages
- the client cannot parse or render messages fast enough
A production server should have policies for slow consumers:
- limit message size
- limit queued outgoing bytes
- drop non-critical updates
- disconnect clients that fall too far behind
- compress only when it actually helps
Ignoring backpressure is how “real-time” systems turn into memory leaks under load.
Security Considerations
WebSockets need the same security discipline as HTTP APIs, plus a few extra checks.
Important items include:
- use
wss://in production - authenticate the handshake
- authorize every meaningful action
- validate all incoming messages
- enforce message size limits
- rate-limit noisy clients
- check the
Originheader for browser clients - avoid leaking secrets in query strings
- close idle or abusive connections
The Origin check is especially easy to miss.
Browsers include an Origin header in WebSocket handshakes. If the server accepts cookie-based authentication, checking the origin helps reduce cross-site WebSocket hijacking risks.
Observability
WebSockets can be harder to debug than normal APIs because there is not a clean request-response record for every interaction.
We should track:
- active connection count
- connection open and close rate
- close codes and reasons
- messages sent and received by type
- message processing latency
- authentication failures
- reconnect rate
- outgoing queue size
- dropped messages
Logs should include connection IDs and user IDs where safe. Metrics should make it obvious when a deploy, dependency outage, or network issue caused reconnect storms.
For important user actions, we should still store durable events or use HTTP APIs where appropriate. A WebSocket message that only exists in memory is easy to lose.
A Practical Architecture
A common production architecture looks like this:
Browser
|
| wss://
v
Load Balancer
|
v
WebSocket Gateway
|
+--> Auth / Session Service
|
+--> Redis / NATS / Kafka / Pub/Sub
|
+--> Application Services
The WebSocket gateway owns connection state:
- which user is connected
- which rooms or topics the user subscribed to
- what messages should be sent to the client
- when to close unhealthy connections
The rest of the system can publish events without knowing which server currently holds the user’s socket.
This separation keeps business services from becoming tightly coupled to WebSocket connection management.
When We Should Not Use WebSockets
We should avoid WebSockets when:
- updates are infrequent
- one-way server-to-client streaming is enough
- simple polling gives an acceptable user experience
- the team does not need low-latency bidirectional communication
- infrastructure cannot support long-lived connections reliably
- durable delivery matters more than immediate delivery
For example, a billing dashboard that refreshes every minute does not need WebSockets. Polling is simpler and easier to operate.
A live log viewer may be better with Server-Sent Events if the browser only receives data and does not need to send much back.
The best architecture is not the one with the most real-time technology. It is the one that matches the product requirement with the least operational complexity.
Key Takeaways
WebSockets are useful because they turn the browser-server relationship from repeated request-response calls into a long-lived two-way channel.
The core ideas are:
- WebSockets start with an HTTP upgrade handshake.
- After the
101 Switching Protocolsresponse, the connection uses WebSocket frames. - Either side can send messages while the connection stays open.
- WebSockets are best for low-latency bidirectional features.
- Production systems need authentication, authorization, heartbeats, reconnects, backpressure handling, and observability.
- Scaling WebSockets usually requires a shared messaging layer between server instances.
The practical default is simple: start with polling or Server-Sent Events if they satisfy the requirement. Use WebSockets when the product genuinely needs interactive, bidirectional, low-latency communication.
That is where WebSockets are worth the extra operational work.