Developing WebSocket for Horizontal Scaling: Using Redis as Message Queue

Development and testing of a backend infrastructure for instant messaging applications

francescocartelli

Published in

Stackademic

12 min readMay 4, 2024

Introduction

Building a robust backend infrastructure capable of handling thousands, if not millions, of highly interactive, concurrent connections presents a significant challenge.

This article explores the creation and testing of a backend infrastructure tailored for instant messaging apps. Central to this setup is Redis functioning as a message queue (MQ), enabling communication among WebSocket (WS) server replicas designed for a scalable framework, effectively meeting the demands of a growing user base.

This article presents a practical implementation of the design outlined in the previous article on instant messaging application design, specifically tailored for the real-time communication services.

Features and Challenges

The primary objective is to design and implement a centralised backend solution that not only supports instant messaging through WS connections but also addresses the crucial aspect of scalability.

Instant Messaging Using WebSocket: The primary objective is to facilitate seamless, real-time communication between users, ensuring that new messages are instantly delivered without requiring manual refreshing. WS is the preferred solution for this scenario. It establishes a persistent connection between the client and server, allowing both parties to send messages at any time.
Horizontal Scaling of Stateful Services: Replicating WS servers is crucial for horizontal scaling. However, when replicating stateful services compared to stateless ones, several challenges arise due to the nature of managing and synchronising state communication across multiple instances.
Detaching Instances Using Pub/Sub Mechanism: The challenge arises in centralised messaging applications where messages undergo dual journeys, first from sender to the system and then back to the recipient. Meaning that, replicated chat servers need to communicate with each other, necessitating a MQ with a publish/subscribe mechanism for communication.

System Design: Centralised Instant Messaging

For the features described earlier, the main components of the system are:

Load Balancer: Acting as a single entry point for the BE.
WS Server Cluster: Enabling the main feature of the system, connecting clients to the system in real time.
Pub/Sub Message Queues: Allowing messages exchange between the WS server instances by using a common interface and broker.

The overall design of the system is illustrated and discussed below.

Illustration depicting the high level architecture of the system.

Load Balancer

Load balancer acts as a single entry point for clients to connect to the WS cluster. This component plays a fundamental role in ensuring that clients can reliably connect to the servers, especially in scalable environments. As the cluster dynamically scales by adding or removing server instances, the component adjusts its routing logic accordingly. New server instances are integrated into the pool of available servers, and clients continue to connect to the cluster without disruption.

WebSocket Servers

The WS server is the core of the instant messaging backend system. It establishes persistent connections between clients and servers, enabling real-time interaction with minimal latency.

Since each WS server can only host a certain number of active connections, for this reason, multiple instances of servers are replicated across the network to provide elastic support for handling varying levels of load.

Each instance accommodates multiple connections, each capable of exchanging messages between each client and server bidirectionally. In such architecture, when a message is sent from a client, it reaches the sender’s WS server and traverses to the one hosting the recipient connection. To manage this process effectively, an architecture-wide delivery mechanism is necessary.

Schema for the WS server internal modules structure.

Pub/Sub Message Queue

A MQ decouples each replica from the other. Instead of directly contacting the recipient environment, they can push messages to a common bus. This decoupling simplifies the architecture and makes it more modular and scalable.

By using a publish/subscribe mechanism, each server subscribes only to the channels in which it is interested, allowing the sender instance to broadcast a message to the whole crowd without needing direct knowledge of other running instances that require that message.

Since servers can now avoid communicating directly with each other, adding or removing servers from the cluster becomes simpler. New servers can seamlessly join the cluster and begin monitoring pertinent channels without necessitating alterations to the network topology or detailed knowledge of the network.

While subscribing is an inexpensive activity, identifying and segregating each MQ traffic of data is not a trivial decision (more details in the Message Queue: Small Groups and News-Feed Communities section of Exploring System Design: Instant Messaging Application). In this design, the main idea is to create a channel/topic for each user. The identity of each MQ can be defined starting from the unique username, for example. This option offers numerous advantages, as outlined in the previous article. However, this method faces certain challenges when accommodating larger groups (with thousands of users), such as news-feeds, rather than just direct chats and small groups.

Message flow of a direct message (load balancer is not shown for simplicity). MQs are associated to each user identity, allowing for a simple flow and a single subscription on each connection creation.

Develop the System

In the solution development, the scaling mechanism is omitted, as this article focuses mostly on building a scalable WS server. Nonetheless, the server implementation is specifically designed to operate within a scalable environment.

The project is available in the following repo:

GitHub - francescocartelli/wss-for-horizontal-scaling-back-end: Repository for the "Developing…

Repository for the "Developing WebSocket for Horizontal Scaling: Using Redis as Message Queue" article …

github.com

Requirements:

The whole activity only requires knowledge of Javascript for the custom WS server development and Docker-Compose for the architecture orchestration. In particular, the following components are necessary:

NPM and Node.js environment
Docker and Docker-Compose
Redis Docker Image
NGINX Docker Image

Environment Setup and Docker Compose

The main folders for the project components are structured according to the following schema.

├── docker-compose.yml
├── load-balancer
|   ├── dockerfile
|   └── nginx.conf
└── server-ws
    ├── ...
    ├── dockerfile
    ├── package.json
    ├── server.js
    └── services
        ├── identity.js
        └── mq.js

Docker-Compose serves not only as the orchestrator for all architectural components but also as the blueprint for the system. The configuration is depicted below.

As outlined in the design, each component of the system is represented as a service within the container composition. In the provided example, the WS server count is fixed to simulate a horizontally scaled backend with multiple instances.

Now, let’s go into the definition and implementation of each service.

Load Balancer (NGINX)

NGINX is a popular open-source web server, reverse proxy server, load balancer, and HTTP cache. The setup only requires the NGINX configuration file, which in this case follows a standard rule definition for enabling WS connection persistence through the reverse-proxy.

upstream websocket {
    server server-ws:8000;
}

server {
    listen 80;
    location / {
        proxy_pass http://websocket;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "Upgrade";
        proxy_set_header Host $host;
        proxy_read_timeout 3600s;
    }
}

The following Dockerfile downloads the Alpine image and configures the load balancer. After image fetch, port 80 is exposed, and then NGINX is started with the commandCMD ["nginx", "-g", "daemon off;"], keeping the process in the foreground.

FROM nginx:stable-alpine

COPY nginx.conf /etc/nginx/conf.d/default.conf

EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]

MQ (Redis)

Redis is an open-source, in-memory data store that serves as a high-performance and versatile solution for various data storage and manipulation needs.

Additionally, it provides various advanced features, such as built-in replication, persistence options, pub/sub messaging capabilities and support for advanced data structures. These features make it a versatile tool that can be used for caching, session management, real-time analytics, messaging queues and much more.

Redis-Server can be downloaded as command-line software, but in the described system, only its Docker image is used.

Redis Client for WS Server

Each WS server will require a set of Redis-Clients to establish network communication with the Redis-Server. The client definition can be built upon the Redis client Node.js package.

npm redis install

A simple configuration can be defined and exported by means of the following utility module: Each server will use two clients, one for publishing and one for subscribing to the user’s MQs.

In the code, a single Redis client is replicated in pub and sub clients, then connected and exported. Those clients and their purpose are the following:

pubClient: Used for injecting messages into all the recipient MQs.
subClient: Used for subscribing to the related user MQs.

WebSocket Server (WS)

The main part of this implementation relies on the development of the WS server. The library of choice is ws, can be downloaded and installed directly with NPM.

npm ws install

The following snippet shows the standard structure of a WS server using ws Node.js library.

The development of the custom logic for the instant messaging back-end using Redis as a pub/sub mechanism can follow two different paths based on a specific assumption:

Limiting to one the connections per each user: Working under this assumption allows for a straightforward implementation. Each new WS connection will subscribe to the user related MQ. In the same way, when each connection is closed, the server will unsubscribe.
Allowing multiple connections per each user: Since Redis identifies each topic/channel just by using its name, subscribing many times to the same one, apart from being useless, will replicate each message event. For this reason, a custom logic is required for limiting each subscription to one for each user. A similar logic is required for unsubscribing from the user MQ only when all connections belonging to that user are closed.

It is crucial to note that the restriction on user connections only poses an issue when two connections linked to the same user are established within the same WS server. If connections are spread across replicas, this problem does not arise. Nonetheless, given the ideal design of the system, addressing this issue should be inherent.

Both examples are discussed in the next sections.

Managing At Most One Connection Per User

The following examples only outline the transmission requirements, excluding aspects like data validation, persistence, logging, and authentication for simplicity.

Each new connection will undertake these responsibilities:

Upon connection initialisation, subscribe to its user related channel by using the username extracted from the identity header as the channel name. Each time a message appears in the queue, the message is forwarded to the client by invoking the send(message) function within the subscribe callback.
Provide a callback mechanism for each message received from the client. The message is then forwarded to the recipient queue by implementing the on('message’) callback. The recipient is directly extracted from the original payload, considering the following JSON schema as the minimal set of properties for the message.

{
    content: { type: "string", description: "Message body" },
    recipient: { type: "string", description: "Unique id of the recipient" }
}

Provide a callback for the close event. When the connection is closed, the user-related subscription is released by calling unsubscribe for the callback of the on('close') event.

By associating each WS connection to a specific MQ subscription, each instance can host only one connection per user. For every server, every connection and subscription in the system can be schematised with the following table (represented in YAML notation).

- wss1: 
    userA: wsA
    # ...
    userN: wsN
- wss2:
    userA: wsA
    # ...
    userM: wsM
# ...
- wssN:
    # ...

With the previous implementation, each server implicitly associated each user connection with its own MQ subscription.

Managing Multiple Connections Per User

As previously discusses, due to Redis limitations, safely replicating and closing subscriptions to a channel presents two main issues:

Unsubscribe Effect: Calling the unsubscribe event from a user connection will close all the other ones related to that user (e.g. If a user logs out of the app on one of her devices, all other devices will be disconnected as well).
Message Multiplication: All messages received will be multiplied by the number of active subscription subscribed to that channel. This can lead to unnecessary duplication of messages.

Schema depicting a possible configuration of the system in which a user has three active connections hosted in multiple servers. The image empathises the broadcasting of the message on the receiving flow.

To address these requirements, a custom Connection-Manager is necessary. The main purpose of this component is to detach each connection from the subscription established on the MQ. Its requirements are the following:

Ensure Subscription Management: Manage the subscription to the user MQ ensuring it is established only on the first connection initialisation.
Group Broadcasting: Allow the capability to forward each message originating from a single user subscription to all active connections owned by that user.
Cleanup Mechanism: Manage the unsubscribing from the MQ, ensuring that subscription cleanup occurs only once for each user when all connections related to that user are closed.

By using the custom manager, the WS server implementation is the following.

In the code, the call to createConnectionManager returns a single addConnection function. When called, the function adds a new connection to the manager and returns two utilities and a flag related to that connection:

to: Invoke the function passed as a parameter on each connection belonging to the specified group. It is useful for sending messages to all connections within a group.
remove: When called, release a connection. Once all connections belonging to a group are removed, execute the callback passed as a parameter. This functionality is used for unsubscribing from the MQ when all connections belonging to a group are removed.
isInit: Set to true when the connection is the first of the group. This flag can be used to detect the need for a new subscription and limit multiple subscriptions for the same group.

The Connection-Manager implementation is presented below.

The Connection-Manager is implemented as a closure consisting of a two-layer map structure. This structure keeps track of active users and associates each user with their related connections. Adopting maps as collections allows for a fast lookup on resources, providing the highest read-performance for efficient management of connections and users within the system.

The subscription manager is designed using a function factory pattern. Registering a new connection enables the utilisation of the three returned utilities ( to, remove and isInit) later from the connection scope.

A possible connection map for the whole cluster can be the following (represented with YAML notation).

- wss1:
    userA:
      connA1: wsA1
      # ...
      connAN: wsAN
    userN:
      connN1: wsN1
- wss2:
    userA:
      connA1: wsA1
    userM:
      connM1: wsM1
# ...
- wssN:
    # ...

The schema depicts a two-layer map structure in each replica: All the servers track their local subset of active connections associated with every connected user.

This following Dockerfile sets up a Node.js environment for the WS server.

FROM node:latest

WORKDIR /app

COPY package.json .
RUN npm install

COPY . .

CMD ["node", "server.js"]

Testing the Solution

The system can be started just by running the docker-compose up command. Once all the containers are up and running, the Docker-Compose console will display the messages for all the listening instances.

For testing the system, no WS client has to be implemented since it is possible to use Postman in WebSocket mode.

To setup a simple test, it is first required to create a new connection with the WS cluster:

Create a new “WebSocket request” and provide the WS endpoint URL.
Set a dummy Identity header that contains the identity (in the described case, the username or id) of the user performing the request.
Press “Connect” to perform the handshake.

A new connection has been created! A new log in the Docker-compose console should display the creation of a new entry in the Connection-Manager for one of the available WS servers.

In the displayed log three users are connected to the system through five different connections: **userA** has three active connections.

At this point, iterate the process for all the users to be tested and start messaging:

In the predefined WS request just add a body in the message tab.
Provide recipient and content of the message in JSON format.
Press “Send” to deliver the message.

Postman screenshot depicting a message sent from **userC** to **userA**.

By switching the selected request in Postman, the received message can be seen in the recipient request view.

Postman screenshot depicting the message from **userC** being received by all the **userA** opened connections.

Perform multiple connections for the same user and continue testing the resistance of the Connection-Manager.

The last test is the one related to connection close: To achieve that, just press the “Disconnect” button in each one of the created connections. The Docker-Compose console should display the connection removal and, if needed, the subscription deletion from the manager of the related servers.

In the displayed log, all userA connections hosted in server ws-3 are removed. The first disconnection of the client causes only its connection to be closed. After the second one the subscription of userA to the MQ is definitively deleted since the is no longer present in the server.

Conclusion

That concludes our activity for now. Thank you for taking the time to engage with this reading.

External Resources

If backend development is not your only interest, nyla-instant-messaging-app repository contains a fully functional instant messaging web-app implementation. The project exposes a slightly more complex design, offering additional important features, along with a highly interactive client to operate the application.

Stackademic 🎓

Thank you for reading until the end. Before you go:

Please consider clapping and following the writer! 👏
Follow us X | LinkedIn | YouTube | Discord
Visit our other platforms: In Plain English | CoFeed | Venture | Cubed
More content at Stackademic.com