Type Your Question
How do I set up Redis Cluster for scaling?
Tuesday, 18 March 2025REDIS
Redis Cluster is a distributed, scalable, and highly available implementation of Redis. It automatically shards data across multiple Redis nodes, providing fault tolerance and horizontal scalability. This guide will walk you through the process of setting up a Redis Cluster for optimal performance and resilience.
1. Understanding Redis Cluster Architecture
Before diving into the setup, let's understand the core concepts of Redis Cluster:
- Nodes: Each Redis instance in the cluster is called a node. These nodes communicate with each other to maintain the cluster state.
- Data Sharding: Redis Cluster uses hash slots to distribute data across nodes. The keyspace is divided into 16384 hash slots. When a key is added, it is hashed and assigned to one of these slots, and that slot is associated with a specific node.
- Master and Slave Nodes: Each shard typically consists of one master node responsible for handling write operations and one or more replica (slave) nodes that replicate data from the master and can handle read operations. This provides redundancy and fault tolerance.
- Cluster Bus: Nodes communicate with each other using a dedicated communication channel called the "cluster bus," which uses a specific protocol for gossip and consensus.
- Gossip Protocol: Nodes use the gossip protocol to exchange information about the cluster state, like node availability and slot assignments. This allows the cluster to automatically detect failures and reconfigure itself.
- Automatic Failover: If a master node fails, one of its replica nodes is automatically promoted to master. This ensures the cluster remains operational even in the event of node failures.
2. Planning Your Redis Cluster
Proper planning is crucial for a successful Redis Cluster deployment. Consider the following factors:
- Number of Nodes: Start with a minimum of three master nodes and at least one replica per master. A cluster with three masters and three replicas provides a reasonable balance between fault tolerance and overhead. For example, 3 masters + 3 slaves = 6 nodes total. More replicas can further enhance read performance and improve failover robustness, at the cost of higher replication bandwidth and storage requirements. Avoid even numbers of masters to avoid potential split-brain scenarios during voting in the event of failures.
- Hardware Resources: Choose appropriate hardware resources (CPU, RAM, disk) for each node based on your expected data size and workload. Redis is generally memory-bound, so sufficient RAM is crucial. Consider SSDs for faster persistence if you're using AOF or RDB.
- Network Configuration: Ensure good network connectivity between nodes. Low latency is critical for the gossip protocol and data replication.
- Redis Version: Use a stable and recent version of Redis (5.0 or higher recommended, 7.x ideally), as it contains numerous bug fixes and performance improvements related to clustering.
- Persistence: Choose a persistence mechanism (RDB or AOF) based on your durability requirements. Remember that AOF persistence can impact write performance slightly compared to no persistence, so test accordingly.
- RDB (Redis Database): Periodically snapshots the in-memory data to disk. Offers good recovery but potential data loss if a failure happens between snapshots.
- AOF (Append-Only File): Logs every write operation. Guarantees minimal data loss but may be slower than RDB depending on the settings. Can choose "always", "everysec", and "no" strategies. "everysec" offers a balance.
3. Configuring Redis Nodes
Each Redis node in the cluster requires a configuration file (redis.conf
) with the following key settings. Note: You will create a separate configuration file for *each* Redis instance running on your servers. Adjust ports according to the number of Redis instances running on the host.
port 7000 # Choose a port for the instance (e.g., 7000, 7001, 7002...)
cluster-enabled yes # Enable cluster mode
cluster-config-file nodes.conf # Path to the cluster configuration file (auto-generated)
cluster-node-timeout 15000 # How long a node must be unreachable before another node considers it to be failing, in milliseconds. Increase this if experiencing false positives due to network issues.
cluster-require-full-coverage yes # If set to 'yes' the cluster stops accepting writes if some slot is not served
cluster-migration-barrier 1 # minimum number of replicas a slot must have. default is 1. For advanced tweaking.
appendonly yes # Enable AOF persistence (optional but recommended)
appendfsync everysec # set every second synchronization if appendonly is enabled
# bind 127.0.0.1 # Remove or comment out to allow external connections (careful!) Or bind to internal ip like 10.0.0.1
protected-mode no # disable protected mode. careful! Or bind to specific IPs
#Protected mode only permits connections from the loopback interface. You need
#to disable protected mode if you want to connect using remote clients,
#otherwise the cluster will never join.
Important considerations:
- Replace
7000
with different ports for each instance (e.g., 7001, 7002, 7003, etc.). A common strategy is to run the same configuration file for all nodes on the *same machine*, with variations controlled using shell scripts or environment variables to parameterize the port. If on different servers, customize theredis.conf
individually for each. - The
cluster-config-file
setting tells Redis where to store information about the cluster state. This file is automatically managed by Redis. Do *not* edit it directly. - Adjust the
cluster-node-timeout
value based on your network latency. A higher value provides more time for the cluster to recover from transient network issues but increases the time it takes to detect permanent failures. - Adjust
appendfsync
value based on requirement, tradeoff between safety and latency. - If using replication, ensure each master node has a different ID. This is usually managed automatically.
- Disable protected mode or bind to a specific IP address that remote nodes/clients can access! Otherwise, other cluster nodes won't be able to connect. Consider binding to internal IPs only (e.g., 10.x.x.x, 172.16.x.x, or 192.168.x.x). If binding to all interfaces (0.0.0.0), strongly consider enabling authentication using the
requirepass
configuration option for security reasons.
4. Deploying Redis Cluster Nodes
Once the configurations are created, you can deploy the Redis nodes. It's generally best practice to separate the nodes on different physical servers for enhanced fault tolerance. However, for testing and development, you can run multiple instances on a single machine, using different ports.
Here's how to start each Redis node (replace paths and port as needed):
redis-server /path/to/redis.conf
5. Creating the Redis Cluster
Redis provides the redis-cli
utility to create and manage clusters. Use the --cluster create
command to initialize the cluster. Replace the IP addresses and ports with your actual node addresses.
redis-cli --cluster create 10.0.0.1:7000 10.0.0.1:7001 10.0.0.1:7002 10.0.0.1:7003 10.0.0.1:7004 10.0.0.1:7005 --cluster-replicas 1
Let's break down this command:
redis-cli --cluster create
: This initiates the cluster creation process.10.0.0.1:7000 ... 10.0.0.1:7005
: This specifies the list of Redis nodes that will be part of the cluster. Important: Provide *all* nodes that are *eligible* to be master nodes in this creation command! Slaves will be assigned later based on the --cluster-replicas value.--cluster-replicas 1
: This specifies that each master node should have one replica. Adjust the value accordingly, balancing read scaling requirements vs replication costs.
The redis-cli
utility will ask you to confirm the configuration. It automatically assigns hash slots to master nodes and creates replica sets. Make sure all specified nodes can be accessed.
6. Verifying the Cluster Configuration
After creating the cluster, verify that it is configured correctly using the following command:
redis-cli -c -h 10.0.0.1 -p 7000 cluster info
redis-cli -c -h 10.0.0.1 -p 7000 cluster nodes
Replace 10.0.0.1:7000
with the address of any node in your cluster.
- The
cluster info
command displays general information about the cluster, such as its size and state. - The
cluster nodes
command displays detailed information about each node, including its role (master or slave), the slots it serves, and its connection status. - -c options enables the client in cluster mode, meaning it follows the redirections in order to perform the operation against the node serving the hash slot corresponding to the specified key.
Check the output to ensure all nodes are connected and that hash slots are evenly distributed. Ensure each master has a slave assigned.
7. Testing the Cluster
Test the cluster's functionality by performing read and write operations. Use the redis-cli
utility with the -c
option (cluster mode enabled). This automatically handles redirections to the correct nodes:
redis-cli -c -h 10.0.0.1 -p 7000 set mykey myvalue
redis-cli -c -h 10.0.0.1 -p 7000 get mykey
If the cluster is functioning correctly, the commands will execute without errors and return the expected results. Try running this on different nodes, and even on a node that *isn't* a master node responsible for your key's hash slot - Redis will automatically redirect the query to the correct node in that case.
8. Scaling Your Redis Cluster
Scaling a Redis Cluster involves adding or removing nodes to accommodate changes in your data volume or workload.
Adding Nodes:
- Start the new Redis node(s) with the appropriate configuration (
cluster-enabled yes
, appropriate port, and AOF and so on.). - Use
redis-cli --cluster add-node
command to add the new node. This requires you connect to any existing cluster node and provide it with the IP address and port of the new node and an *existing* running node in the cluster. Example: - The newly added node starts as a slave, unconnected. If the newly added node should become a new Master: run redis-cli with command redis-cli --cluster reshard 10.0.0.1:7000. Follow the on-screen prompts for assigning slots to this master.
- Follow screen prompts of redis-cli --cluster reshard and determine: Number of slots, Receiving node id, Source node IDs, and do you want to proceed. Reshard moves hash slots and distributes workload. You might see ERR BUSYKEY meaning keys exist in a receiving node, remove it manually first before resharding.
- Optionally add the newly added node to the master/replica pair as well if the previous adding a master only reshard.
Use redis-cli --cluster add-node. Example:redis-cli --cluster add-node 10.0.0.1:7006 10.0.0.1:7000 --cluster-master-id {master_id}
and change its configuration to follow master IDcluster replicate {master_id}
- If cluster has nodes/network in different subnets and encounters an issue Error creating the cluster, keys don't appear to be migrating: Try a larger cluster-node-timeout
redis-cli --cluster add-node 10.0.0.1:7006 10.0.0.1:7000
Removing Nodes:
- If you're removing a master node, first migrate its hash slots to other nodes using the
redis-cli --cluster reshard
. You can choose random nodes or existing master nodes as target for reshard command. - Remove a node using
redis-cli --cluster del-node
command. Example:redis-cli --cluster del-node 10.0.0.1:7000 {node_id}
. {node_id} is shown on output when calling commandcluster nodes
. - Shut down the removed Redis node.
Important: Always rebalance the cluster after adding or removing nodes to ensure an even distribution of hash slots. This is extremely crucial when performing scaled changes. Redis uses SMEMBERS keyslot:
9. Monitoring Your Redis Cluster
Continuous monitoring is crucial for maintaining the health and performance of your Redis Cluster. Monitor the following metrics:
- Node Status: Ensure all nodes are running and connected to the cluster. Use redis-cli -c -h 10.0.0.1 -p 7000 cluster nodes
- Slot Distribution: Verify that hash slots are evenly distributed among the nodes. Use redis-cli -c -h 10.0.0.1 -p 7000 cluster info. Note the numbers like slots_pfail.
- Replication Lag: Monitor the replication lag between master and slave nodes. Significant lag can indicate network or resource issues. Use info replication after connection to master.
- Resource Usage: Track CPU, memory, and network utilization for each node. If a node becomes overly stressed, consider upgrading its resources. Use top or htop utilities and various system metric monitoring dashboards. For instance using htop -d 10 -s PERCENT_CPU show high CPU using redis.
Redis INFO has memory utilization information:used_memory_rss_human
: (human readable format). Redis configuration itemmaxmemory
configuration tells Redis the max memory to consume or eviction will kick in. The best way is to have redis server does not go beyond the configured limit. Evicting keys can lead to slow and CPU-intensive situations.
- Slow Queries: Identify and address slow queries that can impact performance. Use slowlog get command in redis-cli command for diagnosis. Consider caching more results, optimizing keys and schema as potential solutions.
You can use tools like RedisInsight, Prometheus, Grafana, or Redis Cloud's built-in monitoring capabilities for visualizing and alerting on these metrics.
10. Best Practices for Redis Cluster
- Choose Keys Wisely: Avoid using large or hot keys that can overwhelm a single node. Think carefully of how best keys are generated from the applications, that impacts sharding operations for better results.
object encoding {keyname} command show encoding mechanism used. - Optimize Your Data Model: Design your data model to minimize the number of round trips between your application and Redis. Consider techniques like pipelining and using complex data structures (hashes, sets, lists) to reduce network overhead.
Redis hashes vs JSON values depend on many circumstances and each approach comes with benefits and drawbacks. Prefer hashes as they make field retrieval usingHGET key field
efficient without loading/parsing JSON overhead if you will frequently access individual fields,
Use JSON in situations if data is already in JSON format and will perform bulk operations using commandsJSON.GET, JSON.SET, JSON.DEL, JSON.MGET
etc
There exist resp3 client for redis with benefits when receiving back large number of key in SCAN, HGETALL, SMEMBERS or even a very smallresp2
client to avoid overhead when the goal is for lowest resource/bandwidth uses when Redis is not bottleneck. - Connection Pooling: Use connection pooling on the client-side to reuse Redis connections and reduce connection overhead. Libraries and application support such as Lettuce in Java.
- Regular Backups: Implement a regular backup strategy to protect your data from data loss. Backup mechanisms can happen when RDB persistent operation happened, or a snapshot after that. For added fault tolerancing backup should also replicate to different regions for added availability purposes.
Conclusion
Setting up a Redis Cluster provides a robust solution for scaling your Redis deployments and ensuring high availability. By carefully planning your configuration, monitoring performance, and following best practices, you can unlock the full potential of Redis Cluster for your applications. Remember to thoroughly test your setup and adjust your configuration as needed to optimize performance for your specific workload.
Cluster Scaling Sharding Configuration 
Related