Redis Streams
- Summary
-
Discussion
- What are some use cases for Redis Streams?
- Which are the new commands introduced by Redis Streams?
- What are main features of Redis Streams?
- Could you explain consumer groups in Redis?
- Could you share more details about IDs in Redis Streams?
- In what technical aspects does Redis Streams differ from other Redis data types?
- How does Redis Streams compare against Kafka?
- Could you share some performance numbers on Redis Streams?
- Could you share some developer tips for using Redis Streams?
- Milestones
- Sample Code
- References
- Further Reading
- Article Stats
- Cite As
Redis has data types that could be used for events or message sequences but with different tradeoffs. Sorted sets are memory hungry. Clients can't block for new messages. It's also not a good choice for time series data since entries can be moved around. Lists don't offer fan-out: a message is delivered to a single client. List entries don't have fixed identifiers. For 1-to-n workloads, there's Pub/Sub but this is a "fire-and-forget" mechanism. Sometimes we wish to keep history, make range queries, or re-fetch messages after a reconnection. Pub/Sub lacks these properties.
Redis Streams addresses these limitations. Stream data type can be seen as similar to logging, except that Stream is an abstraction that's more performant due to logical offsets. It's built using radix trees and listpacks, making it space efficient while also permitting random access by IDs.
Discussion
-
What are some use cases for Redis Streams? Redis Streams is useful for building chat systems, message brokers, queuing systems, event sourcing, etc. Any system that needs to implement unified logging can use Streams. Queuing apps such as Celery and Sidekiq could use Streams. Slack-style chat apps with history can use Streams.
For IoT applications, Streams can run on end devices. This is essentially time-series data that Streams timestamps for sequential ordering. Each IoT device will store data temporarily and asynchronously push these to the cloud via Streams.
While we could use Pub/Sub along with lists and hashes to persist data, Stream is a better data type that's designed to be more performant. Also, if we use Pub/Sub and Redis server is restarted, then all clients have to resubscribe to the channel.
Since Streams supports blocking, clients need not poll for new data. Blocking enables real-time applications, that is, clients can act on new messages as soon as possible.
-
Which are the new commands introduced by Redis Streams? All commands of Redis Streams are documented online. We briefly mention them:
- Adding:
XADD
is the only command for adding data to a stream. Each entry has a unique ID that enables ordering. - Reading:
XREAD
andXRANGE
read items in the order determined by the IDs.XREVRANGE
returns items in reverse order.XREAD
can read from multiple streams and can be called in a blocking manner. - Deleting:
XDEL
andXTRIM
can remove data from the stream. - Grouping:
XGROUP
is for managing consumer groups.XREADROUP
is a special version ofXREAD
with support for consumer groups.XACK
,XCLAIM
andXPENDING
are other commands associated with consumer groups. - Information:
XINFO
shows details of streams and consumer groups.XLEN
gives number of entries in a stream.
- Adding:
-
What are main features of Redis Streams? Streams is a first-class citizen of Redis. It benefits from the usual Redis capabilities of persistency, replication and clustering. It's stored in-memory and under a single key.
The main features of Streams are:
- Asynchronous: Producers and consumers need not be simultaneously connected to the stream. Consumers can subscribe to streams (push) or read periodically (pull).
- Blocking: Consumers need not keep polling for new messages.
- Capped Streams: Streams can be truncated, keeping only the N most recent messages.
- At-Least Once Delivery: This makes the system robust.
- Counter: Every pending message has a counter of delivery attempts. We can use this for dead letter queuing.
- Deletion: While events and logs don't usually have deletion as a feature, Streams supports this efficiently. Deletion allows us to address privacy or regulatory concerns.
- Persistent: Unlike Pub/Sub, messages are persistent. Since history is saved, a consumer can look at previous messages.
- Lookback Queries: This helps consumers analyse past data, such as, obtain temperature readings in a particular 10-second window.
- Scale-Out Options: Via consumer groups, we can easily scale out. Consumers can share the load of processing a fast-incoming data stream.
-
Could you explain consumer groups in Redis? A consumer group allows consumers of that group to share the task of consuming messages from a stream. Thus, a message in a stream can be consumed by only one consumer in that consumer group. This relieves the burden on a consumer to process all messages.
Command
XGROUP
creates a consumer group. A consumer is added to a group the first time it callsXREADGROUP
. A consumer always has to identify itself with a unique consumer name.A stream can have multiple consumer groups. Each consumer group tracks the ID of the last consumed message. This ID is shared by all consumers of the group. Once a consumer reads a message, it's ID is added to a Pending Entries List (PEL). The consumer must acknowledge that it has processed the message, using
XACK
command. Once acknowledged, the pending list is updated. Another consumer can claim a pending message usingXCLAIM
command and begin processing it. This helps in recovering from failures. However, a consumer can choose to use theNOACK
subcommand ofXREADGROUP
if high reliability is not important. -
Could you share more details about IDs in Redis Streams? Entries within a stream are ordered using IDs. Each ID has two parts separated by hyphen: UNIX millisecond timestamp followed by sequence number to distinguish entries added at the same millisecond time. Each part is a 64-bit number. For example,
1526919030474-55
is a valid ID.IDs are autogenerated when
XADD
command is called. However, a client can specify its own ID but it should be an ID greater than all other IDs in the stream.Incomplete IDs are when the second part is omitted. With
XRANGE
, Redis will fill in a suitable second part for us. WithXREAD
, the second part is always-0
.Some IDs are special:
$
: Used withXREAD
to block for new messages, ignoring messages already in the stream.-
&+
: Used withXRANGE
, to specify minimum and maximum IDs possible within the stream. For example, the following command will return every entry in the stream:XRANGE mystream - +
>
: Used withXREADGROUP
, to get new messages (never delivered to other clients). If this command uses any other ID, it has the effect of returning pending entries of that client.
-
In what technical aspects does Redis Streams differ from other Redis data types? Unlike other Redis blocking commands that specify timeouts in seconds, commands
XREAD
andXREADGROUP
specify timeouts in milliseconds. Another difference is that when blocking on list pop operations, the first client will be served when new data arrives. With StreamXREAD
command, every client blocking on the stream will get the new data.When an aggregate data type is emptied, its key is automatically destroyed. This is not the case with Stream data type. The reason for this is to preserve the state associated with consumer groups. Stream is not deleted even if there are no consumer groups but this behaviour may be changed in future versions.
-
How does Redis Streams compare against Kafka? Apache Kafka is a well-known alternative for Redis Streams. In fact, some features of Streams such as consumer groups have been inspired by Kafka.
However, Kafka is said to be difficult to configure and expensive to operate on typical public clouds. Streams is therefore a better option for small, inexpensive apps.
-
Could you share some performance numbers on Redis Streams? In one test on a two-core machine with multiple producers and consumers, messages were generated at 10K per second. With
COUNT 10000
given toXREADGROUP
command, every iteration processed 10K messages. It was seen that 99.9% requests had a latency of less than 2 ms. Real-world performance is expected to be better than this.When compared against traditional Pub/Sub messaging, Streams gives 100x better throughput. It's able to handle more than 1 million operations per second. If Pub/Sub messages are persisted to network storage, latency is about 5 ms. Streams has less than 1 ms latency.
-
Could you share some developer tips for using Redis Streams? There are dozens of Redis clients in various languages. Many of these have support for Streams.
Use
XREAD
for 1-to-1 or 1-to-n messaging. UseXRANGE
for windowing-based stream processing. Within a consumer group, if a client fails temporarily, it can reread messages from a specific ID. For permanent failures, other clients can claim pending messages.For real-time streaming analytics, one suggestion is to pair up Redis Streams with Apache Spark. The latter has the feature Structured Streaming that pairs up nicely with Streams. To scale out, multiple Spark jobs can belong to a single consumer group. Since Streams is persistent, even if a Spark job restarts, it won't miss any data since it will start consuming from where it left off.
Milestones
2017
Salvatore Sanfilippo, creator of Redis, gives a demo of Redis Streams and explains the API. In October, he blogs about it. He explains that the idea occurred much earlier. He tinkered with implementing a generalization of sorted sets and lists but was not happy with the results. When Redis 4.0 came out with support for modules, Timothy Downs created a data type for logging transactions. Sanfilippo used this as an inspiration to create Redis Streams.
2018
Sample Code
References
- AWS. 2019. "Working with Redis Streams." Amazon ElastiCache, Amazon Web Services. Accessed 2019-10-15.
- ApsaraDB. 2018. "Redis Streams – Redis 5.0's Newest Data Type." Alibaba Cloud Community, July 11. Accessed 2019-10-15.
- Arora, Ruchita and Aseem Cheema. 2018. "Building a Messaging Application with Redis Streams." DAT353, AWS re:Invent. Accessed 2019-10-15.
- Cro, Loris. 2019. "What to Choose for Your Synchronous and Asynchronous Communication Needs—Redis Streams, Redis Pub/Sub, Kafka, etc." Redis Labs, May 03. Accessed 2019-10-15.
- Davis, Kyle. 2018. "Redis 5.0 Is Here!" DZone, November 06. Accessed 2019-10-15.
- Giamas, Alex. 2018. "Redis 5.0 Released with New Streams Data Type." InfoQ, October 26. Accessed 2019-10-15.
- Huawei Cloud. 2019. "New Features of DCS Redis 5.0." Distributed Cache Service, Huawei Cloud, September 11. Accessed 2019-10-15.
- Kumar, Roshan. 2018. "How to Build Apps using Redis Streams." Tutorial, Redis Labs. Accessed 2019-10-15.
- Kumar, Roshan. 2019. "Redis Streams + Apache Spark Structured Streaming." Redis Labs, June 03. Accessed 2019-10-15.
- Leach, Brandur. 2017. "Redis Streams and the Unified Log." November 08. Accessed 2019-10-15.
- O'Rourke, Brian P. 2018. "Timeseries in Redis with Streams." Blog, RedisGreen, December 07. Accessed 2019-10-15.
- Pipe-Mazo, Dan. 2019. "Atom: The Redis Streams-Powered Microservices SDK." Redis Labs, via SlidePlayer, May 08. Accessed 2019-10-15.
- Redis. 2019a. "Introduction to Redis Streams." Redis. Accessed 2019-10-15.
- Redis. 2019b. "Redis commands: Streams." Redis. Accessed 2019-10-15.
- Redis. 2019c. "An introduction to Redis data types and abstractions." Redis. Accessed 2019-10-15.
- Redis Labs. 2018. "Tech Preview: Redis Enterprise 5.3 with Streams is here!" Site, July 30. Accessed 2019-10-15.
- Redis Labs Docs. 2019. "Redis Enterprise Software Release Notes 5.3 BETA (July 2018)." Redis Labs, July. Accessed 2019-10-15.
- Sanfilippo, Salvatore. 2017a. "Redis Streams Video." YouTube, September 12. Accessed 2019-10-15.
- Sanfilippo, Salvatore. 2017b. "Streams: a new general purpose data structure in Redis." antirez News, October. Accessed 2019-10-15.
- Sanfilippo, Salvatore. 2019. "Redis 5.0 release notes." Redis 5.0.6, September 25. Accessed 2019-10-15.
Further Reading
- Kumar, Roshan. 2019. "Redis + Structured Streaming—A Perfect Combination to Scale Out Your Continuous Applications." Databricks, on YouTube, May 08. Accessed 2019-10-15.
- Doglio, Fernando. 2019. "Why are we getting Streams in Redis?" LogRocket Blog, January 17. Accessed 2019-10-15.
- Redis. 2019a. "Introduction to Redis Streams." Redis. Accessed 2019-10-15.
- Leach, Brandur. 2017. "Redis Streams and the Unified Log." November 08. Accessed 2019-10-15.
- Leifer, Charles. 2018. "Introduction to Redis streams with Python." Blog, October 08. Accessed 2019-10-15.
- Kumar, Roshan. 2017. "How to use Redis for real-time stream processing." InfoWorld, August 02. Accessed 2019-10-15.
Article Stats
Cite As
See Also
- Redis Data Types
- Stream Processing
- Event Sourcing
- Apache Kafka
- Publish-Subscribe Pattern
- Producer-Consumer Problem