Redis Streams

Redis Streams is similar to the unified log pattern. Source: Leach 2017.
Redis Streams is similar to the unified log pattern. Source: Leach 2017.

Redis has data types that could be used for events or message sequences but with different tradeoffs. Sorted sets are memory hungry. Clients can't block for new messages. It's also not a good choice for time series data since entries can be moved around. Lists don't offer fan-out: a message is delivered to a single client. List entries don't have fixed identifiers. For 1-to-n workloads, there's Pub/Sub but this is a "fire-and-forget" mechanism. Sometimes we wish to keep history, make range queries, or re-fetch messages after a reconnection. Pub/Sub lacks these properties.

Redis Streams addresses these limitations. Stream data type can be seen as similar to logging, except that Stream is an abstraction that's more performant due to logical offsets. It's built using radix trees and listpacks, making it space efficient while also permitting random access by IDs.

Discussion

  • What are some use cases for Redis Streams?
    Redis Streams running within a container on a Raspberry Pi. Source: ApsaraDB 2018.
    Redis Streams running within a container on a Raspberry Pi. Source: ApsaraDB 2018.

    Redis Streams is useful for building chat systems, message brokers, queuing systems, event sourcing, etc. Any system that needs to implement unified logging can use Streams. Queuing apps such as Celery and Sidekiq could use Streams. Slack-style chat apps with history can use Streams.

    For IoT applications, Streams can run on end devices. This is essentially time-series data that Streams timestamps for sequential ordering. Each IoT device will store data temporarily and asynchronously push these to the cloud via Streams.

    While we could use Pub/Sub along with lists and hashes to persist data, Stream is a better data type that's designed to be more performant. Also, if we use Pub/Sub and Redis server is restarted, then all clients have to resubscribe to the channel.

    Since Streams supports blocking, clients need not poll for new data. Blocking enables real-time applications, that is, clients can act on new messages as soon as possible.

  • Which are the new commands introduced by Redis Streams?
    Illustrating the use of some commands of Redis Streams. Source: Huawei Cloud 2019, fig. 2.
    Illustrating the use of some commands of Redis Streams. Source: Huawei Cloud 2019, fig. 2.

    All commands of Redis Streams are documented online. We briefly mention them:

    • Adding: XADD is the only command for adding data to a stream. Each entry has a unique ID that enables ordering.
    • Reading: XREAD and XRANGE read items in the order determined by the IDs. XREVRANGE returns items in reverse order. XREAD can read from multiple streams and can be called in a blocking manner.
    • Deleting: XDEL and XTRIM can remove data from the stream.
    • Grouping: XGROUP is for managing consumer groups. XREADROUP is a special version of XREAD with support for consumer groups. XACK, XCLAIM and XPENDING are other commands associated with consumer groups.
    • Information: XINFO shows details of streams and consumer groups. XLEN gives number of entries in a stream.
  • What are main features of Redis Streams?

    Streams is a first-class citizen of Redis. It benefits from the usual Redis capabilities of persistency, replication and clustering. It's stored in-memory and under a single key.

    The main features of Streams are:

    • Asynchronous: Producers and consumers need not be simultaneously connected to the stream. Consumers can subscribe to streams (push) or read periodically (pull).
    • Blocking: Consumers need not keep polling for new messages.
    • Capped Streams: Streams can be truncated, keeping only the N most recent messages.
    • At-Least Once Delivery: This makes the system robust.
    • Counter: Every pending message has a counter of delivery attempts. We can use this for dead letter queuing.
    • Deletion: While events and logs don't usually have deletion as a feature, Streams supports this efficiently. Deletion allows us to address privacy or regulatory concerns.
    • Persistent: Unlike Pub/Sub, messages are persistent. Since history is saved, a consumer can look at previous messages.
    • Lookback Queries: This helps consumers analyse past data, such as, obtain temperature readings in a particular 10-second window.
    • Scale-Out Options: Via consumer groups, we can easily scale out. Consumers can share the load of processing a fast-incoming data stream.
  • Could you explain consumer groups in Redis?
    Consumers within a consumer group share the processing load. Source: Kumar 2018, fig. 1.3.
    Consumers within a consumer group share the processing load. Source: Kumar 2018, fig. 1.3.

    A consumer group allows consumers of that group to share the task of consuming messages from a stream. Thus, a message in a stream can be consumed by only one consumer in that consumer group. This relieves the burden on a consumer to process all messages.

    Command XGROUP creates a consumer group. A consumer is added to a group the first time it calls XREADGROUP. A consumer always has to identify itself with a unique consumer name.

    A stream can have multiple consumer groups. Each consumer group tracks the ID of the last consumed message. This ID is shared by all consumers of the group. Once a consumer reads a message, it's ID is added to a Pending Entries List (PEL). The consumer must acknowledge that it has processed the message, using XACK command. Once acknowledged, the pending list is updated. Another consumer can claim a pending message using XCLAIM command and begin processing it. This helps in recovering from failures. However, a consumer can choose to use the NOACK subcommand of XREADGROUP if high reliability is not important.

  • Could you share more details about IDs in Redis Streams?

    Entries within a stream are ordered using IDs. Each ID has two parts separated by hyphen: UNIX millisecond timestamp followed by sequence number to distinguish entries added at the same millisecond time. Each part is a 64-bit number. For example, 1526919030474-55 is a valid ID.

    IDs are autogenerated when XADD command is called. However, a client can specify its own ID but it should be an ID greater than all other IDs in the stream.

    Incomplete IDs are when the second part is omitted. With XRANGE, Redis will fill in a suitable second part for us. With XREAD, the second part is always -0.

    Some IDs are special:

    • $: Used with XREAD to block for new messages, ignoring messages already in the stream.
    • - & +: Used with XRANGE, to specify minimum and maximum IDs possible within the stream. For example, the following command will return every entry in the stream: XRANGE mystream - +
    • >: Used with XREADGROUP, to get new messages (never delivered to other clients). If this command uses any other ID, it has the effect of returning pending entries of that client.
  • In what technical aspects does Redis Streams differ from other Redis data types?
    Comparing Redis Stream data type with other types. Source: Huawei Cloud 2019, table 2.
    Comparing Redis Stream data type with other types. Source: Huawei Cloud 2019, table 2.

    Unlike other Redis blocking commands that specify timeouts in seconds, commands XREAD and XREADGROUP specify timeouts in milliseconds. Another difference is that when blocking on list pop operations, the first client will be served when new data arrives. With Stream XREAD command, every client blocking on the stream will get the new data.

    When an aggregate data type is emptied, its key is automatically destroyed. This is not the case with Stream data type. The reason for this is to preserve the state associated with consumer groups. Stream is not deleted even if there are no consumer groups but this behaviour may be changed in future versions.

  • How does Redis Streams compare against Kafka?

    Apache Kafka is a well-known alternative for Redis Streams. In fact, some features of Streams such as consumer groups have been inspired by Kafka.

    However, Kafka is said to be difficult to configure and expensive to operate on typical public clouds. Streams is therefore a better option for small, inexpensive apps.

  • Could you share some performance numbers on Redis Streams?

    In one test on a two-core machine with multiple producers and consumers, messages were generated at 10K per second. With COUNT 10000 given to XREADGROUP command, every iteration processed 10K messages. It was seen that 99.9% requests had a latency of less than 2 ms. Real-world performance is expected to be better than this.

    When compared against traditional Pub/Sub messaging, Streams gives 100x better throughput. It's able to handle more than 1 million operations per second. If Pub/Sub messages are persisted to network storage, latency is about 5 ms. Streams has less than 1 ms latency.

  • Could you share some developer tips for using Redis Streams?

    There are dozens of Redis clients in various languages. Many of these have support for Streams.

    Use XREAD for 1-to-1 or 1-to-n messaging. Use XRANGE for windowing-based stream processing. Within a consumer group, if a client fails temporarily, it can reread messages from a specific ID. For permanent failures, other clients can claim pending messages.

    For real-time streaming analytics, one suggestion is to pair up Redis Streams with Apache Spark. The latter has the feature Structured Streaming that pairs up nicely with Streams. To scale out, multiple Spark jobs can belong to a single consumer group. Since Streams is persistent, even if a Spark job restarts, it won't miss any data since it will start consuming from where it left off.

Milestones

Sep
2017

Salvatore Sanfilippo, creator of Redis, gives a demo of Redis Streams and explains the API. In October, he blogs about it. He explains that the idea occurred much earlier. He tinkered with implementing a generalization of sorted sets and lists but was not happy with the results. When Redis 4.0 came out with support for modules, Timothy Downs created a data type for logging transactions. Sanfilippo used this as an inspiration to create Redis Streams.

May
2018

The first release candidate RC1 of Redis 5.0 is released. This supports Stream data type.

Jul
2018

Beta version of Redis Enterprise Software (RS) 5.3 is released. This is based on Redis 5.0 RC3 with support for Stream data type.

Oct
2018

Redis 5.0.0 is released.

May
2019

At the Redis Conference 2019, Dan Pipe-Mazo talks about Atom, a microservices SDK powered by Redis Streams. Microservices interact with one another using Streams.

Sample Code

  • # Source: https://www.alibabacloud.com/blog/redis-streams-redis-5-0s-newest-data-type_593816
    # Accessed: 2019-10-15
     
    # Currently Redis does not support creating empty streams. We can add a special message,
    # to create a new stream (channel)
    ip:7000> xadd channel1 * create-channel null
    1528702126345-0
     
    # Use the xadd command to send a message. We can name each message and attach the message source 
    # for business logic processing convenience.
    # We can also send multiple messages at one time as this is helpful for optimizing network overhead.
    ip:7000> xadd channel1 * msg1-tony "Hello everyone."
    1528702503377-0
    ip:7000> xadd channel1 * msg2-tony "I am a big Redis fan." msg3-tony "Hope we can learn from each other.:-)"
    1528702573546-0
     
    # If a new user joins a channel for the first time, "$" is specified as a special start ID for message reading. 
    # The user will then only receive the channel's latest messages.
    # If there are new messages, the user will start reading the messages from the previous return result ID.
    # If there are no new messages, the xread command returns an empty set
    ip:7000> xread BLOCK 100 STREAMS channel1 $
    1) 1) "channel1"
       2) 1) 1) 1528703048021-0
             2) 1) "msg1-tony"
                2) "Hello everyone."
    ip:7000> xread BLOCK 100 STREAMS channel1 1528703048021-0
    1) 1) "channel1"
       2) 1) 1) 1528703061087-0
             2) 1) "msg2-tony"
                2) "I am a big Redis fan."
                3) "msg3-tony"
                4) "Hope we can learn from each other.:-)"
    ip:7000> xread BLOCK 100 STREAMS channel1 1528703061087-0
    (nil)
     
    # 1528703061087-0 is the ID of the user's last received message
    ip:7000> xrange channel1 1528703061087-0 +
    1) 1) 1528706457462-0
       2) 1) "msg1-andy"
          2) "Nice to meet you guys."
    2) 1) 1528706497200-0
       2) 1) "msg4-tony"
          2) "When will Redis 5.0 GA comes out?"
    3) 1) 1528706601973-0
       2) 1) "msg1-antirez"
          2) "I think it will arrive in the second half of 2018."
     

References

  1. AWS. 2019. "Working with Redis Streams." Amazon ElastiCache, Amazon Web Services. Accessed 2019-10-15.
  2. ApsaraDB. 2018. "Redis Streams – Redis 5.0's Newest Data Type." Alibaba Cloud Community, July 11. Accessed 2019-10-15.
  3. Arora, Ruchita and Aseem Cheema. 2018. "Building a Messaging Application with Redis Streams." DAT353, AWS re:Invent. Accessed 2019-10-15.
  4. Cro, Loris. 2019. "What to Choose for Your Synchronous and Asynchronous Communication Needs—Redis Streams, Redis Pub/Sub, Kafka, etc." Redis Labs, May 03. Accessed 2019-10-15.
  5. Davis, Kyle. 2018. "Redis 5.0 Is Here!" DZone, November 06. Accessed 2019-10-15.
  6. Giamas, Alex. 2018. "Redis 5.0 Released with New Streams Data Type." InfoQ, October 26. Accessed 2019-10-15.
  7. Huawei Cloud. 2019. "New Features of DCS Redis 5.0." Distributed Cache Service, Huawei Cloud, September 11. Accessed 2019-10-15.
  8. Kumar, Roshan. 2018. "How to Build Apps using Redis Streams." Tutorial, Redis Labs. Accessed 2019-10-15.
  9. Kumar, Roshan. 2019. "Redis Streams + Apache Spark Structured Streaming." Redis Labs, June 03. Accessed 2019-10-15.
  10. Leach, Brandur. 2017. "Redis Streams and the Unified Log." November 08. Accessed 2019-10-15.
  11. O'Rourke, Brian P. 2018. "Timeseries in Redis with Streams." Blog, RedisGreen, December 07. Accessed 2019-10-15.
  12. Pipe-Mazo, Dan. 2019. "Atom: The Redis Streams-Powered Microservices SDK." Redis Labs, via SlidePlayer, May 08. Accessed 2019-10-15.
  13. Redis. 2019a. "Introduction to Redis Streams." Redis. Accessed 2019-10-15.
  14. Redis. 2019b. "Redis commands: Streams." Redis. Accessed 2019-10-15.
  15. Redis. 2019c. "An introduction to Redis data types and abstractions." Redis. Accessed 2019-10-15.
  16. Redis Labs. 2018. "Tech Preview: Redis Enterprise 5.3 with Streams is here!" Site, July 30. Accessed 2019-10-15.
  17. Redis Labs Docs. 2019. "Redis Enterprise Software Release Notes 5.3 BETA (July 2018)." Redis Labs, July. Accessed 2019-10-15.
  18. Sanfilippo, Salvatore. 2017a. "Redis Streams Video." YouTube, September 12. Accessed 2019-10-15.
  19. Sanfilippo, Salvatore. 2017b. "Streams: a new general purpose data structure in Redis." antirez News, October. Accessed 2019-10-15.
  20. Sanfilippo, Salvatore. 2019. "Redis 5.0 release notes." Redis 5.0.6, September 25. Accessed 2019-10-15.

Further Reading

  1. Kumar, Roshan. 2019. "Redis + Structured Streaming—A Perfect Combination to Scale Out Your Continuous Applications." Databricks, on YouTube, May 08. Accessed 2019-10-15.
  2. Doglio, Fernando. 2019. "Why are we getting Streams in Redis?" LogRocket Blog, January 17. Accessed 2019-10-15.
  3. Redis. 2019a. "Introduction to Redis Streams." Redis. Accessed 2019-10-15.
  4. Leach, Brandur. 2017. "Redis Streams and the Unified Log." November 08. Accessed 2019-10-15.
  5. Leifer, Charles. 2018. "Introduction to Redis streams with Python." Blog, October 08. Accessed 2019-10-15.
  6. Kumar, Roshan. 2017. "How to use Redis for real-time stream processing." InfoWorld, August 02. Accessed 2019-10-15.

Article Stats

Author-wise Stats for Article Edits

Author
No. of Edits
No. of Chats
DevCoins
5
0
1902
1635
Words
4
Likes
18K
Hits

Cite As

Devopedia. 2020. "Redis Streams." Version 5, January 6. Accessed 2024-06-25. https://devopedia.org/redis-streams
Contributed by
1 author


Last updated on
2020-01-06 12:40:16
  • Redis Data Types
  • Stream Processing
  • Event Sourcing
  • Apache Kafka
  • Publish-Subscribe Pattern
  • Producer-Consumer Problem