Dragonfly vs Redis: Single-Binary Performance

Your Redis Is Running on One Core and That’s Hilarious

You’ve got a beefy server. Sixteen cores, 64 GB of RAM, NVMe storage. Your Redis instance is pegging one core at 100% while the other fifteen sit there watching Netflix. That’s the Redis experience, and it’s been the Redis experience for fifteen years.

Enter DragonflyDB, a drop-in Redis-compatible in-memory store that actually wants to use your hardware. It’s written in C++20, designed around a threads-per-core model, and it’ll cheerfully spin up a shard on every core your machine has. One binary, one process, sixty-four cores if you want them.

Is it magic? No. Is it interesting enough to talk about? Absolutely.

What Dragonfly Actually Is

DragonflyDB is a high-performance in-memory data store that speaks the Redis protocol. You point your existing app at it, you don’t change a line of code, and it just works, at least for the common stuff.

Under the hood it’s a completely different animal. The project was started in 2021, went open source in 2022, and has been steadily closing the Redis compatibility gap ever since. As of mid-2026, you’re looking at Dragonfly 1.x releases that support the vast majority of Redis 6.x commands and a solid chunk of Redis 7.x.

The license is Business Source License (BSL) 1.1, meaning production use is absolutely fine, but if you’re AWS trying to sell it as a managed service without contributing anything back, you’ll hit restrictions. After four years from each release date, the code converts to Apache 2.0. For home lab and self-hosted use cases, this is a non-issue. You’re good.

Architecture: Why Redis Is Single-Threaded (And Why That Was Actually Fine Until It Wasn’t)

Redis made a bet in the early 2010s: single-threaded command execution is simpler, avoids lock contention, and is fast enough for most workloads because memory operations are blazing fast. The bet paid off for a decade.

The architecture looks like this:

Single event loop handles all client connections and command execution
I/O threads (added in Redis 6) handle network reads/writes to reduce bottlenecks at the networking layer
Result: you can get maybe 200,000 to 400,000 ops/sec on a well-tuned Redis instance, and that ceiling doesn’t move much no matter how many cores you throw at the machine

Valkey 8.x (the community fork born from the Redis license drama of 2024) adds multi-threaded I/O improvements and some internal concurrency work, but the core command execution model is still fundamentally event-loop-based. It’s faster than Redis in some benchmarks, but it’s iterating on the same foundation.

Dragonfly blows up the foundation entirely.

Dragonfly’s Shared-Nothing Model

Dragonfly uses a shard-per-core architecture. When you start it, it spins up N worker threads, one per logical CPU core by default. Each shard owns a slice of the keyspace, determined by hashing the key. A client connection gets load-balanced across these shards.

Single-key operations: route directly to the owning shard, fully parallel across all keys
Multi-key operations: go through a coordinator shard that fans out sub-operations to the relevant shards and collects results
No global locks on the hot path

This is the “shared-nothing” model that databases like ScyllaDB (which heavily inspired Dragonfly’s architecture) use. It scales linearly, add more cores, get more throughput.

The tradeoff is complexity in multi-key operations. MSET, MGET, EVAL (Lua), and transactions that span multiple keys have to coordinate across shards. Dragonfly handles this, but Lua script compatibility isn’t 100%, and complex MULTI/EXEC blocks that touch many keys can have edge cases.

Memory: The Dashtable Story

Redis uses a classic hash table internally, you know the one, it has to rehash periodically and can spike memory during the resize operation. Dragonfly uses a custom data structure called dashtable (short for dash = dynamic array of segments hash table).

The dashtable is designed to avoid the “all-at-once” rehashing problem. Resizes happen incrementally, which smooths out memory usage and latency spikes. Dragonfly claims roughly 30% better memory efficiency than Redis for equivalent datasets.

In practice this shows up most with large keyspaces of small-to-medium values, which is exactly what a cache hot path looks like. Your 50 million session keys? Dragonfly will fit more of them in the same RAM.

Running It: Docker Compose

Here’s a minimal setup to kick the tires:

services:
  dragonfly:
    image: docker.dragonflydb.io/dragonflydb/dragonfly:latest
    restart: unless-stopped
    ports:
      - "6379:6379"
    volumes:
      - dragonfly_data:/data
    ulimits:
      memlock: -1
    command: >
      --logtostderr
      --cache_mode=false
      --dbfilename=dragonfly.rdb
      --dir=/data
      --hz=100

volumes:
  dragonfly_data:

Drop that in, docker compose up -d, and your existing Redis clients connect on port 6379 like nothing changed. redis-cli ping returns PONG. Life is good.

A few flags worth knowing:

--cache_mode=true: enables LRU-style eviction. Great for pure cache use cases.
--maxmemory 4gb: cap memory usage (same as Redis maxmemory)
--hz: internal timer frequency, default 100
--threads: override core count if you want to limit CPU usage

The memlock ulimit is important, Dragonfly wants to lock memory pages to avoid swap thrashing. Don’t skip it.

Benchmark Numbers: Let’s Be Honest About What These Mean

Dragonfly’s own benchmarks are impressive. Here’s the rough shape of what the public numbers look like on a 32-core machine using memtier_benchmark:

# Redis 7.4, single instance, pinned to one core
memtier_benchmark -s localhost -p 6379 \
  --threads=4 --clients=50 --test-time=30 \
  --data-size=64 --pipeline=16

# Result: ~210,000 ops/sec
# Latency p99: 4.2ms

# Dragonfly 1.x, same machine, all 32 cores
memtier_benchmark -s localhost -p 6379 \
  --threads=16 --clients=50 --test-time=30 \
  --data-size=64 --pipeline=16

# Result: ~3,800,000 ops/sec
# Latency p99: 1.1ms

That’s not a typo. For embarrassingly parallel workloads, lots of independent GET/SET operations on different keys, Dragonfly is a different league.

But here’s where I have to pump the brakes: these numbers are for the benchmark scenario. Your app probably doesn’t look like that. If you’re doing a bunch of MULTI/EXEC transactions, SCAN operations over huge keyspaces, or heavy Pub/Sub, the gap narrows considerably. If you’re running a single-threaded app that fires off one Redis command at a time, you won’t see 18x improvement, you’ll see something more modest.

The value proposition is real, but it’s most visible when you’re actually bottlenecking on Redis throughput. A lot of home lab setups aren’t.

Persistence: Not Just a Cache

Redis’s persistence story has always been a bit “pick your poison”: RDB snapshots (fast recovery, possible data loss) or AOF (less data loss, bigger files, slower recovery). Mixing them is possible but annoying.

Dragonfly takes a different approach with a snapshot + WAL (Write-Ahead Log) hybrid for point-in-time recovery:

RDB-compatible snapshots for bulk state
WAL for the delta since last snapshot
Recovery means: load snapshot + replay WAL = minimal data loss without full AOF overhead

The snapshots are RDB-compatible, so you can technically load a Redis RDB file into Dragonfly. Useful if you’re migrating.

Replication is supported, a primary/replica setup works similarly to Redis, using RESP protocol replication. You won’t get multi-primary cluster mode like Redis Cluster, though.

Compatibility Reality Check

Most Redis 6.x commands work. The common stuff, GET, SET, MGET, MSET, EXPIRE, TTL, SCAN, LPUSH/LPOP, ZADD/ZRANGE, HSET/HGET, Pub/Sub, is solid.

Where you need to test before committing:

Lua scripting: Basic EVAL works. Complex scripts that assume specific execution ordering or use redis.call in loops over many keys can behave differently due to the shard architecture. Test your scripts explicitly.

OBJECT ENCODING: Some introspection commands return different values than Redis. Fine for apps, annoying for monitoring tools that parse this.

Redis Cluster protocol: Dragonfly emulates cluster mode with --cluster_mode=emulated. This makes clients that require cluster mode happy without you actually running multiple nodes. It’s single-node under the hood, so it won’t help with multi-host distribution.

WAIT command: Dragonfly has this but behavior differences exist in replication scenarios.

Modules: Redis Stack modules (RedisSearch, RedisJSON, etc.) don’t apply. Dragonfly has some native JSON support via JSON.* commands, but it’s not RediSearch.

The Gotcha List

No horizontal sharding across nodes. This is the big one. Redis Cluster splits your keyspace across multiple nodes, if your dataset is 500 GB, you spread it across machines. Dragonfly doesn’t do that yet. You can scale vertically to one massive node, but if you need multi-node horizontal scale, you’re not there yet with Dragonfly.

BSL license. Fully fine for self-hosting and production use at your company. Just know it’s not Apache/MIT. Valkey 8.x is BSD, if license purity matters to your org’s legal team, there’s your alternative.

Lua compatibility quirks. Already mentioned, but worth repeating. If you have a Lua-heavy Redis setup, do the testing before you cut over production.

Newer Redis 7.x features. Things like Redis Functions, some ACL v2 features, and LMPOP edge cases, check the Dragonfly compatibility matrix for your specific commands at their GitHub repo before assuming support.

When Should You Actually Use Dragonfly?

Use Dragonfly when:

You have 16+ cores available and Redis is your bottleneck
You’re running a cache-heavy app with high concurrency: session stores, rate limiting, leaderboards
You want to consolidate: instead of running three Redis instances, run one Dragonfly that uses all your cores
Memory efficiency matters: you’re packing 100 GB of cache data and 30% savings is real money
You’re starting fresh and Redis compatibility is all you need

Stick with Redis (or Valkey) when:

You’re on a small dev box or homelab with 4 cores and Redis is nowhere near the ceiling
You have complex Lua scripts that you can’t easily test
You need Redis Cluster topology across multiple physical nodes today
Your team is deeply familiar with Redis internals and the debugging tools around it
You’re using Redis Stack (RedisSearch, RedisTimeSeries, RedisBloom): those don’t translate

Valkey 8.x is the sweet spot for “I want Redis but I’m mildly annoyed at the license drama and I want slightly better multi-threaded I/O.” It’s the community-maintained, BSD-licensed Redis fork that AWS, Google, and others are backing. It’s not as architecturally radical as Dragonfly, but it’s a safer drop-in for existing setups with less to test.

The Bottom Line

Dragonfly is legitimately impressive engineering. The shared-nothing, shard-per-core architecture is the right answer to “how do you make an in-memory store scale on modern hardware,” and the Redis protocol compatibility means you can try it without rewriting your app.

The catch is that the killer advantage, that 18x throughput number, only matters if you’re actually hitting Redis limits. For the average self-hosted app, you’re not. Redis or Valkey on a 4-core VPS will handle Nextcloud, Gitea, your monitoring stack, and your home automation without breaking a sweat.

But if you’ve got a serious workload, a busy Mastodon instance, a high-traffic rate limiter, a session cache serving thousands of concurrent users, and you’re throwing hardware at the problem because Redis can’t keep up? Dragonfly is the move. Drop in the Compose file, point your client at port 6379, run your test suite, and watch your latency graphs improve.

Just test your Lua scripts first. Your 2 AM self will thank you.

Dragonfly vs Redis: Single-Binary Performance

Your Redis Is Running on One Core and That’s Hilarious

What Dragonfly Actually Is

Architecture: Why Redis Is Single-Threaded (And Why That Was Actually Fine Until It Wasn’t)

Dragonfly’s Shared-Nothing Model

Memory: The Dashtable Story

Running It: Docker Compose

Benchmark Numbers: Let’s Be Honest About What These Mean

Persistence: Not Just a Cache

Compatibility Reality Check

The Gotcha List

When Should You Actually Use Dragonfly?

Use Dragonfly when:

Stick with Redis (or Valkey) when:

The Bottom Line

Responses from around the web

Discussion

Related Posts

KeyDB: A Redis Fork With Multi-Threading

ClickHouse vs DuckDB vs StarRocks: Light OLAP

Adding NOT NULL on a Big Table Without Downtime

Postgres HA: Patroni + etcd + HAProxy

Dragonfly vs Redis: Single-Binary Performance

Your Redis Is Running on One Core and That’s Hilarious

What Dragonfly Actually Is

Architecture: Why Redis Is Single-Threaded (And Why That Was Actually Fine Until It Wasn’t)

Dragonfly’s Shared-Nothing Model

Memory: The Dashtable Story

Running It: Docker Compose

Benchmark Numbers: Let’s Be Honest About What These Mean

Persistence: Not Just a Cache

Compatibility Reality Check

The Gotcha List

When Should You Actually Use Dragonfly?

Use Dragonfly when:

Stick with Redis (or Valkey) when:

The Bottom Line

Related Reading

Responses from around the web

Discussion

Related Posts

KeyDB: A Redis Fork With Multi-Threading

ClickHouse vs DuckDB vs StarRocks: Light OLAP

Adding NOT NULL on a Big Table Without Downtime

Postgres HA: Patroni + etcd + HAProxy