When Kafka, Redis, and Elasticsearch are justified

Clear criteria for when Kafka, Redis, Elasticsearch, and outbox patterns are structurally justified in real-world architectures.

Автор: Николай ТеневПубликувано: 22 февруари 2026 г.Последна актуализация: 22 февруари 2026 г.9 мин. четенеArchitecture

Most systems begin life with a transactional database and a web service. That baseline already gives you more than people sometimes admit: durable state, constraints, transactions, indexing, and a fairly sharp model of what "correct" means.

Then the system grows, and new components appear: a broker, a cache, a search cluster, background workers. Sometimes this is structural progress, sometimes - just surface area.

This article does not argue that "Postgres is enough" or that "modern stacks are an overkill". It tries to pin down something more useful: what pressures make Kafka, Redis, Elasticsearch, or an outbox pattern genuinely justified, and what pressures are often mistaken for justification.

When Kafka (or a broker) is actually required

Kafka is not justified by the fact that you have events. Every system has events. Kafka is justified when the shape of the work stops fitting a single transactional core, and you need explicit machinery for fan-out, replay, and isolation between producers and consumers.

Structural signals

Independent consumers with different needs. Multiple downstream processes consume the same stream, but they do not share SLAs, throughput, or failure tolerance.
Backpressure must not affect writes. Downstream slowness or outages cannot be allowed to slow core writes, even indirectly.
Replay is a first-class requirement. Reprocessing historical events is part of normal operations (not a rare emergency).
High-volume fan-out. You are distributing state changes to many independent consumers at meaningful scale.

Concrete scenario: billing fan-out and failure isolation

Imagine a billing platform. An invoice is created. That single action needs to trigger several workflows:

Update internal accounting state.
Send customer notifications.
Feed a real-time analytics pipeline.
Export to a compliance archive.
Notify external partners via webhooks.

Early on, you implement this with a transactional outbox table and a small worker fleet. It works fine: insert invoice, insert outbox rows in the same transaction, poll the outbox, deliver messages, mark rows as processed.

Then the system grows:

Analytics ingestion lags during peak hours.
Webhooks experience spikes of retries when partners are down.
Compliance exports have their own schedule and retention needs.
Email delivery is bursty and sensitive to provider throttling.

At this point, the database is doing more than storing invoices. It is coordinating: fan-out, retry semantics, backpressure, and per-consumer progress. You can keep pushing the outbox, but you'll find yourself recreating broker mechanics: partitioning, consumer offsets, replay tools, dead-letter handling, and operational visibility built around "where did message X go?"

Kafka becomes justified here not because it is fashionable, but because the consumers have become independent systems. You need a substrate that makes that independence explicit and operable.

When Redis is actually needed

Redis is often introduced as a generic performance tool: "the database is slow, let's cache." That is sometimes correct, but "cache" is a broad word. Redis tends to be most justified when it is being used for ephemeral coordination and high-churn state, not as a second home for durable domain data.

Structural signals

Predictable low latency under extreme concurrency. You need fast operations with consistent latency at high request rates.
Ephemeral, high-churn state. Counters, windows, tokens, short-lived sessions, and other state that is naturally time-bound.
Atomic operations that are awkward in SQL. You want simple atomic increments, compare-and-set patterns, or data structures optimized for in-memory operations.
You are not creating a second source of truth. The state is either derived, ephemeral, or explicitly allowed to be eventually consistent.

Concrete scenario: Rate Limiting and Coordination State

Consider an API gateway enforcing rate limits at 50,000 requests per second, with a sliding window: limits per user and per IP, plus a few special cases (burst allowances, premium tiers). The gateway has a tight latency budget: it can't add tens of milliseconds to every request.

If you implement this in Postgres, you will likely run into some combination of:

High write amplification (every request updates counters).
Row-level contention (hot keys under popular accounts or NAT'd IP ranges).
Locks and retries (either explicit or implicit under serialization).
Operational complexity (you end up tuning the database around rate limiting rather than your core workload).

Redis fits naturally here because the state is short-lived, heavily mutated, and needs atomic operations with expiration. You are not pretending this is canonical business data; you are running a coordination algorithm.

Now contrast that with caching "product listings" because reads are expensive. If you cannot clearly state what invalidates the cache, who owns the cached value, and what happens under drift, you are introducing an alternative state store with unclear semantics. Any short-term latency improvement will most likely result in long-term consistency debt.

When Elasticsearch beats Postgres full-text search

Postgres full-text search is better than many teams expect. If your search needs are modest, it can be entirely sufficient: basic ranking, language dictionaries, reasonable indexes, and straightforward query shapes.

Elasticsearch becomes justified when search is not "a feature" but a subsystem that deserves its own scaling, tuning, and operational model.

Structural signals

Complex filtering and faceting. Many filters, facets, and aggregations in the critical user path.
Relevance tuning and iteration. You expect to change analyzers, scoring, synonyms, boosting, and ranking rules as part of product evolution.
Independent scaling pressure. Search traffic is large enough that it should scale independently of transactional writes.
Search is a primary access path. Users navigate the system through search, not through fixed pages and structured browsing alone.

Concrete scenario: marketplace search as a primary product surface

Consider a marketplace with millions of products and a search experience that is effectively the UI:

Autocomplete and suggestions, across multiple languages.
Filters on category, brand, price range, rating, availability, shipping options.
Sorting by relevance, popularity, and recency.
Faceted navigation with live counts.

You can implement some of this in Postgres, especially early on. But as the product evolves you will likely face:

Hard limits on relevance tuning (you can rank, but not with the richness you want).
Expensive query plans as filters and joins multiply.
Scaling friction (search load competes with transactional workload).
Operational pressure to split read models from write models anyway.

In that shape of system, Elasticsearch is justified because it is designed for search as a core workload: inverted indices, query-time scoring, language-aware analyzers, and efficient aggregation patterns. The justification is not "better search," but "search has become its own system."

When the outbox pattern becomes dangerous

The transactional outbox is one of the few patterns that genuinely earns its popularity. It gives you an attractive property: you can commit business state and enqueue downstream work in the same transaction, without inventing your own two-phase protocol.

But it has limits. The outbox pattern is a bridge. It can become a trap when it silently turns your database into a broker.

Structural signals

Event volume pushes polling mechanics. You need aggressive polling, careful locking, and partitioning just to keep up.
Many independent consumers. Per-consumer progress tracking, retries, and dead-letter handling multiply.
Replay becomes routine. You need replay and reprocessing as normal operations, not as rare exceptions.
Delivery semantics become demanding. Ordering, retention windows, and consumer offset tooling become part of your life.

Concrete scenario: outbox as a broker in disguise

Suppose you run a financial trading platform. Trades are written at high volume. Each trade triggers: portfolio updates, risk checks, real-time dashboards, audit persistence, and downstream analytics ingestion.

The outbox starts as a clean solution. Then pressure accumulates:

Workers poll frequently to reduce latency, increasing database load.
Hot partitions appear if polling and locking are naive.
Retries and dead-letter handling become a substantial subsystem.
"Reprocess last week's events" becomes a common operational request.

At that point you are not "using Postgres for messaging." You are implementing: partitioning strategies, consumer progress tracking, backpressure, replay tooling, and operational dashboards. Those are the responsibilities of a broker. If the system needs those responsibilities, it is usually better to give them a component built for that role.

The outbox remains excellent when event volume is moderate, consumers are few, and transactional coupling is valuable. It becomes risky when distribution is no longer an implementation detail but a first-class system behavior.

The underlying principle: boundary justification

Each of these tools introduces a boundary:

Kafka introduces asynchronous coordination and durable event streams.
Redis introduces an alternative state store (often ephemeral, but sometimes gradually treated as durable without being designed as one).
Elasticsearch introduces an independent read model optimized for search and aggregations.
Outbox introduces event distribution inside your transactional core.

A boundary can be a feature. It can also be a cost. The question is whether the boundary is justified by real constraints:

Independent scaling. Does this workload need to scale separately from the rest?
Failure isolation. Are you isolating failures, or just moving them to new places?
Latency guarantees. Do you need predictable latency at high concurrency, or do you merely want it?
Ownership boundaries. Are multiple teams or subsystems truly independent, or just organized that way in code?
Operational demands. Do you need replay, retention, and clear consumer progress as daily tools?

If those forces are weak, adding components tends to create coordination work without providing structural benefit. If those forces are strong, distribution is not a style choice; it is the shape the system is being pushed into.

Summary

A transactional database can carry far more than most systems demand. That reality is easy to forget because distributed components feel like "professional architecture". Sometimes they are, sometimes they are just surface area.

Kafka, Redis, and Elasticsearch are excellent tools when their constraints are present. The skill is not knowing how to run them. The skill is recognizing when the pressure to introduce them is real, and when you are paying coordination costs for a future you may never reach.

Kafka is justified by independent consumers, failure isolation, replay needs, and high-volume fan-out.
Redis is justified by predictable low latency under extreme concurrency and ephemeral coordination state.
Elasticsearch is justified when search is a subsystem: complex faceting, relevance tuning, and independent scaling.
Outbox is powerful as a bridge, but becomes risky when it turns your database into a broker in disguise.
The decision is less about tools and more about whether the boundary is structurally justified.