Picking a non-relational store, untangled[/HEADING>
"Should we use NoSQL?" was the question 10 years ago. The honest answer in 2026 is: probably not as your primary, often as your secondary. Modern Postgres handles a lot of what was once "NoSQL territory" (JSON, full-text search, geospatial), and the cases where dedicated non-relational stores earn their place are narrower than the marketing suggests.
MongoDB[/HEADING>
What it's for: document-shaped data with deep nesting, schema flexibility during early product development, applications dominated by document-by-id reads.
Where it shines: developer ergonomics for document-modelled APIs, mature aggregation framework, Atlas managed offering is solid.
Where it doesn't: highly relational data with many joins. Multi-document transactions exist but aren't where Mongo is fastest. The "I need this query, can you build the index" problem at scale.
Use when: the application's data really is document-shaped, not "we'll hammer it into documents".
DynamoDB[/HEADING>
What it's for: AWS-native key-value with predictable single-digit-ms latency at any scale. Heavy on the "if you know exactly what queries you need".
Where it shines: very high throughput, AWS-integrated workloads, predictable cost at known load patterns, serverless integration (Lambda triggers).
Where it doesn't: "I'm not sure what queries I'll need" workloads. DynamoDB rewards careful access pattern design and punishes ad-hoc query needs.
Use when: AWS-resident, scale matters, access patterns are knowable in advance.
Cassandra / ScyllaDB[/HEADING>
What it's for: write-heavy, scale-out, eventually-consistent workloads.
Where it shines: time-series at massive scale, IoT / telemetry ingest, multi-region active-active.
Where it doesn't: anything where consistency matters more than availability. The data modelling is a discipline; getting it wrong creates expensive performance problems.
Use when: write throughput is the binding constraint, the data is naturally partitionable by a clear key.
Elasticsearch / OpenSearch[/HEADING>
What it's for: full-text search, log analytics, anything where you need to search "documents" with fuzzy / relevance ranking.
Where it shines: search use cases, log aggregation (the ELK stack), multi-faceted filtering on large datasets.
Where it doesn't: primary database. Elasticsearch is not a system of record. Index everything, but the source of truth lives elsewhere.
Use when: full-text or analytics-style search is a meaningful part of the application.
Redis[/HEADING>
What it's for: cache, queue, leaderboard, rate limiter, ephemeral state.
Where it shines: microsecond latency, simple data structures (lists, sets, hashes, sorted sets), pub-sub, streams.
Where it doesn't: durability-critical data. Redis is best as the working-memory layer in front of a durable store.
Use when: you need a cache, queue, or fast-access ephemeral state — which is most production applications.
The PostgreSQL question[/HEADING>
Modern Postgres can replace many "NoSQL" choices:
- JSONB columns for document-shape data
- Full-text search (ts_vector / pg_trgm) for moderate-scale text search
- TimescaleDB extension for time-series
- PostGIS for geospatial
- pgvector for vector embeddings (LLM use cases)
For most projects in 2026, "Postgres + Redis" handles 80 % of what was historically multi-database NoSQL territory. We default to that and add specialised stores when the data justifies it.
The decision framework[/HEADING>
- Default: Postgres + Redis
- Heavy search: + Elasticsearch (or OpenSearch self-hosted)
- Document-shaped, schema flexibility critical: MongoDB
- AWS-native scale + predictable access patterns: DynamoDB
- Massive write throughput, time-series: Cassandra / ScyllaDB or TimescaleDB on Postgres
- Vector embeddings: pgvector or a dedicated vector DB (Qdrant, Weaviate, Pinecone)
The data layer that survives[/HEADING>
- One source of truth per entity. Don't dual-write to two stores at the application layer.
- CDC (change data capture) to move data between stores when needed (Debezium, ECS).
- Async replication for read-heavy or denormalised secondary stores.
- Backup and DR for every store, not just the primary.
- Schema evolution discipline — even "schemaless" stores have an effective schema; document and version it.
One pattern we'd warn about[/HEADING>
Polyglot persistence as a default. Each new datastore is a new operational footprint, a new backup story, a new failure mode. Add a store when the use case justifies it; don't accumulate them because each looks neat in isolation.
One pattern that always pays off[/HEADING>
Modelling access patterns before picking a NoSQL store. The store's job is to serve specific queries fast. If you can't list those queries, you're not ready to pick.
What's your data layer? And — for the Postgres-everywhere folks — what's the use case where you've genuinely needed to step outside Postgres?
What it's for: document-shaped data with deep nesting, schema flexibility during early product development, applications dominated by document-by-id reads.
Where it shines: developer ergonomics for document-modelled APIs, mature aggregation framework, Atlas managed offering is solid.
Where it doesn't: highly relational data with many joins. Multi-document transactions exist but aren't where Mongo is fastest. The "I need this query, can you build the index" problem at scale.
Use when: the application's data really is document-shaped, not "we'll hammer it into documents".
DynamoDB[/HEADING>
What it's for: AWS-native key-value with predictable single-digit-ms latency at any scale. Heavy on the "if you know exactly what queries you need".
Where it shines: very high throughput, AWS-integrated workloads, predictable cost at known load patterns, serverless integration (Lambda triggers).
Where it doesn't: "I'm not sure what queries I'll need" workloads. DynamoDB rewards careful access pattern design and punishes ad-hoc query needs.
Use when: AWS-resident, scale matters, access patterns are knowable in advance.
Cassandra / ScyllaDB[/HEADING>
What it's for: write-heavy, scale-out, eventually-consistent workloads.
Where it shines: time-series at massive scale, IoT / telemetry ingest, multi-region active-active.
Where it doesn't: anything where consistency matters more than availability. The data modelling is a discipline; getting it wrong creates expensive performance problems.
Use when: write throughput is the binding constraint, the data is naturally partitionable by a clear key.
Elasticsearch / OpenSearch[/HEADING>
What it's for: full-text search, log analytics, anything where you need to search "documents" with fuzzy / relevance ranking.
Where it shines: search use cases, log aggregation (the ELK stack), multi-faceted filtering on large datasets.
Where it doesn't: primary database. Elasticsearch is not a system of record. Index everything, but the source of truth lives elsewhere.
Use when: full-text or analytics-style search is a meaningful part of the application.
Redis[/HEADING>
What it's for: cache, queue, leaderboard, rate limiter, ephemeral state.
Where it shines: microsecond latency, simple data structures (lists, sets, hashes, sorted sets), pub-sub, streams.
Where it doesn't: durability-critical data. Redis is best as the working-memory layer in front of a durable store.
Use when: you need a cache, queue, or fast-access ephemeral state — which is most production applications.
The PostgreSQL question[/HEADING>
Modern Postgres can replace many "NoSQL" choices:
- JSONB columns for document-shape data
- Full-text search (ts_vector / pg_trgm) for moderate-scale text search
- TimescaleDB extension for time-series
- PostGIS for geospatial
- pgvector for vector embeddings (LLM use cases)
For most projects in 2026, "Postgres + Redis" handles 80 % of what was historically multi-database NoSQL territory. We default to that and add specialised stores when the data justifies it.
The decision framework[/HEADING>
- Default: Postgres + Redis
- Heavy search: + Elasticsearch (or OpenSearch self-hosted)
- Document-shaped, schema flexibility critical: MongoDB
- AWS-native scale + predictable access patterns: DynamoDB
- Massive write throughput, time-series: Cassandra / ScyllaDB or TimescaleDB on Postgres
- Vector embeddings: pgvector or a dedicated vector DB (Qdrant, Weaviate, Pinecone)
The data layer that survives[/HEADING>
- One source of truth per entity. Don't dual-write to two stores at the application layer.
- CDC (change data capture) to move data between stores when needed (Debezium, ECS).
- Async replication for read-heavy or denormalised secondary stores.
- Backup and DR for every store, not just the primary.
- Schema evolution discipline — even "schemaless" stores have an effective schema; document and version it.
One pattern we'd warn about[/HEADING>
Polyglot persistence as a default. Each new datastore is a new operational footprint, a new backup story, a new failure mode. Add a store when the use case justifies it; don't accumulate them because each looks neat in isolation.
One pattern that always pays off[/HEADING>
Modelling access patterns before picking a NoSQL store. The store's job is to serve specific queries fast. If you can't list those queries, you're not ready to pick.
What's your data layer? And — for the Postgres-everywhere folks — what's the use case where you've genuinely needed to step outside Postgres?
What it's for: write-heavy, scale-out, eventually-consistent workloads.
Where it shines: time-series at massive scale, IoT / telemetry ingest, multi-region active-active.
Where it doesn't: anything where consistency matters more than availability. The data modelling is a discipline; getting it wrong creates expensive performance problems.
Use when: write throughput is the binding constraint, the data is naturally partitionable by a clear key.
Elasticsearch / OpenSearch[/HEADING>
What it's for: full-text search, log analytics, anything where you need to search "documents" with fuzzy / relevance ranking.
Where it shines: search use cases, log aggregation (the ELK stack), multi-faceted filtering on large datasets.
Where it doesn't: primary database. Elasticsearch is not a system of record. Index everything, but the source of truth lives elsewhere.
Use when: full-text or analytics-style search is a meaningful part of the application.
Redis[/HEADING>
What it's for: cache, queue, leaderboard, rate limiter, ephemeral state.
Where it shines: microsecond latency, simple data structures (lists, sets, hashes, sorted sets), pub-sub, streams.
Where it doesn't: durability-critical data. Redis is best as the working-memory layer in front of a durable store.
Use when: you need a cache, queue, or fast-access ephemeral state — which is most production applications.
The PostgreSQL question[/HEADING>
Modern Postgres can replace many "NoSQL" choices:
- JSONB columns for document-shape data
- Full-text search (ts_vector / pg_trgm) for moderate-scale text search
- TimescaleDB extension for time-series
- PostGIS for geospatial
- pgvector for vector embeddings (LLM use cases)
For most projects in 2026, "Postgres + Redis" handles 80 % of what was historically multi-database NoSQL territory. We default to that and add specialised stores when the data justifies it.
The decision framework[/HEADING>
- Default: Postgres + Redis
- Heavy search: + Elasticsearch (or OpenSearch self-hosted)
- Document-shaped, schema flexibility critical: MongoDB
- AWS-native scale + predictable access patterns: DynamoDB
- Massive write throughput, time-series: Cassandra / ScyllaDB or TimescaleDB on Postgres
- Vector embeddings: pgvector or a dedicated vector DB (Qdrant, Weaviate, Pinecone)
The data layer that survives[/HEADING>
- One source of truth per entity. Don't dual-write to two stores at the application layer.
- CDC (change data capture) to move data between stores when needed (Debezium, ECS).
- Async replication for read-heavy or denormalised secondary stores.
- Backup and DR for every store, not just the primary.
- Schema evolution discipline — even "schemaless" stores have an effective schema; document and version it.
One pattern we'd warn about[/HEADING>
Polyglot persistence as a default. Each new datastore is a new operational footprint, a new backup story, a new failure mode. Add a store when the use case justifies it; don't accumulate them because each looks neat in isolation.
One pattern that always pays off[/HEADING>
Modelling access patterns before picking a NoSQL store. The store's job is to serve specific queries fast. If you can't list those queries, you're not ready to pick.
What's your data layer? And — for the Postgres-everywhere folks — what's the use case where you've genuinely needed to step outside Postgres?
What it's for: cache, queue, leaderboard, rate limiter, ephemeral state.
Where it shines: microsecond latency, simple data structures (lists, sets, hashes, sorted sets), pub-sub, streams.
Where it doesn't: durability-critical data. Redis is best as the working-memory layer in front of a durable store.
Use when: you need a cache, queue, or fast-access ephemeral state — which is most production applications.
The PostgreSQL question[/HEADING>
Modern Postgres can replace many "NoSQL" choices:
- JSONB columns for document-shape data
- Full-text search (ts_vector / pg_trgm) for moderate-scale text search
- TimescaleDB extension for time-series
- PostGIS for geospatial
- pgvector for vector embeddings (LLM use cases)
For most projects in 2026, "Postgres + Redis" handles 80 % of what was historically multi-database NoSQL territory. We default to that and add specialised stores when the data justifies it.
The decision framework[/HEADING>
- Default: Postgres + Redis
- Heavy search: + Elasticsearch (or OpenSearch self-hosted)
- Document-shaped, schema flexibility critical: MongoDB
- AWS-native scale + predictable access patterns: DynamoDB
- Massive write throughput, time-series: Cassandra / ScyllaDB or TimescaleDB on Postgres
- Vector embeddings: pgvector or a dedicated vector DB (Qdrant, Weaviate, Pinecone)
The data layer that survives[/HEADING>
- One source of truth per entity. Don't dual-write to two stores at the application layer.
- CDC (change data capture) to move data between stores when needed (Debezium, ECS).
- Async replication for read-heavy or denormalised secondary stores.
- Backup and DR for every store, not just the primary.
- Schema evolution discipline — even "schemaless" stores have an effective schema; document and version it.
One pattern we'd warn about[/HEADING>
Polyglot persistence as a default. Each new datastore is a new operational footprint, a new backup story, a new failure mode. Add a store when the use case justifies it; don't accumulate them because each looks neat in isolation.
One pattern that always pays off[/HEADING>
Modelling access patterns before picking a NoSQL store. The store's job is to serve specific queries fast. If you can't list those queries, you're not ready to pick.
What's your data layer? And — for the Postgres-everywhere folks — what's the use case where you've genuinely needed to step outside Postgres?
- Default: Postgres + Redis
- Heavy search: + Elasticsearch (or OpenSearch self-hosted)
- Document-shaped, schema flexibility critical: MongoDB
- AWS-native scale + predictable access patterns: DynamoDB
- Massive write throughput, time-series: Cassandra / ScyllaDB or TimescaleDB on Postgres
- Vector embeddings: pgvector or a dedicated vector DB (Qdrant, Weaviate, Pinecone)
The data layer that survives[/HEADING>
- One source of truth per entity. Don't dual-write to two stores at the application layer.
- CDC (change data capture) to move data between stores when needed (Debezium, ECS).
- Async replication for read-heavy or denormalised secondary stores.
- Backup and DR for every store, not just the primary.
- Schema evolution discipline — even "schemaless" stores have an effective schema; document and version it.
One pattern we'd warn about[/HEADING>
Polyglot persistence as a default. Each new datastore is a new operational footprint, a new backup story, a new failure mode. Add a store when the use case justifies it; don't accumulate them because each looks neat in isolation.
One pattern that always pays off[/HEADING>
Modelling access patterns before picking a NoSQL store. The store's job is to serve specific queries fast. If you can't list those queries, you're not ready to pick.
What's your data layer? And — for the Postgres-everywhere folks — what's the use case where you've genuinely needed to step outside Postgres?
Polyglot persistence as a default. Each new datastore is a new operational footprint, a new backup story, a new failure mode. Add a store when the use case justifies it; don't accumulate them because each looks neat in isolation.