The Complete Guide to PostgreSQL JSONB — Flexible Schemas and Fast Queries
A practical guide to JSON vs JSONB, GIN indexes, operators, and more
When requirements include “store data before the schema is finalized” or “handle structures whose fields change dynamically,” PostgreSQL’s JSONB type is often a strong option.
This article provides a systematic guide to JSONB, from the fundamentals to practical queries, index design, and performance characteristics.
JSON vs JSONB — What Is the Difference?
PostgreSQL provides two types for storing JSON.
json |
jsonb |
|
|---|---|---|
| Internal representation | Stored as the original text | Converted to and stored in a binary format |
| Write cost | Low (no conversion) | Slightly higher (binary conversion required) |
| Read cost | High (parsed every time) | Low (read directly from binary representation) |
| Indexes | Limited support | Supports GIN indexes |
| Duplicate keys | Preserved | Only the last value is retained |
| Key order | Preserved | Not guaranteed |
| Operators | Basic operators only | Rich set (@>, ?, ?|, ?&, etc.) |
In practice, JSONB is the clear choice when reads outnumber writes and the data must be searchable. The json type is mainly useful in unusual cases where key order or duplicate keys must be preserved exactly.
Basic Table Design
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
source TEXT NOT NULL,
payload JSONB NOT NULL
);
This design allows any structure to be stored in the payload column. A common pattern is to keep stable fields such as occurred_at and source as relational columns and use JSONB only for the variable portion.
Basic Reads and Writes
Writing Data
INSERT INTO events (source, payload)
VALUES (
'api',
'{"user_id": 42, "action": "login", "metadata": {"ip": "192.168.1.1", "ua": "Mozilla/5.0"}}'
);
Retrieving Fields
-- Retrieve values as text
SELECT payload->>'user_id' AS user_id,
payload->>'action' AS action
FROM events;
-- Retrieve a nested object as JSON
SELECT payload->'metadata' AS meta
FROM events;
-- Traverse a nested path directly
SELECT payload #>> '{metadata, ip}' AS ip_address
FROM events;
How to choose an operator:
| Operator | Return type | Use |
|---|---|---|
-> |
jsonb |
Retrieve nested JSON |
->> |
text |
Retrieve a leaf value as text |
#> |
jsonb |
Traverse a path array |
#>> |
text |
Traverse a path array and return text |
Search Queries
Exact and Containment Searches
-- Check whether payload contains the specified object (@> operator)
SELECT * FROM events
WHERE payload @> '{"action": "login"}';
-- Check whether a key exists
SELECT * FROM events
WHERE payload ? 'user_id';
-- Check whether any of the keys exist
SELECT * FROM events
WHERE payload ?| ARRAY['user_id', 'session_id'];
-- Check whether all keys exist
SELECT * FROM events
WHERE payload ?& ARRAY['action', 'metadata'];
Comparisons with Type Casts
JSONB fields must be cast explicitly when comparing them as numbers or dates.
-- Numeric comparison
SELECT * FROM events
WHERE (payload->>'user_id')::INTEGER > 100;
-- Date comparison
SELECT * FROM events
WHERE (payload->>'created_at')::TIMESTAMPTZ > NOW() - INTERVAL '1 hour';
Searching Array Fields
-- Rows whose tags array contains "error"
SELECT * FROM events
WHERE payload->'tags' @> '["error"]';
GIN Indexes — The Key to Fast JSONB Searches
JSONB searches can be accelerated with a GIN (Generalized Inverted Index) index.
Default GIN Index
CREATE INDEX idx_events_payload ON events USING GIN (payload);
This supports the @>, ?, ?|, and ?& operators. It is sufficient for most use cases.
jsonb_path_ops
CREATE INDEX idx_events_payload_path ON events
USING GIN (payload jsonb_path_ops);
This operator class is specialized for @>. It often produces a smaller index and better performance, but does not support the ? family of operators.
B-Tree Indexes on Specific Fields
-- When user_id is queried frequently
CREATE INDEX idx_events_user_id
ON events ((payload->>'user_id'));
-- Include the cast when querying it as a number
CREATE INDEX idx_events_user_id_int
ON events (((payload->>'user_id')::INTEGER));
For frequent point queries on a specific field, a B-tree index may be more efficient than GIN.
jsonb_path_query — JSONPath Searches
PostgreSQL 12 and later support more flexible searches through JSONPath syntax.
-- Events where action is "error" and severity is at least 3
SELECT *
FROM events
WHERE payload @? '$.action ? (@ == "error") && $.severity ? (@ >= 3)';
-- Extract values with jsonb_path_query
SELECT jsonb_path_query(payload, '$.metadata.ip')
FROM events
WHERE source = 'api';
Aggregation
Grouping by a Field
SELECT payload->>'action' AS action,
COUNT(*) AS cnt
FROM events
GROUP BY payload->>'action'
ORDER BY cnt DESC;
jsonb_agg / jsonb_object_agg
-- Aggregate payloads into an array for each source
SELECT source,
jsonb_agg(payload ORDER BY occurred_at) AS payloads
FROM events
GROUP BY source;
Handling Schema Evolution
One advantage of JSONB is the ability to change the schema later.
Adding and Updating Fields
-- Add a new field to existing rows
UPDATE events
SET payload = payload || '{"version": 2}'
WHERE source = 'api';
-- Update a nested field
UPDATE events
SET payload = jsonb_set(payload, '{metadata, processed}', 'true')
WHERE id = 42;
Removing Fields
-- Remove a top-level key
UPDATE events
SET payload = payload - 'deprecated_field';
-- Remove a nested key
UPDATE events
SET payload = payload #- '{metadata, tmp}';
Performance Design Guidelines
JSONB vs Relational Columns
JSONB should be used as a tradeoff for flexibility. As a rule, fields that are fixed and queried frequently should be promoted to relational columns.
| Situation | Recommendation |
|---|---|
| Frequently searched or joined | Relational column |
| Fluid schema or highly varied fields | JSONB |
| Writes greatly outnumber reads | JSONB (but use GIN carefully) |
| Full-text or similarity search is required | Separate column + tsvector / pgvector |
The Cost of GIN Indexes
GIN indexes have a high write cost. For workloads with heavy INSERT traffic, consider fastupdate = off or limit the index to the fields that need it.
-- Disable fastupdate to reduce deferred write overhead
CREATE INDEX idx_events_payload ON events
USING GIN (payload) WITH (fastupdate = off);
Verify with EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM events
WHERE payload @> '{"action": "error"}';
If the plan contains Bitmap Index Scan on idx_events_payload, the GIN index is being used. If it shows Seq Scan, update statistics with ANALYZE or reconsider the query condition.
Combining JSONB with Generated Columns (PostgreSQL 12+)
Fields that require repeated casts are convenient candidates for a generated column.
ALTER TABLE events
ADD COLUMN user_id INTEGER
GENERATED ALWAYS AS ((payload->>'user_id')::INTEGER) STORED;
CREATE INDEX idx_events_uid ON events (user_id);
This preserves JSONB’s flexibility while providing fast access to selected fields.
Summary
| Topic | Key Point |
|---|---|
| json vs jsonb | Choose JSONB when search or indexing is required |
| Operators | @> (containment) and ? (key existence) are the essentials |
| Indexes | Start with GIN; consider B-tree for frequent point queries |
| Schema evolution | Append with ||, update with jsonb_set, remove with - |
| Fixed fields | Promote frequently queried fields to relational columns |
| Generated columns | Combine JSONB flexibility with B-tree performance |
Using JSONB as an unrestricted container for everything makes later query optimization difficult. A clear division of responsibility — relational columns for fixed data and JSONB for variable data — leads to a design that remains maintainable over time.