PostgreSQL First
PostgreSQL First
Choosing a data store is often a burden. NoSQL, distributed SQL, graph DBs, time-series DBs all line up. Yet for many projects, PostgreSQL alone is enough for a long time at the start.
1. About PostgreSQL
PostgreSQL traces back to the 1986 POSTGRES research project at UC Berkeley (led by Michael Stonebraker). Once a SQL interface was added, the project was handed over to the open source community in 1996 under the name PostgreSQL. The license is BSD-like (the PostgreSQL License), so commercial use, modification, and redistribution face few constraints.
| Event | Year |
|---|---|
| POSTGRES (Berkeley) | 1986 |
| Postgres95 (SQL added) | 1995 |
| PostgreSQL 1.0 | 1996 |
| MVCC settled | 6.x ~ 7.x |
| Streaming Replication | 9.0 (2010) |
| JSONB type | 9.4 (2014) |
| Logical replication | 10 (2017) |
| Declarative partitioning | 10 (2017) |
PostgreSQL's identity is "an extensible relational DBMS." It started from a structure where types, operators, indexes, and languages can all be added by user code, and that lineage continues today as the extension ecosystem.
2. ACID and standard SQL
PostgreSQL guarantees ACID (atomicity, consistency, isolation, durability) by default for transactions. Within a single instance, the default isolation level is Read Committed, with Repeatable Read and Serializable available.
Standard SQL compliance is also strong. PostgreSQL stabilized standard features such as CTE (WITH), window functions, LATERAL joins, and recursive queries earlier than many other RDBMSes. The JSON standard (SQL/JSON path expressions) has been adopted incrementally.
3. Extension ecosystem
A single line CREATE EXTENSION ... slots in new functionality.
| Extension | Role |
|---|---|
pg_trgm |
Substring similarity search |
unaccent |
Strip Latin accent marks |
postgis |
Geospatial (spatial indexes, SRS, functions) |
pgvector |
Vector embedding storage and similarity search |
timescaledb |
Time-series hypertables and compression |
citus |
Sharding and distribution |
hstore |
Key-value type |
pg_partman |
Automated partition management |
Extensions are tied to PostgreSQL's major version, so it is safer to fold them into upgrade plans together.
4. Comparison with MySQL and MariaDB
MySQL appeared in 1995 and spread widely as part of the LAMP stack. After Oracle acquired Sun in 2009, MariaDB was forked. Strengths include familiarity, abundant tuning material, and throughput on certain workloads.
Frequently mentioned differences with PostgreSQL.
- Transactions and isolation — Similar on InnoDB, but defaults and lock model details differ.
- DDL — Some MySQL DDL triggers an implicit commit. Wrapping DDL inside a transaction is comparatively natural in PostgreSQL.
- Type richness — PostgreSQL is richer in arrays, JSONB, ranges, UUID, enums, domains, and so on.
- Extensibility — PostgreSQL's extension model goes deeper.
The choice hinges on workload, operational resources, and ecosystem familiarity.
5. Difference in role with SQLite
SQLite is a library-style DB that D. Richard Hipp built in 2000. It stores data in a single file without a separate server. It excels in embedded, desktop, mobile, and test environments. Concurrent multi-client writes are weak, but single-process reads are very fast.
For example, a project's food-app and language-app use SQLite. They are local apps that need no server and stay within a single user's PC.
6. CockroachDB and YugabyteDB
CockroachDB (GA in 2015) and YugabyteDB (GA in 2018) are distributed SQL candidates that advertise PostgreSQL compatibility. They emphasize global distribution, automatic sharding, and strong consistency. They follow the PostgreSQL wire protocol and parts of SQL, but the engine itself is separate, so PostgreSQL-specific features (especially some extensions) do not carry over.
7. Comparison with MongoDB
MongoDB is a document-oriented DB that appeared in 2009. It stores BSON (JSON-like) documents in collections. Schema flexibility and fast initial development are often cited as strengths, but it stands in a different place from relational DBs in terms of transactions, joins, and aggregation richness.
Since PostgreSQL added JSONB, the rationale of "we picked MongoDB because we needed JSON" has reportedly weakened.
8. PostgreSQL for everything
Lately the recommendation "start with PostgreSQL alone" appears often.
- Queue: instead of a separate message broker, use
SELECT ... FOR UPDATE SKIP LOCKEDas a job queue (river, graphile-worker, pgmq). - Search:
pg_trgmandtsvectorfull-text indexes cover small to medium search needs. - Vector: the
pgvectorextension stores embeddings and runs similarity search. - Time-series:
timescaledbor BRIN indexes plus partitioning. - Cache: materialized views or cache tables.
The advantage is one operational surface. Backups, monitoring, authentication, and access control sit in one place. The limit is that one DB doing too much enlarges a single point of failure, and once workloads diverge they eventually need separation.
9. Common pitfalls
Major version upgrades — extension compatibility checks are mandatory. pg_upgrade or logical replication can shrink downtime.
Encoding and locale fixed at create time — LC_COLLATE at DB creation affects sorting. It is hard to change mid-operation, so decide early.
Autovacuum disabled — turning it off lets bloat accumulate (covered in detail in postgres-deep).
Sequence cache — the cache setting on nextval can create gaps across multiple processes.
Timezone assumptions — TIMESTAMPTZ and TIMESTAMP differ subtly. The norm is to store in UTC and convert on display.
Closing thoughts
Picking a data store feels heavy at first, but PostgreSQL alone takes us a long way. The point of separation is decided by workload, traffic, and SLA — not by the tool catalog.
Next
- postgres-deep
References: PostgreSQL official docs, PostgreSQL license, Awesome Postgres, PostgreSQL wiki — History.