Back to articles

Why Random UUIDs Hurt Database Indexes — And How UUIDv7 Fixes It

DatabasesIndexesUUIDPerformance

Using UUIDs as primary keys is common in distributed systems: they're unique across nodes without a central coordinator. The problem isn't uniqueness — it's how they interact with database indexes. Most production databases (MySQL/InnoDB, PostgreSQL, etc.) use B-tree or B+tree indexes to store and look up data by key. Random UUIDs cause those structures to fragment and split frequently, hurting write performance and cache locality. This article explains how B+trees work, why sequential keys are efficient, why random UUIDs are not, and how UUIDv7 (time-ordered UUIDs) restores index-friendly behavior without giving up global uniqueness.


How database indexes work: B-trees and B+trees

To understand why primary key choice matters, you need a minimal picture of how indexes are stored.

Why trees?

Databases need to find rows by key (e.g. by id) without scanning every row. A B-tree (or its variant the B+tree) is a balanced tree where:

  • Each node holds a bounded number of keys (and optionally values).
  • Keys inside a node are kept sorted.
  • Navigation follows pointers: “go left if key < X, else go right.”

So a lookup walks from the root down to a leaf, visiting one node per level. The number of levels (the height) is small even for huge datasets, because each node can hold many keys. That makes lookups efficient in both memory and, importantly, on disk: each node is typically sized to match a disk block (e.g. 4 KB or 16 KB), so one disk read loads one node.

B+tree in practice

In a B+tree (used by MySQL InnoDB and many others):

  • Leaf nodes hold the actual key/value pairs (e.g. primary key → row data) and are often linked in order (linked list of leaves).
  • Internal nodes hold only keys (and pointers to children) so more keys fit per node and the tree stays shallower.

When you insert a new key:

  1. The database finds the leaf where that key belongs (by sorted order).
  2. It inserts the key (and value) into that leaf.
  3. If the leaf is full, it splits: one node becomes two, and a key is promoted to the parent. Splits can propagate up to the root.

So: where the new key lands in the sorted order determines which leaf (and which disk page) is touched, and whether a split happens. That’s why key ordering and insert pattern matter so much.


Sequential primary keys: inserts at the end

With an auto-incrementing (or otherwise monotonically increasing) primary key, new rows get IDs 1, 2, 3, 4, 5, … So in the B+tree, new keys always go to the rightmost leaf.

Conceptually:

  • The index looks like: [1, 2, 3, 4, 5] in sorted order.
  • The next insert (6) goes to the end of that sequence.
  • No reshuffling of existing keys, no insert “in the middle,” and often no split (the rightmost leaf is the one that grows).

Effects:

  • Minimal page splits — at most the last page splits when it fills.
  • Good cache locality — the “hot” write path is a small set of rightmost pages.
  • Efficient I/O — fewer pages modified per insert.

So sequential IDs are very friendly to B+tree indexes.


Random UUIDs (e.g. UUIDv4): inserts in the middle

With UUIDv4 (or any random 128-bit value), new IDs have no relationship to each other. In sorted order they might look like:

  • Insert A → a92f...
  • Insert B → 12bc...
  • Insert C → ff01...

Sorted: [12bc..., a92f..., ff01...]. So each new insert can land anywhere in the tree, depending on the random value.

What happens:

  1. The database finds the leaf where the new key belongs (somewhere in the middle of the key space).
  2. It inserts into that leaf.
  3. That leaf is no more likely to be the “end” than any other — so it’s often full or nearly full.
  4. Page splits happen frequently: one full leaf becomes two, and the parent may split too.
  5. More pages are read and written per insert → more I/O, slower writes, and more fragmentation.

So the ID format is fine for uniqueness; the storage behavior is inefficient. The B+tree stays balanced, but at the cost of many splits and scattered writes.


The solution: UUIDv7 (time-ordered UUIDs)

UUIDv7 keeps the same 128-bit size and global uniqueness, but encodes a timestamp in the high bits:

|  timestamp (ms)  |  randomness  |

So newer UUIDs are larger (in numeric order) than older ones. Inserts still don’t need a central coordinator, but they tend to go to the right in the B+tree, like sequential IDs.

Example (conceptually):

018fa2b0-xxxx-xxxx-xxxx-xxxxxxxxxxxx
018fa2b1-xxxx-xxxx-xxxx-xxxxxxxxxxxx
018fa2b2-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Insert order and sorted order are similar: new rows mostly append at the end of the index.

Effects:

  • Far fewer page splits than with random UUIDs.
  • Better cache locality — the “hot” write region is again the rightmost part of the tree.
  • Near-sequential write performance, while keeping:
    • No central ID generator.
    • No coordination between nodes.
    • Very low collision probability.

So UUIDv7 gives you distributed scalability and index efficiency at the same time.


Other considerations

Secondary indexes

Many databases (e.g. MySQL InnoDB) store secondary indexes as separate B+trees where the key is the indexed column(s) and the value is the primary key. So:

  • A large or random primary key is repeated in every secondary index.
  • Smaller, sequential-ish primary keys (e.g. BIGINT auto-increment or UUIDv7) keep secondary index entries smaller and the trees shallower.

So primary key choice affects not only the main table B+tree but also every secondary index.

Key size

UUIDs are 128 bits (16 bytes); a typical BIGINT is 64 bits (8 bytes). For a fixed page size (e.g. 16 KB in InnoDB), larger keys mean fewer keys per node → a deeper tree and more disk reads per lookup. So even with UUIDv7, you pay a size cost vs. an integer; but you avoid the random-insert cost.

When random UUIDs are acceptable

If you insert rarely, or your working set is small and fits in memory, the cost of random UUIDs may be acceptable. The problems show up with high insert rates and large tables where I/O and page splits dominate.


Conclusion

When UUIDs are used as primary keys, the main issue isn’t uniqueness — it’s how they interact with B+tree indexes. Sequential keys (e.g. auto-increment) append at the end of the tree, minimizing splits and I/O. Random UUIDs (e.g. UUIDv4) land all over the tree, causing frequent page splits and slower writes. UUIDv7 (time-ordered) keeps global uniqueness and distributed generation while making inserts behave like sequential keys, giving you much better index performance. For more on B-trees and database indexes, see PlanetScale’s B-trees and database indexes.