Batched Key Access (BKA) join

Parameters (Batched Key Access join)

orders rows (outer): 10000

Rows from the outer (driving) table. Keys are collected into the join buffer.

departments rows (inner): 100000

Rows in the inner table. Looked up via B+tree index.

Row size (bytes): 200

Size of one outer row in the join buffer.

join_buffer_size (bytes): 262144

Buffer for collecting outer keys. Default 256 KiB.

Index key size (bytes): 8

Size of the join key in the inner B+tree index. Affects fan-out and tree height.

Example query this animation executes

-- BKA: batch outer keys → MRR on inner
SELECT o.*, d.name
FROM   orders o
JOIN   departments d ON d.id = o.dept_id
ORDER  BY o.order_date;

BKA collects outer keys into join_buffer, sorts by rowid, then does a Multi-Range Read on the inner index — converting random I/O into sequential.

What you'll see in the animation

Purple zone (top): outer-row join keys flow into the join buffer one by one, forming a batch.
Blue zone (middle): the batch is sorted by rowid so the storage engine can read in disk order. The sorted keys are dispatched as an MRR request.
Green zone (bottom): the storage engine reads matching inner rows sequentially in rowid order — no random seeks.
A sweep bar moves across the green zone to show the sequential nature of the disk reads.
The readout below compares random I/Os (without BKA) vs sequential I/Os (with BKA) and shows the speedup factor.

0.0s 0.0s

BKA cost model

Rows per batch ?How many outer rows fit in one join_buffer_size chunk. More rows per batch = fewer batches = fewer MRR round-trips.

—

Number of batches ?ceil(outer_rows / rows_per_batch). Each batch triggers one MRR request to the storage engine.

—

B+tree height ?Levels in the inner index B+tree. Each batch traverses this many levels before reaching leaf pages.

—

Random I/Os (without BKA) ?Without BKA, each outer row does an independent random index lookup: outer_rows x height page reads.

—

Sequential I/Os (with BKA) ?With BKA + MRR, keys are sorted by rowid so the engine reads pages in disk order: batches x (height + rows_per_batch).

—

Speedup factor ?random_ios / sequential_ios. Higher = more benefit from BKA. Biggest gains on HDD; still helps on SSD.

—

I/O operations vs outer rows (log–log)

Learn more — BKA vs simple Nested Loop, and when to enable it

Simple Nested Loop does one random index lookup per outer row. If the inner table is large and the index is deep, each lookup may touch 3-4 random pages. With 100,000 outer rows that is 300,000-400,000 random I/Os — catastrophic on spinning disks and still expensive on SSDs.

BKA (Batched Key Access) changes the pattern: it collects a batch of outer keys into join_buffer_size, sorts them by the inner table's rowid (primary key order), and hands the sorted batch to the storage engine as a Multi-Range Read (MRR). The engine reads the matching rows in disk order — sequential I/O instead of random.

When to enable it: BKA is not on by default. You need:

SET optimizer_switch = 'batched_key_access=on,mrr=on,mrr_cost_based=off';

mrr_cost_based=off forces MRR even when the optimizer's cost model thinks random I/O is cheap (which it often misjudges on HDD). On MySQL 8.4 you can also set these in my.cnf globally.

Why disk-order reads are faster: HDDs have ~10 ms seek time per random read. Sequential reads bypass seeks entirely — the head just streams. On SSDs, sequential reads still win because of read-ahead, fewer syscalls, and better NAND page utilization. A 10-50x speedup is common on HDD; 2-5x on SSD.

MariaDB 11.4 also supports BKA (called BKA in join_cache_level 5-6). The principle is identical: batch, sort, MRR.