A clean, from-scratch implementation of the Bitcask key–value storage engine in Go, inspired by the original Basho (Riak) design.
Bitcask is a log-structured, append-only key–value storage engine optimized for:
- Fast writes — sequential, append-only I/O
- Fast reads — O(1) lookups via an in-memory index
- Crash recovery — deterministic rebuild from disk
- Operational simplicity — minimal on-disk structures
- Append-only datafiles (no in-place updates)
- In-memory KeyDir mapping keys → file offsets
- Immutable datafiles once rotated
- Background merge & compaction
- Atomic file replacement
- Append-only Bitcask datafiles
- In-memory KeyDir
- File rotation and disk sync
- Background merge & compaction
- Hint files for fast startup
A Go client for programmatic access:
- Simple API:
GET,SET,DELETE,EXISTS,COUNT,LIST - Binary protocol
- Designed for correctness and testability
- Interactive REPL
- Full command coverage
- Useful for debugging and exploration
- Go ≥ 1.23
- Make (optional)
git clone https://github.com/0xRadioAc7iv/go-bitcask.git
cd go-bitcask
make buildThis produces:
bin/bitcask— serverbin/bk-cli— CLI client
./bin/bitcask./bin/bitcask \
--dir ./data \
--df-size 128 \
--port 9999 \
--sync 15 \
--size-check 5 \
--dead-ratio 0.4| Flag | Description | Default |
|---|---|---|
--dir |
Data directory | ./ |
--df-size |
Max datafile size (MB) | 64 |
--port |
TCP port | 9999 |
--sync |
fsync interval (s) | 15 |
--size-check |
Rotation check interval (s) | 5 |
--dead-ratio |
Garbage ratio threshold | 0.4 |
--merge-files |
Min files for merge | 3 |
--merge-size |
Min merge size (MB) | 256 |
--max-merge-files |
Max files per merge | 5 |
--max-merge-bytes |
Max merge bytes (GB) | 2 |
./bin/bk-cli --host 127.0.0.1 --port 9999Example session:
> SET foo bar
ok
> GET foo
bar
> COUNT
1
> DELETE foo
ok
go get github.com/0xRadioAc7iv/go-bitcask/bitcaskclient, _ := bitcask.Connect()
defer client.Close()
client.SET("user:1", "Alice")
v, _ := client.GET("user:1")
fmt.Println(v)- Append record to active datafile
- Update KeyDir in memory
- fsync periodically
- Lookup key in KeyDir
- Read record at offset
- Validate CRC
- Return value
- Runs periodically
- Selects files with high garbage ratio
- Rewrites only live records
- Atomically replaces old files
- Rebuilds KeyDir from hint files (or datafiles)
- Truncates corrupted tails
- Resumes safely
make build
make test
make fmtA data generation script is provided for stress-testing merge and compaction:
make datagen- Bitcask: A Log-Structured Hash Table for Fast Key/Value Data - Original PDF
- Bitcask - A Log-Structured fast KV store - Arpit Bhayani's Blog