A Steadybit extension for Redis chaos engineering.
Learn about the capabilities of this extension in our Reliability Hub.
| Environment Variable | Required | Description |
|---|---|---|
STEADYBIT_EXTENSION_ENDPOINTS_JSON |
Yes | JSON array of Redis endpoint configurations |
STEADYBIT_EXTENSION_DISCOVERY_INTERVAL_INSTANCE_SECONDS |
No | Interval for instance discovery (default: 30) |
STEADYBIT_EXTENSION_DISCOVERY_INTERVAL_DATABASE_SECONDS |
No | Interval for database discovery (default: 60) |
The STEADYBIT_EXTENSION_ENDPOINTS_JSON environment variable should contain a JSON array of Redis endpoint configurations:
[
{
"url": "redis://localhost:6379",
"password": "optional-password",
"username": "optional-username",
"db": 0,
"name": "my-redis-instance",
"insecureSkipVerify": false
}
]For TLS connections, use the rediss:// URL scheme:
[
{
"url": "rediss://redis.example.com:6379",
"password": "secret",
"insecureSkipVerify": true
}
]Discovers Redis instances and exposes attributes like:
redis.host- Redis hostredis.port- Redis portredis.version- Redis versionredis.role- Instance role (master/replica)redis.cluster.enabled- Cluster mode status
Discovers Redis databases (db0-db15) and exposes:
redis.database.index- Database indexredis.database.keys- Key count in databaseredis.database.name- Database name (e.g., "db0")
- ID:
com.steadybit.extension_redis.instance.connection-exhaustion - Target: Instance
- Description: Opens many connections to test connection limit handling
- Parameters:
duration- How long to hold connectionsnumConnections- Number of connections to open (default: 100)
- ID:
com.steadybit.extension_redis.instance.client-pause - Target: Instance
- Description: Suspends all client command processing using CLIENT PAUSE
- Parameters:
duration- How long to pause clientspauseMode- ALL (all commands) or WRITE (write commands only)
- Reversibility: Auto-reverts after timeout
- ID:
com.steadybit.extension_redis.instance.maxmemory-limit - Target: Instance
- Description: Reduces Redis maxmemory to force evictions or OOM errors
- Parameters:
duration- How long to apply the limitmaxmemory- Memory limit (e.g., "10mb", "1gb")evictionPolicy- noeviction, allkeys-lru, allkeys-lfu, volatile-lru, volatile-ttl, or keep original
- Reversibility: Fully reversible - restores original settings on stop
- ID:
com.steadybit.extension_redis.database.cache-expiration - Target: Database
- Description: Sets TTL on string keys matching a pattern to force expiration (non-string keys are skipped)
- Parameters:
duration- Attack duration (for tracking)pattern- Key pattern to match (only string keys are affected)ttl- TTL in seconds before keys expire (default: 5)maxKeys- Maximum keys to affect (default: 100)restoreOnStop- Restore keys with original values and TTLs when attack stops (default: false)
- Reversibility: Reversible when
restoreOnStopis enabled - recreates expired keys with original values and TTLs
- ID:
com.steadybit.extension_redis.instance.sentinel-stop - Target: Instance
- Description: Stops a Redis Sentinel server using DEBUG SLEEP, making it unresponsive to all clients and other Sentinels
- Parameters:
duration- How long the Sentinel should be unresponsive (default: 30s)
- Reversibility: Auto-recovers after the sleep duration
- ID:
com.steadybit.extension_redis.instance.check-memory - Target: Instance
- Description: Monitors Redis memory usage and fails if threshold exceeded
- Parameters:
duration- Monitoring durationmaxMemoryPercent- Max memory as % of maxmemory (default: 80%)maxMemoryBytes- Max memory in MB (optional)
- ID:
com.steadybit.extension_redis.instance.check-latency - Target: Instance
- Description: Monitors Redis response latency
- Parameters:
duration- Monitoring durationmaxLatencyMs- Maximum allowed latency in ms (default: 100)
- ID:
com.steadybit.extension_redis.instance.check-connections - Target: Instance
- Description: Monitors connected clients and fails if threshold exceeded
- Parameters:
duration- Monitoring durationmaxConnectionsPct- Max connections as % of maxclients (default: 80%)maxConnections- Absolute max connections (optional)
- ID:
com.steadybit.extension_redis.instance.check-replication - Target: Instance
- Description: Monitors Redis replication status and lag for replicas
- Parameters:
duration- Monitoring durationmaxLagSeconds- Maximum allowed replication lag (default: 10s)requireLinkUp- Fail if master link is down (default: true)
A complete demo environment with a sample application and chaos engineering experiments is available in the demo/ directory.
cd demo
docker-compose up -dThis starts:
- redis-master: Primary Redis (port 6379)
- redis-replica: Replica for HA testing (port 6380)
- demo-app: Sample app with caching (port 3400)
- load-generator: Continuous traffic generator
See demo/CHAOS_EXPERIMENTS.md for detailed chaos engineering scenarios including:
User-Facing Scenarios:
- Cache unavailability impact
- Session loss handling
- Slow cache response
- Cache stampede (thundering herd)
SRE/Platform Scenarios:
- Connection pool exhaustion
- Memory pressure & eviction
- Replication lag
- Redis failover
helm repo add steadybit https://steadybit.github.io/helm-charts
helm repo update
helm install steadybit-extension-redis steadybit/steadybit-extension-redis \
--set redis.auth.managementEndpoints='[{"url":"redis://redis:6379"}]'docker run -d \
-e STEADYBIT_EXTENSION_ENDPOINTS_JSON='[{"url":"redis://redis:6379"}]' \
-p 8083:8083 \
ghcr.io/steadybit/extension-redis:latest- Go 1.25+
- Redis instance for testing
- Docker (for demo environment)
make build# Unit tests only
go test ./clients/... ./config/... ./extredis/... -v
# All tests including e2e (requires minikube)
make test# Start Redis
./scripts/start-redis.sh
# Run extension
export STEADYBIT_EXTENSION_ENDPOINTS_JSON='[{"url":"redis://localhost:6379","password":"dev-password"}]'
make runMIT License - see LICENSE for details.