Janus

'Presiding over all beginnings and transitions, whether abstract or concrete, sacred or profane.'

An LLM token compression proxy for the Anthropic API. Janus sits between your application and Claude, intelligently compressing requests to reduce token usage and cost without sacrificing context quality.

1x GenAI Genesis Winner: 🏆 Google Sustainability Hack

Inspiration

I wanted to build something that runs locally, losslessly, and efficiently that significantly decreases the token usage, to maximize utility out of coding agents.

What It Does

Janus intercepts outgoing API requests to Anthropic's /v1/messages endpoint and runs them through a multi-stage compression pipeline before forwarding them upstream. Responses are returned transparently to the client, with both streaming and non-streaming modes supported.

Compression Pipeline

Requests pass through four stages, each targeting a different source of redundancy:

Stage A -- Tool-Result Deduplication Tracks tool call outputs within a conversation session. When the same tool produces identical output more than once, subsequent occurrences are replaced with a short placeholder, eliminating repeated content.

Stage B -- Regex Structural Compression Five sub-stages of pattern-based compression:

B1: Docstring removal (Python, JSDoc, Rust doc comments)
B2: Comment stripping
B3: Whitespace normalization
B4: Stack trace condensation
B5: Repeated block deduplication

Stage C -- AST Pruning Uses tree-sitter to parse code blocks (Python, JavaScript, Rust, Go) and remove functions that are unlikely to be relevant to the current query. Only applied to blocks above a configurable line threshold.

Semantic Cache

On top of the compression pipeline, Janus maintains a semantic cache backed by Redis with vector similarity search. Requests that are semantically similar to previously seen requests (above a configurable similarity threshold) return cached responses directly, skipping the upstream call entirely.

Embeddings generated locally using BGE-small-en-v1.5 (384-dimensional) via fastembed
Configurable similarity cutoff (default: 0.85) and TTL (default: 1 hour)

Architecture

Client --> Janus Proxy (localhost:8080) --> Anthropic API
               |
               |-- Compression Pipeline (Stages A-D)
               |-- Semantic Cache (Redis + Vector Search)
               |-- TUI Dashboard (real-time metrics)

Tech Stack

Component	Technology
Language	Rust
Async Runtime	Tokio
HTTP Framework	Axum
Terminal UI	Ratatui + Crossterm
AST Parsing	tree-sitter (Python, JS, Rust, Go)
Embeddings	fastembed (BGE-small-en-v1.5)
Cache	Redis with RediSearch
Token Counting	tiktoken-rs
Hashing	xxhash (xxh3)
Containerization	Docker + Docker Compose

Getting Started

Prerequisites

Rust toolchain (1.75+)
Redis server (with RediSearch module for semantic caching)

Build

cargo build --release

Configure

Copy and edit the default configuration file:

cp janus.toml janus.toml.local

Key settings in janus.toml:

[server]
listen = "0.0.0.0:8080"
upstream_url = "https://api.anthropic.com"

[pipeline]
tool_dedup = true
regex_structural = true
ast_pruning = true
semantic_trim = true

[cache]
enabled = true
redis_url = "redis://127.0.0.1:6379"
similarity_cutoff = 0.85
ttl_seconds = 3600

[pricing]
input_cost_per_1k = 0.003
output_cost_per_1k = 0.015

Run

# Start the proxy with the interactive TUI
janus serve

# Start without the TUI (logs to stdout)
janus serve --no-tui

# Use a custom config file
janus serve --config path/to/config.toml

Docker

# Start Janus + Redis stack
docker-compose up

# Health check
curl http://localhost:8080/health

Other Commands

# Run compression benchmarks
janus benchmark

# Cache management
janus cache flush
janus cache stats
janus cache test

TUI Dashboard

When running with janus serve, an interactive terminal dashboard displays real-time metrics:

Total tokens saved and estimated cost reduction
Per-stage compression breakdown
Request history with cache hit/miss indicators
Error tracking with timestamps

Keyboard controls: q quit, p pause, r reset stats, f flush cache, a toggle auto-flush, arrow keys to scroll.

License

This project is licensed under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
assets		assets
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
janus.toml		janus.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Janus

Inspiration

What It Does

Compression Pipeline

Semantic Cache

Architecture

Tech Stack

Getting Started

Prerequisites

Build

Configure

Run

Docker

Other Commands

TUI Dashboard

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Janus

Inspiration

What It Does

Compression Pipeline

Semantic Cache

Architecture

Tech Stack

Getting Started

Prerequisites

Build

Configure

Run

Docker

Other Commands

TUI Dashboard

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages