Skip to content

evalops/maestro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Maestro

Lightweight agent orchestration hooks for Claude Code and OpenAI Codex. No daemons, no databases — just native hook scripts that keep your coding agents working until the job is done.

The Problem

AI coding agents stop prematurely. You end up babysitting them with messages like:

  • "keep going"
  • "commit and push"
  • "fix those CI failures"
  • "address the review feedback"

In our analysis of 193 agent sessions across 14 days (287 agent-hours), 439 user messages were just nudges to keep agents working. That's one interruption every ~40 minutes.

The Solution

Maestro uses each agent's native hook system to automatically check if work is complete before allowing the agent to stop:

  • Claude Code: Stop hooks — exit code 2 blocks Claude from stopping and feeds a continuation reason back
  • Codex: notify config — fires after each turn, can send continuation prompts via codex exec resume

When the hook detects incomplete work, it tells the agent what's left. The agent continues autonomously.

What It Checks

  1. Uncommitted changes — "You have uncommitted work. Commit and push before stopping."
  2. Unpushed commits — "Local commits not pushed to remote."
  3. CI failures — "CI is failing: test-suite, lint. Please fix."
  4. PR review feedback — "PR #42 has changes requested."

Quick Start

git clone https://github.com/evalops/maestro.git
cd maestro
python3 install.py

That's it. The installer wires hooks into ~/.claude/settings.json and ~/.codex/config.toml.

python3 install.py --status     # verify installation
python3 install.py --uninstall  # remove all hooks

How It Works

Claude Code (Stop Hook)

When Claude finishes responding, the Stop hook runs:

Claude stops → stop_validator.py runs → checks git/CI/PR status
                                       ├─ All clear → exit 0 (allow stop)
                                       └─ Work remains → exit 2 + reason (block stop)
                                                         Claude receives reason and continues

The continuation reason is fed directly to Claude as a prompt, so it knows exactly what to fix.

Codex (Notify Handler)

When a Codex turn completes:

Turn ends → notify_handler.py runs → checks git/CI/PR status
                                    ├─ All clear → done
                                    └─ Work remains → codex exec resume <session> "fix X"

Safety

  • Interactive sessions are left alone by default. Maestro only enforces in headless/automated sessions (claude -p, CI, etc.)
  • Max 5 continuations per session prevents infinite loops
  • stop_hook_active flag prevents recursive hook triggers
  • No external dependencies — stdlib only (subprocess, json, pathlib)

Configuration

Edit shared/config.py:

MAX_CONTINUATIONS = 5          # max times hook blocks per session
UNCOMMITTED_CHANGES_BLOCK = True
CI_CHECK_BLOCK = True
OPEN_ISSUES_BLOCK = False      # opt-in: block if assigned GH issues are open

Environment Variables

Variable Default Description
MAESTRO_HEADLESS_ONLY 1 Only enforce in headless sessions
MAESTRO_ENFORCE unset Force enforcement in all sessions (for testing)
CLAUDE_CODE_HEADLESS unset Marker that session is headless

To enforce in interactive sessions too:

MAESTRO_HEADLESS_ONLY=0 claude -p "implement feature X"

Project Structure

maestro/
├── install.py              # Hook installer/uninstaller
├── hooks/
│   ├── stop_validator.py   # Claude Code Stop hook
│   └── session_end.py      # Claude Code SessionEnd hook (logging)
├── codex/
│   └── notify_handler.py   # Codex notify callback
├── shared/
│   ├── config.py           # Configuration
│   ├── github_checker.py   # Git/CI/PR status checks
│   └── session_parser.py   # Agent session JSONL analyzer
└── logs/                   # Runtime logs (gitignored)

Testing

# Test the Stop hook manually
echo '{"session_id":"test","cwd":".","stop_hook_active":false}' \
  | MAESTRO_ENFORCE=1 python3 hooks/stop_validator.py

# Check what the hook would do in your repo
echo '{"session_id":"test","cwd":"'$(pwd)'","stop_hook_active":false}' \
  | MAESTRO_ENFORCE=1 python3 hooks/stop_validator.py

Bonus: Session Analyzer

shared/session_parser.py can parse JSONL logs from all three agent types (Claude Code, Codex, Factory/Droid):

python3 shared/session_parser.py --list --since 2d    # list recent sessions
python3 shared/session_parser.py --recent              # analyze most recent per agent
python3 shared/session_parser.py <path> --errors       # show errors from a session
python3 shared/session_parser.py <path> --timeline     # user message timeline
python3 shared/session_parser.py <path> --todos        # todo state
python3 shared/session_parser.py <path> --json         # full JSON output

Inspired By

Maestro takes the lightweight path: instead of a full orchestration daemon, it uses each agent's native hook system to inject just enough automation to eliminate the babysitting loop.

License

MIT

About

Lightweight agent orchestration hooks for Claude Code and OpenAI Codex. Keep your coding agents working until the job is done.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages