Skip to content

KazKozDev/dspy-optimization-patterns

DSPy Production Framework logo

DSPy Production Framework

Production-oriented DSPy starter for compiling, serving, and operating LLM programs with a teacher-student workflow.

License: MIT CI/CD Pipeline Python 3.10+ DSPy

Highlights

  • Compiles prompts into versioned artifacts
  • Separates optimization from inference cost
  • Exposes FastAPI inference endpoints
  • Includes Docker and Kubernetes deployment
  • Provides tracing, tests, and examples

Demo

DSPy Production Framework overview

Overview

This repository shows how to treat DSPy programs like deployable assets rather than ad hoc prompts. It provides reusable DSPy modules, an optimization pipeline that saves compiled JSON artifacts, and a FastAPI service that loads those artifacts at startup for production inference. The project is aimed at teams building RAG, QA, or classification systems that need a cleaner path from offline optimization to online serving.

Motivation

Most LLM applications stop at prompt iteration, which makes behavior hard to reproduce, version, and deploy. DSPy changes that model by compiling instructions and few-shot examples against a metric. This project packages that idea into a practical workflow: prepare data, optimize with a stronger teacher model, save the compiled artifact, and serve requests with a cheaper student model. The result is a more disciplined way to ship LLM systems with measurable optimization and lower inference cost.

Features

  • Provides DSPy modules for QA, RAG, classification, and multi-hop reasoning.
  • Saves compiled programs to artifacts/compiled_programs/ for reuse in production.
  • Runs optimization through configurable teacher and student model settings in config/.
  • Exposes /qa, /rag, /classify, /health, and /artifacts endpoints via FastAPI.
  • Includes sample data preparation, example scripts, and pytest coverage for core behavior.
  • Supports local runs, Docker Compose, Phoenix observability, and Kubernetes manifests.

Architecture

Core components:

  • src/core/ defines signatures, metrics, and DSPy modules.
  • src/pipeline/ loads datasets, splits train/dev/test sets, and runs compilation.
  • src/app/ serves optimized or fallback modules through REST endpoints.
  • src/integrations/ contains vector database integration points.
  • artifacts/compiled_programs/ stores versioned compiled DSPy states.

Flow:

dataset -> metric -> optimizer -> compiled artifact -> FastAPI service -> inference request

Tech Stack

  • Python 3.10+
  • DSPy
  • FastAPI + Uvicorn
  • Poetry
  • Pytest, Ruff, Black, MyPy
  • Docker Compose
  • Kubernetes
  • Arize Phoenix

Quick Start

  1. Clone the repo and install dependencies.
git clone https://github.com/KazKozDev/dspy-optimization-patterns.git
cd dspy-optimization-patterns
poetry install --with dev
  1. Create local environment variables.
cp .env.example .env

Required for most flows: OPENAI_API_KEY. Optional model configuration lives in config/models.yaml.

  1. Generate sample data.
make prepare-sample-data
  1. Compile a RAG module into a reusable artifact.
make optimize-rag
  1. Start the API.
make run-api

Open http://localhost:8000/docs for the interactive API docs.

Usage

Run the service locally and send a QA request:

curl -X POST http://localhost:8000/qa \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is DSPy?",
    "context": "DSPy is a framework for programming with LLMs."
  }'

Compile a classifier artifact:

make optimize-classifier

Explore example workflows:

python examples/01_basic_usage.py
python examples/02_optimization_workflow.py
python examples/03_rag_with_vector_db.py

Project Structure

config/      model and optimizer settings
data/        raw and processed datasets
artifacts/   compiled DSPy programs
src/core/    signatures, modules, metrics
src/pipeline/ data loading and optimization
src/app/     FastAPI serving layer
tests/       API and module tests

Testing

make test
make lint

More Docs


MIT - see LICENSE

If you like this project, please give it a star ⭐

For questions, feedback, or support, reach out to:

LinkedIn Email

About

A production framework for DSPy implementing the Teacher-Student pattern. Distill the reasoning of expensive models (Teacher) into optimized prompts for cheap, fast models (Student) to reduce inference costs by up to 50x.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Contributors