| layout | title | nav_order | has_children |
|---|---|---|---|
default |
Haystack Tutorial |
23 |
true |
Project: Haystack — An open-source framework for building production-ready LLM applications, RAG pipelines, and intelligent search systems.
Haystack is an open-source LLM framework by deepset for building composable AI pipelines. It provides a modular, component-based architecture that combines retrieval, generation, and evaluation into production-ready workflows. Haystack supports dozens of LLM providers, vector databases, and retrieval strategies out of the box.
| Feature | Description |
|---|---|
| Pipeline System | Directed graph of components with typed inputs/outputs and automatic validation |
| RAG | First-class retrieval-augmented generation with hybrid search (BM25 + embedding) |
| Multi-Provider | OpenAI, Anthropic, Cohere, Google, Hugging Face, Ollama, and more |
| Document Stores | In-memory, Elasticsearch, OpenSearch, Pinecone, Qdrant, Weaviate, Chroma, pgvector |
| Evaluation | Built-in metrics (MRR, MAP, NDCG) and LLM-based evaluation components |
| Custom Components | @component decorator for building reusable pipeline nodes with typed I/O |
- repository:
deepset-ai/haystack - stars: about 24.4k
- latest release:
v2.25.2(published 2026-03-05)
graph TB
subgraph Ingestion["Ingestion Pipeline"]
FILES[File Converters]
SPLIT[Document Splitter]
EMBED_D[Document Embedder]
WRITER[Document Writer]
end
subgraph Store["Document Stores"]
MEM[In-Memory]
ES[Elasticsearch]
PG[pgvector]
VEC[Pinecone / Qdrant / Weaviate]
end
subgraph Query["Query Pipeline"]
EMBED_Q[Query Embedder]
BM25[BM25 Retriever]
EMB_RET[Embedding Retriever]
JOINER[Document Joiner]
RANKER[Ranker]
PROMPT[Prompt Builder]
GEN[Generator / LLM]
end
FILES --> SPLIT --> EMBED_D --> WRITER
WRITER --> Store
Store --> BM25
Store --> EMB_RET
EMBED_Q --> EMB_RET
BM25 --> JOINER
EMB_RET --> JOINER
JOINER --> RANKER --> PROMPT --> GEN
| Chapter | Topic | What You'll Learn |
|---|---|---|
| 1. Getting Started | Setup | Installation, first RAG pipeline, architecture overview |
| 2. Document Stores | Storage | Store backends, indexing, preprocessing, multi-store patterns |
| 3. Retrievers & Search | Retrieval | BM25, embedding, hybrid search, filtering, re-ranking |
| 4. Generators & LLMs | Generation | Multi-provider LLMs, prompt engineering, streaming, chat |
| 5. Pipelines & Workflows | Composition | Pipeline graph, branching, loops, serialization, async |
| 6. Evaluation & Optimization | Quality | Retrieval metrics, LLM evaluation, A/B testing, optimization |
| 7. Custom Components | Extensibility | @component decorator, typed I/O, testing, packaging |
| 8. Production Deployment | Operations | REST API, Docker, Kubernetes, monitoring, scaling |
| Component | Technology |
|---|---|
| Language | Python 3.9+ |
| Pipeline Engine | Custom directed graph with topological execution |
| Serialization | YAML / JSON pipeline definitions |
| Embeddings | Sentence Transformers, OpenAI, Cohere, Fastembed |
| Vector Search | FAISS, Pinecone, Qdrant, Weaviate, Chroma, pgvector |
| Text Search | Elasticsearch, OpenSearch, BM25 (in-memory) |
| LLM Providers | OpenAI, Anthropic, Google, Cohere, Hugging Face, Ollama |
| API Layer | Hayhooks (FastAPI-based pipeline serving) |
Ready to begin? Start with Chapter 1: Getting Started.
Built with insights from the Haystack repository and community documentation.
- Start Here: Chapter 1: Getting Started with Haystack
- Back to Main Catalog
- Browse A-Z Tutorial Directory
- Search by Intent
- Explore Category Hubs
- Chapter 1: Getting Started with Haystack
- Chapter 2: Document Stores
- Chapter 3: Retrievers & Search
- Chapter 4: Generators & LLMs
- Chapter 5: Pipelines & Workflows
- Chapter 6: Evaluation & Optimization
- Chapter 7: Custom Components
- Chapter 8: Production Deployment
Generated by AI Codebase Knowledge Builder