Name	Name	Last commit message	Last commit date
parent directory ..
01-getting-started.md	01-getting-started.md
02-document-stores.md	02-document-stores.md
03-retrievers-search.md	03-retrievers-search.md
04-generators-llms.md	04-generators-llms.md
05-pipelines-workflows.md	05-pipelines-workflows.md
06-evaluation-optimization.md	06-evaluation-optimization.md
07-custom-components.md	07-custom-components.md
08-production-deployment.md	08-production-deployment.md
README.md	README.md

layout	title	nav_order	has_children
default	Haystack Tutorial	23	true

Haystack: Deep Dive Tutorial

Project: Haystack — An open-source framework for building production-ready LLM applications, RAG pipelines, and intelligent search systems.

What Is Haystack?

Haystack is an open-source LLM framework by deepset for building composable AI pipelines. It provides a modular, component-based architecture that combines retrieval, generation, and evaluation into production-ready workflows. Haystack supports dozens of LLM providers, vector databases, and retrieval strategies out of the box.

Feature	Description
Pipeline System	Directed graph of components with typed inputs/outputs and automatic validation
RAG	First-class retrieval-augmented generation with hybrid search (BM25 + embedding)
Multi-Provider	OpenAI, Anthropic, Cohere, Google, Hugging Face, Ollama, and more
Document Stores	In-memory, Elasticsearch, OpenSearch, Pinecone, Qdrant, Weaviate, Chroma, pgvector
Evaluation	Built-in metrics (MRR, MAP, NDCG) and LLM-based evaluation components
Custom Components	`@component` decorator for building reusable pipeline nodes with typed I/O

Current Snapshot (auto-updated)

repository: deepset-ai/haystack
stars: about 24.4k
latest release: v2.25.2 (published 2026-03-05)

Architecture Overview

graph TB
    subgraph Ingestion["Ingestion Pipeline"]
        FILES[File Converters]
        SPLIT[Document Splitter]
        EMBED_D[Document Embedder]
        WRITER[Document Writer]
    end

    subgraph Store["Document Stores"]
        MEM[In-Memory]
        ES[Elasticsearch]
        PG[pgvector]
        VEC[Pinecone / Qdrant / Weaviate]
    end

    subgraph Query["Query Pipeline"]
        EMBED_Q[Query Embedder]
        BM25[BM25 Retriever]
        EMB_RET[Embedding Retriever]
        JOINER[Document Joiner]
        RANKER[Ranker]
        PROMPT[Prompt Builder]
        GEN[Generator / LLM]
    end

    FILES --> SPLIT --> EMBED_D --> WRITER
    WRITER --> Store

    Store --> BM25
    Store --> EMB_RET
    EMBED_Q --> EMB_RET
    BM25 --> JOINER
    EMB_RET --> JOINER
    JOINER --> RANKER --> PROMPT --> GEN

Tutorial Structure

Chapter	Topic	What You'll Learn
1. Getting Started	Setup	Installation, first RAG pipeline, architecture overview
2. Document Stores	Storage	Store backends, indexing, preprocessing, multi-store patterns
3. Retrievers & Search	Retrieval	BM25, embedding, hybrid search, filtering, re-ranking
4. Generators & LLMs	Generation	Multi-provider LLMs, prompt engineering, streaming, chat
5. Pipelines & Workflows	Composition	Pipeline graph, branching, loops, serialization, async
6. Evaluation & Optimization	Quality	Retrieval metrics, LLM evaluation, A/B testing, optimization
7. Custom Components	Extensibility	@component decorator, typed I/O, testing, packaging
8. Production Deployment	Operations	REST API, Docker, Kubernetes, monitoring, scaling

Tech Stack

Component	Technology
Language	Python 3.9+
Pipeline Engine	Custom directed graph with topological execution
Serialization	YAML / JSON pipeline definitions
Embeddings	Sentence Transformers, OpenAI, Cohere, Fastembed
Vector Search	FAISS, Pinecone, Qdrant, Weaviate, Chroma, pgvector
Text Search	Elasticsearch, OpenSearch, BM25 (in-memory)
LLM Providers	OpenAI, Anthropic, Google, Cohere, Hugging Face, Ollama
API Layer	Hayhooks (FastAPI-based pipeline serving)

Ready to begin? Start with Chapter 1: Getting Started.

Built with insights from the Haystack repository and community documentation.

Navigation & Backlinks

Full Chapter Map

Source References

Haystack

Generated by AI Codebase Knowledge Builder

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Haystack: Deep Dive Tutorial

What Is Haystack?

Current Snapshot (auto-updated)

Architecture Overview

Tutorial Structure

Tech Stack

Navigation & Backlinks

Full Chapter Map

Source References

FilesExpand file tree

haystack-tutorial

Directory actions

More options

Directory actions

More options

Latest commit

History

haystack-tutorial

Folders and files

parent directory

README.md

Haystack: Deep Dive Tutorial

What Is Haystack?

Current Snapshot (auto-updated)

Architecture Overview

Tutorial Structure

Tech Stack

Navigation & Backlinks

Full Chapter Map

Source References