Skip to content

RedLordezh7Venom/prolog-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

72 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Prolog-RAG: Formal Logic for Financial Reasoning

Python SWI-Prolog LLM License

A hybrid RAG system bridging the gap between semantic retrieval and symbolic precision.


πŸ“– Table of Contents


πŸ›‘ The Problem with Traditional RAG

Standard Vector-based RAG architectures consistently fail in the financial domain because:

  • Numerical Hallucinations: LLMs struggle with multi-step arithmetic, leading to incorrect calculations for growth rates, total costs, and margins.
  • Logical Multi-Hop Gaps: Information scattered across fragmented documents (e.g., across 2012–2014 SEC filings) results in "lost-in-the-middle" reasoning errors.
  • Mathematical Imprecision: Simple semantic search cannot handle complex constraints or temporal comparative logic (e.g., "Find the year with the lowest margin").
  • Opaque Reasoning: Answers are generated as "black boxes," providing no verifiable audit trailβ€”a critical failure for regulatory compliance.

πŸš€ The Solution: Prolog-RAG

Prolog-RAG solves these issues by offloading Reasoning from the LLM to a Symbolic Logic Engine.

  • Symbolic Fact Extraction: Uses LLMs to turn natural language context into structured Prolog predicates (revenue(aal, 2023, 500)).
  • Deterministic Reasoning: Executes complex financial rules (growth, margin, threshold analysis) through a symbolic Prolog backend.
  • 100% Numerical Accuracy: Performs exact mathematical calculations, eliminating the "rounding" errors inherent in LLM-based synthesis.
  • Verifiable Proof Traces: Every answer comes with a step-by-step logic trace, showing exactly which financial facts and reasoning rules were used to derive the result.

πŸ—οΈ System Architecture

Prolog-RAG uses a hybrid routing mechanism to ensure high precision for structured queries while maintaining the flexibility of semantic search.

       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚   User Query  β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
               β–Ό
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚ Query Router  β”‚ (LLM-Based Decision Entry)
       β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
      β–Ό (Arithmetic)    β–Ό (Semantic)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Prolog Path  β”‚   β”‚  Vector Path  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€   β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Fact Extract  β”‚   β”‚ Chroma Search β”‚
β”‚ Logic Engine  β”‚   β”‚ LLM Synthesis β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                   β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β–Ό
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚  Final Answer β”‚ (With Proof Trace if Prolog)
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ” Example: The "Audit Proof" Difference

Query: "What was the gross profit margin for the company in 2017?"

🟒 Prolog-RAG (Reasoning Path)

Answer: "The gross profit margin for 2017 was 18.91%."
Verification Trace:

1. [Extract] revenue(company, 2017, 3314.0).
2. [Extract] cost_of_sales(company, 2017, 2687.0).
3. [Rule]    gross_profit(C, Y, GP) :- revenue(C, Y, R), cost(C, Y, S), GP is R - S.
4. [Rule]    margin(C, Y, M) :- gross_profit(C, Y, G), revenue(C, Y, R), M is (G/R)*100.
5. [Execute] M is ((3314 - 2687) / 3314) * 100 = 18.9197...

πŸ”΄ Traditional RAG (Semantic Path)

Answer: "The company reported a strong gross margin in 2017, approximately 19% based on the consolidated statements."
Verification Trace:

❌ None. Source of the "19%" figure is opaque and subject to LLM rounding/estimation.



πŸ“Š Benchmark Results

Evaluated on our Grounded Financial QA Suite (10 high-stakes financial reasoning questions):

System Avg Accuracy Score Proof Trace Availability Best For...
🟒 Prolog-RAG 4.6 / 5 100% (10/10) Logic, Arithmetic, Auditing
πŸ”΅ Contextual RAG 4.8 / 5 0% General Semantic Lookups
πŸ”΄ Naive RAG 3.2 / 5 0% Basic FAQ retrieval
🟣 Graph RAG 0.7 / 5 0% Complex entity mapping


πŸ› οΈ How It Works

  1. Query Input: The user provides a natural language financial query.
  2. Hybrid Routing: An LLM analyzes the query to determine if it is Semantic (FAQ/Summary) or Arithmetic/Logical (Calculations/Multi-hop).
  3. Fact Extraction: For logical queries, the system retrieves relevant document chunks and extracts structured financial facts (e.g., revenue(co, 2023, 500)).
  4. Symbolic Unification: The facts are asserted into a SWI-Prolog knowledge base alongside domain-specific financial reasoning rules.
  5. Logic Execution: The Prolog engine executes a symbolic query to derive the exact numerical answer or logical conclusion.
  6. Answer Synthesis: The system generates a natural language answer, appending a Proof Trace for full transparency and explainability.

πŸ—οΈ Project Structure

prolog-rag/
β”œβ”€β”€ prolog_rag_project/
β”‚   β”œβ”€β”€ core/           # Hybrid pipeline, Query Router, NL-to-Prolog translator
β”‚   β”œβ”€β”€ baselines/      # Naive, Graph, CRAG, and Contextual RAG implementations
β”‚   └── utils/          # Auto-Evaluator, Reporting, & Visualization tools
β”œβ”€β”€ benchmarks/         # Data generators for NIAH, HotpotQA, and FRAMES
β”œβ”€β”€ docs/               # Comparative analysis and PRD documentation
β”œβ”€β”€ assets/             # Performance charts and visualizations
β”œβ”€β”€ demo_app.py         # Streamlit Interactive Demo
└── arena.py            # Unified benchmarking arena

πŸ’» Tech Stack

Component Technology Role
Language Python 3.11+ Orchestration & Pipeline
Logic Engine SWI-Prolog 9.x Symbolic Reasoning & Arithmetic
LLM (Backbone) Llama 3.1 (via Groq) Fact Extraction & Query Routing
Vector Database ChromaDB Semantic Retrieval & Context Management
Embeddings Sentence-Transformers Vectorizing Financial Documents
Visualization Matplotlib Performance & Benchmark Charting

πŸ› οΈ Installation & Setup

Prerequisites

  • Python 3.11 or higher.
  • SWI-Prolog installed and added to your system PATH.
  • A Groq API Key (Set in .env).

Step-by-Step

  1. Clone the Repo:
    git clone https://github.com/RedLordezh7Venom/prolog-RAG.git && cd prolog-RAG
  2. Install Dependencies:
    uv pip install -e .
  3. Environment Setup: Create a .env file in the root and add your Groq API key:
    GROQ_API_KEY=your_key_here
  4. Run the Benchmark:
    uv run python arena.py

πŸ“œ Explainability Trace Example

Query: "What was the growth in Technical Solutions operating income from 2017 to 2018?"
Trace:
  -> fact: operating_income('Technical Solutions', 2017, 21.0)
  -> fact: operating_income('Technical Solutions', 2018, 32.0)
  -> rule: op_income_growth(Company, 2017, 2018, Growth)
  -> calc: (32.0 - 21.0) / 21.0 * 100 = 52.38%
Answer: "The operating income grew by 52.4%."

🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for details.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

experimental setup to hook llms to a logic engine, for proof traces and financial math

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors