LoguardLLM

RAG-Powered Log Threat Detection System using Local LLMs on AMD GPUs.

Overview

LoguardLLM is an intelligent security monitoring tool that leverages Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to detect cybersecurity threats in web server logs. Unlike traditional rule-based systems, LoguardLLM uses semantic understanding to identify both known attack patterns and novel threats.

Features

Two-Tier Embedding System: Fast individual log embedding (mxbai) + rich context embedding (bge-m3)
Real-time Buffered Processing: Immediate embedding with intelligent batch analysis
RAG-Based Analysis: Multi-collection retrieval (individual logs + groups + summaries)
Intelligent Grouping: Hardcoded grouping by status codes and IP/CIDR blocks for DDoS detection
Log Sentencification: Human-readable key-value format for better embeddings
Static Asset Filtering: Configurable exclusion of CSS/JS/images to focus on real threats
TTL-Based Cleanup: Automatic deletion of old embeddings to prevent unbounded growth
Local LLM Inference: Privacy-preserving analysis on AMD GPUs via Ollama
Continuous Monitoring: Tail -f behavior with file watcher
Django Dashboard: Web interface for viewing threats and configuration
Extensible Architecture: Easy to add new log formats and LLM models

Architecture

Two-step process for log analysis (current, real-time)

Log analysis runs in two steps that repeat continuously:

Step 1 – Ingest & index (per log)
Each new log line is: parsed → sentencified (human-readable key-value) → embedded with a fast model (e.g. mxbai) → stored in ChromaDB (individual_logs). The same log is appended to an in-memory buffer. No LLM call yet; this step is optimized for speed.
Step 2 – Analyze (when buffer is full)
When the buffer reaches the configured size (e.g. 50 logs):
- Buffer is flushed and logs are grouped (e.g. by 4xx/5xx and by IP/CIDR).
- Group summaries are generated and embedded with a context model (e.g. bge-m3) and stored (log_chunks).
- RAG retrieves recent context from individual_logs, log_chunks, and analysis_summaries (time-windowed).
- The LLM is called with the current batch of raw logs plus the retrieved context.
- The analysis summary is embedded and stored (analysis_summaries); threats are written to the database.

So: Step 1 = continuous per-log indexing (fast embed + buffer); Step 2 = periodic batch analysis (group → RAG → LLM → store results).

How it was done earlier (batch / legacy, removed)

The old batch mode used a different pipeline:

Indexing (one-shot):
All logs were chunked (e.g. fixed-size or “intelligent” by time/IP/status) into LogChunk objects. Each chunk (not each log) was embedded with a single embedding model and stored in one ChromaDB collection. There was no per-log index and no sentencification (chunks used a simple METHOD path | Status: X | IP: ... style text).
Analysis (separate run):
The analyzer retrieved pre-built chunks via RAG and passed only those chunks to the LLM. There was no “current batch” of raw logs in the prompt, no group summaries, and no stored analysis summaries for follow-up context. You ran “index everything” once, then “analyze” (optionally on a schedule).

So the earlier approach was: one-time chunk-and-embed, then RAG over chunks → LLM. The current approach is: continuous per-log indexing + buffer, then group → RAG over logs + groups + past summaries → LLM with current raw logs.

Two-Tier Embedding System

Log File (continuous)
    ↓
File Watcher (tail -f)
    ↓
CLF Parser → Sentencify (key-value format)
    ↓                           ↓
Buffer (50 logs)        Fast Embedder (mxbai)
    ↓                           ↓
Intelligent Grouper     ChromaDB: individual_logs
    ↓
Group Summaries
    ↓
Context Embedder (bge-m3)
    ↓
ChromaDB: log_chunks
    ↓
RAG Retriever (30-60 min window)
    ↓
Threat Detector (LLM + RAG context)
    ↓
Analysis Summary → Context Embedder (bge-m3)
    ↓
ChromaDB: analysis_summaries
    ↓
SQLite DB → Django Dashboard
    ↑
TTL Cleanup Thread (auto-delete old embeddings)

Prerequisites

Docker and Docker Compose
AMD GPU with ROCm support (e.g. RX 9070 XT) — or CPU-only (slower)
At least 16GB RAM
20GB free disk space

Local Setup

Follow these steps to run LoguardLLM on your machine.

1. Clone the repository

git clone git@github.com:codingsasi/loguard_llm.git loguard_llm
cd loguard_llm

Or if you already have the project:

cd /path/to/loguard_llm

2. Start services with Docker Compose

docker compose up

This starts:

loguard_ollama — Ollama with ROCm (AMD GPU support) for LLM and embeddings
loguard_app — Django app (migrations, admin, dashboard, analyzer)

Check that both containers are running:

docker compose ps

3. Pull required models (Ollama)

The system uses 3 models in total: 1 base LLM (threat analysis) + 2 embedding models (fast tier + context tier).

# 1) Base LLM for threat analysis (pick one; run `ollama list` for sizes)
docker exec -it loguard_ollama ollama pull mistral:7b-instruct
docker exec -it loguard_ollama ollama pull qwen2.5:7b-instruct
docker exec -it loguard_ollama ollama pull llama3.1:8b
docker exec -it loguard_ollama ollama pull llama3.1:8b-instruct-q8_0
docker exec -it loguard_ollama ollama pull codellama:13b
docker exec -it loguard_ollama ollama pull qwen2.5:14b-instruct
docker exec -it loguard_ollama ollama pull qwen2.5:32b-instruct
docker exec -it loguard_ollama ollama pull deepseek-r1:14b
docker exec -it loguard_ollama ollama pull llama3.1:70b-instruct-q4_K_M
docker exec -it loguard_ollama ollama pull athene-v2:latest

# 2) Fast embedding model (individual logs)
docker exec -it loguard_ollama ollama pull mxbai-embed-large                         # 669 MB

# 3) Context embedding model (groups/summaries)
docker exec -it loguard_ollama ollama pull bge-m3                                    # 1.2 GB

Verify models:

docker exec loguard_ollama ollama list

4. Initialize Django

# Run migrations
docker exec loguard_app python manage.py migrate

# Create admin user (for dashboard and config)
docker exec -it loguard_app python manage.py createsuperuser

Follow the prompts to set username, email, and password.

5. (Optional) Clear old data (runs, threats, ChromaDB)

To wipe analysis runs, threat alerts, and embeddings (config is left unchanged):

docker exec loguard_app python reset_and_configure.py

6. Access the dashboard

Dashboard: http://localhost:8000
Admin (config, threats): http://localhost:8000/admin

Log in with the superuser account you created.

7. Run the analyzer

docker exec -it loguard_app python manage.py run_analyzer

The analyzer runs in real-time only: it watches the log file, buffers entries, and analyzes with LLM + RAG. Log file path and other options are in Django admin under Config (see Configuration).

Summary: minimal local setup

cd loguard_llm
docker compose up -d
docker exec loguard_ollama ollama pull qwen2.5:7b-instruct
docker exec loguard_ollama ollama pull mxbai-embed-large
docker exec loguard_ollama ollama pull bge-m3
docker exec loguard_app python manage.py migrate
docker exec -it loguard_app python manage.py createsuperuser
# Then open http://localhost:8000 and run: python manage.py run_analyzer

Configuration

Edit the configuration via Django admin at http://localhost:8000/admin/engine/config/.

Analysis Settings

Enabled: Toggle analysis on/off
Analysis Interval: How often to analyze logs (default: 30 seconds)
Max Logs Per Analysis: Maximum logs to retrieve per analysis (default: 100)

Log Source

Log File Path: Path to Nginx access log (e.g. /app/sample_logs/realtime.log or attack_simulation.log)
Log Format: CLF (Common Log Format) or Combined/Extended

Filtering Options

Filter Static Assets: Exclude CSS/JS/images (default: enabled)
Excluded Extensions: File extensions to exclude (e.g. .css,.js,.jpg,.png,.gif,.webp,.svg,.ico,.woff,.ttf,.map,.json,.xml,.mp4,.pdf)
Excluded Paths: Path substrings to exclude (legitimate app endpoints). Default: /nodeviewcount,/analytics,/api/tracking,/api/metrics,/heartbeat,/health,/ping. Requests whose path contains any of these are not analyzed.
Excluded User Agents: Comma-separated substrings (case-insensitive). Default includes: googlebot,bingbot,msnbot,slurp,duckduckbot,baiduspider,yandexbot,facebookexternalhit,twitterbot,linkedinbot,whatsapp,telegrambot
Excluded Status Codes: Comma-separated codes to exclude (e.g. 200,301,302,304 to focus on errors). Leave empty to include all.
Excluded Methods: HTTP methods to exclude (e.g. HEAD,OPTIONS). Leave empty to include all.
Included Methods Override: Methods to always include even if their status code is excluded (e.g. POST,PUT,DELETE,PATCH). Use this to still analyze POST 200 OK for spam/abuse while excluding normal GET 200 traffic.

LLM Configuration (1 of 3 models)

Base LLM (threat analysis) — options (run ollama list for sizes):
- mistral:7b-instruct — 4.4 GB
- qwen2.5:7b-instruct — 4.7 GB
- llama3.1:8b — 4.9 GB
- llama3.1:8b-instruct-q8_0 — 8.5 GB
- codellama:13b
- qwen2.5:14b-instruct — 9.0 GB
- qwen2.5:32b-instruct — 20 GB
- deepseek-r1:14b
- llama3.1:70b-instruct-q4_K_M — 42 GB
- athene-v2:latest — 47 GB

Embedding Models (2 of 3 models – two-tier)

Fast Embedding Model: mxbai-embed-large — 669 MB; embeds individual logs
Context Embedding Model: bge-m3 — 1.2 GB; embeds groups and summaries

Buffer & RAG Settings

Buffer Size: Logs to accumulate before analysis (default: 50)
RAG Time Window: Context retrieval window (default: 30 minutes)
VectorDB Retention: TTL for embeddings; old data is auto-deleted (default: 60 minutes)

Intelligent grouping (4xx/5xx/IP/CIDR) is hardcoded for consistency.

Usage

Run the analyzer

docker exec -it loguard_app python manage.py run_analyzer

The analyzer runs in real-time only. It will:

Watch the log file continuously (tail -f behavior)
Sentencify each log to human-readable key-value format
Embed immediately with fast model (mxbai) → ChromaDB individual_logs
Buffer logs until the configured buffer size (default 50)
Group by status codes (4xx/5xx) and IP/CIDR blocks
Generate group summaries, embed with context model (bge-m3) → ChromaDB log_chunks
Retrieve RAG context from the last 30–60 minutes (individual + groups + summaries)
Analyze with LLM + RAG context
Embed analysis summary with context model → ChromaDB analysis_summaries
Store threats in the database
Auto-cleanup old embeddings (TTL)

Press Ctrl+C to stop.

Understanding the console output

Each time the buffer fills, you see a block like:

Buffer full (200 logs) - starting batch analysis #12...
  Flushed 200 logs, created 1 group summaries
  Generated 1 group embeddings
  Retrieved RAG context: 19 individual, 5 groups, 2 summaries
  Running LLM analysis...
  ✓ No threats detected
Total processed: 2400 logs, 12 batches

Buffer full (200 logs) — The in-memory buffer reached 200 lines (your configured buffer size). Only these 200 logs are analyzed in this batch.
Flushed 200 logs, created 1 group summaries — The buffer was cleared; the 200 logs were grouped (e.g. by 4xx/5xx and IP) into one or more text summaries.
Generated 1 group embeddings — Each group summary was embedded with the context model (bge-m3) and stored in ChromaDB.
Retrieved RAG context — For this batch, RAG pulled recent context: 19 individual log snippets, 5 group summaries, and 2 past analysis summaries from ChromaDB (within the time window). This context is sent to the LLM along with the 200 current logs.
Running LLM analysis — The LLM (e.g. Mistral, Llama) is called with the 200 logs + RAG context to detect threats.
Total processed: 2400 logs, 12 batches — Cumulative totals since startup: 2400 log lines read from the file so far, and 12 batch analyses run. Each batch still only analyzes 200 logs; the total is a running count, not the size of one request.

So the run is efficient: every batch analyzes only one buffer (e.g. 200 logs). The increasing total is just “how many logs we’ve seen and how many batches we’ve done so far.”

View results

Web Dashboard

Navigate to http://localhost:8000/ to see:

Overview metrics (total threats, recent analysis runs)
Threat alerts with details
Analysis run history

Threat Details

Visit http://localhost:8000/threats/ to see detected threats with:

Threat type (brute force, DDoS, SQL injection, path traversal, reconnaissance, bot activity)
Severity level (critical, high, medium, low)
Confidence score (0.0-1.0)
Description and evidence (actual log entries)
Actionable recommendations

Development

Project Structure

loguard_llm/
├── docker-compose.yml          # Container orchestration
├── Dockerfile                  # App container definition
├── requirements.txt            # Python dependencies
├── manage.py                   # Django management script
├── loguard/                    # Django project settings
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── engine/                     # Core analysis engine
│   ├── models.py              # Django models
│   ├── collector/             # Log collection
│   ├── indexer/               # RAG indexing
│   ├── llm/                   # LLM providers
│   └── analyzer/              # Threat detection
├── dashboard/                  # Web interface
│   ├── views.py
│   ├── urls.py
│   └── templates/
└── docs/                       # Documentation
    └── phase1_proposal.md

Helper scripts

reset_and_configure.py — Clears threat alerts, analysis runs, and ChromaDB embeddings only. Does not change Config (edit filters/LLM in Django admin):
```
docker exec loguard_app python reset_and_configure.py
```
view_threats.py — Prints latest threat alerts and evidence from the database:
```
docker exec loguard_app python view_threats.py
```
configure_filters.py — Applies a predefined filter set (excluded status codes, method overrides) without clearing data.
inspect_chroma (management command) — Show ChromaDB contents (counts and sample documents). Use to verify what is in RAG context (e.g. after a clear, or to see if old excluded paths like /nodeviewcount appear):
```
docker exec loguard_app python manage.py inspect_chroma
docker exec loguard_app python manage.py inspect_chroma --samples=10 --check-excluded
```

Supported Log Formats

Common Log Format (CLF)
NCSA Extended/Combined Log Format

Threat Detection Categories

Brute Force: Multiple failed authentication attempts
DDoS/DoS: High request rates from single or distributed sources
SQL Injection: Malicious SQL patterns in URLs
Path Traversal: Directory traversal attempts
Reconnaissance: Scanning and probing behavior
Bot Activity: Suspicious automated traffic

Performance

With Intelligent Chunking + Static Filtering

Log Reduction: 75-90% fewer logs to analyze (filters static assets)
Analysis Latency: ~10-30 seconds per batch (50-100 logs)
Throughput: 500-1000+ logs/minute
Token Efficiency: 85-95% reduction vs. sending all logs to LLM
GPU Memory:
- Mistral 7B: ~4GB VRAM
- Qwen 2.5 7B: ~5GB VRAM
- Llama 3.1 70B (quantized q4): ~40GB+ VRAM (or large RAM if CPU offload)

Recommended Settings for Performance

Chunk Size: 10-20 logs (smaller = faster embeddings)
Max Logs/Analysis: 50-100 (balance between coverage and speed)
Model: qwen2.5:7b-instruct for best speed/quality balance
Embedding: nomic-embed-text for speed, bge-m3 for larger context

Troubleshooting

Ollama not starting

Check GPU access:

ls -la /dev/kfd /dev/dri

Verify ROCm installation:

rocm-smi

Embedding context length errors

If you see the input length exceeds the context length:

Reduce chunk size: Set to 5-10 logs in Django admin
Use smaller log file: Test with attack_simulation.log (1000 lines) instead of access.log (280k lines)
Use larger embedding model: Switch to bge-m3 (8K context)

ChromaDB errors / stale RAG context

If threat alerts show target paths like /nodeviewcount or /analytics even though those paths are in Excluded paths, old ChromaDB data may still be in RAG context. Inspect and clear:

Inspect what is in ChromaDB:
```
docker exec loguard_app python manage.py inspect_chroma --check-excluded
```
Samples that contain excluded-path hints are flagged.

Clear the vector database and re-run the analyzer:

docker exec loguard_app python reset_and_configure.py
docker exec -it loguard_app python manage.py run_analyzer

If the problem persists (e.g. some Chroma versions leave data after delete_collection), do a force clear (deletes the ChromaDB data directory and reinitializes):
```
docker exec loguard_app python reset_and_configure.py --force
docker exec -it loguard_app python manage.py run_analyzer
```

Slow LLM inference

If analysis is taking too long:

Use faster model: Switch to mistral:7b-instruct or qwen2.5:7b-instruct
Reduce buffer size: Set buffer size to 25–30 in Django admin
Enable static filtering: Reduces number of logs to analyze by 75%+

Ollama 500 error (Llama 3.1 70B or other large models)

If you see 500 Internal Server Error for http://ollama:11434/api/generate when using a 70B model:

Check Ollama’s error message — The app now surfaces Ollama’s response body (e.g. out of memory, context length). Look at the log line after Ollama generate error: for the exact reason.
Resource requirements — Llama 3.1 70B (e.g. llama3.1:70b-instruct-q4_K_M) needs ~40GB+ VRAM or substantial RAM if offloaded to CPU. Ensure your host has enough memory and that the Ollama container can use it.
Reduce load — Lower the buffer size in Django admin (e.g. 50–100) so each request sends fewer logs and a shorter prompt.
Check Ollama logs:
```
docker logs loguard_ollama
```
Look for OOM, CUDA/ROCm errors, or “context length” messages.
Use a smaller model — If resources are limited, use llama3.1:8b or qwen2.5:7b-instruct for analysis.

Low detection accuracy

Try a larger model:

docker exec -it loguard_ollama ollama pull qwen2.5:14b-instruct

Update the model in Django admin.

No logs collected

Check log file path and permissions:

docker exec loguard_app ls -la /app/sample_logs/

Ensure the log file exists and is readable.

License

This project is licensed under the GNU General Public License v3.0 (GPL-3.0).
You may use, modify, and distribute it under the terms of the GPL v3. See the LICENSE file for the full text.

Acknowledgments

Drupalgeddon 2.0 case study for motivation
Ollama team for local LLM runtime
ChromaDB for vector database
Django community

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
dashboard		dashboard
engine		engine
loguard		loguard
templates		templates
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
configure_filters.py		configure_filters.py
docker-compose.yml		docker-compose.yml
manage.py		manage.py
requirements.txt		requirements.txt
reset_and_configure.py		reset_and_configure.py
setup.sh		setup.sh
view_threats.py		view_threats.py

Folders and files

Latest commit

History

Repository files navigation

LoguardLLM

Overview

Features

Architecture

Two-step process for log analysis (current, real-time)

How it was done earlier (batch / legacy, removed)

Two-Tier Embedding System

Prerequisites

Local Setup

1. Clone the repository

2. Start services with Docker Compose

3. Pull required models (Ollama)

4. Initialize Django

5. (Optional) Clear old data (runs, threats, ChromaDB)

6. Access the dashboard

7. Run the analyzer

Summary: minimal local setup

Configuration

Analysis Settings

Log Source

Filtering Options

LLM Configuration (1 of 3 models)

Embedding Models (2 of 3 models – two-tier)

Buffer & RAG Settings

Usage

Run the analyzer

Understanding the console output

View results

Web Dashboard

Threat Details

Development

Project Structure

Helper scripts

Supported Log Formats

Threat Detection Categories

Performance

With Intelligent Chunking + Static Filtering

Recommended Settings for Performance

Troubleshooting

Ollama not starting

Embedding context length errors

ChromaDB errors / stale RAG context

Slow LLM inference

Ollama 500 error (Llama 3.1 70B or other large models)

Low detection accuracy

No logs collected

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages