Skip to content

Security: AreteDriver/memboot

Security

SECURITY.md

Security Policy

Supported Versions

Version Supported
0.7.x Yes
0.4.x – 0.6.x Security fixes only
< 0.4 No

Reporting a Vulnerability

If you discover a security vulnerability, please report it responsibly:

  1. Do not open a public issue
  2. Email jamesyng79@gmail.com with:
    • Description of the vulnerability
    • Steps to reproduce
    • Potential impact
  3. You will receive an acknowledgment within 48 hours
  4. A fix will be prioritized based on severity

Security Measures

This project uses:

  • CodeQL — static analysis on every push
  • gitleaks — secret scanning on every push
  • pip-audit — dependency vulnerability scanning on every push
  • Dependabot — automated dependency updates
  • Optional SQLCipher at-rest encryption via MEMBOOT_DB_KEY
  • Optional encrypted notes mirroring via MEMBOOT_NOTES_KEY
  • Private local permissions on ~/.memboot storage paths (best effort on POSIX)
  • SSRF guard on web ingestionhttp/https only; hostnames resolving to private, loopback, link-local, or cloud-metadata addresses are refused, and every redirect hop is re-validated so a public URL can't 30x-bounce to an internal target. Body size capped (10 MiB, compression refused) and a fetch timeout applied. Override the IP check with MEMBOOT_INGEST_ALLOW_PRIVATE=1 for internal docs servers (the scheme check is always enforced).
  • Credential-directory denylist in the default indexing config — .env, .aws, .ssh, .gnupg, secrets, credentials, *.pem, *.key, *.tfvars and similar are skipped by default

Threat Model

memboot is a single-user CLI and library that runs as the invoking user on their own machine. There is no deployed network surface in the default configuration (the optional MCP server binds localhost-only via stdio). The threat model that follows is what memboot does and does not defend against.

In scope — memboot defends against these

  • SSRF via memboot ingest <url> — the URL is validated before any network call, and every redirect hop is re-validated by the same guard. Non-http(s) schemes (file://, gopher://, dict://, …) are rejected outright. Hostnames that resolve to private (RFC1918), loopback, link-local (including the AWS metadata endpoint 169.254.169.254), multicast, reserved, or unspecified addresses are refused. If any address in a multi-record DNS response is non-public, the whole URL is rejected. The fetch is a plain stdlib urllib GET (not the extractor's bundled HTTP client) so memboot controls the redirect chain; redirects are capped at 5, the response body at 10 MiB, and a 20 s timeout is applied. The only residual gap is DNS rebinding within a single fetch — the guard resolves once for the check and the connection resolves again — which is not a meaningful threat in the single-user-CLI model (an attacker who can flip your DNS mid-request can attack you more directly).
  • Accidental credential ingestion — the default ignore_patterns skips common secret-bearing directories and file types. This is defense in depth on top of the file-extension allowlist (which already excludes .env, .pem, .key, .netrc).
  • Path traversal / symlink escape in file ingestion — the watcher and file ingester do not follow symlinks out of the project directory.
  • Plaintext-at-rest exposure — opt-in SQLCipher DB encryption and Fernet notes encryption are available for sensitive hosts.

Out of scope — these are the user's responsibility

  • Indirect prompt injection via indexed content. memboot is a retrieval primitive: when you query it, the matching chunks are returned verbatim — to your LLM client (via the get_context / query_memory MCP tools) and, in smart mode, to a local rerank model. If a file in your indexed project (or a page you ingested, or a PDF you were sent) contains text crafted to manipulate an LLM ("ignore previous instructions and …"), that text becomes part of your model's context on the next retrieval. memboot cannot distinguish a malicious instruction from legitimate content; the host LLM is responsible for not acting on instructions found in retrieved data. Recommendation: only point memboot at content sources you trust, and be aware that a dependency, a forwarded document, or a scraped page is an untrusted source.
  • Embedding inversion on shared or committed databases. memboot stores numpy float32 embedding vectors alongside chunk text in the per-project SQLite DB. Modern embedding-inversion research can reconstruct a large fraction of the original text from the vectors alone — so a memboot DB is not "just an index", it is a recoverable copy of everything you indexed. Never commit ~/.memboot/*.db to a repository (it is gitignored by default in memboot's own repo for this reason), and treat sharing a DB file the same way you would treat sharing the source it was built from. If you need to share an indexed corpus, share it encrypted (MEMBOOT_DB_KEY).
  • Local privilege boundaries. memboot runs with your privileges. It does not defend against an attacker who already has code execution or filesystem write access as your user — at that point they can read your data directly without going through memboot.
  • Denial of service, social engineering, and supply-chain compromise of upstream dependencies beyond what pip-audit + Dependabot catch.

Scope summary

In scope:

  • Code injection vulnerabilities
  • Path traversal in file ingestion
  • SSRF in web ingestion
  • Credential exposure through default indexing behavior
  • Dependency vulnerabilities with known exploits

Out of scope:

  • Indirect prompt injection via content you chose to index
  • Embedding inversion on databases you chose to share
  • Denial of service
  • Social engineering

There aren't any published security advisories