A standalone Node.js recommendation system that surfaces relevant articles from your Readwise queue based on your reading interests.
This system analyzes your reading interests and generates daily HTML recommendations from your Readwise "Later" queue and tagged collections. It scores articles based on thematic keyword matching, filters out previously recommended items, and hosts the results as a bookmarkable webpage.
Claude Code is optional — it can help create your interest profile by analyzing your highlights, but you can also create the profile manually.
- Node.js 14+
- Readwise account with API access
- Readwise API token
Optional (for enhanced features):
- Readwise CLI (
npm install -g @readwise/cli) — required for qmd engine exports and discovery search - qmd — required only for the qmd scoring engine (BM25+vector full-text search)
git clone https://github.com/derekvan/Readwise-recommender.git
cd Readwise-recommender
npm installexport READWISE_TOKEN="your-api-token-here"Add this to your ~/.bashrc or ~/.zshrc to make it permanent.
The Readwise CLI is required for the qmd scoring engine and discovery search. It is not needed for the default keyword scoring workflow.
npm install -g @readwise/cli
readwise login-with-token your-readwise-tokenVerify it works:
readwise reader-list-documents --location later --limit 1 --jsonYour interest profile defines the topics you care about. There are two ways to create it:
Claude Code can analyze your Readwise highlights and automatically generate a high-quality interest profile. This approach identifies thematic patterns across thousands of highlights, picking keywords and weights you might not think of.
Step 1: Fetch your highlights
Fetch all my Readwise highlights using readwise_export_highlights
and save to data/highlights_raw.json
Step 2: Generate interest profile
Analyze all highlights in data/highlights_raw.json. Identify 25 thematic clusters
representing my core interests. For each cluster, provide: theme name, 5-10 keywords,
and calculate weight based on how many highlights match. Save to data/interest_profile.json.
Claude Code will produce a JSON file with your personalized reading themes.
Create data/interest_profile.json with this structure:
{
"version": "1.0",
"metadata": {
"created_at": "2026-01-29T10:00:00Z",
"cluster_count": 15
},
"clusters": [
{
"id": "cluster_001",
"theme": "Decision Making & Cognitive Frameworks",
"keywords": ["decision-making", "intuition", "biases", "rationality", "thinking"],
"weight": 0.18,
"description": "Frameworks for making better decisions and understanding cognitive biases"
},
{
"id": "cluster_002",
"theme": "Writing & Communication",
"keywords": ["writing", "storytelling", "editing", "clarity", "communication"],
"weight": 0.12,
"description": "Techniques for clear, effective writing and communication"
},
{
"id": "cluster_003",
"theme": "Productivity & Focus",
"keywords": ["productivity", "focus", "deep work", "distraction", "habits"],
"weight": 0.15,
"description": "Systems for managing attention and building effective habits"
}
]
}Field guide:
theme: A topic you care about (e.g., "Decision Making")keywords: Words that signal this topic in article titles/summaries (lowercase, 5-10 per theme)weight: Importance of this theme (0.0-1.0). Higher = more likely to surface articles on this topic. Weights don't need to sum to 1.0.description: Human-readable explanation of what this cluster represents
Tips:
- Start with 10-25 themes that represent your core reading interests
- Think about what words appear in article titles/summaries for each topic
- Higher weights prioritize that theme in scoring
- You can refine your profile over time based on which recommendations feel right
Copy the example settings file:
cp utils/settings.example.json utils/settings.jsonEdit utils/settings.json to customize:
{
"tagBucket": {
"enabled": true,
"tag": "bankruptcy2025-12",
"label": "Bankruptcy Readings",
"emoji": "💼",
"cooldownMonths": 6,
"count": 3
},
"laterBucket": {
"label": "Top Picks",
"emoji": "✨",
"cooldownMonths": 2,
"count": 2
},
"batchCount": 3,
"recommendationsPerBucket": 5,
"excludedCategories": ["pdf", "epub"],
"upload": {
"method": "local"
}
}See Configuration Reference below for all options.
Choose a scoring engine path:
Keyword engine (default) — REST API with full-text:
node utils/fetch_later_incremental.js # fetches full HTML content (~several minutes)
node utils/merge_chunks.js # strips HTML → plain text, saves fulltext map
node utils/build_scored_cache.js # scores documents (uses full text if available)QMD engine — Readwise Export with full-text:
node utils/export_docs_to_qmd.js # exports Later docs as markdown, indexes via qmd
node utils/build_scored_cache.js # scores documents via BM25 search(Optional) Fetch from a tagged collection:
# Keyword engine
node utils/fetch_tag_incremental.js "your-tag"
node utils/merge_tag_chunks.js "your-tag"
node utils/score_tag_documents.js "your-tag"
# QMD engine
node utils/export_tag_to_qmd.js "your-tag"
node utils/score_tag_documents.js "your-tag"node daily_recs.jsThis generates an HTML file at output/recommendations/YYYY-MM-DD.html and prints the file path.
Once setup is complete, run one command each day:
node daily_recs.jsWhat this does:
- Generates HTML recommendations from cached, scored data
- Filters out previously recommended items
- Creates multiple batches of recommendations (default: 3 batches with 2 "Later" + 3 tagged items each)
- Shows batch 1 by default with a "Get More Recommendations" button to reveal additional batches
- Uploads based on your configured method (or saves locally)
- Prints the URL or file path
Time: < 10 seconds
The generated page includes:
- Multiple batches of recommendations with progressive disclosure
- Scores, matched themes, and reading time estimates
- Archive buttons to remove articles from your queue without reading
- Mobile-friendly, bookmarkable design
The system caches scored documents to keep daily runs fast. Refresh when you've added new articles:
Refresh "Later" Documents (keyword engine):
node utils/fetch_later_incremental.js
node utils/merge_chunks.js
node utils/build_scored_cache.jsRefresh "Later" Documents (qmd engine):
node utils/export_docs_to_qmd.js # incremental by default; use --full to force full export
node utils/build_scored_cache.jsWhen the qmd engine is enabled,
daily_recs.jsruns an incremental export automatically at startup — you usually don't need to refresh manually.
Refresh Tagged Collection (keyword engine):
node utils/fetch_tag_incremental.js "your-tag-name"
node utils/merge_tag_chunks.js "your-tag-name"
node utils/score_tag_documents.js "your-tag-name"Refresh Tagged Collection (qmd engine):
node utils/export_tag_to_qmd.js "your-tag-name"
node utils/score_tag_documents.js "your-tag-name"Recommendation: Refresh weekly, or whenever you've added 50+ new articles.
Edit utils/settings.json to customize behavior:
{
"tagBucket": {
"enabled": true, // Enable/disable tag-based recommendations
"tag": "bankruptcy2025-12", // Readwise tag to track
"label": "Bankruptcy Readings", // Display name in HTML
"emoji": "💼", // Section emoji
"cooldownMonths": 6, // Months before re-recommending same item
"count": 3 // Items per batch from this bucket (default: 3)
},
"laterBucket": {
"label": "Top Picks", // Display name for "Later" recommendations (default: "Top Picks")
"emoji": "✨",
"cooldownMonths": 2,
"count": 2 // Items per batch from this bucket (default: 2)
},
"batchCount": 3, // Number of batches to generate (default: 3)
"recommendationsPerBucket": 5, // Fallback count if bucket.count not specified
"excludedCategories": ["pdf", "epub"], // Document types to skip
"scoring": {
"engine": "keyword", // "keyword" (default) or "qmd" (BM25+vector)
"strongestThemeMultiplier": 2, // Boost for strongest theme match
"rawScoreMultiplier": 6, // Overall score scaling
"maxMatchedClusters": 5, // Max themes shown per document
"fullTextScoring": {
"enabled": true, // Use full article body in scoring (keyword engine)
"titleWeight": 3, // Relative weight for title matches
"summaryWeight": 2, // Relative weight for summary matches
"contentWeight": 1 // Relative weight for body text matches
}
},
"cache": {
"refreshIntervalDays": 7, // Days between full cache refreshes
"maxIncrementalUpdates": 10 // Max incremental updates before full refresh
},
"upload": {
"method": "local", // local, github-pages, or scp
"remoteFilename": "recs.html",
"urlPath": "/recommendations/"
}
}Note: utils/settings.json is gitignored — your settings stay private.
Choose how to view your daily recommendations:
Open the HTML file directly in your browser:
open output/recommendations/$(date +%Y-%m-%d).htmlPros: Zero setup, complete privacy, works offline Cons: Not bookmarkable, need to find file each day
Set "upload.method": "local" in settings.json.
Host on GitHub Pages for a consistent, bookmarkable URL.
Initial Setup:
-
Create a public GitHub repository (e.g.,
my-recommendations) -
Clone it locally and create the
docsfolder:git clone https://github.com/yourusername/my-recommendations.git cd my-recommendations mkdir docs echo "# My Recommendations" > docs/README.md git add docs/ && git commit -m "Initialize docs" && git push
-
Enable GitHub Pages:
- Go to repository Settings → Pages
- Source: Deploy from a branch
- Branch:
main→/docsfolder → Save - Your site:
https://yourusername.github.io/my-recommendations/
-
Configure settings.json:
{ "upload": { "method": "github-pages", "pagesRepoPath": "/absolute/path/to/my-recommendations", "pagesUrl": "https://yourusername.github.io/my-recommendations/" } }
Daily Publishing:
node daily_recs.js automatically copies the HTML to docs/index.html, commits, and pushes.
Pros: Free hosting, bookmarkable, archive buttons work from anywhere Cons: Repository must be public (article titles visible), requires git push per update
Security Note: HTML contains article titles and reading times. If this is sensitive, use local hosting instead.
Upload to your own server via SCP.
Setup:
-
Configure SCP in
utils/config.json:{ "sshHost": "your.server.com", "sshUser": "username", "remotePath": "/path/to/webroot/", "remoteFilename": "recs.html", "urlPath": "/recommendations/" } -
Set upload method in
utils/settings.json:{ "upload": { "method": "scp" } } -
Ensure SSH key authentication is configured (no password prompts).
Pros: Complete privacy, custom domain, full control Cons: Requires web server with SSH access
Generate HTML without uploading:
{
"upload": {
"method": "local"
}
}node daily_recs.js prints the local file path without uploading.
For each document:
- Extract text: title + summary + author (and full article body if available)
- Match keywords: For each theme, count keyword matches across text fields
- Calculate theme contribution:
(matches / total_keywords) × theme_weight - Apply hybrid scoring:
(strongest_theme × 2) + sum_of_other_contributions - Scale to 1-10: Multiply by 6 and cap at 10
Why hybrid? This prioritizes depth (strong single-theme match) over breadth (many weak matches).
Example:
- Document A: Strong "Meditation" match (0.045) + weak matches → Score: 2.5
- Document B: Many weak matches across 5 themes → Score: 2.0
Document A wins because it strongly matches a high-weight theme, even though Document B matches more themes weakly.
Two scoring engines are available. Switch with scoring.engine in utils/settings.json.
Scores documents via substring matching against their text fields. If full-text content is available (fetched via REST API), it uses weighted scoring across three fields:
| Field | Default Weight |
|---|---|
| Title | 3× |
| Summary + author | 2× |
| Article body | 1× |
Full-text scoring catches articles whose summaries are vague but whose body is highly relevant. Title matches still count most.
How to get full text (keyword engine):
node utils/fetch_later_incremental.js # fetches full HTML via REST API
node utils/merge_chunks.js # strips HTML → plain text, saves fulltext map
node utils/build_scored_cache.js # scores with full text if availableGotchas:
- The CLI fetch (
fetch_later_cli.js) does not return article body — metadata only - Daily incremental updates (
update_cache_from_yesterday.js) do not fetch full text; new docs added between full rebuilds are scored on title+summary only - Full-text scoring only takes effect during a full
build_scored_cache.jsrebuild
Configure weights in utils/settings.json:
"scoring": {
"fullTextScoring": {
"enabled": true,
"titleWeight": 3,
"summaryWeight": 2,
"contentWeight": 1
}
}Scores documents by running BM25 searches against a local qmd index of full markdown documents. This is richer than substring matching: it uses proper term frequency weighting and indexes the entire article body by default.
Set "engine": "qmd" in utils/settings.json to enable:
"scoring": {
"engine": "qmd"
}See QMD Engine Setup below for installation and configuration.
-
Install the Readwise CLI:
npm install -g @readwise/cli readwise login-with-token your-readwise-token
-
Install qmd and initialize:
# Follow qmd installation instructions for your platform qmd init -
Configure qmd collections in
~/.config/qmd/index.yml:collections: readwise-docs: path: /absolute/path/to/Readwise-recommender/data/qmd-docs readwise-tag-docs: path: /absolute/path/to/Readwise-recommender/data/qmd-tag-docs
Export your Later queue as markdown and index it:
node utils/export_docs_to_qmd.js # exports docs, then runs qmd update + qmd embedFor tagged collections:
node utils/export_tag_to_qmd.js "your-tag"Both scripts automatically run qmd update and qmd embed after export.
When "engine": "qmd" is set, daily_recs.js automatically runs an incremental export at startup (Step 0) before generating recommendations. You generally don't need to export manually each day.
Force a full (non-incremental) export when needed:
node utils/export_docs_to_qmd.js --fullArticles originally saved via Readwise's internal read.readwise.io URLs are not included in the export ZIP. The export script automatically backfills these from the REST API fulltext JSON (data/documents_later_fulltext.json) if it exists. Run a full REST API fetch first if you're missing documents:
node utils/fetch_later_incremental.js
node utils/merge_chunks.jsRun recommendations automatically each morning using cron.
-
Find your node path:
which node # Example output: /usr/local/bin/node or /opt/homebrew/bin/node -
Get your project's absolute path:
cd /path/to/Readwise-recommender pwd # Example output: /Users/yourusername/Code/Readwise-recommender
-
Set READWISE_TOKEN permanently:
Add to your shell profile (
~/.bashrc,~/.zshrc, or~/.bash_profile):export READWISE_TOKEN="your-api-token-here"
Then reload:
source ~/.zshrc # or ~/.bashrc
-
Edit crontab:
crontab -e
-
Add cron job:
For 6:00 AM daily (replace paths with your actual paths):
0 6 * * * cd /Users/yourusername/Code/Readwise-recommender && /usr/local/bin/node daily_recs.js >> /tmp/daily_recs.log 2>&1
Cron time format:
minute hour day month weekday0 6 * * *= 6:00 AM daily0 9 * * *= 9:00 AM daily30 7 * * *= 7:30 AM daily
-
Verify cron is running:
crontab -l # List your cron jobs -
Check logs:
tail -f /tmp/daily_recs.log # Watch output in real-time cat /tmp/daily_recs.log # View full log
Cron job not running?
- Verify cron service is running:
sudo launchctl list | grep cron(macOS) - Check system logs:
grep CRON /var/log/syslog(Linux) - Ensure full paths are used (no
~shorthand)
Environment variables not available?
- Cron runs with minimal environment. Export READWISE_TOKEN in your shell profile as shown above
- Test the exact cron command manually:
cd /path && /usr/local/bin/node daily_recs.js
Wrong node version?
- Cron may use system node instead of nvm/asdf node
- Use full path from
which nodeto ensure correct version
For more reliable scheduling on macOS, consider using launchd instead of cron. Create ~/Library/LaunchAgents/com.user.readwise-recs.plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.user.readwise-recs</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/node</string>
<string>/Users/yourusername/Code/Readwise-recommender/daily_recs.js</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<key>READWISE_TOKEN</key>
<string>your-api-token-here</string>
</dict>
<key>StartCalendarInterval</key>
<dict>
<key>Hour</key>
<integer>6</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
<key>StandardOutPath</key>
<string>/tmp/readwise-recs.log</string>
<key>StandardErrorPath</key>
<string>/tmp/readwise-recs-error.log</string>
</dict>
</plist>Load it:
launchctl load ~/Library/LaunchAgents/com.user.readwise-recs.plistCheck status:
launchctl list | grep readwise-recs{
"version": "1.0",
"metadata": {
"created_at": "2026-01-29T10:00:00Z",
"cluster_count": 25
},
"clusters": [
{
"id": "cluster_001",
"theme": "Decision Making & Cognitive Frameworks",
"keywords": ["decision-making", "intuition", "fear", "reason", "biases"],
"weight": 0.15,
"description": "Frameworks for making better decisions..."
}
]
}{
"version": "1.0",
"recommendations": [
{
"date": "2026-01-29",
"bucket_1_later": [
{
"document_id": "01kdnmypxhaphr6be3vv7xkgn4",
"title": "How Can I Stop Being So Afraid of Changing My Life?",
"score": 10.0,
"themes": ["Decision Making", "Overcoming Fear"]
}
],
"bucket_2_bankruptcy": [...],
"bonus_doc_ids": ["01xyz...", "01abc..."],
"bonus_buckets": {
"bucket_1_later": ["01xyz..."],
"bucket_2_bankruptcy": ["01abc..."]
}
}
],
"recommended_doc_ids": ["01kdnmypxhaphr6be3vv7xkgn4", ...]
}No high-scoring recommendations?
- Check your interest profile (
data/interest_profile.json) — ensure keywords match how articles describe topics - Lower the threshold or refine keyword lists based on article titles in your queue
Recommendations not aligned with interests?
- Review cluster weights in
data/interest_profile.json - Add missing keywords that appear in article titles you're interested in
- If using Claude Code, regenerate the profile after accumulating new highlights
Want more variety?
- Increase
batchCountin settings.json to generate more batches (e.g., change from 3 to 5) - Adjust
laterBucket.countandtagBucket.countfor more items per batch - Use the "Get More Recommendations" button to reveal additional batches beyond the first set
Archive buttons not working?
- Archive buttons require
READWISE_TOKENin browser localStorage - Open browser console and run:
localStorage.setItem('READWISE_TOKEN', 'your-token-here') - This is per-origin (GitHub Pages users need to set it once per browser)
Data fetch failing?
- Verify
READWISE_TOKENis set:echo $READWISE_TOKEN - Check token is valid at https://readwise.io/access_token
- Ensure internet connection is active
Scores seem low even for relevant articles?
- Enable full-text scoring by running a full REST API fetch:
fetch_later_incremental.js→merge_chunks.js→build_scored_cache.js - Without full text, scoring uses only the article title, summary, and author — body content is ignored
- Alternatively, switch to the qmd engine (
"engine": "qmd") which indexes full article text via BM25
QMD export seems to be missing documents?
- Articles originally saved via Readwise's internal
read.readwise.ioURLs are not included in the export ZIP - Run a full REST API fetch first (
fetch_later_incremental.js+merge_chunks.js) — the export script will backfill missing articles from the resulting fulltext JSON
Discovery search not working?
- Discovery search requires the Readwise CLI:
npm install -g @readwise/cli - Authenticate with
readwise login-with-token your-token - Check that
discovery.enabledistrueinutils/settings.json
Readwise-recommender/
├── data/ # Your data (gitignored)
│ ├── highlights_raw.json # Cached highlights (optional, for profile generation)
│ ├── interest_profile.json # Your thematic clusters
│ ├── recommendation_log.json # Tracking previously recommended items
│ ├── documents_later.json # Scored candidates from "Later" queue
│ ├── documents_later_fulltext.json # Extracted plain text map (from merge step)
│ ├── documents_<tag>.json # Scored candidates from tagged collection
│ ├── qmd-docs/ # Markdown export for qmd engine (Later queue)
│ └── qmd-tag-docs/ # Markdown export for qmd engine (tagged docs)
│
├── output/
│ └── recommendations/ # Daily HTML files (local archive)
│ └── YYYY-MM-DD.html
│
├── utils/
│ ├── fetch_later_incremental.js # Fetch "Later" docs via REST API (with full text)
│ ├── fetch_later_cli.js # Fetch "Later" docs via CLI (metadata only, no full text)
│ ├── fetch_tag_incremental.js # Fetch tagged documents
│ ├── merge_chunks.js # Merge fetched data, extract full text
│ ├── build_scored_cache.js # Score documents (keyword or qmd engine)
│ ├── scoring_engine.js # Keyword scoring algorithm
│ ├── qmd_scoring_engine.js # QMD BM25+vector scoring engine
│ ├── scoring_engine_selector.js # Routes to keyword or qmd engine per settings
│ ├── export_docs_to_qmd.js # Export Later docs as markdown for qmd
│ ├── export_tag_to_qmd.js # Export tagged docs as markdown for qmd
│ ├── score_tag_documents.js # Score tagged documents
│ ├── update_cache_from_yesterday.js # Incremental daily cache update
│ ├── discover_by_search.js # Content-based discovery search (requires CLI)
│ ├── upload_recommendations.js # SCP upload helper
│ ├── upload_github_pages.js # GitHub Pages upload helper
│ ├── settings.example.json # Configuration template
│ ├── settings.json # Your settings (gitignored)
│ ├── config.json.template # SCP config template
│ └── config.json # SCP credentials (gitignored)
│
├── daily_recs.js # Main daily script
└── README.md # This file
- Feedback Loop: Track read vs. skipped items and adjust theme weights automatically
- Diversity Scoring: Ensure recommendations span multiple themes by default
- Engagement Analysis: "What themes did I engage with most this month?"
- Conversational Refinement: Natural language profile updates ("I want fewer philosophy, more practical advice")
- Weekly Summaries: Reading pattern analysis and theme trends
Questions or feedback? Open an issue at https://github.com/yourusername/Readwise-recommender/issues