Fast and efficient spreadsheet analysis through atomic operations, built specifically for AI agents
π¬π§ English β’ π·πΊ Π ΡΡΡΠΊΠΈΠΉ β’ π¨π³ δΈζ β’ πͺπΈ EspaΓ±ol β’ π―π΅ ζ₯ζ¬θͺ β’ π§π· PortuguΓͺs
Made with β€οΈ for mom by @Jwadow
Analyze Excel spreadsheets with your AI agent through atomic operations β no data dumping into AI context
Works with OpenCode, Claude Code, Codex app, Cursor, Cline, Roo Code, Kilo Code and other MCP-compatible AI agents
Why This Exists β’ My Mom's Review β’ What Your Agent Can Do β’ Installation β’ π Donate
Local-First Architecture This server runs entirely on your local machine. Your Excel files are processed locally and never leave your computer.
Is it safe?
- Local Models (Ollama, LM Studio): Your data never leaves your machine. 100% private.
- Cloud Models (OpenRouter, ChatGPT): Only the precise results of operations (counts, sums, formulas) and metadata (column names) are sent to the model. The bulk raw data remains on your disk.
The Problem: Most Excel tools for AI dump raw spreadsheet data into the agent's context. This floods the context window, slows everything down, and the AI can still miscalculate or get confused in large datasets.
This Project: Think SQL for Excel. Your AI agent composes atomic operations (filter_and_count, aggregate, group_by) and gets back precise results β not thousands of rows.
The agent analyzes data without seeing it. Results come as numbers, formulas, and insights.
"This is like working with a database through SQL, not dragging everything into memory." β AI Agent after analyzing a production spreadsheet
Model Context Protocol is an open standard that lets AI agents use external tools.
This project is such a tool. When you connect this server to your AI agent (OpenCode, Claude Code, Codex app, Cursor, Cline, Roo Code, Kilo Code, etc.), your agent gets a lot of new commands for working with Excel files β filtering, counting, aggregating, analyzing.
The key benefit: Your AI doesn't load thousands of spreadsheet rows into its memory. Instead, it asks specific questions and gets precise answers. Faster, more accurate, no context overflow.
Translated from Russian. She's not a tech person - types with one finger, uses Excel every day for work.
"Usually takes me an hour to break down this spreadsheet, filter by categories, copy into different columns, calculate totals. Gave it the task and it did everything in 3 minutes. Checked it and its correct. Now its like this with any task, just write what I need and it does it. I'm honestly shocked. Half my life I've been doing this by hand and the computer just gets what I need. Saving so much time for real."
Once connected, your AI agent gets a lot of specialized tools for analyzing spreadsheet data. The agent receives only precise queries and reliable results.
- Inspect files - structure, sheets, columns, data types (auto-detects messy headers)
- Profile columns - statistics, null counts, top values, data quality in one call
- Find data - search across multiple sheets, locate columns anywhere
- 12 filter operators -
==,!=,>,<,>=,<=,in,not_in,contains,startswith,endswith,regex - Complex logic - nested AND/OR groups, NOT operator, unlimited conditions
- Batch operations - classify data into multiple categories in one request (6x faster)
- Overlap analysis - Venn diagrams, intersection counts, set operations
- 8 aggregation functions - sum, mean, median, min, max, std, var, count
- Group by - pivot tables with multiple grouping columns
- Statistical analysis - correlations (Pearson/Spearman/Kendall), outlier detection (IQR/Z-score)
- Time series - period-over-period growth, moving averages, running totals
- Ranking - top-N, bottom-N, percentile ranking (with grouping support)
- Calculated columns - arithmetic expressions between columns
- Data validation - find duplicates, null values, data quality checks
- Sheet comparison - diff between versions, find changes
- Atomic operations - results in 20-50ms, no matter the file size
- Smart caching - file loaded once, reused for all operations
- Sample rows - preview filtered data without full retrieval
- Context protection - smart limits prevent AI context overflow
- Formula generation - every result includes Excel formula for dynamic updates
- TSV output - copy-paste results directly into Excel
- Legacy support - works with old .xls files (Excel 97-2003)
- Multi-sheet - analyze across multiple sheets in one file
Example queries your agent can now handle:
- "Show me top 10 customers by revenue"
- "Find all orders from Q4 where amount > $1000"
- "Calculate month-over-month growth for each product category"
- "Which customers are both VIP and active? (overlap analysis)"
- "Find duplicates in the email column"
Python 3.10 or higher β Download here
git clone https://github.com/jwadow/mcp-excel.git
cd mcp-excelNo Git? Click "Code" β "Download ZIP" at the top of this repository page, extract, and open terminal in that folder.
π― Option A: Poetry (Recommended)
Poetry is a modern Python dependency manager (replaces pip+venv+requirements.txt).
Install it: pip install poetry or pipx install poetry
Install dependencies:
poetry installConfigure your AI agent:
Add this to your MCP settings (JSON config):
{
"mcpServers": {
"excel": {
"command": "poetry",
"args": ["run", "python", "-m", "mcp_excel.main"],
"cwd": "C:/path/to/mcp-excel"
}
}
}Important: Replace C:/path/to/mcp-excel with actual path to the cloned repository.
π¦ Option B: pip with virtual environment
Install dependencies:
# Windows
python -m venv venv
venv\Scripts\activate
pip install -e .
# Linux/Mac
python -m venv venv
source venv/bin/activate
pip install -e .Find Python path in venv:
# Windows
where python
# Linux/Mac
which pythonConfigure your AI agent:
Add this to your MCP settings (JSON config):
{
"mcpServers": {
"excel": {
"command": "C:/path/to/mcp-excel/venv/Scripts/python.exe",
"args": ["-m", "mcp_excel.main"],
"cwd": "C:/path/to/mcp-excel"
}
}
}Important:
- Replace
C:/path/to/mcp-excel/venv/Scripts/python.exewith actual path fromwhere pythoncommand - On Linux/Mac use path from
which python(e.g.,/path/to/mcp-excel/venv/bin/python)
π Option C: System Python (Not Recommended)
Install dependencies globally:
pip install "mcp>=1.1.0" "pandas>=2.2.0" "pydantic>=2.10.0" "xlrd>=2.0.1" "openpyxl>=3.1.0" "psutil>=6.1.0" "python-dateutil>=2.9.0"Configure your AI agent:
{
"mcpServers": {
"excel": {
"command": "python",
"args": ["-m", "mcp_excel.main"],
"cwd": "C:/path/to/mcp-excel"
}
}
}Restart your AI agent and test:
"Analyze the Excel file at C:/Users/YourName/Documents/test.xlsx"
If it works - you're done! If not, check:
- Path to repository is correct in
cwd - Python path is correct in
command(for pip method) - All dependencies are installed
Works with any MCP-compatible AI agent.
After configuration, restart your AI agent and ask it to analyze Excel files:
"Analyze the Excel file at C:/Users/YourName/Documents/sales.xls"
"Show me top 10 customers by revenue from sales.xlsx"
"Find duplicates in column 'Email' in contacts.xlsx"
"Calculate month-over-month growth from revenue.xls"
π Complete Tool Reference (25 tools) - Click to expand
Get file structure overview - sheets, dimensions, format. Use for: Initial file exploration, sheet discovery, format validation Returns: Sheet list, row/column counts, file metadata
Detailed sheet analysis with auto-header detection. Use for: Understanding data structure, column types, sample preview Returns: Column names/types, row count, sample data (3 rows), header detection info
Quick column enumeration without loading full data. Use for: Schema validation, filter building, column availability checks Returns: Column name list, column count
Comprehensive column profiling - types, stats, nulls, top values. Use for: Initial data exploration, quality assessment, distribution analysis Returns: Per-column: type, null %, unique count, stats (numeric), top N values Efficiency: Replaces 10+ separate calls (get_column_stats + get_value_counts + find_nulls)
Locate column across multiple sheets. Use for: Multi-sheet navigation, data discovery, cross-sheet analysis Returns: Sheet list with column locations, indices, row counts (case-insensitive)
Extract unique values from a column. Use for: Data exploration, filter building, distinct value discovery, data quality checks Returns: Unique value list, count, truncated flag (if limit exceeded) Default limit: 100 values
Frequency analysis - top N most common values. Use for: Distribution analysis, identifying dominant categories, data imbalance detection Returns: Value β count dictionary, total count, TSV output Default: Top 10 values
Retrieve filtered rows with pagination. Use for: Data extraction, sample inspection, detailed analysis, export Returns: Filtered rows (list of dicts), total count, TSV output Pagination: limit/offset support
Count rows matching conditions with 14 operators.
Operators: ==, !=, >, <, >=, <=, in, not_in, contains, startswith, endswith, regex, is_null, is_not_null
Logic: Nested AND/OR groups, NOT operator, unlimited conditions
Use for: Classification, segmentation, data validation, category counting
Returns: Count + Excel formula (COUNTIFS), optional sample rows
Classify data into multiple categories in one call (6x faster). Use for: Multi-category classification, market segmentation, quality control Returns: Count + formula per category, TSV table for Excel Efficiency: Loads file once, applies all filters, returns all results
Venn diagram analysis - intersections, unions, exclusive zones. Use for: Overlap analysis, cross-sell opportunities, data consistency checks Returns: Set counts, pairwise intersections (A β© B), union, Venn data (2-3 sets) Examples: VIP AND active customers, product category overlaps, completed orders WITHOUT completion date
Perform aggregation with optional filters (8 operations).
Operations: sum, mean, median, min, max, std, var, count
Use for: Totals, averages, statistical summaries, conditional aggregations, KPIs
Returns: Aggregated value + Excel formula (SUMIF, AVERAGEIF, etc.)
Special: Auto-converts text-stored numbers to numeric
Pivot table with multi-column grouping. Use for: Category analysis, hierarchical grouping, sales by region/product Returns: Grouped data with aggregated values, TSV output Supports: Multiple grouping columns, all 8 aggregation operations
Statistical summary - count, mean, median, std, quartiles. Use for: Distribution analysis, data profiling, outlier detection prep Returns: Full stats (min, max, mean, median, std, Q1, Q3), null count, TSV output
Correlation matrix between 2+ columns. Methods: Pearson (linear), Spearman (rank-based), Kendall (rank-based) Use for: Relationship analysis, variable dependency, feature selection Returns: Correlation matrix (-1 to 1), TSV output
Anomaly detection using IQR or Z-score. Methods: IQR (robust), Z-score (assumes normal distribution) Use for: Fraud detection, sensor errors, data quality, unusual value identification Returns: Outlier rows with indices, count, method/threshold used
Detect duplicate rows by specified columns.
Use for: Data quality, deduplication planning, integrity checks
Returns: All duplicate rows (including first occurrence), count, indices
Note: Uses duplicated(keep=False) to mark all duplicates
Find null/empty values with detailed statistics.
Use for: Completeness checks, missing value analysis, data cleaning
Returns: Per-column: null count, percentage, indices (first 100)
Note: Placeholders (".", "-") are NOT null - use == or in operators
Search value across all sheets. Use for: Cross-sheet search, value tracking, data location Returns: Sheet list with match counts, total matches Supports: Numeric and string values
Diff between two sheets using key column. Use for: Version comparison, change detection, reconciliation, audit trails Returns: Rows with differences, status (only_in_sheet1/sheet2/different_values), side-by-side comparison
Period-over-period growth analysis. Periods: month, quarter, year Use for: Trend analysis, growth tracking, seasonal comparison, YoY analysis Returns: Periods with values, absolute/percentage changes, Excel formula
Cumulative sum with optional grouping. Use for: Cumulative analysis, progress tracking, balance calculations, cash flow Returns: Rows with running totals, Excel formula (SUM($B$2:B2)) Supports: Grouping (running total resets per group)
Smoothing with specified window size. Use for: Trend detection, noise reduction, pattern identification Returns: Rows with moving averages, Excel formula (AVERAGE(B1:B7)) Examples: 7-day moving average, 30-day stock price smoothing
Rank by column value with top-N filtering. Directions: desc (highest first), asc (lowest first) Use for: Leaderboards, top/bottom analysis, percentile ranking Returns: Ranked rows with rank numbers, Excel formula (RANK) Supports: Top-N filtering, ranking within groups
Arithmetic expressions between columns.
Operations: +, -, *, /, parentheses
Use for: Derived metrics, financial calculations, ratio analysis, KPIs
Returns: Calculated values, Excel formula (e.g., =A2*B2)
Examples: Revenue = Price * Quantity, Margin = (Revenue - Cost) / Revenue
Currently Supported:
- β XLS - Excel 97-2003 (read-only)
- β XLSX - Excel 2007+ (read-only)
Planned:
- π XLSM - Excel with macros support
- π CSV - Comma-separated values
- π TSV - Tab-separated values
- π ODS - OpenDocument Spreadsheet
- π Parquet - Columnar storage format
- Write operations - Modify spreadsheets files (create calculated columns, update values)
- SSE transport mode - Server-Sent Events for remote access
- Advanced formula generation - More complex Excel formulas with nested functions
- Data export - Export filtered/aggregated results to new files
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
This means:
- β You can use, modify, and distribute this software
- β You can use it for commercial purposes
β οΈ You must disclose source code when you distribute the softwareβ οΈ Network use is distribution β if you run a modified version on a server and let others interact with it, you must make the source code availableβ οΈ Modifications must be released under the same license
See the LICENSE file for the full license text.
AGPL-3.0 ensures that improvements to this software benefit the entire community. If you modify this server and deploy it as a service, you must share your improvements with your users.
If this project saved you time or money, consider supporting it!
Every contribution helps keep this project alive and growing
β One-time Donation β’ π Monthly Support
| Currency | Network | Address |
|---|---|---|
| USDT | TRC20 | TSVtgRc9pkC1UgcbVeijBHjFmpkYHDRu26 |
| BTC | Bitcoin | 12GZqxqpcBsqJ4Vf1YreLqwoMGvzBPgJq6 |
| ETH | Ethereum | 0xc86eab3bba3bbaf4eb5b5fff8586f1460f1fd395 |
| SOL | Solana | 9amykF7KibZmdaw66a1oqYJyi75fRqgdsqnG66AK3jvh |
| TON | TON | UQBVh8T1H3GI7gd7b-_PPNnxHYYxptrcCVf3qQk5v41h3QTM |
Contributions are welcome! Please ensure:
- All dependencies are AGPL-compatible
- Code follows the existing style
- Tests are included for new features
- Documentation is updated
For issues, questions, or contributions, please open an issue on GitHub.
Got questions? Found a bug? Have a feature idea? We're here to help!
Whether you're stuck with installation, found something broken, or just want to suggest an improvement β GitHub Issues is the place. Don't worry if you're new to GitHub, just click the link above and describe your situation. We'll figure it out together.