Skip to content

Commit 9ea00c0

Browse files
committed
Create spec directory (vibe-kanban bb30cf94)
/directory-to-spec Symlink the spec directory in ../docs using a relative symlink and update mkdocs.yml.
1 parent c7233fb commit 9ea00c0

10 files changed

Lines changed: 578 additions & 0 deletions

docs/spec

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../spec

mkdocs.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,15 @@ nav:
4848
- Getting Started: getting-started.md
4949
- Commands: commands.md
5050
- Architecture: architecture.md
51+
- Specifications:
52+
- Code Parsing: spec/code-parsing.md
53+
- Description Storage: spec/description-storage.md
54+
- Staleness Tracking: spec/staleness-tracking.md
55+
- LLM Integration: spec/llm-integration.md
56+
- Configuration Management: spec/configuration-management.md
57+
- Git Integration: spec/git-integration.md
58+
- Hierarchical Scopes: spec/hierarchical-scopes.md
59+
- CLI Interface: spec/cli-interface.md
5160

5261
markdown_extensions:
5362
- pymdownx.highlight:

spec/cli-interface.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# CLI Interface
2+
3+
## Overview
4+
5+
The CLI interface provides command-line access to all code-lod operations using Typer. Each command is implemented in a separate module for maintainability.
6+
7+
## Requirements
8+
9+
### MUST
10+
11+
- The system MUST provide commands: init, generate, status, validate, read, update, clean, config, hooks
12+
- Each command MUST be implemented in a separate file under `cli/`
13+
- The system MUST use Typer for command parsing and help text
14+
- The system MUST auto-detect the project root from the current directory
15+
- The system MUST provide clear error messages for common failure cases
16+
17+
### Command Descriptions
18+
19+
**init**: Initialize code-lod in a project directory
20+
- Creates `.code-lod` directory structure
21+
- Creates default `config.json`
22+
23+
**generate**: Generate descriptions for code entities
24+
- Parses source files
25+
- Generates descriptions via LLM
26+
- Stores in database and `.lod` files
27+
28+
**status**: Check freshness status of descriptions
29+
- Shows total, fresh, and stale counts
30+
- Lists stale entries
31+
32+
**validate**: Validate descriptions
33+
- Checks for stale descriptions
34+
- Can fail with exit code 1 if stale entries found
35+
36+
**read**: Output descriptions in LLM-consumable format
37+
- Retrieves descriptions from storage
38+
- Formats for LLM input
39+
40+
**update**: Update stale descriptions
41+
- Regenerates only stale entries
42+
- Updates database and `.lod` files
43+
44+
**clean**: Clean all code-lod data
45+
- Removes `.code-lod` directory
46+
- Removes all `.lod` files
47+
48+
**config**: Configuration management
49+
- View and edit configuration
50+
- Set provider and model options
51+
52+
**hooks**: Git hooks management
53+
- install: Install pre-commit hook
54+
- uninstall: Remove installed hooks
55+
56+
### SHOULD
57+
58+
- Commands SHOULD support common options (verbose, quiet, etc.)
59+
- Commands SHOULD provide helpful output for success and failure cases
60+
61+
### MAY
62+
63+
- The system MAY add additional commands in the future
64+
- The system MAY support shell completion for commands
65+
66+
## Implementation
67+
68+
### CLI Structure
69+
70+
```
71+
cli/
72+
├── __init__.py # Main app registration
73+
├── init.py # Initialize code-lod
74+
├── generate.py # Generate descriptions
75+
├── status.py # Check freshness
76+
├── validate.py # Validate descriptions
77+
├── read.py # Output descriptions
78+
├── update.py # Update stale descriptions
79+
├── clean.py # Clean all data
80+
├── config.py # Configuration management
81+
└── hooks.py # Git hooks
82+
```
83+
84+
### Main App (`cli/__init__.py`)
85+
86+
- Creates the main Typer app
87+
- Registers all sub-commands
88+
- Provides top-level help and version info
89+
90+
### Command Pattern
91+
92+
Each command module:
93+
- Defines one or more Typer functions
94+
- Uses `get_paths()` to find project root
95+
- Handles errors with appropriate exit codes
96+
- Provides user-friendly output via `typer.echo()`

spec/code-parsing.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Code Parsing
2+
3+
## Overview
4+
5+
The code parsing feature extracts code entities (functions, classes, modules) from source files using Tree-sitter parsers. It computes AST hashes for each entity to enable change detection and staleness tracking.
6+
7+
## Requirements
8+
9+
### MUST
10+
11+
- The parser MUST extract all functions, classes, and module-level entities from source files
12+
- The parser MUST compute AST hashes for each extracted entity using normalized source code
13+
- The parser MUST support Python, JavaScript, TypeScript, Go, Rust, Java, C, C++, Ruby, PHP, C#, Scala, Bash, YAML, JSON, TOML, and Markdown
14+
- The parser MUST provide a file extension to language name mapping
15+
- The parser MUST detect the programming language from file extensions automatically
16+
- The base parser interface MUST be implemented as an abstract base class
17+
- Each parsed entity MUST include: scope, name, location (path, start_line, end_line), source code, AST hash, language, and optional parent name
18+
19+
### SHOULD
20+
21+
- The parser SHOULD normalize source code before hashing to ignore cosmetic changes (comments, whitespace)
22+
- The parser SHOULD extract parent names for nested entities (methods in classes)
23+
- The parser SHOULD handle language-specific node types for functions and classes
24+
25+
### MAY
26+
27+
- The parser MAY support additional languages via Tree-sitter language pack
28+
- The parser MAY cache parsed entities for performance
29+
30+
## Implementation
31+
32+
### BaseParser Interface
33+
34+
Abstract base class defining:
35+
- `language` property: Returns the language name
36+
- `parse_file(path)`: Parses a file and returns list of ParsedEntity
37+
- `parse_module(source, path)`: Parses a module as a whole
38+
39+
### TreeSitterParser
40+
41+
Concrete implementation using Tree-sitter:
42+
- Maintains language-specific node type mappings for functions and classes
43+
- Traverses the AST to extract entities with proper parent relationships
44+
- Uses tree-sitter-language-pack for dynamic language loading
45+
46+
### Hash Computation
47+
48+
- Normalizes source by stripping comments and normalizing whitespace
49+
- Computes SHA-256 hash prefixed with "sha256:"
50+
- Hashes are used for change detection and staleness tracking

spec/configuration-management.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Configuration Management
2+
3+
## Overview
4+
5+
The configuration management feature handles project configuration, provider settings, and model selection per scope. Configuration is stored in `.code-lod/config.json` and manages paths relative to the project root.
6+
7+
## Requirements
8+
9+
### MUST
10+
11+
- The system MUST store configuration in `.code-lod/config.json`
12+
- The system MUST auto-detect the project root by searching for the `.code-lod` directory
13+
- The configuration MUST support: languages list, auto_update flag, fail_on_stale flag, provider selection, and per-provider model settings
14+
- Model settings MUST support default and scope-specific models (project, package, module, class, function)
15+
- The system MUST provide standard paths: code_lod_dir, lod_dir, config_file, hash_db
16+
- The system MUST validate hash format in `@lod` comments
17+
18+
### SHOULD
19+
20+
- The system SHOULD provide default configuration when config file doesn't exist
21+
- The system SHOULD handle configuration errors gracefully by falling back to defaults
22+
- The system SHOULD allow querying model configuration for specific scopes and providers
23+
24+
### MAY
25+
26+
- The system MAY support additional configuration options in the future
27+
- The system MAY provide configuration validation and schema checking
28+
29+
## Implementation
30+
31+
### Config Model
32+
33+
Pydantic BaseModel with fields:
34+
- `languages`: List of supported languages (default: ["python"])
35+
- `auto_update`: Whether to auto-update descriptions (default: false)
36+
- `fail_on_stale`: Whether to fail validation on stale descriptions (default: false)
37+
- `provider`: LLM provider to use (default: Provider.MOCK)
38+
- `model_settings`: Dict mapping Provider to ModelConfig
39+
40+
### ModelConfig Model
41+
42+
Pydantic BaseModel for per-provider model settings:
43+
- `default`: Default model for the provider
44+
- `project`: Model for PROJECT scope
45+
- `package`: Model for PACKAGE scope
46+
- `module`: Model for MODULE scope
47+
- `class_`: Model for CLASS scope
48+
- `function`: Model for FUNCTION scope
49+
- `get_model_for_scope(scope)`: Method to retrieve model for a specific scope
50+
51+
### Paths Dataclass
52+
53+
Frozen dataclass with path management:
54+
- `root_dir`: Project root directory
55+
- `code_lod_dir`: `.code-lod` directory
56+
- `lod_dir`: `.code-lod/.lod` directory
57+
- `config_file`: `.code-lod/config.json`
58+
- `hash_db`: `.code-lod/hash-index.db`
59+
60+
### Configuration Functions
61+
62+
- `find_project_root(start_path)`: Searches upward for `.code-lod` directory
63+
- `get_paths(root_dir)`: Returns Paths object for the project
64+
- `load_config(paths)`: Loads configuration from file or returns defaults
65+
- `save_config(config, paths)`: Saves configuration to file
66+
- `get_model_for_scope(config, provider, scope)`: Retrieves configured model

spec/description-storage.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# Description Storage
2+
3+
## Overview
4+
5+
The description storage feature provides a dual storage system for code descriptions: a SQLite database for metadata and caching, and `.lod` files alongside source code with structured `@lod` comments for human readability.
6+
7+
## Requirements
8+
9+
### MUST
10+
11+
- The system MUST maintain a SQLite database at `.code-lod/hash-index.db` for hash-to-description mapping
12+
- The database MUST store: hash, description, stale status, created_at, updated_at, and hash_history
13+
- The system MUST create `.lod` files alongside source files to store descriptions
14+
- `.lod` files MUST use structured `@lod` comments with hash, stale status, and description
15+
- The system MUST support reading and writing `.lod` files
16+
- The system MUST parse `@lod` comments to extract hash, stale, and description fields
17+
- The database MUST support CRUD operations: get, set, mark_stale, mark_fresh, delete
18+
- Database connections MUST use context managers for proper cleanup
19+
20+
### SHOULD
21+
22+
- `.lod` files SHOULD include function/class signatures for readability
23+
- `.lod` files SHOULD preserve module-level descriptions
24+
- The writer SHOULD format comments appropriately for the programming language
25+
26+
### MAY
27+
28+
- The system MAY support additional storage backends in the future
29+
- The system MAY compress descriptions in the database for large codebases
30+
31+
## Implementation
32+
33+
### SQLite Database (HashIndex)
34+
35+
Table schema:
36+
```sql
37+
CREATE TABLE descriptions (
38+
hash TEXT PRIMARY KEY,
39+
description TEXT NOT NULL,
40+
stale BOOLEAN DEFAULT FALSE,
41+
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
42+
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
43+
hash_history TEXT DEFAULT '[]'
44+
)
45+
```
46+
47+
Operations:
48+
- `get(hash_)`: Retrieve a description record
49+
- `set(hash_, description, stale, hash_history)`: Create or update a record
50+
- `mark_stale(hash_)`: Mark a description as stale
51+
- `mark_fresh(hash_)`: Mark a description as fresh
52+
- `get_all_stale()`: Retrieve all stale records
53+
- `delete(hash_)`: Remove a record
54+
55+
### .lod Files
56+
57+
Structure:
58+
- Module-level description at the top (optional)
59+
- Entity descriptions with `@lod` annotations
60+
61+
Comment format:
62+
```
63+
# @lod hash:sha256:<hexdigest> stale:true/false
64+
# @lod description:<description text>
65+
<class_or_function_signature>
66+
```
67+
68+
### LodReader
69+
70+
Parses `.lod` files and extracts:
71+
- Scope (function, class, module)
72+
- Name
73+
- Hash, stale status, description
74+
- Signature
75+
- Line numbers
76+
77+
### LodWriter
78+
79+
Writes `.lod` files with:
80+
- Module description header
81+
- Entity descriptions with signatures
82+
- Language-appropriate comment syntax

spec/git-integration.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Git Integration
2+
3+
## Overview
4+
5+
The Git integration feature provides pre-commit hooks to ensure code descriptions stay fresh. It automatically validates descriptions before commits, preventing stale code documentation from being committed.
6+
7+
## Requirements
8+
9+
### MUST
10+
11+
- The system MUST support installing pre-commit hooks
12+
- The system MUST support uninstalling hooks
13+
- The installed hook MUST run `code-lod validate --fail-on-stale`
14+
- The hook script MUST be executable (chmod 0o755)
15+
- The hook MUST be installed in `.git/hooks/`
16+
- The system MUST verify that code-lod is initialized before installing hooks
17+
- The system MUST verify that the directory is a git repository before installing hooks
18+
19+
### SHOULD
20+
21+
- The system SHOULD support additional hook types (e.g., pre-push)
22+
- The system SHOULD provide clear error messages when initialization or git repository checks fail
23+
24+
### MAY
25+
26+
- The system MAY support hook customization (e.g., different validation commands)
27+
- The system MAY integrate with other hook managers (e.g., pre-commit framework)
28+
29+
## Implementation
30+
31+
### install_hook Function
32+
33+
Creates a git hook script:
34+
1. Validates code-lod is initialized (checks for `.code-lod` directory)
35+
2. Validates the directory is a git repository (checks for `.git/hooks`)
36+
3. Creates the hook script with appropriate content
37+
4. Sets executable permissions (0o755)
38+
5. Reports success to the user
39+
40+
Hook script template:
41+
```bash
42+
#!/bin/sh
43+
# code-lod {hook_type} hook
44+
code-lod validate --fail-on-stale
45+
```
46+
47+
### uninstall_hook Function
48+
49+
Removes the git hook:
50+
1. Validates code-lod is initialized
51+
2. Removes `.git/hooks/pre-commit` if it exists
52+
3. Reports success or that no hook was found
53+
54+
### Error Handling
55+
56+
- Exits with status code 1 if code-lod is not initialized
57+
- Exits with status code 1 if not in a git repository
58+
- Uses typer.error() for user-friendly error messages

0 commit comments

Comments
 (0)