From e29e6e41f484374cb648fa02b0a89db0a3ef488d Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 26 Jan 2026 08:23:23 +0000 Subject: [PATCH 1/4] Initial plan From 9a0e970f83fc2e2e28f84457337aa9cb14d24caa Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 26 Jan 2026 08:26:53 +0000 Subject: [PATCH 2/4] Add DUPLICATION_ANALYSIS.md and update refactoring checklist Co-authored-by: MrIbrahem <26301308+MrIbrahem@users.noreply.github.com> --- Doc/DUPLICATION_ANALYSIS.md | 221 ++++++++++++++++++++++++++++++++++++ refactor.md | 51 +++++---- 2 files changed, 252 insertions(+), 20 deletions(-) create mode 100644 Doc/DUPLICATION_ANALYSIS.md diff --git a/Doc/DUPLICATION_ANALYSIS.md b/Doc/DUPLICATION_ANALYSIS.md new file mode 100644 index 0000000..fa6514a --- /dev/null +++ b/Doc/DUPLICATION_ANALYSIS.md @@ -0,0 +1,221 @@ +# Code Duplication Analysis: `get_text` and Related Methods + +## Executive Summary + +This document analyzes the duplicate functionality between the **legacy** `MainPage` class in `mw_api/super/S_Page/super_page.py` and the **new** refactored architecture in `mw_api/repositories/page_repository.py` and related modules. + +**Key Finding:** The duplications are **intentional** and part of an ongoing refactoring effort. The new `repositories/` and `core/` modules are designed to eventually replace the legacy implementation in `super/S_Page/`. + +--- + +## 1. Identified Duplications + +### 1.1 `get_text` Method + +| Aspect | Legacy (`MainPage.get_text`) | New (`PageRepository.get_text`) | +|--------|------------------------------|--------------------------------| +| **Location** | `mw_api/super/S_Page/super_page.py:186` | `mw_api/repositories/page_repository.py:33` | +| **Lines** | ~70 lines | ~30 lines | +| **Signature** | `def get_text(self, redirects=False)` | `def get_text(self, page: Page) -> str` | +| **Side Effects** | Updates `self.text`, `self.user`, `self.ns`, `self.meta`, `self.revisions_data` | None (pure function) | +| **Testability** | Hard (requires full `MainPage` setup) | Easy (mock `LoginBotProtocol`) | +| **Type Hints** | Minimal | Full | + +**Recommendation:** Keep both during migration. Eventually, `MainPage.get_text()` should delegate to `PageRepository.get_text()`. + +--- + +### 1.2 `get_categories` Method + +| Aspect | Legacy (`MainPage.get_categories`) | New (`PageRepository.get_categories`) | +|--------|-----------------------------------|--------------------------------------| +| **Location** | `super_page.py:496` | `page_repository.py:182` | +| **Implementation** | Returns cached data from `get_infos()` | Makes direct API call | +| **Side Effects** | Lazy loads via `get_infos()` | None | + +**Recommendation:** The implementations differ in approach. Legacy uses batch loading; new uses individual queries. Consider merging the lazy-loading pattern into a service layer. + +--- + +### 1.3 `get_langlinks` Method + +| Aspect | Legacy | New | +|--------|--------|-----| +| **Location** | `super_page.py:514` | `page_repository.py:263` | +| **Implementation** | Returns cached `self.langlinks` | Makes direct API call | + +**Recommendation:** Same as `get_categories` - consider a caching service wrapper. + +--- + +### 1.4 `get_templates` Method + +| Aspect | Legacy (`get_templates_API`) | New (`get_templates`) | +|--------|------------------------------|----------------------| +| **Location** | `super_page.py:521` | `page_repository.py:209` | +| **Implementation** | Returns cached data | Makes direct API call | + +--- + +### 1.5 `save` Method + +| Aspect | Legacy (`MainPage.save`) | New (`PageRepository.save`) | +|--------|--------------------------|----------------------------| +| **Location** | `super_page.py:657` | `page_repository.py:118` | +| **Lines** | ~90 lines | ~40 lines | +| **Features** | Diff display, user prompt, error handling, state updates | Basic API call | + +**Recommendation:** The legacy `save` has important user interaction features. Create a `PageEditor` service that combines `PageRepository.save` with the UX features. + +--- + +### 1.6 `page_exists` / `exists` + +| Aspect | Legacy | New | +|--------|--------|-----| +| **Location** | `super_page.py:634` | `page_repository.py:159` | +| **Implementation** | Lazy loads via `get_text()` | Direct query | + +--- + +## 2. Architecture Comparison + +### 2.1 Legacy Architecture (super/S_Page/) + +``` +MainPage (God Object - 987 lines) +├── Inherits from PAGE_APIS, ASK_BOT +├── Owns login_bot reference +├── Mixes data access, business logic, and UI concerns +├── Uses instance variables for caching +└── Side effects on every operation +``` + +**Files:** +- `super/S_Page/super_page.py` - Main god object +- `super/S_Page/bot.py` - API helpers +- `super/S_Page/data.py` - Data classes +- `super/S_Page/ar_err.py` - Arabic error handling + +### 2.2 New Architecture (repositories/, core/, services/) + +``` +Page (Dataclass - immutable) +├── core/page.py - Pure data entity +├── repositories/page_repository.py - Data access only +├── services/template_service.py - Template parsing +├── services/edit_validator.py - Edit permission logic +└── core/protocols.py - Interfaces for dependency injection +``` + +**Advantages:** +- Testable (dependency injection via protocols) +- Type-safe (full type hints) +- Single Responsibility Principle +- No side effects in data access + +--- + +## 3. Refactoring Status (from refactor.md) + +The `refactor.md` checklist shows the following relevant items: + +### Completed ✅ +- [x] `mw_api/core/config.py` - `BotConfig` dataclass +- [x] `mw_api/core/page.py` - `Page` and `PageMetadata` dataclasses +- [x] `mw_api/repositories/page_repository.py` - `PageRepository` class +- [x] `mw_api/services/template_service.py` - `TemplateService` class +- [x] `mw_api/services/edit_validator.py` - `EditValidator` class +- [x] `mw_api/core/protocols.py` - Interface definitions +- [x] `tests/test_repositories.py` - Unit tests for `PageRepository` + +### Pending ⏳ +- [ ] Update `MainPage` to delegate text retrieval to `PageRepository` +- [ ] Update `MainPage` to delegate template operations to `TemplateService` +- [ ] Update `MainPage` to delegate edit permission checks to `EditValidator` +- [ ] Remove duplicate implementations once delegation is complete + +--- + +## 4. Recommendations + +### 4.1 Short-Term (Do Not Remove) + +**DO NOT remove either implementation yet.** The legacy code is still in active use via `ALL_APIS.MainPage()`. Removing it would break existing users. + +### 4.2 Medium-Term (Integration Phase) + +Modify `MainPage.get_text()` to delegate to `PageRepository`: + +```python +# In super_page.py +def get_text(self, redirects=False): + """Retrieves the current wikitext content.""" + # Create Page entity + page = Page(title=self.title, lang=self.lang, family=self.family) + + # Delegate to repository + repository = PageRepository(self.login_bot) + self.text = repository.get_text(page) + + # Update legacy instance attributes for backward compatibility + # ... existing metadata updates ... + + return self.text +``` + +### 4.3 Long-Term (New API) + +Provide a new, clean API that uses the repository pattern directly: + +```python +from mw_api.core.page import Page +from mw_api.repositories import PageRepository + +# New clean usage +repo = PageRepository(login_bot) +page = Page(title="Example") +text = repo.get_text(page) +``` + +--- + +## 5. Files to Keep vs. Migrate + +| File | Status | Action | +|------|--------|--------| +| `repositories/page_repository.py` | **NEW** | Keep - This is the target architecture | +| `core/page.py` | **NEW** | Keep - Core data entities | +| `core/protocols.py` | **NEW** | Keep - Interface definitions | +| `services/template_service.py` | **NEW** | Keep - Extracted from MainPage | +| `services/edit_validator.py` | **NEW** | Keep - Extracted from botEdit.py | +| `super/S_Page/super_page.py` | **LEGACY** | Keep for now, migrate incrementally | +| `super/S_Page/bot.py` | **LEGACY** | Migrate to repositories/services | +| `super/S_Page/data.py` | **LEGACY** | Replace with `core/page.py` dataclasses | + +--- + +## 6. Testing Strategy + +| Component | Tests | Status | +|-----------|-------|--------| +| `PageRepository` | `tests/test_repositories.py` | ✅ Complete | +| `TemplateService` | (needs tests) | ⏳ Pending | +| `EditValidator` | (needs tests) | ⏳ Pending | +| `MainPage` | `tests/test_MainPage.py` | ✅ Exists | + +--- + +## 7. Conclusion + +The duplicate `get_text` methods exist as part of a planned refactoring effort: + +1. **`PageRepository.get_text`** is the **new, correct implementation** following clean architecture principles +2. **`MainPage.get_text`** is the **legacy implementation** that will eventually delegate to the repository + +**Do not remove either file.** Instead, follow the migration path in `refactor.md` to gradually delegate legacy functionality to the new architecture. + +--- + +*Analysis Date: 2026-01-26* +*Related: See `refactor.md` for the complete refactoring roadmap* diff --git a/refactor.md b/refactor.md index 3f332f9..76ccd6c 100644 --- a/refactor.md +++ b/refactor.md @@ -805,18 +805,20 @@ def parse_api_error(error_dict: dict) -> Optional[ApiError]: ## 8. Refactoring Checklist +> **Note**: See `Doc/DUPLICATION_ANALYSIS.md` for detailed analysis of duplicate methods between legacy and new implementations. + ### Phase 1: Foundation #### Configuration & State -- [ ] **Create File**: `mw_api/core/config.py`. - - [ ] Define `BotConfig` dataclass. +- [x] **Create File**: `mw_api/core/config.py`. ✅ DONE + - [x] Define `BotConfig` dataclass. - [ ] **Edit File**: `mw_api/api_utils/botEdit.py`. - [ ] Import `BotConfig`. - [ ] Replace `sys.argv` checks with `BotConfig` checks. - [ ] **Edit File**: `mw_api/super/super_login.py`. - [ ] Import `BotConfig`. - [ ] Replace `sys.argv` checks with `BotConfig` checks. -- [ ] **Create File**: `mw_api/core/container.py`. - - [ ] Implement `SessionManager` class. +- [x] **Create File**: `mw_api/core/container.py`. ✅ DONE + - [x] Implement `SessionManager` class. - [ ] **Edit File**: `mw_api/super/bot.py`. - [ ] Remove global `seasons_by_lang`. - [ ] Remove global `users_by_lang`. @@ -826,15 +828,24 @@ def parse_api_error(error_dict: dict) -> Optional[ApiError]: ### Phase 2: Decomposition #### Page Logic -- [ ] **Create File**: `mw_api/core/page.py`. - - [ ] Define `Page` dataclass. - - [ ] Define `PageMetadata` dataclass. -- [ ] **Create File**: `mw_api/repositories/page_repository.py`. - - [ ] Define `PageRepository` class. -- [ ] **Create File**: `mw_api/services/template_service.py`. - - [ ] Define `TemplateService` class. -- [ ] **Create File**: `mw_api/services/edit_validator.py`. - - [ ] Define `EditValidator` class. +- [x] **Create File**: `mw_api/core/page.py`. ✅ DONE + - [x] Define `Page` dataclass. + - [x] Define `PageMetadata` dataclass. +- [x] **Create File**: `mw_api/core/protocols.py`. ✅ DONE + - [x] Define `LoginBotProtocol` interface. + - [x] Define `SessionProtocol` interface. +- [x] **Create File**: `mw_api/repositories/page_repository.py`. ✅ DONE + - [x] Define `PageRepository` class. + - [x] Implement `get_text()`, `get_page_info()`, `save()`, `page_exists()`. + - [x] Implement `get_categories()`, `get_templates()`, `get_links()`, `get_langlinks()`. +- [x] **Create File**: `mw_api/services/template_service.py`. ✅ DONE + - [x] Define `TemplateService` class. +- [x] **Create File**: `mw_api/services/edit_validator.py`. ✅ DONE + - [x] Define `EditValidator` class. +- [x] **Create File**: `tests/test_repositories.py`. ✅ DONE + - [x] Unit tests for `PageRepository`. + +#### Pending: MainPage Migration - [ ] **Edit File**: `mw_api/super/S_Page/super_page.py`. - [ ] Remove `PAGE_APIS` inheritance. - [ ] Remove `ASK_BOT` inheritance. @@ -845,10 +856,10 @@ def parse_api_error(error_dict: dict) -> Optional[ApiError]: - [ ] Delegate edit permission checks to `EditValidator`. #### API Client +- [x] **Create File**: `mw_api/core/request_config.py`. ✅ DONE + - [x] Define `RequestConfig` dataclass. - [ ] **Create File**: `mw_api/api/client.py`. - [ ] Define `MediaWikiApiClient`. -- [ ] **Create File**: `mw_api/core/request_config.py` (or inside `api/client.py`). - - [ ] Define `RequestConfig` dataclass. - [ ] **Edit File**: `mw_api/super/S_API/bot_api.py`. - [ ] Refactor usage to `MediaWikiApiClient`. - [ ] Refactor `post_params` to use `RequestConfig`. @@ -877,12 +888,12 @@ def parse_api_error(error_dict: dict) -> Optional[ApiError]: - [ ] Implement `LiteDbRepository` class. ### Phase 3: Error Handling & Interfaces -- [ ] **Create File**: `mw_api/core/exceptions.py`. - - [ ] Define `NewApiException`. - - [ ] Define `ApiError`, `AbuseFilterError`, `MaxLagError`, `ArticleExistsError`. +- [x] **Create File**: `mw_api/core/exceptions.py`. ✅ DONE + - [x] Define `NewApiException`. + - [x] Define `ApiError`, `AbuseFilterError`, `MaxLagError`, `ArticleExistsError`. + - [x] Implement `parse_api_error` function. - [ ] **Edit File**: `mw_api/super/handel_errors.py`. - - [ ] Implement `parse_api_error` function. - - [ ] Update `handel_err` to return typed exceptions/results. + - [ ] Update `handel_err` to use the new exception types. ### Phase 4: Modernization - [ ] **Edit File**: `mw_api/super/S_Page/super_page.py`. From cc9bbf742226e1c72536bd564e84c4fb64dcf9bc Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 26 Jan 2026 08:28:15 +0000 Subject: [PATCH 3/4] Fix code review comments in DUPLICATION_ANALYSIS.md Co-authored-by: MrIbrahem <26301308+MrIbrahem@users.noreply.github.com> --- Doc/DUPLICATION_ANALYSIS.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/Doc/DUPLICATION_ANALYSIS.md b/Doc/DUPLICATION_ANALYSIS.md index fa6514a..a68dcc7 100644 --- a/Doc/DUPLICATION_ANALYSIS.md +++ b/Doc/DUPLICATION_ANALYSIS.md @@ -33,7 +33,7 @@ This document analyzes the duplicate functionality between the **legacy** `MainP | **Implementation** | Returns cached data from `get_infos()` | Makes direct API call | | **Side Effects** | Lazy loads via `get_infos()` | None | -**Recommendation:** The implementations differ in approach. Legacy uses batch loading; new uses individual queries. Consider merging the lazy-loading pattern into a service layer. +**Recommendation:** The implementations differ in approach. Legacy uses batch loading; new uses individual queries. Consider adding a caching service wrapper for the new implementation. --- @@ -44,7 +44,7 @@ This document analyzes the duplicate functionality between the **legacy** `MainP | **Location** | `super_page.py:514` | `page_repository.py:263` | | **Implementation** | Returns cached `self.langlinks` | Makes direct API call | -**Recommendation:** Same as `get_categories` - consider a caching service wrapper. +**Recommendation:** The implementations differ similarly to `get_categories`. Consider adding a caching service wrapper for the new implementation. --- @@ -100,12 +100,12 @@ MainPage (God Object - 987 lines) ### 2.2 New Architecture (repositories/, core/, services/) ``` -Page (Dataclass - immutable) -├── core/page.py - Pure data entity -├── repositories/page_repository.py - Data access only -├── services/template_service.py - Template parsing -├── services/edit_validator.py - Edit permission logic -└── core/protocols.py - Interfaces for dependency injection +New Clean Architecture: +├── core/page.py - Page dataclass (pure data entity) +├── core/protocols.py - LoginBotProtocol interface +├── repositories/page_repository.py - Data access layer (API calls) +├── services/template_service.py - Template parsing logic +└── services/edit_validator.py - Edit permission logic ``` **Advantages:** From a9e24a121efef1678dda5680d68649659cc2f9c8 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 26 Jan 2026 08:59:36 +0000 Subject: [PATCH 4/4] Add merge.md with comprehensive migration plan for super folder Co-authored-by: MrIbrahem <26301308+MrIbrahem@users.noreply.github.com> --- merge.md | 411 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 411 insertions(+) create mode 100644 merge.md diff --git a/merge.md b/merge.md new file mode 100644 index 0000000..eea68ea --- /dev/null +++ b/merge.md @@ -0,0 +1,411 @@ +# Migration Plan: `super/` Folder to New Architecture + +This document provides a comprehensive plan to migrate all functionality from the legacy `mw_api/super/` folder to the new clean architecture in `mw_api/core/`, `mw_api/repositories/`, `mw_api/services/`, and `mw_api/auth/`. + +--- + +## Table of Contents + +1. [Overview](#1-overview) +2. [Current State](#2-current-state) +3. [Target Architecture](#3-target-architecture) +4. [Migration Plan by Module](#4-migration-plan-by-module) +5. [Execution Timeline](#5-execution-timeline) +6. [Testing Strategy](#6-testing-strategy) +7. [Deprecation and Removal](#7-deprecation-and-removal) + +--- + +## 1. Overview + +### Goals +- Migrate all functionality from `super/` to the new modular architecture +- Maintain backward compatibility during migration +- Improve testability, type safety, and separation of concerns +- Eventually remove the `super/` folder entirely + +### Migration Principles +1. **Incremental Migration**: Migrate one module at a time +2. **Delegation First**: Legacy code delegates to new implementations +3. **No Breaking Changes**: Existing API (`ALL_APIS`) remains functional +4. **Test Coverage**: Add tests before migrating each component + +--- + +## 2. Current State + +### Legacy Files in `super/` Folder + +| File | Lines | Classes | Status | +|------|-------|---------|--------| +| `super_login.py` | ~315 | `Login` | 🔄 Partially migrated | +| `S_Page/super_page.py` | ~987 | `MainPage` | 🔄 Partially migrated | +| `S_API/bot_api.py` | ~952 | `NEW_API` | ⏳ Pending | +| `S_Category/bot.py` | ~376 | `CategoryDepth` | ⏳ Pending | +| `handel_errors.py` | ~150 | `HANDEL_ERRORS` | ✅ Migrated to `core/exceptions.py` | +| `bot.py` | ~150 | `LOGIN_HELPS` | ⏳ Pending | +| `params_help.py` | ~80 | `PARAMS_HELPS` | ⏳ Pending | +| `cookies_bot.py` | ~100 | Cookie handling | ⏳ Pending | +| `login_wrap.py` | ~100 | `LoginWrapState` | ⏳ Pending | + +### New Architecture Files (Already Created) + +| Module | Files | Status | +|--------|-------|--------| +| `core/` | `page.py`, `config.py`, `protocols.py`, `exceptions.py`, `request_config.py`, `container.py`, `namespace.py` | ✅ Complete | +| `repositories/` | `page_repository.py` | ✅ Complete | +| `services/` | `template_service.py`, `edit_validator.py` | ✅ Complete | +| `auth/` | `authenticator.py`, `token_provider.py` | ✅ Complete | + +--- + +## 3. Target Architecture + +``` +mw_api/ +├── core/ # Core entities and interfaces +│ ├── page.py # Page, PageMetadata dataclasses +│ ├── config.py # BotConfig +│ ├── protocols.py # LoginBotProtocol, SessionProtocol +│ ├── exceptions.py # Exception hierarchy +│ ├── request_config.py # RequestConfig +│ ├── container.py # SessionManager (DI container) +│ └── namespace.py # NamespaceRegistry +│ +├── repositories/ # Data access layer +│ ├── page_repository.py # PageRepository +│ ├── category_repository.py # CategoryRepository (NEW) +│ └── api_repository.py # ApiRepository (NEW) +│ +├── services/ # Business logic +│ ├── template_service.py # TemplateService +│ ├── edit_validator.py # EditValidator +│ ├── category_service.py # CategoryService (NEW) +│ └── search_service.py # SearchService (NEW) +│ +├── auth/ # Authentication +│ ├── authenticator.py # Authenticator +│ ├── token_provider.py # TokenProvider +│ └── session_manager.py # SessionManager (NEW) +│ +├── api/ # API client layer (NEW) +│ ├── client.py # MediaWikiClient +│ ├── queries.py # Query builders +│ └── token_manager.py # Token management +│ +└── super/ # LEGACY (to be removed) + └── ... (deprecated) +``` + +--- + +## 4. Migration Plan by Module + +### 4.1 `super_login.py` → `auth/` + +**Current State**: Partially migrated. `Authenticator` and `TokenProvider` exist. + +**Remaining Tasks**: + +| Task | Target File | Priority | +|------|-------------|----------| +| Migrate cookie handling | `auth/cookie_store.py` | High | +| Migrate session management | `auth/session_manager.py` | High | +| Update `Login` to delegate to new classes | `super_login.py` | Medium | +| Add tests | `tests/test_auth.py` | Medium | + +**Steps**: +1. [ ] Create `auth/cookie_store.py`: + ```python + class CookieStore: + def save_cookies(self, session: requests.Session, path: str) -> None: ... + def load_cookies(self, session: requests.Session, path: str) -> bool: ... + ``` + +2. [ ] Create `auth/session_manager.py`: + ```python + class SessionManager: + def get_session(self, key: str) -> requests.Session: ... + def invalidate(self, key: str) -> None: ... + ``` + +3. [ ] Update `Login.__init__()` to use new components +4. [ ] Add deprecation warnings to legacy methods +5. [ ] Write unit tests for new auth components + +--- + +### 4.2 `S_Page/super_page.py` → `repositories/` + `services/` + +**Current State**: `PageRepository`, `TemplateService`, `EditValidator` exist. + +**Remaining Tasks**: + +| Method | Target | Priority | +|--------|--------|----------| +| `get_text()` | Delegate to `PageRepository.get_text()` | High | +| `save()` | Delegate to `PageRepository.save()` | High | +| `get_categories()` | Delegate to `PageRepository.get_categories()` | Medium | +| `get_langlinks()` | Delegate to `PageRepository.get_langlinks()` | Medium | +| `get_templates()` | Delegate to `PageRepository.get_templates()` | Medium | +| `can_edit()` | Delegate to `EditValidator.can_edit()` | Medium | +| `Get_tags()` | Delegate to `TemplateService` | Low | +| `get_text_html()` | Add to `PageRepository` | Low | +| `create()` | Add to `PageRepository` | Low | +| `purge()` | Add to `PageRepository` | Low | + +**Steps**: +1. [ ] Add missing methods to `PageRepository`: + - `get_text_html()` + - `create_page()` + - `purge()` + +2. [ ] Update `MainPage.get_text()` to delegate: + ```python + def get_text(self, redirects=False): + from ..repositories import PageRepository + from ..core.page import Page + + page = Page(title=self.title, lang=self.lang, family=self.family) + repo = PageRepository(self.login_bot) + self.text = repo.get_text(page) + # Update legacy attributes for backward compatibility + return self.text + ``` + +3. [ ] Repeat for all other methods +4. [ ] Add deprecation warnings to legacy direct implementations +5. [ ] Add tests for new methods + +--- + +### 4.3 `S_API/bot_api.py` → `repositories/` + `services/` + +**Current State**: Not migrated. + +**New Files Needed**: + +| File | Purpose | +|------|---------| +| `repositories/api_repository.py` | Low-level API queries | +| `services/search_service.py` | Search functionality | +| `services/page_info_service.py` | Page info queries | + +**Methods to Migrate**: + +| Method | Target | Priority | +|--------|--------|----------| +| `Find_pages_exists_or_not()` | `PageRepository.check_pages_exist()` | High | +| `Get_All_pages()` | `ApiRepository.get_all_pages()` | High | +| `Search()` | `SearchService.search()` | High | +| `Get_Newpages()` | `ApiRepository.get_recent_pages()` | Medium | +| `UserContribs()` | `ApiRepository.get_user_contributions()` | Medium | +| `Get_langlinks_for_list()` | `PageRepository.get_langlinks_batch()` | Medium | +| `get_revisions()` | `PageRepository.get_revisions()` | Medium | +| `Get_template_pages()` | `TemplateService.get_transclusions()` | Low | +| `Get_image_url()` | `ApiRepository.get_image_url()` | Low | + +**Steps**: +1. [ ] Create `repositories/api_repository.py`: + ```python + class ApiRepository: + def get_all_pages(self, namespace: str, limit: int) -> List[str]: ... + def get_recent_pages(self, namespace: str, limit: int) -> List[Dict]: ... + def get_user_contributions(self, user: str, limit: int) -> List[Dict]: ... + ``` + +2. [ ] Create `services/search_service.py`: + ```python + class SearchService: + def search(self, query: str, namespace: str) -> List[Dict]: ... + def prefix_search(self, prefix: str, namespace: str) -> List[str]: ... + ``` + +3. [ ] Update `NEW_API` to delegate to new components +4. [ ] Add tests for new services + +--- + +### 4.4 `S_Category/bot.py` → `repositories/` + `services/` + +**Current State**: Not migrated. + +**New Files Needed**: + +| File | Purpose | +|------|---------| +| `repositories/category_repository.py` | Category data access | +| `services/category_service.py` | Category traversal logic | + +**Steps**: +1. [ ] Create `repositories/category_repository.py`: + ```python + class CategoryRepository: + def get_members(self, category: str, namespace: str) -> List[str]: ... + def get_subcategories(self, category: str) -> List[str]: ... + ``` + +2. [ ] Create `services/category_service.py`: + ```python + class CategoryService: + def get_category_tree(self, category: str, depth: int) -> Dict: ... + def get_all_members_recursive(self, category: str, depth: int) -> List[str]: ... + ``` + +3. [ ] Update `CategoryDepth` to delegate +4. [ ] Add tests + +--- + +### 4.5 `handel_errors.py` → `core/exceptions.py` + +**Current State**: ✅ Complete + +The exception hierarchy has been migrated to `core/exceptions.py`. + +**Remaining Task**: +- [ ] Update `HANDEL_ERRORS.handel_err()` to use `parse_api_error()` from `core/exceptions.py` + +--- + +### 4.6 Helper Classes (`bot.py`, `params_help.py`) + +**Current State**: Not migrated. + +**Steps**: +1. [ ] Extract parameter building logic to `api/queries.py` +2. [ ] Migrate session helpers to `core/container.py` +3. [ ] Remove global state variables + +--- + +## 5. Execution Timeline + +### Phase 1: Complete Page Module (Week 1-2) +- [ ] Update `MainPage.get_text()` to delegate to `PageRepository` +- [ ] Update `MainPage.save()` to delegate to `PageRepository` +- [ ] Add missing methods to `PageRepository` +- [ ] Add tests for `PageRepository` + +### Phase 2: Complete Auth Module (Week 3) +- [ ] Create `CookieStore` +- [ ] Create `SessionManager` +- [ ] Update `Login` to use new components +- [ ] Add tests for auth components + +### Phase 3: API Module (Week 4-5) +- [ ] Create `ApiRepository` +- [ ] Create `SearchService` +- [ ] Update `NEW_API` to delegate +- [ ] Add tests + +### Phase 4: Category Module (Week 6) +- [ ] Create `CategoryRepository` +- [ ] Create `CategoryService` +- [ ] Update `CategoryDepth` to delegate +- [ ] Add tests + +### Phase 5: Cleanup (Week 7-8) +- [ ] Add deprecation warnings to all legacy classes +- [ ] Update documentation +- [ ] Remove inheritance chains +- [ ] Final testing + +### Phase 6: Removal (Week 9+) +- [ ] Remove deprecated code after grace period +- [ ] Remove `super/` folder +- [ ] Update all imports + +--- + +## 6. Testing Strategy + +### Unit Tests (Priority: High) +Each new component needs unit tests with mocked dependencies. + +| Component | Test File | Coverage Target | +|-----------|-----------|-----------------| +| `PageRepository` | `tests/test_repositories.py` | ✅ 80%+ | +| `TemplateService` | `tests/test_services.py` | ⏳ 80%+ | +| `EditValidator` | `tests/test_services.py` | ⏳ 80%+ | +| `Authenticator` | `tests/test_auth.py` | ⏳ 80%+ | +| `TokenProvider` | `tests/test_auth.py` | ⏳ 80%+ | +| `ApiRepository` | `tests/test_api_repository.py` | ⏳ 80%+ | +| `CategoryRepository` | `tests/test_category_repository.py` | ⏳ 80%+ | + +### Integration Tests (Priority: Medium) +Test that delegations work correctly. + +- [ ] `MainPage.get_text()` → `PageRepository.get_text()` delegation +- [ ] `Login` → `Authenticator` delegation +- [ ] `NEW_API` → `ApiRepository` delegation + +### Regression Tests (Priority: High) +Ensure existing functionality works after migration. + +- [ ] `ALL_APIS.MainPage()` works as before +- [ ] `ALL_APIS.NEW_API()` works as before +- [ ] `ALL_APIS.CatDepth()` works as before + +--- + +## 7. Deprecation and Removal + +### Deprecation Schedule + +| Component | Deprecation | Removal | +|-----------|-------------|---------| +| `super/S_Page/super_page.py` direct methods | v2.0 | v3.0 | +| `super/S_API/bot_api.py` direct methods | v2.0 | v3.0 | +| `super/super_login.py` | v2.0 | v3.0 | +| Entire `super/` folder | v2.5 | v3.0 | + +### Deprecation Warnings + +Add warnings to legacy code: + +```python +import warnings + +def get_text(self, redirects=False): + warnings.warn( + "MainPage.get_text() is deprecated. Use PageRepository.get_text() instead.", + DeprecationWarning, + stacklevel=2 + ) + # ... delegation code +``` + +### Backward Compatibility + +During migration, maintain the following API: + +```python +from mw_api import ALL_APIS + +# This must continue to work +api = ALL_APIS(lang='en', family='wikipedia', username='...', password='...') +page = api.MainPage('Example') +text = page.get_text() # Works, but shows deprecation warning +``` + +--- + +## Appendix: File Mapping Reference + +| Legacy File | New Location(s) | +|-------------|-----------------| +| `super/super_login.py` | `auth/authenticator.py`, `auth/token_provider.py`, `auth/session_manager.py` | +| `super/S_Page/super_page.py` | `repositories/page_repository.py`, `services/template_service.py`, `services/edit_validator.py` | +| `super/S_Page/data.py` | `core/page.py` | +| `super/S_API/bot_api.py` | `repositories/api_repository.py`, `services/search_service.py` | +| `super/S_Category/bot.py` | `repositories/category_repository.py`, `services/category_service.py` | +| `super/handel_errors.py` | `core/exceptions.py` | +| `super/bot.py` | `core/container.py`, `api/queries.py` | +| `super/params_help.py` | `api/queries.py` | +| `super/cookies_bot.py` | `auth/cookie_store.py` | + +--- + +*Plan created: 2026-01-26* +*Last updated: 2026-01-26*