Releases · aborruso/scrape-cli

06 Apr 09:05

aborruso

v1.2.3

ad39e28

v1.2.3 Latest

Latest

Agent-friendly CLI improvements

All error messages now go to stderr (stdout stays clean for data)
Missing -e error replaced with a concise actionable message + example
--help extended with examples for all major use cases (XPath, CSS, text, attributes, URL, stdin, check-existence)

Assets 2

22 Feb 23:32

aborruso

v1.2.2

a9cc902

v1.2.2

Added -u/--user-agent option for HTTP requests
Default browser-like User-Agent to avoid 403 errors (e.g. Wikipedia)

Assets 4

22 Feb 12:00

aborruso

v1.2.1

a18caa6

v1.2.1

What's changed

Added automated pytest coverage for XPath/CSS detection and CLI options, including URL/file/stdin input paths and error handling.
Hardened runtime behavior by adding timeout=30 to URL fetches and replacing a bare except: with except Exception in charset detection.
Raised requires-python to >=3.8, removed legacy setup.py, and expanded .gitignore for local test/venv artifacts.

Validation

pytest: 18 passed
twine check dist/*: passed

Assets 2

07 Sep 11:41

aborruso

v1.2.0

3419411

Release v1.2.0

Bug Fix

Fixed XPath detection for expressions wrapped in parentheses
XPath expressions like (//div[@class='coordinate lat'])[1] are now correctly recognized as XPath instead of being incorrectly treated as CSS selectors
Enhanced the is_xpath function with additional pattern recognition for XPath-specific syntax including attribute predicates, position predicates, and XPath functions

Installation

pip install scrape_cli==1.2.0

Assets 2

14 Aug 07:29

aborruso

v1.1.9

06657af

v1.1.9: CSS Selector Fix

What Changed

🐛 Bug Fix: Fixed CSS selector parsing that was incorrectly identified as XPath

Details

Fixed is_xpath() function to properly distinguish CSS selectors from XPath expressions
CSS selectors like a[href*="/talk/"] now work correctly
Improved selector recognition logic to be more restrictive for XPath detection

Technical Changes

Updated XPath detection to only recognize expressions starting with / or // or containing ::
This prevents CSS attribute selectors with square brackets from being misidentified as XPath

Installation

pip install --upgrade scrape-cli

Assets 2

02 Jun 13:11

aborruso

v1.1.8

acf0d4b

v1.1.8

What's Changed

Features

Added text extraction functionality with -t option
- Extract only text content without HTML tags
- Automatically excludes text from script and style tags
- Cleans up whitespace for better readability
- Particularly useful for LLMs and text processing workflows
- Can be combined with CSS selectors or XPath expressions for targeted text extraction

Assets 2

04 May 13:39

aborruso

v1.1.7

a2fbc48

v1.1.7

What's Changed

Features

Improved XPath detection with support for complex expressions:
- Added support for predicates and square brackets
- Added support for XPath functions (last(), position(), contains(), text())
- Added support for XPath axes and attributes
- Better handling of complex XPath expressions

Assets 2

02 May 17:53

aborruso

v1.1.6

cb67dd5

v1.1.6

Added charset detection from HTML meta tags
Added support for ISO-8859-1 encoding fallback
Improved HTML parsing with better encoding handling

Assets 2

02 Nov 14:50

aborruso

v1.1.1

e33a91b

v1.1.1

update

Assets 3

02 Nov 14:06

aborruso

1.1

d4f8ec2

1.1

Bump version to 0.2

Assets 3

Releases: aborruso/scrape-cli

v1.2.3

Uh oh!

v1.2.2

Uh oh!

v1.2.1

What's changed

Validation

Uh oh!

Release v1.2.0

Bug Fix

Installation

Uh oh!

v1.1.9: CSS Selector Fix

What Changed

Details

Technical Changes

Installation

Uh oh!

v1.1.8

What's Changed

Features

Uh oh!

v1.1.7

What's Changed

Features

Uh oh!

v1.1.6

Uh oh!

v1.1.1

Uh oh!

1.1

Uh oh!