Skip to content

lastgenre: Genre spelling normalization (aliases)#6466

Open
JOJ0 wants to merge 22 commits intomasterfrom
lastgenre_aliases
Open

lastgenre: Genre spelling normalization (aliases)#6466
JOJ0 wants to merge 22 commits intomasterfrom
lastgenre_aliases

Conversation

@JOJ0
Copy link
Copy Markdown
Member

@JOJ0 JOJ0 commented Mar 27, 2026

Description

This PR introduces a regex-based normalization (alias) system to unify variant genre tags and improves the plugin's documentation.

The normalization feature uses an ordered list of regular expression aliases to map variant spellings or synonyms to a single canonical name. The mapping keys act as re.Match.expand() templates, supporting \g<N> back-references to regex capture groups:

lastgenre:
    aliases:
        # Maps 'hip-hop' -> 'hip hop' using back-references 
        # (regex full-match)
        \g<1> hop:
            - (hip|trip|jazz)y?[ /-]*hop
        
        # Static normalization examples
        drum and bass:
            - dnb
            - d&b
  • Where?: Integrated normalize_genre before is_ignored in both the LastFmClient (for clean lookup/filtering) and the core _resolve_genres loop (for uniform processing of existing file tags).
  • Updated default whitelist and genre tree files for canonical-name consistency, and curated the bundled aliases.yaml from the top 1,000 Last.fm tags to cover common inconsistencies without over-normalizing.
  • Additional documentation changes:
    • Added a Choosing the Right Tool guide. This is for endusers but note that it might even be a primary starting point for maintainer review.
    • Added non-whitelist mode details to the Canonicalization chapter.
  • Additional unrelated changes:
    • Fixes a few variable name inconsistencies in the recently merged ignorelist features' tests.

@JOJ0 JOJ0 requested a review from a team as a code owner March 27, 2026 12:55
@github-actions
Copy link
Copy Markdown

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@JOJ0 JOJ0 marked this pull request as draft March 27, 2026 12:56
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 87.30159% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.13%. Comparing base (3d3f5e3) to head (0eaca9d).

Files with missing lines Patch % Lines
beetsplug/lastgenre/__init__.py 91.42% 1 Missing and 2 partials ⚠️
beetsplug/lastgenre/client.py 66.66% 1 Missing and 2 partials ⚠️
beetsplug/lastgenre/utils.py 89.47% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6466      +/-   ##
==========================================
+ Coverage   72.07%   72.13%   +0.05%     
==========================================
  Files         159      159              
  Lines       20633    20681      +48     
  Branches     3273     3287      +14     
==========================================
+ Hits        14871    14918      +47     
+ Misses       5053     5048       -5     
- Partials      709      715       +6     
Files with missing lines Coverage Δ
beetsplug/lastgenre/utils.py 93.75% <89.47%> (-6.25%) ⬇️
beetsplug/lastgenre/__init__.py 82.86% <91.42%> (+1.18%) ⬆️
beetsplug/lastgenre/client.py 64.58% <66.66%> (+10.92%) ⬆️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JOJ0 JOJ0 force-pushed the lastgenre_aliases branch 2 times, most recently from 20b2d88 to 90064fa Compare March 28, 2026 22:52
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 66a7d98 to 1785dab Compare March 28, 2026 23:12
@JOJ0 JOJ0 added lastgenre lastgenre plugin plugin Pull requests that are plugins related labels Mar 29, 2026
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 1785dab to c5a14d3 Compare March 29, 2026 09:11
@JOJ0 JOJ0 force-pushed the lastgenre_aliases branch 4 times, most recently from d0cc527 to 12e7358 Compare March 29, 2026 10:09
@JOJ0 JOJ0 marked this pull request as ready for review March 29, 2026 10:17
@github-actions
Copy link
Copy Markdown

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented Mar 29, 2026

This is closely related to #6449 and might make sense to have a look when reviewing the former @snejus Many thanks in advance! 🙌

@JOJ0 JOJ0 force-pushed the lastgenre_aliases branch from 12e7358 to f4b9436 Compare March 29, 2026 11:03
Comment on lines +252 to +268
Choosing the Right Tool
-----------------------

With multiple ways to filter and map genres, here is a quick guide on when to
use what:

- **Aliases**: Use these first to fix spelling variants and abbreviations (e.g.,
``dnb`` → ``drum and bass``).
- **Ignorelist**: Use this for error correction when Last.fm results are not
accurate, or for precise per-artist or global exclusions (e.g., rejecting
``Metal`` for specific electronic artists).
- **Canonicalization**: Use this to automatically map specific sub-genres to
broader categories (e.g., ``Grindcore`` → ``Metal``).
- **Whitelist**: Use this to finally limit your library to a predefined set of
genres. When combined with canonicalization, the plugin will try to map a
sub-genre to its closest whitelisted parent. Anything else is dropped.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewers: This new chapter is a good starting point for reviewing and might help thinking through where and when things should happen in the plugin and why multiple filtering features make sense for this plugin!

@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch 9 times, most recently from 87c272c to b12b314 Compare April 3, 2026 06:52
@JOJ0 JOJ0 force-pushed the lastgenre_aliases branch 3 times, most recently from 133b397 to 67ae26c Compare May 1, 2026 17:02
@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented May 1, 2026

Make sense. Let me know once it's ready!

It's ready now @snejus! I also updated the PR description. Please re-read! Thanks!

@JOJ0 JOJ0 force-pushed the lastgenre_aliases branch from d2cfdfa to 0e27872 Compare May 3, 2026 07:31
@snejus
Copy link
Copy Markdown
Member

snejus commented May 3, 2026

Note the tests are failing still!

@JOJ0 JOJ0 force-pushed the lastgenre_aliases branch from 40bbd83 to 0eaca9d Compare May 4, 2026 15:51
@JOJ0
Copy link
Copy Markdown
Member Author

JOJ0 commented May 4, 2026

Note the tests are failing still!

Fixed and narrowed some regex along the way. (note that the commit history still is a real mess, I'll clean that up once it's reviewed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lastgenre lastgenre plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants