-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Centralize release id extraction #5761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry. |
75015dd
to
b386fea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR centralizes the extraction of release IDs by introducing a new utility function extract_release_id and replacing source‐specific regex handling with a common implementation. Key changes include:
- Adding and integrating extract_release_id into various plugins and test files.
- Removing redundant code and tests that previously handled ID extraction separately.
- Updating plugin methods to use the new extraction logic.
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
test/util/test_id_extractors.py | Introduces tests for the centralized extract_release_id function. |
test/test_plugins.py | Removes outdated tests related to individual ID regex extraction. |
test/plugins/test_musicbrainz.py | Removes tests for the old _parse_id functionality. |
test/plugins/test_discogs.py | Deletes legacy Discogs ID parsing tests in favor of the new method. |
beetsplug/spotify.py | Updates album_for_id and track_for_id methods to use extract_release_id. |
beetsplug/musicbrainz.py | Switches from _parse_id to extract_release_id; updates album/track lookups. |
beetsplug/discogs.py | Replaces extract_discogs_id_regex with extract_release_id. |
beetsplug/deezer.py | Refines ID extraction using the new utility with walrus operator. |
beetsplug/beatport.py | Updates release ID extraction handling with the centralized approach. |
beets/util/id_extractors.py | Defines extract_release_id and a PATTERN_BY_SOURCE mapping. |
beets/plugins.py | Refactors _get_id to delegate ID extraction to extract_release_id. |
2374b06
to
c51da3f
Compare
Thanks for already implementing this! Are we set on the functional approach here? I would prefer if this would be part of the plugins themself (e.g. defined as an required property or method in the abc). Is in my opinion closer to the actual logic and would required anyone who creates a new source plugin to properly register their ids. I know not all plugins are |
👍 I only had a very superficial look at the PR, but I'd agree that this looks like plugins should register their ID patterns. Right now, |
This comment has been minimized.
This comment has been minimized.
In an ideal world, I would also imagine that metadata source plugins have something like what @semohr suggested:
And they indeed originally had been part of the plugins, however they have been moved out #4633 in order to support saving external IDs by the Arguably, we may be able to remove this ID parsing from On the other hand, what do we do about |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
f0afda7
to
0c2fc7b
Compare
c51da3f
to
fb0d14e
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I see. See my question from above:
|
These additional regexes are only used in the musicbrainz plugin, right? If this is the case, it wouldn't be too big of an issue to just have them as part of the mb plugin. I'm not super set on this approach, but I generally think the extraction should be closer to the plugins. |
66fd8f5
to
cab0246
Compare
fb0d14e
to
409f3ab
Compare
I had a try at this myself and implemented the extraction on plugin level in an earlier commit of #5764. To be honest my approach gets convoluted and incomprehensive quickly, even if working in theory. I would be happy to go with your inital approach for id extraction as it is more understandable and maintainable. Could we move the extractors utils into the beetsplug folder tho? That way they are close to where the are used. I think that should be a decent compromise. |
The extraction function is called from the generic |
409f3ab
to
ad0a784
Compare
The
Also, once we merge #5764 into this PR, the Apologies for the back and forth—this really feels more like a design decision we should get right to ensure it's future-proof than a simple refactor. |
No need to apologize! All good 😆
Indeed - another reason why I don't think we need to overthink this too much - this method simply follows DRY design and defines shared functionality within the base class. |
cab0246
to
eae7f70
Compare
ad0a784
to
2ee5de4
Compare
I do not think thought We should really think about moving the extractor utils into the beetsplug folder tho. It is kinda confusing that the regex patterns are defined in beets, because they are not directly used in beets but only by beetsplug i.e. inside specific plugin implementations. Nonetheless Im still pretty happy with these enhancments as they are a mayor improvment over the previous behaviour. We can still move/remove it in favor of direct imports later. |
edf688f
to
f1dc75f
Compare
2ee5de4
to
a8e33a2
Compare
5202219
to
8936ae4
Compare
Just need your approval @semohr if this looks good |
Looks good to me 👍 Once we are done with this, It would be awesome to continue with #5787, since it partially depends on this PR. |
Description
Refactor: Centralize release ID extraction
This change introduces a new utility function
extract_release_id
inbeets.util.id_extractors
to handle the parsing of release IDs (or URLs containing IDs) for various metadata sources (Spotify, Deezer, Beatport, Discogs, MusicBrainz, Bandcamp, Tidal).Key changes:
extract_release_id
function andPATTERN_BY_SOURCE
regex dictionary.MetadataSourcePlugin._get_id
static method to an instance method which uses thedata_source
property to pick the correct id extractor.id_regex
property and updated_get_id
calls in all modules.test/util/test_id_extractors.py
that tests theextract_release_id
function.This refactoring simplifies the codebase, reduces redundancy, and makes it easier to manage and extend ID extraction logic for different sources.