Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: Refactor extraction code #367

Merged
merged 9 commits into from
Mar 17, 2023
Merged

MNT: Refactor extraction code #367

merged 9 commits into from
Mar 17, 2023

Conversation

jsheunis
Copy link
Member

This PR:

  • moves the get_required_content method to the base class MetadataExtractorBase so that both FileMetadataExtractor and DatasetMetadataExtractor can make use of it (addresses Should get_required_content() also be available for file-level extractors? #287)
  • refactors extract.py in order to deduplicate code, and updates related tests (addresses MNT: refactor extract.py to deduplicate code #366):
    • merges do_file_extraction() and do_dataset_extraction() into a single function do_extraction(), with a new argument extractor_type, which would be either 'file' or 'dataset'.
    • merges perform_file_metadata_extraction() and perform_dataset_metadata_extraction() into a single function perform_metadata_extraction()

Copy link
Collaborator

@christian-monch christian-monch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @jsheunis .

Only one thing. Let's move the extractor_type variable into the ExtractionArguments-class (see comments in the code)

datalad_metalad/extractors/base.py Outdated Show resolved Hide resolved
datalad_metalad/extract.py Outdated Show resolved Hide resolved
datalad_metalad/extract.py Outdated Show resolved Hide resolved
datalad_metalad/extract.py Outdated Show resolved Hide resolved
datalad_metalad/extract.py Outdated Show resolved Hide resolved
datalad_metalad/extract.py Outdated Show resolved Hide resolved
datalad_metalad/extract.py Outdated Show resolved Hide resolved
jsheunis and others added 4 commits March 16, 2023 22:24
Co-authored-by: Christian Mönch <christian.moench@web.de>
Co-authored-by: Christian Mönch <christian.moench@web.de>
Co-authored-by: Christian Mönch <christian.moench@web.de>
Co-authored-by: Christian Mönch <christian.moench@web.de>
@codecov-commenter
Copy link

codecov-commenter commented Mar 16, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (c741d49) 86.28% compared to head (9eb3a51) 86.28%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #367   +/-   ##
=======================================
  Coverage   86.28%   86.28%           
=======================================
  Files          88       88           
  Lines        4842     4828   -14     
=======================================
- Hits         4178     4166   -12     
+ Misses        664      662    -2     
Impacted Files Coverage Δ
datalad_metalad/extract.py 88.21% <100.00%> (-0.07%) ⬇️
datalad_metalad/extractors/base.py 88.00% <100.00%> (+1.33%) ⬆️
datalad_metalad/tests/test_extract.py 99.64% <100.00%> (-0.01%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jsheunis
Copy link
Member Author

Thanks, good suggestion. I committed some of your suggestions, added more changes in 9eb3a51, including updating tests to reflect the changed parameter.

@jsheunis
Copy link
Member Author

@christian-monch I'll leave the merging to you once you're happy with the state of the PR.

@christian-monch christian-monch merged commit 30ed90b into master Mar 17, 2023
@christian-monch
Copy link
Collaborator

Thx a lot @jsheunis

jsheunis added a commit that referenced this pull request Apr 6, 2023
This was supposed to be part of #367, but fell through the cracks. This addition ensures that the correct type is displayed in the metadata record output from `meta-extract`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants