Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions to support padding and filtering intervals: filter_by_intervals, pad_intervals, parse_locus_intervals #752

Merged
merged 11 commits into from
Jan 14, 2025

Conversation

jkgoodrich
Copy link
Contributor

No description provided.

@KoalaQin KoalaQin changed the title Add functions to support padding and filtering intervals: pad_intervals, filter_by_intervals, pad_intervals, parse_locus_intervals Add functions to support padding and filtering intervals: filter_by_intervals, pad_intervals, parse_locus_intervals Jan 10, 2025
Copy link
Contributor

@KoalaQin KoalaQin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for separating the functions. I have a few comments.

Comment on lines +566 to +569
.. note::

If no Gencode Table is provided, the default version of the Gencode Table
resource for the genome build of the input Table/MatrixTable will be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the note here is a bit redundant, since it's already mentioned in filter_gencode_ht function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it hurts to have it here for users that don't realize this function is using that function


if is_expr:
_ht = intervals._indices.source
num_intervals = _ht.count()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_intervals is only checked when the intervals is_exper, what if a list is bigger than our max? The list won't cause a memory issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the memory issue is in the collect right? So the user will already have encountered that

@jkgoodrich jkgoodrich requested a review from KoalaQin January 14, 2025 00:32
Copy link
Contributor

@KoalaQin KoalaQin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jkgoodrich jkgoodrich merged commit 538ecf2 into main Jan 14, 2025
5 checks passed
@jkgoodrich jkgoodrich deleted the jg/add_support_for_padding_and_filtering_intervals branch January 14, 2025 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants