Parameterize `parse_read` for custom UMI-tags etc. #24

mschilli87 · 2024-02-23T01:03:29Z

While cellranger uses the UB SAM tag, other scRNA-seq analyses use different SAM tags (e.g. XM) for the same information. By parameterizing the existing parse_read function, this (and other read parsing/filtering options) can be adjusted by the caller without the need for boilerplate code required for a custom callback class. At the same time, by providing the previously hard-coded values as default parameters, compatibility with existing code is kept. All functions using this callback have been extended by optional keyword arguments to allow passing through the newly added parameters.

This addresses one out of two issues raised in #23.

closes #26

While cellranger uses the `UB` SAM tag, other scRNA-seq analyses use different SAM tags (*e.g.* `XM`) for the same information. By parameterizing the existing `parse_read` function, this (and other read parsing/filtering options) can be adjusted by the caller without the need for boilerplate code required for a custom callback class. At the same time, by providing the previously hard-coded values as default parameters, compatibility with existing code is kept. All functions using this callback have been extended by optional keyword arguments to allow passing through the newly added parameters. This addresses one out of two issues raised in arogozhnikov#23.

arogozhnikov · 2024-02-23T04:27:44Z

So looking at this.

For user it is actually more readable to just overwrite handler, for example:

custom_parse_read = lambda read: parse_file(read, umi_tag="UB")

count_snps(... , parse_read=custom_parse_read)

rather than figuring out parameters for parse_read, and then passing corresponding kwargs.

This has advantage of being more direct, and IDE can suggest parameters and complain on arguments.

So I recommend just dropping kwargs :)

mschilli87 · 2024-02-23T08:41:47Z

@arogozhnikov: Great suggestion. As I said, my Python is a bit rusty so the only two options that occured to me where a class full custom, like you suggested in #23 (comment) or the kwargs. But I agree that the lambda is a much better solution. I'll adjust this PR (and #26) accordingly, after another round of testing.

This is a partial revert of commit f460476 which parameterized the `parse_read` callback and added keyword arguments to all its callers to pass down the arguments like this: ```python count_snps(... , parse_read=parse_read, umi_tag="XM") ``` Howerver, as pointed out by Alex Rogozhnikov, the keyword arguments are not required since a lambda function could be used in their place resulting which even results in more readable code: ```python parse_read_xm = lambda read: parse_read(read, umi_tag="XM") count_snps(... , parse_read=parse_read_xm) ``` Thus, this commit reverts all changes introducing keyword argument to `parse_read` callers, leaving only the parameterization of that callback itself.

mschilli87 · 2024-02-23T12:41:17Z

@arogozhnikov: I have made the adjustemts and ran both tests (cellranger-style + my data) again using the lambda appraoch suggested by you and obtained identical results as with the kwargs approach I used initially. If you merge the (updated) PR #26, both, this one and #25 should be closed automatically.

arogozhnikov · 2024-02-24T07:14:08Z

great, already looks good to me.

We're sorting out things with repo, this shouldn't take long

This was referenced Feb 23, 2024

Non cellranger tags #26

Closed

How do I define which BAM tags contain the UMI / cell barcode? #23

Closed

mschilli87 mentioned this pull request Feb 26, 2024

Add support for non- Cell Ranger SAM tags drneavin/Demultiplexing_Doublet_Detecting_Docs#48

Merged

arogozhnikov merged commit a82618d into arogozhnikov:master Feb 27, 2024

mschilli87 deleted the parse_read-params branch February 28, 2024 08:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameterize `parse_read` for custom UMI-tags etc. #24

Parameterize `parse_read` for custom UMI-tags etc. #24

mschilli87 commented Feb 23, 2024 •

edited

Loading

arogozhnikov commented Feb 23, 2024

mschilli87 commented Feb 23, 2024

mschilli87 commented Feb 23, 2024

arogozhnikov commented Feb 24, 2024

Parameterize parse_read for custom UMI-tags etc. #24

Parameterize parse_read for custom UMI-tags etc. #24

Conversation

mschilli87 commented Feb 23, 2024 • edited Loading

arogozhnikov commented Feb 23, 2024

mschilli87 commented Feb 23, 2024

mschilli87 commented Feb 23, 2024

arogozhnikov commented Feb 24, 2024

Parameterize `parse_read` for custom UMI-tags etc. #24

Parameterize `parse_read` for custom UMI-tags etc. #24

mschilli87 commented Feb 23, 2024 •

edited

Loading