Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sce_to_seurat function #123

Merged
merged 10 commits into from
Sep 1, 2022

Conversation

sjspielman
Copy link
Member

This PR addresses certain aspects of this conversation about data integration: AlexsLemonade/sc-data-integration#123

We would like to make the sce_to_seurat() more flexible in which assay is chosen to include in the new Seurat object. This PR adds an argument assay_name, default "counts", which can be specified for this purpose. I also included a check to make sure the assay exists.

All tests are passing, but I did not write more tests for different assay names (but I can if requested!), but the code I wrote indeed still works fine for a "counts" assay as the passing tests reflect!

I also thought about whether I should update here as well and supply an argument assay = assay_name, but the default that is used here by Seurat is "RNA" which is generally expected, so I decided to leave it alone.

seurat_obj <- Seurat::CreateSeuratObject(counts = sce_counts,
meta.data = coldata)
# add rowdata and metadata separately adding it while creating leaves the slots empty without warning
seurat_obj[["RNA"]]@var.features <- rowdata

Finally, I also reran devtools::document(), and in addition to the expected man pages update for this function, the sce_to_anndata man was updated, and the roxygen note in DESCRIPTION was bumped. I wonder if the docs were just out of date?

Finally #2, should I add myself to DESCRIPTION author field? This is a pretty small update so not sure where this change lands!

@sjspielman sjspielman requested a review from allyhawkins August 30, 2022 19:26
Copy link
Member

@allyhawkins allyhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on these lines but I agree with your comment about changing the actual assay name in Seurat so that it corresponds to the assay name in the SCE object that is being used. I just don't like the idea of saying I want to convert the logcounts assay but then putting that in the RNA slot which corresponds to raw data usually. That just seems wrong. So I think we should add the assay = assay_name line to line 55 and then change line 59 to seurat_obj[[assay_name]]@var.features <- rowdata. I also think there should be a note in the docs stating that the assay converted is saved with that assay name in the Seurat object.

Doing this also makes me want to add further functionality to convert multiple assays, like both counts and logcounts. That's outside the scope of this PR and the need for what we're trying to do but I'm going to file an issue because I like the idea of being able to have both.

As far as authorship, feel free to add your self.

@@ -18,7 +18,7 @@ Description: Tools for processing single cell data associated with the
License: BSD_3_clause + file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.2.0
RoxygenNote: 7.2.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Josh and I have been going back and forth on committing and removing this change because for some reason we have different versions but can't figure out why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to join in the party!

@sjspielman
Copy link
Member Author

Finally #2, should I add myself to DESCRIPTION author field? This is a pretty small update so not sure where this change lands!

Please note this means "second finally," and the GH-linked PR is not relevant 😂

@sjspielman
Copy link
Member Author

So I think we should add the assay = assay_name line to line 55 and then change line 59 to seurat_obj[[assay_name]]@var.features <- rowdata. I also think there should be a note in the docs stating that the assay converted is saved with that assay name in the Seurat object.

👍
I'm now also wondering if we'd want to have a default of using the Seurat-y "RNA" name if the default "counts" assay is pulled out?

@allyhawkins
Copy link
Member

I'm now also wondering if we'd want to have a default of using the Seurat-y "RNA" name if the default "counts" assay is pulled out?

I like this idea! I would just add a check that if the assay name is "counts" to make it equal to "RNA" before making the seurat object.

It also looks like you will need to add in the assay_name argument to the test for this function and then change line 37 to bee expect_equal(rowdata_sce, seurat_object[[assay_name]]@var.features) for it to pass tests.

@sjspielman
Copy link
Member Author

sjspielman commented Sep 1, 2022

The test here was failing because I wrote docs for assigning "RNA" but then didn't actually do it (woops!)! So, the name was being assigned as "counts". The test should be passing now with properly assigning the name as "RNA" to go with counts. I updated the test to test both default and non-default assays. I don't think all the expect_*'s were necessary to go with the new test_that() call but they can't hurt and are super quick.

Also when I was running tests locally, I was getting this from git status :

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	tests/testthat/_snaps/
	tests/testthat/test_anndata.h5

I figure we almost certainly want to be ignoring these files which are created by the tests. So, I got a bit ahead of myself and made a .gitignore for it. Let me know if that's ok!

test_that("Converting SCE to Seurat objects works as expected for custom assay name", {

seurat_object <- sce_to_seurat(sce, assay_name = "logcounts")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would also include an explicit check for the assay name being logcounts here? I don't know that you need the rest of the checks but they can't hurt to have since they should all pass if it's converted successfully.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6dd5790!

Copy link
Member

@allyhawkins allyhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once this passes checks this is good to go!

@sjspielman sjspielman merged commit 8597a30 into main Sep 1, 2022
@sjspielman sjspielman deleted the sjspielman/assay_sce_to_seurat_function branch September 1, 2022 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants