Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Seurat <-> SCE conversion sections #668

Merged
merged 13 commits into from
Mar 7, 2023

Conversation

sjspielman
Copy link
Member

Closes #646

This PR adds sections for seurat/sce conversion. For the first round of review here, I think the most helpful feedback will be in terms of scope. I tried to find a good middle-ground starting point with enough info to get them going in conversion, but without going into excessive (any?) detail about edge cases, and would like to hear feedback on what can be expanded/condensed/have-nuanced-added.
One small note is that I linked the release "OSCA" version, consistent with how we handle link versions in the cheatsheets but different from how we handle OSCA links in the instruction notebooks.

The PDF will come at the end of review!

@sjspielman sjspielman requested a review from jashapiro March 6, 2023 22:00
Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start. I think we probably want a bit more detail in a couple of places; see my comments below. tl;dr: I think we need to mention when the simple methods might not work and be sure we are always providing examples with a minimum amount of functionality (i.e., not dropping metadata).

module-cheatsheets/scRNA-seq-advanced-cheatsheet.md Outdated Show resolved Hide resolved
sce_object <- as.SingleCellExperiment(seurat_obj)
```

Alternatively, you can extract individual slots from the `Seurat` object and build your `SCE` object from scratch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like introducing this without a reason why we want to do it is not particularly helpful. Is there any case where we would want to extract just one assay with no metadata? If there is a specific case that we feel is worth illustrating, we should do that, but I don't want to encourage people to drop metadata as part of the conversion.

If there is something that the conversion does not handle automatically, we might want to show how to add that to an existing SCE, or rename things that might end up in unexpected locations, but I'm not sure linking to building an SCE is useful here, especially without the matching docs for how to extract each part of a Seurat object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, maybe this isn't needed at all. I included it as a way to offer some kind of direction if conversion goes wrong, but really if conversion goes wrong, a few small details won't help (that's what consultations/office hours are more reasonably for).
Based on the Seurat source, it seems like all the bits do get converted, so I'm thinking just call it a day with Seurat::as.SingleCellExperiment().

assay = "RNA",
project = "name of your project goes here")
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to note that we are not using Seurat::as.Seurat() though that may be an option (it didn't used to work very well, but maybe works better now?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check it out on whatever versions we have in the training renv, haven't use that function yet.


This [documentation from the `ScPCA`](https://scpca.readthedocs.io/en/latest/faq.html#what-if-i-want-to-use-seurat-instead-of-bioconductor) introduces how to convert `SCE` objects to `Seurat` objects.
Although this documentation was written for `ScPCA` datasets, the steps generally apply to any `SCE object`.
Briefly, here is how you can convert a `Seurat` to `SCE` object, focusing on porting over _assays_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will to show the steps to add at least the cell and feature metadata here. We present it as "optional" in the docs, but we probably shouldn't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something else I also thought about is whether we should straight-up recommend scpcaTools::sce_to_seurat() here, in addition to or instead of individual steps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe not ^, since our function only allows one assay to get ported over... But I've stubbed something out already so we'll see.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can point people to it as an example, but I would not exactly recommend it.

module-cheatsheets/scRNA-seq-advanced-cheatsheet.md Outdated Show resolved Hide resolved
@sjspielman
Copy link
Member Author

This is ready for another look. I made some sort of big changes, so part of the next round of review should focus on: "Was that wrong? Should I not have done that?", with extra bonus points if you get this reference.

  • I played around with as.Seurat() on a couple ScPCA-derived objects (_filtered.rds and some more processed files we generated during integration testing), and it seems to work...as long as you use its arguments properly...... I added some bullet points to explain its arguments for a couple scenarios. I separated out ScPCA-derived conversion into a subsection, hence overall header level changes to accommodate that more legibly.
  • Note that many of the PDF cheatsheets were still on blob links, but now they're on raw links which will download when clicked (browser-dependently, maybe).

@sjspielman sjspielman requested a review from jashapiro March 7, 2023 18:00
Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks good! But I think we need to remove or cut way down the scpcaTools part, for reasons discussed below. I had a couple other suggestions where I worry about unexpected behavior that we might want to warn people about.

```

By default, all assays present in the `Seurat` object will be ported into the new `SCE` object.
To only specify that certain assays are retained, you can optionally provide the argument `assays`, as in: `assays = c("assays", "to", "keep")`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make this a concrete example with the function call?

sce_object <- Seurat::as.SingleCellExperiment(seurat_obj, assays = c("counts", "logcounts"))`

As I wrote that, I realized that I don't necessarily know the default assay names or if/how they get copied/renamed. (I assume the default version does the "right" thing, but I don't know about this assay-specific version. This seems like an important caveat to include here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caveat is now a big table and bullet list of sce/seurat bookkeeping. There's a good pun/joke to be made with "confusion matrix" here.

Comment on lines 209 to 210
- `assay = NULL` specifies that, by default, all assays will be converted.
- To specify that an additional assay besides `"counts"` or `"logcounts"` should be converted, include it here as in `assay = "additional_assay_name"`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd flip the order of these two instructions. Because what you are really saying is that if you want to exclude some assays, you need to specify this argument, with all of the assays you want to include (which is confusing). I think it may be less confusing to say something like "The assays argument allows you to specify specific addtional assays to include. By default, all assays are converted..." Probably also worth saying if you _don't want additional assays you need to set this to c() which is non-intuitive.

Copy link
Member Author

@sjspielman sjspielman Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, assay apparently does not always mean what we think it means, so I changed this in a different way!

Comment on lines 231 to 235
We also offer a conversion function `sce_to_seurat()` as part of our [`scpcaTools()` package](https://github.com/AlexsLemonade/scpcaTools/), which holds utilities used in the `ScPCA` workflow.
Again, although this function was written to convert `SCE` objects from `ScPCA`, it should generally work for most `SCE` objects, although it will only retain a single assay (raw `"counts"`) in the new `SCE` object, and it will not retain reduced dimension representations (e.g., PCA or UMAP).
Therefore, this function is mostly useful at the early stages of processing before you have normalized counts and and calculated reduced dimensions.

You can obtain this package using the `remotes` package, which may also need to be installed first:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should recommend this at this stage, for a bad reason and a good reason. The bad reason is that I'm just not confident in it yet, and don't want to be debugging it as a general tool at the moment.

The good reason is that we are using R 4.1 on the server, and scpcaTools now requires R 4.2, so this won't actually work!

I think it is fine to point people to the code as an example though!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purged scpcaTools for R-versioning reasons, and reorganized the remaining text.

Comment on lines 179 to 187
| Data aspect | `SCE` | `Seurat` |
|------------|---------|---------|
| Raw counts assay | `counts(sce_object)` | `seurat_obj[["RNA"]]@counts` |
| Normalized counts assay | `logcounts(sce_object)` | `seurat_obj[["RNA"]]@data` |
| Reduced dimension: PCA matrix | `reducedDim(sce_object, "PCA)` | `seurat_obj$pca@cell.embeddings` |
| Reduced dimension: UMAP matrix | `reducedDim(sce_object, "UMAP)` | `seurat_obj$umap@cell.embeddings` |
| Cell-level metadata | `colData(sce_object)` | `seurat_obj@meta.data` |
| Feature (gene)-level metadata | `rowData(sce_object)` | `seurat_obj[["RNA"]]@meta.features`|
| Miscellaneous additional metadata | `metadata(sce_object)` | `seurat_obj@misc`|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏🏼 👏🏼 👏🏼

This I think will be really useful! (My only fear is that it will make people ask more for the "Seurat equivalent" for all the things.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anything it's useful for me!

@sjspielman sjspielman requested a review from jashapiro March 7, 2023 20:13
Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sjspielman sjspielman merged commit 54a0577 into master Mar 7, 2023
@sjspielman sjspielman deleted the sjspielman/scAdvanced-cheatsheet2 branch March 7, 2023 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cheat sheet(s) for advanced scRNA topics
2 participants