-
-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Seurat <-> SCE conversion sections #668
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good start. I think we probably want a bit more detail in a couple of places; see my comments below. tl;dr: I think we need to mention when the simple methods might not work and be sure we are always providing examples with a minimum amount of functionality (i.e., not dropping metadata).
sce_object <- as.SingleCellExperiment(seurat_obj) | ||
``` | ||
|
||
Alternatively, you can extract individual slots from the `Seurat` object and build your `SCE` object from scratch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like introducing this without a reason why we want to do it is not particularly helpful. Is there any case where we would want to extract just one assay with no metadata? If there is a specific case that we feel is worth illustrating, we should do that, but I don't want to encourage people to drop metadata as part of the conversion.
If there is something that the conversion does not handle automatically, we might want to show how to add that to an existing SCE, or rename things that might end up in unexpected locations, but I'm not sure linking to building an SCE is useful here, especially without the matching docs for how to extract each part of a Seurat object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, maybe this isn't needed at all. I included it as a way to offer some kind of direction if conversion goes wrong, but really if conversion goes wrong, a few small details won't help (that's what consultations/office hours are more reasonably for).
Based on the Seurat source, it seems like all the bits do get converted, so I'm thinking just call it a day with Seurat::as.SingleCellExperiment()
.
assay = "RNA", | ||
project = "name of your project goes here") | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to note that we are not using Seurat::as.Seurat()
though that may be an option (it didn't used to work very well, but maybe works better now?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check it out on whatever versions we have in the training renv
, haven't use that function yet.
|
||
This [documentation from the `ScPCA`](https://scpca.readthedocs.io/en/latest/faq.html#what-if-i-want-to-use-seurat-instead-of-bioconductor) introduces how to convert `SCE` objects to `Seurat` objects. | ||
Although this documentation was written for `ScPCA` datasets, the steps generally apply to any `SCE object`. | ||
Briefly, here is how you can convert a `Seurat` to `SCE` object, focusing on porting over _assays_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will to show the steps to add at least the cell and feature metadata here. We present it as "optional" in the docs, but we probably shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something else I also thought about is whether we should straight-up recommend scpcaTools::sce_to_seurat()
here, in addition to or instead of individual steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe not ^, since our function only allows one assay to get ported over... But I've stubbed something out already so we'll see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can point people to it as an example, but I would not exactly recommend it.
Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>
This is ready for another look. I made some sort of big changes, so part of the next round of review should focus on: "Was that wrong? Should I not have done that?", with extra bonus points if you get this reference.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good! But I think we need to remove or cut way down the scpcaTools part, for reasons discussed below. I had a couple other suggestions where I worry about unexpected behavior that we might want to warn people about.
``` | ||
|
||
By default, all assays present in the `Seurat` object will be ported into the new `SCE` object. | ||
To only specify that certain assays are retained, you can optionally provide the argument `assays`, as in: `assays = c("assays", "to", "keep")`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make this a concrete example with the function call?
sce_object <- Seurat::as.SingleCellExperiment(seurat_obj, assays = c("counts", "logcounts"))`
As I wrote that, I realized that I don't necessarily know the default assay names or if/how they get copied/renamed. (I assume the default version does the "right" thing, but I don't know about this assay-specific version. This seems like an important caveat to include here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caveat is now a big table and bullet list of sce/seurat bookkeeping. There's a good pun/joke to be made with "confusion matrix" here.
- `assay = NULL` specifies that, by default, all assays will be converted. | ||
- To specify that an additional assay besides `"counts"` or `"logcounts"` should be converted, include it here as in `assay = "additional_assay_name"`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd flip the order of these two instructions. Because what you are really saying is that if you want to exclude some assays, you need to specify this argument, with all of the assays you want to include (which is confusing). I think it may be less confusing to say something like "The assays
argument allows you to specify specific addtional assays to include. By default, all assays are converted..." Probably also worth saying if you _don't want additional assays you need to set this to c()
which is non-intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, assay
apparently does not always mean what we think it means, so I changed this in a different way!
We also offer a conversion function `sce_to_seurat()` as part of our [`scpcaTools()` package](https://github.com/AlexsLemonade/scpcaTools/), which holds utilities used in the `ScPCA` workflow. | ||
Again, although this function was written to convert `SCE` objects from `ScPCA`, it should generally work for most `SCE` objects, although it will only retain a single assay (raw `"counts"`) in the new `SCE` object, and it will not retain reduced dimension representations (e.g., PCA or UMAP). | ||
Therefore, this function is mostly useful at the early stages of processing before you have normalized counts and and calculated reduced dimensions. | ||
|
||
You can obtain this package using the `remotes` package, which may also need to be installed first: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should recommend this at this stage, for a bad reason and a good reason. The bad reason is that I'm just not confident in it yet, and don't want to be debugging it as a general tool at the moment.
The good reason is that we are using R 4.1 on the server, and scpcaTools
now requires R 4.2, so this won't actually work!
I think it is fine to point people to the code as an example though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Purged scpcaTools
for R-versioning reasons, and reorganized the remaining text.
| Data aspect | `SCE` | `Seurat` | | ||
|------------|---------|---------| | ||
| Raw counts assay | `counts(sce_object)` | `seurat_obj[["RNA"]]@counts` | | ||
| Normalized counts assay | `logcounts(sce_object)` | `seurat_obj[["RNA"]]@data` | | ||
| Reduced dimension: PCA matrix | `reducedDim(sce_object, "PCA)` | `seurat_obj$pca@cell.embeddings` | | ||
| Reduced dimension: UMAP matrix | `reducedDim(sce_object, "UMAP)` | `seurat_obj$umap@cell.embeddings` | | ||
| Cell-level metadata | `colData(sce_object)` | `seurat_obj@meta.data` | | ||
| Feature (gene)-level metadata | `rowData(sce_object)` | `seurat_obj[["RNA"]]@meta.features`| | ||
| Miscellaneous additional metadata | `metadata(sce_object)` | `seurat_obj@misc`| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏🏼 👏🏼 👏🏼
This I think will be really useful! (My only fear is that it will make people ask more for the "Seurat equivalent" for all the things.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything it's useful for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Closes #646
This PR adds sections for seurat/sce conversion. For the first round of review here, I think the most helpful feedback will be in terms of scope. I tried to find a good middle-ground starting point with enough info to get them going in conversion, but without going into excessive (any?) detail about edge cases, and would like to hear feedback on what can be expanded/condensed/have-nuanced-added.
One small note is that I linked the
release
"OSCA" version, consistent with how we handle link versions in the cheatsheets but different from how we handle OSCA links in the instruction notebooks.The PDF will come at the end of review!