Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add violin raster #5076

Merged
merged 7 commits into from
Nov 3, 2021
Merged

add violin raster #5076

merged 7 commits into from
Nov 3, 2021

Conversation

samuel-marsh
Copy link
Collaborator

@samuel-marsh samuel-marsh commented Sep 13, 2021

Hi Seurat Team,

Just a quick PR if you guys like it, that adds raster plotting to VlnPlots. Given individual data points can still be helpful but like other plots rendering that many points in vector form becomes inefficient in multiple ways.

I used ggrastr and added that to package imports page as well.

Thanks,
Sam

@samuel-marsh samuel-marsh changed the base branch from master to develop September 13, 2021 18:42
@saketkc saketkc self-assigned this Oct 8, 2021
@saketkc
Copy link
Collaborator

saketkc commented Oct 8, 2021

Hi @samuel-marsh, thanks for the PR! We would want to limit the dependencies as much as possible. For rasterization elsewhere, we rely on scattermore. Would it be possible to rely here on that instead?

@samuel-marsh
Copy link
Collaborator Author

Hi @saketkc
Ya I tried that at first for that reason and you can pass geom_scattermore(pointsize = pt.size, position = "jitter") and it works. However, when splitting the plots I couldn't figure out how to add the additional jitter positioning in the current Seurat code (Maybe it's some rust on my ggplot2 and I'm missing it or maybe not possible with scattermore). I do really like scattermore and it's speed so have used that where possible.

It certainly doesn't look bad without the additional position if you're really looking it does appear a little off. If you guys are ok with it though I can definitely modify the PR.

Best,
Sam

@saketkc
Copy link
Collaborator

saketkc commented Oct 15, 2021

Thanks Sam! Do you have an example I could look at? Just want to avoid adding another dependency if possible.

@samuel-marsh
Copy link
Collaborator Author

Hey @saketkc so here is quick example using the SeuratData hcabm40k dataset because has decent number of points. A few notes:
--So yes scattermore does raster the points but as mentioned when using split.by they don't really offset in the jitter which makes it difficult to tell the splits apart. Can also see visually by extra space on edges of first and last group in the split. Again maybe there is way to specify this extra info to scattermore that I'm missing (kinda hope so).
--ggrastr has no such issues because you just wrap the rasterize function around the existing ggplot calls so it maintains that info.

--Even in this relatively small dataset scattermore is definitely way faster actually rendering the plot which is part of desire to rasterize in first place. Didn't time it, but very noticeable.

--The file size of scattermore plot is also quite a bit smaller even in this small dataset though both are far cry from vector form. In the plots described below:
Vector Form (p1 or p4): 2.2MB
scattermore no split (p2): 89KB
ggrastr no split (p5): 181KB

Vector Form (p7): 2.3MB
scattermore split (p3): 207KB
ggrastr split (p6): 295KB

If there isn't way to make scattermore offset things when split then overall may not be worth the trouble of the additional dependencies just to add raster option for VlnPlot because scattermore seems like much better solution for any/all of the tSNE/UMAP or FeatureScatter plots.

Best,
Sam

test_data <- hcabm40k.SeuratData::hcabm40k

test_data$groups <- sample(c("group1", "group2", "group3", "group4"), size = ncol(test_data), replace = TRUE)

test_data <- NormalizeData(test_data)

p1 <- VlnPlot_scattermore(object = test_data, "ACTB", raster = FALSE, pt.size = 0.5)
p2 <- VlnPlot_scattermore(object = test_data, "ACTB", raster = TRUE, pt.size = 0.5)
p3 <- VlnPlot_scattermore(object = test_data, "ACTB", raster = TRUE, pt.size = 0.5, split.by = "groups")

scatter_patch <- wrap_plots(p1, p2, p3, ncol = 1)
scatter_patch

p4 <- VlnPlot_ggrastr(object = test_data, "ACTB", raster = FALSE, pt.size = 0.5)
p5 <- VlnPlot_ggrastr(object = test_data, "ACTB", raster = TRUE, pt.size = 0.5)
p6 <- VlnPlot_ggrastr(object = test_data, "ACTB", raster = TRUE, pt.size = 0.5, split.by = "groups")

ggrastr_patch <- wrap_plots(p4, p5, p6, ncol = 1)
ggrastr_patch

p7 <- VlnPlot_ggrastr(object = test_data, "ACTB", raster = FALSE, pt.size = 0.5, split.by = "groups")

scatter_patch.pdf
ggrastr_patch.pdf

@saketkc
Copy link
Collaborator

saketkc commented Oct 19, 2021

Thanks very much! I will take a closer look and also discuss it internally.

@saketkc
Copy link
Collaborator

saketkc commented Oct 28, 2021

Hi @samuel-marsh, I modified your PR to make ggrastr optional. Can you let me know if the changes look good and suit your use case?

@samuel-marsh
Copy link
Collaborator Author

Hi @saketkc makes sense and good idea to make optional. One change is that I believe ggrastr should be moved to the Suggests section instead of Imports in DESCRIPTION file as I believe anything in Imports section will be installed when installing Seurat which doesn't make it optional anymore.

Thanks very much!

@saketkc
Copy link
Collaborator

saketkc commented Nov 3, 2021

Thanks for checking - I did remove it from imports, but did not add it to suggests, which I have fixed.

@saketkc saketkc merged commit b8ff4f8 into satijalab:develop Nov 3, 2021
@saketkc
Copy link
Collaborator

saketkc commented Nov 3, 2021

Thanks a lot for your contribution, Sam!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants