Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiplex community detection with Leiden #1818

Closed
1 of 5 tasks
giovp opened this issue Apr 28, 2021 · 9 comments
Closed
1 of 5 tasks

Multiplex community detection with Leiden #1818

giovp opened this issue Apr 28, 2021 · 9 comments

Comments

@giovp
Copy link
Member

giovp commented Apr 28, 2021

  • Additional function parameters / changed functionality / changed defaults?
  • New analysis tool: A simple analysis tool you have been using and are missing in sc.tools?
  • New plotting function: A kind of plot you would like to seein sc.pl?
  • External tools: Do you know an existing package that should go into sc.external.*?
  • Other?

Adding multiplex community detection from Leiden: https://leidenalg.readthedocs.io/en/stable/multiplex.html#layer-multiplex

It seems very straightforward and would be the most simple way to integrate two modalities on the graph. We would make great use of it in Squidpy (rna counts+image), but I think it should live in Scanpy becasue it could be useful for other multi-modal data.

This is a duplicate of #1107 and it has been extensively discussed in #1117 . In the latter however, lots of thought went into normalization/processing which is superfluous for this case as it is only specific for CITE-seq data. Here we'd just want to allow users to get partitions out of multiple graphs.

This could be done in two ways:

  • adding arguments to existing tl.leiden, so that it accepts multiple graphs and multiple resolutions params per graph.
  • creating a separate function sc.tl.leiden_multiplex.
    Any thoughts on this @ivirshup @Koncopd ?

I think @WeilerP also had some thoughts along these lines. Have you ever tried this out? is there any other analysis tool you explored with a simlar purpose? Would be interested to hear your thoughts!
worth mentioning that another approach, the WNN from seurat, was also mentioned here: #1117 (comment)
although am not sure how much work that requries.

@WeilerP
Copy link
Contributor

WeilerP commented Apr 28, 2021

Sorry, I didn't use multiplex community detection. I focused more on preprocessing and briefly on graph construction. I stored the CITE-Seq data in adata.obsm and the gene-protein mapping as a column in adata.var - if this helps.

@LuckyMD
Copy link
Contributor

LuckyMD commented Apr 28, 2021

Hey! I looked at multiplex louvain a bit a few years ago (and put it in a grant that didn't get funded in the end ^^)... i guess one of the difficult things to actually using this is tuning the inter layer weight. I reckon this should actually be regarded as a new approach to multi-modal data integration. And it would require quite a bit of parameter tuning to understand how these edge weights need to be tuned. Hence I'm not sure if we just want to add it like this...

@giovp
Copy link
Member Author

giovp commented May 1, 2021

hey all, thanks for feedback.

@LuckyMD I totally see the point but disagree

i guess one of the difficult things to actually using this is tuning the inter layer weight.

exactly and this will be different (I think?) across different multi modal tech integration (e.g. cite-seq, or spatial etc.) and e.g. for spatial it will potentially different across tissues (some tissues have more structure spatial/image features graphs than others).

Nervetheless, I think it would be very empowering to users to be able to play around with this. It is "just" another knob to tune that would nonetheless enrich the analysis experience imho

@LuckyMD
Copy link
Contributor

LuckyMD commented May 2, 2021

Hmm... I wonder what the policy should be for Scanpy in these kinds of situations. So far I believe we have mainly added tools that have previously been used for sc analysis (either published tools or ones that have been used in sc papers). I'm not aware of that being the case for multiplex clustering yet. Do we really want to add methods to core that are ML tools, but not necessarily used for SC analysis yet? That would open quite a large range to methods to possible contributions (but might take us out of scanpy core remit... assuming that's clearly defined).

Especially something as experimental as multiplex clustering I would be a bit hesitant about. Users will have a particular expectation of a tool in scanpy core. If there isn't a canonical use case example (something we can use as tutorial, or point to as a reason for when this can work), then might not meet those expectations.

@bio-la
Copy link
Contributor

bio-la commented Jun 9, 2021

I agree with malte that there's so much more ML out there that just adding a function cause it can be quickly implemented can be risky.
however if we're not the ones to try then who else should. so what if we test the leiden_multiplex in comparison to seurat's WNN on the tutorial data, and decide then? I would be surprised if we didn't find a set of params for leiden_multiplex that allows to replicate the seurat clustering results. also comes to mind similarity network fusion (implemented for citeseq in the citefuse package). prob a project of its own sake tbh.
happy to help with this.

@giovp
Copy link
Member Author

giovp commented Jun 14, 2021

@bio-la thanks for the interest. If you are keen to take a stab at implementation I'd be very happy to support.

re: multiplex partition being a "new algorithm" or too experimental and therefore not suitable for Scanpy, I still disagree. FWIW it's already implemented in the great muon-data package (both for louvain and leiden):

@LuckyMD
Copy link
Contributor

LuckyMD commented Jun 14, 2021

@giovp Cool! I hadn't seen this. If this is referenced in their paper, then multiplex leiden would fit into the category of "used in sc analysis" that I was arguing before, and I would be happy with it being in here. I do think that some testing should ideally happen on our side, so it would be great if you want to take this on, @bio-la !

@imeMFK01
Copy link

Hi,

What should be the current workaround to run the following line of code?

sc.tl.leiden_multiplex(rna, ["rna_connectivities", "protein_connectivities"]) # Adds key "leiden_multiplex" by default

I want to replicate this tutorial.

Thanks.

@ivirshup
Copy link
Member

This functionality is available through muon: https://muon.readthedocs.io/en/latest/api/generated/muon.tl.leiden.html

Sorry for the confusion, but that tutorial is based on a development branch which is out of date and should be taken down.

I'm also going to close this issue, since multimodal analysis in general is handled through muon.

@ivirshup ivirshup closed this as not planned Won't fix, can't repro, duplicate, stale Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants