Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustering accounting for spatial coordinates #13

Open
giovp opened this issue Sep 9, 2020 · 12 comments
Open

clustering accounting for spatial coordinates #13

giovp opened this issue Sep 9, 2020 · 12 comments
Assignees
Labels
enhancement ✨ New feature or request graph 🕸️

Comments

@giovp
Copy link
Member

giovp commented Sep 9, 2020

Not very clear idea, but something along these lines: https://www.biorxiv.org/content/10.1101/2020.09.04.283812v1
Maybe a way to achieve similar results without explicit modelling and inference. It's essentially a smoothing of cluster assignments on spatial coordinates.

@giovp giovp added this to the Tools to analyze spatial graph milestone Sep 9, 2020
@SabrinaRichter SabrinaRichter self-assigned this Sep 14, 2020
@M0hammadL M0hammadL self-assigned this Sep 14, 2020
@SabrinaRichter
Copy link
Contributor

Are you working on a method that creates some kind of adjacency matrices for the seqfish data? So especially split by 'Field of View'? So actually like 6 or 7 adjacency matrices?

@giovp
Copy link
Member Author

giovp commented Sep 14, 2020

@Koncopd is working on them!

@giovp giovp mentioned this issue Sep 15, 2020
@giovp
Copy link
Member Author

giovp commented Dec 5, 2020

Still mixed feelings about this, let's keep it open

@giovp giovp added enhancement ✨ New feature or request graph 🕸️ labels Feb 11, 2021
@giovp
Copy link
Member Author

giovp commented Jul 14, 2021

related to #246 and scverse/scanpy#1818

@giovp
Copy link
Member Author

giovp commented Jan 7, 2022

I'm still quite tempted to add this although only use case I see is when the spatial graph is not a grid (but has some interesting topology). also, this should probably be in scanpy (or muon ? ).

@SabrinaRichter
Copy link
Contributor

The idea was to include node feature information into the clustering, right? Then it could also be interesting for grid graphs, no? only question is whether people are interested in spatial pieces/clusters of homogeneous cell type patterns

@giovp
Copy link
Member Author

giovp commented Jan 7, 2022

mmh that could also be a way to do it but in scverse/scanpy#1818 the idea is to do multiplex partitioning with the knn from gexp and spatial graph jointly (without considering the node features). in case of features yes (could be image features?) and it would be interesting nonetheless (and even doable by doing joint partitioning of knn from gexp and image features).

@ivirshup
Copy link
Member

What do you want to achieve by including spatial information in the clustering? I can think of two reasons to do this:

  1. You want to separate cell types which are not near each other into separate categories
  2. You want to make it more likely for nearby cells to be a part of cluster. E.g. loosen the similarity criteria for nearby cells.

I see a obvious use cases for 1, but I'm not sure you need a clustering for this. You should just be able to break up your non-spatial clustering results by finding connected components in the spatial graph. This would be like:

Example
setup

Just getting to an AnnData I can do stuff with

import scanpy as sc
import squidpy as sq

import numpy as np, pandas as pd
from scipy import sparse

import seaborn as sns
from matplotlib import pyplot as plt
plt.rcParams["figure.figsize"] = (12, 8)

adata = sc.datasets.visium_sge("V1_Breast_Cancer_Block_A_Section_1")
adata.var_names_make_unique()

adata.var["mito"] = adata.var_names.str.startswith("MT-")
sc.pp.calculate_qc_metrics(adata, qc_vars=["mito"], inplace=True)
sc.pp.filter_genes(adata, min_counts=1)

adata.layers["counts"] = adata.X.copy()

sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata, flavor="seurat_v3", layer="counts", n_top_genes=1000)

sc.pp.pca(adata)
sc.pp.neighbors(adata)
Subsetting clusters by spatial neighbor
sq.gr.spatial_neighbors(adata)
sc.tl.leiden(adata, resolution=0.5)

def find_per_cluster_components(adata, obs_key, graph_key):
    clusters = adata.obs[obs_key].astype("category")
    graph = adata.obsp[graph_key]
    components = -np.ones(adata.n_obs, dtype=int)
    new_labels = pd.DataFrame({"cluster": clusters, "component": np.zeros(adata.n_obs, dtype=int)})

    for k, indices in adata.obs.groupby(obs_key).indices.items():
        components[indices] = sparse.csgraph.connected_components(adata[indices].obsp[graph_key])[1]
    new_labels = pd.DataFrame({"cluster": clusters, "components": components})
    return new_labels


df = find_per_cluster_components(adata, "leiden", "spatial_connectivities")

# Kinda gross
subgroups = pd.Series(-np.ones(adata.n_obs, dtype=int), index=adata.obs_names)
subgroups.loc[adata.obs.query("leiden == '6'").index] = df["components"].loc[adata.obs.query("leiden == '6'").index]
adata.obs["to_plot"] = pd.Categorical.from_codes(codes=subgroups, categories=[str(x) for x in range(subgroups.max() + 1)])

One selected cluster, split by connected components on the spatial graph.

image

I'm not so sure how useful 2 is, but I could definitely be missing something.

@giovp
Copy link
Member Author

giovp commented Jan 11, 2022

that's really cool @ivirshup ! it'd be a very handy function.

re 2. , I think it's still be useful and would be a purely "data driven" (not necessarily better) way to achieve 1. That'd be done with multi-graph partitioning (native in leidenalg) where the knngraph from gexp and the spatial graphs are inputted. This is particulary useful for non-visium data where the graph actually has an interesting topology.

@ivirshup
Copy link
Member

re 2., I think it's still be useful and would be a purely "data driven" (not necessarily better) way to achieve 1.

Could this problem also be thought of as "expression driven segmentation"?

I'm just a little unsure of the case where you want an output like the second plot, but without knowing those were the same cell types. Unless there's a case where you'd find something that looks different?

@LLehner
Copy link
Member

LLehner commented Jun 12, 2024

Some methods for spatial clustering are now being implemented in PR#831. Potential methods are discussed in Issue#789.

@LLehner LLehner closed this as completed Jun 12, 2024
@LLehner LLehner reopened this Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ New feature or request graph 🕸️
Projects
None yet
Development

No branches or pull requests

5 participants