clustering accounting for spatial coordinates #13

giovp · 2020-09-09T15:39:50Z

Not very clear idea, but something along these lines: https://www.biorxiv.org/content/10.1101/2020.09.04.283812v1
Maybe a way to achieve similar results without explicit modelling and inference. It's essentially a smoothing of cluster assignments on spatial coordinates.

SabrinaRichter · 2020-09-14T10:59:50Z

Are you working on a method that creates some kind of adjacency matrices for the seqfish data? So especially split by 'Field of View'? So actually like 6 or 7 adjacency matrices?

giovp · 2020-09-14T11:05:38Z

@Koncopd is working on them!

giovp · 2020-09-16T13:59:04Z

@SabrinaRichter https://leidenalg.readthedocs.io/en/stable/reference.html#leidenalg.find_partition_multiplex

giovp · 2020-12-05T16:40:30Z

Still mixed feelings about this, let's keep it open

giovp · 2021-07-14T11:26:22Z

related to #246 and scverse/scanpy#1818

giovp · 2022-01-07T09:53:27Z

I'm still quite tempted to add this although only use case I see is when the spatial graph is not a grid (but has some interesting topology). also, this should probably be in scanpy (or muon ? ).

SabrinaRichter · 2022-01-07T13:02:45Z

The idea was to include node feature information into the clustering, right? Then it could also be interesting for grid graphs, no? only question is whether people are interested in spatial pieces/clusters of homogeneous cell type patterns

giovp · 2022-01-07T15:39:33Z

mmh that could also be a way to do it but in scverse/scanpy#1818 the idea is to do multiplex partitioning with the knn from gexp and spatial graph jointly (without considering the node features). in case of features yes (could be image features?) and it would be interesting nonetheless (and even doable by doing joint partitioning of knn from gexp and image features).

ivirshup · 2022-01-11T10:36:19Z

What do you want to achieve by including spatial information in the clustering? I can think of two reasons to do this:

You want to separate cell types which are not near each other into separate categories
You want to make it more likely for nearby cells to be a part of cluster. E.g. loosen the similarity criteria for nearby cells.

I see a obvious use cases for 1, but I'm not sure you need a clustering for this. You should just be able to break up your non-spatial clustering results by finding connected components in the spatial graph. This would be like:

Example

setup

Just getting to an AnnData I can do stuff with

import scanpy as sc
import squidpy as sq

import numpy as np, pandas as pd
from scipy import sparse

import seaborn as sns
from matplotlib import pyplot as plt
plt.rcParams["figure.figsize"] = (12, 8)

adata = sc.datasets.visium_sge("V1_Breast_Cancer_Block_A_Section_1")
adata.var_names_make_unique()

adata.var["mito"] = adata.var_names.str.startswith("MT-")
sc.pp.calculate_qc_metrics(adata, qc_vars=["mito"], inplace=True)
sc.pp.filter_genes(adata, min_counts=1)

adata.layers["counts"] = adata.X.copy()

sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata, flavor="seurat_v3", layer="counts", n_top_genes=1000)

sc.pp.pca(adata)
sc.pp.neighbors(adata)

Subsetting clusters by spatial neighbor

sq.gr.spatial_neighbors(adata)
sc.tl.leiden(adata, resolution=0.5)

def find_per_cluster_components(adata, obs_key, graph_key):
    clusters = adata.obs[obs_key].astype("category")
    graph = adata.obsp[graph_key]
    components = -np.ones(adata.n_obs, dtype=int)
    new_labels = pd.DataFrame({"cluster": clusters, "component": np.zeros(adata.n_obs, dtype=int)})

    for k, indices in adata.obs.groupby(obs_key).indices.items():
        components[indices] = sparse.csgraph.connected_components(adata[indices].obsp[graph_key])[1]
    new_labels = pd.DataFrame({"cluster": clusters, "components": components})
    return new_labels


df = find_per_cluster_components(adata, "leiden", "spatial_connectivities")

# Kinda gross
subgroups = pd.Series(-np.ones(adata.n_obs, dtype=int), index=adata.obs_names)
subgroups.loc[adata.obs.query("leiden == '6'").index] = df["components"].loc[adata.obs.query("leiden == '6'").index]
adata.obs["to_plot"] = pd.Categorical.from_codes(codes=subgroups, categories=[str(x) for x in range(subgroups.max() + 1)])

One selected cluster, split by connected components on the spatial graph.

I'm not so sure how useful 2 is, but I could definitely be missing something.

giovp · 2022-01-11T10:47:46Z

that's really cool @ivirshup ! it'd be a very handy function.

re 2. , I think it's still be useful and would be a purely "data driven" (not necessarily better) way to achieve 1. That'd be done with multi-graph partitioning (native in leidenalg) where the knngraph from gexp and the spatial graphs are inputted. This is particulary useful for non-visium data where the graph actually has an interesting topology.

ivirshup · 2022-01-11T13:30:36Z

re 2., I think it's still be useful and would be a purely "data driven" (not necessarily better) way to achieve 1.

Could this problem also be thought of as "expression driven segmentation"?

I'm just a little unsure of the case where you want an output like the second plot, but without knowing those were the same cell types. Unless there's a case where you'd find something that looks different?

LLehner · 2024-06-12T08:24:19Z

Some methods for spatial clustering are now being implemented in PR#831. Potential methods are discussed in Issue#789.

giovp added this to the Tools to analyze spatial graph milestone Sep 9, 2020

SabrinaRichter self-assigned this Sep 14, 2020

M0hammadL self-assigned this Sep 14, 2020

giovp mentioned this issue Sep 15, 2020

Clustering #25

Merged

giovp added enhancement ✨ New feature or request graph 🕸️ labels Feb 11, 2021

LLehner closed this as completed Jun 12, 2024

LLehner reopened this Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clustering accounting for spatial coordinates #13

clustering accounting for spatial coordinates #13

giovp commented Sep 9, 2020

SabrinaRichter commented Sep 14, 2020

giovp commented Sep 14, 2020

giovp commented Sep 16, 2020

giovp commented Dec 5, 2020

giovp commented Jul 14, 2021

giovp commented Jan 7, 2022

SabrinaRichter commented Jan 7, 2022

giovp commented Jan 7, 2022

ivirshup commented Jan 11, 2022

giovp commented Jan 11, 2022

ivirshup commented Jan 11, 2022

LLehner commented Jun 12, 2024

clustering accounting for spatial coordinates #13

clustering accounting for spatial coordinates #13

Comments

giovp commented Sep 9, 2020

SabrinaRichter commented Sep 14, 2020

giovp commented Sep 14, 2020

giovp commented Sep 16, 2020

giovp commented Dec 5, 2020

giovp commented Jul 14, 2021

giovp commented Jan 7, 2022

SabrinaRichter commented Jan 7, 2022

giovp commented Jan 7, 2022

ivirshup commented Jan 11, 2022

giovp commented Jan 11, 2022

ivirshup commented Jan 11, 2022

LLehner commented Jun 12, 2024