Understanding the "-" operator when combining two UMAP models #932

kimartin · 2022-10-13T14:51:03Z

kimartin
Oct 13, 2022

I'm currently playing around with a labelled dataset where the labels have two nest levels (basically, individuals in groups).
I'm trying to separate the two levels so I can see if there's an effect of groups, independently of the individual-level labels.

Fitting two models and contrasting them might output something I want, but I also wonder if I'm understanding what the "-" operator really does.

So the code would be something like (with X the data, y the individual-level labels)
regular_mapper = UMAP(min_dist=0, random_state=42).fit(X)
individual_mapper = UMAP(min_dist=0, random_state=42).fit(X, y)
final_mapper = regular_mapper - individual_mapper

Therefore, one regular model with unsupervised UMAP, and another with supervised UMAP where the model is given the individual-level labels, then contrasting the two.

After plotting the embeddings, my dataset (with the points coloured by individual label) looks like this under the three mappers above:

Am I correct in thinking the third mapper gives me a view of the data without the influence of the individual-level classes?

Thank you in advance and sorry if I fundamentally misunderstood something.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding the "-" operator when combining two UMAP models #932

{{title}}

Replies: 0 comments

Select a reply

Understanding the "-" operator when combining two UMAP models #932

kimartin Oct 13, 2022

Replies: 0 comments

kimartin
Oct 13, 2022