Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix plotting fast_hdbscan condensed trees #666

Merged
merged 3 commits into from
Jan 9, 2025

Conversation

JelmerBot
Copy link
Contributor

The recent change to a floating point child_size in fast_hdbscan's condensed_tree broke its interoperability with the CondensedTree._select_clusters() plotting functionality. This PR resolves the issue by detecting selected clusters through cluster labels, rather than re-computing them from the condensed tree. The main advantage of this approach is that it works without knowing all the ways clusters can be extracted from the condensed tree implemented in both repositories.

Only the code in flat.py used the _select_clusters() function. That file already contains a (re-)implementation of the required functionality (_new_select_clusters()) . So, I changed it to use that function instead.

@JelmerBot
Copy link
Contributor Author

Tests should pass once assert_raises is replaced with pytest.raises as in #667.

@lmcinnes
Copy link
Collaborator

lmcinnes commented Jan 7, 2025

I think if you resync from master it should pick up the pytest.raises changes now.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@lmcinnes lmcinnes merged commit 4c80850 into scikit-learn-contrib:master Jan 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants