Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new argument for limiting the maximum epsilon #529

Merged
merged 11 commits into from
Oct 27, 2024

Conversation

prodrigues-tdx
Copy link
Contributor

@prodrigues-tdx prodrigues-tdx commented Feb 22, 2022

This PR aims to introduce to HDBSCAN an argument for a max threshold to the epsilon used when picking the best clusters. With this PR we allow for this new argument, cluster_selection_epsilon_max, to be used in the EOM search method.

This is very useful for cases where you know from the get go that your samples should not be very far from each other, because you have some domain knowledge.

For this implementation, we use cluster_selection_epsilon_max in a very similar way to max_cluster_size. This way the clusters with an epsilon bigger than cluster_selection_epsilon_max can still appear if there are no valid clusters bellow that epsilon. This is, in fact, the exact same behavior as max_cluster_size.

@lmcinnes
Copy link
Collaborator

Sorry for taking so long to get to this. It looks like a useful addition. Any chance you could add a test to the test suite to check that it works as intended?

@prodrigues-tdx
Copy link
Contributor Author

Sorry for taking so long to get to this. It looks like a useful addition. Any chance you could add a test to the test suite to check that it works as intended?

I totally missed your comment:s I'll do that yes.

@joaopmatias
Copy link
Contributor

Hi @lmcinnes! :D It has been a while since the last update in this PR.

Could you take another look?

Thanks!

@joaopmatias
Copy link
Contributor

Gentle reminder to revisit this PR @lmcinnes
Thanks!

@lmcinnes lmcinnes merged commit 5dab8e3 into scikit-learn-contrib:master Oct 27, 2024
1 check passed
@lmcinnes
Copy link
Collaborator

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants