I created a parallelized implementation of Agglomerative clustering that's many times faster than existing implementations and has a better runtime

Follow the full discussion on Reddit.
I've been working on a new implementation of Agglomerative clustering called Reciprocal Agglomerative Clustering (RAC) based off of this paper: https://arxiv.org/abs/2105.11653. The short of it is Agglomerative clustering can be broken down into finding and merging pairs of reciprocal nearest neighbors in parallel, as long as the linkage function is one of the following:

Comments

There's unfortunately not much to read here yet...

Discover the Best of Machine Learning.

Ever having issues keeping up with everything that's going on in Machine Learning? That's where we help. We're sending out a weekly digest, highlighting the Best of Machine Learning.

Join over 900 Machine Learning Engineers receiving our weekly digest.

Best of Machine LearningBest of Machine Learning

Discover the best guides, books, papers and news in Machine Learning, once per week.

Twitter