machine learning - How do I choose a linkage method for Hierarchical Agglomerative Clustering? -


i understand hac has several options in terms of linkage functions. have:

  • single linkage produces "straggly" clusters
  • complete linkage produces tight, spherical clusters
  • average linkage sort of compromise between two
  • ward's method, based more off variance actual distance

what i'm trying figure out is, how know 1 of these want use? there datasets "straggly" clusters preferable spherical ones? or more function of intend clustering data?

it depends on data.

single-linkage works reasonably on clean data.

if have dirty data, other linkages may better.

ward similar k-means. may choice if want talk centroids , data partitioned disjoint subsets.

the other problem slink (for single-linkabe) fast. others work in o(n^3) not usable on large data sets. compare e.g. dbscan runs in o(n log n) if done well, or kmeans in o(n)...


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -