
2023-09-11 05:47:02 作者:拉钩之后为什么要上吊

我正在寻找一个聚类算法,允许每个文件的属于多个集群(如到至少 K 集群)。

I'm looking for a clustering algorithm that allows each document to belong to more than one cluster (eg. to at least Kclusters).


All the cluster algorithms I studied create a partition of the dataset, which means that every document will be in only one cluster.




Use a soft, probabilistic clustering algorithm like a Gaussian Mixture Model. This will then give you a probability of each instance belonging to all possible clusters: just pick the top-N, or any above a certain probability threshold, or some other scheme to allow multiple membership.