首页 | 官方网站   微博 | 高级检索  
     

基于集束搜索的可解释阈值树构造
引用本文:李钰群,何振峰.基于集束搜索的可解释阈值树构造[J].计算机系统应用,2023,32(11):247-252.
作者姓名:李钰群  何振峰
作者单位:福州大学 计算机与大数据学院, 福州 350108
摘    要:传统的聚类算法能够将数据集划分成不同的簇,但是这些簇通常都是难以解释的. IMM (iterative mistake minimization)是一种常见的可解释聚类算法,通过单个特征来构造阈值树,每个簇都可以用根节点到叶子节点路径上的特征-阈值对进行解释.然而,阈值树在每一轮划分数据时仅考虑错误最少的特征-阈值对,这种贪心的方法容易导致局部最优解.针对这一问题,本文引入了集束搜索,通过在阈值树的每一轮划分过程当中保留预定数量的状态来减缓局部最优,进而提高阈值树提供的聚类划分与初始聚类划分的一致性.最后,通过实验验证了该算法的有效性.

关 键 词:可解释聚类  集束搜索  阈值树  K-means
收稿时间:2023/3/21 0:00:00
修稿时间:2023/5/11 0:00:00

Explainable Threshold Tree Construction Based on Beam Search
LI Yu-Qun,HE Zhen-Feng.Explainable Threshold Tree Construction Based on Beam Search[J].Computer Systems& Applications,2023,32(11):247-252.
Authors:LI Yu-Qun  HE Zhen-Feng
Affiliation:College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China
Abstract:Traditional clustering algorithms can split the dataset into different clusters, whereas these clusters are usually difficult to explain. Iterative mistake minimization (IMM) is a common explainable clustering algorithm, which constructs a threshold tree from a single feature, and each cluster can be explained by feature-threshold pairs on the path from the root node to the leaf node. However, the threshold tree only considers the feature-threshold pair with the fewest errors when dividing the data in each round, and this greedy method is easy to lead to the local optimal solution. To solve this problem, this study introduces beam search, which slows local optimization by retaining a predetermined number of states in each round of division, thereby improving the consistency between the clustering provided by the threshold tree and the initial clustering. Finally, the effectiveness of the algorithm is verified by experiments.
Keywords:explainable clustering  beam search  threshold tree  K-means
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号