首页 | 本学科首页   官方微博 | 高级检索  
     

基于正余弦算法的文本特征选择
引用本文:文武,万玉辉,文志云.基于正余弦算法的文本特征选择[J].计算机工程与科学,2022,44(8):1467-1473.
作者姓名:文武  万玉辉  文志云
作者单位:(1.重庆邮电大学通信与信息工程学院,重庆 400065; 2.重庆邮电大学通信新技术应用研究中心,重庆 400065;3.重庆信科设计有限公司,重庆 401121)
摘    要:为获取文本中的较优特征子集,剔除干扰和冗余特征,提出了一种结合过滤式算法和群智能算法的混合特征寻优算法。首先计算每个特征词的信息增益值,选取较优的特征作为预选特征集合,再利用正余弦算法对预选特征进行寻优,获取精选特征集合。为较好地平衡正余弦算法中的全局搜索和局部开发能力,加入了自适应惯性权重;为更精确地评价特征子集,引入以特征数量和准确率进行加权的适应度函数,并提出了新的位置更新机制。在KNN和贝叶斯分类器上的实验结果表明,该特征选择算法与其它特征选择算法及改进前的算法相比,分类准确率得到了一定的提升。

关 键 词:特征选择  正余弦  惯性权重  分类准确率  
收稿时间:2020-07-10
修稿时间:2021-01-19

Text feature selection basedon sine and cosine algorithm
WEN Wu,WAN Yu-hui,WEN Zhi-yun.Text feature selection basedon sine and cosine algorithm[J].Computer Engineering & Science,2022,44(8):1467-1473.
Authors:WEN Wu  WAN Yu-hui  WEN Zhi-yun
Affiliation:(1.School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications,Chongqing 400065; 2.Research Center of New Telecommunication Technology Applications, Chongqing University of Posts and Telecommunications,Chongqing 400065; 3.Chongqing Information Technology Designing Co.,Ltd.,Chongqing 401121,China)
Abstract:In order to obtain a better feature subset in the text and eliminate interference and redundant features, a hybrid feature optimization algorithm combining filtering and swarm intelligence algorithm is proposed. Firstly, the information gain value of each feature word is calculated, the better feature is selected as the preselected feature set, and then the sine cosine algorithm is used to optimize the preselected feature to obtain the selected feature set. In order to better balance the global search and local development capabilities in the sine-cosine algorithm, adaptive inertia weights are added. To more accurately evaluate feature subsets, a fitness function weighted by the number of features and accuracy is introduced, and a new location update mechanism is proposed. Experiment results on KNN and Bayesian classifier show that this feature selection model improves the classification accuracy, compared with other feature selection methods and the model before improvement.
Keywords:feature selection  sine and cosine  inertia weight  classification accuracy  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号