首页 | 本学科首页   官方微博 | 高级检索  
     

基于密度感知模式的生物序列分类算法
引用本文:胡耀炜,段磊,李岭,韩超.基于密度感知模式的生物序列分类算法[J].计算机应用,2018,38(2):427-432.
作者姓名:胡耀炜  段磊  李岭  韩超
作者单位:1. 四川大学 计算机学院, 成都 610065;2. 四川大学 华西公共卫生学院, 成都 610041;3. 四川大学 生命科学学院, 成都 610041
基金项目:国家自然科学基金资助项目(61572332,81473446);中国博士后科学基金特别资助项目(2016T90850);中央高校基本科研业务费资助项目(2016SCU04A22)。
摘    要:针对现有的基于模式的序列分类算法对于生物序列存在分类精度不理想、模型训练时间长的问题,提出密度感知模式,并设计了基于密度感知模式的生物序列分类算法——BSC。首先,在生物序列中挖掘具有"密度感知"的频繁序列模式;然后,对挖掘出的频繁序列模式进行筛选、排序制定成分类规则;最后,通过分类规则对没有分类的序列进行分类预测。在4组真实生物序列中进行实验,分析了BSC算法参数对结果的影响并提供了推荐参数设置;同时分类结果表明,相比其他四种基于模式的分类算法,BSC算法在实验数据集上的准确率至少提高了2.03个百分点。结果表明,BSC算法有较高的生物序列分类精度和执行效率。

关 键 词:生物序列  序列分类  序列模式  密度感知模式  分类规则  
收稿时间:2017-07-24
修稿时间:2017-09-13

Biological sequence classification algorithm based on density-aware patterns
HU Yaowei,DUAN Lei,LI Ling,HAN Chao.Biological sequence classification algorithm based on density-aware patterns[J].journal of Computer Applications,2018,38(2):427-432.
Authors:HU Yaowei  DUAN Lei  LI Ling  HAN Chao
Affiliation:1. College of Computer Science, Sichuan University, Chengdu Sichuan 610065, China;2. West China School of Public Health, Sichuan University, Chengdu Sichuan 610041, China;3. College of Life Science, Sichuan University, Chengdu Sichuan 610041, China
Abstract:Concerning unsatisfactory classification accuracy and low efficiency of the existing pattern-based classification methods for model training, a concept of density-aware pattern and an algorithm for biological sequence classification based on density-aware patterns, namely BSC (Biological Sequence Classifier), were proposed. Firstly, frequent sequence patterns based on density-aware concept were mined. Then, the mined frequent sequence patterns were filtered and sorted for designing the classification rules. Finally, the sequences without classification were classified by classification rules. According to a number of experiments conducted on four real biological sequence datasets, the influence of BSC algorithm parameters on the results were analyzed and the recommended parameter settings were provided. Meanwhile, the experimental results showed that the accuracies of BSC algorithm were improved by at least 2.03 percentage points compared with other four pattern-based baseline algorithms. The results indicate that BSC algorithm has high biological sequence classification accuracy and execution efficiency.
Keywords:biological sequence                                                                                                                        sequence classification                                                                                                                        sequential pattern                                                                                                                        density-aware pattern                                                                                                                        classification rule
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号