首页 | 本学科首页   官方微博 | 高级检索  
     

基于YKW图形表达的人类基因短编码序列识别
引用本文:骆嘉伟,颜军,何海峰.基于YKW图形表达的人类基因短编码序列识别[J].计算机应用,2011,31(8):2087-2091.
作者姓名:骆嘉伟  颜军  何海峰
作者单位:湖南大学 信息科学与工程学院,长沙410082
基金项目:国家自然科学基金资助项目,湖南省自然科学基金资助项目
摘    要:针对人类短编码序列的识别问题,根据碱基在密码子三个位置的偏性和碱基自身物理化学性质的分类,提出一种新的图形表示方法--YKW图形,然后在此图形上,提取了9个有效的面积矩阵特征,识别过程中,为了提高识别率利用递增特征选择算法添加4个统计特征,并采用主元分析(PCA)方法对这13个特征降维,最后使用支持向量机(SVM)对人类的短编码序列进行编码区/非编码区识别。实验结果表明,与其他方法相比,该方法使用较少的特征(7个或4个)取得了更好的识别结果。

关 键 词:图形表达    短编码序列识别    面积矩阵    基因序列
收稿时间:2011-01-24
修稿时间:2011-03-15

Short coding sequence identification of human genes based on YKW graphical representation
LUO Jia-wei,YAN Jun,HE Hai-feng.Short coding sequence identification of human genes based on YKW graphical representation[J].journal of Computer Applications,2011,31(8):2087-2091.
Authors:LUO Jia-wei  YAN Jun  HE Hai-feng
Affiliation:College of Information Science and Engineering, Hunan University, Changsha Hunan 410082, China
Abstract:According to base bias in the three positions of codon and base chemical properties, the YKW graph, a new graphical representation of gene sequences was introduced for recognizing short coding sequences of human genes. Nine effective features of area matrix were extracted in the YKW curves. In the identifying process, the incremental feature selection algorithm was used to add four statistical features to improve the accuracy. Then Principal Component Analysis (PCA) method was adopted to reduce dimensions and Support Vector Machine (SVM) was applied to classify the coding/un-coding sequence in short human genes. Finally, the experimental results show that the proposed method uses fewer features (seven or four) and gets better recognition results than other methods.
Keywords:graphical representation                                                                                                                        short coding sequence identification                                                                                                                        area matrix                                                                                                                        gene sequence
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号