首页 | 本学科首页   官方微博 | 高级检索  
     

基于卷积神经网络的语句级新闻分类算法
引用本文:曾凡锋,李玉珂,肖珂.基于卷积神经网络的语句级新闻分类算法[J].计算机工程与设计,2020,41(4):978-982.
作者姓名:曾凡锋  李玉珂  肖珂
作者单位:北方工业大学信息学院,北京100144;北方工业大学信息学院,北京100144;北方工业大学信息学院,北京100144
摘    要:针对传统的中文文本分类在海量的互联网信息中难以胜任的现状,提出一种语句级的卷积神经网络中文新闻分类方案。通过信息提取算法从长短不一的新闻数据中提取固定大小的新闻摘要,压缩输入量的同时统一输入格式。信息提取时,通过对TF-IDF算法进行改进提升新闻摘要的质量,结合word2vec技术和卷积神经网络完成文本分类任务。与传统方法相比,词向量模型的引入弥补了传统词袋模型的缺陷,且语句的语义远比词的更加全面,使用语句进行分类更加可靠。通过实验对比验证了该方案具有较好的性能。

关 键 词:文本分类  深度学习  卷积神经网络  词向量  TF-IDF算法  信息抽取

Sentence-level fine-grained news classification based on convolutional neural network
ZENG Fan-feng,LI Yu-ke,XIAO Ke.Sentence-level fine-grained news classification based on convolutional neural network[J].Computer Engineering and Design,2020,41(4):978-982.
Authors:ZENG Fan-feng  LI Yu-ke  XIAO Ke
Affiliation:(College of Information Technology,North China University of Technology,Beijing 100144,China)
Abstract:Aiming at the current situation that traditional Chinese text classification is difficult to be competent in massive Internet information,a sentence-level convolutional neural network Chinese news classification scheme was proposed.A fixed-size news digest was extracted from different lengths of news data using an information extraction algorithm,and the input amount was compressed while unifying the input format.When the information was extracted,the quality of the news digest was improved by improving the TF-IDF algorithm.The word2vec technology and convolutional neural network were combined to complete the text classification task.Compared with the traditional method,on the one hand,the introduction of the word vector model makes up for the defects of the traditional word bag model,on the other hand,the semantics of the sentence are far more comprehensive than that of the word,and the classification is more reliable using the statement.Through experimental comparison,it verifies that the scheme has better performance.
Keywords:text classification  deep learning  convolutional neural network  word vector  TF-IDF algorithm  information extraction
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号