首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
提出了一种结合加权特征向量空间模型和径向基概率神经网络(RBPNN)的文本分类方法.该方法针对传统的文本特征提取方法的不足,根据文本中特征项的位置信息和所属类别信息定义特征权重,然后,依据特征项的权值计算文档特征项的频数,通过TFIDF函数计算特征值并得到文本的特征向量,最后,采用RBPNN网络分类,通过最小二乘算法求解神经网络的第二隐层和输出层之间的权值,最终训练获得文本分类模型.文本分类实验结果表明,该方法在文本分类中表现出较好的效果,具有较好查全率和查准率.  相似文献   

2.
针对解决新闻文本如何有效提取关键主题信息进行归纳分类的问题,提出一种基于RoBERTa-wwm与注意力机制混合的深度学习文本分类模型RoBERTa-ATTLSTM。模型首先采用RoBERTa-wwm预训练语言模型获取文本的动态特征信息;利用双向长短期记忆网络Bi-LSTM进一步提取文本更深层次的语义关系,将最后一个时序输出作为特征向量输入到注意力机制层;最后通过全连接层神经网络得到文本分类结果。在今日头条与新浪新闻THUCnews数据集上的实验表明,模型RoBERTa-ATTLSTM的准确率、精确率、F1值、召回率均为最高,且模型可有效提取文本中字词特征信息,提高新闻文本分类效果。  相似文献   

3.
作为自然语言处理技术中的底层任务之一,文本分类任务对于上游任务有非常重要的辅助价值。而随着最近几年深度学习广泛应用于NLP中的上下游任务的趋势,深度学习在下游任务文本分类中性能不错。但是目前的基于深层学习网络的模型在捕捉文本序列的长距离型上下文语义信息进行建模方面仍有不足,同时也没有引入语言信息来辅助分类器进行分类。针对这些问题,提出了一种新颖的结合Bert与Bi-LSTM的英文文本分类模。该模型不仅能够通过Bert预训练语言模型引入语言信息提升分类的准确性,还能基于Bi-LSTM网络去捕捉双向的上下文语义依赖信息对文本进行显示建模。具体而言,该模型主要有输入层、Bert预训练语言模型层、Bi-LSTM层以及分类器层搭建而成。实验结果表明,与现有的分类模型相比较,所提出的Bert-Bi-LSTM模型在MR数据集、SST-2数据集以及CoLA数据集测试中达到了最高的分类准确率,分别为86.2%、91.5%与83.2%,大大提升了英文文本分类模型的性能。  相似文献   

4.
景丽  何婷婷 《计算机科学》2021,48(z2):170-175,190
文本分类是自然语言处理领域中的重要内容,常用于信息检索、情感分析等领域.针对传统的文本分类模型文本特征提取不全面、文本语义表达弱的问题,提出一种基于改进TF-IDF算法、带有注意力机制的长短期记忆卷积网络(Attention base on Bi-LSTM and CNN,ABLCNN)相结合的文本分类模型.该模型首先利用特征项在类内、类间的分布关系和位置信息改进TF-IDF算法,突出特征项的重要性,并结合Word2vec工具训练的词向量对文本进行表示;然后使用ABLCNN提取文本特征,ABLCNN结合了注意力机制、长短期记忆网络和卷积神经网络的优点,既可以有重点地提取文本的上下文语义特征,又兼顾了局部语义特征;最后,将特征向量通过softmax函数进行文本分类.在THUCNews数据集和online_shopping_10_cats数据集上对基于改进TF-IDF和ABLCNN的文本分类模型进行实验,结果表明,所提模型在两个数据集上的准确率分别为97.38%和91.33%,高于其他文本分类模型.  相似文献   

5.
针对文本自动分类问题,提出一种基于概率型神经网络(PNN)和学习矢量量化(LVQ)相结合的文本分类算法,该方法借助TFIDF方法提取文本特征及特征值,形成文本分类特征向量,利用概率型神经网络构建分类模型,并利用LVQ学习算法对神经网络模型竞争层网络进行学习,使相应模式向量相互靠拢,远离其他模式,从而实现文本分类.实验结果表明,提出的该方法在文本分类中表现了很好的效果,不仅具有很好的分类准确率,还表现出很好的学习效率.  相似文献   

6.
对于越南语组块识别任务,在前期对越南语组块内部词性构成模式进行统计调查的基础上,该文针对Bi-LSTM+CRF模型提出了两种融入注意力机制的方法: 一是在输入层融入注意力机制,从而使得模型能够灵活调整输入的词向量与词性特征向量各自的权重;二是在Bi-LSTM之上加入了多头注意力机制,从而使模型能够学习到Bi-LSTM输出值的权重矩阵,进而有选择地聚焦于重要信息。实验结果表明,在输入层融入注意力机制后,模型对组块识别的F值提升了3.08%,在Bi-LSTM之上加入了多头注意力机制之后,模型对组块识别的F值提升了4.56%,证明了这两种方法的有效性。  相似文献   

7.
郁友琴  李弼程 《计算机科学》2021,48(12):219-225
微博用户兴趣发现对社交网络的个性化推荐和信息传播的正确引导具有重要意义,因此提出了一种基于多粒度文本特征表示的微博用户兴趣识别方法.首先,从主题层、词序层和词汇层3个方面对微博用户构造文本向量,利用LDA提取内容的主题特征,通过LSTM学习内容的语义特征,引入腾讯AI Lab开源词向量获取词义特征;然后,将以上3种特征向量拼接得到的多粒度文本特征表示矩阵输入CNN中,进行文本分类训练;最后,通过多端输出层实现对微博用户的兴趣识别.实验结果表明,多粒度特征表示模型的分类实验结果比单粒度特征表示模型的精准率、召回率和F1值分别提高了8%,12%和13%.基于对文本粗、细语义粒度和词粒度的综合考量,结合神经网络分类算法,多粒度特征表示模型的评价指标均优于单粒度特征表示模型.  相似文献   

8.
赖文辉  乔宇鹏 《计算机应用》2018,38(9):2469-2476
对垃圾短信进行过滤识别研究具有重要的社会价值和时代背景意义。针对传统的人工设计短信特征选择方法中存在数据稀疏、特征信息共现不足和特征提取困难的问题,提出一种基于词向量和卷积神经网络(CNN)的垃圾短信识别方法。首先,使用word2vec的skip-gram模型根据维基中文语料库训练出短信数据集中每个词的词向量,并将每条短信中各个词组所对应的词向量组成表示短信的二维特征矩阵;然后,把特征矩阵作为卷积神经网络的输入,通过卷积层的不同尺度卷积核提取多尺度短信特征,以及利用1-max pooling池化策略得到局部最优特征;最后,将局部最优特征组成融合特征向量放入softmax分类器中得出分类结果。在10万条短信数据上进行的实验结果表明,在特征提取方式相同的情况下,基于卷积神经网络模型的识别准确率能够达到99.5%,比传统的机器学习模型提高了2.4%~5.1%,且各模型的识别准确率均保持在94%以上。  相似文献   

9.
为了对中文微博进行有效的情感极性识别,基于表情符能改变或加强微博文本的情感极性这一认知事实,提出基于表情符注意力机制的微博情感分析神经网络模型。该模型在使用双向循环神经网络模型(BiLSTM)学习文本的特征表示时,利用表情符注意力机制,得到文本结合表情符后新的特征表示,从而实现微博情感识别。实验结果显示,与输入纯文本和表情符的Bi-LSTM模型相比,基于表情符注意力机制的模型准确率提高了4. 06%;与仅输入纯文本的Bi-LSTM模型相比,基于表情符注意力机制的模型准确率提高了6. 35%。  相似文献   

10.
提出融合领域特征向量与词向量的识别方法,将基于武器装备名特征库与维基语料训练得到的领域特征向量引入Bi-LSTM+CRF模型,并对武器装备名进行自动识别实验。引入领域特征向量后模型的识别准确率由78.30%提升到82.10%,召回率由65.25%提升到67.30%,对未登录武器装备名识别的召回率从45.08%提升到50.16%。此外,将领域特征融入条件随机场(conditional random field,CRF)模型,实验表明,在小规模语料库与领域特征支持的情况下,CRF模型的效果要优于Bi-LSTM+CRF模型且对稀疏特征的利用效率更优。  相似文献   

11.
Abstract This paper describes an approach to the design of interactive multimedia materials being developed in a European Community project. The developmental process is seen as a dialogue between technologists and teachers. This dialogue is often problematic because of the differences in training, experience and culture between them. Conditions needed for fruitful dialogue are described and the generic model for learning design used in the project is explained.  相似文献   

12.
European Community policy and the market   总被引:1,自引:0,他引:1  
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven.  相似文献   

13.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。  相似文献   

14.
Development of software intensive systems (systems) in practice involves a series of self-contained phases for the lifecycle of a system. Semantic and temporal gaps, which occur among phases and among developer disciplines within and across phases, hinder the ongoing development of a system because of the interdependencies among phases and among disciplines. Such gaps are magnified among systems that are developed at different times by different development teams, which may limit reuse of artifacts of systems development and interoperability among the systems. This article discusses such gaps and a systems development process for avoiding them.  相似文献   

15.
This paper presents control charts models and the necessary simulation software for the location of economic values of the control parameters. The simulation program is written in FORTRAN, requires only 10K of main storage, and can run on most mini and micro computers. Two models are presented - one describes the process when it is operating at full capacity and the other when the process is operating under capacity. The models allow the product quality to deteriorate to a further level before an existing out-of-control state is detected, and they can also be used in situations where no prior knowledge exists of the out-of-control causes and the resulting proportion defectives.  相似文献   

16.
Going through a few examples of robot artists who are recognized worldwide, we try to analyze the deepest meaning of what is called “robot art” and the related art field definition. We also try to highlight its well-marked borders, such as kinetic sculptures, kinetic art, cyber art, and cyberpunk. A brief excursion into the importance of the context, the message, and its semiotics is also provided, case by case, together with a few hints on the history of this discipline in the light of an artistic perspective. Therefore, the aim of this article is to try to summarize the main characteristics that might classify robot art as a unique and innovative discipline, and to track down some of the principles by which a robotic artifact can or cannot be considered an art piece in terms of social, cultural, and strictly artistic interest. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

17.
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given.  相似文献   

18.
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical.  相似文献   

19.
基于复小波噪声方差显著修正的SAR图像去噪   总被引:4,自引:1,他引:3  
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。  相似文献   

20.
Abstract  This paper considers some results of a study designed to investigate the kinds of mathematical activity undertaken by children (aged between 8 and 11) as they learned to program in LOGO. A model of learning modes is proposed, which attempts to describe the ways in which children used and acquired understanding of the programming/mathematical concepts involved. The remainder of the paper is concerned with discussing the validity and limitations of the model, and its implications for further research and curriculum development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号