首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
在应用SVM对文本进行分类时,用传统的TFIDF算法对文本特征进行选择会产生高维特征向量问题,这个问题干扰了SVM的效率和准确性,使SVM的性能下降.为了解决SVM文本分类过程中产生的这个问题,提出一种基于本体的特征项约简方法.该方法通过本体找出特征向量中具有同义关系、组成关系和上下位关系的冗余特征项,然后对它们进行合并降低特征向量的维数.试验结果表明,采用本体约简特征向量的方法改进了SVM分类器的性能.  相似文献   

2.
近年来,随着生活节奏的提高和互联网的迅速发展,人们更倾向于在众多社交平台上用短文本进行交流,进而可能有人通过发布垃圾文本妨碍人们的正常社交,扰乱网络的绿色环境.为了解决这个问题,我们提出了基于TF-IDF和改进BP神经网络的社交平台垃圾文本检测的方法.通过该方法,实现对社交平台上的垃圾文本过滤.首先,通过结巴分词和去停分词构造关键词数据集;其次,对文本表示的关键词向量运用计算各关键词的权重从而对文本向量进行降维,得到特征向量;最后,在此基础上,运用BP神经网络分类器对短文本进行分类,检测出垃圾文本并进行过滤.实验结果表明用该方法在1000维文本特征向量的情况下分类平均准确率达到了97.720%.  相似文献   

3.
特征向量的高维性以及训练样本分布不均影响文本分类器性能。提出了一种聚类模式下的KNN改进方法。首先使用一种改进的聚类方法对文本特征集进行初步筛选,随后使用一种基于类别的改进KNN分类器进行分类,减少了噪声样本对测试样本类别判定的干扰。试验结果表明本文提出的分类模型在分类效率上得到提高。  相似文献   

4.
文本分类存在维数灾难、数据集噪声及特征词对分类贡献不同等问题,影响文本分类精度。为提高文本分类精度,在数据处理方面提出一种新方法。该方法首先对数据集进行去噪处理,结合特征提取算法和语义分析方法对数据实现降维,再利用词语语义相关度对文本特征向量中每个特征词赋予不同权重;并利用经过上述处理的文本数据学习分类器。实验结果表明,该文本处理方法能够有效提高文本分类精度。  相似文献   

5.
针对文本情感分类准确率不高的问题,提出基于CCA-VSM分类器和KFD的多级文本情感分类方法。采用典型相关性分析对文档的权重特征向量和词性特征向量进行降维,在约简向量集上构建向量空间模型,根据模型之间的差异度设计VSM分类器,筛选出与测试文档差异度较小的R个模型作为核Fisher判别的输入,最终判别出文档的情感观点。实验结果表明:该方法比传统支持向量机有较高的分类准确率和较快的分类速度,权重特征和词性特征对分类准确率的影响较大。  相似文献   

6.
一种神经网络文本分类器的设计与实现   总被引:1,自引:0,他引:1  
李斗  李弼程 《计算机工程与应用》2005,41(17):107-109,119
论文着重介绍了一种基于神经网络的文本分类器,分类器使用神经网络作为分类工具,特征词的词频组成原始特征向量,和神经网络输入层的神经元一一对应。并引入了信息检索中的常用技术——潜在语义索引,训练过程中结合遗传算法,优化神经网络的初始权值。最后对分类器进行了开放性测试,实验表明分类器对文本分类具有较高的平均查全率和平均精度。  相似文献   

7.
基于全信息矩阵的多分类器集成方法   总被引:12,自引:0,他引:12       下载免费PDF全文
唐春生  金以慧 《软件学报》2003,14(6):1103-1109
自动文本分类是提高信息利用效率和质量的有效方法,而多分类器的有效组合能够得到更高的分类准确率.给出了样本集在多分类器下的全信息矩阵概念,并提出一种权重自适应调整的多分类器集成方法.该方法能够自适应地选择分类器组合及确定分类器权重,并利用分类统计信息指导分类结果的集成判决.通过在标准文本集Reuters-21578上的实验表明:该方法能从查准率和查全率两方面提高文本分类的整体性能,同时表明了该方法的有效性.  相似文献   

8.
多标签文本分类问题是多标签分类的重要分支之一,现有的方法往往忽视了标签之间的关系,难以有效利用标签之间存在着的相关性,从而影响分类效果.基于此,本文提出一种融合BERT和图注意力网络的模型HBGA(hybrid BERT and graph attention):首先,利用BERT获得输入文本的上下文向量表示,然后用Bi-LSTM和胶囊网络分别提取文本全局特征和局部特征,通过特征融合方法构建文本特征向量,同时,通过图来建模标签之间的相关性,用图中的节点表示标签的词嵌入,通过图注意力网络将这些标签向量映射到一组相互依赖的分类器中,最后,将分类器应用到特征提取模块获得的文本特征进行端到端的训练,综合分类器和特征信息得到最终的预测结果.在Reuters-21578和AAPD两个数据集上面进行了对比实验,实验结果表明,本文模型在多标签文本分类任务上得到了有效的提升.  相似文献   

9.
免疫阴性选择分类器在信息恢复中的应用   总被引:3,自引:0,他引:3  
文中的信息恢复系统是基于网络获取文本信息的系统,利用基于熵的信息抽取技术将获得的网络文本转换成特征向量文件.免疫阴性选择分类器是基于免疫系统丁细胞选择原理设计检测器,利用协同进化算法进化检测器,进化得到的检测器对信息恢复系统中的文本特征向量进行分类.分类后得到的有用文件用于系统中的信息恢复.实验结果表明,与传统的朴素贝叶斯分类器比较,该方法具有更高的分类准确性,不仅验证了免疫阴性选择分类器的良好性能,同时也提高了信息恢复准确性.  相似文献   

10.
在文本分类研究中,集成学习是一种提高分类器性能的有效方法.Bagging算法是目前流行的一种集成学习算法.针对Bagging算法弱分类器具有相同权重问题,提出一种改进的Bagging算法.该方法通过对弱分类器分类结果进行可信度计算得到投票权重,应用于Attribute Bagging算法设计了一个中文文本自动分类器.采用kNN作为弱分类器基本模型对Sogou实验室提供的新闻集进行分类.实验表明该算法比Attribute Bagging有更好的分类精度.  相似文献   

11.
Abstract This paper describes an approach to the design of interactive multimedia materials being developed in a European Community project. The developmental process is seen as a dialogue between technologists and teachers. This dialogue is often problematic because of the differences in training, experience and culture between them. Conditions needed for fruitful dialogue are described and the generic model for learning design used in the project is explained.  相似文献   

12.
European Community policy and the market   总被引:1,自引:0,他引:1  
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven.  相似文献   

13.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。  相似文献   

14.
Development of software intensive systems (systems) in practice involves a series of self-contained phases for the lifecycle of a system. Semantic and temporal gaps, which occur among phases and among developer disciplines within and across phases, hinder the ongoing development of a system because of the interdependencies among phases and among disciplines. Such gaps are magnified among systems that are developed at different times by different development teams, which may limit reuse of artifacts of systems development and interoperability among the systems. This article discusses such gaps and a systems development process for avoiding them.  相似文献   

15.
This paper presents control charts models and the necessary simulation software for the location of economic values of the control parameters. The simulation program is written in FORTRAN, requires only 10K of main storage, and can run on most mini and micro computers. Two models are presented - one describes the process when it is operating at full capacity and the other when the process is operating under capacity. The models allow the product quality to deteriorate to a further level before an existing out-of-control state is detected, and they can also be used in situations where no prior knowledge exists of the out-of-control causes and the resulting proportion defectives.  相似文献   

16.
Going through a few examples of robot artists who are recognized worldwide, we try to analyze the deepest meaning of what is called “robot art” and the related art field definition. We also try to highlight its well-marked borders, such as kinetic sculptures, kinetic art, cyber art, and cyberpunk. A brief excursion into the importance of the context, the message, and its semiotics is also provided, case by case, together with a few hints on the history of this discipline in the light of an artistic perspective. Therefore, the aim of this article is to try to summarize the main characteristics that might classify robot art as a unique and innovative discipline, and to track down some of the principles by which a robotic artifact can or cannot be considered an art piece in terms of social, cultural, and strictly artistic interest. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

17.
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given.  相似文献   

18.
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical.  相似文献   

19.
基于复小波噪声方差显著修正的SAR图像去噪   总被引:4,自引:1,他引:3  
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。  相似文献   

20.
Abstract  This paper considers some results of a study designed to investigate the kinds of mathematical activity undertaken by children (aged between 8 and 11) as they learned to program in LOGO. A model of learning modes is proposed, which attempts to describe the ways in which children used and acquired understanding of the programming/mathematical concepts involved. The remainder of the paper is concerned with discussing the validity and limitations of the model, and its implications for further research and curriculum development.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号