首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
文本自动分类是信息检索与数据挖掘领域的研究热点与核心技术,本文给出文本分类的一个神经网络模型,采用特征词的向量空间描述文本,利用神经网络的学习能力,通过文本的样本集来训练网络,得到神经网络分类器.该分类器可用于对普通文本的分类.  相似文献   

2.
文本分类是数据挖掘的重要课题,它是获取信息资源的重要方式之一。根据对具有主题的大量文本的分析,基于神经网络的文本分类器在网络结构上,与文档的标题和段落结构之间建立了严格的对应关系。比较仔细地描述了神经网络的训练算法,包括正向传播算法和反向修正算法,对于算法的主要步骤,给出较详细计算方法。对基于神经网络的文本分类器的测试表明,该神经网络模型参数设置比较简单,其文本分类性能良好。  相似文献   

3.
本文在对文本分类的问题,关键技术及系统结构进行介绍的基础上,详细阐述了一种利用带动力项的BP神经网络作为分类器的中文文本自动分类方法.该法采用归一化TFIDF算法对特征向量进行权值计算,并使用期望交叉熵统计方法对特征向量集进行精简.此外,我们在TanCorp12数据集上测试了特征项数目和训练次数对于分类器的宏平均和微平均性能的影响.  相似文献   

4.
采用向量空间模型(V SM)描述文本,利用隐性语义索引(LSI)技术进行特征重构与降维,构造了BP神经网络文本分类器。将贝叶斯分类技术与前者结合构造了一种混合文本分类器。实验结果表明混合分类器分类准确度和分类速度得到提高。  相似文献   

5.
针对传统的循环神经网络模型在处理长期依赖问题时面临着梯度爆炸或者梯度消失的问题,且参数多训练模型时间长,提出一种基于双向GRU神经网络和贝叶斯分类器的文本分类方法。利用双向GRU神经网络提取文本特征,通过TF-IDF算法权重赋值,采用贝叶斯分类器判别分类,改进单向GRU对后文依赖性不足的缺点,减少参数,缩短模型的训练时间,提高文本分类效率。在两类文本数据上进行对比仿真实验,实验结果表明,该分类算法与传统的循环神经网络相比能够有效提高文本分类的效率和准确率。  相似文献   

6.
本文针对单个BP神经网络在文本分类中准确率较低的问题,通过级联多个BP神经网络,利用Adaboost算法调整各个BP弱分类器的权值,从而获得了一个稳定、高效的BP_Adaboost强分类器。实验结果现实:BP_Adaboost文本分类准确率比BP神经网络提高了9.09%。  相似文献   

7.
一种用于大规模文本分类的特征表示方法   总被引:4,自引:0,他引:4       下载免费PDF全文
随着网络和信息技术的迅猛发展,文本分类成为处理和组织大量文档数据的关键技术。文本的特征表示严重地限制了文本分类性能的提升。以经典的向量空间模型和tf-idf权值计算公式为基础,提出了以应用于文本分类为目的的权值改进公式p-idf公式。在比较了贝叶斯、K近邻、神经网络和支持向量机四种典型的文本分类器的基础上,采用支持向量机分类器搭建了一个文本分类试验系统。经过科学的试验比较了tf-idf、p-idf、LTC三种权值公式在文本分类系统中对分类器性能的影响,证实了所提出的p-idf公式的合理性和有效性。  相似文献   

8.
应用有指导的机器学习方法实现了一个文本分类器。运用改进型的CHI统计量方法对分词结果进行特征提取,对传统的TF-IDF加权公式进行了一些改进(称之为:ETF-IDF),运用资源优化神经网络RON(Resource-optimizing Networks)构建分类器。在复旦大学提供的中文文本分类语料库上进行分类实验,实验结果表明该分类器较之BP算法有较高的分类质量,且ETF-IDF加权公式较之传统的TF-IDF加权公式有其优越性,提高了分类的精度和性能,满足了中文文本自动分类的要求。  相似文献   

9.
一种改进的基于神经网络的文本分类算法*   总被引:1,自引:0,他引:1  
提出并实现了一种结合前馈型神经网络和K最近邻的文本分类算法。其中,在选取特征项时考虑到Web文本不同标签组所代表的意义和权重有所区别,采用了一种改进的TFIDF特征选择法。最后对设计的分类器进行了开放性测试,实验结果表明该分类器显著地提高了文本分类的查全率和查准率。  相似文献   

10.
本文依据隐马尔柯夫模型修正特征词属性向量,利用BP神经网络的学习能力,采用遗传算法优化,构造了一种遗传神经网络的中文文本分类器。实验表明,此文本分类器分类有较高的准确性。  相似文献   

11.
The generalization problem of an artificial neural network (ANN) classifier with unlimited size of training sample, namely asymptotic optimization in probability, is discussed in this paper. As an improved ANN network model, the pre-edited ANN classifier shows better practical performance than the standard one. However, it has not been widely applied due to the absence of the related theoretical support. To further promote its application in practice, the asymptotic optimization of the pre-edited ANN classifier is studied in this paper. To help study ANN asymptotic optimization in probability, we gives a review of the previous research works on asymptotic optimization in probability of non-parametric classifier, and grouped the main methods into four classes: two-step method, one-step method, generalization method and hypothesis method. In this paper, we adopt generalization/hypothesis mixed method to prove that pre-edited ANN is asymptotically optimal in probability. Furthermore, a simulation is presented to provide an experimental support for our theoretical work.  相似文献   

12.
Control charts pattern recognition is one of the most important tools in statistical process control to identify process problems. Unnatural patterns exhibited by such charts can be associated with certain assignable causes affecting the process. In this paper, multi-resolution wavelets analysis (MRWA) is used to extract distinct features for unnatural patterns by providing distinct time–frequency coefficients. A reduced set of parameters is derived from these coefficients and used as input to an artificial neural network (ANN) classifier. Results show that the performance of the proposed technique in classifying shift, trend and cyclic patterns is superior to that of ANN classifier, which operated on coded observed data.  相似文献   

13.
Empirical results illustrate the pitfalls of applying an artificial neural network (ANN) to classification of underwater active sonar returns. During training, a back-propagation ANN classifier learns to recognize two classes of reflected active sonar waveforms: waveforms having two major sonar echoes or peaks and those having one major echo or peak. It is shown how the classifier learns to distinguish between the two classes. Testing the ANN classifier with different waveforms of each type generated unexpected results: the number of echo peaks was nor the feature used to separate classes.  相似文献   

14.
It is well recognized that the impact-acoustic emissions contain information that can indicate the presence of the adhesive defects in the bonding structures. In our previous papers, artificial neural network (ANN) was adopted to assess the bonding integrity of the tile–walls with the feature extracted from the power spectral density (PSD) of the impact-acoustic signals acting as the input of classifier. However, in addition to the inconvenience posed by the general drawbacks such as long training time and large number of training samples needed, the performance of the classic ANN classifier is deteriorated by the similar spectral characteristics between different bonding status caused by abnormal impacts. In this paper our previous works was developed by the employment of the least-squares support vector machine (LS-SVM) classifier instead of the ANN to derive a bonding integrity recognition approach with better reliability and enhanced immunity to surface roughness. With the help of the specially designed artificial sample slabs, experiments results obtained with the proposed method are provided and compared with that using the ANN classifier, demonstrating the effectiveness of the present strategy.  相似文献   

15.
This paper presents a neural network architecture using a support vector machine (SVM) as an inference engine (IE) for classification of light detection and ranging (Lidar) data. Lidar data gives a sequence of laser backscatter intensities obtained from laser shots generated from an airborne object at various altitudes above the earth surface. Lidar data is pre-filtered to remove high frequency noise. As the Lidar shots are taken from above the earth surface, it has some air backscatter information, which is of no importance for detecting underwater objects. Because of these, the air backscatter information is eliminated from the data and a segment of this data is subsequently selected to extract features for classification. This is then encoded using linear predictive coding (LPC) and polynomial approximation. The coefficients thus generated are used as inputs to the two branches of a parallel neural architecture. The decisions obtained from the two branches are vector multiplied and the result is fed to an SVM-based IE that presents the final inference. Two parallel neural architectures using multilayer perception (MLP) and hybrid radial basis function (HRBF) are considered in this paper. The proposed structure fits the Lidar data classification task well due to the inherent classification efficiency of neural networks and accurate decision-making capability of SVM. A Bayesian classifier and a quadratic classifier were considered for the Lidar data classification task but they failed to offer high prediction accuracy. Furthermore, a single-layered artificial neural network (ANN) classifier was also considered and it failed to offer good accuracy. The parallel ANN architecture proposed in this paper offers high prediction accuracy (98.9%) and is found to be the most suitable architecture for the proposed task of Lidar data classification.  相似文献   

16.
Automatic text classification based on vector space model (VSM), artificial neural networks (ANN), K-nearest neighbor (KNN), Naives Bayes (NB) and support vector machine (SVM) have been applied on English language documents, and gained popularity among text mining and information retrieval (IR) researchers. This paper proposes the application of VSM and ANN for the classification of Tamil language documents. Tamil is morphologically rich Dravidian classical language. The development of internet led to an exponential increase in the amount of electronic documents not only in English but also other regional languages. The automatic classification of Tamil documents has not been explored in detail so far. In this paper, corpus is used to construct and test the VSM and ANN models. Methods of document representation, assigning weights that reflect the importance of each term are discussed. In a traditional word-matching based categorization system, the most popular document representation is VSM. This method needs a high dimensional space to represent the documents. The ANN classifier requires smaller number of features. The experimental results show that ANN model achieves 93.33% which is better than the performance of VSM which yields 90.33% on Tamil document classification.  相似文献   

17.
Robust radar target classifier using artificial neural networks   总被引:3,自引:0,他引:3  
In this paper an artificial neural network (ANN) based radar target classifier is presented, and its performance is compared with that of a conventional minimum distance classifier. Radar returns from realistic aircraft are synthesized using a thin wire time domain electromagnetic code. The time varying backscattered electric field from each target is processed using both a conventional scheme and an ANN-based scheme for classification purposes. It is found that a multilayer feedforward ANN, trained using a backpropagation learning algorithm, provides a higher percentage of successful classification than the conventional scheme. The performance of the ANN is found to be particularly attractive in an environment of low signal-to-noise ratio. The performance of both methods are also compared when a preemphasis filter is used to enhance the contributions from the high frequency poles in the target response.  相似文献   

18.
高频区的雷达目标识别是当前高技术领域内的一个发展重点。以多散射中心的目标模型为基础,本文应用小波分析作为提取目标特征矢量的有力工具,结合提出的一种改进的人工神经网络算法,对于宽带连续波工作体制下的雷达目标识别问题做了探讨和研究。实验结果证实了小波分析与神经网络相结合的目标识别方法的有效性。  相似文献   

19.
It is widely believed in the pattern recognition field that when a fixed number of training samples is used to design a classifier, the generalization error of the classifier tends to increase as the number of features gets larger. In this paper, we discuss the generalization error of the artificial neural network (ANN) classifiers in high-dimensional spaces, under a practical condition that the ratio of the training sample size to the dimensionality is small. Experimental results show that the generalization error of ANN classifiers seems much less sensitive to the feature size than 1-NN, Parzen and quadratic classifiers  相似文献   

20.
A case study including the discrimination of traffic accidents as accident free and accident cases on Konya-Afyonkarahisar highway in Turkey using the proposed hybrid method based on combining of a new data preprocessing method called subtractive clustering attribute weighting (SCAW) and classifier algorithms with the help of Geographical Information System (GIS) technology has been conducted. In order to improve the discrimination of classifier algorithms including artificial neural network (ANN), adaptive network based fuzzy inference system (ANFIS), support vector machine, and decision tree, using data preprocessing need in solution of these kinds of problems (traffic accident case study). So, we have proposed a novel data preprocessing method called subtractive clustering attribute weighting (SCAW) and combined with classifier algorithms. In this study, the experimental data has been obtained by means of using GIS. The obtained GIS attributes are day, temperature, humidity, weather conditions, and month of occurred accident. To evaluate the performance of the proposed hybrid method, the classification accuracy, sensitivity and specificity values have been used. The experimental obtained results are 53.93%, 52.25%, and 38.76% classification successes using alone ANN, ANFIS, and SVM with RBF kernel type, respectively. As for the proposed hybrid method, the classification accuracies of 67.98%, 70.22%, and 61.24% have been obtained using the combination of SCAW with ANN, the combination of SCAW with SVM (radial basis function (RBF) kernel type), and the combination of SCAW with ANFIS, respectively. The proposed SCAW method with the combination of classifier algorithms has been achieved the very promising results in the discrimination of traffic accidents.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号