期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Automatic thesaurus construction for spam filtering using revised back propagation neural network 总被引：1，自引：0，他引：1

Hao Xu Bo Yu 《Expert systems with applications》2010,37(1):18-23

Email has become one of the fastest and most economical forms of communication. Email is also one of the most ubiquitous and pervasive applications used on a daily basis by millions of people worldwide. However, the increase in email users has resulted in a dramatic increase in spam emails during the past few years. This paper proposes a new spam filtering system using revised back propagation (RBP) neural network and automatic thesaurus construction. The conventional back propagation (BP) neural network has slow learning speed and is prone to trap into a local minimum, so it will lead to poor performance and efficiency. The authors present in this paper the RBP neural network to overcome the limitations of the conventional BP neural network. A well constructed thesaurus has been recognized as a valuable tool in the effective operation of text classification, it can also overcome the problems in keyword-based spam filters which ignore the relationship between words. The authors conduct the experiments on Ling-Spam corpus. Experimental results show that the proposed spam filtering system is able to achieve higher performance, especially for the combination of RBP neural network and automatic thesaurus construction. 相似文献

2.

Chinese document re-ranking based on automatically acquired term resource

Donghong Ji Shiju Zhao Guozheng Xiao 《Language Resources and Evaluation》2009,43(4):385-406

In this paper, we address the problem of document re-ranking in information retrieval, which is usually conducted after initial retrieval to improve rankings of relevant documents. To deal with this problem, we propose a method which automatically constructs a term resource specific to the document collection and then applies the resource to document re-ranking. The term resource includes a list of terms extracted from the documents as well as their weighting and correlations computed after initial retrieval. The term weighting based on local and global distribution ensures the re-ranking not sensitive to different choices of pseudo relevance, while the term correlation helps avoid any bias to certain specific concept embedded in queries. Experiments with NTCIR3 data show that the approach can not only improve performance of initial retrieval, but also make significant contribution to standard query expansion. 相似文献

3.

Semi-supervised fuzzy co-clustering algorithm for document categorization

Yang Yan Lihui Chen William-Chandra Tjhi 《Knowledge and Information Systems》2013,34(1):55-74

In this paper, we propose a new semi-supervised fuzzy co-clustering algorithm called SS-FCC for categorization of large web documents. In this new approach, the clustering process is carried out by incorporating some prior domain knowledge of a dataset in the form of pairwise constraints provided by users into the fuzzy co-clustering framework. With the help of those constraints, the clustering problem is formulated as the problem of maximizing a competitive agglomeration cost function with fuzzy terms, taking into account the provided domain knowledge. The constraint specifies whether a pair of objects “must” or “cannot” be clustered together. The update rules for fuzzy memberships are derived, and an iterative algorithm is designed for the soft co-clustering process. Our experimental studies show that the quality of clustering results can be improved significantly with the proposed approach. Simulations on 10 large benchmark datasets demonstrate the strength and potentials of SS-FCC in terms of performance evaluation criteria, stability and operating time, compared with some of the existing semi-supervised algorithms. 相似文献

4.

Generic framework for multilingual short text categorization using convolutional neural network

Enamoto Liriam Weigang Li Filho Geraldo P. Rocha 《Multimedia Tools and Applications》2021,80(9):13475-13490

Multimedia Tools and Applications - Online social media is a powerful source of information that can influence users’ decisions. Due to the huge volume of data generated by such media, many... 相似文献

5.

Web document clustering using a hybrid neural network

M. Shamim Khan Sebastian W. Khor 《Applied Soft Computing》2004,4(4):423-432

The list of documents returned by Internet search engines in response to a query these days can be quite overwhelming. There is an increasing need for organising this information and presenting it in a more compact and efficient manner. This paper describes a method developed for the automatic clustering of World Wide Web documents, according to their relevance to the user’s information needs, by using a hybrid neural network. The objective is to reduce the time and effort the user has to spend to find the information sought after. Clustering documents by features representative of their contents—in this case, key words and phrases—increases the effectiveness and efficiency of the search process. It is shown that a two-dimensional visual presentation of information on retrieved documents, instead of the traditional linear listing, can create a more user-friendly interface between a search engine and the user. 相似文献

6.

Supervised labeled latent Dirichlet allocation for document categorization

Ximing Li Jihong Ouyang Xiaotang Zhou You Lu Yanhui Liu 《Applied Intelligence》2015,42(3):581-593

相似文献

7.

Text categorization based on combination of modified back propagation neural network and latent semantic analysis

Wei Wang Bo Yu 《Neural computing & applications》2009,18(8):875-881

This paper proposed a new text categorization model based on the combination of modified back propagation neural network (MBPNN) and latent semantic analysis (LSA). The traditional back propagation neural network (BPNN) has slow training speed and is easy to trap into a local minimum, and it will lead to a poor performance and efficiency. In this paper, we propose the MBPNN to accelerate the training speed of BPNN and improve the categorization accuracy. LSA can overcome the problems caused by using statistically derived conceptual indices instead of individual words. It constructs a conceptual vector space in which each term or document is represented as a vector in the space. It not only greatly reduces the dimension but also discovers the important associative relationship between terms. We test our categorization model on 20-newsgroup corpus and reuter-21578 corpus, experimental results show that the MBPNN is much faster than the traditional BPNN. It also enhances the performance of the traditional BPNN. And the application of LSA for our system can lead to dramatic dimensionality reduction while achieving good classification results. 相似文献

8.

An artificial neural network based heuristic for flow shop scheduling problems

T. Radha Ramanan R. Sridharan Kulkarni Sarang Shashikant A. Noorul Haq 《Journal of Intelligent Manufacturing》2011,22(2):279-288

The objective of this paper is to find a sequence of jobs in the flow shop to minimize makespan. A feed forward back propagation neural network is used to solve the problem. The network is trained with the optimal sequences of completely enumerated five, six and seven jobs, ten machine problem and this trained network is then used to solve the problem with greater number of jobs. The sequence obtained using artificial neural network (ANN) is given as the initial sequence to a heuristic proposed by Suliman and also to genetic algorithm (GA) as one of the sequences of the population for further improvement. The approaches are referred as ANN-Suliman heuristic and ANN-GA heuristic respectively. Makespan of the sequences obtained by these heuristics are compared with the makespan of the sequences obtained using the heuristic proposed by Nawaz, Enscore and Ham (NEH) and Suliman Heuristic initialized with Campbell Dudek and Smith (CDS) heuristic called as CDS-Suliman approach. It is found that the ANN-GA and ANN-Suliman heuristic approaches perform better than NEH and CDS-Suliman heuristics for the problems considered. 相似文献

9.

Minimizer of the Reconstruction Error for multi-class document categorization

《Expert systems with applications》2014,41(3):861-868

In the present article we introduce and validate an approach for single-label multi-class document categorization based on text content features. The introduced approach uses the statistical property of Principal Component Analysis, which minimizes the reconstruction error of the training documents used to compute a low-rank category transformation matrix. Such matrix transforms the original set of training documents from a given category to a new low-rank space and then optimally reconstructs them to the original space with a minimum reconstruction error. The proposed method, called Minimizer of the Reconstruction Error (mRE) classifier, uses this property, and extends and applies it to new unseen test documents. Several experiments on four multi-class datasets for text categorization are conducted in order to test the stable and generally better performance of the proposed approach in comparison with other popular classification methods. 相似文献

10.

Artificial neural network based robot control: An overview 总被引：3，自引：0，他引：3

Sameer M. Prabhu Devendra P. Garg 《Journal of Intelligent and Robotic Systems》1996,15(4):333-365

The current thrust of research in robotics is to build robots which can operate in dynamic and/or partially known environments. The ability of learning endows the robot with a form of autonomous intelligence to handle such situations. This paper focuses on the intersection of the fields of robot control and learning methods as represented by artificial neural networks. An in-depth overview of the application of neural networks to the problem of robot control is presented. Some typical neural network architectures are discussed first. The important issues involved in the study of robotics are then highlighted. This paper concentrates on the neural network applications to the motion control of robots involved in both non-contact and contact tasks. The current state of research in this area is surveyed and the strengths and weakness of the present approaches are emphasized. The paper concludes by indentifying areas which need future research work. 相似文献

11.

Automatic textual document categorization based on generalized instance sets and a metamodel 总被引：5，自引：0，他引：5

Wai Lam Yiqiu Han 《IEEE transactions on pattern analysis and machine intelligence》2003,25(5):628-633

We propose a new approach to text categorization known as generalized instance set (GIS) algorithm under the framework of generalized instance patterns. Our GIS algorithm unifies the strengths of k-NN and linear classifiers and adapts to characteristics of text categorization problems. It focuses on refining the original instances and constructs a set of generalized instances. We also propose a metamodel framework based on category feature characteristics. It has a metalearning phase which discovers a relationship between category feature characteristics and each component algorithm. Extensive experiments have been conducted on two large-scale document corpora for both GIS and the metamodel. The results demonstrate that both approaches generally achieve promising text categorization performance. 相似文献

12.

基于自反馈Hopfield网络的快速文本分类器

黄波陈怀熹马培羽《计算机工程与设计》2009,30(11)

根据大规模中文文本分类的特点,提出了一种基于最大特征值选取的快速文本正交编码方法,并构造了一种具有较快收敛速度的Hopfield神经网络模型.采用神经动力学方法,对自反馈Hopfield神经网络的网络结构进行了稳定性分析.在Hopfield神经网络中引入KNN再预测机制,使进入伪状态而被拒收的样本能有效地逃离伪状态.实验结果表明,该方法应用到大规模的中文文本分类时,效果良好. 相似文献

13.

基于BP神经网络的专利自动分类方法

李生珍王建新齐建东朱礼军《计算机工程与设计》2010,31(23)

提出了一种基于后向传播神经网络的专利自动分类方法.通过中文分词从专利文件集中提取特征项,并根据特征项在专利文件中出现的频率赋予其权重,从而将每篇专利文件表示为一个特征项向量.为取得较好的BP神经网络(BPN)训练效果,使用X2统计方法进行特征向量降维,并使用BPN专利分类器进行专利文件分类.用国际分类号为H02下的专利文件作为测试数据,取得了较好的分类效果. 相似文献

14.

An end-to-end neural network for detecting hidden people in images based on multiple attention network

Hendaoui Rabeb Nabiyev Vasif 《Multimedia Tools and Applications》2022,81(13):18531-18542

Camouflaged people like soldiers on the battlefield or even camouflaged objects in the natural environments are hard to be detected because of the strong resemblances between the hidden target and the background. That’s why seeing these hidden objects is a challenging task. Due to the nature of hidden objects, identifying them require a significant level of visual perception. To overcome this problem, we present a new end-to-end framework via a multi-level attention network in this paper. We design a novel inception module to extract multi-scale receptive fields features aiming at enhancing feature representation. Furthermore, we use a dense feature pyramid taking advantage of multi-scale semantic features. At last, to locate and distinguish the camouflaged target better from the background, we develop a multi-attention module that generates more discriminative feature representation and combines semantic information with spatial information from different levels. Experiments on the camouflaged people dataset show that our approach outperformed all state-of-the-art methods.

相似文献

15.

Proximity-based k-partitions clustering with ranking for document categorization and analysis

《Expert systems with applications》2014,41(16):7095-7105

As one of the most fundamental yet important methods of data clustering, center-based partitioning approach clusters the dataset into k subsets, each of which is represented by a centroid or medoid. In this paper, we propose a new medoid-based k-partitions approach called Clustering Around Weighted Prototypes (CAWP), which works with a similarity matrix. In CAWP, each cluster is characterized by multiple objects with different representative weights. With this new cluster representation scheme, CAWP aims to simultaneously produce clusters of improved quality and a set of ranked representative objects for each cluster. An efficient algorithm is derived to alternatingly update the clusters and the representative weights of objects with respect to each cluster. An annealing-like optimization procedure is incorporated to alleviate the local optimum problem for better clustering results and at the same time to make the algorithm less sensitive to parameter setting. Experimental results on benchmark document datasets show that, CAWP achieves favorable effectiveness and efficiency in clustering, and also provides useful information for cluster-specified analysis. 相似文献

16.

An efficient document classification model using an improved back propagation neural network and singular value decomposition

Cheng Hua Li Soon Choel Park 《Expert systems with applications》2009,36(2):3208-3215

This paper proposed a new improved method for back propagation neural network, and used an efficient method to reduce the dimension and improve the performance. The traditional back propagation neural network (BPNN) has the drawbacks of slow learning and is easy to trap into a local minimum, and it will lead to a poor performance and efficiency. In this paper, we propose the learning phase evaluation back propagation neural network (LPEBP) to improve the traditional BPNN. We adopt a singular value decomposition (SVD) technique to reduce the dimension and construct the latent semantics between terms. Experimental results show that the LPEBP is much faster than the traditional BPNN. It also enhances the performance of the traditional BPNN. The SVD technique cannot only greatly reduce the high dimensionality but also enhance the performance. So SVD is to further improve the document classification systems precisely and efficiently. 相似文献

17.

A recurrent neural network based deep learning model for text and non-text stroke classification in online handwritten Devanagari document

Ghosh Rajib 《Multimedia Tools and Applications》2022,81(17):24245-24263

相似文献

18.

An optimizing BP neural network algorithm based on genetic algorithm 总被引：4，自引：0，他引：4

Shifei Ding Chunyang Su Junzhao Yu 《Artificial Intelligence Review》2011,36(2):153-162

A back-propagation (BP) neural network has good self-learning, self-adapting and generalization ability, but it may easily get stuck in a local minimum, and has a poor rate of convergence. Therefore, a method to optimize a BP algorithm based on a genetic algorithm (GA) is proposed to speed the training of BP, and to overcome BP’s disadvantage of being easily stuck in a local minimum. The UCI data set is used here for experimental analysis and the experimental result shows that, compared with the BP algorithm and a method that only uses GA to learn the connection weights, our method that combines GA and BP to train the neural network works better; is less easily stuck in a local minimum; the trained network has a better generalization ability; and it has a good stabilization performance. 相似文献

19.

An orthogonal neural network for function approximation 总被引：6，自引：0，他引：6

Shiow-Shung Yang Ching-Shiow Tseng 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》1996,26(5):779-785

This paper presents a new single-layer neural network which is based on orthogonal functions. This neural network is developed to avoid the problems of traditional feedforward neural networks such as the determination of initial weights and the numbers of layers and processing elements. The desired output accuracy determines the required number of processing elements. Because weights are unique, the training of the neural network converges rapidly. An experiment in approximating typical continuous and discrete functions is given. The results show that the neural network has excellent performance in convergence time and approximation error. 相似文献

20.

An adiabatic neural network for RBF approximation

B. Truyen N. Langloh J. Cornelis 《Neural computing & applications》1994,2(2):69-88

Numerous studies have addressed nonlinear functional approximation by multilayer perceptrons (MLPs) and RBF networks as a special case of the more general mapping problem. The performance of both these supervised network models intimately depends on the efficiency of their learning process. This paper presents an unsupervised recurrent neural network, based on the recurrent Mean Field Theory (MFT) network model, that finds a least-squares approximation to an arbitrary L₂ function, given a set of Gaussian radially symmetric basis functions (RBFs). Essential is the reformulation of RBF approximation as a problem of constrained optimisation. A new concept of adiabatic network organisation is introduced. Together with an adaptive mechanism of temperature control this allows the network to build a hierarchical multiresolution approximation with preservation of the global optimisation characteristics. A revised problem mapping results in a position invariant local interconnectivity pattern, which makes the network attractive for electronic implementation. The dynamics and performance of the network are illustrated by numerical simulation. 相似文献