首页 | 本学科首页   官方微博 | 高级检索  
     


Combining neural networks and semantic feature space for email classification
Authors:Bo Yu  Dong-hua Zhu
Affiliation:1. College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 200234, China;2. School of Computer Science, Liaocheng University, Liaocheng 252400, China;3. Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, China;1. Department of Industrial and Management Systems Engineering, Kyung Hee University, Yongin, Gyeonggi 446-701, Republic of Korea;2. Department of Industrial Engineering, Seoul National University, Seoul 151-744, Republic of Korea;1. School of Management, Xi’an Jiaotong University, PR China, 28 Xianning West Road, Xi’an, Shaanxi, PR China;2. Department of Industrial and Manufacturing Systems Engineering, The University of Hong Kong, Hong Kong SAR, PR China;1. Department of Pediatric Surgery, Konya Education and Research Hospital, Konya, Turkey;2. Division of Pediatric Urology, Department of Pediatric Surgery, Medical Faculty, Sutcu Imam University, Kahramanmaras, Turkey
Abstract:Email is one of the most ubiquitous and pervasive applications used on a daily basis by millions of people worldwide, individuals and organizations more and more rely on the emails to communicate and share information and knowledge. However, the increase in email users has resulted in a dramatic increase in spam emails during the past few years. It is becoming a big challenge to process and manage the emails efficiently for and individuals and organizations. This paper proposes new email classification models using a linear neural network trained by perceptron learning algorithm and a nonlinear neural network trained by back-propagation learning algorithm. An efficient semantic feature space (SFS) method is introduced in these classification models. The traditional back-propagation neural network (BPNN) has slow learning speed and is prone to trap into a local minimum, so the modified back-propagation neural network (MBPNN) is presented to overcome these limitations. The vector space model based email classification system suffers from a large number of features and ambiguity in the meaning of terms, which will lead to sparse and noisy feature space. So we use the SFS to convert the original sparse and noisy feature space to a semantically richer feature space, which will helps to accelerate the learning speed. The experiments are conducted based on different training set size and extracted feature size. Experimental results show that the models using MBPNN outperform the traditional BPNN, and the use of SFS can greatly reduce the feature dimensionality and improve email classification performance.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号