期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Twitter spam account detection based on clustering and classification methods

Adewole Kayode Sakariyah Han Tao Wu Wanqing Song Houbing Sangaiah Arun Kumar 《The Journal of supercomputing》2020,76(7):4802-4837

The Journal of Supercomputing - Twitter social network has gained more popularity due to the increase in social activities of registered users. Twitter performs dual functions of online social... 相似文献

2.

Symbiotic filtering for spam email detection

Clotilde Lopes Paulo Cortez Pedro Sousa Miguel Rocha Miguel Rio 《Expert systems with applications》2011,38(8):9365-9372

This paper presents a novel spam filtering technique called Symbiotic Filtering (SF) that aggregates distinct local filters from several users to improve the overall performance of spam detection. SF is an hybrid approach combining some features from both Collaborative (CF) and Content-Based Filtering (CBF). It allows for the use of social networks to personalize and tailor the set of filters that serve as input to the filtering. A comparison is performed against the commonly used Naive Bayes CBF algorithm. Several experiments were held with the well-known Enron data, under both fixed and incremental symbiotic groups. We show that our system is competitive in performance and is robust against both dictionary and focused contamination attacks. Moreover, it can be implemented and deployed with few effort and low communication costs, while assuring privacy. 相似文献

3.

Distributed classification for image spam detection

Amiza Amir Bala Srinivasan Asad I. Khan 《Multimedia Tools and Applications》2018,77(11):13249-13278

Spam appears in various forms and the current trend in spamming is moving towards multimedia spam objects. Image spam is a new type of spam attacks which attempts to bypass the spam filters that mostly text-based. Spamming attacks the users in many ways and these are usually countered by having a server to filter the spammers. This paper provides a fully-distributed pattern recognition system within P2P networks using the distributed associative memory tree (DASMET) algorithm to detect spam which is cost-efficient and not prone to a single point of failure, unlike the server-based systems. This algorithm is scalable for large and frequently updated data sets, and specifically designed for data sets that consist of similar occurring patterns.We have evaluated our system against centralised state-of-the-art algorithms (NN, k-NN, naive Bayes, BPNN and RBFN) and distributed P2P-based algorithms (Ivote-DPV, ensemble k-NN, ensemble naive Bayes, and P2P-GN). The experimental results show that our method is highly accurate with a 98 to 99% accuracy rate, and incurs a small number of messages—in the best-case, it requires only two messages per recall test. In summary, our experimental results show that the DAS-MET performs best with a relatively small amount of resources for the spam detection compared to other distributed methods. 相似文献

4.

Image spam analysis and detection

Annapurna Annadatha Mark Stamp 《Journal in Computer Virology》2018,14(1):39-52

Image spam is unsolicited bulk email, where the message is embedded in an image. Spammers use such images to evade text-based filters. In this research, we analyze and compare two methods for detecting spam images. First, we consider principal component analysis (PCA), where we determine eigenvectors corresponding to a set of spam images and compute scores by projecting images onto the resulting eigenspace. The second approach focuses on the extraction of a broad set of image features and selection of an optimal subset using support vector machines (SVM). Both of these detection strategies provide high accuracy with low computational complexity. Further, we develop a new spam image dataset that cannot be detected using our PCA or SVM approach. This new dataset should prove valuable for improving image spam detection capabilities. 相似文献

5.

Graph regularization methods for Web spam detection 总被引：1，自引：0，他引：1

Jacob Abernethy Olivier Chapelle Carlos Castillo 《Machine Learning》2010,81(2):207-225

We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark. 相似文献

6.

Genetic optimized artificial immune system in spam detection: a review and a model

Raed Abu Zitar Adel Hamdan 《Artificial Intelligence Review》2013,40(3):305-377

Spam is a serious universal problem which causes problems for almost all computer users. This issue affects not only normal users of the internet, but also causes a big problem for companies and organizations since it costs a huge amount of money in lost productivity, wasting users’ time and network bandwidth. Many studies on spam indicate that spam cost organizations billions of dollars yearly. This work presents a machine learning method inspired by the human immune system called Artificial Immune System (AIS) which is a new emerging method that still needs further exploration. Core modifications were applied on the standard AIS with the aid of the Genetic Algorithm. Also an Artificial Neural Network for spam detection is applied with a new manner. SpamAssassin corpus is used in all our simulations. 相似文献

7.

Neighborhood-consensus message passing as a framework for generalized iterated conditional expectations

Tijana Ru?i?Wilfried Philips 《Pattern recognition letters》2012,33(3):309-318

In this paper we propose a novel inference method for maximum a posteriori estimation with Markov random field prior. The central idea is to integrate a kind of joint “voting” of neighboring labels into a message passing scheme similar to loopy belief propagation (LBP). While the LBP operates with many pairwise interactions, we formulate “messages” sent from a neighborhood as a whole. Hence the name neighborhood-consensus message passing (NCMP). The practical algorithm is much simpler than LBP and combines the flexibility of iterated conditional modes (ICM) with some ideas of more general message passing. The proposed method is also a generalization of the iterated conditional expectations algorithm (ICE): we revisit ICE and redefine it in a message passing framework in a more general form. We also develop a simplified version of NCMP, called weighted iterated conditional modes (WICM), that is suitable for large neighborhoods. We verify the potentials of our methods on four different benchmarks, showing the improvement in quality and/or speed over related inference techniques. 相似文献

8.

Towards a verification framework for faulty message passing systems in PVS

Concetta Pilotto Jerome White 《Innovations in Systems and Software Engineering》2011,7(2):109-118

We present a library of PVS meta-theories that can be used to verify a class of distributed systems in which agent communication is via message-passing. The theoretical work, as outlined in Chandy et al. (Form Aspect Comput 2011, to appear) consists of iterative schemes for solving systems of linear equations, such as message-passing extensions of the Gauss and Gauss-Seidel methods. We briefly review that work and discuss the challenges in formally verifying it. 相似文献

9.

Using GMDH-based networks for improved spam detection and email feature analysis

El-Sayed M. El-Alfy Radwan E. Abdel-Aal 《Applied Soft Computing》2011,11(1):477-488

Unsolicited or spam email has recently become a major threat that can negatively impact the usability of electronic mail. Spam substantially wastes time and money for business users and network administrators, consumes network bandwidth and storage space, and slows down email servers. In addition, it provides a medium for distributing harmful code and/or offensive content. In this paper, we explore the application of the GMDH (Group Method of Data Handling) based inductive learning approach in detecting spam messages by automatically identifying content features that effectively distinguish spam from legitimate emails. We study the performance for various network model complexities using spambase, a publicly available benchmark dataset. Results reveal that classification accuracies of 91.7% can be achieved using only 10 out of the available 57 attributes, selected through abductive learning as the most effective feature subset (i.e. 82.5% data reduction). We also show how to improve classification performance using abductive network ensembles (committees) trained on different subsets of the training data. Comparison with other techniques such as neural networks and naïve Bayesian classifiers shows that the GMDH-based learning approach can provide better spam detection accuracy with false-positive rates as low as 4.3% and yet requires shorter training time. 相似文献

10.

SPF: Sunblock for spam

Kennedy Steve 《ITNOW》2005,47(5):22

相似文献

11.

Link-based web spam detection using weight properties

Kwang Leng Goh Ravi Kumar Patchmuthu Ashutosh Kumar Singh 《Journal of Intelligent Information Systems》2014,43(1):129-145

Link spam is created with the intention of boosting one target’s rank in exchange of business profit. This unethical way of deceiving Web search engines is known as Web spam. Since then many anti-link spam detection techniques have constantly being proposed. Web spam detection is a crucial task due to its devastation towards Web search engines and global cost of billion dollars annually. In this paper, we proposed a novel technique by incorporating weight properties to enhance the Web spam detection algorithms. Weight properties can be defined as the influences of one Web node towards another Web node. We modified existing Web spam detection algorithms with our novel technique to evaluate the performances on a large public Web spam dataset – WEBSPAM-UK2007. The overall performance have shown that the modified algorithms outperform the benchmark algorithms up to 30.5 % improvement at host level and 6.11 % improvement at page level. 相似文献

12.

PADS: a probabilistic activity detection framework for video data

Albanese M Chellappa R Cuntoor N Moscato V Picariello A Subrahmanian VS Udrea O 《IEEE transactions on pattern analysis and machine intelligence》2010,32(12):2246-2261

相似文献

13.

Traditional and context-specific spam detection in low resource settings

Kawintiranon Kornraphop Singh Lisa Budak Ceren 《Machine Learning》2022,111(7):2515-2536

Machine Learning - Social media data has a mix of high and low-quality content. One form of commonly studied low-quality content is spam. Most studies assume that spam is context-neutral. We show... 相似文献

14.

A generic statistical approach for spam detection in Online Social Networks

Faraz Ahmed Muhammad Abulaish 《Computer Communications》2013,36(10-11):1120-1129

In this paper, we present a generic statistical approach to identify spam profiles on Online Social Networks (OSNs). Our study is based on real datasets containing both normal and spam profiles crawled from Facebook and Twitter networks. We have identified a set of 14 generic statistical features to identify spam profiles. The identified features are common to both Facebook and Twitter networks. For classification task, we have used three different classification algorithms – naïve Bayes, Jrip, and J48, and evaluated them on both individual and combined datasets to establish the discriminative property of the identified features. The results obtained on a combined dataset has detection rate (DR) as 0.957 and false positive rate (FPR) as 0.048, whereas on Facebook dataset the DR and FPR values are 0.964 and 0.089, respectively, and that on Twitter dataset the DR and FPR values are 0.976 and 0.075, respectively. We have also analyzed the contribution of each individual feature towards the detection accuracy of spam profiles. Thereafter, we have considered 7 most discriminative features and proposed a clustering-based approach to identify spam campaigns on Facebook and Twitter networks. 相似文献

15.

DeepAM: a heterogeneous deep learning framework for intelligent malware detection

Yanfang Ye Lingwei Chen Shifu Hou William Hardy Xin Li 《Knowledge and Information Systems》2018,54(2):265-285

With computers and the Internet being essential in everyday life, malware poses serious and evolving threats to their security, making the detection of malware of utmost concern. Accordingly, there have been many researches on intelligent malware detection by applying data mining and machine learning techniques. Though great results have been achieved with these methods, most of them are built on shallow learning architectures. Due to its superior ability in feature learning through multilayer deep architecture, deep learning is starting to be leveraged in industrial and academic research for different applications. In this paper, based on the Windows application programming interface calls extracted from the portable executable files, we study how a deep learning architecture can be designed for intelligent malware detection. We propose a heterogeneous deep learning framework composed of an AutoEncoder stacked up with multilayer restricted Boltzmann machines and a layer of associative memory to detect newly unknown malware. The proposed deep learning model performs as a greedy layer-wise training operation for unsupervised feature learning, followed by supervised parameter fine-tuning. Different from the existing works which only made use of the files with class labels (either malicious or benign) during the training phase, we utilize both labeled and unlabeled file samples to pre-train multiple layers in the heterogeneous deep learning framework from bottom to up for feature learning. A comprehensive experimental study on a real and large file collection from Comodo Cloud Security Center is performed to compare various malware detection approaches. Promising experimental results demonstrate that our proposed deep learning framework can further improve the overall performance in malware detection compared with traditional shallow learning methods, deep learning methods with homogeneous framework, and other existing anti-malware scanners. The proposed heterogeneous deep learning framework can also be readily applied to other malware detection tasks. 相似文献

16.

On spam detection based on cognitive pattern recognition

PI You-guo LIANG Tian-cai YUE Rong 《通讯和计算机》2008,5(5):28-31

A method of spam detection, based on cognitive pattern recognition, had been proposed. The connection between Email category and cognition of Email user interest within life and work, had been analyzed. Under the guidance of cognitive pattern recognition theory, the mechanism of spam detection, based on intelligent cognition of Email user interest within life and work, had been discussed. Then the spam detection algorithm and its concrete implementation was given. Experimental results demonstrate that the spare detection algorithm has a good learning ability, scalability, and a good ability to achieve high recognition accuracyn 相似文献

17.

HEMD: a highly efficient random forest-based malware detection framework for Android

Zhu Hui-Juan Jiang Tong-Hai Ma Bo You Zhu-Hong Shi Wei-Lei Cheng Li 《Neural computing & applications》2018,30(11):3353-3361

Mobile phones are rapidly becoming the most widespread and popular form of communication; thus, they are also the most important attack target of malware. The amount of malware in mobile phones is increasing exponentially and poses a serious security threat. Google’s Android is the most popular smart phone platforms in the world and the mechanisms of permission declaration access control cannot identify the malware. In this paper, we proposed an ensemble machine learning system for the detection of malware on Android devices. More specifically, four groups of features including permissions, monitoring system events, sensitive API and permission rate are extracted to characterize each Android application (app). Then an ensemble random forest classifier is learned to detect whether an app is potentially malicious or not. The performance of our proposed method is evaluated on the actual data set using tenfold cross-validation. The experimental results demonstrate that the proposed method can achieve a highly accuracy of 89.91%. For further assessing the performance of our method, we compared it with the state-of-the-art support vector machine classifier. Comparison results demonstrate that the proposed method is extremely promising and could provide a cost-effective alternative for Android malware detection.

相似文献

18.

A three-layer back-propagation neural network for spam detection using artificial immune concentration

Guangchen Ruan Ying Tan 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2010,14(2):139-150

In this paper, a three-layer back-propagation neural network (BPNN) is employed for spam detection by using a concentration based feature construction (CFC) approach. In the CFC approach, ‘self’ and ‘non-self’ concentrations are constructed through ‘self’ and ‘non-self’ gene libraries, respectively, to form a two-element concentration vector for expressing the e-mail efficiently. A three-layer BPNN with two-element input is then employed to classify e-mails automatically. Comprehensive experiments are conducted on two public benchmark corpora PU1 and Ling to demonstrate that the proposed CFC approach based BPNN classifier not only has a very much fast speed but also achieves 97 and 99% of classification accuracy on corpora PU1 and Ling by just using a two-element concentration feature vector. 相似文献

19.

基于拟合特征分布的垃圾网页检测方法

刘阳张化祥《计算机工程与设计》2013,34(8)

为了有效地检测垃圾网页,通过分析网页内容特征和链接特征的分布,发现正常网页特征分布有规律而垃圾网页特征分布散乱,根据正常网页特征分布与垃圾网页特征分布的不同,提出了用分布函数拟合正常网页特征分布,并计算正常网页和垃圾网页比例与分布函数的差值,以差值为阈值使用C4.5决策树对垃圾网页进行检测.实验结果表明,该方法能够有效地减少被错误分类的正常网页,提高准确率. 相似文献

20.

基于P2P的协作式垃圾邮件检测系统 总被引：1，自引：0，他引：1

邱明明吴国新《计算机工程与设计》2007,28(11):2559-2562,2596

对于某一封邮件是否为垃圾邮件,很多的邮件用户可能有着相同的看法.利用这一特性,提出了一种基于P2P的协作式垃圾邮件检测系统的设计方案.系统借助的信息摘要技术,在保护用户隐私的同时,提供了较好的抗攻击能力.在邮件服务器层构建了有超级节点的P2P网络,实现了垃圾邮件信息的分布式共享.同时在用户层设计了基于多Agent的体系结构,在利用邮件服务器对收到邮件进行初步分析的结果的同时,利用个性化贝叶斯Agent,可以实现一定程度的个性化垃圾邮件检测服务. 相似文献