期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Malicious sequential pattern mining for automatic malware detection

《Expert systems with applications》2016

Due to its damage to Internet security, malware (e.g., virus, worm, trojan) and its detection has caught the attention of both anti-malware industry and researchers for decades. To protect legitimate users from the attacks, the most significant line of defense against malware is anti-malware software products, which mainly use signature-based method for detection. However, this method fails to recognize new, unseen malicious executables. To solve this problem, in this paper, based on the instruction sequences extracted from the file sample set, we propose an effective sequence mining algorithm to discover malicious sequential patterns, and then All-Nearest-Neighbor (ANN) classifier is constructed for malware detection based on the discovered patterns. The developed data mining framework composed of the proposed sequential pattern mining method and ANN classifier can well characterize the malicious patterns from the collected file sample set to effectively detect newly unseen malware samples. A comprehensive experimental study on a real data collection is performed to evaluate our detection framework. Promising experimental results show that our framework outperforms other alternate data mining based detection methods in identifying new malicious executables. 相似文献

2.

Distributed and scalable sequential pattern mining through stream processing

Chun-Chieh Chen Hong-Han Shuai Ming-Syan Chen 《Knowledge and Information Systems》2017,53(2):365-390

Scalability is a primary issue in existing sequential pattern mining algorithms for dealing with a large amount of data. Previous work, namely sequential pattern mining on the cloud (SPAMC), has already addressed the scalability problem. It supports the MapReduce cloud computing architecture for mining frequent sequential patterns on large datasets. However, this existing algorithm does not address the iterative mining problem, which is the problem that reloading data incur additional costs. Furthermore, it did not study the load balancing problem. To remedy these problems, we devised a powerful sequential pattern mining algorithm, the sequential pattern mining in the cloud-uniform distributed lexical sequence tree algorithm (SPAMC-UDLT), exploiting MapReduce and streaming processes. SPAMC-UDLT dramatically improves overall performance without launching multiple MapReduce rounds and provides perfect load balancing across machines in the cloud. The results show that SPAMC-UDLT can significantly reduce execution time, achieves extremely high scalability, and provides much better load balancing than existing algorithms in the cloud. 相似文献

3.

NetNMSP: Nonoverlapping maximal sequential pattern mining

Li Yan Zhang Shuai Guo Lei Liu Jing Wu Youxi Wu Xindong 《Applied Intelligence》2022,52(9):9861-9884

Applied Intelligence - Nonoverlapping sequential pattern mining, as a kind of repetitive sequential pattern mining with gap constraints, can find more valuable patterns. Traditional algorithms... 相似文献

4.

Constraint-based sequential pattern mining: the pattern-growth methods 总被引：4，自引：0，他引：4

Jian Pei Jiawei Han Wei Wang 《Journal of Intelligent Information Systems》2007,28(2):133-160

Constraints are essential for many sequential pattern mining applications. However, there is no systematic study on constraint-based sequential pattern mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-pattern mining does not fit our mission well. An extended framework is developed based on a sequential pattern growth methodology. Our study shows that constraints can be effectively and efficiently pushed deep into the sequential pattern mining under this new framework. Moreover, this framework can be extended to constraint-based structured pattern mining as well. This research is supported in part by NSERC Grant 312194-05, NSF Grants IIS-0308001, IIS-0513678, BDI-0515813 and National Science Foundation of China (NSFC) grants No. 60303008 and 69933010. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies. 相似文献

5.

Simulation of sequential data: An enhanced reinforcement learning approach

Marlies Vanhulsel Davy Janssens Geert Wets Koen Vanhoof 《Expert systems with applications》2009,36(4):8032-8039

The present study aims at contributing to the current state-of-the art of activity-based travel demand modelling by presenting a framework to simulate sequential data. To this end, the suitability of a reinforcement learning approach to reproduce sequential data is explored. Additionally, as traditional reinforcement learning techniques are not capable of learning efficiently in large state and action spaces with respect to memory and computational time requirements on the one hand, and of generalizing based on infrequent visits of all state-action pairs on the other hand, the reinforcement learning technique as used in most applications, is enhanced by means of regression tree function approximation.Three reinforcement learning algorithms are implemented to validate their applicability: the traditional Q-learning and Q-learning with bucket-brigade updating are tested against the improved reinforcement learning approach with a CART function approximator. These methods are applied on data of 26 diary days. The results are promising and show that the proposed techniques offer great opportunity of simulating sequential data. Moreover, the reinforcement learning approach improved by introducing a regression tree function approximator learns a more optimal solution much faster than the two traditional Q-learning approaches. 相似文献

6.

Scalable and parallel sequential pattern mining using spark

Yu Xiao Li Qing Liu Jin 《World Wide Web》2019,22(1):295-324

World Wide Web - The performance of the existing parallel sequential pattern mining algorithms is often unsatisfactory due to high IO overhead and imbalanced load among the computing nodes. To... 相似文献

7.

CDARL: a contrastive discriminator-augmented reinforcement learning framework for sequential recommendations

Liu Zhuang Ma Yunpu Hildebrandt Marcel Ouyang Yuanxin Xiong Zhang 《Knowledge and Information Systems》2022,64(8):2239-2265

Knowledge and Information Systems - Sequential recommendations play a crucial role in many real-world applications. Due to the sequential nature, reinforcement learning has been employed to... 相似文献

8.

Integration of reinforcement learning and optimal decision-making theories of the basal ganglia

Bogacz R Larsen T 《Neural computation》2011,23(4):817-851

This article seeks to integrate two sets of theories describing action selection in the basal ganglia: reinforcement learning theories describing learning which actions to select to maximize reward and decision-making theories proposing that the basal ganglia selects actions on the basis of sensory evidence accumulated in the cortex. In particular, we present a model that integrates the actor-critic model of reinforcement learning and a model assuming that the cortico-basal-ganglia circuit implements a statistically optimal decision-making procedure. The values of cortico-striatal weights required for optimal decision making in our model differ from those provided by standard reinforcement learning models. Nevertheless, we show that an actor-critic model converges to the weights required for optimal decision making when biologically realistic limits on synaptic weights are introduced. We also describe the model's predictions concerning reaction times and neural responses during learning, and we discuss directions required for further integration of reinforcement learning and optimal decision-making theories. 相似文献

9.

时间约束序列模式的有效生成候选项的方法

尹莉莉郑诚郑小波《微型机与应用》2011,30(10):69-72

针对序列模式的几个经典的算法的缺点,提出了一种基于时间约束序列模式的快速产生候选项的方法( TFEGC).此算法不但避免了频繁的扫描数据库,还考虑了时间限制因素,避免了无用的候选序列的产生,提高了算法运行的时间效率. 相似文献

10.

DFSP: a Depth-First SPelling algorithm for sequential pattern mining of biological sequences

Vance Chiang-Chi Liao Ming-Syan Chen 《Knowledge and Information Systems》2014,38(3):623-639

Scientific progress in recent years has led to the generation of huge amounts of biological data, most of which remains unanalyzed. Mining the data may provide insights into various realms of biology, such as finding co-occurring biosequences, which are essential for biological data mining and analysis. Data mining techniques like sequential pattern mining may reveal implicitly meaningful patterns among the DNA or protein sequences. If biologists hope to unlock the potential of sequential pattern mining in their field, it is necessary to move away from traditional sequential pattern mining algorithms, because they have difficulty handling a small number of items and long sequences in biological data, such as gene and protein sequences. To address the problem, we propose an approach called Depth-First SPelling (DFSP) algorithm for mining sequential patterns in biological sequences. The algorithm’s processing speed is faster than that of PrefixSpan, its leading competitor, and it is superior to other sequential pattern mining algorithms for biological sequences. 相似文献

11.

Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning

Shahid Asad Ali Piga Dario Braghin Francesco Roveda Loris 《Autonomous Robots》2022,46(3):483-498

Autonomous Robots - This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The... 相似文献

12.

Robot arm reaching through neural inversions and reinforcement learning 总被引：1，自引：0，他引：1

Pedro Jos del R. 《Robotics and Autonomous Systems》2000,31(4):227-246

We present a neural method that computes the inverse kinematics of any kind of robot manipulators, both redundant and non-redundant. Inverse kinematics solutions are obtained through the inversion of a neural network that has been previously trained to approximate the manipulator forward kinematics. The inversion provides difference vectors in the joint space from difference vectors in the workspace. Our differential inverse kinematics (DIV) approach can be viewed as a neural network implementation of the Jacobian transpose method for arm kinematic control that does not require previous knowledge of the arm forward kinematics. Redundancy can be exploited to obtain a special inverse kinematic solution that meets a particular constraint (e.g. joint limit avoidance) by inverting an additional neural network The usefulness of our DIV approach is further illustrated with sensor-based multilink manipulators that learn collision-free reaching motions in unknown environments. For this task, the neural controller has two modules: a reinforcement-based action generator (AG) and a DIV module that computes goal vectors in the joint space. The actions given by the AG are interpreted with regard to those goal vectors. 相似文献

13.

Integration of K-means algorithm and AprioriSome algorithm for fuzzy sequential pattern mining

R.J. Kuo C.M. Chao C.Y. Liu 《Applied Soft Computing》2009,9(1):85-93

Since Agrawal and Srikant proposed sequential pattern mining in 1995, there have been many scholars working to improve the efficiency and reduce the processing time of algorithms. This study intends to propose a fuzzy AprioriSome algorithm for fuzzy sequential patterns mining with integration with clustering technique, K-means algorithm. Two experiments performed using transaction data provided by a securities firm and foodmarket data from SQL sever 2000 demonstrate the strength of fuzzy AprioriSome sequential pattern mining in mining large quantity of transaction data. 相似文献

14.

State-chain sequential feedback reinforcement learning for path planning of autonomous mobile robots

Xin MA Ya XU Guo-qiang SUN Li-xia DENG Yi-bin LI 《浙江大学学报:C卷英文版》2013,14(3):167-178

This paper deals with a new approach based on Q-learning for solving the problem of mobile robot path planning in complex unknown static environments.As a computational approach to learning through interaction with the environment,reinforcement learning algorithms have been widely used for intelligent robot control,especially in the field of autonomous mobile robots.However,the learning process is slow and cumbersome.For practical applications,rapid rates of convergence are required.Aiming at the problem of slow convergence and long learning time for Q-learning based mobile robot path planning,a state-chain sequential feedback Q-learning algorithm is proposed for quickly searching for the optimal path of mobile robots in complex unknown static environments.The state chain is built during the searching process.After one action is chosen and the reward is received,the Q-values of the state-action pairs on the previously built state chain are sequentially updated with one-step Q-learning.With the increasing number of Q-values updated after one action,the number of actual steps for convergence decreases and thus,the learning time decreases,where a step is a state transition.Extensive simulations validate the efficiency of the newly proposed approach for mobile robot path planning in complex environments.The results show that the new approach has a high convergence speed and that the robot can find the collision-free optimal path in complex unknown static environments with much shorter time,compared with the one-step Q-learning algorithm and the Q(λ)-learning algorithm. 相似文献

15.

Effective temporal data classification by integrating sequential pattern mining and probabilistic induction

Vincent S. Tseng Chao-Hui Lee 《Expert systems with applications》2009,36(5):9524-9532

Data classification is an important topic in the field of data mining due to its wide applications. A number of related methods have been proposed based on the well-known learning models such as decision tree or neural network. Although data classification was widely discussed, relatively few studies explored the topic of temporal data classification. Most of the existing researches focused on improving the accuracy of classification by using statistical models, neural network, or distance-based methods. However, they cannot interpret the results of classification to users. In many research cases, such as gene expression of microarray, users prefer the classification information above a classifier only with a high accuracy. In this paper, we propose a novel pattern-based data mining method, namely classify-by-sequence (CBS), for classifying large temporal datasets. The main methodology behind the CBS is integrating sequential pattern mining with probabilistic induction. The CBS has the merit of simplicity in implementation and its pattern-based architecture can supply clear classification information to users. Through experimental evaluation, the CBS was shown to deliver classification results with high accuracy under two real time series datasets. In addition, we designed a simulator to evaluate the performance of CBS under datasets with different characteristics. The experimental results show that CBS can discover the hidden patterns and classify data effectively by utilizing the mined sequential patterns. 相似文献

16.

Efficient mining of sequential patterns with time constraints by delimited pattern growth

Ming-Yen Lin Suh-Yin Lee 《Knowledge and Information Systems》2005,7(4):499-514

An active research topic in data mining is the discovery of sequential patterns, which finds all frequent subsequences in a sequence database. The generalized sequential pattern (GSP) algorithm was proposed to solve the mining of sequential patterns with time constraints, such as time gaps and sliding time windows. Recent studies indicate that the pattern-growth methodology could speed up sequence mining. However, the capabilities to mine sequential patterns with time constraints were previously available only within the Apriori framework. Therefore, we propose the DELISP (delimited sequential pattern) approach to provide the capabilities within the pattern-growth methodology. DELISP features in reducing the size of projected databases by bounded and windowed projection techniques. Bounded projection keeps only time-gap valid subsequences and windowed projection saves nonredundant subsequences satisfying the sliding time-window constraint. Furthermore, the delimited growth technique directly generates constraint-satisfactory patterns and speeds up the pattern growing process. The comprehensive experiments conducted show that DELISP has good scalability and outperforms the well-known GSP algorithm in the discovery of sequential patterns with time constraints. 相似文献

17.

Comparison between on- and off-campus behaviour and adaptability in online learning: A case from China

《Behaviour & Information Technology》2012,31(4):281-291

More and more universities and colleges are providing online courses not only for on-campus students but also for off-campus students. Tutors have to consider the differences between on- and off-campus students in order to improve effective instruction. Comparisons are made in this paper between on- and off-campus performances in online learning from four areas: learning time, path of browsing courseware, intercommunication and adaptability towards online learning. The last two areas are emphasized. Multiple approaches were adopted to collect data, which include questionnaires, posted documents, online logs, interviews and observations. This study shows that the rush time of online learning, paths of browsing courseware and favourite intercommunication means of on- and off-campus students are similar. But there are also some differences between these two groups such as competence of self-learning, enthusiasm of interpersonal exchange, dependence on tutors, feeling of learning stress, etc. 相似文献

18.

Comparison between on- and off-campus behaviour and adaptability in online learning: a case from China

Xiaoyan Xie Fuzong Lin Tao Zhang 《Behaviour & Information Technology》2001,20(4):281-291

More and more universities and colleges are providing online courses not only for on-campus students but also for off-campus students. Tutors have to consider the differences between on- and off-campus students in order to improve effective instruction. Comparisons are made in this paper between on- and off-campus performances in online learning from four areas: learning time, path of browsing courseware, intercommunication and adaptability towards online learning. The last two areas are emphasized. Multiple approaches were adopted to collect data, which include questionnaires, posted documents, online logs, interviews and observations. This study shows that the rush time of online learning, paths of browsing courseware and favourite intercommunication means of on- and off-campus students are similar. But there are also some differences between these two groups such as competence of self-learning, enthusiasm of interpersonal exchange, dependence on tutors, feeling of learning stress, etc. 相似文献

19.

Challenges of real-world reinforcement learning: definitions,benchmarks and analysis

Dulac-Arnold Gabriel Levine Nir Mankowitz Daniel J. Li Jerry Paduraru Cosmin Gowal Sven Hester Todd 《Machine Learning》2021,110(9):2419-2468

Machine Learning - Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research... 相似文献

20.

Explaining mixture models through semantic pattern mining and banded matrix visualization

Prem Raj Adhikari Anže Vavpetič Jan Kralj Nada Lavrač Jaakko Hollmén 《Machine Learning》2016,105(1):3-39

相似文献