首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到16条相似文献,搜索用时 15 毫秒
1.
Due to its damage to Internet security, malware (e.g., virus, worm, trojan) and its detection has caught the attention of both anti-malware industry and researchers for decades. To protect legitimate users from the attacks, the most significant line of defense against malware is anti-malware software products, which mainly use signature-based method for detection. However, this method fails to recognize new, unseen malicious executables. To solve this problem, in this paper, based on the instruction sequences extracted from the file sample set, we propose an effective sequence mining algorithm to discover malicious sequential patterns, and then All-Nearest-Neighbor (ANN) classifier is constructed for malware detection based on the discovered patterns. The developed data mining framework composed of the proposed sequential pattern mining method and ANN classifier can well characterize the malicious patterns from the collected file sample set to effectively detect newly unseen malware samples. A comprehensive experimental study on a real data collection is performed to evaluate our detection framework. Promising experimental results show that our framework outperforms other alternate data mining based detection methods in identifying new malicious executables.  相似文献   

2.
Malicious web content detection by machine learning   总被引:1,自引:0,他引:1  
The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise itself easily through obfuscation or transformation, which makes the detection even harder. Anti-virus software packages commonly use signature-based approaches which might not be able to efficiently identify camouflaged malicious HTML codes. Therefore, our paper proposes a malicious web page detection using the technique of machine learning. Our study analyzes the characteristic of a malicious webpage systematically and presents important features for machine learning. Experimental results demonstrate that our method is resilient to code obfuscations and can correctly determine whether a webpage is malicious or not.  相似文献   

3.
To improve software quality, static or dynamic defect-detection tools accept programming rules as input and detect their violations in software as defects. As these programming rules are often not well documented in practice, previous work developed various approaches that mine programming rules as frequent patterns from program source code. Then these approaches use static or dynamic defect-detection techniques to detect pattern violations in source code under analysis. However, these existing approaches often produce many false positives due to various factors. To reduce false positives produced by these mining approaches, we develop a novel approach, called Alattin, that includes new mining algorithms and a technique for detecting neglected conditions based on our mining algorithm. Our new mining algorithms mine patterns in four pattern formats: conjunctive, disjunctive, exclusive-disjunctive, and combinations of these patterns. We show the benefits and limitations of these four pattern formats with respect to false positives and false negatives among detected violations by applying those patterns to the problem of detecting neglected conditions.  相似文献   

4.
Learning and convergence properties of linear threshold elements or perceptrons are well understood for the case where the input vectors (or the training sets) to the perceptron are linearly separable. Little is known, however, about the behavior of the perceptron learning algorithm when the training sets are linearly nonseparable. We present the first known results on the structure of linearly nonseparable training sets and on the behavior of perceptrons when the set of input vectors is linearly nonseparable. More precisely, we show that using the well known perceptron learning algorithm, a linear threshold element can learn the input vectors that are provably learnable, and identify those vectors that cannot be learned without committing errors. We also show how a linear threshold element can be used to learn large linearly separable subsets of any given nonseparable training set. In order to develop our results, we first establish formal characterizations of linearly nonseparable training sets and define learnable structures for such patterns. We also prove computational complexity results for the related learning problems. Next, based on such characterizations, we show that a perceptron does the best one can expect for linearly nonseparable sets of input vectors and learns as much as is theoretically possible.  相似文献   

5.
6.
长时间持续使用电脑会对人体造成健康危害,针对目前尚无非入侵式电脑使用疲劳度检测的有效方法的现状,提出了一种基于键盘和鼠标事件实时监测的非干扰式手部肌肉疲劳度评估方法。该方法经过按键动作匹配、数据去噪、特征向量提取、分类等处理,分析一段时间内两类按键的时延特性,实现对手部肌肉疲劳程度的评估和监测。利用社交网络,将检测的疲劳状态与好友进行分享,以好友劝导、健康激励的方式促使用户逐渐改变不健康的电脑使用习惯。该方法在15位用户中进行了为期2周的实验,结果验证了所提方法对疲劳度评估的有效性,以及在社交网络平台分享相关健康信息的可行性,并发现按键延迟与手部肌肉疲劳程度成负相关关系。  相似文献   

7.
Advances in the data mining technologies have enabled the intelligent Web abilities in various applications by utilizing the hidden user behavior patterns discovered from the Web logs. Intelligent methods for discovering and predicting user’s patterns is important in supporting intelligent Web applications like personalized services. Although numerous studies have been done on Web usage mining, few of them consider the temporal evolution characteristic in discovering web user’s patterns. In this paper, we propose a novel data mining algorithm named Temporal N-Gram (TN-Gram) for constructing prediction models of Web user navigation by considering the temporality property in Web usage evolution. Moreover, three kinds of new measures are proposed for evaluating the temporal evolution of navigation patterns under different time periods. Through experimental evaluation on both of real-life and simulated datasets, the proposed TN-Gram model is shown to outperform other approaches like N-gram modeling in terms of prediction precision, in particular when the web user’s navigating behavior changes significantly with temporal evolution.  相似文献   

8.
An active research topic in data mining is the discovery of sequential patterns, which finds all frequent subsequences in a sequence database. The generalized sequential pattern (GSP) algorithm was proposed to solve the mining of sequential patterns with time constraints, such as time gaps and sliding time windows. Recent studies indicate that the pattern-growth methodology could speed up sequence mining. However, the capabilities to mine sequential patterns with time constraints were previously available only within the Apriori framework. Therefore, we propose the DELISP (delimited sequential pattern) approach to provide the capabilities within the pattern-growth methodology. DELISP features in reducing the size of projected databases by bounded and windowed projection techniques. Bounded projection keeps only time-gap valid subsequences and windowed projection saves nonredundant subsequences satisfying the sliding time-window constraint. Furthermore, the delimited growth technique directly generates constraint-satisfactory patterns and speeds up the pattern growing process. The comprehensive experiments conducted show that DELISP has good scalability and outperforms the well-known GSP algorithm in the discovery of sequential patterns with time constraints.  相似文献   

9.
The generation of road networks from ubiquitous motor-vehicle GPS trajectories has recently gained wide interest. However, few attempts have been made to automatically extract road network properties such as intersections and traffic rules to facilitate the production of high-quality routable maps. For urban street networks, the vehicle trajectory logged by a GPS receiver tends to be straight on streets and curved at intersections although the local deviation exists due to vehicle paths deviating from road centrelines and GPS positioning errors. This paper uses large curved trajectories at traffic intersections and presents novel algorithms for automatically detecting road intersections and traffic rules. Two inherent issues related to GPS trajectories have been resolved using the proposed approach. First, the serious fluctuations of vehicle trajectories due to multipath reflectivity from high-rise buildings have been eliminated, thereby enabling the effective detection of real curved trajectories occurring at traffic intersections. Second, the heterogeneity of traffic density has been considered when using the curved trajectories to automatically detect road intersections. The proposed algorithm was implemented using open-source software libraries and tested using large taxi trajectories collected in Suzhou City, China. A total of 285 at-grade intersections were detected automatically, and dynamic traffic rules were elucidated for each intersection. Compared with the manually interpreted results, the detection results were high quality and provided detailed information for the construction of a routable map.  相似文献   

10.
A number of studies have been written on sensor networks in the past few years due to their wide range of potential applications. Object tracking is an important topic in sensor networks; and the limited power of sensor nodes presents numerous challenges to researchers. Previous studies of energy conservation in sensor networks have considered object movement behavior to be random. However, in some applications, the movement behavior of an object is often based on certain underlying events instead of randomness completely. Moreover, few studies have considered the real-time issue in addition to the energy saving problem for object tracking in sensor networks. In this paper, we propose a novel strategy named multi-level object tracking strategy (MLOT) for energy-efficient and real-time tracking of the moving objects in sensor networks by mining the movement log. In MLOT, we first conduct hierarchical clustering to form a hierarchical model of the sensor nodes. Second, the movement logs of the moving objects are analyzed by a data mining algorithm to obtain the movement patterns, which are then used to predict the next position of a moving object. We use the multi-level structure to represent the hierarchical relations among sensor nodes so as to achieve the goal of keeping track of moving objects in a real-time manner. Through experimental evaluation of various simulated conditions, the proposed method is shown to deliver excellent performance in terms of both energy efficiency and timeliness.  相似文献   

11.
Pattern Analysis and Applications - Existing architectures used in face anti-spoofing tend to deploy registered spatial measurements to generate feature vectors for spoof detection. This means that...  相似文献   

12.
In this paper, we present a new data mining algorithm which involves incremental mining for user moving patterns in a mobile computing environment and exploit the mining results to develop data allocation schemes so as to improve the overall performance of a mobile system. First, we propose an algorithm to capture the frequent user moving patterns from a set of log data in a mobile environment. The algorithm proposed is enhanced with the incremental mining capability and is able to discover new moving patterns efficiently without compromising the quality of results obtained. Then, in light of mining results of user moving patterns and the properties of data objects, we develop data allocation schemes that can utilize the knowledge of user moving patterns for proper allocation of both personal and shared data. By employing the data allocation schemes, the occurrences of costly remote accesses can be minimized and the performance of a mobile computing system is thus improved. For personal data allocation, two schemes are devised: one utilizes the set level of moving patterns and the other utilizes their path level. Schemes for shared data are also developed. Performance of these schemes is comparatively analyzed.  相似文献   

13.
The purpose of the work described in this paper is to provide an intelligent intrusion detection system (IIDS) that uses two of the most popular data mining tasks, namely classification and association rules mining together for predicting different behaviors in networked computers. To achieve this, we propose a method based on iterative rule learning using a fuzzy rule-based genetic classifier. Our approach is mainly composed of two phases. First, a large number of candidate rules are generated for each class using fuzzy association rules mining, and they are pre-screened using two rule evaluation criteria in order to reduce the fuzzy rule search space. Candidate rules obtained after pre-screening are used in genetic fuzzy classifier to generate rules for the classes specified in IIDS: namely Normal, PRB-probe, DOS-denial of service, U2R-user to root and R2L-remote to local. During the next stage, boosting genetic algorithm is employed for each class to find its fuzzy rules required to classify data each time a fuzzy rule is extracted and included in the system. Boosting mechanism evaluates the weight of each data item to help the rule extraction mechanism focus more on data having relatively more weight, i.e., uncovered less by the rules extracted until the current iteration. Each extracted fuzzy rule is assigned a weight. Weighted fuzzy rules in each class are aggregated to find the vote of each class label for each data item.  相似文献   

14.
Location-based services allow users to perform check-in actions, which record the geo-spatial activities and provide a plentiful source to do more accurate and useful geographical recommendation. In this paper, we present a novel Preferred Time-aware Route Planning (PTRP) problem, which aims to recommend routes whose locations are not only representative but also need to satisfy users’ preference. The central idea is that the goodness of visiting locations along a route is significantly affected by the visiting time and user preference, and each location has its own proper visiting time due to its category and population. We develop a four-stage preference-based time-aware route planning framework. First, since there is usually either noise time on existing locations or no visiting information on new locations, we devise an inference method, LocTimeInf, to predict the location visiting time on routes. Second, considering the geographical, social, and temporal information of users, we propose the GST-Clus method to group users with similar location visiting preferences. Third, we find the representative and popular time-aware location-transition behaviors by proposing Time-aware Transit Pattern Mining (TTPM) algorithm. Finally, based on the mined time-aware transit patterns, we develop a Preferred Route Search (PR-Search) algorithm to construct the final time-aware routes. Experiments on Gowalla and Foursquare check-in data exhibit the promising effectiveness and efficiency of the proposed methods, comparing to a series of competitors.  相似文献   

15.
In this paper, we propose a novel noncontact pulse wave monitoring method that is robust to fluctuations in illumination through use of two-band infrared video signals. Because the proposed method uses infrared light for illumination, the method can be used to detect a pulse wave on a human face without visible lighting. The corresponding two-band pixel values in the video signals can be separated into hemoglobin and shading components by application of a separation matrix in logarithmic space for the two pixel values. Because the shading component has been separated, the extracted hemoglobin component is then robust to fluctuations in the illumination. The pixel values in the region of interest were spatially averaged over all the pixels of each frame. These averaged values were then used to form the raw trace signal. Finally, the pulse wave and the corresponding pulse rate were obtained from the raw trace signal through several signal processing stages, including detrending, use of an adaptive bandpass filter, and peak detection. We evaluated the absolute error rate for the pulse rate between the estimated value and the ground truth obtained using an electrocardiogram. In the experiments, we found that the performance of the proposed method was greatly improved compared with that of conventional methods using single-band infrared video.  相似文献   

16.
This paper deals with a technique that can support the re-engineering of parallel programs based on point-to-point communication primitives by detecting typical process interaction patterns in the code. Pattern detection is performed by the static analysis of the parallel program and by solving Diophantine sets of inequalities. The objective is to determine process interactions and to classify them into a set of commonly occurring interaction patterns.

Information on the patterns contained in the program, besides being useful for code comprehension and documentation, makes it possible to obtain more structured and, possibly, efficient versions of the same programs through the use of collective communication constructs. These are primitives for collective data movement or computation often available in current message-passing programming environments.

After the presentation of the basic program analysis technique, several examples involving the detection of common communication patterns are shown. Then the structure of PPAR, a prototype tool that allows the analysis of parallel programs written in Fortran 77 with calls to PVM or MPI unstructured communication primitives is outlined, and conclusions are drawn.  相似文献   


设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号