首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
In this paper, we propose a general learning framework based on local and global regularization. In the local regularization part, our algorithm constructs a regularized classifier for each data point using its neighborhood, while the global regularization part adopts a Laplacian regularizer to smooth the data labels predicted by those local classifiers. We show that such a learning framework can easily be incorporated into either unsupervised learning, semi-supervised learning, and supervised learning paradigm. Moreover, many existing learning algorithms can be derived from our framework. Finally we present some experimental results to show the effectiveness of our method.  相似文献   

2.
The majority of machine learning methodologies operate with the assumption that their environment is benign. However, this assumption does not always hold, as it is often advantageous to adversaries to maliciously modify the training (poisoning attacks) or test data (evasion attacks). Such attacks can be catastrophic given the growth and the penetration of machine learning applications in society. Therefore, there is a need to secure machine learning enabling the safe adoption of it in adversarial cases, such as spam filtering, malware detection, and biometric recognition. This paper presents a taxonomy and survey of attacks against systems that use machine learning. It organizes the body of knowledge in adversarial machine learning so as to identify the aspects where researchers from different fields can contribute to. The taxonomy identifies attacks which share key characteristics and as such can potentially be addressed by the same defence approaches. Thus, the proposed taxonomy makes it easier to understand the existing attack landscape towards developing defence mechanisms, which are not investigated in this survey. The taxonomy is also leveraged to identify open problems that can lead to new research areas within the field of adversarial machine learning.  相似文献   

3.
Illya  Antonio Padua   《Neurocomputing》2008,71(7-9):1203-1209
The problem of inductive supervised learning is discussed in this paper within the context of multi-objective (MOBJ) optimization. The smoothness-based apparent (effective) complexity measure for RBF networks is considered. For the specific case of RBF network, bounds on the complexity measure are formally described. As the synthetic and real-world data experiments show, the proposed MOBJ learning method is capable of efficient generalization control along with network size reduction.  相似文献   

4.
Quality control of the commutator manufacturing process can be automated by means of a machine learning model that can predict the quality of commutators as they are being manufactured. Such a model can be constructed by combining machine vision, machine learning and evolutionary optimization techniques. In this procedure, optimization is used to minimize the model error, which is estimated using single cross-validation. This work exposes the overfitting that emerges in such optimization. Overfitting is shown for three machine learning methods with different sensitivity to it (trees, additionally pruned trees and random forests) and assessed in two ways (repeated cross-validation and validation on a set of unseen instances). Results on two distinct quality control problems show that optimization amplifies overfitting, i.e., the single cross-validation error estimate for the optimized models is overly optimistic. Nevertheless, minimization of the error estimate by single cross-validation in general results in minimization of the other error estimates as well, showing that optimization is indeed beneficial in this context.  相似文献   

5.
Based on the previous work on ridgelet neural network, which employs the ridgelet function as the activation function in a feedforward neural network, in this paper we proposed a single-hidden-layer regularization ridgelet network (SLRRN). An extra regular item indicating the prior knowledge of the problem to be solved is added in the cost functional to obtain better generalization performance, and a simple and efficient method named cost functional minimized extreme and incremental learning (CFM-EIL) algorithm is proposed. In CFM-EIL based SLRRN (CFM-EIL-SLRRN), the ridgelet hidden neurons together with their parameters are tuned incrementally and analytically; thus it can significantly reduce the computational complexity of gradient based or other iterative algorithms. Some simulation experiments about time-series forecasting are taken, and several commonly used regression ways are considered under the same condition to give a comparison result. The results show the superiority of the proposed CFM-EIL-SLRRN to its counterparts in forecasting.  相似文献   

6.
Machine learning has experienced explosive growth in the last few decades, achieving sufficient maturity to provide effective tools for sundry scientific and engineering fields. Machine learning provides a firm theoretical foundation upon which to build techniques that leverage existing data to extract interesting information or to synthesize more data.In this paper we survey the uses of machine learning methods and concepts in recent computer graphics techniques. Many graphics techniques are data-driven; however, few graphics papers explicitly leverage the machine learning literature to underpin, validate, and develop their proposed methods. This survey provides novel insights by casting many existing computer graphics techniques into a common learning framework. This not only illuminates how these techniques are related, but also reveals possible ways in which they may be improved. We also use our analysis to propose several directions for future work.  相似文献   

7.
Literature has always witnessed efforts that make use of parallel algorithms / parallel architecture to improve performance; machine learning space is no exception. In fact, a considerable effort has gone into this area in the past fifteen years. Our report attempts to bring together and consolidate such attempts. It tracks the development in this area since the inception of the idea in 1995, identifies different phases during the time period 1995–2011 and marks important achievements. When it comes to performance enhancement, GPU platforms have carved a special niche for themselves. The strength of these platforms comes from the capability to speed up computations exponentially by way of parallel architecture / programming methods. While it is evident that computationally complex processes like image processing, gaming etc. stand to gain much from parallel architectures; studies suggest that general purpose tasks such as machine learning, graph traversal, and finite state machines are also identified as the parallel applications of the future. Map reduce is another important technique that has evolved during this period and as the literature has it, it has been proved to be an important aid in delivering performance of machine learning algorithms on GPUs. The report summarily presents the path of developments.  相似文献   

8.
We present a comparative study on the most popular machine learning methods applied to the challenging problem of customer churning prediction in the telecommunications industry. In the first phase of our experiments, all models were applied and evaluated using cross-validation on a popular, public domain dataset. In the second phase, the performance improvement offered by boosting was studied. In order to determine the most efficient parameter combinations we performed a series of Monte Carlo simulations for each method and for a wide range of parameters. Our results demonstrate clear superiority of the boosted versions of the models against the plain (non-boosted) versions. The best overall classifier was the SVM-POLY using AdaBoost with accuracy of almost 97% and F-measure over 84%.  相似文献   

9.
This paper describes a synergistic approach that is applicable to a wide variety of system control problems. The approach utilizes a machine learning technique, goal-directed conceptual aggregation (GDCA), to facilitate dynamic decision-making. The application domain employed is Flexible Manufacturing System (FMS) scheduling and control. Simulation is used for the dual purpose of providing a realistic depiction of FMSs, and serves as an engine for demonstrating the viability of a synergistic system involving incremental learning. The paper briefly describes prior approaches to FMS scheduling and control, and machine learning. It outlines the GDCA approach, provides a generalized architecture for dynamic control problems, and describes the implementation of the system as applied to FMS scheduling and control. The paper concludes with a discussion of the general applicability of this approach.  相似文献   

10.
Malicious web content detection by machine learning   总被引:1,自引:0,他引:1  
The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore, such DHTML code can disguise itself easily through obfuscation or transformation, which makes the detection even harder. Anti-virus software packages commonly use signature-based approaches which might not be able to efficiently identify camouflaged malicious HTML codes. Therefore, our paper proposes a malicious web page detection using the technique of machine learning. Our study analyzes the characteristic of a malicious webpage systematically and presents important features for machine learning. Experimental results demonstrate that our method is resilient to code obfuscations and can correctly determine whether a webpage is malicious or not.  相似文献   

11.

在许多实际的数据挖掘应用场景,如网络入侵检测、Twitter垃圾邮件检测、计算机辅助诊断等中,与目标域分布不同但相关的源域普遍存在. 一般情况下,在源域和目标域中都有大量未标记样本,对其中的每个样本都进行标记是件困难的、昂贵的、耗时的事,有时也没必要. 因此,充分挖掘源域和目标域中标记和未标记样本来解决目标域中的分类任务非常重要且有意义. 结合归纳迁移学习和半监督学习,提出一种名为Co-Transfer的半监督归纳迁移学习框架. Co-Transfer首先生成3个TrAdaBoost分类器用于实现从原始源域到原始目标域的迁移学习,同时生成另外3个TrAdaBoost分类器用于实现从原始目标域到原始源域的迁移学习. 这2组分类器都使用从原始源域和原始目标域的原有标记样本的有放回抽样来训练. 在Co-Transfer的每一轮迭代中,每组TrAdaBoost分类器使用新的训练集更新,其中一部分训练样本是原有的标记样本,一部分是由本组TrAdaBoost分类器标记的样本,还有一部分则由另一组TrAdaBoost分类器标记. 迭代终止后,把从原始源域到原始目标域的3个TrAdaBoost分类器的集成作为原始目标域分类器. 在UCI数据集和文本分类数据集上的实验结果表明,Co-Transfer可以有效地学习源域和目标域的标记和未标记样本从而提升泛化性能.

  相似文献   

12.
Background:In recent years, the application of artificial intelligence in the field of sleep medicine has rapidly emerged. One of the main concerns of many researchers is the recognition of sleep positions, which enables efficient monitoring of changes in sleeping posture for precise and intelligent adjustment. In sleep monitoring, machine learning is able to analyze the raw data collected and optimizes the algorithm in real-time to recognize the sleeping position of the human body during sleep.Methodology:A detailed search of relevant databases was conducted through a systematic search process, and we reviewed research published since 2017, focusing on 27 articles on sleep recognition.Results:Through the analysis and study of these articles, we propose several determinants that objectively affect sleeping posture recognition, including the acquisition of sleep posture data, data pre-processing, recognition algorithms, and validation analysis. Moreover, we analyze the categories of sleeping postures adapted to different body types.Conclusion:A systematic evaluation combining the above determinants provides solutions for system design and rational selection of recognition algorithms for sleep posture recognition, and it is necessary to regularize and standardize existing machine learning algorithms before they can be incorporated into clinical monitoring of sleep.  相似文献   

13.
BackgroundSoftware fault prediction is the process of developing models that can be used by the software practitioners in the early phases of software development life cycle for detecting faulty constructs such as modules or classes. There are various machine learning techniques used in the past for predicting faults.MethodIn this study we perform a systematic review of studies from January 1991 to October 2013 in the literature that use the machine learning techniques for software fault prediction. We assess the performance capability of the machine learning techniques in existing research for software fault prediction. We also compare the performance of the machine learning techniques with the statistical techniques and other machine learning techniques. Further the strengths and weaknesses of machine learning techniques are summarized.ResultsIn this paper we have identified 64 primary studies and seven categories of the machine learning techniques. The results prove the prediction capability of the machine learning techniques for classifying module/class as fault prone or not fault prone. The models using the machine learning techniques for estimating software fault proneness outperform the traditional statistical models.ConclusionBased on the results obtained from the systematic review, we conclude that the machine learning techniques have the ability for predicting software fault proneness and can be used by software practitioners and researchers. However, the application of the machine learning techniques in software fault prediction is still limited and more number of studies should be carried out in order to obtain well formed and generalizable results. We provide future guidelines to practitioners and researchers based on the results obtained in this work.  相似文献   

14.
Curated collections of models are essential for the success of Machine Learning (ML) and Data Analytics in Model-Driven Engineering (MDE). However, current datasets are either too small or not properly curated. In this paper, we present ModelSet, a dataset composed of 5,466 Ecore models and 5,120 UML models which have been manually labelled to support ML tasks. We describe the structure of the dataset and explain how to use the associated library to develop ML applications in Python. Finally, we present some applications which can be addressed using ModelSet.Tool Website: https://github.com/modelset  相似文献   

15.
Recent theoretical and practical studies have revealed that malware is one of the most harmful threats to the digital world. Malware mitigation techniques have evolved over the years to ensure security. Earlier, several classical methods were used for detecting malware embedded with various features like the signature, heuristic, and others. Traditional malware detection techniques were unable to defeat new generations of malware and their sophisticated obfuscation tactics. Deep Learning is increasingly used in malware detection as DL-based systems outperform conventional malware detection approaches at finding new malware variants. Furthermore, DL-based techniques provide rapid malware prediction with excellent detection rates and analysis of different malware types. Investigating recently proposed Deep Learning-based malware detection systems and their evolution is hence of interest to this work. It offers a thorough analysis of the recently developed DL-based malware detection techniques. Furthermore, current trending malwares are studied and detection techniques of Mobile malware (both Android and iOS), Windows malware, IoT malware, Advanced Persistent Threats (APTs), and Ransomware are precisely reviewed.  相似文献   

16.
Steels of different classes (austenitic, martensitic, pearlitic, etc.) have different applications and characteristic areas of properties. In the present work two methods are used to predict steel class, based on the composition and heat treatment parameters: the physically-based Calphad method and data-driven machine learning method. They are applied to the same dataset, collected from open sources (mostly steels for high-temperature applications). Classification accuracy of 93.6% is achieved by machine learning model, trained on the concentration of three elements (C, Cr, Ni) and heat treatment parameters (heating temperatures). Calphad method gives 76% accuracy, based on the temperature and cooling rate. The reasons for misclassification by both methods are discussed, and it is shown that the part of them caused by ambiguity/inaccuracy in the data or limitations of the models used. For the rest of cases reasonable classification accuracy is demonstrated. We suggest that the reason of the supremacy of machine learning classifier is the small variation in the data used, which indeed does not change the steel class: the properties of steel should be insensitive to the details of the manufacturing process.  相似文献   

17.
Context:Research related to code clones includes detection of clones in software systems, analysis, visualization and management of clones. Detection of semantic clones and management of clones have attracted use of machine learning techniques in code clone related research.Objective:The aim of this study is to report the extent of machine learning usage in code clone related research areas.Method:The paper uses a systematic review method to report the use of machine learning in research related to code clones. The study considers a comprehensive set of 57 articles published in leading conferences, workshops and journals.Results:Code clone related research using machine learning techniques is classified into different categories. Machine learning and deep learning algorithms used in the code clone research are reported. The datasets, features used to train machine learning models and metrics used to evaluate machine learning algorithms are reported. The comparative results of various machine learning algorithms presented in primary studies are reported.Conclusion:The research will help to identify the status of using machine learning in different code clone related research areas. We identify the need of more empirical studies to assess the benefits of machine learning in code clone research and give recommendations for future research.  相似文献   

18.
Point clouds are increasingly being used to improve productivity, quality, and safety throughout the life cycle of construction and infrastructure projects. While applicable for visualizing construction projects, point clouds lack meaningful semantic information. Thus, the theoretical benefits of point clouds, such as productivity, quality, and safety improvement, in the construction and infrastructure domains can only be achieved after the processing of point clouds. Manual processing of point cloud datasets is costly, time-consuming, and error-prone. A variety of automatic approaches, such as machine learning methods, are adopted in different steps of automatic processing of point clouds. This article surveys recent research on point cloud datasets, which were automatically processed with machine learning methods in construction and infrastructure industries. An outline for future research is proposed based on identified research gaps. This review paper aims to be a reference for researchers to acknowledge the state-of-the-art applications of automatically-processed point cloud models in construction and infrastructure domains and a guide to assist stakeholders in developing automatic procedures in construction and infrastructure industries.  相似文献   

19.
改进CAS性能的多网络表决模型   总被引:2,自引:0,他引:2  
Fahlman和Lebiere提出的级联相关网络是一个典型的自适应神经网络的增长算法。它具有灵活、高效的特点,但由于该算法存在诸多的不确定因素,致使在其增长过程中引入过多的自由参数,它和随机选取的初始权重是导致单个神经网络过拟合的两个直接原因。本文提出的多网表决模型的基本思想是,利用多个网络来对未知的模式进行表决来确定其解,由于其平均效应,它能够避免单个网络预言带来的偏颇,获得满意的结果,利用我们建立的PC-FARM计算环境,本文还从实验上验证了网络表决模型的优越性。  相似文献   

20.
Large area land-cover monitoring scenarios, involving large volumes of data, are becoming more prevalent in remote sensing applications. Thus, there is a pressing need for increased automation in the change mapping process. The objective of this research is to compare the performance of three machine learning algorithms (MLAs); two classification tree software routines (S-plus and C4.5) and an artificial neural network (ARTMAP), in the context of mapping land-cover modifications in northern and southern California study sites between 1990/91 and 1996. Comparisons were based on several criteria: overall accuracy, sensitivity to data set size and variation, and noise. ARTMAP produced the most accurate maps overall ( 84%), for two study areas — in southern and northern California, and was most resistant to training data deficiencies. The change map generated using ARTMAP has similar accuracies to a human-interpreted map produced by the U.S. Forest Service in the southern study area. ARTMAP appears to be robust and accurate for automated, large area change monitoring as it performed equally well across the diverse study areas with minimal human intervention in the classification process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号