首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Whenever machine learning is used to prevent illegal or unsanctioned activity and there is an economic incentive, adversaries will attempt to circumvent the protection provided. Constraints on how adversaries can manipulate training and test data for classifiers used to detect suspicious behavior make problems in this area tractable and interesting. This special issue highlights papers that span many disciplines including email spam detection, computer intrusion detection, and detection of web pages deliberately designed to manipulate the priorities of pages returned by modern search engines. The four papers in this special issue provide a standard taxonomy of the types of attacks that can be expected in an adversarial framework, demonstrate how to design classifiers that are robust to deleted or corrupted features, demonstrate the ability of modern polymorphic engines to rewrite malware so it evades detection by current intrusion detection and antivirus systems, and provide approaches to detect web pages designed to manipulate web page scores returned by search engines. We hope that these papers and this special issue encourages the multidisciplinary cooperation required to address many interesting problems in this relatively new area including predicting the future of the arms races created by adversarial learning, developing effective long-term defensive strategies, and creating algorithms that can process the massive amounts of training and test data available for internet-scale problems.  相似文献   

2.
In many information analysis tasks, one is often confronted with thousands to millions dimensional data, such as images, documents, videos, web data, bioinformatics data, etc. Conventional statistical and computational tools are severely inadequate for processing and analysing high-dimensional data due to the curse of dimensionality, where we often need to conduct inference with a limited number of samples. On the other hand, naturally occurring data may be generated by structured systems with possibly much fewer degrees of freedom than the ambient dimension would suggest. Recently, various works have considered the case when the data is sampled from a submanifold embedded in the much higher dimensional Euclidean space. Learning with full consideration of the low dimensional manifold structure, or specifically the intrinsic topological and geometrical properties of the data manifold is referred to as manifold learning, which has been receiving growing attention in our community in recent years. This special issue is to attract articles that (a) address the frontier problems in the scientific principles of manifold learning, and (b) report empirical studies and applications of manifold learning algorithms, including but not limited to pattern recognition, computer vision, web mining, image processing and so on. A total of 13 submissions were received. The papers included in this special issue are selected based on the reviews by experts in the subject area according to the journal''s procedure and quality standard. Each paper is reviewed by at least two reviewers and some of the papers were revised for two rounds according to the reviewers'' comments. The special issue includes 6 papers in total: 3 papers on the foundational theories of manifold learning, 2 papers on graph-based methods, and 1 paper on the application of manifold learning to video compression. The papers on the foundational theories of manifold learning cover the topics about the generalization ability of manifold learning, manifold ranking, and multi-manifold factorization. In the paper entitled ``Manifold Learning: Generalizing Ability and Tangential Proximity'', Bernstein and Kuleshov propose a tangential proximity based technique to address the generalized manifold learning problem. The proposed method ensures not only proximity between the points and their reconstructed values but also proximity between the corresponding tangent spaces. The traditional manifold ranking methods are based on the Laplacian regularization, which suffers from the issue that the solution is biased towards constant functions. To overcome this issue, in the paper entitled ``Manifold Ranking using Hessian Energy'', Guan et al. propose to use the second-order Hessian energy as regularization for manifold ranking. In the paper entitled ``Multi-Manifold Concept Factorization for Data Clustering'', Li et al. incorporate the multi-manifold ensemble learning into concept factorization to better preserve the local structure of the data, thus yielding more satisfactory clustering results. The papers on graph-based methods cover the topics about label propagation and graph-based dimensionality reduction. In the paper entitled ``Bidirectional Label Propagation over Graphs'', Liu et al. propose a novel label propagation algorithm to propagate labels along positive and negative edges in the graph. The construction of the graph is novel against the conventional approach by incorporating the dissimilarity among data points into the affinity matrix. In the paper entitled ``Locally Regressive Projections'', Lijun Zhang proposes a novel graph-based dimensionality reduction method that captures the local discriminative structure of the data space. The key idea is to fit a linear model locally around each data point, and then use the fitting error to measure the performance of dimensionality reduction. In the last paper entitled ``Combining Active and Semi-Supervised Learning for Video Compression'', motivated from manifold regularization, Zhang and Ji propose a machine learning approach for video compression. Active learning is used to select the most representative pixels in the encoding process, and semi-supervised learning is used to recover the color video in the decoding process. One remarking property of this approach is that the active learning algorithm shares the same loss function as the semi-supervised learning algorithm, providing a unified framework for video compression. Many people have been involved in making this special issue possible. The guest editor would like to express his gratitude to all the contributing authors for their insightful work on manifold learning. The guest editor would like to thank the reviewers for their comments and useful suggestions in order to improve the quality of the papers. The guest editor would also like to thank Prof. Ruqian Lu, the editor-in-chief of the International Journal of Software and Informatics, for providing the precious opportunity to publish this special issue. Finally, we hope the reader will enjoy this special issue and find it useful.  相似文献   

3.
Big data has become an important issue for a large number of research areas such as data mining, machine learning, computational intelligence, information fusion, the semantic Web, and social networks. The rise of different big data frameworks such as Apache Hadoop and, more recently, Spark, for massive data processing based on the MapReduce paradigm has allowed for the efficient utilisation of data mining methods and machine learning algorithms in different domains. A number of libraries such as Mahout and SparkMLib have been designed to develop new efficient applications based on machine learning algorithms. The combination of big data technologies and traditional machine learning algorithms has generated new and interesting challenges in other areas as social media and social networks. These new challenges are focused mainly on problems such as data processing, data storage, data representation, and how data can be used for pattern mining, analysing user behaviours, and visualizing and tracking data, among others. In this paper, we present a revision of the new methodologies that is designed to allow for efficient data mining and information fusion from social media and of the new applications and frameworks that are currently appearing under the “umbrella” of the social networks, social media and big data paradigms.  相似文献   

4.
This article serves as an introduction to the Special Issue on Metalearning and Algorithm Selection. The introduction is divided into two parts. In the the first section, we give an overview of how the field of metalearning has evolved in the last 1–2 decades and mention how some of the papers in this special issue fit in. In the second section, we discuss the contents of this special issue. We divide the papers into thematic subgroups, provide information about each subgroup, as well as about the individual papers. Our main aim is to highlight how the papers selected for this special issue contribute to the field of metalearning.  相似文献   

5.

Machine learning algorithms typically rely on optimization subroutines and are well known to provide very effective outcomes for many types of problems. Here, we flip the reliance and ask the reverse question: can machine learning algorithms lead to more effective outcomes for optimization problems? Our goal is to train machine learning methods to automatically improve the performance of optimization and signal processing algorithms. As a proof of concept, we use our approach to improve two popular data processing subroutines in data science: stochastic gradient descent and greedy methods in compressed sensing. We provide experimental results that demonstrate the answer is “yes”, machine learning algorithms do lead to more effective outcomes for optimization problems, and show the future potential for this research direction. In addition to our experimental work, we prove relevant Probably Approximately Correct (PAC) learning theorems for our problems of interest. More precisely, we show that there exists a learning algorithm that, with high probability, will select the algorithm that optimizes the average performance on an input set of problem instances with a given distribution.

  相似文献   

6.
Clustering is a powerful machine learning technique that groups “similar” data points based on their characteristics. Many clustering algorithms work by approximating the minimization of an objective function, namely the sum of within-the-cluster distances between points. The straightforward approach involves examining all the possible assignments of points to each of the clusters. This approach guarantees the solution will be a global minimum; however, the number of possible assignments scales quickly with the number of data points and becomes computationally intractable even for very small datasets. In order to circumvent this issue, cost function minima are found using popular local search-based heuristic approaches such as k-means and hierarchical clustering. Due to their greedy nature, such techniques do not guarantee that a global minimum will be found and can lead to sub-optimal clustering assignments. Other classes of global search-based techniques, such as simulated annealing, tabu search, and genetic algorithms, may offer better quality results but can be too time-consuming to implement. In this work, we describe how quantum annealing can be used to carry out clustering. We map the clustering objective to a quadratic binary optimization problem and discuss two clustering algorithms which are then implemented on commercially available quantum annealing hardware, as well as on a purely classical solver “qbsolv.” The first algorithm assigns N data points to K clusters, and the second one can be used to perform binary clustering in a hierarchical manner. We present our results in the form of benchmarks against well-known k-means clustering and discuss the advantages and disadvantages of the proposed techniques.  相似文献   

7.
Machine learning deals with the issue of how to build programs that improve their performance at some task through experience. Machine learning algorithms have proven to be of great practical value in a variety of application domains. They are particularly useful for (a) poorly understood problem domains where little knowledge exists for the humans to develop effective algorithms; (b) domains where there are large databases containing valuable implicit regularities to be discovered; or (c) domains where programs must adapt to changing conditions. Not surprisingly, the field of software engineering turns out to be a fertile ground where many software development and maintenance tasks could be formulated as learning problems and approached in terms of learning algorithms. This paper deals with the subject of applying machine learning in software engineering. In the paper, we first provide the characteristics and applicability of some frequently utilized machine learning algorithms. We then summarize and analyze the existing work and discuss some general issues in this niche area. Finally we offer some guidelines on applying machine learning methods to software engineering tasks and use some software development and maintenance tasks as examples to show how they can be formulated as learning problems and approached in terms of learning algorithms.  相似文献   

8.
Probabilistic graphical models have had a tremendous impact in machine learning and approaches based on energy function minimization via techniques such as graph cuts are now widely used in image segmentation. However, the free parameters in energy function-based segmentation techniques are often set by hand or using heuristic techniques. In this paper, we explore parameter learning in detail. We show how probabilistic graphical models can be used for segmentation problems to illustrate Markov random fields (MRFs), their discriminative counterparts conditional random fields (CRFs) as well as kernel CRFs. We discuss the relationships between energy function formulations, MRFs, CRFs, hybrids based on graphical models and their relationships to key techniques for inference and learning. We then explore a series of novel 3D graphical models and present a series of detailed experiments comparing and contrasting different approaches for the complete volumetric segmentation of multiple organs within computed tomography imagery of the abdominal region. Further, we show how these modeling techniques can be combined with state of the art image features based on histograms of oriented gradients to increase segmentation performance. We explore a wide variety of modeling choices, discuss the importance and relationships between inference and learning techniques and present experiments using different levels of user interaction. We go on to explore a novel approach to the challenging and important problem of adrenal gland segmentation. We present a 3D CRF formulation and compare with a novel 3D sparse kernel CRF approach we call a relevance vector random field. The method yields state of the art performance and avoids the need to discretize or cluster input features. We believe our work is the first to provide quantitative comparisons between traditional MRFs with edge-modulated interaction potentials and CRFs for multi-organ abdominal segmentation and the first to explore the 3D adrenal gland segmentation problem. Finally, along with this paper we provide the labeled data used for our experiments to the community.  相似文献   

9.
Business process modeling is an essential task in business process management. Process models that are comprehensively understood by business stakeholders allow organizations to profit from this field. In this work, we report what is being investigated in the topic “visualization of business process models”, since visualization is known as improving perception and comprehension of structures and patterns in datasets. We performed a systematic literature review through which we selected and analyzed 46 papers from two points of view. Firstly, we observed the similarities between the papers regarding their main scope. From this observation we classified the papers into six categories: “Augmentation of existing elements”, “Creation of new elements”, “Exploration of the 3D space”, “Information visualization”, “Visual feedback concerning problems detected in process models” and “Perspectives”. The less explored categories and which could represent research challenges for further exploration are “Visual feedback” and “Information visualization”. Secondly, we analyzed the papers based on a well-known visualization analysis framework, which allowed us to obtain a high-level point of view of the proposals presented in the literature and could identify that few authors explore user interaction features in their works. Besides that, we also found that exactly half of the papers base their proposals on BPMN and present results from evaluation or validation. Since BPMN is an ISO standard and there are many tools based on BPMN, there should be more research intending to improve the knowledge around this topic. We expect that our results inspire researchers for further work aiming at bringing forward the field of business process model visualization, to have the advantages of information visualization helping the tasks of business process modeling and management.  相似文献   

10.
Annotating linguistic data has become a major field of interest, both for supplying the necessary data for machine learning approaches to NLP applications, and as a research issue in its own right. This comprises issues of technical formats, tools, and methodologies of annotation. We provide a brief overview of these notions and then introduce the papers assembled in this special issue.  相似文献   

11.
Introduction to the Special Issue on Meta-Learning   总被引:1,自引:0,他引:1  
Recent advances in meta-learning are providing the foundations to construct meta-learning assistants and task-adaptive learners. The goal of this special issue is to foster an interest in meta-learning by compiling representative work in the field. The contributions to this special issue provide strong insights into the construction of future meta-learning tools. In this introduction we present a common frame of reference to address work in meta-learning through the concept of meta-knowledge. We show how meta-learning can be simply defined as the process of exploiting knowledge about learning that enables us to understand and improve the performance of learning algorithms.  相似文献   

12.
We have witnessed the tremendous momentum of the second spring of parallel computing in recent years. But, we should remember the low points of the field more than 20 years ago and review the lesson that has led to the question at that point whether “parallel computing will soon be relegated to the trash heap reserved for promising technologies that never quite make it” in an article entitled “the death of parallel computing” written by the late Ken Kennedy — a prominent leader of parallel computing in the world. Facing the new era of parallel computing, we should learn from the robust history of sequential computation in the past 60 years. We should study the foundation established by the model of Turing machine (1936) and its profound impact in this history. To this end, this paper examines the disappointing state of the work in parallel Turing machine models in the past 50 years of parallel computing research. Lacking a solid yet intuitive parallel Turing machine model will continue to be a serious challenge in the future parallel computing. Our paper presents an attempt to address this challenge by presenting a proposal of a parallel Turing machine model. We also discuss why we start our work in this paper from a parallel Turing machine model instead of other choices.  相似文献   

13.
Severe weather, including tornadoes, thunderstorms, wind, and hail annually cause significant loss of life and property. We are developing spatiotemporal machine learning techniques that will enable meteorologists to improve the prediction of these events by improving their understanding of the fundamental causes of the phenomena and by building skillful empirical predictive models. In this paper, we present significant enhancements of our Spatiotemporal Relational Probability Trees that enable autonomous discovery of spatiotemporal relationships as well as learning with arbitrary shapes. We focus our evaluation on two real-world case studies using our technique: predicting tornadoes in Oklahoma and predicting aircraft turbulence in the United States. We also discuss how to evaluate success for a machine learning algorithm in the severe weather domain, which will enable new methods such as ours to transfer from research to operations, provide a set of lessons learned for embedded machine learning applications, and discuss how to field our technique.  相似文献   

14.
Wearables paired with data analytics and machine learning algorithms that measure physiological (and other) parameters are slowly finding their way into our workplace. Several studies have reported positive effects from using such “physiolytics” devices and purported the notion that it may lead to significant workplace safety improvements or to increased awareness among employees concerning unhealthy work practices and other job‐related health and well‐being issues. At the same time, physiolytics may cause an overdependency on technology and create new constraints on privacy, individuality, and personal freedom. While it is easy to understand why organizations are implementing physiolytics, it remains unclear what employees think about using wearables at their workplace. Using an affordance theory lens, we, therefore, explore the mental models of employees who are faced with the introduction of physiolytics as part of corporate wellness or security programs. We identify five distinct user types each of which characterizes a specific viewpoint on physiolytics at the workplace: the freedom loving, the individualist, the cynical, the tech independent, and the balancer. Our findings allow for better understanding the wider implications and possible user responses to the introduction of wearable technologies in occupational settings and address the need for opening up the “user black box” in IS use research.  相似文献   

15.
As a field, Grammatical Inference addresses both theoretical and empirical learning problems, and the collection of papers within this special issue attests both to the diversity of these problems as well as the advances and insights that are being made by the researchers working within it. Thus we hope this special issue is of interest to the readership of Machine Learning.  相似文献   

16.
To analyze the research hotspots on the application of machine learning methods in the field of ergonomics, we collected 1141 articles related to machine learning methods in the field of ergonomics from 2014 to 2021 on the Web of Science (WoS) database. Then we used Cite Space V 6.1. R2 to generate network maps and analyze the authors, institutions, countries, co-cited literature, and keywords. Results show that the correlation between authors in the formed author co-occurrence network is not strong, which indicates low cooperation among authors. In the analysis of research institutions, the University of Southampton is the most frequently cited literature in the United Kingdom. However, the US is leading in the country's co-occurrence network. “System” and “Model” are the top two cited keywords, while “Methodology” and “Decision-making” were active from 2015 to 2018, with a longer development time. Other keywords, including “Musculoskeletal disorders”, “Performance”, “Low back pain”, “Health”, and “Risk Factors”, are the most frequently cited keywords and have a high betweenness centrality. “Validation” and “Prediction” have recently become popular keywords in this field. Therefore, we conclude that the application of machine learning methods in the field of ergonomics will continue to increase year by year and that the development of machine learning methods in the field of ergonomics is gaining importance due to its cross-disciplinary nature. In ergonomics, machine learning methods will be further developed and widely used.  相似文献   

17.
Ethnographic approaches to study of work in the field have been widely adopted by HCI researchers as resources for investigation of work settings and for requirements elicitation. Although the value of fieldwork for design is widely recognised, difficulties surround the exploitation of fieldwork data within the design process. Since not every development project can support or justify large-scale field investigation, the issue of how to build on previous work within a domain is particularly important. In this paper we consider this issue in the context of development of mobile healthcare applications. Many such systems will be built in the coming years, and already a number of influential studies have derived concepts from fieldwork data and used them to support analysis of healthcare work. Using a patient review process as an example, we examine how the concepts from such exemplar studies can be leveraged to analyse fieldwork data, and to facilitate requirements elicitation. The concepts, previous interpretation within the domain, prototypical requirements and associated critique together provide a framework for analysis. The concepts are used to highlight issues that must be addressed and to derive requirements. We make the case that these concepts are not “value free” and that the course of our analysis is significantly altered through the palette of concepts used. The methodological implications of this proposition are also considered.  相似文献   

18.
The Web-its resources and users-offers a wealth of information for data mining and knowledge discovery. Up to now, a great deal of work has been done applying data mining and machine learning methods to discover novel and useful knowledge on the Web. However, many techniques aim only at extracting knowledge for human users to view and use. Recently, more and more work addresses Web for knowledge that computer systems will use. You can apply such actionable knowledge back to the Web for measurable performance improvements. This special issue of IEEE Intelligent Systems features five articles that address the problem of actionable Web mining.  相似文献   

19.
In this position paper, we summarize history, current activities and future topics of IFAC Technical Committee (TC) 5.2 “Manufacturing Modelling for Management and Control”. As a special focus, we discuss the results of the 9th IFAC Conference MIM 2019 that was recently organized by IFAC TC 5.2 in Berlin, Germany and attended by 740 participants. We analyse the current activities of the working groups within TC 5.2 and project some future research directions. Finally, we present an overview of papers from TC 5.2 in this special issue.  相似文献   

20.
Although machine learning is becoming commonly used in today's software, there has been little research into how end users might interact with machine learning systems, beyond communicating simple “right/wrong” judgments. If the users themselves could work hand-in-hand with machine learning systems, the users’ understanding and trust of the system could improve and the accuracy of learning systems could be improved as well. We conducted three experiments to understand the potential for rich interactions between users and machine learning systems. The first experiment was a think-aloud study that investigated users’ willingness to interact with machine learning reasoning, and what kinds of feedback users might give to machine learning systems. We then investigated the viability of introducing such feedback into machine learning systems, specifically, how to incorporate some of these types of user feedback into machine learning systems, and what their impact was on the accuracy of the system. Taken together, the results of our experiments show that supporting rich interactions between users and machine learning systems is feasible for both user and machine. This shows the potential of rich human–computer collaboration via on-the-spot interactions as a promising direction for machine learning systems and users to collaboratively share intelligence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号