首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Crowdsourcing allows large-scale and flexible invocation of human input for data gathering and analysis, which introduces a new paradigm of data mining process. Traditional data mining methods often require the experts in analytic domains to annotate the data. However, it is expensive and usually takes a long time. Crowdsourcing enables the use of heterogeneous background knowledge from volunteers and distributes the annotation process to small portions of efforts from different contributions. This paper reviews the state-of-the-arts on the crowdsourcing for data mining in recent years. We first review the challenges and opportunities of data mining tasks using crowdsourcing, and summarize the framework of them. Then we highlight several exemplar works in each component of the framework, including question designing, data mining and quality control. Finally, we conclude the limitation of crowdsourcing for data mining and suggest related areas for future research.  相似文献   

2.
The amount of ontologies and semantic annotations available on the Web is constantly growing. This new type of complex and heterogeneous graph-structured data raises new challenges for the data mining community. In this paper, we present a novel method for mining association rules from semantic instance data repositories expressed in RDF/(S) and OWL. We take advantage of the schema-level (i.e. Tbox) knowledge encoded in the ontology to derive appropriate transactions which will later feed traditional association rules algorithms. This process is guided by the analyst requirements, expressed in the form of query patterns. Initial experiments performed on semantic data of a biomedical application show the usefulness and efficiency of the approach.  相似文献   

3.
An Overview of Data Mining and Knowledge Discovery   总被引:9,自引:0,他引:9       下载免费PDF全文
With massive amounts of data stored in databases,mining information and knowledge in databases has become an important issue in recent research.Researchers in many different fields have shown great interest in date mining and knowledge discovery in databases.Several emerging applications in information providing services,such as data warehousing and on-line services over the Internet,also call for various data mining and knowledge discovery tchniques to understand used behavior better,to improve the service provided,and to increase the business opportunities.In response to such a demand,this article is to provide a comprehensive survey on the data mining and knowledge discorvery techniques developed recently,and introduce some real application systems as well.In conclusion,this article also lists some problems and challenges for further research.  相似文献   

4.
结合上海水务信息化发展实际情况,综合运用满足水务一体化管理工作需求的多目标、多任务和多层次的中间件、数据总线、数据挖掘、统一数据交换与共享管理技术,实现具有接入多源异构数据、实时监控管理功能的水务数据交换平台.为水务数据实现有机整合、共享共用和数据挖掘进行有益的研究和实践.  相似文献   

5.
Recently the coupling of proton transfer reaction ionization with a time-of-flight mass analyser (PTR-TOF-MS) has been proposed to realise a volatile organic compound (VOC) detector that overcomes the limitations in terms of time and mass resolution of the previous instrument based on a quadrupole mass analysers (PTR-Quad-MS). This opens new horizons for research and allows for new applications in fields where the rapid and sensitive monitoring and quantification of volatile organic compounds (VOCs) is crucial as, for instance, environmental sciences, food sciences and medicine. In particular, if coupled with appropriate data mining methods, it can provide a fast MS-nose system with rich analytical information. The main, perhaps even the only, drawback of this new technique in comparison to its precursor is related to the increased size and complexity of the data sets obtained. It appears that this is the main limitation to its full use and widespread application. Here we present and discuss a complete computer-based strategy for the data analysis of PTR-TOF-MS data from basic mass spectra handling, to the application of up-to date data mining methods. As a case study we apply the whole procedure to the classification of apple cultivars and clones, which was based on the distinctive profiles of volatile organic compound emissions.  相似文献   

6.
This paper proposes a data envelopment analysis (DEA) approach to measurement and benchmarking of service quality. Dealing with measurement of overall service quality of multiple units with SERVPERF as multiple-criteria decision-making (MCDM), the proposed approach utilizes DEA, in particular, the pure output model without inputs. The five dimensions of SERVPERF are considered as outputs of the DEA model. A case study of auto repair services is provided for the purpose of illustration. The current practice of benchmarking of service quality with SERVQUAL/SERVPERF is limited in that there is little guidance to whom to benchmark and to what degree service quality should be improved. This study contributes to the field of service quality benchmarking by overcoming the above limitations, taking advantage of DEA’s capability to handle MCDM problems and provide benchmarking guidelines.  相似文献   

7.
Survey of data management and analysis in disaster situations   总被引:1,自引:0,他引:1  
The area of disaster management receives increasing attention from multiple disciplines of research. A key role of computer scientists has been in devising ways to manage and analyze the data produced in disaster management situations.In this paper we make an effort to survey and organize the current knowledge in the management and analysis of data in disaster situations, as well as present the challenges and future research directions. Our findings come as a result of a thorough bibliography survey as well as our hands-on experiences from building a Business Continuity Information Network (BCIN) with the collaboration with the Miami-Dade county emergency management office. We organize our findings across the following Computer Science disciplines: data integration and ingestion, information extraction, information retrieval, information filtering, data mining and decision support. We conclude by presenting specific research directions.  相似文献   

8.
Product family design and product configuration based on data mining technology is identified as an intelligent and automated means to improve the efficiency of product development. However, few of previous literatures have proposed systematic product family design method based on data mining technology. To make up for this deficiency, this research put forward a systematic data-mining-based method for product family design and product configuration. First, the customer requirement information and product engineering information in the historical order are formatted into structural data. Second, principal component analysis is performed on historical orders to extract the customers' differentiated needs. Third, association rule algorithm is introduced to mine the rules between differentiated needs and module instances in the historical orders, thus obtained the configuration knowledge between customer needs and product engineer. Forth, the mined rules are used to construct association rule-based classifier (CBA) that is employed to sort out the best product configuration schemes as popular product variants. Fifth, sequence alignment technique is employed to identify modules for popular product variants, so that the module instances are divided into optional, common and special module, respectively, thereby the product platform is generated based on common modules. Finally, according to new customer needs, the CBA classifier is used to recommend the best configuration schemes, and then popular product variants are configured based on the product platform. The feasibility of the proposed method is demonstrated by the product family design example of desktop computer hosts.  相似文献   

9.
For product design and development, crowdsourcing shows huge potential for fostering creativity and has been regarded as one important approach to acquiring innovative concepts. Nevertheless, prior to the approach could be effectively implemented, the following challenges concerning crowdsourcing should be properly addressed: (1) burdensome concept review process to deal with a large amount of crowd-sourced design concepts; (2) insufficient consideration in integrating design knowledge and principles into existing data processing methods/algorithms for crowdsourcing; and (3) lack of a quantitative decision support process to identify better concepts. To tackle these problems, a product concept evaluation and selection approach, which comprises three modules, is proposed. These modules are respectively: (1) a data mining module to extract meaningful information from online crowd-sourced concepts; (2) a concept re-construction module to organize word tokens into a unified frame using domain ontology and extended design knowledge; and (3) a decision support module to select better concepts in a simplified manner. A pilot study on future PC (personal computer) design was conducted to demonstrate the proposed approach. The results show that the proposed approach is promising and may help to improve the concept review and evaluation efficiency; facilitate data processing using design knowledge; and enhance the reliability of concept selection decisions.  相似文献   

10.
本文研究基于感应线圈和视频数据融合的交通参数检测方法,分析了感应线圈与视频这两种异质传感器的信息互补关系,对D-S证据理论进行改进,并利用其对感应线圈和视频的检测信息进行融合,用以提高交通事件检测的可靠性和响应速度。  相似文献   

11.
The key to achieving optimum ship system reliability and safety is to have a sound maintenance management system in place for mitigating or eliminating equipment/component failures. Maintenance has three key elements; risk assessment, maintenance strategy selection and the process of determining the optimal interval for the maintenance task. The optimisation of these three main elements of maintenance is what constitute a sound maintenance management system. One of the challenges that marine maintenance practitioners are faced with is the problem of maintenance selection for each equipment item of the ship machinery system. The decision making process involves utilising different conflicting decision criteria in selecting the optimum maintenance strategy from among multiple maintenance alternatives. In tackling such decision making problems the application of a multi-criteria decision making (MCDM) method is appropriate. Hence in this paper two hybrid MCDM methods; Delphi-AHP and Delphi-AHP-PROMETHEE, are presented for the selection of appropriate maintenance strategies for ship machinery systems and other related ship systems. A case study of a ship machinery system maintenance strategy selection problem is used to demonstrate the suitability of the proposed methods.  相似文献   

12.
Compared with structured data sources that are usually stored and analyzed in spreadsheets, relational databases, and single data tables, unstructured construction data sources such as text documents, site images, web pages, and project schedules have been less intensively studied due to additional challenges in data preparation, representation, and analysis. In this paper, our vision for data management and mining addressing such challenges are presented, together with related research results from previous work, as well as our recent developments of data mining on text-based, web-based, image-based, and network-based construction databases.  相似文献   

13.
The arrival of new technologies related to smart grids and the resulting ecosystem of applications and management systems pose many new problems. The databases of the traditional grid and the various initiatives related to new technologies have given rise to many different management systems with several formats and different architectures. A heterogeneous data source integration system is necessary to update these systems for the new smart grid reality. Additionally, it is necessary to take advantage of the information smart grids provide. In this paper, the authors propose a heterogeneous data source integration based on IEC standards and metadata mining. Additionally, an automatic data mining framework is applied to model the integrated information.  相似文献   

14.
高校事务信息的数据规模较大,更新速度较快,复杂度较高,需要设计有效的高校管理信息系统,提高信息管理能力。传统的高校管理信息系统设计采用嵌入式Visual Basic的信息管理系统构架方法,系统的信息再植入能力和多线程处理性能不好,提出一种基于多元特征数据挖掘和嵌入式Linux内核的高校管理信息系统设计方法。首先在嵌入式Linux的核单元中进行高校管理信息系统总体设计和文件配置,进行系统的功能模块分析和技术指标描述。设计基于相空间重构和关联特征提取的数据挖掘算法,进行高校管理信息的有用特征挖掘和提取。以数据挖掘结果进行程序加载和引导,进行高校管理信息系统的软件开发和设计,主要包括程序加载模块、数据存储模块、交叉编译模块以及网络通信模块的设计,实现基于数据挖掘的高校管理信息系统的改进设计。实验结果表明,采用该系统进行高校管理信息的挖掘和存取调度,具有较好的可靠性和人机交互性,系统的吞吐性能和执行时间开销等指标具有优越性。  相似文献   

15.
近年来隐私保护数据挖掘已经成为数据挖掘的研究热点, 并取得了丰富的研究成果。但是, 随着移动通信、嵌入式、定位等技术的发展与物联网、位置服务、基于位置的社交网络等应用的出现, 具有个人隐私的信息内容更加丰富, 利用数据挖掘工具对数据进行综合分析更容易侵犯个人隐私。针对新的应用需求, 对隐私保护数据挖掘方法进行深入研究具有重要的现实意义。在分析现有的隐私保护数据挖掘方法分类与技术特点的基础上, 提出现有方法并应用于新型分布式系统架构应用系统、高维数据及时空数据等领域存在的挑战性问题, 并指出了今后研究的方向。  相似文献   

16.
本文探讨数据挖掘技术在中油集团新疆培训中心的应用。现有培训管理信息系统的数据库积累了大量历史数据,在此基础上使用数据挖掘技术,应用微软SQLServer2005的数据挖掘集成环境,以Microsoft时序算法为例,建立数据挖掘模型,进行数据挖掘,预测各承办部门的培训能力,实现为管理人员合理配置培训资源的决策提供有用信息,最后总结了在开发过程遇到的问题及解决办法。  相似文献   

17.
Uncertain data are data with uncertainty information,which exist widely in database applications.In recent years,uncertainty in data has brought challenges in almost all database management areas such as data modeling,query representation,query processing,and data mining.There is no doubt that uncertain data management has become a hot research topic in the field of data management.In this study,we explore problems in managing uncertain data,present state-of-the-art solutions,and provide future research directions in this area.The discussed uncertain data management techniques include data modeling,query processing,and data mining in uncertain data in the forms of relational,XML,graph,and stream.  相似文献   

18.
关于分布式、异构、历史遗留数据的数据挖掘研究   总被引:3,自引:0,他引:3  
主要研究在分布式、异构和历史遗留数据库中进行数据挖掘的方法和策略。首先讨论分布式数据库的挖掘方法,在此基础上进行扩展讨论异构数据源的数据挖掘方法;最后,讨论历史遗留数据库的挖掘方法。  相似文献   

19.
Internet of Things (IoT) aims to create a world that enables the interconnection and integration of things in physical world and cyber space. With the involvement of a great number of wireless sensor devices, IoT generates a diversity of datasets that are massive, multi-sourcing, heterogeneous, and sparse. By taking advantage of these data to further improve IoT services and offer intelligent services, data fusion is always employed first to reduce the size and dimension of data, optimize the amount of data traffic and extract useful information from raw data. Although there exist some surveys on IoT data fusion, the literature still lacks comprehensive insight and discussion on it with regard to different IoT application domains by paying special attention to security and privacy. In this paper, we investigate the properties of IoT data, propose a number of IoT data fusion requirements including the ones about security and privacy, classify the IoT applications into several domains and then provide a thorough review on the state-of-the-art of data fusion in main IoT application domains. In particular, we employ the requirements of IoT data fusion as a measure to evaluate and compare the performance of existing data fusion methods. Based on the thorough survey, we summarize open research issues, highlight promising future research directions and specify research challenges.  相似文献   

20.
随着医疗改革的深化,医院将面临日趋激烈的竞争形势,在信息技术高度发达的今天,医院也已广泛开展信息化建设工作。本文提出结合医院实际特点,应用数据仓库技术对已有业务数据库进行集成,并利用数据挖掘技术在其上进行数据分析,挖掘潜在有用的知识,从而进一步改善医院服务质量,提高效益,降低成本,并指导管理者做出科学决策分析,提高医院核心竞争力。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号