首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Recent trends in big data have shown that the amount of data continues to increase at an exponential rate. This trend has inspired many researchers over the past few years to explore new research direction of studies related to multiple areas of big data. The widespread popularity of big data processing platforms using MapReduce framework is the growing demand to further optimize their performance for various purposes. In particular, enhancing resources and jobs scheduling are becoming critical since they fundamentally determine whether the applications can achieve the performance goals in different use cases. Scheduling plays an important role in big data, mainly in reducing the execution time and cost of processing. This paper aims to survey the research undertaken in the field of scheduling in big data platforms. Moreover, this paper analyzed scheduling in MapReduce on two aspects: taxonomy and performance evaluation. The research progress in MapReduce scheduling algorithms is also discussed. The limitations of existing MapReduce scheduling algorithms and exploit future research opportunities are pointed out in the paper for easy identification by researchers. Our study can serve as the benchmark to expert researchers for proposing a novel MapReduce scheduling algorithm. However, for novice researchers, the study can be used as a starting point.

  相似文献   

2.
Digitization brings about new ways of analyzing data from cultural heritage areas. Automatic error detection, as input to semiautomatic error correction, is one type of analysis that can be found high on the priority list of cultural heritage data managers and researchers. We describe a general approach to cleaning cultural heritage databases. We present four case studies on databases from different cultural heritage institutions, and describe an information system in which we embed our error detector in a larger framework, enabling researchers to access, check, and correct their data more easily than before.  相似文献   

3.
Peer-to-Peer (P2P) Desktop Grids are computing infrastructures that aggregate a set of desktop-class machines in which all the participating entities have the same roles, responsibilities, and rights. In this paper, we present ShareGrid, a P2P Desktop Grid infrastructure based on the OurGrid middleware, that federates the resources provided by a set of small research laboratories to easily share and use their computing resources. We discuss the techniques and tools we employed to ensure scalability, efficiency, and usability, and describe the various applications used on it. We also demonstrate the ability of ShareGrid of providing good performance and scalability by reporting the results of experimental evaluations carried out by running various applications with different resource requirements. Our experience with ShareGrid indicates that P2P Desktop Grids can represent an effective answer to the computing needs of small research laboratories, as long as they provide both ease of management and use, and good scalability and performance.  相似文献   

4.
5.
The rapid growth of data exchange on the Internet has created many critical problems that require an answer. Traditional data exchange systems based on client/server communication models are less scalable and incur especially high maintenance cost in the data exchange domain. For these reasons, many researchers have switched their interest to asynchronous communication models. Although Message-Oriented Middleware (MOM) is a middle-tier infrastructure that links operating systems and applications, such asynchronous communication APIs supported by middleware vendors are usually hard to use. For these reasons, in this study, we present a new development environment for asynchronous communication platforms which we term Ghostwriter. The keyword for our development environment is ‘easy’, that is, easy to use, easy to develop, and easy to deploy. Therefore, many researchers have switched their interest to asynchronous communication models. In addition, learning about and implementing the functions of the asynchronous communication's clients in Ghostwriter environment is simple. Other benefits are a lower technical learning curve, help for concentrate on system design, has easily reusable components, and easily integrated applications.  相似文献   

6.
The inherent complex nature of current distributed computing architectures hinders the widespread adoption of these systems for mainstream use. In general, users have access to a highly heterogeneous set of compute resources, which may include clusters, grids, desktop grids, clouds, and other compute platforms. This heterogeneity is especially problematic when running parallel and distributed applications. Software is needed which easily combines as many resources as possible into one coherent computing platform. In this paper, we introduce Zorilla: peer‐to‐peer (P2P) middleware that creates a single distributed environment from any available set of compute resources. Zorilla imposes minimal requirements on the resource used, is platform independent, and does not rely on central components. In addition to providing functionality on bare resources, Zorilla can exploit locally available middleware. Zorilla explicitly supports distributed and parallel applications, and allows resources from multiple sites to cooperate in a single computation. Zorilla makes extensive use of both virtualization and P2P techniques. We will demonstrate how virtualization and P2P combine into a simple design, while enhancing functionality and ease of use. Together, these techniques bring our goal a step closer: transparent, easy use of resources, even on very heterogeneous distributed systems. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

7.
Intuitively, data management and data integration tools should be well suited for exchanging information in a semantically meaningful way. Unfortunately, they suffer from two significant problems: they typically require a common and comprehensive schema design before they can be used to store or share information, and they are difficult to extend because schema evolution is heavyweight and may break backward compatibility. As a result, many large-scale data sharing tasks are more easily facilitated by non-database-oriented tools that have little support for semantics.The goal of the peer data management system (PDMS) is to address this need: we propose the use of a decentralized, easily extensible data management architecture in which any user can contribute new data, schema information, or even mappings between other peers schemas. PDMSs represent a natural step beyond data integration systems, replacing their single logical schema with an interlinked collection of semantic mappings between peers individual schemas.This paper considers the problem of schema mediation in a PDMS. Our first contribution is a flexible language for mediating between peer schemas that extends known data integration formalisms to our more complex architecture. We precisely characterize the complexity of query answering for our language. Next, we describe a reformulation algorithm for our language that generalizes both global-as-view and local-as-view query answering algorithms. Then we describe several methods for optimizing the reformulation algorithm and an initial set of experiments studying its performance. Finally, we define and consider several global problems in managing semantic mappings in a PDMS.Received: 16 December 2002, Accepted: 14 April 2003, Published online: 12 December 2003Edited by: V. Atluri  相似文献   

8.
While Peer-to-Peer streaming has become increasingly popular over the Internet during recent years, the proper allocation of available resources among peers in a resource constraint environment, remains a challenging problem. In a resource constraint environment, the allocated resources and thus delivered quality to individual peers should be proportional to their contribution to the system, i.e., resource allocation should be contribution aware. This in turn results in fairness among peers and encourages active contribution from participating peers which is essential for scalability of P2P systems. However, contribution-aware resource allocation is challenging due to the distributed and dynamic nature of resources in P2P systems. In this paper, we present a tax-based contribution-aware scheme for live mesh-based P2P streaming approaches. In our proposed scheme, individual peers use a tax function to determine their number of parent peers (i.e., their share of resources) based on the number of their child peers (i.e., peers’ contributed resources) and the aggregate available resources in the system. We examine the behavior of a commonly used tax function, and describe how the contribution aware scheme can leverage the tax function. Through extensive simulations we demonstrate the ability of our proposed scheme to properly allocate available resources among participating peers over a wide range of scenarios. We show that the amount of resources (i.e., bandwidth) is divided across peers proportional to their contribution and in our default simulation setting the median delivered quality to high bandwidth peers with high contribution is improved by 100%. We believe that our results shed an insightful light on the dynamics of resource utilization and allocation in the context of live mesh-based P2P streaming.  相似文献   

9.

Since its invention, the Web has evolved into the largest multimedia repository that has ever existed. This evolution is a direct result of the explosion of user-generated content, explained by the wide adoption of social network platforms. The vast amount of multimedia content requires effective management and retrieval techniques. Nevertheless, Web multimedia retrieval is a complex task because users commonly express their information needs in semantic terms, but expect multimedia content in return. This dissociation between semantics and content of multimedia is known as the semantic gap. To solve this, researchers are looking beyond content-based or text-based approaches, integrating novel data sources. New data sources can consist of any type of data extracted from the context of multimedia documents, defined as the data that is not part of the raw content of a multimedia file. The Web is an extraordinary source of context data, which can be found in explicit or implicit relation to multimedia objects, such as surrounding text, tags, hyperlinks, and even in relevance-feedback. Recent advances in Web multimedia retrieval have shown that context data has great potential to bridge the semantic gap. In this article, we present the first comprehensive survey of context-based approaches for multimedia information retrieval on the Web. We introduce a data-driven taxonomy, which we then use in our literature review of the most emblematic and important approaches that use context-based data. In addition, we identify important challenges and opportunities, which had not been previously addressed in this area.

  相似文献   

10.
A novel model of distributed knowledge recommender system is proposed to facilitate knowledge sharing among collaborative team members. Different from traditional recommender systems in the client-server architecture, our model is oriented to the peer-to-peer (P2P) environment without the centralized control. Among the P2P network of collaborative team members, each peer is deployed with one distributed knowledge recommender, which can supply proper knowledge resources to peers who may need them. This paper investigates the key techniques for implementing the distributed knowledge recommender model. Moreover, a series of simulation-based experiments are conducted by using the data from a real-world collaborative team in an enterprise. The experimental results validate the efficiency of the proposed model. This research paves the way for developing platforms that can share and manage large-scale distributed knowledge resources. This study also provides a new framework for simulating and studying individual or organizational behaviors of knowledge sharing in a collaborative team.  相似文献   

11.
The user experience of current P2P Personal and Social networking systems does not meet the usability needs of the technically naïve users. This is the motivation behind MyNet, a P2P platform that enables non-expert users to easily organize their resources and share them in their immediate social neighborhood. In this paper, we present our experience following a user-centered approach in designing MyNet: using real-world metaphors in the core system, leveraging NFC-based touch to mirror human behavior models, and involving actual users in the design process. The results of our 50-user usability evaluation are also presented in detail.  相似文献   

12.
Chee Shin Yeo  Rajkumar Buyya 《Software》2006,36(13):1381-1419
In utility‐driven cluster computing, cluster Resource Management Systems (RMSs) need to know the specific needs of different users in order to allocate resources according to their needs. This in turn is vital to achieve service‐oriented Grid computing that harnesses resources distributed worldwide based on users' objectives. Recently, numerous market‐based RMSs have been proposed to make use of real‐world market concepts and behavior to assign resources to users for various computing platforms. The aim of this paper is to develop a taxonomy that characterizes and classifies how market‐based RMSs can support utility‐driven cluster computing in practice. The taxonomy is then mapped to existing market‐based RMSs designed for both cluster and other computing platforms to survey current research developments and identify outstanding issues. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

13.
Ultrasonic Doppler color imaging can provide anatomic information and simultaneously render flow information within blood vessels for diagnostic purpose. Many researchers are currently developing ultrasound image processing algorithms in order to provide physicians with accurate clinical parameters from the images. Because researchers use a variety of computer languages and work on different computer platforms to implement their algorithms, it is difficult for other researchers and physicians to access those programs. A system has been developed using World Wide Web (WWW) technologies and HTTP communication protocols to publish our ultrasonic Angle Independent Doppler Color Image (AIDCI) processing algorithm and several general measurement tools on the Internet, where authorized researchers and physicians can easily access the program using web browsers to carry out remote analysis of their local ultrasonic images or images provided from the database. In order to overcome potential incompatibility between programs and users' computer platforms, ActiveX technology was used in this project. The technique developed may also be used for other research fields.  相似文献   

14.
The majority of machine learning methodologies operate with the assumption that their environment is benign. However, this assumption does not always hold, as it is often advantageous to adversaries to maliciously modify the training (poisoning attacks) or test data (evasion attacks). Such attacks can be catastrophic given the growth and the penetration of machine learning applications in society. Therefore, there is a need to secure machine learning enabling the safe adoption of it in adversarial cases, such as spam filtering, malware detection, and biometric recognition. This paper presents a taxonomy and survey of attacks against systems that use machine learning. It organizes the body of knowledge in adversarial machine learning so as to identify the aspects where researchers from different fields can contribute to. The taxonomy identifies attacks which share key characteristics and as such can potentially be addressed by the same defence approaches. Thus, the proposed taxonomy makes it easier to understand the existing attack landscape towards developing defence mechanisms, which are not investigated in this survey. The taxonomy is also leveraged to identify open problems that can lead to new research areas within the field of adversarial machine learning.  相似文献   

15.
Optimization of data-parallel applications for modern HPC platforms requires partitioning the computations between the heterogeneous computing devices in proportion to their speed. Heterogeneous data partitioning algorithms are based on computation performance models of the executing platforms. Their implementation is not trivial as it requires: accurate and efficient benchmarking of computing devices, which may share resources and/or execute different codes; appropriate interpolation methods to predict performance; and advanced mathematical methods to solve the data partitioning problem. In this paper, we present FuPerMod, a software tool that addresses these implementation issues and automates the development of data partitioning code in data-parallel applications for heterogeneous HPC platforms.  相似文献   

16.
17.
《Computers & Education》1998,31(3):255-264
The World Wide Web coupled with user friendly Web browsers now provide access to multimedia Web pages in universally accepted formats that can be accessed world wide easily via inexpensive desk-top computers. Everyone appears to agree that this technology will revolutionize how students, faculty, researchers, and the public access and use information. Consequently university educators are now enjoying, for the first time in history, a new way to customize and share their unique approaches to teaching and information resources in the form of text, graphics, and sound—to students both on and off campus and, with concern for the future, across time. In this paper we discuss our exploration with the use of interactive learning on the Web in an Introduction to C Programming Course taught in the Department of Computer and Information Science at Cleveland State University, and compare results with the same course taught a previous semester using no interactive WWW learning.  相似文献   

18.
Image acquisition technology is improving very fast from a performance point of view. However, there are physical restrictions that can only be solved using software processing strategies. This is particularly true in the case of super resolution (SR) methodologies. SR techniques have found a fertile application field in airborne and space optical acquisition platforms. Single-frame SR methods may be advantageous for some remote-sensing platforms and acquisition time conditions. The contributions of this article are basically two: (1) to present an overview of single-frame SR methods, making a comparative analysis of their performance in different and challenging remote-sensing scenarios, and (2) to propose a new single-frame SR taxonomy, and a common validation strategy. Finally, we should emphasize that, on the one hand, this is the first time, to the best of our knowledge, that such a review and analysis of single SR methods is made in the framework of remote sensing, and, on the other hand, that the new single-frame SR taxonomy is aimed at shedding some light when classifying some types of single-frame SR methods.  相似文献   

19.
Schools are increasingly integrating character education to facilitate improved moral thinking and pro social behavior among students. An effective method for delivering character education is problem solving moral and social situations represented visually as animated vignettes. However, schools are rarely able to use animated vignettes since existing tools do not allow them to be easily created and having them created externally is overly expensive. In this paper, we describe the design, use, and evaluation of a computational tool that enables students to construct their own animated vignettes. By building, sharing, and responding to vignettes, students become engaged in problem solving moral and social situations. Evaluations showed that users are able to build meaningful vignettes, our tool is easy to learn and fun to use, and our tool's multimedia features are often used and well-liked. Educators can download and use our tool while researchers can draw upon our design rationale and lessons learned when building similar tools.  相似文献   

20.
In this paper we use a case study of a project to create a Web 2.0-based, Virtual Research Environment (VRE) for researchers to share digital resources in order to reflect on the principles and practices for embedding eResearch applications within user communities. In particular, we focus on the software development methodologies and project management techniques adopted by the project team in order to ensure that the project remained responsive to changing user requirements without compromising their capacity to keep the project ‘on track’, i.e. meeting the goals declared in the project proposal within budget and on time. Drawing on ethnographic fieldwork, we describe how the project team, whose members are distributed across multiple sites (and often mobile), exploit a repertoire of coordination mechanisms, communication modes and tools, artefacts and structuring devices as they seek to establish the orderly running of the project while following an agile, user-centred development approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号