首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Today, a large volume of hotel reviews is available on many websites, such as TripAdvisor (http://www.tripadvisor.com) and Orbitz (http://www.orbitz.com). A typical review contains an overall rating, several aspect ratings, and review text. The rating is an abstract of review in terms of numerical points. The task of aspect-based opinion summarization is to extract aspect-specific opinions hidden in the reviews which do not have aspect ratings, so that users can quickly digest them without actually reading through them. The task consists of aspect identification and aspect rating inference. Most existing studies cannot utilize aspect ratings which become increasingly abundant on review hosts. In this paper, we propose two topic models which explicitly model aspect ratings as observed variables to improve the performance of aspect rating inference on unrated reviews. The experiment results show that our approaches outperform the existing methods on the data set crawled from TripAdvisor website.  相似文献   

2.
Rapid building detection using machine learning   总被引:1,自引:0,他引:1  
This work describes algorithms for performing discrete object detection, specifically in the case of buildings, where usually only low quality RGB-only geospatial reflective imagery is available. We utilize new candidate search and feature extraction techniques to reduce the problem to a machine learning (ML) classification task. Here we can harness the complex patterns of contrast features contained in training data to establish a model of buildings. We avoid costly sliding windows to generate candidates; instead we innovatively stitch together well known image processing techniques to produce candidates for building detection that cover 80–85 % of buildings. Reducing the number of possible candidates is important due to the scale of the problem. Each candidate is subjected to classification which, although linear, costs time and prohibits large scale evaluation. We propose a candidate alignment algorithm to boost classification performance to 80–90 % precision with a linear time algorithm and show it has negligible cost. Also, we propose a new concept called a Permutable Haar Mesh (PHM) which we use to form and traverse a search space to recover candidate buildings which were lost in the initial preprocessing phase. All code and datasets from this paper are made available online (http://kdl.cs.umb.edu/w/datasets/ and https://github.com/caitlinkuhlman/ObjectDetectionCLUtility).  相似文献   

3.
In the era of bigdata, with a massive set of digital information of unprecedented volumes being collected and/or produced in several application domains, it becomes more and more difficult to manage and query large data repositories. In the framework of the PetaSky project (http://com.isima.fr/Petasky), we focus on the problem of managing scientific data in the field of cosmology. The data we consider are those of the LSST project (http://www.lsst.org/). The overall size of the database that will be produced is expected to exceed 60 PB (Lsst data challenge handbook, 2012). In order to evaluate the performances of existing SQL On MapReduce data management systems, we conducted extensive experiments by using data and queries from the area of cosmology. The goal of this work is to report on the ability of such systems to support large scale declarative queries. We mainly investigated the impact of data partitioning, indexing and compression on query execution performances.  相似文献   

4.
5.
Twitter (http://twitter.com) is one of the most popular social networking platforms. Twitter users can easily broadcast disaster-specific information, which, if effectively mined, can assist in relief operations. However, the brevity and informal nature of tweets pose a challenge to Information Retrieval (IR) researchers. In this paper, we successfully use word embedding techniques to improve ranking for ad-hoc queries on microblog data. Our experiments with the ‘Social Media for Emergency Relief and Preparedness’ (SMERP) dataset provided at an ECIR 2017 workshop show that these techniques outperform conventional term-matching based IR models. In addition, we show that, for the SMERP task, our word embedding based method is more effective if the embeddings are generated from the disaster specific SMERP data, than when they are trained on the large social media collection provided for the TREC (http://trec.nist.gov/) 2011 Microblog track dataset.  相似文献   

6.
The amount of multimedia data collected in museum databases is growing fast, while the capacity of museums to display information to visitors is acutely limited by physical space. Museums must seek the perfect balance of information given on individual pieces in order to provide sufficient information to aid visitor understanding while maintaining sparse usage of the walls and guaranteeing high appreciation of the exhibit. Moreover, museums often target the interests of average visitors instead of the entire spectrum of different interests each individual visitor might have. Finally, visiting a museum should not be an experience contained in the physical space of the museum but a door opened onto a broader context of related artworks, authors, artistic trends, etc. In this paper we describe the MNEMOSYNE system that attempts to address these issues through a new multimedia museum experience. Based on passive observation, the system builds a profile of the artworks of interest for each visitor. These profiles of interest are then used to drive an interactive table that personalizes multimedia content delivery. The natural user interface on the interactive table uses the visitor’s profile, an ontology of museum content and a recommendation system to personalize exploration of multimedia content. At the end of their visit, the visitor can take home a personalized summary of their visit on a custom mobile application. In this article we describe in detail each component of our approach as well as the first field trials of our prototype system built and deployed at our permanent exhibition space at LeMurate (http://www.lemurate.comune.fi.it/lemurate/) in Florence together with the first results of the evaluation process during the official installation in the National Museum of Bargello (http://www.uffizi.firenze.it/musei/?m=bargello).  相似文献   

7.
This paper describes the educational game, TopOpt Game, which invites the player to solve various optimization challenges. The main purpose of gamifying topology optimization is to create a supplemental educational tool which can be used to introduce concepts of topology optimization to newcomers as well as to train human intuition of topology optimization. The players are challenged to solve the standard minimum compliance problem in 2D by distributing material in a design domain given a number of loads and supports with a material constraint. A statistical analysis of the gameplay data shows that players achieve higher scores the more they play the game. The game is freely available for the iOS platform at Apple’s App Store and at http://www.topopt.dtu.dk/?q=node/909 for Windows and OSX.  相似文献   

8.
As the volume of data generated each day continues to increase, more and more interest is put into Big Data algorithms and the insight they provide.? Since these analyses require a substantial amount of resources, including physical machines, power, and time, reliable execution of the algorithms becomes critical. This paper analyzes the error resilience of a select group of popular Big Data algorithms and shows how they can effectively be made more fault-tolerant. Using KULFI (http://github.com/quadpixels/kulfi, 2013) and the LLVM (Proceedings of the 2004 international symposium on code generation and optimization (CGO 2004), San Jose, CA, USA, 2004) compiler for compilation allows injection of artificial soft faults throughout these algorithms, giving a thorough analysis of how faults in different locations can affect the outcome of the program. This information is then used to help guide incorporating fault tolerance mechanisms into the program, making them as impervious as possible to soft faults.  相似文献   

9.
Website Archivability (WA) is a notion established to capture the core aspects of a website, crucial in diagnosing whether it has the potential to be archived with completeness and accuracy. In this work, aiming at measuring WA, we introduce and elaborate on all aspects of CLEAR+, an extended version of the Credible Live Evaluation Method for Archive Readiness (CLEAR) method. We use a systematic approach to evaluate WA from multiple different perspectives, which we call Website Archivability Facets. We then analyse archiveready.com, a web application we created as the reference implementation of CLEAR+, and discuss the implementation of the evaluation workflow. Finally, we conduct thorough evaluations of all aspects of WA to support the validity, the reliability and the benefits of our method using real-world web data.  相似文献   

10.
The main aim of this paper is to propose an efficient and novel Markov chain-based multi-instance multi-label (Markov-Miml) learning algorithm to evaluate the importance of a set of labels associated with objects of multiple instances. The algorithm computes ranking of labels to indicate the importance of a set of labels to an object. Our approach is to exploit the relationships between instances and labels of objects. The rank of a class label to an object depends on (i) the affinity metric between the bag of instances of this object and the bag of instances of the other objects, and (ii) the rank of a class label of similar objects. An object, which contains a bag of instances that are highly similar to bags of instances of the other objects with a high rank of a particular class label, receives a high rank of this class label. Experimental results on benchmark data have shown that the proposed algorithm is computationally efficient and effective in label ranking for MIML data. In the comparison, we find that the classification performance of the Markov-Miml algorithm is competitive with those of the three popular MIML algorithms based on boosting, support vector machine, and regularization, but the computational time required by the proposed algorithm is less than those by the other three algorithms.  相似文献   

11.
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We provide a dataset containing \(\sim \)0.25 M images, \(\sim \)0.76 M questions, and \(\sim \)10 M answers (www.visualqa.org), and discuss the information it provides. Numerous baselines and methods for VQA are provided and compared with human performance. Our VQA demo is available on CloudCV (http://cloudcv.org/vqa).  相似文献   

12.
We present the twelfth edition of the Multi-Agent Programming Contest (https://multiagentcontest.org), an annual, community-serving competition that attracts groups from all over the world. Our contest facilitates comparison of multi-agent systems and provides a concrete problem that is interesting in itself and well-suited to be tackled in educational environments. This time, seven teams competed using strictly agent-based as well as traditional programming approaches.  相似文献   

13.
Gradient vector flow (GVF) snakes are an efficient method for segmentation of ultrasound images of breast cancer. However, the method produces inaccurate results if the seeds are initialized improperly (far from the true boundaries and close to the false boundaries). Therefore, we propose a novel initialization method designed for GVF-type snakes based on walking particles. At the first step, the algorithm locates the seeds at converging and diverging configurations of the vector field. At the second step, the seeds “explode,” generating a set of random walking particles designed to differentiate between the seeds located inside and outside the object. The method has been tested against five state-of-the-art initialization methods on sixty ultrasound images from a database collected by Thammasat University Hospital of Thailand (http://onlinemedicalimages.com). The ground truth was hand-drawn by leading radiologists of the hospital. The competing methods were: trial snake method (TS), centers of divergence method (CoD), force field segmentation (FFS), Poisson Inverse Gradient Vector Flow (PIG), and quasi-automated initialization (QAI). The numerical tests demonstrated that CoD and FFS failed on the selected test images, whereas the average accuracy of PIG and QAI was 73 and 87%, respectively, versus 97% achieved by the proposed method. Finally, TS has shown a comparable accuracy of about 93%; however, the method is about ten times slower than the proposed exploding seeds. A video demonstration of the algorithm is at http://onlinemedicalimages.com/index.php/en/presentations.  相似文献   

14.
Recently, Gao et al. (J Time Ser Anal, 2016 doi:  10.1111/jtsa.12178) propose a new estimation method for dynamic panel probit model with random effects, where the theoretical properties of estimator are derived. In this paper, we extend their estimation method to the \(T\ge 3\) case, and some Monte Carlo simulations are presented to illustrate the extended estimator.  相似文献   

15.
Foreground detection or moving object detection is a fundamental and critical task in video surveillance systems. Background subtraction using Gaussian Mixture Model (GMM) is a widely used approach for foreground detection. Many improvements have been proposed over the original GMM developed by Stauffer and Grimson (IEEE Computer Society conference on computer vision and pattern recognition, vol 2, Los Alamitos, pp 246–252, 1999. doi: 10.1109/CVPR.1999.784637) to accommodate various challenges experienced in video surveillance systems. This paper presents a review of various background subtraction algorithms based on GMM and compares them on the basis of quantitative evaluation metrics. Their performance analysis is also presented to determine the most appropriate background subtraction algorithm for the specific application or scenario of video surveillance systems.  相似文献   

16.
In our studies of global software engineering (GSE) teams, we found that informal, non-work-related conversations are positively associated with trust. Seeking to use novel analytical techniques to more carefully investigate this phenomenon, we described these non-work-related conversations by adapting the economics literature concept of “cheap talk,” and studied it using Evolutionary Game Theory (EGT). More specifically, we modified the classic Stag-hunt game and analyzed the dynamics in a fixed population setting (an abstraction of a GSE team). Doing so, we were able to demonstrate how cheap talk over the Internet (e-cheap talk) was powerful enough to facilitate the emergence of trust and improve the probability of cooperation where the punishment for uncooperative behavior is comparable to the cost of the cheap talk. To validate the results of our theoretical approach, we conducted two empirical case studies that analyzed the logged IRC development discussions of Apache Lucene (http://lucene.apache. org/) and Chromium OS (http://www.chromium.org/chromium-os) using both quantitative and qualitative methods. The results provide general support to the theoretical propositions. We discuss our findings and the theoretical and practical implications to GSE collaborations and research.  相似文献   

17.
We study the class of pseudo-BL-algebras whose every maximal filter is normal. We present an equational base for this class and we extend these results for the class of basic pseudo hoops with fixed strong unit. This is a continuation of the research from Botur et al. (Soft Comput 16:635–644, doi: 10.1007/s00500-011-0763-7, 2012).  相似文献   

18.
In 2013, Farid and Vasiliev [arXiv:1310.4922 [quant-ph]] for the first time proposed a way to construct a protocol for the realisation of “Classical to Quantum” one-way hash function, a derivative of the quantum one-way function as defined by Gottesman and Chuang [Technical Report arXiv:quant-ph/0105032] and used it for constructing quantum digital signatures. We, on the other hand, for the first time, propose the idea of a different kind of one-way function, which is “quantum-classical” in nature, that is, it takes an n-qubit quantum state of a definite kind as its input and produces a classical output. We formally define such a one-way function and propose a way to construct and realise it. The proposed one-way function turns out to be very useful in authenticating a quantum state in any quantum money scheme, and so we can construct many different quantum money schemes based on such a one-way function. Later in the paper, we also give explicit constructions of some interesting quantum money schemes like quantum bitcoins and quantum currency schemes, solely based on the proposed one-way function. The security of such schemes can be explained on the basis of the security of the underlying one-way functions.  相似文献   

19.
This paper analyses online competition between private labels and national brands. Purchase data from a grocery retailer operating both on and offline are used to compute two measures of competition (intrinsic loyalty and conquesting power) for both the private label, and what this paper terms the “reference brand” (a compound of the different national brands within a category), in 36 product categories. The results show that the competitive position of the private label, relative to that of the reference brand, varies across categories and across channels. Using the framework devised by Steenkamp and Dekimpe (Long Range Plan 30(6):917–930, 1997.  https://doi.org/10.1016/S0024-6301(97)00077-0) we combine the two computed measures of competition, and classify the private label as a miser, a giant, a fighter or an artisan in each channel and category. The results show: (1) that private labels significantly improve their competitive position online; and (2) that this improvement is not equal across all categories.  相似文献   

20.
Fine particulate matter (\(\hbox {PM}_{2.5}\)) has a considerable impact on human health, the environment and climate change. It is estimated that with better predictions, US$9 billion can be saved over a 10-year period in the USA (State of the science fact sheet air quality. http://www.noaa.gov/factsheets/new, 2012). Therefore, it is crucial to keep developing models and systems that can accurately predict the concentration of major air pollutants. In this paper, our target is to predict \(\hbox {PM}_{2.5}\) concentration in Japan using environmental monitoring data obtained from physical sensors with improved accuracy over the currently employed prediction models. To do so, we propose a deep recurrent neural network (DRNN) that is enhanced with a novel pre-training method using auto-encoder especially designed for time series prediction. Additionally, sensors selection is performed within DRNN without harming the accuracy of the predictions by taking advantage of the sparsity found in the network. The numerical experiments show that DRNN with our proposed pre-training method is superior than when using a canonical and a state-of-the-art auto-encoder training method when applied to time series prediction. The experiments confirm that when compared against the \(\hbox {PM}_{2.5}\) prediction system VENUS (National Institute for Environmental Studies. Visual Atmospheric Environment Utility System. http://envgis5.nies.go.jp/osenyosoku/, 2014), our technique improves the accuracy of \(\hbox {PM}_{2.5}\) concentration level predictions that are being reported in Japan.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号