首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
Rapid building detection using machine learning   总被引:1,自引:0,他引:1  
This work describes algorithms for performing discrete object detection, specifically in the case of buildings, where usually only low quality RGB-only geospatial reflective imagery is available. We utilize new candidate search and feature extraction techniques to reduce the problem to a machine learning (ML) classification task. Here we can harness the complex patterns of contrast features contained in training data to establish a model of buildings. We avoid costly sliding windows to generate candidates; instead we innovatively stitch together well known image processing techniques to produce candidates for building detection that cover 80–85 % of buildings. Reducing the number of possible candidates is important due to the scale of the problem. Each candidate is subjected to classification which, although linear, costs time and prohibits large scale evaluation. We propose a candidate alignment algorithm to boost classification performance to 80–90 % precision with a linear time algorithm and show it has negligible cost. Also, we propose a new concept called a Permutable Haar Mesh (PHM) which we use to form and traverse a search space to recover candidate buildings which were lost in the initial preprocessing phase. All code and datasets from this paper are made available online (http://kdl.cs.umb.edu/w/datasets/ and https://github.com/caitlinkuhlman/ObjectDetectionCLUtility).  相似文献   

3.
We present the twelfth edition of the Multi-Agent Programming Contest (https://multiagentcontest.org), an annual, community-serving competition that attracts groups from all over the world. Our contest facilitates comparison of multi-agent systems and provides a concrete problem that is interesting in itself and well-suited to be tackled in educational environments. This time, seven teams competed using strictly agent-based as well as traditional programming approaches.  相似文献   

4.
With the explosion of data production, the efficiency of data management and analysis has been concerned by both industry and academia. Meanwhile, more and more energy is consumed by the IT infrastructure especially the larger scale distributed systems. In this paper, a novel idea for optimizing the Energy Consumption (EC for short) of MapReduce system is proposed. We argue that a fair data placement is helpful to save energy, and then we propose three goals of data placement, and a modulo based Data Placement Algorithm (DPA for short) which achieves these goals. Afterwards, the correctness of the proposed DPA is proved from both theoretical and experimental perspectives. Three different systems which implement MapReduce model with different DPAs are compared in our experiments. Our algorithm is proved to optimize EC effectively, without introducing the additional costs and delaying data loading. With the help of our DPA, the EC for the WordCount (https://src/examples/org/apache/hadoop/examples/), Sort (https://src/examples/org/apache/hadoop/examples/sort) and MRBench (https://src/examples/org/apache/hadoop/mapred/) can be reduced by 10.9 %, 8.3 % and 17 % respectively, and time consumption is reduced by 7 %, 6.3 % and 7 % respectively.  相似文献   

5.
We study the class of pseudo-BL-algebras whose every maximal filter is normal. We present an equational base for this class and we extend these results for the class of basic pseudo hoops with fixed strong unit. This is a continuation of the research from Botur et al. (Soft Comput 16:635–644, doi: 10.1007/s00500-011-0763-7, 2012).  相似文献   

6.
Recently, Gao et al. (J Time Ser Anal, 2016 doi:  10.1111/jtsa.12178) propose a new estimation method for dynamic panel probit model with random effects, where the theoretical properties of estimator are derived. In this paper, we extend their estimation method to the \(T\ge 3\) case, and some Monte Carlo simulations are presented to illustrate the extended estimator.  相似文献   

7.
As the volume of data generated each day continues to increase, more and more interest is put into Big Data algorithms and the insight they provide.? Since these analyses require a substantial amount of resources, including physical machines, power, and time, reliable execution of the algorithms becomes critical. This paper analyzes the error resilience of a select group of popular Big Data algorithms and shows how they can effectively be made more fault-tolerant. Using KULFI (http://github.com/quadpixels/kulfi, 2013) and the LLVM (Proceedings of the 2004 international symposium on code generation and optimization (CGO 2004), San Jose, CA, USA, 2004) compiler for compilation allows injection of artificial soft faults throughout these algorithms, giving a thorough analysis of how faults in different locations can affect the outcome of the program. This information is then used to help guide incorporating fault tolerance mechanisms into the program, making them as impervious as possible to soft faults.  相似文献   

8.
In model-driven development of safety-critical systems (like automotive, avionics or railways), well-formedness of models is repeatedly validated in order to detect design flaws as early as possible. In many industrial tools, validation rules are still often implemented by a large amount of imperative model traversal code which makes those rule implementations complicated and hard to maintain. Additionally, as models are rapidly increasing in size and complexity, efficient execution of validation rules is challenging for the currently available tools. Checking well-formedness constraints can be captured by declarative queries over graph models, while model update operations can be specified as model transformations. This paper presents a benchmark for systematically assessing the scalability of validating and revalidating well-formedness constraints over large graph models. The benchmark defines well-formedness validation scenarios in the railway domain: a metamodel, an instance model generator and a set of well-formedness constraints captured by queries, fault injection and repair operations (imitating the work of systems engineers by model transformations). The benchmark focuses on the performance of query evaluation, i.e. its execution time and memory consumption, with a particular emphasis on reevaluation. We demonstrate that the benchmark can be adopted to various technologies and query engines, including modeling tools; relational, graph and semantic databases. The Train Benchmark is available as an open-source project with continuous builds from https://github.com/FTSRG/trainbenchmark.  相似文献   

9.
Learning how to forecast is always important for traders, and divergent learning frequencies prevail among traders. The influence of the evolutionary frequency on learning performance has occasioned many studies of agent-based computational finance (e.g., Lettau in J Econ Dyn Control 21:1117–1147, 1997. doi: 10.1016/S0165-1889(97)00046-8; Szpiro in Complexity 2(4):31–39, 1997. doi: 10.1002/(SICI)1099-0526(199703/04)2:4<31::AID-CPLX8>3.0.CO;2-3; Cacho and Simmons in Aust J Agric Resour Econ 43(3):305–322, 1999. doi: 10.1111/1467-8489.00081). Although these studies all suggest that evolving less frequently and, hence, experiencing more realizations help learning, this implication may result from their common stationary assumption. Therefore, we first attempt to approach this issue in a ‘dynamically’ evolving market in which agents learn to forecast endogenously generated asset prices. Moreover, in these studies’ market settings, evolving less frequently also meant having a longer time horizon. However, it is not true in many market settings that are even closer to the real financial markets. The clarification that the evolutionary frequency and the time horizon are two separate notions leaves the effect of the evolutionary frequency on learning even more elusive and worthy of exploration independently. We find that the influence of a trader’s evolutionary frequency on his forecasting accuracy depends on all market participants and the resulting price dynamics. In addition, prior studies also commonly assume that traders have identical preferences, which is too strong an assumption to apply to a real market. Considering the heterogeneity of preferences, we find that converging to the rational expectations equilibrium is hardly possible, and we even suggest that agents in a slow-learning market learn frequently. We also apply a series of econometric tests to explain the simulation results.  相似文献   

10.
Identifying context-specific entity networks from aggregated data is an important task, arising often in bioinformatics and neuroimaging applications. Computationally, this task can be formulated as jointly estimating multiple different, but related, sparse undirected graphical models (UGM) from aggregated samples across several contexts. Previous joint-UGM studies have mostly focused on sparse Gaussian graphical models (sGGMs) and can’t identify context-specific edge patterns directly. We, therefore, propose a novel approach, SIMULE (detecting Shared and Individual parts of MULtiple graphs Explicitly) to learn multi-UGM via a constrained \(\ell \)1 minimization. SIMULE automatically infers both specific edge patterns that are unique to each context and shared interactions preserved among all the contexts. Through the \(\ell \)1 constrained formulation, this problem is cast as multiple independent subtasks of linear programming that can be solved efficiently in parallel. In addition to Gaussian data, SIMULE can also handle multivariate Nonparanormal data that greatly relaxes the normality assumption that many real-world applications do not follow. We provide a novel theoretical proof showing that SIMULE achieves a consistent result at the rate \(O(\log (Kp)/n_{tot})\). On multiple synthetic datasets and two biomedical datasets, SIMULE shows significant improvement over state-of-the-art multi-sGGM and single-UGM baselines (SIMULE implementation and the used datasets @https://github.com/QData/SIMULE).  相似文献   

11.
We present the RST Signalling Corpus (Das et al. in RST signalling corpus, LDC2015T10. https://catalog.ldc.upenn.edu/LDC2015T10, 2015), a corpus annotated for signals of coherence relations. The corpus is developed over the RST Discourse Treebank (Carlson et al. in RST Discourse Treebank, LDC2002T07. https://catalog.ldc.upenn.edu/LDC2002T07, 2002) which is annotated for coherence relations. In the RST Signalling Corpus, these relations are further annotated with signalling information. The corpus includes annotation not only for discourse markers which are considered to be the most typical (or sometimes the only type of) signals in discourse, but also for a wide array of other signals such as reference, lexical, semantic, syntactic, graphical and genre features as potential indicators of coherence relations. We describe the research underlying the development of the corpus and the annotation process, and provide details of the corpus. We also present the results of an inter-annotator agreement study, illustrating the validity and reproducibility of the annotation. The corpus is available through the Linguistic Data Consortium, and can be used to investigate the psycholinguistic mechanisms behind the interpretation of relations through signalling, and also to develop discourse-specific computational systems such as discourse parsing applications.  相似文献   

12.
In the era of bigdata, with a massive set of digital information of unprecedented volumes being collected and/or produced in several application domains, it becomes more and more difficult to manage and query large data repositories. In the framework of the PetaSky project (http://com.isima.fr/Petasky), we focus on the problem of managing scientific data in the field of cosmology. The data we consider are those of the LSST project (http://www.lsst.org/). The overall size of the database that will be produced is expected to exceed 60 PB (Lsst data challenge handbook, 2012). In order to evaluate the performances of existing SQL On MapReduce data management systems, we conducted extensive experiments by using data and queries from the area of cosmology. The goal of this work is to report on the ability of such systems to support large scale declarative queries. We mainly investigated the impact of data partitioning, indexing and compression on query execution performances.  相似文献   

13.
Fine particulate matter (\(\hbox {PM}_{2.5}\)) has a considerable impact on human health, the environment and climate change. It is estimated that with better predictions, US$9 billion can be saved over a 10-year period in the USA (State of the science fact sheet air quality. http://www.noaa.gov/factsheets/new, 2012). Therefore, it is crucial to keep developing models and systems that can accurately predict the concentration of major air pollutants. In this paper, our target is to predict \(\hbox {PM}_{2.5}\) concentration in Japan using environmental monitoring data obtained from physical sensors with improved accuracy over the currently employed prediction models. To do so, we propose a deep recurrent neural network (DRNN) that is enhanced with a novel pre-training method using auto-encoder especially designed for time series prediction. Additionally, sensors selection is performed within DRNN without harming the accuracy of the predictions by taking advantage of the sparsity found in the network. The numerical experiments show that DRNN with our proposed pre-training method is superior than when using a canonical and a state-of-the-art auto-encoder training method when applied to time series prediction. The experiments confirm that when compared against the \(\hbox {PM}_{2.5}\) prediction system VENUS (National Institute for Environmental Studies. Visual Atmospheric Environment Utility System. http://envgis5.nies.go.jp/osenyosoku/, 2014), our technique improves the accuracy of \(\hbox {PM}_{2.5}\) concentration level predictions that are being reported in Japan.  相似文献   

14.
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We provide a dataset containing \(\sim \)0.25 M images, \(\sim \)0.76 M questions, and \(\sim \)10 M answers (www.visualqa.org), and discuss the information it provides. Numerous baselines and methods for VQA are provided and compared with human performance. Our VQA demo is available on CloudCV (http://cloudcv.org/vqa).  相似文献   

15.
We show how to realize two-factor authentication for a Bitcoin wallet. To do so, we explain how to employ an ECDSA adaption of the two-party signature protocol by MacKenzie and Reiter (Int J Inf Secur 2(3–4):218–239, 2004. doi: 10.1007/s10207-004-0041-0) in the context of Bitcoin and present a prototypic implementation of a Bitcoin wallet that offers both: two-factor authentication and verification over a separate channel. Since we use a smart phone as the second authentication factor, our solution can be used with hardware already available to most users and the user experience is quite similar to the existing online banking authentication methods.  相似文献   

16.
Foreground detection or moving object detection is a fundamental and critical task in video surveillance systems. Background subtraction using Gaussian Mixture Model (GMM) is a widely used approach for foreground detection. Many improvements have been proposed over the original GMM developed by Stauffer and Grimson (IEEE Computer Society conference on computer vision and pattern recognition, vol 2, Los Alamitos, pp 246–252, 1999. doi: 10.1109/CVPR.1999.784637) to accommodate various challenges experienced in video surveillance systems. This paper presents a review of various background subtraction algorithms based on GMM and compares them on the basis of quantitative evaluation metrics. Their performance analysis is also presented to determine the most appropriate background subtraction algorithm for the specific application or scenario of video surveillance systems.  相似文献   

17.
Robust and accurate detection of the pupil position is a key building block for head-mounted eye tracking and prerequisite for applications on top, such as gaze-based human–computer interaction or attention analysis. Despite a large body of work, detecting the pupil in images recorded under real-world conditions is challenging given significant variability in the eye appearance (e.g., illumination, reflections, occlusions, etc.), individual differences in eye physiology, as well as other sources of noise, such as contact lenses or make-up. In this paper we review six state-of-the-art pupil detection methods, namely ElSe (Fuhl et al. in Proceedings of the ninth biennial ACM symposium on eye tracking research & applications, ACM. New York, NY, USA, pp 123–130, 2016), ExCuSe (Fuhl et al. in Computer analysis of images and patterns. Springer, New York, pp 39–51, 2015), Pupil Labs (Kassner et al. in Adjunct proceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing (UbiComp), pp 1151–1160, 2014. doi: 10.1145/2638728.2641695), SET (Javadi et al. in Front Neuroeng 8, 2015), Starburst (Li et al. in Computer vision and pattern recognition-workshops, 2005. IEEE Computer society conference on CVPR workshops. IEEE, pp 79–79, 2005), and ?wirski (?wirski et al. in Proceedings of the symposium on eye tracking research and applications (ETRA). ACM, pp 173–176, 2012. doi: 10.1145/2168556.2168585). We compare their performance on a large-scale data set consisting of 225,569 annotated eye images taken from four publicly available data sets. Our experimental results show that the algorithm ElSe (Fuhl et al. 2016) outperforms other pupil detection methods by a large margin, offering thus robust and accurate pupil positions on challenging everyday eye images.  相似文献   

18.
Today, a large volume of hotel reviews is available on many websites, such as TripAdvisor (http://www.tripadvisor.com) and Orbitz (http://www.orbitz.com). A typical review contains an overall rating, several aspect ratings, and review text. The rating is an abstract of review in terms of numerical points. The task of aspect-based opinion summarization is to extract aspect-specific opinions hidden in the reviews which do not have aspect ratings, so that users can quickly digest them without actually reading through them. The task consists of aspect identification and aspect rating inference. Most existing studies cannot utilize aspect ratings which become increasingly abundant on review hosts. In this paper, we propose two topic models which explicitly model aspect ratings as observed variables to improve the performance of aspect rating inference on unrated reviews. The experiment results show that our approaches outperform the existing methods on the data set crawled from TripAdvisor website.  相似文献   

19.
In our studies of global software engineering (GSE) teams, we found that informal, non-work-related conversations are positively associated with trust. Seeking to use novel analytical techniques to more carefully investigate this phenomenon, we described these non-work-related conversations by adapting the economics literature concept of “cheap talk,” and studied it using Evolutionary Game Theory (EGT). More specifically, we modified the classic Stag-hunt game and analyzed the dynamics in a fixed population setting (an abstraction of a GSE team). Doing so, we were able to demonstrate how cheap talk over the Internet (e-cheap talk) was powerful enough to facilitate the emergence of trust and improve the probability of cooperation where the punishment for uncooperative behavior is comparable to the cost of the cheap talk. To validate the results of our theoretical approach, we conducted two empirical case studies that analyzed the logged IRC development discussions of Apache Lucene (http://lucene.apache. org/) and Chromium OS (http://www.chromium.org/chromium-os) using both quantitative and qualitative methods. The results provide general support to the theoretical propositions. We discuss our findings and the theoretical and practical implications to GSE collaborations and research.  相似文献   

20.
A new algorithm for segmenting documents into regions containing musical scores and text is proposed. Such segmentation is a required step prior to applying optical character recognition and optical music recognition on scanned pages that contain both music notation and text. Our segmentation technique is based on the bag-of-visual-words representation followed by random block voting (RBV) in order to detect the bounding boxes containing the musical score and text within a document image. The RBV procedure consists of extracting a fixed number of blocks whose position and size are sampled from a discrete uniform distribution that “over”-covers the input image. Each block is automatically classified as either coming from musical score or text and votes with a particular posterior probability of classification in its spatial domain. An initial coarse segmentation is obtained by summarizing all the votes in a single image. Subsequently, the final segmentation is obtained by subdividing the image in microblocks and classifying them using a N-nearest neighbor classifier which is trained using the coarse segmentation. We demonstrate the potential of the proposed method by experiments on two different datasets. One is on a challenging dataset of images collected and artificially combined and manipulated for this project. The other is a music dataset obtained by the scanning of two music books. The results are reported using precision/recall metrics of the overlapping area with respect to the ground truth. The proposed system achieves an overall averaged F-measure of 85 %. The complete source code package and associated data are available at https://github.com/fpeder/mscr under the FreeBSD license to support reproducibility.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号