首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
This paper extends Common2, the family of objects that implement and are wait-free implementable from 2 consensus objects, in two ways: First, the stack object is shown to be in the family, refuting a conjecture to the contrary [6]. Second, Common2 is investigated in the unbounded concurrency model, whereas until now it was considered only in an n-process model. We show that the fetch-and-add, test-and-set , and stack objects are in Common2 even with respect to this stronger notion of wait-free implementation. Our constructions rely on a wait-free implementation of immediate snapshots in the unbounded concurrency model, which was previously not known to be possible. The introduction of unbounded concurrency to the study of Common2 opens several directions of research: are there objects that have n-process implementations but are not unbounded concurrency implementable? We conjecture that swap is such an object. Additionally, the hope is that a queue impossibility proof, which eludes us in the n-process model, will be easier to establish in the unbounded concurrency model.  相似文献   

3.
Bit commitment schemes are at the basis of modern cryptography. Since information-theoretic security is impossible both in the classical and in the quantum regime, we examine computationally secure commitment schemes. In this paper we study worst-case complexity assumptions that imply quantum bit commitment schemes. First, we show that QSZK \({\not\subseteq}\) QMA implies a computationally hiding and statistically binding auxiliary-input quantum commitment scheme. We then extend our result to show that the much weaker assumption QIP \({\not\subseteq}\) QMA (which is weaker than PSPACE \({\not\subseteq}\) PP) implies the existence of auxiliary-input commitment schemes with quantum advice. Finally, to strengthen the plausibility of the separation QSZK \({\not\subseteq}\) QMA, we find a quantum oracle relative to which honest-verifier QSZK is not contained in QCMA, the class of languages that can be verified using a classical proof in quantum polynomial time.  相似文献   

4.
The best way of selecting samples in algebraic attacks against block ciphers is not well explored and understood. We introduce a simple strategy for selecting the plaintexts and demonstrate its strength by breaking reduced-round KATAN32, LBlock and SIMON. For each case, we present a practical attack on reduced-round version which outperforms previous attempts of algebraic cryptanalysis whose complexities were close to exhaustive search. The attack is based on the selection of samples using cube attack and ElimLin which was presented at FSE’12, and a new technique called Universal Proning. In the case of LBlock, we break 10 out of 32 rounds. In KATAN32, we break 78 out of 254 rounds. Unlike previous attempts which break smaller number of rounds, we do not guess any bit of the key and we only use structural properties of the cipher to be able to break a higher number of rounds with much lower complexity. We show that cube attacks owe their success to the same properties and therefore can be used as a heuristic for selecting the samples in an algebraic attack. The performance of ElimLin is further enhanced by the new Universal Proning technique, which allows to discover linear equations that are not found by ElimLin.  相似文献   

5.
Reachability is a fundamental problem on large-scale networks emerging nowadays in various application domains, such as social networks, communication networks, biological networks, road networks, etc. It has been studied extensively. However, little existing work has studied reachability with realistic constraints imposed on graphs with real-valued edge or node weights. In fact, such weights are very common in many real-world networks, for example, the bandwidth of a link in communication networks, the reliability of an interaction between two proteins in PPI networks, and the handling capacity of a warehouse/storage point in a distribution network. In this paper, we formalize a new yet important reachability query in weighted undirected graphs, called weight constraint reachability (WCR) query that asks: is there a path between nodes \(a\) and \(b\), on which each real-valued edge (or node) weight satisfies a range constraint. We discover an interesting property of WCR, based on which, we design a novel edge-based index structure to answer the WCR query in \(O(1)\) time. Furthermore, we consider the case when the index cannot entirely fit in the memory, which can be very common for emerging massive networks. An I/O-efficient index is proposed, which provides constant I/O (precisely four I/Os) query time with \(O(|V|\log |V|)\) disk-based index size. Extensive experimental studies on both real and synthetic datasets demonstrate the efficiency and scalability of our solutions in answering the WCR query.  相似文献   

6.
Given a large attributed social network, can we find a compact, diffusion-equivalent representation while keeping the attribute properties? Diffusion networks with user attributes such as friendship, email communication, and people contact networks are increasingly common-place in the real-world. However, analyzing them is challenging due to their large size. In this paper, we first formally formulate a novel problem of summarizing an attributed diffusion graph to preserve its attributes and influence-based properties. Next, we propose ANeTS, an effective sub-quadratic parallelizable algorithm to solve this problem: it finds the best set of candidate nodes and merges them to construct a smaller network of ‘super-nodes’ preserving the desired properties. Extensive experiments on diverse real-world datasets show that ANeTS outperforms all state-of-the-art baselines (some of which do not even finish in 14 days). Finally, we show how ANeTS helps in multiple applications such as Topic-Aware viral marketing and sense-making of diverse graphs from different domains.  相似文献   

7.
This paper studies the problem of probabilistic range query over uncertain data. Although existing solutions could support such query, it still has space for improvement. In this paper, we firstly propose a novel index called S-MRST for indexing uncertain data. For one thing, via using an irregular shape for bounding uncertain data, it has a stronger space pruning ability. For another, by taking the gradient of probability density function into consideration, S-MRST is also powerful in terms of probability pruning ability. More important, S-MRST is a general index which could support multiple types of probabilistic queries. Theoretical analysis and extensive experimental results demonstrate the effectiveness and efficiency of the proposed index.  相似文献   

8.
9.
This paper presents the Argonauts multi-agent framework which was developed as part of a one year student project at Technische Universität Dortmund. The Argonauts framework builds on a BDI approach to model rational agents that act cooperatively in a dynamic and indeterministically changing environment. However, our agent model extends the traditional BDI approach in several aspects, most notably by incorporating motivation into the agent’s goal selection mechanism. The framework has been applied by the Argonauts team in the 2010 version of the annual multi-agent programming contest organized by Technische Universität Clausthal. In this paper, we present a high-level specification and analysis of the actual system used for solving the given scenario. We do this by applying the GAIA methodology, a high-level and iterative approach to model communication and roles in multi-agent scenarios. We further describe the technical details and insights gained during our participation in the multi-agent programming contest.  相似文献   

10.
The increasing popularity of location-based social networks encourages more and more users to share their experiences. It deeply impacts the decision of customers when shopping, traveling, and so on. This paper studies the problem of top-K valuable documents query over geo-textual data stream. Many researchers have studied this problem. However, they do not consider the reliability of documents, where some unreliable documents may mislead customers to make improper decisions. In addition, they lack the ability to prune documents with low representativeness. In order to increase user satisfaction in recommendation systems, we propose a novel framework named PDS. It first employs an efficiently machine learning technique named ELM to prune unreliable documents, and then uses a novel index named \(\mathcal {GH}\) to maintain documents. For one thing, this index maintains a group of pruning values to filter low quality documents. For another, it utilizes the unique property of sliding window to further enhance the PDS performance. Theoretical analysis and extensive experimental results demonstrate the effectiveness of the proposed algorithms.  相似文献   

11.
Bézier surfaces are mathematical tools employed in a wide variety of applications. Some works in the literature propose parallelization strategies to improve performance for the computation of Bézier surfaces. These approaches, however, are mainly focused on graphics applications and often are not directly applicable to other domains. In this work, we propose a new method for the computation of Bézier surfaces, together with approaches to efficiently map the method onto different platforms (CPUs, discrete and integrated GPUs). Additionally, we explore CPU–GPU cooperation mechanisms for computing Bézier surfaces using two integrated heterogeneous systems with different characteristics. An exhaustive performance evaluation—including different data-types, rendering and several hardware platforms—is performed. The results show that our method achieves speedups as high as 3.12x (double-precision) and 2.47x (single-precision) on CPU, and 3.69x (double-precision) and 13.14x (single-precision) on GPU compared to other methods in the literature. In heterogeneous platforms, the CPU–GPU cooperation increases the performance up to 2.09x with respect to the GPU-only version. Our method and the associated parallelization approaches can be easily employed in domains other than computer-graphics (e.g., image registration, bio-mechanical modeling and flow simulation), and extended to other Bézier formulations and Bézier constructions of higher order than surfaces.  相似文献   

12.
Reference data and accuracy assessments via error matrices build the foundation for measuring success of classifications. An error matrix is often based on the traditional holdout method that utilizes only one training/test dataset. If the training/test dataset does not fully represent the variability in a population, accuracy may be over – or under – estimated. Furthermore, reference data may be flawed by spatial errors or autocorrelation that may lead to overoptimistic results. For a forest study we first corrected spatially erroneous ground data and then used aerial photography to sample additional reference data around the field-sampled plots (Mannel et al. 2006 Mannel, S., Hua, D. and Price, M. 2006. A method to obtain large quantities of reference data. International Journal of Remote Sensing, 27: 623627. [Taylor & Francis Online], [Web of Science ®] [Google Scholar]). These reference data were used to classify forest cover and subsequently determine classification success. Cross-validation randomly separates datasets into several training/test sets and is well documented to perform a more precise accuracy measure than the traditional holdout method. However, random cross-validation of autocorrelated data may overestimate accuracy, which in our case was between 5% and 8% for a 90% confidence interval. In addition, we observed accuracies differing by up to 35% for different land cover classes depending on which training/test datasets were used. The observed discrepancies illustrate the need for paying attention to autocorrelation and utilizing more than one permanent training/test dataset, for example, through a k-fold holdout method.1  相似文献   

13.
Recently, a great deal of research work has been devoted to the development of algorithms to estimate the intrinsic dimensionality (id) of a given dataset, that is the minimum number of parameters needed to represent the data without information loss. id estimation is important for the following reasons: the capacity and the generalization capability of discriminant methods depend on it; id is a necessary information for any dimensionality reduction technique; in neural network design the number of hidden units in the encoding middle layer should be chosen according to the id of data; the id value is strongly related to the model order in a time series, that is crucial to obtain reliable time series predictions. Although many estimation techniques have been proposed in the literature, most of them fail on noisy data, or compute underestimated values when the id is sufficiently high. In this paper, after reviewing some of the most important id estimators related to our work, we provide a theoretical motivation of the bias that causes the underestimation effect, and we present two id estimators based on the statistical properties of manifold neighborhoods, which have been developed in order to reduce this effect. We exhaustively evaluate the proposed techniques on synthetic and real datasets, by employing an objective evaluation measure to compare their performance with those achieved by state of the art algorithms; the results show that the proposed methods are promising, and produce reliable estimates also in the difficult case of datasets drawn from non-linearly embedded manifolds, characterized by high?id.  相似文献   

14.
Efficient processing of high-dimensional similarity joins plays an important role for a wide variety of data-driven applications. In this paper, we consider $\varepsilon $ -join variant of the problem. Given two $d$ -dimensional datasets and parameter $\varepsilon $ , the task is to find all pairs of points, one from each dataset that are within $\varepsilon $ distance from each other. We propose a new $\varepsilon $ -join algorithm, called Super-EGO, which belongs the EGO family of join algorithms. The new algorithm gains its advantage by using novel data-driven dimensionality re-ordering technique, developing a new EGO-strategy that more aggressively avoids unnecessary computation, as well as by developing a parallel version of the algorithm. We study the newly proposed Super-EGO algorithm on large real and synthetic datasets. The empirical study demonstrates significant advantage of the proposed solution over the existing state of the art techniques.  相似文献   

15.
16.
Scaling skyline queries over high-dimensional datasets remains to be challenging due to the fact that most existing algorithms assume dimensional independence when establishing the worst-case complexity by discarding correlation distribution. In this paper, we present HashSkyline, a systematic and correlation-aware approach for scaling skyline queries over high-dimensional datasets with three novel features: First, it offers a fast hash-based method to prune non-skyline points by utilizing data correlation characteristics and speed up the overall skyline evaluation for correlated datasets. Second, we develop \(HashSkyline_{GPU}\), which can dramatically reduce the response time for anti-correlated and independent datasets by capitalizing on the parallel processing power of GPUs. Third, the HashSkyline approach uses the pivot cell-based mechanism combined with the correlation threshold to determine the correlation distribution characteristics for a given dataset, enabling adaptive configuration of HashSkyline for skyline query evaluation by auto-switching of \(HashSkyline_{CPU}\) and \(HashSkyline_{GPU}\). We evaluate the validity of HashSkyline using both synthetic datasets and real datasets. Our experiments show that HashSkyline consumes significantly less pre-processing cost and achieves significantly higher overall query performance, compared to existing state-of-the-art algorithms.  相似文献   

17.
The goal of fairness-aware classification is to categorize data while taking into account potential issues of fairness, discrimination, neutrality, and/or independence. For example, when applying data mining technologies to university admissions, admission criteria must be non-discriminatory and fair with regard to sensitive features, such as gender or race. In this context, such fairness can be formalized as statistical independence between classification results and sensitive features. The main purpose of this paper is to analyze this formal fairness in order to achieve better trade-offs between fairness and prediction accuracy, which is important for applying fairness-aware classifiers in practical use. We focus on a fairness-aware classifier, Calders and Verwer’s two-naive-Bayes (CV2NB) method, which has been shown to be superior to other classifiers in terms of fairness. We hypothesize that this superiority is due to the difference in types of independence. That is, because CV2NB achieves actual independence, rather than satisfying model-based independence like the other classifiers, it can account for model bias and a deterministic decision rule. We empirically validate this hypothesis by modifying two fairness-aware classifiers, a prejudice remover method and a reject option-based classification (ROC) method, so as to satisfy actual independence. The fairness of these two modified methods was drastically improved, showing the importance of maintaining actual independence, rather than model-based independence. We additionally extend an approach adopted in the ROC method so as to make it applicable to classifiers other than those with generative models, such as SVMs.  相似文献   

18.
Many real-world knowledge-based systems must deal with information coming from different sources that invariably leads to incompleteness, overspecification, or inherently uncertain content. The presence of these varying levels of uncertainty doesn’t mean that the information is worthless – rather, these are hurdles that the knowledge engineer must learn to work with. In this paper, we continue work on an argumentation-based framework that extends the well-known Defeasible Logic Programming (DeLP) language with probabilistic uncertainty, giving rise to the Defeasible Logic Programming with Presumptions and Probabilistic Environments (DeLP3E) model. Our prior work focused on the problem of belief revision in DeLP3E, where we proposed a non-prioritized class of revision operators called AFO (Annotation Function-based Operators) to solve this problem. In this paper, we further study this class and argue that in some cases it may be desirable to define revision operators that take quantitative aspects into account, such as how the probabilities of certain literals or formulas of interest change after the revision takes place. To the best of our knowledge, this problem has not been addressed in the argumentation literature to date. We propose the QAFO (Quantitative Annotation Function-based Operators) class of operators, a subclass of AFO, and then go on to study the complexity of several problems related to their specification and application in revising knowledge bases. Finally, we present an algorithm for computing the probability that a literal is warranted in a DeLP3E knowledge base, and discuss how it could be applied towards implementing QAFO-style operators that compute approximations rather than exact operations.  相似文献   

19.
Nowadays, clustering of massive datasets is a crucial part of many data-analytic tasks. Most of the available clustering algorithms have two shortcomings when used on big data: (1) a large group of clustering algorithms, e.g. \(k\) -means, has to keep the data in memory and iterate over the data many times which is very costly for big datasets, (2) clustering algorithms that run on limited memory sizes, especially the family of stream-clustering algorithms, do not have a parallel implementation to utilize modern multi-core processors and also they lack decent quality of results. In this paper, we propose an algorithm that combines parallel clustering with single-pass, stream-clustering algorithms. The aim is to make a clustering algorithm that utilizes maximum capabilities of a regular multi-core PC to cluster the dataset as fast as possible while resulting in acceptable quality of clusters. Our idea is to split the data into chunks and cluster each chunk in a separate thread. Then, the clusters extracted from chunks are aggregated at the final stage using re-clustering. Parameters of the algorithm can be adjusted according to hardware limitations. Experimental results on a 12-core computer show that the proposed method is much faster than its batch-processing equivalents (e.g. \(k\) -means++) and stream-based algorithms. Also, the quality of solution is often equal to \(k\) -means++, while it significantly dominates stream-clustering algorithms. Our solution also scales well with extra available cores and hence provides an effective and fast solution to clustering large datasets on multi-core and multi-processor systems.  相似文献   

20.
Twitter (http://twitter.com) is one of the most popular social networking platforms. Twitter users can easily broadcast disaster-specific information, which, if effectively mined, can assist in relief operations. However, the brevity and informal nature of tweets pose a challenge to Information Retrieval (IR) researchers. In this paper, we successfully use word embedding techniques to improve ranking for ad-hoc queries on microblog data. Our experiments with the ‘Social Media for Emergency Relief and Preparedness’ (SMERP) dataset provided at an ECIR 2017 workshop show that these techniques outperform conventional term-matching based IR models. In addition, we show that, for the SMERP task, our word embedding based method is more effective if the embeddings are generated from the disaster specific SMERP data, than when they are trained on the large social media collection provided for the TREC (http://trec.nist.gov/) 2011 Microblog track dataset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号