首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   6篇
  免费   0篇
无线电   1篇
自动化技术   5篇
  2016年   1篇
  2012年   1篇
  2009年   1篇
  2004年   1篇
  2002年   1篇
  2000年   1篇
排序方式: 共有6条查询结果,搜索用时 0 毫秒
1
1.
We propose an approximate computation technique for inter-object distances of binary data sets. Our approach is based on locality sensitive hashing. We randomly select a number of projections of the data set and group objects into buckets based on the hash values of these projections. For each pair of objects, occurrences in the same bucket are counted and the exact Hamming distance is approximated based on the number of co-occurrences in all buckets. We parallelize the computation using mainly two schemes. The first assigns each random subspace to a processor for calculating the local co-occurrence matrix, where all the local co-occurrence matrices are combined into the final co-occurrence matrix. The second method provides the same distance approximation in longer runtimes by limiting the total message size in a parallel computing environment, which is especially useful for very large data sets generating immense message traffic. Our methods produce very accurate results, scale up well with the number of objects, and tolerate processor failures. Experimental evaluations on supercomputers and workstations with several processors demonstrate the usefulness of our methods.  相似文献   
2.
Scalable pattern mining with Bayesian networks as background knowledge   总被引:2,自引:1,他引:1  
We study a discovery framework in which background knowledge on variables and their relations within a discourse area is available in the form of a graphical model. Starting from an initial, hand-crafted or possibly empty graphical model, the network evolves in an interactive process of discovery. We focus on the central step of this process: given a graphical model and a database, we address the problem of finding the most interesting attribute sets. We formalize the concept of interestingness of attribute sets as the divergence between their behavior as observed in the data, and the behavior that can be explained given the current model. We derive an exact algorithm that finds all attribute sets whose interestingness exceeds a given threshold. We then consider the case of a very large network that renders exact inference unfeasible, and a very large database or data stream. We devise an algorithm that efficiently finds the most interesting attribute sets with prescribed approximation bound and confidence probability, even for very large networks and infinite streams. We study the scalability of the methods in controlled experiments; a case-study sheds light on the practical usefulness of the approach.  相似文献   
3.
The (n,k,s)-perceptrons partition the input space V R n into s+1 regions using s parallel hyperplanes. Their learning abilities are examined in this research paper. The previously studied homogeneous (n,k,k–1)-perceptron learning algorithm is generalized to the permutably homogeneous (n,k,s)-perceptron learning algorithm with guaranteed convergence property. We also introduce a high capacity learning method that learns any permutably homogeneously separable k-valued function given as input.  相似文献   
4.
We introduce purity dependencies as generalizations of functional dependencies in relational databases starting from the notion of impurity measure. The impurity measure of a subset of a set relative to a partition of that set and the relative impurity of two partitions allow us to define the relative impurity of two attribute sets of a table of a relational database and to introduce purity dependencies. We discuss properties of these dependencies that generalize similar properties of functional dependencies and we highlight their relevance for approximate classifications. Finally, an algorithm that mines datasets for these dependencies is presented. Received: 4 July 2000 / 16 November 2001  相似文献   
5.
Evaluation of automatic text summarization is a challenging task due to the difficulty of calculating similarity of two texts. In this paper, we define a new dissimilarity measure – compression dissimilarity to compute the dissimilarity between documents. Then we propose a new automatic evaluating method based on compression dissimilarity. The proposed method is a completely “black box” and does not need preprocessing steps. Experiments show that compression dissimilarity could clearly distinct automatic summaries from human summaries. Compression dissimilarity evaluating measure could evaluate an automatic summary by comparing with high-quality human summaries, or comparing with its original document. The evaluating results are highly correlated with human assessments, and the correlation between compression dissimilarity of summaries and compression dissimilarity of documents can serve as a meaningful measure to evaluate the consistency of an automatic text summarization system.  相似文献   
6.
Increasing interest in new pattern recognition methods has been motivated by bioinformatics research. The analysis of gene expression data originated from microarrays constitutes an important application area for classification algorithms and illustrates the need for identifying important predictors. We show that the Goodman-Kruskal coefficient can be used for constructing minimal classifiers for tabular data, and we give an algorithm that can construct such classifiers.  相似文献   
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号