共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
The need to store and query a set of strings – a string dictionary – arises in many kinds of applications. While classically these string dictionaries have accounted for a small share of the total space budget (e.g., in Natural Language Processing or when indexing text collections), recent applications in Web engines, Semantic Web (RDF) graphs, Bioinformatics, and many others handle very large string dictionaries, whose size is a significant fraction of the whole data. In these cases, string dictionary management is a scalability issue by itself. This paper focuses on the problem of managing large static string dictionaries in compressed main memory space. We revisit classical solutions for string dictionaries like hashing, tries, and front-coding, and improve them by using compression techniques. We also introduce some novel string dictionary representations built on top of recent advances in succinct data structures and full-text indexes. All these structures are empirically compared on a heterogeneous testbed formed by real-world string dictionaries. We show that the compressed representations may use as little as 5% of the original dictionary size, while supporting lookup operations within a few microseconds. These numbers outperform the state-of-the-art space/time tradeoffs in many cases. Furthermore, we enhance some representations to provide prefix- and substring-based searches, which also perform competitively. The results show that compressed string dictionaries are a useful building block for various data-intensive applications in different domains. 相似文献
4.
Karl J. Lieberherr 《LISP and Symbolic Computation》1988,1(2):185-212
A class dictionary defines all data structures that appear in a program as well as a language for describing data specified by the data structures. We demonstrate that class dictionaries are ideal for simplifying object-oriented programming. Our class dictionary-based approach to object-oriented programming is independent of any particular programming language, so it is applicable to a large variety of object-oriented systems. The experience in designing and using over one hundred class dictionaries has resulted in a set of useful design techniques. This novel approach to object-oriented programming makes interesting links between language design, data structure design, and data-base design. 相似文献
5.
《Data Processing》1984,26(6):17-19
A data dictionary not only describes the computer processes and data within a computer system but also the business processes and data which they represent. In this way a computer system actually mirrors the way the user's business works and is not merely a reflection of the way the systems analysts perceive it. The use of entity modelling for building the data dictionary and the development of application generators are also described. 相似文献
6.
Multimedia Tools and Applications - In this paper we present an improved method for single image super-resolution (SISR). The improvement of our method is mainly attributed to the features that we... 相似文献
7.
S.M. Lucas 《Pattern recognition letters》1996,17(14):551-1512
A new method of searching large dictionaries given uncertain inputs is described, based on the lazy evaluation of a syntactic neural network (SNN). The new method is shown to significantly outperform a conventional trie-based method for large dictionaries (e.g. in excess of 100,000 entries). Results are presented for the problem of recognising UK postcodes using dictionary sizes of up to 1 million entries. Most significantly, it is demonstrated that the SNN actually gets faster as more data is loaded into it. 相似文献
8.
《Pattern recognition》2014,47(2):899-913
Dictionary learning is a critical issue for achieving discriminative image representation in many computer vision tasks such as object detection and image classification. In this paper, a new algorithm is developed for learning discriminative group-based dictionaries, where the inter-concept (category) visual correlations are leveraged to enhance both the reconstruction quality and the discrimination power of the group-based discriminative dictionaries. A visual concept network is first constructed for determining the groups of visually similar object classes and image concepts automatically. For each group of such visually similar object classes and image concepts, a group-based dictionary is learned for achieving discriminative image representation. A structural learning approach is developed to take advantage of our group-based discriminative dictionaries for classifier training and image classification. The effectiveness and the discrimination power of our group-based discriminative dictionaries have been evaluated on multiple popular visual benchmarks. 相似文献
9.
《Information & Management》1986,10(1):21-46
The role of information resource dictionary systems (data dictionary systems) is important in two important phases of information resource management:First, information requirements analysis and specification, which is a complex activity requiring data dictionary support: the end result is the specification of an “Enterprise Model,” which embodies the major activities, processes, information flows, organizational constraints, and concepts. This role is examined in detail after analyzing the existing approaches to requirements analysis and specification.Second, information modeling which uses the information in the Enterprise Model to construct a formal implementation independent database specification: several information models and support tools that may aid in transforming the initial requirements into the final logical database design are examined.The metadata — knowledge about both data and processes — contained in the data dictionary can be used to provide views of data for the specialized tools that make up the database design workbench. The role of data dictionary systems in the integration of tools is discussed. 相似文献
10.
Shuyuan Yang Min Wang Meirong Wei Licheng Jiao 《Engineering Applications of Artificial Intelligence》2012,25(6):1259-1264
In this paper, a multiscale overcomplete dictionary learning approach is proposed for image denoising by exploiting the multiscale property and sparse representation of images. The images are firstly sparsely represented by a translation invariant dictionary and then the coefficients are denoised using some learned multiscale dictionaries. Dictionaries learning can be reduced to a non-convex l0-norm minimization problem with multiple variables, so an evolution-enhanced algorithm is proposed to alternately optimize the variables. Some experiments are taken on comparing the performance of our proposed method with its counterparts on some benchmark natural images, and the superiorities of our proposed method to its counterparts can be observed in both the visual result and some numerical guidelines. 相似文献
11.
12.
G. K. Golubev 《Problems of Information Transmission》2009,45(4):378-392
Assume that we observe a Gaussian vector Y = Xβ + σζ, where X is a known p × n matrix with p ≥ n, β ∈ ℝ
n
is an unknown vector, and ζ ∈ ℝ
n
is a standard Gaussian white noise. The problem is to reconstruct Xβ from observations Y, provided that β is a sparse vector. 相似文献
13.
We present a dynamic comparison-based search structure that supports insertions, deletions, and searches within the unified bound. The unified bound specifies that it is quick to access an element that is near a recently accessed element. More precisely, if w(y) distinct elements have been accessed since the last access to element y, and d(x,y) denotes the rank distance between x and y among the current set of elements, then the amortized cost to access element x is O(minylog[w(y)+d(x,y)+2]). This property generalizes the working-set and dynamic-finger properties of splay trees. 相似文献
14.
《Data Processing》1984,26(6):14-16
Data dictionaries are becoming increasingly useful to DP departments because of their valid and up-to-date information on the state of data. Benefits include increased programmer productivity, less data redundancy and greater security of information. 相似文献
15.
Valentin Alvarez-Ramos Volodymyr Ponomaryov Rogelio Reyes-Reyes 《Multimedia Tools and Applications》2018,77(11):13487-13511
In image processing, the super-resolution (SR) technique has played an important role to perform high-resolution (HR) images from the acquired low-resolution (LR) images. In this paper, a novel technique is proposed that can generate a SR image from a single LR input image. Designed framework can be used in images of different kinds. To reconstruct a HR image, it is necessary to perform an intermediate step, which consists of an initial interpolation; next, the features are extracted from this initial image via convolution operation. Then, the principal component analysis (PCA) is used to reduce information redundancy after features extraction step. Non-overlapping blocks are extracted, and for each block, the sparse representation is performed, which it is later used to recover the HR image. Using the quality objective criteria and subjective visual perception, the proposed technique has been evaluated demonstrating their competitive performance in comparison with state-of-the-art methods. 相似文献
16.
Rosalia Maglietta 《国际通用系统杂志》2013,42(8):854-882
This paper focuses on the problem of how data representation influences the generalization error of kernel-based learning machines like support vector machines (SVMs). We analyse the effects of sparse and dense data representations on the generalization error of SVM. We show that using sparse representations the performances of classifiers belonging to hypothesis spaces induced by polynomial or Gaussian kernel functions reduce to the performances of linear classifiers. Sparse representations reduce the generalization error as long as the representation is not too sparse as with very large dictionaries. Dense data representations reduce the generalization error also using very large dictionaries. We use two schemes for representing data in data-independent overcomplete Haar and Gabor dictionaries, and measure the generalization error of SVMs on benchmark datasets. We study sparse and dense representations in the case of data-dependent overcomplete dictionaries and we show how this leads to principal component analysis. 相似文献
17.
In this paper, a novel algorithm is proposed to achieve robust high resolution detection in sparse multipath channels. Currently used sparse reconstruction techniques are not immediately applicable in multipath channel modeling. Performance of standard compressed sensing formulations based on discretization of the multipath channel parameter space degrade significantly when the actual channel parameters deviate from the assumed discrete set of values. To alleviate this off-grid problem, we make use of the particle swarm optimization (PSO) to perturb each grid point that reside in each multipath component cluster. Orthogonal matching pursuit (OMP) is used to reconstruct sparse multipath components in a greedy fashion. Extensive simulation results quantify the performance gain and robustness obtained by the proposed algorithm against the off-grid problem faced in sparse multipath channels. 相似文献
18.
The task of matching observations of the same person in disjoint views captured by non-overlapping cameras is known as the person re-identification problem. It is challenging owing to low-quality images, inter-object occlusions, and variations in illumination, viewpoints and poses. Unlike previous approaches that learn Mahalanobis-like distance metrics, we propose a novel approach based on dictionary learning that takes the advances of sparse coding of discriminatingly and cross-view invariantly encoding features representing different people. Firstly, we propose a robust and discriminative feature extraction method of different feature levels. The feature representations are projected to a lower computation common subspace. Secondly, we learn a single cross-view invariant dictionary for each feature level for different camera views and a fusion strategy is utilized to generate the final matching results. Experimental statistics show the superior performance of our approach by comparing with state-of-the-art methods on two publicly available benchmark datasets VIPeR and PRID 2011. 相似文献
19.
20.
目的 自动指纹识别系统大多是基于细节点匹配的,系统性能依赖于输入指纹质量。输入指纹质量差是目前自动指纹识别系统面临的主要问题。为了提高系统性能,实现对低质量指纹的增强,提出了一种基于多尺度分类字典稀疏表示的指纹增强方法。方法 首先,构建高质量指纹训练样本集,基于高质量训练样本学习得到多尺度分类字典;其次,使用线性对比度拉伸方法对指纹图像进行预增强,得到预增强指纹;然后,在空域对预增强指纹进行分块,基于块内点方向一致性对块质量进行评价和分级;最后,在频域构建基于分类字典稀疏表示的指纹块频谱增强模型,基于块质量分级机制和复合窗口策略,结合频谱扩散,基于多尺度分类字典对块频谱进行增强。结果 在指纹数据库FVC2004上将提出算法与两种传统指纹增强算法进行了对比实验。可视化和量化实验结果均表明,相比于传统指纹增强算法,提出的方法具有更好的鲁棒性,能有效改善低质量输入指纹质量。结论 通过将指纹脊线模式先验引入分类字典学习,为拥有不同方向类别的指纹块分别学习一个更为可靠的字典,使得学习到的分类字典拥有更可靠的脊线模式信息。块质量分级机制和复合窗口策略不仅有助于频谱扩散,改善低质量块的频谱质量,而且使得多尺度分类字典能够成功应用,克服了增强准确性和抗噪性之间的矛盾,使得块增强结果更具稳定性和可靠性,显著提升了低质量指纹图像的增强质量。 相似文献