共查询到20条相似文献,搜索用时 31 毫秒
1.
Recognizing acronyms and their definitions 总被引:1,自引:0,他引:1
Kazem Taghva Jeff Gilbreth 《International Journal on Document Analysis and Recognition》1999,1(4):191-198
This paper introduces an automatic method for finding acronyms and their definitions in free text. The method is based on
an inexact pattern matching algorithm applied to text surrounding the possible acronym. Evaluation shows both high recall
and precision for a set of documents randomly selected from a larger set of full text documents.
Received October 1, 1997 / Revised September 8, 1998 相似文献
2.
We optimize relational queries using connection hypergraphs (CHGs). All operations including value-passing between SQL blocks
can be set-oriented. By introducing partial evaluations, reordering operations can be achieved for nested queries. For a query
using views, we merge CHGs for the views and the query into one CHG and then apply query optimization. Furthermore, we may
simulate magic sets methods elegantly in a CHG. Sideways information-passing strategies (SIPS) in a CHG amount to partial
evaluations of SIPS paths. We introduce the maximum SIPS strategy, which performs SIPS for all bindings and all SIPS paths
for a query. The new method has several advantages. First, the maximum SIPS strategy can be more efficient than the previous
SIPS based on simple heuristics. Second, it is conceptually simple and easy to implement. Third, the processing strategies
may be incorporated with the search space for query execution plans, which is a proven optimization strategy introduced by
System R. Fourth, it provides a general framework of query optimization and may potentially be used to optimize next-generation
database systems.
Received September 1, 1993 / Accepted January 8, 1996 相似文献
3.
Approximate query mapping: Accounting for translation closeness 总被引:2,自引:0,他引:2
Kevin Chen-Chuan Chang Héctor García-Molina 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(2-3):155-181
In this paper we present a mechanism for approximately translating Boolean query constraints across heterogeneous information
sources. Achieving the best translation is challenging because sources support different constraints for formulating queries,
and often these constraints cannot be precisely translated. For instance, a query [score>8] might be “perfectly” translated
as [rating>0.8] at some site, but can only be approximated as [grade=A] at another. Unlike other work, our general framework
adopts a customizable “closeness” metric for the translation that combines both precision and recall. Our results show that
for query translation we need to handle interdependencies among both query conjuncts as well as disjuncts. As the basis, we
identify the essential requirements of a rule system for users to encode the mappings for atomic semantic units. Our algorithm
then translates complex queries by rewriting them in terms of the semantic units. We show that, under practical assumptions,
our algorithm generates the best approximate translations with respect to the closeness metric of choice. We also present
a case study to show how our technique may be applied in practice.
Received: 15 October 2000 / Accepted: 15 April 2001 Published online: 28 June 2001 相似文献
4.
Query by video clip 总被引:15,自引:0,他引:15
Typical digital video search is based on queries involving a single shot. We generalize this problem by allowing queries
that involve a video clip (say, a 10-s video segment). We propose two schemes: (i) retrieval based on key frames follows the traditional approach of identifying shots, computing key frames from a video, and then extracting image features
around the key frames. For each key frame in the query, a similarity value (using color, texture, and motion) is obtained
with respect to the key frames in the database video. Consecutive key frames in the database video that are highly similar
to the query key frames are then used to generate the set of retrieved video clips. (ii) In retrieval using sub-sampled frames, we uniformly sub-sample the query clip as well as the database video. Retrieval is based on matching color and texture features
of the sub-sampled frames. Initial experiments on two video databases (basketball video with approximately 16,000 frames and
a CNN news video with approximately 20,000 frames) show promising results. Additional experiments using segments from one
basketball video as query and a different basketball video as the database show the effectiveness of feature representation
and matching schemes. 相似文献
5.
Stephan Greene Egemen Tanin Catherine Plaisant Ben Shneiderman Lola Olsen Gene Major Steve Johns 《International Journal on Digital Libraries》1999,2(2-3):79-90
The Human-Computer Interaction Laboratory (HCIL) of the University of Maryland and NASA have collaborated over three years
to refine and apply user interface research concepts developed at HCIL in order to improve the usability of NASA data services.
The research focused on dynamic query user interfaces, visualization, and overview + preview designs. An operational prototype,
using query previews, was implemented with NASA’s Global Change Master Directory (GCMD), a directory service for earth science
datasets. Users can see the histogram of the data distribution over several attributes and choose among attribute values.
A result bar shows the cardinality of the result set, thereby preventing users from submitting queries that would have zero
hits. Our experience confirmed the importance of metadata accuracy and completeness. The query preview interfaces make visible
the problems or gaps in the metadata that are undetectable with classic form fill-in interfaces. This could be seen as a problem,
but we think that it will have a long-term beneficial effect on the quality of the metadata as data providers will be compelled
to produce more complete and accurate metadata. The adaptation of the research prototype to the NASA data required revised
data structures and algorithms.
Received: 12 December 1997 / Revised: June 1999 相似文献
6.
UnQL: a query language and algebra for semistructured data based on structural recursion 总被引:5,自引:0,他引:5
Peter Buneman Mary Fernandez Dan Suciu 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(1):76-110
Abstract. This paper presents structural recursion as the basis of the syntax and semantics of query languages for semistructured data
and XML. We describe a simple and powerful query language based on pattern matching and show that it can be expressed using
structural recursion, which is introduced as a top-down, recursive function, similar to the way XSL is defined on XML trees.
On cyclic data, structural recursion can be defined in two equivalent ways: as a recursive function which evaluates the data
top-down and remembers all its calls to avoid infinite loops, or as a bulk evaluation which processes the entire data in parallel
using only traditional relational algebra operators. The latter makes it possible for optimization techniques in relational
queries to be applied to structural recursion. We show that the composition of two structural recursion queries can be expressed
as a single such query, and this is used as the basis of an optimization method for mediator systems. Several other formal
properties are established: structural recursion can be expressed in first-order logic extended with transitive closure; its
data complexity is PTIME; and over relational data it is a conservative extension of the relational calculus. The underlying
data model is based on value equality, formally defined with bisimulation. Structural recursion is shown to be invariant with
respect to value equality.
Received: July 9, 1999 / Accepted: December 24, 1999 相似文献
7.
The use of massive image databases has increased drastically over the few years due to evolution of multimedia technology. Image retrieval has become one of the vital tools in image processing applications. Content-Based Image Retrieval (CBIR) has been widely used in varied applications. But, the results produced by the usage of a single image feature are not satisfactory. So, multiple image features are used very often for attaining better results. But, fast and effective searching for relevant images from a database becomes a challenging task. In the previous existing system, the CBIR has used the combined feature extraction technique using color auto-correlogram, Rotation-Invariant Uniform Local Binary Patterns (RULBP) and local energy. However, the existing system does not provide significant results in terms of recall and precision. Also, the computational complexity is higher for the existing CBIR systems. In order to handle the above mentioned issues, the Gray Level Co-occurrence Matrix (GLCM) with Deep Learning based Enhanced Convolution Neural Network (DLECNN) is proposed in this work. The proposed system framework includes noise reduction using histogram equalization, feature extraction using GLCM, similarity matching computation using Hierarchal and Fuzzy c- Means (HFCM) algorithm and the image retrieval using DLECNN algorithm. The histogram equalization has been used for computing the image enhancement. This enhanced image has a uniform histogram. Then, the GLCM method has been used to extract the features such as shape, texture, colour, annotations and keywords. The HFCM similarity measure is used for computing the query image vector's similarity index with every database images. For enhancing the performance of this image retrieval approach, the DLECNN algorithm is proposed to retrieve more accurate features of the image. The proposed GLCM+DLECNN algorithm provides better results associated with high accuracy, precision, recall, f-measure and lesser complexity. From the experimental results, it is clearly observed that the proposed system provides efficient image retrieval for the given query image. 相似文献
8.
Kashif Iqbal Michael O. Odetayo Anne James 《Journal of Computer and System Sciences》2012,78(4):1258-1277
In this paper, we discuss a new content-based image retrieval approach for biometric security, which is based on colour, texture and shape features and controlled by fuzzy heuristics. The proposed approach is based on the three well-known algorithms: colour histogram, texture and moment invariants. The use of these three algorithms ensures that the proposed image retrieval approach produces results which are highly relevant to the content of an image query, by taking into account the three distinct features of the image and similarity metrics based on Euclidean measure. Colour histogram is used to extract the colour features of an image. Gabor filter is used to extract the texture features and the moment invariant is used to extract the shape features of an image. The evaluation of the proposed approach is carried out using the standard precision and recall measures, and the results are compared with the well-known existing approaches. We present results which show that our proposed approach performs better than these approaches. 相似文献
9.
Aya Soffer Hanan Samet 《The VLDB Journal The International Journal on Very Large Data Bases》1998,7(4):253-274
Symbolic images are composed of a finite set of symbols that have a semantic meaning. Examples of symbolic images include
maps (where the semantic meaning of the symbols is given in the legend), engineering drawings, and floor plans. Two approaches
for supporting queries on symbolic-image databases that are based on image content are studied. The classification approach
preprocesses all symbolic images and attaches a semantic classification and an associated certainty factor to each object
that it finds in the image. The abstraction approach describes each object in the symbolic image by using a vector consisting
of the values of some of its features (e.g., shape, genus, etc.). The approaches differ in the way in which responses to queries
are computed. In the classification approach, images are retrieved on the basis of whether or not they contain objects that
have the same classification as the objects in the query. On the other hand, in the abstraction approach, retrieval is on
the basis of similarity of feature vector values of these objects. Methods of integrating these two approaches into a relational
multimedia database management system so that symbolic images can be stored and retrieved based on their content are described.
Schema definitions and indices that support query specifications involving spatial as well as contextual constraints are presented.
Spatial constraints may be based on both locational information (e.g., distance) and relational information (e.g., north of).
Different strategies for image retrieval for a number of typical queries using these approaches are described. Estimated costs
are derived for these strategies. Results are reported of a comparative study of the two approaches in terms of image insertion
time, storage space, retrieval accuracy, and retrieval time.
Received June 12, 1998 / Accepted October 13, 1998 相似文献
10.
用兴趣点凸包和SVM加权反馈实现图像检索 总被引:4,自引:0,他引:4
针对采用环状颜色直方图的图像检索方法存在的不足,提出一种基于兴趣点凸包的图像特征提取方法,通过对用小波变换检测出的必趣点递归求出它们的凸包,并将每个凸包上的兴趣点按一定的算法安插在相应的桶内,对每个桶求出颜色直方图,利用桶与桶之间的相似度定义两幅图像的相似度.这种特征提取方法可有效抑制兴趣点集合中出现游离兴趣点的情况,结合基于兴趣点的空间离散度和Gabor小波纹理等特征实现图像检索,可有效提高图像检索精度.最后,提出一种新的相关反馈方法,通过利用支持向量机分类结果设置权值来改进移动查询点相关反馈方法.实际图像数据库上的实验表明,引入这种反馈方法后可将图像检索的查准率提高20%左右,查全率提高10%左右. 相似文献
11.
Manabu Ohta Atsuhiro Takasu Jun Adachi 《International Journal on Digital Libraries》2000,3(2):140-151
Optical character reader (OCR) misrecognition is a serious problem when OCR-recognized text is used for retrieval purposes
in digital libraries. We have proposed fuzzy retrieval methods that, instead of correcting the errors manually, assume that
errors remain in the recognized text. Costs are thereby reduced. The proposed methods generate multiple search terms for each
input query term by referring to confusion matrices, which store all characters likely to be misrecognized and the respective
probability of each misrecognition. The proposed methods can improve recall rates without decreasing precision rates. However,
a few million search terms are occasionally generated in English-text fuzzy retrieval, giving an intolerable effect on retrieval
speed. Therefore, this paper presents two remedies to reduce the number of generated search terms while maintaining retrieval
effectiveness. One remedy is to restrict the number of errors included in each expanded search term, while the other is to
introduce another validity value different to our conventional one. Experimental results indicate that the former remedy reduced
the number of terms to about 50 and the latter to not more than 20.
Received: 18 December 1998 / Revised: 31 May 1999 相似文献
12.
A model-driven approach for real-time road recognition 总被引:6,自引:0,他引:6
This article describes a method designed to detect and track road edges starting from images provided by an on-board monocular
monochromic camera. Its implementation on specific hardware is also presented in the framework of the VELAC project. The method
is based on four modules: (1) detection of the road edges in the image by a model-driven algorithm, which uses a statistical
model of the lane sides which manages the occlusions or imperfections of the road marking – this model is initialized by an
off-line training step; (2) localization of the vehicle in the lane in which it is travelling; (3) tracking to define a new
search space of road edges for the next image; and (4) management of the lane numbers to determine the lane in which the vehicle
is travelling. The algorithm is implemented in order to validate the method in a real-time context. Results obtained on marked
and unmarked road images show the robustness and precision of the method.
Received: 18 November 2000 / Accepted: 7 May 2001 相似文献
13.
基于灰度序特征的视频片段定位算法是解决视频片段定位问题的典型算法.这类算法存在的不足是:特征的唯一性表示能力不够,使得在召回率较高的情况下,定位检索的精度下降得较快;二次多项式级的时间复杂度使得响应时间过长,并对查询视频长度敏感.针对上述两个问题,提出了一种基于时空灰度序特征的视频片段定位算法,其关键步骤包括:(1) 在精确定位之前,通过引入线性时间复杂度的基于时空二值模式直方图特征(spatio-temporal binary pattern histogram,简称STBPH)的实时过滤算法以及基于二值时间灰度序特征(binarytemporal ordinal measure,简称BTOM)的快速过滤算法,大幅度减少精确定位阶段需要进行比较的候选视频片段个数;(2) 在精确定位阶段,通过引入唯一性表示能力更好且保持了较好鲁棒性的时空统一灰度序特征(jointspatio-temporal ordinal measure,简称JSTOM)进行序列匹配,显著提高了定位检索的精度.实验结果表明,该算法能够快速、准确地进行视频片段定位,大幅降低了对查询视频长度的敏感度. 相似文献
14.
提出了一种基于特征融合的图像检索方法。利用图像的HSV直方图特征建立图像颜色直方图,并采用直方图二次式距离公式取得图像相似性度量值;利用图像的纹理特征建立256维的LBP特征向量,并利用欧式距离取得相似性度量值;通过两特征融合的方法取得图像检索中关键图和检索图之间的相似度值,使得检索取得更好的效果。实验表明,在查准率和... 相似文献
15.
Due to the unconstrained nature of image segmentation, the existing thresholding methods require considerable human intervention
and pre-assumptions to determine appropriate threshold values. In this paper, a fully automatic thresholding method via histogram
modal decomposition by data-dependent-systems methodology is presented. In this method, the histogram of an image is parametrically
modeled by the power spectrum of an autoregressive model to provide vital information about histogram clusters. Utilizing
the modal information, threshold values are then selected to maximize the between-class variance. The proposed method is validated
by illustrative examples; comparison with the existing methods helps explain their differences and the superiority of the
approach. 相似文献
16.
Searching in metric spaces by spatial approximation 总被引:5,自引:0,他引:5
Gonzalo Navarro 《The VLDB Journal The International Journal on Very Large Data Bases》2002,11(1):28-46
We propose a new data structure to search in metric spaces. A metric space is formed by a collection of objects and a distance function defined among them which satisfies the triangle inequality. The goal is, given a set of objects and a query, retrieve those
objects close enough to the query. The complexity measure is the number of distances computed to achieve this goal. Our data
structure, called sa-tree (“spatial approximation tree”), is based on approaching the searched objects spatially, that is, getting closer and closer
to them, rather than the classic divide-and-conquer approach of other data structures. We analyze our method and show that
the number of distance evaluations to search among n objects is sublinear. We show experimentally that the sa-tree is the best existing technique when the metric space is hard to search or the query has low selectivity. These are the most
important unsolved cases in real applications. As a practical advantage, our data structure is one of the few that does not
need to tune parameters, which makes it appealing for use by non-experts.
Edited by R. Sacks-Davis Received: 17 April 2001 / Accepted: 24 January 2002 / Published online: 14 May 2002 相似文献
17.
Abstract. The purpose of this study is to discuss existing fractal-based algorithms and propose novel improvements of these algorithms
to identify tumors in brain magnetic-response (MR) images. Considerable research has been pursued on fractal geometry in various
aspects of image analysis and pattern recognition. Magnetic-resonance images typically have a degree of noise and randomness
associated with the natural random nature of structure. Thus, fractal analysis is appropriate for MR image analysis. For tumor
detection, we describe existing fractal-based techniques and propose three modified algorithms using fractal analysis models.
For each new method, the brain MR images are divided into a number of pieces. The first method involves thresholding the pixel
intensity values; hence, we call the technique piecewise-threshold-box-counting (PTBC) method. For the subsequent methods,
the intensity is treated as the third dimension. We implement the improved piecewise-modified-box-counting (PMBC) and piecewise-triangular-prism-surface-area
(PTPSA) methods, respectively. With the PTBC method, we find the differences in intensity histogram and fractal dimension
between normal and tumor images. Using the PMBC and PTPSA methods, we may detect and locate the tumor in the brain MR images
more accurately. Thus, the novel techniques proposed herein offer satisfactory tumor identification.
Received: 13 October 2001 / Accepted: 28 May 2002
Correspondence to: K.M. Iftekharuddin 相似文献
18.
Curvature scale space image in shape similarity retrieval 总被引:7,自引:0,他引:7
19.
Approximate query processing using wavelets 总被引:7,自引:0,他引:7
Kaushik Chakrabarti Minos Garofalakis Rajeev Rastogi Kyuseok Shim 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(2-3):199-223
Approximate query processing has emerged as a cost-effective approach for dealing with the huge data volumes and stringent
response-time requirements of today's decision support systems (DSS). Most work in this area, however, has so far been limited
in its query processing scope, typically focusing on specific forms of aggregate queries. Furthermore, conventional approaches
based on sampling or histograms appear to be inherently limited when it comes to approximating the results of complex queries
over high-dimensional DSS data sets. In this paper, we propose the use of multi-dimensional wavelets as an effective tool
for general-purpose approximate query processing in modern, high-dimensional applications. Our approach is based on building
wavelet-coefficient synopses of the data and using these synopses to provide approximate answers to queries. We develop novel query processing algorithms
that operate directly on the wavelet-coefficient synopses of relational tables, allowing us to process arbitrarily complex
queries entirely in the wavelet-coefficient domain. This guarantees extremely fast response times since our approximate query execution engine
can do the bulk of its processing over compact sets of wavelet coefficients, essentially postponing the expansion into relational
tuples until the end-result of the query. We also propose a novel wavelet decomposition algorithm that can build these synopses
in an I/O-efficient manner. Finally, we conduct an extensive experimental study with synthetic as well as real-life data sets
to determine the effectiveness of our wavelet-based approach compared to sampling and histograms. Our results demonstrate
that our techniques: (1) provide approximate answers of better quality than either sampling or histograms; (2) offer query
execution-time speedups of more than two orders of magnitude; and (3) guarantee extremely fast synopsis construction times
that scale linearly with the size of the data.
Received: 7 August 2000 / Accepted: 1 April 2001 Published online: 7 June 2001 相似文献
20.
Abstract. Providing a customized result set based upon a user preference is the ultimate objective of many content-based image retrieval
systems. There are two main challenges in meeting this objective: First, there is a gap between the physical characteristics
of digital images and the semantic meaning of the images. Secondly, different people may have different perceptions on the
same set of images. To address both these challenges, we propose a model, named Yoda, that conceptualizes content-based querying
as the task of soft classifying images into classes. These classes can overlap, and their members are different for different
users. The “soft” classification is hence performed for each and every image feature, including both physical and semantic
features. Subsequently, each image will be ranked based on the weighted aggregation of its classification memberships. The
weights are user-dependent, and hence different users would obtain different result sets for the same query. Yoda employs
a fuzzy-logic based aggregation function for ranking images. We show that, in addition to some performance benefits, fuzzy
aggregation is less sensitive to noise and can support disjunctive queries as compared to weighted-average aggregation used
by other content-based image retrieval systems. Finally, since Yoda heavily relies on user-dependent weights (i.e., user profiles)
for the aggregation task, we utilize the users' relevance feedback to improve the profiles using genetic algorithms (GA).
Our learning mechanism requires fewer user interactions, and results in a faster convergence to the user's preferences as
compared to other learning techniques.
Correspondence to: Y.-S. Chen (E-mail: yishinc@usc.edu)
This research has been funded in part by NSF grants EEC-9529152 (IMSC ERC) and IIS-0082826, NIH-NLM R01-LM07061, DARPA and
USAF under agreement nr. F30602-99-1-0524, and unrestricted cash gifts from NCR, Microsoft, and Okawa Foundation. 相似文献