共查询到19条相似文献,搜索用时 109 毫秒
1.
对象级别的信息检索已经引起越来越多的关注和研究。针对这一研究问题,设计并实现了一个对象级别的关系数据库信息检索方法DBORank,来有效提高关系数据库信息检索效果。DBORank方法从数据库和信息检索两个角度出发,设计了一种灵活有效的评分机制,它既考虑了对象级别数据图的链接结构,又考虑了图中对象结点的内部结构,边的类型和权值,对象内容相关性等因素,同时优化了对象评分的迭代算法。实验表明DBORank方法具有良好的检索效果和效率。 相似文献
2.
针对关系数据库元组级别关键词检索中存在查询语句多义性及展现结果冗余性等问题,提出一种关系数据库对象级别检索结果的聚类方法。以对象的观点,综合考虑检索结果的相关性和多样性,从结构和内容两个层面对其聚类。基于覆盖树对检索结果进行同构判断,实现第一级聚类;利用核函数计算同构类别中检索结果间所包含内容的相似性,实现第二级聚类;同时对聚类后的结果集进行动态更新。该聚类方法有效降低了展现结果的冗余性,增加了用户可选择的结果类别,提高了检索系统的性能。 相似文献
3.
4.
5.
6.
SEEKER:基于关键词的关系数据库信息检索 总被引:20,自引:3,他引:20
传统上,SQL是存取关系数据库中数据的主要界面.但是,对于没有经验的用户来说,学习复杂的SQL语法是一件困难的事情.实现基于关键词的关系数据库信息检索,将使用户不需要任何SQL语言和底层数据库模式的知识,用搜索引擎的方式来获取数据库中的相关数据.描述了一个基于关键词的关系数据库信息检索系统SEEKER的设计和实现.现有的关系数据库关键词查询系统只能检索关系数据库中的文本属性,而SEEKER还可以检索数据库元数据以及数字属性.并且,SEEKER采用了更合理的排序公式,支持Top-k查询.实验结果显示,SEEKER具有良好的查询性能. 相似文献
7.
在现实世界中,有些对象比其它的更具有一般性,两个对象的相似度可能不对称。两个对象之间的相似关系可能既不对称又不传递,我们用弱相似关系来表示。本文提出了非对称冗余元组来处理模糊关系数据库中的弱相似关系。非对称冗余元组的概念是模糊关系数据库的冗余概念的推广,它用来删除一些冗余信息,表示更精确的信息。 相似文献
8.
基于相似度的粗关系数据库的近似查询 总被引:3,自引:2,他引:1
基于数据库理论和粗集方法研究了粗关系数据库中不确定数据的存储、索引和检索。提出了分别采用邻接表和十字链表实现粗关系数据库中属性值等价类和元组数据的存储;借助汉明距离和聚类方法,提出了实现粗关系数据库索引的方法;提出一种基于Rough集中的上、下近似计算数据间的相似度,并基于相似度给出了对粗关系数据库进行查询的模型,设计了相应的查询算法。最后,通过一个具体实例说明了查询算法的可行性和有效性。 相似文献
9.
10.
11.
Roberto Cornacchia Sándor Héman Marcin Zukowski Arjen P. de Vries Peter Boncz 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(1):151-168
The Matrix Framework is a recent proposal by Information Retrieval (IR) researchers to flexibly represent information retrieval
models and concepts in a single multi-dimensional array framework. We provide computational support for exactly this framework
with the array database system SRAM (Sparse Relational Array Mapping), that works on top of a DBMS. Information retrieval
models can be specified in its comprehension-based array query language, in a way that directly corresponds to the underlying
mathematical formulas. SRAM efficiently stores sparse arrays in (compressed) relational tables and translates and optimizes array queries into relational queries. In this work, we describe
a number of array query optimization rules. To demonstrate their effect on text retrieval, we apply them in the TREC TeraByte
track (TREC-TB) efficiency task, using the Okapi BM25 model as our example. It turns out that these optimization rules enable
SRAM to automatically translate the BM25 array queries into the relational equivalent of inverted list processing including
compression, score materialization and quantization, such as employed by custom-built IR systems. The use of the high-performance
MonetDB/X100 relational backend, that provides transparent database compression, allows the system to achieve very fast response
times with good precision and low resource usage. 相似文献
12.
The synergistic application of CBR to IR 总被引:1,自引:0,他引:1
In this paper we discuss a hybrid approach combining Case-Based Reasoning (CBR) and Information Retrieval (IR) for the retrieval of full-text documents. Our hybrid CBR-IR approach takes as input a standard symbolic representation of a problem case and retrieves texts of relevant cases from a document collection dramatically larger than the case base available to the CBR system. Our system works by first performing a standard HYPO-style CBR analysis and then using the texts associated with certain important classes of cases found in this analysis to seed a modified version of INQUERY's relevance feedback mechanism in order to generate a query composed of individual terms or pairs of terms. Our approach provides two benefits: it extends the reach of CBR (for retrieval purposes) to much larger corpora, and it enables the injection of knowledge-based techniques into traditional IR. We describe our CBR-IR approach and report on on-going experiments.This research was supported by NSF Grant no. EEC-9209623, State/Industry/University Cooperative Research on Intelligent Information Retrieval, Digital Equipment Corporation, and the National Center for Automated Information Research. 相似文献
13.
在研究现有文本信息检索技术的基础上,设计了基于推理网络的文本检索模型.提出一种改进的推理算法,以实现从文档观察事件到索引词出现事件的推理,使新模型可以更全面地利用文本数据信息.最后通过一个推理网络实例来说明实现推理的数学过程. 相似文献
14.
D. Lillis F. Toolan A. Mur L. Peng R. Collier J. Dunnion 《Artificial Intelligence Review》2006,25(1-2):179-191
Information Retrieval (IR) forms the basis of many information management tasks. Information management itself has become
an extremely important area as the amount of electronically available information increases dramatically. There are numerous
methods of performing the IR task both by utilising different techniques and through using different representations of the
information available to us. It has been shown that some algorithms outperform others on certain tasks. Combining the results
produced by different algorithms has resulted in superior retrieval performance and this has become an important research
area. This paper introduces a probability-based fusion technique probFuse that shows initial promise in addressing this question. It also compares probFuse with the common CombMNZ data fusion technique. 相似文献
15.
利用人工和自动生成的资源进行中文信息检索查询扩展 总被引:4,自引:0,他引:4
在中文信息检索的研究和实践中,由于查询与文件集中词的不匹配现象导致一些相关的文件不能被成功地检索出来,这是影响检索效果的一个很关键的问题。该文提出并实现了利用人工和自动生成的资源进行中文信息检索查询扩展,在NTCIR-2中文信息检索测试集上进行的实验表明,相对于不进行查询扩展的检索结果,该扩展方法取得了具有统计意义提高的检索效果。 相似文献
16.
首先介绍了统计语言模型(SLM)的发展及常用的N元(n-gram)模型,对信息检索过程中的主要模型作了公式化描述并比较了不同模型,指出了它们之间及与传统概率检索方法的异同,分析了统计语言模型的弱点,最后介绍了对其可能的改进方法及最新研究进展,讨论了在中文信息检索中的应用和面对的挑战。 相似文献
17.
This paper reports on methodological considerations and the results of the Information Retrieval (IR) project PADOK I and II. PADOK has been carried out by the Linguistic Information Science Group of the University of Regensburg (LIR) since November 1984 and has been sponsored by the German Ministry for Research and Technology. The long term objective is to integrate artificial intelligence topics and the methods of information retrieval research without neglecting traditional IR methodology. In PADOK we consider a type of mass data IR system which indexes its documents rather shallowly (freetext or morphological components) and adds an intelligent information retrieval component to this kernel system. So far we have obtained, on the basis of two large-scale retrieval tests of the German Patent Information System results which show how the linguistically based functions of an indexing system contribute to its performance, and indicate what is the most reasonable basic content analysis program for a German Patent Information System. This paper focusses on the general principles and aims of PADOK I and PADOK R and on the statistical evaluation of the retrieval tests.Christa Womser-Hacker has a Ph.D. in Linguistic Information Science. From 1985 until 1990 she was involved in several LIR-Projects concerning text processing, evaluation of the German Patent Information System, man-machine-interaction, intelligent interfaces for databases. Since May 1990 she has been an LIR staff member. She is interested in information retrieval, (statistical) evaluation methods of man-machine-interaction, intelligent interfaces. She has published Der PADOK-Retrieval-test (1989) and Die statistische Auswertung des Retrievaltests (1990).Jürgen Krause is professor of Linguistic Information Science at the University of Regensburg. He is a member of the editorial boards of the periodicals Computer and the Humanities and GLDV-Forum, and co-editor of Sprache and Computer. His research interests include office automation, artificial intelligence help system, information retrieval, evaluation of natural language systems. He is co-editor (with Christa Womser-Hacker) of Das Deutsche Patentinformationssystem, Entwicklungstendenzen, Retrievaltests and Bewertungen (1990) and co-editor of Computer Talk (1991). 相似文献
18.