期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

张孝飞黄河燕陈肇雄代六玲《计算机工程》2007,33(11):166-167,212

跨语言信息检索中,输入的查询语句往往是一系列关键词组合,而不是一个完整意义上的句子,致使查询关键词序列缺乏必要的语法、语境信息,难以实现查询语句的精确翻译。该文基于大规模双语语料库,以向量空间模型和词汇同现互信息为理论基础,运用传统单语信息检索技术,将查询语句的翻译问题转换为查询关键词词典义项的boost值计算,重构目标语查询语句。相似文献

2.

Formulation of qualitative models using fuzzy logic

Narasimha Bolloju 《Decision Support Systems》1996,17(4):275

Formulation of qualitative models for complex decision problems exhibiting less structure, more imprecision and uncertainty is not adequately addressed in DSS research. Typical characteristics and requirements of such problems prohibit the development of DSS using knowledge based system development methodologies. This paper presents a methodology for formulation of qualitative models using fuzzy logic to handle the imprecision and uncertainty in the problem domain. The problem domain, in this methodology, is represented using problem-solving knowledge, environmental knowledge, and control knowledge components. A high level non-procedural language for representing these components of knowledge is illustrated using a project selection and resource allocation problem. The paper also describes the implementation of a prototype decision support environment based on this methodology. 相似文献

3.

基于SRCSAC评价框架挖掘的跨语言查询译后扩展

黄名选朱丽娜《控制与决策》2020,35(11):2787-2796

提出一种面向查询扩展的基于评价框架SRCSAC (support-relevancy-chi-square analysis-confidence)的加权关联规则挖掘算法,给出跨语言查询译后扩展模型和新的扩展词权值计算方法,并提出基于SRCSAC框架挖掘的跨语言查询译后扩展算法.该算法采用支持度-关联度框架和新的剪枝策略挖掘有效频繁项集,通过卡方分析-置信度框架从有效频繁项集中提取加权关联规则,根据扩展模型从关联规则中获取优质扩展词,实现跨语言译后扩展.实验结果表明:所提算法能有效遏制查询主题漂移和词不匹配问题;与基准检索比较,其前件扩展、后件扩展和混合扩展的MAP最低平均增幅分别为86.85%、86.04%和86.00%;与对比方法比较,其长查询检索的MAP最低平均增幅分别可达12.23%、9.06%和12.6%,都高于短查询检索的增幅;与后件扩展算法比较,前件扩展和混合扩展的MAP最高增幅可达5.5%;置信度有助于提升前件扩展和混合扩展算法的检索性能,关联度有利于后件扩展算法检索性能的提高,支持度和关联度对后件扩展算法的短查询检索更有效. 相似文献

4.

关于信息管理与信息系统专业《信息存储与检索》课程的研究

张继燕欧莹元《软件》2013,34(5):155-156

本文从信息管理与信息系统的专业目标开始分析,确立《信息存储与检索》课程在该专业中的地位,然后阐述《信息存储与检索》课程的跨多学科的特点,分析当前大学的主要教材,选择最适合信息管理与信息系统专业的教材,针对所选教材阐述了该课程的教学内容及教学方式、方法。相似文献

5.

An efficient implementation of trie structures

Jun-Ichi Aoe Katsushi Morimoto Takashi Sato 《Software》1992,22(9):695-721

A new internal array structure, called a double-array, implementing a trie structure is presented. The double-array combines the fast access of a matrix form with the compactness of a list form. The algorithms for retrieval, insertion and deletion are introduced through examples. Although insertion is rather slow, it is still practical, and both the deletion and the retrieval time can be improved from the list form. From the comparison with the list for various large sets of keys, it is shown that the size of the double-array can be about 17 per cent smaller than that of the list, and that the retrieval speed of the double-array can be from 3–1 to 5–1 times faster than that of the list. 相似文献

6.

A query language for retrieving information from a soil data bank

V.J. Kollias N.J. Yassoglou J.G. Kollias 《Computers & Geosciences》1981,7(4):393-400

The paper presents specifications and implementation details of a query language designed for retrieving information from a soil data bank. The commands of the language are based on operations of relational algebra, and can be employed without previous programming experience. The language is part of the ARSIS (A Relational Soil Information System) system that is being developed in Greece. 相似文献

7.

基于检索增强生成的开放域问答方法研究

《计算机科学》2025,52(6A)

大型语言模型在自然语言处理任务中取得显著进展,但其对封装在参数内的知识依赖易引发幻觉现象。为缓解这一问题,检索增强生成技术通过信息检索方法降低错误风险。然而,现有方法检索到的文档往往含有不准确或误导性信息,且在评估文档相关性方面存在判别准确性不足的问题。针对上述挑战,设计了一种简洁高效的方法,通过结合稀疏检索与稠密检索,兼顾词汇重叠的信息与语义相关性。此外,引入排序器对检索到的候选段落进行重排序,在排序器的输入中注入稀疏和稠密检索的分数,进一步优化了段落的排序质量。为验证所提方法的有效性,在SQuAD和HotpotQA数据集上进行实验,并与现有基准方法比较。实验结果表明,所提方法在提升问答性能方面具有显著优势。相似文献

8.

Intelligent handling of data by integration of commonsense reasoning

E. T. Keravnou L. Johnson 《Knowledge》1987,1(1)

Any intelligent problem solving system should be able, given the known data on a case, to decide whether some item of information is true, false or unknown. In this paper the way in which various forms of commonsense reasoning can be integrated to provide such decisions is described. To this end three structural types of knowledge defined over data, and four strategies for exploiting these structures, are identified. ‘Decide-Status’ integrates the reasoning strategies into a task frame. This frame structure not only integrates the reasoning but also affords the appropriate facilities for providing strategic justifications for its conclusions, if required. 相似文献

9.

面向信息检索的近邻语言模型

韩中元李生齐浩亮杨沐昀《中文信息学报》2011,25(1):66-71

面向信息检索的语言模型对单篇文档构建语言模型,存在较严重的数据稀疏问题。该文认为利用文档的近邻信息能够更合理地反映词在文档中的分布,有助于数据稀疏问题的解决,因此将文档的近邻信息加入语言模型的平滑算法中,提出近邻语言模型。该文在TREC评测的典型文档集美国能源署文件(DOE)和《华尔街日报》(WSJ)数据集上测试了在不同近邻选择来源上近邻语言模型的性能。实验结果表明,近邻语言模型对检索性能有一定的提升。相似文献

10.

A probabilistic justification for using tf×idf term weighting in information retrieval

Djoerd Hiemstra 《International Journal on Digital Libraries》2000,3(2):131-139

This paper presents a new probabilistic model of information retrieval. The most important modeling assumption made is that documents and queries are defined by an ordered sequence of single terms. This assumption is not made in well-known existing models of information retrieval, but is essential in the field of statistical natural language processing. Advances already made in statistical natural language processing will be used in this paper to formulate a probabilistic justification for using tf×idf term weighting. The paper shows that the new probabilistic interpretation of tf×idf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking. A pilot experiment on the TREC collection shows that the linguistically motivated weighting algorithm outperforms the popular BM25 weighting algorithm. Received: 17 December 1998 / Revised: 31 May 1999 相似文献

11.

Reasoning about Textual Similarity in a Web-Based Information Access System

William W. Cohen 《Autonomous Agents and Multi-Agent Systems》1999,2(1):65-86

The degree to which information sources are pre-processed by Web-based information systems varies greatly. In search engines like Altavista, little pre-processing is done, while in knowledge integration systems, complex site-specific wrappers are used to integrate different information sources into a common database representation. In this paper we describe an intermediate point between these two models. In our system, information sources are converted into a highly structured collection of small fragments of text. Database-like queries to this structured collection of text fragments are approximated using a novel logic called WHIRL, which combines inference in the style of deductive databases with ranked retrieval methods from information retrieval (IR). WHIRL allows queries that integrate information from multiple Web sites, without requiring the extraction and normalization of object identifiers that can be used as keys; instead, operations that in conventional databases require equality tests on keys are approximated using IR similarity metrics for text. This leads to a reduction in the amount of human engineering required to field a knowledge integration system. Experimental evidence is given showing that many information sources can be easily modeled with WHIRL, and that inferences in the logic are both accurate and efficient. 相似文献

12.

文本检索的统计语言建模方法综述 总被引：2，自引：0，他引：2

丁国栋白硕王斌《计算机研究与发展》2006,43(5):769-776

统计语言建模技术（statistical language modeling,SLM）已逐渐成为当前语言信息处理的主流技术之一.近几年的研究和实验表明,SLM技术在文本检索领域有着广阔的发展前景和拓展空间.对基于SLM的文本检索方法（SLMTR）进行了综述,重点论述SLMTR的主要方法和关键技术.首先对查询似然检索模型进行形式化的描述;然后详细论述语言模型的估计和数据平滑问题;并讨论了平滑对检索性能的影响;之后简要介绍了对查询似然模型的一些主要的扩展和改进工作;最后的总结部分讨论了SLMTR所面临的一些挑战. 相似文献

13.

基于Lucene的英汉跨语言信息检索 总被引：8，自引：0，他引：8

陈士杰张玥杰《计算机工程》2005,31(13):62-64

描述了一个英汉跨语言检索系统的设计与实现,其主要研究目的在于寻找更为有效的英汉查询翻译方法,以及提高汉语检索系统的性能。在英汉查询翻译方面,以英汉双语词典为基础,建立了查询翻译算法。在汉语检索方面,分析不同索引单元对于检索性能的影响,基于Lucene全文索引工具包建立了搜索引擎。在系统评测方面,提出了一种根据主题,快速构建评测数据的方法。相似文献

14.

商务智能和信息检索技术在高新成果转化信息服务中的应用和实现

闻云斌《计算机应用与软件》2011,(10)

以高新技术成果转化信息服务平台为实际背景,介绍商务智能和信息检索技术所起到的作用以及有关的实现方法。相似文献

15.

Characteristics of information retrieval systems on the internet: Theoretical and practical aspects

V. O. Mel’nikov G. S. Melikyan O. A. Maksimov 《Automatic Documentation and Mathematical Linguistics》2009,43(1):42-50

相似文献

16.

搜索引擎技术简析

刘智浓张永利《数字社区&智能家居》2006,(1):79-80

随着信息搜索日益成为互联网的主要应用．搜索引擎技术正成为计算机工业界和学术界争相研究和开发的热点。本文主要介绍搜索引擎的基本原理、工作过程及技术发展趋势．相似文献

17.

搜索引擎技术简析

刘智浓张永利《数字社区&智能家居》2006,(2)

随着信息搜索日益成为互联网的主要应用,搜索引擎技术正成为计算机工业界和学术界争相研究和开发的热点。本文主要介绍搜索引擎的基本原理、工作过程及技术发展趋势。相似文献

18.

Retrieval of online handwriting by synthesis and matching

C.V. Jawahar^{Author Vitae} A. Balasubramanian Author VitaeAuthor Vitae Anoop M. Namboodiri Author Vitae 《Pattern recognition》2009,42(7):1445-1457

Search and retrieval is gaining importance in the ink domain due to the increase in the availability of online handwritten data. However, the problem is challenging due to variations in handwriting between various writers, digitizers and writing conditions. In this paper, we propose a retrieval mechanism for online handwriting, which can handle different writing styles, specifically for Indian languages. The proposed approach provides a keyboard-based search interface that enables to search handwritten data from any platform, in addition to pen-based and example-based queries. One of the major advantages of this framework is that information retrieval techniques such as ranking relevance, detecting stopwords and controlling word forms can be extended to work with search and retrieval in the ink domain. The framework also allows cross-lingual document retrieval across Indian languages. 相似文献

19.

科学计算时计算机编程语言的互译问题研究

杜中华王兴贵陈永才《计算机工程》2001,27(12):164-165,193

探讨了用于科学计算时研究计算机编程语言间相互翻译问题的意义,提供了这方面的一些实用技巧,最后给出了一个实例。相似文献

20.

Determining the specificity of terms using inside-outside information: a necessary condition of term hierarchy mining

Pum-Mo Ryu Key-Sun Choi 《Information Processing Letters》2006,100(2):76-82

This paper introduces new specificity measuring methods of terms using inside and outside information. Specificity of a term is the quantity of domain specific information contained in the term. Specific terms have a larger quantity of domain information than general terms. Specificity is an important necessary condition for building hierarchical relations among terms. If t₁ is a hyponym of t₂ in a domain term hierarchy, then the specificity of t₁ is greater than that of t₂. As domain specific terms are commonly compounds of the generic level term and some modifiers, inside information is important to represent the meaning of terms. Outside contextual information is also used to complement the shortcomings of inside information. We propose an information theoretic method to measure the quantity of terms. Experiments showed promising results with a precision of 73.9% when applied to terms in the MeSH thesaurus. 相似文献