首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
This paper gives a global introduction to the aims and objectives of the EuroWordNet project, and it provides a general framework for the other papers in this volume. EuroWordNet is an EC project that develops a multilingual database with wordnets in several European languages, structured along the same lines as the Princeton WordNet. Each wordnet represents an autonomous structure of language-specific lexicalizations, which are interconnected via an Inter-Lingual-Index. The wordnets are built at different sites from existing resources, starting from a shared level of basic concepts and extended top-down. The results will be publicly available and will be tested in cross-language information retrieval applications.  相似文献   

2.
This paper gives a global introduction to the aims and objectives of the EuroWordNet project, and it provides a general framework for the other papers in this volume. EuroWordNet is an EC project that develops a multilingual database with wordnets in several European languages, structured along the same lines as the Princeton WordNet. Each wordnet represents an autonomous structure of language-specific lexicalizations, which are interconnected via an Inter-Lingual-Index. The wordnets are built at different sites from existing resources, starting from a shared level of basic concepts and extended top-down. The results will be publicly available and will be tested in cross-language information retrieval applications.  相似文献   

3.
基于词典的英汉双向跨语言信息检索方法   总被引:1,自引:0,他引:1       下载免费PDF全文
杨辉  张玥杰  张涛 《计算机工程》2009,35(16):273-274
基于文本检索会议关于英汉跨语言信息检索的任务评价,分别以英汉双向查询翻译和英汉查询为主导策略与翻译对象,采用英汉电子词典作为获取翻译知识的知识源,结合构建的英汉单语信息检索系统,实现完整的英汉双向跨语言信息检索过程。实验结果验证了该系统的有效性。  相似文献   

4.
重新审视跨语言信息检索   总被引:6,自引:1,他引:6  
阻碍互联网资源在世界范围内广泛共享的一个主要障碍是多语言问题,而跨语言信息检索是解决这个问题的有效方法之一。本文从定义跨语言信息检索系统开始,给出了一个标准的跨语言信息检索系统框架和评价方法,对主流研究方法进行了重新审视,进一步明确指出了跨语言信息检索中必须解决的核心问题,最后通过分析研究现状给出了未来可能的重点研究方向。  相似文献   

5.
The effectiveness of information retrieval technology in electronic discovery (E-discovery) has become the subject of judicial rulings and practitioner controversy. The scale and nature of E-discovery tasks, however, has pushed traditional information retrieval evaluation approaches to their limits. This paper reviews the legal and operational context of E-discovery and the approaches to evaluating search technology that have evolved in the research community. It then describes a multi-year effort carried out as part of the Text Retrieval Conference to develop evaluation methods for responsive review tasks in E-discovery. This work has led to new approaches to measuring effectiveness in both batch and interactive frameworks, large data sets, and some surprising results for the recall and precision of Boolean and statistical information retrieval methods. The paper concludes by offering some thoughts about future research in both the legal and technical communities toward the goal of reliable, effective use of information retrieval in E-discovery.  相似文献   

6.
7.
基于英汉机译实现跨语言信息检索   总被引:8,自引:0,他引:8  
随着日益增长的大量信息成为可利用的、用户面对查询一个多语种文本集合的情形,变得越来越普遍。这就产生一个非常重要的问题一以一种语言描述的用户查询与以不同语言书写的文本之间的匹配问题,也就是一种如何跨越语言界限的问题,即跨语言信息检索(Cross-Language Information Retrievat,CLIR)。针对该项任务建立了一个面向英汉的跨语言信息检索系统,并以此为基础提交了相关的几组运行结果。同时,结合所构建的汉语IR系统,实现完整的英一汉CLIR过程。  相似文献   

8.
文本检索会议简介   总被引:3,自引:0,他引:3  
文本检索简介随着互联网的发展和存贮技术的提高,计算机可读的文本信息也越来越多。据估计,截止到1999年,互联网上已约有5TB的信息容量,其中文字信息约为6TB。然而,要有效地开发利用如此丰富的信息资源并不是轻而易举的事情,因为许多信息往往是规模巨大,实时性强,而且存贮分散;语言混杂,内容广泛;图文并茂,格式灵活,有时还含有一定的拼写错误或传输错误。而对于特定的用户而言,所需要的信息往往只占其中极小的一部分。要从如此规模的网络信息中抽取有用的信息资源,对信息处理的速度和精度将提出极为严格的要求,因而迫切需要对这种形式的混合语料进行更快速高效的处理。在这种情况下,人们越来越多地依靠文本检索工具来寻找自己所需要的信息。文本检索指的是给定文本方式的检索需求,在电子文档库中查找出与指定表达式相匹配的文本,并将出现和包含这些文本的原文作为检索结果返回给用户。  相似文献   

9.
提出了一种基于语义的跨语种信息检索中的文本比较及结果生成的算法,算法从语义入手,以形式化的语境单元框架结构来表示被检索的内容和检索请求,它从文本语义表示的三个方面:领域(静态范畴)、情景(动态范畴)、背景(参照)来对检索请求和被检索文档数据之间的语义相关度进行计算和排序,根据建立在语境单元框架上的语义符号间的匹配和生成机制来实现文本检索。与传统CLIR技术相比,它可以避免以语言空间中的词语作为检索的中间量而带来的语义模糊。实验证明,这一算法在解决基于语义的跨语种信息检索中的文本比较和结果生成上具有良好的处理能力。  相似文献   

10.
Wordnets have been created in many languages, revealing both their lexical commonalities and diversity. The next challenge is to make multilingual wordnets fully interoperable. The EuroWordNet experience revealed the shortcomings of an interlingua based on a natural language. Instead, we propose a model based on the division of the lexicon and a language-independent, formal ontology that serves as the hub interlinking the language-specific lexicons. The ontology avoids the idiosyncracies of the lexicon and furthermore allows formal reasoning about the concepts it contains. We address the division of labor between ontology and lexicon. Finally, we illustrate our model in the context of a domain-specific multilingual information system based on a central ontology and interconnected wordnets in seven languages.  相似文献   

11.
文档中词语权重计算方法的改进   总被引:57,自引:5,他引:52  
文本的形式化表示一直是文本检索、自动文摘和搜索引擎等信息检索领域关注的基础性问题。向量空间模型(Vector Space Model) 中的tf.idf文本表示是该领域里得到广泛应用并且取得较好效果的一种文本表示方法。词语在文本集合中的分布比例量上的差异是决定词语表达文本内容的重要因素之一,但现在tf.idf方法无法把握这一因素。针对这个问题,本文引入信息论中信息增益的概念,提出一种对tf.idf的改进方法tf.idf.IG文本表示方法。该方法将词语的信息增益作为一个文本表示的一个因子,来衡量词语在文本集合中分布比例在量上的差异。在文本分类实验中,tf.idf.IG文本表示的向量空间模型的分类效果要好于tf.idf方法,验证了改进方法tf.idf.IG的有效性和可行性。  相似文献   

12.
将时态信息融入到信息检索技术中可以有效提高检索效果,时态信息检索已有较多的研究,而现有数据库信息检索方法还缺乏对时态信息有效利用。针对这一研究问题,提出关系数据库上基于时态语义的关键词检索方法,引入时态信息构建时态数据图,设计时态相关性评分机制,在时态图搜索过程中引入时态语义约束,设计基于关键词的时态检索算法。实验验证了该方法可以有效提高数据库信息检索效果,而检索性能并没有降低。  相似文献   

13.
Automatic Text Summarization has been shown to be useful for Natural Language Processing tasks such as Question Answering or Text Classification and other related fields of computer science such as Information Retrieval. Since Geographical Information Retrieval can be considered as an extension of the Information Retrieval field, the generation of summaries could be integrated into these systems by acting as an intermediate stage, with the purpose of reducing the document length. In this manner, the access time for information searching will be improved, while at the same time relevant documents will be also retrieved. Therefore, in this paper we propose the generation of two types of summaries (generic and geographical) applying several compression rates in order to evaluate their effectiveness in the Geographical Information Retrieval task. The evaluation has been carried out using GeoCLEF as evaluation framework and following an Information Retrieval perspective without considering the geo-reranking phase commonly used in these systems. Although single-document summarization has not performed well in general, the slight improvements obtained for some types of the proposed summaries, particularly for those based on geographical information, made us believe that the integration of Text Summarization with Geographical Information Retrieval may be beneficial, and consequently, the experimental set-up developed in this research work serves as a basis for further investigations in this field.  相似文献   

14.
中文Web文档库全文检索技术研究与实现   总被引:13,自引:0,他引:13  
全文检索是一种非常有效的信息检索技术,本文结合国家863项目《WWW文档协同写作系统》的设计与开发,研究对中文Web文档库实现全文检索的主要技术,着重讨论了字表法全文检索技术细节,最后介绍了一个实用的全文检索系统的实现。  相似文献   

15.
This paper proposes an effective query-translation approach that enables a cross-language information retrieval (CLIR) service to be more easily supported in digital library systems that only contain monolingual content. A query-translation engine called LiveTrans is used to process the translation requests of cross-lingual queries from connected digital library systems. To automatically extract translations not covered by standard dictionaries, the engine is developed based on a novel integration of dictionary resources and Web mining approaches, including anchor-text and search-result methods. The engine exploits a broad range of multilingual Web resources used as live bilingual corpora to alleviate translation difficulties. It is shown to be particularly effective for extracting multilingual translation equivalents of query terms containing proper names or new terminology. The obtained results show the feasibility of and great potential for creating English-Chinese CLIR services in existing digital libraries and new applications in cross-language Web searching, although difficulties still remain that need to be investigated further.  相似文献   

16.
17.
本文给出了在JavaEE应用程序中运用Hibernate技术实现全文检索功能的一种方法。文章首先简要地介绍了如何在JavaEE应用程序中使用Hibernate技术,然后在一个实例中讲解Hibernate在JavaEE应用程序中是如何简化数据库操作的,最后给出全文检索功能的一种快捷实现方法。  相似文献   

18.
Text retrieval techniques have long focused on the topic of texts rather than the pragmatic role they play per se. In this article, we address two other aspects in text processing that could enhance text retrieval: (a) the detection of functional style in retrieved texts, and (b) the detection of writer"s attitude towards a given topic in retrieved texts. The former is justified by the fact that current text databases have become highly heterogeneous in terms of document inclusion, while the latter is dictated by the need for advanced and intelligent retrieval tools. Towards this aim, two generalised methodologies are presented in order to achieve the implementation of the findings in both aspects in text processing respectively. Particularly, the first one is fully developed and thus is analysed and evaluated in detail, while for the second one the theoretical framework is given for its subsequent computational implementation. Both approaches are as language independent as possible, empirically driven, and can be used, apart from information retrieval purposes, in various natural language processing applications. These include grammar and style checking, natural language generation, summarisation, style verification in real-world texts, recognition of style shift between adjacent portions of text, and author identification.  相似文献   

19.
用文本检索方法实现基于内容的图像检索   总被引:2,自引:0,他引:2  
利用基于内容的文本检索这项成熟的技术来实现基于内容的图像检索。它不需要进行大量复杂的运算,不仅检索速度快、查准率高,而且能够根据用户感兴趣的区域进行交互式图像检索。主要从实现原理、算法流程和检索实现三方面讨论了这一问题,给出了以基于内容的文本检索、图像映射成文本和文本还原成图像为主要技术的解决方案,最后介绍了一个基于上述设计原理的实例系统。  相似文献   

20.
基于手绘草图的三维模型检索(SBSR)已成为三维模型检索、模式识别与计算机视 觉领域的一个研究热点。与传统方法相比,基于卷积神经网络(CNN)的三维深度表示方法在三 维模型检索任务中性能优势非常明显。本文提出了一种基于手绘图像融合信息熵和CNN 的三 维模型检索方法。首先,通过计算模型投影图的信息熵得到模型的代表性视图,并将代表性视 图经过边缘检测等处理得到三维模型投影图的轮廓图像;然后,将轮廓图像和手绘草图输入到 CNN 中提取特征描述子,并进行特征匹配。本文方法在Shape Retrieval Contest (SHREC) 2012 数据库和SHREC 2013 数据库上进行实验。实验证明,该方法的效果较其他传统方法检索准确 度更高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号