首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
多媒体信息由于维度高、数据量大、可解释性差等特征制约了其检索性能,提出了基于自然语言理解的智能化多媒体信息检索系统模型。该系统基于自然语言理解、数据挖掘、自反馈等技术的运用,在一定程度上扩大了检索范围,提高了检索准确率。  相似文献   

2.
Probabilistic latent semantic analysis (PLSA) is a method for computing term and document relationships from a document set. The probabilistic latent semantic index (PLSI) has been used to store PLSA information, but unfortunately the PLSI uses excessive storage space relative to a simple term frequency index, which causes lengthy query times. To overcome the storage and speed problems of PLSI, we introduce the probabilistic latent semantic thesaurus (PLST); an efficient and effective method of storing the PLSA information. We show that through methods such as document thresholding and term pruning, we are able to maintain the high precision results found using PLSA while using a very small percent (0.15%) of the storage space of PLSI.  相似文献   

3.
一种有效的信息检索模型*   总被引:1,自引:0,他引:1  
提出基于用户查询行为和查询扩展的信息检索模型,给出了设计思想及其算法和实现的关键技术。实验结果表明,该模型能有效提高信息检索性能,有很高的实际应用价值和广阔的前景。  相似文献   

4.
Optical character reader (OCR) misrecognition is a serious problem when OCR-recognized text is used for retrieval purposes in digital libraries. We have proposed fuzzy retrieval methods that, instead of correcting the errors manually, assume that errors remain in the recognized text. Costs are thereby reduced. The proposed methods generate multiple search terms for each input query term by referring to confusion matrices, which store all characters likely to be misrecognized and the respective probability of each misrecognition. The proposed methods can improve recall rates without decreasing precision rates. However, a few million search terms are occasionally generated in English-text fuzzy retrieval, giving an intolerable effect on retrieval speed. Therefore, this paper presents two remedies to reduce the number of generated search terms while maintaining retrieval effectiveness. One remedy is to restrict the number of errors included in each expanded search term, while the other is to introduce another validity value different to our conventional one. Experimental results indicate that the former remedy reduced the number of terms to about 50 and the latter to not more than 20. Received: 18 December 1998 / Revised: 31 May 1999  相似文献   

5.
Product development of today is becoming increasingly knowledge intensive. Specifically, design teams face considerable challenges in making effective use of increasing amounts of information. In order to support product information retrieval and reuse, one approach is to use case-based reasoning (CBR) in which problems are solved “by using or adapting solutions to old problems.” In CBR, a case includes both a representation of the problem and a solution to that problem. Case-based reasoning uses similarity measures to identify cases which are more relevant to the problem to be solved. However, most non-numeric similarity measures are based on syntactic grounds, which often fail to produce good matches when confronted with the meaning associated to the words they compare. To overcome this limitation, ontologies can be used to produce similarity measures that are based on semantics. This paper presents an ontology-based approach that can determine the similarity between two classes using feature-based similarity measures that replace features with attributes. The proposed approach is evaluated against other existing similarities. Finally, the effectiveness of the proposed approach is illustrated with a case study on product–service–system design problems.  相似文献   

6.
倪娜  刘凯  李耀东 《计算机应用研究》2010,27(11):4058-4062
针对在综合集成研讨环境中,由于存在时间压力,传统的网络信息获取方法难以直接使用,提出了一种面向综合集成研讨环境的主动信息获取方法。该方法将领域词条与通用词条相结合,从发言文本流中实时提取话题,并在话题发生变化时自动生成检索词送入搜索引擎进行检索,再通过多个用户之间的协作推荐实现对重要检索结果的筛选。实验结果表明,这种方法可为综合集成研讨系统的用户提供及时、准确、上下文相关的信息服务。  相似文献   

7.
The effectiveness of information retrieval technology in electronic discovery (E-discovery) has become the subject of judicial rulings and practitioner controversy. The scale and nature of E-discovery tasks, however, has pushed traditional information retrieval evaluation approaches to their limits. This paper reviews the legal and operational context of E-discovery and the approaches to evaluating search technology that have evolved in the research community. It then describes a multi-year effort carried out as part of the Text Retrieval Conference to develop evaluation methods for responsive review tasks in E-discovery. This work has led to new approaches to measuring effectiveness in both batch and interactive frameworks, large data sets, and some surprising results for the recall and precision of Boolean and statistical information retrieval methods. The paper concludes by offering some thoughts about future research in both the legal and technical communities toward the goal of reliable, effective use of information retrieval in E-discovery.  相似文献   

8.
Hassler  V. 《Software, IEEE》2005,22(5):78-82
Information retrieval tools, popularly referred to as indexers or search engines, support searches of a local file system, intranet, database, or desktop as well as the Web. They also let you add IR functionality to any application that needs a search method as part of a more complex procedure -for example, periodic surveys about your customers. When you develop your own applications, you can design a search GUI specially tailored for your employees, customers, or type of business. There are two broad tool classes. Desktop search tools usually browse local files. Yahoo Desktop Search is an example product. However, desktop tools sometimes extend to the Web (for example, Google Desktop Search) or to network drives (MSN Toolbar). Enterprise search tools cover a broader search area - an intranet, IP subnets, file systems, or databases. Google Search Appliance is an example commercial product, and you can integrate it with Google Desktop Search to provide a unified search interface.  相似文献   

9.
10.
There is no task that computers regularly perform that is more affected by the nature of human language than the retrieval of texts in response to a human need. Despite this, the techniques actually in use for this task, as well as most of the techniques proposed by information retrieval (IR) researchers, make little use of knowledge about language. In this article we take the view that IR is an inference task, and that natural language processing (NLP) techniques can produce text representations that enable more accurate inferences about document content. By considering previous work on language-based and knowledge-based techniques from this perspective, some clear lessons are apparent, and we are applying these lessons in the ADRENAL (Augmented Document REtrieval using NAtural Language processing) project. Our initial experiments with hand-coded representations suggest that using NLP-produced representations can result in significant performance increases in IR systems, and also demonstrate the attention that must be given to representational issues in language-oriented IR.  相似文献   

11.
12.
13.
A knowledge-based system is used as a front-end to a very large database to increase the relevance of the information being retrieved. The subject domain of the data base is modelled in a semantic network and the queries to the database are expanded according to the semantic model. An experiment has been performed on a bibliographic database, by developing the prototype KNOWIT, a knowledge-based front-end to the information retrieval system ESA-QUEST1. An experimental evaluation shows that the number of relevant bibliographic references retrieved with the knowledge-based front-end is significantly improved, without compromising the precision of the retrieval.  相似文献   

14.
基于本体的法律信息语义检索   总被引:3,自引:0,他引:3       下载免费PDF全文
网络中海量的法律信息及其多义性为准确、高效的查询检索提出了难题,进而也桎梏着司法判案、决策的方法。为了较好地解决司法信息检索中存在的问题,通过对国内外领域本体方法、语义Web技术的研究,借助本体的概念构建了面向案例的法律信息语义检索原型,为法律领域的知识管理和信息检索提供了可借鉴的参考。  相似文献   

15.
Most Music Information Retrieval (MIR) researchers will agree that understanding users’ needs and behaviors is critical for developing a good MIR system. The number of user studies in the MIR domain has been gradually increasing since the early 2000s, reflecting this growing appreciation of the need for empirical studies of users. However, despite the growing number of user studies and the wide recognition of their importance, it is unclear how great their impact has been in the field: on how systems are developed, how evaluation tasks are created, and how MIR system developers in particular understand critical concepts such as music similarity or music mood. In this paper, we present our analysis on the growth, publication and citation patterns, topics, and design of 198 user studies. This is followed by a discussion of a number of issues/challenges in conducting MIR user studies and distributing the research results. We conclude by making recommendations to increase the visibility and impact of user studies in the field.  相似文献   

16.
17.
This paper discusses the development of task-specific information retrieval systems for software engineers. We discuss how software engineers interact with information and information retrieval systems and investigate to what extent a domain-specific search and recommendation system can be developed in order to support their work related activities. We have conducted a user study which is based on the “Cognitive Research Framework” to identify the relation between the information objects used during the code development (code snippets and search queries), the tasks users engage in and the associated use of search interfaces. Based on our user studies, a questionnaire and an automated observation of user interactions with the browser and software development environment, we identify that software engineers engage in a finite number of work related tasks and they also develop a finite number of “work practices”/“archetypes of behaviour”. Secondly we identify a group of domain specific behaviours that can successfully be used as a source of strong implicit relevance feedback. Based on our results, we design a snippet recommendation interface, and a code related recommendation interface which are embedded within the standard search engine.  相似文献   

18.
Abstract

Searching within public information systems is one of the most complex forms of information retrieval. The use of keywords can facilitate this. The Netherlands PTT has developed an alphanumerically operated keyword search method for Viditel, the Dutch videotex system. Laboratory experiments with untrained users have shown an increase in correctly answered questions and a decrease in search time compared with one of the existing search methods, the numerically operated subject list. Some suggestions for further improvement are given. Implementation of the method in videotex systems is recommended.  相似文献   

19.
20.
《Computers & chemistry》1993,17(3):331-333
An efficient algorithm for rapid retrieval of elemental information (EI) and its implementation on IBM PCs by Microsoft FORTRAN 5.0 are presented. Bit mapping and Boolean operations result in compact storage, flexible and rapid EI search. Its applications are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号