首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
基于语义查询本体的语义网文档检索   总被引:1,自引:0,他引:1  
语义网的发展使人们需要对语义网文档进行检索.为了在不需要专业知识和技巧的情况下让用户能形成语义的查询,提出了一种基于本体可以在结构化的知识库里检索语义网文档的算法.通过将自然语言查询术语映射到词汇意义来构造查询本体,以及检索跟查询本体最相似的语义网文档,提高了对语义网文档检索的查准率,使用户能更好地利用语义检索服务.  相似文献   

2.
针对当前的信息检索模型并不能提供语义信息的检索问题,提出了一个基于描述逻辑方法的语义检索模型,定义了文档的逻辑视图、查询的逻辑视图和两种视图之间的相似度计算方法,并给出了模型的存储结构.该模型将用户的检索请求和待查询的数据(文档)转化成基于描述逻辑知识库为基础的个体集合,不仅能够有效表示文档和查询的语义信息,而且有利于计算机自动推理的实现,可以有效提高检索的准确率和召回率.  相似文献   

3.
针对基于关键字的搜索引擎缺乏语义的问题,提出了一种面向专业领域的语义搜索引擎模型.以领域本体形式化描述为基础,构建本体语义框架,进而给出语义搜索模型.在模型中,以概念、概念-实例以及关键字等3种扩展特征项作为基础,对查询扩展算法和文档语义标注算法进行了研究,并且构建了语义索引,通过引入向量空间模型判定扩展检索词与语义文档的相似度.实验结果表明,该模型较传统模型较大提高了检索的查准率和查全率.  相似文献   

4.
师雪霖  赵英 《计算机应用》2008,28(9):2324-2327
语义网格所需要处理的信息通常为半结构化数据,如何以合理的模型表示这些半结构化数据并实现高效查询处理,是语义网格要解决的核心问题之一。提出了一种基于资源描述框架(RDF)的半结构化数据表示模型,并设计了相应的信息检索机制。最后介绍了一个基于化工计算网格平台的,实现了化工领域知识共享与检索的化工语义网格架构的设计与实现。  相似文献   

5.
综合文档语义与用户查询语义的XML关键字检索   总被引:1,自引:0,他引:1  
黎军  熊海灵 《计算机应用》2010,30(11):2945-2948
为了解决XML关键字查询中语义信息丢失的问题,提出了一种语义相关的关键字检索方法。利用文档的半结构化特点提取文档隐含的语义,利用查询语法捕获用户查询意图,然后根据用户意图查询满足条件的元素,并结合文档语义,由最小最近公共祖先改进为语义相关实体子树集来表达查询结果。实验结果表明,该方法能够有效提高关键字检索结果的查准率。  相似文献   

6.
针对刻面描述的构件检索中缺少语义信息的问题,提出了一种基于本体的构件检索框架.按照所给出的构件描述模型,将用户查询的构件从功能、环境和质量属性三个方面与语义构件库中的构件进行匹配.建立在上海构件库的初步实验结果表明,该原型系统能有效地提高构件检索的查准率和查全率.  相似文献   

7.
王志华  金燕  李占波 《计算机工程》2011,37(11):83-85,88
基于内容的语义Web检索只考虑内容本身,没有考虑用户的不同,不能准确反映用户需求。为此,提出一个自适应语义Web检索框架,对于Web中文文档,借助HowNet知识库给出一种本体学习方法,通过提取用户客观、显式和隐式信息建立用户信息库,并设计用户初始查询本体和个性化查询本体构建算法,从而实现用户的自适应检索。实验结果表明,该方法具有较高的检索效率。  相似文献   

8.
语义查询扩展中词语-概念相关度的计算   总被引:16,自引:0,他引:16  
田萱  杜小勇  李海华 《软件学报》2008,19(8):2043-2053
在基于语义的查询扩展中,为了找到描述查询需求语义的相关概念,词语.概念相关度的计算是语义查询扩展中的关键一步.针对词语.概念相关度的计算,提出一种K2CM(keyword to concept method)方法.K2CM方法从词语.文档.概念所属程度和词语.概念共现程度两个方面来计算词语.概念相关度问语.文档.概念所属程度来源于标注的文档集中词语对概念的所属关系,即词语出现在若干文档中而文档被标注了若干概念.词语.概念共现程度是在词语概念对的共现性基础上增加了词语概念对的文本距离和文档分布特征的考虑.3种不同类型数据集上的语义检索实验结果表明,与传统方法相比,基于K2CM的语义查询扩展可以提高查询效果.  相似文献   

9.
受限本体相似   总被引:8,自引:0,他引:8  
在从不同的语义Web上得到用本体表达的文档资源以后,这些文档资源通常被转换成基于同一个本体的本体描述,这样既便于对文档的分析,又便于在此基础上进行信息抽取.这些文档本体之间仅仅在实例和关系层上彼此相互不同,在类、属性、规则、谓词方面都基本相同.对这种文档的检索,一个最普通的操作就是计算本体之间的相似性.很多计算本体相似性的方法基本上都是以分别属于不同本体的实体之间配对比较来实现,而且往往要考虑所有相关的元素.这不仅增加了计算复杂度,还会遇到循环计算的问题.在对语义网本体语言的推理能力进行研究以后,提出了一种基于知识推理的二阶本体相似技术,解决了循环计算的问题.  相似文献   

10.
知识管理中基于本体的扩展检索方法   总被引:2,自引:0,他引:2  
在知识管理系统中,为有效地解决用户查询与文档之间相同概念的不同表达形式造成的失配问题,提出一种基于本体、以面向任务情景的结构化描述作为信息体内容的语义索引的双向扩展检索方法,通过相容匹配和知识联网2种机制实现了扩展检索,分别对应于自上而下的和自下而上的2种途径;并采用查询重写模板(QRT)来搜索与当前任务相关的知识.基于原始查询和本体,QRT生成大量的子查询,同时将与原始查询相关度的权重传递给子查询式.自上而下方法或知识联网机制通过组织、任务本体检索到相关知识项.自下而上方法在任务情景中搜索相似任务,并获取包含该任务描述的知识项.2种方法都应用QRT实现基于本体的知识检索.实验结果表明:文中方法提高了知识管理系统的检索效率和准确率.  相似文献   

11.
郭猛  冯志勇 《微处理机》2007,28(4):116-119
基于关键词处理的传统检索技术会在检索过程中遗漏大量与检索概念相关或同义的内容。针对这种情况,提出了一种基于本体的Web信息检索模型。另外该模型通过解析语义文档并分析所需的概念属性之间的关系得到一定的相似度,并在检索过程中利用该相似度进行语义扩展。  相似文献   

12.
本体驱动的半结构化Web生物数据抽取   总被引:3,自引:0,他引:3       下载免费PDF全文
成瑜  何洁月 《计算机工程》2006,32(5):192-194
提出由本体驱动,并根据文档结构和特征匹配来进行信息定位和信息抽取的方法,并实现了一个用户指导的交互式信息抽取原型系统。有效地解决了信息抽取中涉及的同义词,一词多义等语义问题,以及数据项不完整和排序不固定的问题。  相似文献   

13.
Technology in the field of digital media generates huge amounts of nontextual information, audio, video, and images, along with more familiar textual information. The potential for exchange and retrieval of information is vast and daunting. The key problem in achieving efficient and user-friendly retrieval is the development of a search mechanism to guarantee delivery of minimal irrelevant information (high precision) while insuring relevant information is not overlooked (high recall). The traditional solution employs keyword-based search. The only documents retrieved are those containing user-specified keywords. But many documents convey desired semantic information without containing these keywords. This limitation is frequently addressed through query expansion mechanisms based on the statistical co-occurrence of terms. Recall is increased, but at the expense of deteriorating precision. One can overcome this problem by indexing documents according to context and meaning rather than keywords, although this requires a method of converting words to meanings and the creation of a meaning-based index structure. We have solved the problem of an index structure through the design and implementation of a concept-based model using domain-dependent ontologies. An ontology is a collection of concepts and their interrelationships that provide an abstract view of an application domain. With regard to converting words to meaning, the key issue is to identify appropriate concepts that both describe and identify documents as well as language employed in user requests. This paper describes an automatic mechanism for selecting these concepts. An important novelty is a scalable disambiguation algorithm that prunes irrelevant concepts and allows relevant ones to associate with documents and participate in query generation. We also propose an automatic query expansion mechanism that deals with user requests expressed in natural language. This mechanism generates database queries with appropriate and relevant expansion through knowledge encoded in ontology form. Focusing on audio data, we have constructed a demonstration prototype. We have experimentally and analytically shown that our model, compared to keyword search, achieves a significantly higher degree of precision and recall. The techniques employed can be applied to the problem of information selection in all media types.Received: 7 October 2002, Accepted: 20 May 2003, Published online: 30 September 2003Edited by: E. LochovskyThis research has been funded [or funded in part] by the Integrated Media Systems Center, a National Science Foundation Engineering Research Center, Cooperative Agreement No. EEC-9529152.  相似文献   

14.
Conceptual-model-based data extraction from multiple-record Web pages   总被引:7,自引:0,他引:7  
Electronically available data on the Web is exploding at an ever increasing pace. Much of this data is unstructured, which makes searching hard and traditional database querying impossible. Many Web documents, however, contain an abundance of recognizable constants that together describe the essence of a document's content. For these kinds of data-rich, multiple-record documents (e.g., advertisements, movie reviews, weather reports, travel information, sports summaries, financial statements, obituaries, and many others) we can apply a conceptual-modeling approach to extract and structure data automatically. The approach is based on an ontology – a conceptual model instance – that describes the data of interest, including relationships, lexical appearance, and context keywords. By parsing the ontology, we can automatically produce a database scheme and recognizers for constants and keywords, and then invoke routines to recognize and extract data from unstructured documents and structure it according to the generated database scheme. Experiments show that it is possible to achieve good recall and precision ratios for documents that are rich in recognizable constants and narrow in ontological breadth. Our approach is less labor-intensive than other approaches that manually or semiautomatically generate wrappers, and it is generally insensitive to changes in Web-page format.  相似文献   

15.
Program slicing is a well-known technique to extract the program statements that (potentially) affect the values computed at some point of interest. In this work, we introduce a novel slicing method for XML documents. Essentially, given an XML document (which is valid w.r.t. some DTD), we produce a new XML document (a slice) that contains the relevant information in the original XML document according to some criterion. Furthermore, we also output a new DTD such that the computed slice is valid w.r.t. this DTD. A prototype implementation of the XML slicer has been undertaken.  相似文献   

16.
User interface and requirements prototyping is a requirements elicitation technique. A user interface and requirements prototype is built during the requirements engineering phase of a software system development. Along with the user interface prototype are produced various documents such as the system requirement specification. When a prototype and other documents exist, they may not describe the same functionality, particularly because there may be behaviour of the prototype, artefacts of prototyping, that may not be intended. The problem is that in later development stages, when there is a prototype and other documents, it is often difficult to reconcile the difference between the prototype and the other documents. This paper presents an approach for avoiding this difficulty. It demonstrates the approach by showing its application to parts of a real software development.  相似文献   

17.
18.
基于Ontology知识库系统建模   总被引:2,自引:1,他引:1  
根据KADS的知识模型探讨了领域知识库系统构建的分层模型,并基于Ontology对知识库进行了设计。通过C++和ASP技术对学习系统中的部分模块进行了实现。最后给出了原型系统的运行界面。  相似文献   

19.
基于本体和语义网络的复合文档辅助生成技术   总被引:1,自引:0,他引:1  
针对工程文档的快速编制问题,提出了基于本体和语义网络的复合文档辅助生成方法。通过对本体模型和语义网络的分析,说明基于本体和语义网络实现复合文档辅助生成技术的思路、方法以及工作流程。它对于加快企业知识化,提高企业复合文档编制的准确性、快速性、一致性具有重要意义。  相似文献   

20.
在研究法律文书书写错误的语言表述特征后,将法律文书中的文本错误分为叙事陈述时的直接错误和行文书写时的隐含错误,并构建一组正则匹配规则和字词识别规则来进行错字错词识别。通过对法律文书语言学特征的研究,提出一种规则与概率统计相结合的方法实现对法律文书的文本校对。实验结果显示,该方法的召回率和准确率均达到80%,具有较好的使用前景。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号