期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A survey on question answering technology from an information retrieval perspective

Oleksandr Kolomiyets Marie-Francine Moens 《Information Sciences》2011,181(24):5412-5434

This article provides a comprehensive and comparative overview of question answering technology. It presents the question answering task from an information retrieval perspective and emphasises the importance of retrieval models, i.e., representations of queries and information documents, and retrieval functions which are used for estimating the relevance between a query and an answer candidate. The survey suggests a general question answering architecture that steadily increases the complexity of the representation level of questions and information objects. On the one hand, natural language queries are reduced to keyword-based searches, on the other hand, knowledge bases are queried with structured or logical queries obtained from the natural language questions, and answers are obtained through reasoning. We discuss different levels of processing yielding bag-of-words-based and more complex representations integrating part-of-speech tags, classification of the expected answer type, semantic roles, discourse analysis, translation into a SQL-like language and logical representations. 相似文献

2.

Learning-based SPARQL query performance modeling and prediction

Wei?Emma?Zhang Email author View author&#;s OrcID profile Quan?Z.?Sheng Yongrui?Qin Kerry?Taylor Lina?Yao 《World Wide Web》2018,21(4):1015-1035

One of the challenges of managing an RDF database is predicting performance of SPARQL queries before they are executed. Performance characteristics, such as the execution time and memory usage, can help data consumers identify unexpected long-running queries before they start and estimate the system workload for query scheduling. Extensive works address such performance prediction problem in traditional SQL queries but they are not directly applicable to SPARQL queries. In this paper, we adopt machine learning techniques to predict the performance of SPARQL queries. Our work focuses on modeling features of a SPARQL query to a vector representation. Our feature modeling method does not depend on the knowledge of underlying systems and the structure of the underlying data, but only on the nature of SPARQL queries. Then we use these features to train prediction models. We propose a two-step prediction process and consider performances in both cold and warm stages. Evaluations are performed on real world SPRAQL queries, whose execution time ranges from milliseconds to hours. The results demonstrate that the proposed approach can effectively predict SPARQL query performance and outperforms state-of-the-art approaches. 相似文献

3.

基于层次化语义框架的知识库属性映射方法

李豫周光有《中文信息学报》2022,36(2):49-57

面向知识库的自动问答是自然语言处理的一项重要任务,其旨在对用户提出的自然语言形式问题给出精练、准确的回复.目前由于缺少数据集,存在特征不一致等因素,导致难以使用通用的数据和方法实现领域知识库问答.因此,该文将"问题意图"视作不同领域问答可能存在的共同特征,将"问题"与三元组知识库中"关系谓词"的映射过程作为问答核心工作... 相似文献

4.

CONTEXTUAL LANGUAGE MODELS FOR RANKING ANSWERS TO NATURAL LANGUAGE DEFINITION QUESTIONS

Alejandro Figueroa John Atkinson 《Computational Intelligence》2012,28(4):528-548

Question–answering systems make good use of knowledge bases (KBs, e.g., Wikipedia) for responding to definition queries. Typically, systems extract relevant facts from articles regarding the question across KBs, and then they are projected into the candidate answers. However, studies have shown that the performance of this kind of method suddenly drops, whenever KBs supply narrow coverage. This work describes a new approach to deal with this problem by constructing context models for scoring candidate answers, which are, more precisely, statistical n‐gram language models inferred from lexicalized dependency paths extracted from Wikipedia abstracts. Unlike state‐of‐the‐art approaches, context models are created by capturing the semantics of candidate answers (e.g., “novel,”“singer,”“coach,” and “city”). This work is extended by investigating the impact on context models of extra linguistic knowledge such as part‐of‐speech tagging and named‐entity recognition. Results showed the effectiveness of context models as n‐gram lexicalized dependency paths and promising context indicators for the presence of definitions in natural language texts. 相似文献

5.

Core techniques of question answering systems over knowledge bases: a survey 总被引：1，自引：0，他引：1

Dennis Diefenbach Vanessa Lopez Kamal Singh Pierre Maret 《Knowledge and Information Systems》2018,55(3):529-569

The Semantic Web contains an enormous amount of information in the form of knowledge bases (KB). To make this information available, many question answering (QA) systems over KBs were created in the last years. Building a QA system over KBs is difficult because there are many different challenges to be solved. In order to address these challenges, QA systems generally combine techniques from natural language processing, information retrieval, machine learning and Semantic Web. The aim of this survey is to give an overview of the techniques used in current QA systems over KBs. We present the techniques used by the QA systems which were evaluated on a popular series of benchmarks: Question Answering over Linked Data. Techniques that solve the same task are first grouped together and then described. The advantages and disadvantages are discussed for each technique. This allows a direct comparison of similar techniques. Additionally, we point to techniques that are used over WebQuestions and SimpleQuestions, which are two other popular benchmarks for QA systems. 相似文献

6.

现代汉语形容词资源库的构建

饶琪王厚峰汪梦翔李慧《中文信息学报》2018,32(4):50-58

形容词与名词、动词构成汉语实词的主体组成部分,在句法上表现出对“名词”的极度依赖,其核心功能是在概念层面上,在认知注意机制的调适作用下对名词的特征进行“评价”。该文主要叙述汉语形容词知识库构建的相关工作。首先是考察已有的形容词的收词情况,并结合语言演变中新产生的形容词,构建了一个较为全面的形容词词集;其次是详细阐述知识库的构建理念;再次是具体阐述知识库的特征描述体系;最后是对该知识库的应用场景进行展望。相似文献

7.

SWSNL: Semantic Web Search Using Natural Language

Ivan Habernal Miloslav Konopík 《Expert systems with applications》2013,40(9):3649-3664

As modern search engines are approaching the ability to deal with queries expressed in natural language, full support of natural language interfaces seems to be the next step in the development of future systems. The vision is that of users being able to tell a computer what they would like to find, using any number of sentences and as many details as requested. In this article we describe our effort to move towards this future using currently available technology. The Semantic Web framework was chosen as the best means to achieve this goal. We present our approach to building a complete Semantic Web Search Using Natural Language (SWSNL) system. We cover the complete process which includes preprocessing, semantic analysis, semantic interpretation, and executing a SPARQL query to retrieve the results. We perform an end-to-end evaluation on a domain dealing with accommodation options. The domain data come from an existing accommodation portal and we use a corpus of queries obtained by a Facebook campaign. In our paper we work with written texts in the Czech language. In addition to that, the Natural Language Understanding (NLU) module is evaluated on another domain (public transportation) and language (English). We expect that our findings will be valuable for the research community as they are strongly related to issues found in real-world scenarios. We struggled with inconsistencies in the actual Web data, with the performance of the Semantic Web engines on a decently sized knowledge base, and others. 相似文献

8.

Type-Aware Question Answering over Knowledge Base with Attention-Based Tree-Structured Neural Networks

下载免费PDF全文

Jun Yin Wayne Xin Zhao Xiao-Ming Li 《计算机科学技术学报》2017,32(4):805-813

Question answering (QA) over knowledge base (KB) aims to provide a structured answer from a knowledge base to a natural language question. In this task, a key step is how to represent and understand the natural language query. In this paper, we propose to use tree-structured neural networks constructed based on the constituency tree to model natural language queries. We identify an interesting observation in the constituency tree: different constituents have their own semantic characteristics and might be suitable to solve different subtasks in a QA system. Based on this point, we incorporate the type information as an auxiliary supervision signal to improve the QA performance. We call our approach type-aware QA. We jointly characterize both the answer and its answer type in a unified neural network model with the attention mechanism. Instead of simply using the root representation, we represent the query by combining the representations of different constituents using task-specific attention weights. Extensive experiments on public datasets have demonstrated the effectiveness of our proposed model. More specially, the learned attention weights are quite useful in understanding the query. The produced representations for intermediate nodes can be used for analyzing the effectiveness of components in a QA system. 相似文献

9.

An MDE-based methodology for closed-world integrity constraint checking in the semantic web

《Journal of Web Semantics》2022

Ontology-based data-centric systems support open-world reasoning. Therefore, for these systems, Web Ontology Language (OWL) and Semantic Web Rule Language (SWRL) are not suitable for expressing integrity constraints based on the closed-world assumption. Thus, the requirement of integrating the open-world assumption of OWL/SWRL with closed-world integrity constraint checking is inevitable. SPARQL, recommended by World Wide Web (W3C), is a query language for RDF graphs, and many research studies have shown that it is a perfect candidate for closed-world constraint checking for ontology-based data-centric applications. In this regard, many research studies have been performed to transform integrity constraints into SPARQL queries where some studies have shown the limitations of partial expressivity of knowledge bases while performing the indirect transformations, whereas others are limited to a platform-specific implementation. To address these issues, this paper presents a flexible and formal methodology that employs Model-Driven Engineering (MDE) to model closed-world integrity constraints for open-world reasoning. The proposed approach offers semantic validation of data by expressing integrity constraints at both the model level and the code level. Moreover, straightforward transformations from OWL/SWRL to SPARQL can be performed. Finally, the methodology is demonstrated via a real-world case study of water observations data. 相似文献

10.

基于特征增强的开放域知识库问答系统

下载免费PDF全文

李帅驰杨志豪王鑫雷韩钦宇林鸿飞《计算机工程与应用》2022,58(17):206-212

实体消歧和谓词匹配是中文知识库问答系统（CKBQA）中的两个核心任务。针对开放域知识库中实体和谓词数量巨大,且中文问句与知识库知识在表现形式上存在差异的问题,提出一种基于特征增强的BERT的流水线式问答系统（BERT-CKBQA）,改进了上述两个子任务。采用BERT-CRF模型识别问句中提及的实体,得到候选实体集合。将问题和拼接谓词特征的候选实体输入BERT-CNN模型进行实体消歧。根据实体生成候选谓词集合,提出通过注意力机制引入答案实体谓词特征的BERT-BiLSTM-CNN模型进行谓词匹配。结合实体和谓词的得分确定查询路径来检索最终答案。该方法设计了一个中文简单问题的开放域知识库问答系统,引入预训练模型与谓词特征增强子任务特征以提升其性能,并在NLPCC-ICCPOL-2016KBQA 数据集上取得了88.75%的平均F1值,提高了系统的回答准确率。相似文献

11.

基于多特征语义匹配的知识库问答系统

赵小虎赵成龙《计算机应用》2020,40(7):1873-1878

知识库问答（KBQA）任务主要目的在于精确地将自然语言问题和知识库（KB）中的三元组进行匹配。传统的KBQA方法通常专注于实体识别和谓语匹配,实体识别的错误会导致错误传播从而无法得到正确的答案。针对上述问题提出一种端到端的解决方案直接匹配问题和三元组,该系统主要包含候选三元组生成和候选三元组排序两个部分来实现精确问答。首先通过BM25算法计算问题和知识库中三元组的相关性生成候选三元组;然后通过多特征语义匹配模型（MFSMM）进行三元组的排序,即用MFSMM分别通过双向长短时记忆网络（Bi-LSTM）和卷积神经网络（CNN）实现语义相似度和字符相似度的计算,并通过融合来对三元组进行排序。该系统在NLPCC-ICCPOL 2016 KBQA数据集上的平均F1为80.35%,接近了现有最好的表现。相似文献

12.

基于表示学习的知识库问答研究进展与展望

刘康张元哲纪国良来斯惟赵军《自动化学报》2016,42(6):807-818

面向知识库的问答(Question answering over knowledge base, KBQA)是问答系统的重要组成. 近些年, 随着以深度学习为代表的表示学习技术在多个领域的成功应用, 许多研究者开始着手研究基于表示学习的知识库问答技术. 其基本假设是把知识库问答看做是一个语义匹配的过程. 通过表示学习知识库以及用户问题的语义表示, 将知识库中的实体、关系以及问句文本转换为一个低维语义空间中的数值向量, 在此基础上, 利用数值计算, 直接匹配与用户问句语义最相似的答案. 从目前的结果看, 基于表示学习的知识库问答系统在性能上已经超过传统知识库问答方法. 本文将对现有基于表示学习的知识库问答的研究进展进行综述, 包括知识库表示学习和问句(文本)表示学习的代表性工作, 同时对于其中存在难点以及仍存在的研究问题进行分析和讨论. 相似文献

13.

AquaLog: An ontology-driven question answering system for organizational semantic intranets

Vanessa Lopez Victoria Uren Enrico Motta Michele Pasin 《Journal of Web Semantics》2007,5(2):72-105

The semantic web vision is one in which rich, ontology-based semantic markup will become widely available. The availability of semantic markup on the web opens the way to novel, sophisticated forms of question answering. AquaLog is a portable question-answering system which takes queries expressed in natural language and an ontology as input, and returns answers drawn from one or more knowledge bases (KBs). We say that AquaLog is portable because the configuration time required to customize the system for a particular ontology is negligible. AquaLog presents an elegant solution in which different strategies are combined together in a novel way. It makes use of the GATE NLP platform, string metric algorithms, WordNet and a novel ontology-based relation similarity service to make sense of user queries with respect to the target KB. Moreover it also includes a learning component, which ensures that the performance of the system improves over the time, in response to the particular community jargon used by end users. 相似文献

14.

大规模RDF图数据上高效率分布式查询处理

王鑫徐强柴乐乐杨雅君柴云鹏《软件学报》2019,30(3):498-514

相似文献

15.

gStore: a graph-based SPARQL query engine

Lei Zou M. Tamer Özsu Lei Chen Xuchuan Shen Ruizhe Huang Dongyan Zhao 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(4):565-590

We address efficient processing of SPARQL queries over RDF datasets. The proposed techniques, incorporated into the gStore system, handle, in a uniform and scalable manner, SPARQL queries with wildcards and aggregate operators over dynamic RDF datasets. Our approach is graph based. We store RDF data as a large graph and also represent a SPARQL query as a query graph. Thus, the query answering problem is converted into a subgraph matching problem. To achieve efficient and scalable query processing, we develop an index, together with effective pruning rules and efficient search algorithms. We propose techniques that use this infrastructure to answer aggregation queries. We also propose an effective maintenance algorithm to handle online updates over RDF repositories. Extensive experiments confirm the efficiency and effectiveness of our solutions. 相似文献

16.

Translational relation embeddings for multi-hop knowledge base question answering

《Journal of Web Semantics》2022

Multi-hop Knowledge Base Question Answering (KBQA) aims to predict answers that require multi-hop reasoning from the topic entity in the question over the Knowledge Base (KB). Relation extraction is a core step in KBQA, which extracts the relation path from the topic entity to the answer entity. Compared with single-hop questions, multi-hop ones have more complex syntactic structures to understand, and multi-hop relation paths lead to a larger search space, which makes it much more challenging to extract the correct relation paths. To tackle the above challenges, most existing relation extraction approaches focus on the semantic similarity between questions and relation paths. However, those approaches only consider the word semantics of the relation names but ignore the graph semantics inside the knowledge base. As a result, their generalization ability relying on the naming rules of the relations, making it more difficult to generalize over large knowledge bases.To address the current limitations and take advantage of the graph semantics of relations, we propose a novel translational embedding-based relation extractor that utilizes pretrained embeddings from TransE. In particular, we treat the multi-hop relation path as a translation from the first relation to the last one in the semantic space of TransE. Then we map the question into this space under the supervision of the path embeddings. To take full advantage of the pretrained graph semantics in TransE, we propose a KBQA framework that leverages pretrained relation semantics in relation extraction and pretrained entity semantics in answer selection. Our approach achieves state-of-the-art performance on two benchmark datasets, WebQuestionSP and MetaQA, demonstrating its effectiveness on the multi-hop KBQA task. 相似文献

17.

一种应用于KBQA关系检测的多视角层次匹配网络

朱雅凤邵清《小型微型计算机系统》2020,(1):12-18

知识库问答(KBQA)是指利用知识库中的一个或多个知识三元组回答一个自然语言问题,需要检测问题中提及的知识库实体和关系.关系检测是知识库问答的核心.为了解决现有关系检测方法存在的匹配视角单一和信息瓶颈问题,本文提出了一种多视角层次匹配网络(M-HMN,Multi-view Hierarchical Matching Network),M-HMN利用双向注意力机制对齐问题与候选关系的不同特征,强化两者匹配部分的观察精细度,将匹配信息封装成向量,再由自注意力机制有效聚合多个向量以进行正确关系检测.对于KBQA最终任务的评估,本文提出一种简易的实体重排序算法,利用M-HMN网络优化候选实体集.实验结果表明,M-HMN能有效缓解关系检测的信息瓶颈问题,而提出的实体重排序算法能够进行实体消歧,获得更小更为精准的候选实体集,对KBQA最终任务性能有显著的提升. 相似文献

18.

SEMANTIC MAPPING FROM NATURAL LANGUAGE QUESTIONS TO OWL QUERIES

Mingxia Gao Jiming Liu Ning Zhong Furong Chen Chunnian Liu 《Computational Intelligence》2011,27(2):280-314

相似文献

19.

基于多粒度特征表示的知识图谱问答

申存  黄廷磊  梁霄  《计算机与现代化》2018,(9):5

近年来,基于知识图谱的问答系统逐渐成为学术界和工业界的研究和应用热点方向,而传统方法通常存在效率不高以及未充分利用数据信息的问题。针对以上问题,本文将中文知识图谱问答分为实体抽取和属性选择2个子任务,采用双向长短期记忆条件随机场（Bi-LSTM-CRF）模型来进行实体识别,并提出一种多粒度特征表示的属性选择模型。该模型采用字符级别以及词级别分别对问句和属性进行嵌入表示并通过编码器进行编码,对于属性同时还引入热度编码的信息。通过不同粒度文本表示的结合,并对问句和属性进行相似度计算,最终该系统在NLPCC-ICCPOL 2016 KBQA数据集上取得了73.96%的F1值,能够较好地完成知识图谱问答任务。相似文献

20.

Logical reasoning in natural language: It is all about knowledge

Lucja Iwańska 《Minds and Machines》1993,3(4):475-510

A formal, computational, semantically clean representation of natural language is presented. This representation captures the fact that logical inferences in natural language crucially depend on the semantic relation of entailment between sentential constituents such as determiner, noun, adjective, adverb, preposition, and verb phrases.The representation parallels natural language in that it accounts for human intuition about entailment of sentences, it preserves its structure, it reflects the semantics of different syntactic categories, it simulates conjunction, disjunction, and negation in natural language by computable operations with provable mathematical properties, and it allows one to represent coordination on different syntactic levels.The representation demonstrates that Boolean semantics of natural language can be successfully modeled in terms of representation and inference by knowledge representation formalisms with Boolean semantics. A novel approach to the problem of automatic inferencing in natural language is addressed. The algorithm for updating a computer knowledge base and reasoning with explicit negative, disjunctive, and conjunctive information based on computing subsumption relation between the representations of the appropriate sentential constituents is discussed with examples. 相似文献