首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 109 毫秒
1.
社区问答系统CQA(Community Question Answering),如雅虎问答是专门为了解决传统搜索引擎的局限来帮助用户获取有用信息的社区。问句检索在CQA中主要是针对用户提出来的新问题,在历史问答对中检索出与用户最相关的问题,从而减少用户等待的时间,给用户带来更好的体验。提出一种基于主题类别信息问句检索的新方法来解决问句检索问题,利用问句的主题类别信息对语言模型进行平滑,同时融入问句的语义信息来解决问句检索问题。实验在Yahoo!Answers上抽取的真实标注数据集上进行,对比实验结果表明,所提出的方法在性能上得到了较好的结果。  相似文献   

2.
社区问答系统(Community-Based Question Answering Portal,CQA)的兴起,不仅为用户提供了信息获取与知识分享的平台,同时也积累了大量的问答资源。近年来对于问答社区中的问题匹配、专家发现、用户满意度分析、答案质量评价等方面的研究也逐渐加深,特别是答案质量研究已经从通过答案质量评价改善用户体验,逐步过渡到使用答案摘要提升答案质量。该文阐述了答案摘要对于社区问答系统中问答对资源再利用的重要意义,概括了答案摘要的主要任务,分析了答案摘要和多文档自动文摘的异同点,对答案摘要国内外的研究现状进行了概述,并且总结了答案摘要中需要进一步解决的关键技术问题。  相似文献   

3.
一种基于LDA的社区问答问句相似度计算方法   总被引:2,自引:0,他引:2  
传统的问答系统(QA)只是直接返回问题的答案,而且没有用户交互特性,而基于社区的问答系统(CQA),含有大量的“问答对”可以利用。该文提出了一种基于LDA的匹配框架来解决相似问句的匹配问题,分别从问句的统计信息、语义信息和主题信息三个方面来计算问句相似度,综合得到整体相似度。实验是在Yahoo! Answers上抽取的真实标注数据集上进行,最终的实验结果表明,该文的方法达到了很好的性能。  相似文献   

4.
面向问答社区的问答系统CQA(Community Question Answer)是近年来研究的热点,针对系统中问句分类的复杂性,提出一个粗粒度的分类体系及多标记多分类的问句分类算法——MLMC。基于SVM分类模型实现一个完整的分类系统,总体分类精度达到73.6%。  相似文献   

5.
StackExchange是目前最流行的问答社区集结地之一.本文利用StackExchange中具有美国地理信息的用户构建StackExchange问答社区在美国境内的知识传播图谱,对传播网络的统计特征进行了分析,提取出问答社区类网站的传播模式,获取得到网络用户的知识分享方式.我们发现StackExchange中的问答社区在分享知识过程中,传播源往往不止一个.同时,我们为问答社区构建了知识传播图谱,发现这些传播图谱具有相似的统计特征,这意味着不同的问答社区可能具有类似的知识传播模式.  相似文献   

6.
问句推荐在CQA中主要是针对用户提出来的新问题进行需求分析,然后在历史问答对中找到与用户原始问题最相关的问题,从而在不能给用户提供精确结果的情况下,为用户带来更多的选择,从而提高用户的体验。提出一种基于用户兴趣和需求的问句推荐方法,主要是利用PLSA模型根据用户历史回答问题的记录去发现用户的兴趣,同时采用基于翻译模型根据用户的查询预测用户的需求。结合用户的兴趣和需求在问答语料库中推荐最相关的问句。实验在Yahoo!Answers上抽取的真实标注数据集上进行,对比实验结果表明,该方法在性能上得到了较好的结果。  相似文献   

7.
董才正  刘柏嵩 《计算机应用》2016,36(4):1060-1065
传统的问题分类体系大都基于事实类问题,传统的问题分类方法也比较依赖于疑问词这一分类特征,但问答社区(CQA)中非事实类问题居多,且许多问题并不包含疑问词,为此,提出一种面向问答社区的粗粒度分类体系,并在此基础上提出一种基于疑问词的层次化结构问题分类方法。该方法首先自动识别问题中的疑问词,若疑问词存在,则用支持向量机(SVM)模型进行分类;而对没有疑问词的问题,则用所构造的基于焦点词的分类器进行分类。通过在从中文问答社区知乎中所爬取的问题数据集上进行实验,与传统的基于SVM模型的分类方法相比,该方法的分类准确率提高了4.7个百分点。实验结果表明,这种根据问题是否含有疑问词而选择不同分类器的方法,减轻了分类方法对疑问词的依赖,能有效提高问答社区中问题分类的准确率。  相似文献   

8.
社区问答系统已经成为获取和分享知识的一种重要渠道,但用户提供的信息质量差异比较大。本文针对社区问答系统中具有多个答案的问题,提出了一种基于混合式的社区问答答案质量评价模型,可实现最佳答案的选取。该模型首先利用基于用户活动的UAM模型获得问题和答案的主题相似度并剔除无关回复,然后结合用户权威度及多重评价标准,对答案进行评分,获得对答案的定量评价结果。基于Stack Overflow的实验表明,该方法可有效的对答案进行质量评分,有实用价值。  相似文献   

9.
知识共享型网站为自动问答系统带来了新的研究契机。但用户提供的问题及其答案质量参差不齐,在提供有用信息的同时可能包含各种无关甚至恶意的信息。对此类信息进行判别和过滤,并选取高质量的问题与答案对,有助于在基于社区的自动问答系统中重用相关问题的答案以提高问答系统的服务质量。首先从中文社区问答网站上抓取大量问题及答案,利用社会网络的方法对提问者和回答者的互动关系及特点进行了统计与分析。然后基于给定的问答质量判定标准,对3000多个问题及其答案进行了人工标注。并通过提取文本和非文本两类特征集,利用机器学习算法设计和实现了基于特征集的问答质量分类器。试验结果表明其精度和召回率均在70%以上。最后分析了影响社区网络中问答质量的主要因素。  相似文献   

10.
类似"百度知道"这类社区问答服务系统的主要任务之一是对问题进行分类,以便于对用户的提问进行组织.社区问答服务的实际应用需求对问题分类算法提出了高准确性、小计算量、对噪音数据敏感度低等要求.基于Kullback-Leibler Distance的分类算法在大规模文本和高维向量分类任务中表现出较高的分类精度,本文在该分类算...  相似文献   

11.
An increasingly popular method for retrieving information is via the community question answering (CQA) systems such as Yahoo! Answers and Baidu Knows. In CQA, question classification plays an important role to find the answers. However, the labeled training examples for statistical question classifier are fairly expensive to obtain, as they require the experienced human efforts. Meanwhile, unlabeled data are readily available. This paper employs the method of domain adaptation via kernel mapping to solve this problem. In detail, the kernel approach is utilized to map the target-domain data and the source-domain data into a common space, where the question classifiers are trained under the closer conditional probabilities. The kernel mapping function is constructed by domain knowledge. Therefore, domain knowledge could be transferred from the labeled examples in the source domain to the unlabeled ones in the targeted domain. The statistical training model can be improved by using a large number of unlabeled data. Meanwhile, the Hadoop Platform is used to construct the mapping mechanism to reduce the time complexity. Map/Reduce enable kernel mapping for domain adaptation in parallel in the Hadoop Platform. Experimental results show that the accuracy of question classification could be improved by the method of kernel mapping. Furthermore, the parallel method in the Hadoop Platform could effective schedule the computing resources to reduce the running time.  相似文献   

12.
Community question answering (CQA) has recently become a popular social media where users can post questions on any topic of interest and get answers from enthusiasts. The variation of topics in questions and answers indicate the change of users’ interests over time. It can help users focus on the most popular products or events and track their changes by exploiting hot topics and analyzing the trend of a specific topic. In this paper, we present a hot topic detection and trend analysis system to capture hot topics in a CQA system and track their evolutions over time. Our system consists of hot term extraction, question clustering and trend analysis. Experimental results using datasets from Yahoo! Answers show that our system can discover meaningful hot topics. We also show that the evolution of topics over time can be accurately exploited by trend graphing.  相似文献   

13.
Cao  Xing  Zhao  Yingsi  Shen  Bo 《Neural computing & applications》2023,35(7):5513-5533
Neural Computing and Applications - Complex question answering (CQA) is widely used in real-world tasks such as search engines and intelligent customer service. With the development of large-scale...  相似文献   

14.
Contextual question answering (CQA), in which user information needs are satisfied through an interactive question answering (QA) dialog, has recently attracted more research attention. One challenge is to fuse contextual information into the understanding process of relevant questions. In this paper, a discourse structure is proposed to maintain semantic information, and approaches for recognition of relevancy type and fusion of contextual information according to relevancy type are proposed. The system is evaluated on real contextual QA data. The results show that better performance is achieved than a baseline system and almost the same performance as when these contextual phenomena are resolved manually. A detailed evaluation analysis is presented.  相似文献   

15.
胡婕  陈晓茜  张龑 《计算机应用》2023,43(2):365-373
当前主流模型无法充分地表示问答对的语义,未充分考虑问答对主题信息间的联系并且激活函数存在软饱和的问题,而这些会影响模型的整体性能。针对这些问题,提出了一种基于池化和特征组合增强BERT的答案选择模型。首先,在预训练模型BERT的基础上增加对抗样本并引入池化操作来表示问答对的语义;其次,引入主题信息特征组合来加强问答对主题信息间的联系;最后,改进隐藏层的激活函数,并用拼接向量通过隐藏层和分类器完成答案选择任务。在SemEval-2016CQA和SemEval-2017CQA数据集上进行的验证结果表明,所提模型与tBERT模型相比,准确率分别提高了3.1个百分点和2.2个百分点;F1值分别提高了2.0个百分点和3.1个百分点。可见,所提模型在答案选择任务上的综合效果得到了有效提升,准确率和F1值均优于对比模型。  相似文献   

16.
Modern Community Question Answering (CQA) web forums provide the possibility to browse their archives using question-like search queries as in Information Retrieval (IR) systems. Although these traditional IR methods have become very successful at fetching semantically related questions, they typically leave unconsidered their temporal relations. That is to say, a group of questions may be asked more often during specific recurring time lines despite being semantically unrelated. In fact, predicting temporal aspects would not only assist these platforms in widening the semantic diversity of their search results, but also in re-stating questions that need to refresh their answers and in producing more dynamic, especially temporally-anchored, displays.In this paper, we devised a new set of time-frame specific categories for CQA questions, which is obtained by fusing two distinct earlier taxonomies (i.e., [29] and [50]). These new categories are then utilized in a large crowd-sourcing based human annotation effort. Accordingly, we present a systematical analysis of its results in terms of complexity and degree of difficulty as it relates to the different question topics1Incidentally, through a large number of experiments, we investigate the effectiveness of a wider variety of linguistic features compared to what has been done in previous works. We additionally mix evidence/features distilled directly and indirectly from questions by capitalizing on their related web search results. We finally investigate the impact and effectiveness of multi-view learning to boost a large variety of multi-class supervised learners by optimizing a latent layer build on top of two views: one composed of features harvested from questions, and the other from CQA meta data and evidence extracted from web resources (i.e., snippets and Internet archives).  相似文献   

17.
Community question answering (CQA) represents the type of Web applications where people can exchange knowledge via asking and answering questions. One significant challenge of most real-world CQA systems is the lack of effective matching between questions and the potential good answerers, which adversely affects the efficient knowledge acquisition and circulation. On the one hand, a requester might experience many low-quality answers without receiving a quality response in a brief time; on the other hand, an answerer might face numerous new questions without being able to identify the questions of interest quickly. Under this situation, expert recommendation emerges as a promising technique to address the above issues. Instead of passively waiting for users to browse and find their questions of interest, an expert recommendation method raises the attention of users to the appropriate questions actively and promptly. The past few years have witnessed considerable efforts that address the expert recommendation problem from different perspectives. These methods all have their issues that need to be resolved before the advantages of expert recommendation can be fully embraced. In this survey, we first present an overview of the research efforts and state-of-the-art techniques for the expert recommendation in CQA. We next summarize and compare the existing methods concerning their advantages and shortcomings, followed by discussing the open issues and future research directions.  相似文献   

18.
问答社区中回答质量的评价方法研究   总被引:3,自引:0,他引:3  
问答社区已经成为网络信息获取的一种重要渠道,但其信息质量差异较大。该文研究了问答社区中回答质量的评价方法。具体考察了百度知道的问答社区环境,并对其构建了大规模的语料数据。针对百度知道的特点,文本提出的基于时序的特征、基于问题粒度的特征和基于百度知道社区用户的特征,从更多的角度对回答质量进行评价。利用分类学习的框架,该文综合了新设计的三方面特征和经典的文本特征、链接特征,对高质量和非高质量的回答进行分类。基于大规模问答语料的实验表明,在文本特征与链接特征的基础上,基于时序与基于问题粒度的特征能够有效地提高回答质量的评估效果。另外也发现,根据该文的回答质量评价框架做出的质量评分能够有效地预测最佳答案。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号