基于子问题渐进式推理的3D视觉问答 3D visual question answering based on sub-questions asymptotic reasoning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于子问题渐进式推理的3D视觉问答

引用本文：	李长健,杨昱威,肖枭,雷印杰. 基于子问题渐进式推理的3D视觉问答[J]. 计算机应用研究, 2023, 40(4): 987-990+995

作者姓名：	李长健杨昱威肖枭雷印杰

作者单位：	四川大学电子信息学院,四川大学电子信息学院,四川大学电子信息学院,四川大学电子信息学院

基金项目：	国家重点研发计划项目(2021YFC3300305)

摘要：	3D视觉问答可以帮助人们理解空间信息，在幼儿教育等方面具有广阔的应用前景。3D场景信息复杂，现有方法大多直接进行回答，面对复杂问题时容易忽视上下文细节，从而导致性能下降。针对该问题，提出了一种基于子问题渐进式推理的3D视觉问答方法，通过文本分析为复杂的原始问题构建多个简单的子问题。模型在回答子问题的过程中学习上下文信息，帮助理解复杂问题的含义，最终利用积累的联合信息得出原始问题的答案。子问题与原始问题呈现渐近式推理关系，使得模型具有明确的错误解释性和可追溯性。在现有3D数据集ScanQA上进行的实验表明，所提方法在EM@10和CIDEr两个指标上分别达到了51.49%和61.68%,均超过了现有的其他3D视觉问答方法，证实了该方法的有效性。
关键词：	3D视觉问答原始问题子问题渐进式推理上下文信息
收稿时间：	2022-08-12
修稿时间：	2023-03-07
3D visual question answering based on sub-questions asymptotic reasoning

Li Changjian,Yang Yuwei,Xiao Xiao and Lei Yinjie. 3D visual question answering based on sub-questions asymptotic reasoning[J]. Application Research of Computers, 2023, 40(4): 987-990+995

Authors:	Li Changjian Yang Yuwei Xiao Xiao Lei Yinjie

Affiliation:	College of Electronics and Information Engineering, Sichuan University,,,

Abstract:	3D visual question answering can help people understand spatial information, which has a broad application prospect in early childhood education. The 3D scene information is complex, and most of the existing methods answer directly. It is easy to ignore the context information in the scene when facing complex problems, which leads to the performance degradation. To address this problem, this paper proposed a 3D visual question answering method based on sub-question asymptotic reasoning, which constructed multiple simple sub-questions for complex original question through text analysis. The model learnt context information in the process of answering the sub-questions to help understand the meaning of the complex question, and finally used the accumulated joint information to derive the answers to the original question. The sub-questions presented an asymptotic reasoning relationship with the original question, which made the model have explicit error interpretation and traceability. Experiments conducted on the ScanQA dataset show that, the proposed method achieves 51.49% and 61.68% for the two evaluation metrics EM@10 and CIDEr, both exceeding other existing methods, confirming the effectiveness of the method.

Keywords:	3D visual question answering original question sub-question asymptotic reasoning context information

	点击此处可从《计算机应用研究》浏览原始摘要信息
	点击此处可从《计算机应用研究》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏