首页 | 本学科首页   官方微博 | 高级检索  
     

自动化构建的中文知识图谱系统
引用本文:鄂世嘉,林培裕,向阳.自动化构建的中文知识图谱系统[J].计算机应用,2016,36(4):992-996.
作者姓名:鄂世嘉  林培裕  向阳
作者单位:同济大学 电子与信息工程学院, 上海 201804
基金项目:国家973计划项目(2014CB340404);上海市科委科研计划项目(14511108002)~~
摘    要:为解决当前中文知识图谱构建的准确率低、耗时长且需要大量人工参与的问题,提出一种端到端基于中文百科数据的完整中文知识图谱自动化构建解决方案,并在此基础上开发实现了面向用户的中文知识图谱系统。在此方案中,通过自定义的网络爬虫,原始百科数据的词条属性以及相关的文本信息会不间断地被抓取到本地系统中,并以带扩展属性的三元组形式保存。后端系统则自动通过图数据库Cayley以及MongoDB数据库系统,对三元组文件数据进行导入,转换为庞大的知识图谱系统,从而在前端为用户提供丰富的基于知识图谱的应用服务。通过与其他知识图谱系统的比较,该方案在构建时间上明显减少,并且知识图谱中的实体及关系数量总规模高于YAGO、知网(HowNet)和中文概念词典等中文知识图谱系统至少50%。

关 键 词:知识图谱    网络爬虫    三元组文件    知识库    图数据库
收稿时间:2015-09-06
修稿时间:2015-11-12

Automatical construction of Chinese knowledge graph system
E Shijia;LIN Peiyu;XIANG Yang.Automatical construction of Chinese knowledge graph system[J].journal of Computer Applications,2016,36(4):992-996.
Authors:E Shijia;LIN Peiyu;XIANG Yang
Affiliation:College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
Abstract:To solve the problem that the methods currently used to construct Chinese knowledge graph system are time-consuming, have low accuracy and require a lot of manual intervention, an integrated end-to-end automatically constructed solution based on rich data from Chinese encyclopedia was proposed, and a user-oriented Chinese knowledge graph was implemented. In this solution, some property and related text information of the original encyclopedia data were scraped to local system uninterruptedly by the custom Web crawler, and saved as a triple with extended attributes. Through graph-oriented database Cayley and document-oriented database MongoDB, the data in the archived triple files was imported in the back-end system, and then converted to a huge knowledge graph system in order to provide various services dependent on the Chinese knowledge graph in the front-end system. Compared with other knowledge graph systems, the proposed system significantly reduces the construction time; moreover, the number of entities and relations is at least 50% higher than that of the other knowledge graph systems such as YAGO, HowNet and the Chinese Concept Dictionary.
Keywords:knowledge graph                                                                                                                        Web crawler                                                                                                                        triple file                                                                                                                        knowledge base                                                                                                                        graph-oriented database
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号