语义驱动的司法文档学习分类方法 Semantic-driven learning and classification method of judicial documents期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

语义驱动的司法文档学习分类方法

引用本文：	马建刚,马应龙.语义驱动的司法文档学习分类方法[J].计算机应用,2019,39(6):1696-1700.

作者姓名：	马建刚马应龙

作者单位：	中国人民大学法学院,北京100872;国家检察官学院,北京102206;河南省人民检察院,郑州450004;华北电力大学控制与计算机工程学院,北京,102206

基金项目：	国家重点研发计划项目（2018YFC0831404，2018YFC0830605）；中国博士后科学基金资助项目（2016M591317）。

摘要：	基于海量的司法文书进行的高效司法文档分类有助于目前的司法智能化应用,如类案推送、文书检索、判决预测和量刑辅助等。面向通用领域的文本分类方法因没有考虑司法领域文本的复杂结构和知识语义,导致司法文本分类的效能很低。针对该问题提出了一种语义驱动的方法来学习和分类司法文书。首先,提出并构建了面向司法领域的领域知识模型以清晰表达文档级语义;然后,基于该模型对司法文档进行相应的领域知识抽取;最后,利用图长短期记忆模型(Graph LSTM)对司法文书进行训练和分类。实验结果表明该方法在准确率和召回率方面明显优于常用的长短期记忆(LSTM)模型、多类别逻辑回归和支持向量机等方法。
关键词：	司法大数据领域知识模型文本分类智慧检务图长短期记忆模型
收稿时间：	2018-11-15
修稿时间：	2019-01-03
Semantic-driven learning and classification method of judicial documents

MA Jiangang,MA Yinglong.Semantic-driven learning and classification method of judicial documents[J].journal of Computer Applications,2019,39(6):1696-1700.

Authors:	MA Jiangang MA Yinglong

Affiliation:	1. Law School, Renmin University of China, Beijing 100872, China;2. National Prosecutors College of P. R. C., Beijing 102206, China;3. The People's Procuratorate of Henan Province, Zhengzhou Henen 450004, China;4. School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China

Abstract:	Efficient document classification techniques based on large-scale judicial documents are crucial to current judicial intelligent application, such as similar case pushing, legal document retrieval, judgment prediction and sentencing assistance. The general-domain-oriented document classification methods are lack of efficiency because they do not consider the complex structure and knowledge semantics of judicial documents. To solve this problem, a semantic-driven method was proposed to learn and classify judicial documents. Firstly, a domain knowledge model oriented to judicial domain was proposed and constructed to express the document-level semantics clearly. Then, domain knowledge was extracted from the judicial documents based on the model. Finally, the judicial documents were trained and classified by using Graph Long Short-Term Memory (Graph LSTM) model. The experimental results show that, the proposed method is superior to Long Short-Term Memory (LSTM) model, Multinomial Logistic Regression (MLR) and Support Vector Machine (SVM) in accuracy and recall.

Keywords:	judicial big data domain knowledge model text categorization smart procuratorate Graph Long Short-Term Memory (Graph LSTM) model
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏