首页 | 本学科首页   官方微博 | 高级检索  
     

语义驱动的司法文档学习分类方法
引用本文:马建刚,马应龙.语义驱动的司法文档学习分类方法[J].计算机应用,2019,39(6):1696-1700.
作者姓名:马建刚  马应龙
作者单位:中国人民大学法学院,北京100872;国家检察官学院,北京102206;河南省人民检察院,郑州450004;华北电力大学控制与计算机工程学院,北京,102206
基金项目:国家重点研发计划项目(2018YFC0831404,2018YFC0830605);中国博士后科学基金资助项目(2016M591317)。
摘    要:基于海量的司法文书进行的高效司法文档分类有助于目前的司法智能化应用,如类案推送、文书检索、判决预测和量刑辅助等。面向通用领域的文本分类方法因没有考虑司法领域文本的复杂结构和知识语义,导致司法文本分类的效能很低。针对该问题提出了一种语义驱动的方法来学习和分类司法文书。首先,提出并构建了面向司法领域的领域知识模型以清晰表达文档级语义;然后,基于该模型对司法文档进行相应的领域知识抽取;最后,利用图长短期记忆模型(Graph LSTM)对司法文书进行训练和分类。实验结果表明该方法在准确率和召回率方面明显优于常用的长短期记忆(LSTM)模型、多类别逻辑回归和支持向量机等方法。

关 键 词:司法大数据  领域知识模型  文本分类  智慧检务  图长短期记忆模型
收稿时间:2018-11-15
修稿时间:2019-01-03

Semantic-driven learning and classification method of judicial documents
MA Jiangang,MA Yinglong.Semantic-driven learning and classification method of judicial documents[J].journal of Computer Applications,2019,39(6):1696-1700.
Authors:MA Jiangang  MA Yinglong
Affiliation:1. Law School, Renmin University of China, Beijing 100872, China;2. National Prosecutors College of P. R. C., Beijing 102206, China;3. The People's Procuratorate of Henan Province, Zhengzhou Henen 450004, China;4. School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China
Abstract:Efficient document classification techniques based on large-scale judicial documents are crucial to current judicial intelligent application, such as similar case pushing, legal document retrieval, judgment prediction and sentencing assistance. The general-domain-oriented document classification methods are lack of efficiency because they do not consider the complex structure and knowledge semantics of judicial documents. To solve this problem, a semantic-driven method was proposed to learn and classify judicial documents. Firstly, a domain knowledge model oriented to judicial domain was proposed and constructed to express the document-level semantics clearly. Then, domain knowledge was extracted from the judicial documents based on the model. Finally, the judicial documents were trained and classified by using Graph Long Short-Term Memory (Graph LSTM) model. The experimental results show that, the proposed method is superior to Long Short-Term Memory (LSTM) model, Multinomial Logistic Regression (MLR) and Support Vector Machine (SVM) in accuracy and recall.
Keywords:judicial big data  domain knowledge model  text categorization  smart procuratorate  Graph Long Short-Term Memory (Graph LSTM) model  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号