首页 | 本学科首页   官方微博 | 高级检索  
     


Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model
Affiliation:1. School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing 400074, China;2. Chongqing College of Electronic Engineering, Chongqing 401331, China;1. Department of Construction Management, Louisiana State University, Baton Rouge 70803, USA;2. Department of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge 70803, USA;1. Engineering of Systems and Environment, University of Virginia, Charlottesville, VA 22903, United States;2. University of California Los Angeles, United States;1. AnHui Province Key Laboratory of Special Heavy Load Robot, Anhui University of Technology, Ma’anshan 243002, China;2. Beijing Advanced Innovation Center for Intelligent Robots and Systems, Beijing Institute of Technology, China;1. The Department of Science and Technology Teaching, China University of Political Science and Law, Beijing, China;2. School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Abstract:As an important data source in the field of bridge management, bridge inspection reports contain large-scale fine-grained data, including information on bridge members and structural defects. However, due to insufficient research on automatic information extraction in this field, valuable bridge inspection information has not been fully utilized. Particularly, for Chinese bridge inspection entities, which involve domain-specific vocabularies and have obvious nesting characteristics, most of the existing named entity recognition (NER) solutions are not suitable. To address this problem, this paper proposes a novel lexicon augmented machine reading comprehension-based NER neural model for identifying flat and nested entities from Chinese bridge inspection text. The proposed model uses the bridge inspection text and predefined question queries as input to enhance the ability of contextual feature representation and to integrate prior knowledge. Based on the character-level features encoded by the pre-trained BERT model, bigram embeddings and weighted lexicon features are further combined into a context representation. Then, the bidirectional long short-term memory neural network is used to extract sequence features before predicting the spans of named entities. The proposed model is verified by the Chinese bridge inspection named entity corpus. The experimental results show that the proposed model outperforms other mainstream NER models on the bridge inspection corpus. The proposed model not only provides a basis for automatic bridge inspection information extraction but also supports the downstream tasks such as knowledge graph construction and question answering systems.
Keywords:Bridge inspection  Named entity recognition  Machine reading comprehension  BERT
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号