首页 | 本学科首页   官方微博 | 高级检索  
     

融合注意力机制的人机交互信息半监督敏感数据抽取算法
引用本文:牟少霞,吕冰彩.融合注意力机制的人机交互信息半监督敏感数据抽取算法[J].计算技术与自动化,2023(3):85-89.
作者姓名:牟少霞  吕冰彩
作者单位:(1.菲律宾永恒大学,菲律宾 马尼拉 6015;2.山东省教育招生考试院,山东 济南 250011)
摘    要:为提高敏感数据抽取效果,提出了融合注意力机制的人机交互信息半监督敏感数据抽取方法。融合类卷积以及人机交互注意力机制构建融合交互注意力机制双向长短词记忆(Bi-LSTM-CRF)模型,通过模型的类卷积交互注意力机制将敏感词转化为字符矩阵,采用Bi-LSTM对该矩阵进行编码获得敏感词字符级特点的分布式排列,通过Bi-LSTM对该分布式排列的二次编码获得敏感词上下文信息的隐藏状态,基于该隐藏状态通过类卷积注意力层与交互注意力层进行注意力加权,获得类卷积注意力矩阵与交互注意力矩阵,拼接两个矩阵得到双层注意力矩阵,利用交互注意力层门控循环单元升级双层注意力矩阵成新的注意力矩阵,经全连接降维获取敏感词对应的预测标签,实现人机交互信息半监督敏感数据抽取。实验结果说明:该方法可有效降低敏感数据抽取复杂度,具有较高的敏感数据抽取查全率。

关 键 词:注意力机制  人机交互  半监督  敏感数据抽取  BiLSTM模型  CRF模型

Semi-Supervised Sensitive Data Extraction Algorithm for Human-computer Interaction Information Based on Attention Mechanism
Abstract:In order to improve the extraction effect of sensitive data, a semi-supervised sensitive data extraction method of human-computer interaction information integrating attention mechanism is proposed. Bi-LSTM-CRF model is constructed by integrating convolution and human-computer interaction attention mechanism. Sensitive words are transformed into character matrix through the convolution interaction attention mechanism of the model. Bi-LSTM is used to encode the matrix to obtain distributed arrangement of character level characteristics of sensitive words. Through the Bi-LSTM is sensitive to the distributed array secondary coding gain word context information hidden state, based on the hidden state of combining class convolution attention at close range for all the words of attention weight distribution on the word to get kind of convolution attention matrix, the matrix through the model the interaction layer focus attention for all of the sensitive word weight distribution, attention to obtain interaction matrix, convolution attention yourself matrix and interaction matrix using the class splicing into double attention matrix, using interactive gating circulation cell upgrade double attention attention layer matrix into new attention matrix, the matrix through the connection dimension reduction access to sensitive word corresponding forecast label, realize human-computer interaction information a semi-supervised sensitive data extraction. Experimental results show that this method can effectively reduce the complexity of sensitive data extraction and has a high recall rate of sensitive data extraction.
Keywords:attention mechanism  human-computer interaction  a semi-supervised  sensitive data extraction  BiLSTM model  CRF model
点击此处可从《计算技术与自动化》浏览原始摘要信息
点击此处可从《计算技术与自动化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号