首页 | 本学科首页   官方微博 | 高级检索  
     

基于超列注意力机制的京剧人物识别
引用本文:覃俊,罗一凡,帖军,郑禄,吕伟龙.基于超列注意力机制的京剧人物识别[J].计算机应用,2021,41(4):1027-1034.
作者姓名:覃俊  罗一凡  帖军  郑禄  吕伟龙
作者单位:1. 中南民族大学 计算机科学学院, 武汉 430074;2. 湖北省制造企业智能管理工程技术研究中心(中南民族大学), 武汉 430074;3. 南京理工大学 计算机科学与工程学院, 南京 210094
基金项目:国家自然科学基金资助项目;湖北省技术创新专项重大项目
摘    要:为了克服京剧人物视觉特征提取的难点及满足京剧人物实时识别的需求,提出基于超列注意力机制的卷积神经网络(HCA-CNN)来实现面向京剧人物的细粒度特征提取和识别。该网络中用于关键区域定位的注意力机制借鉴了用于图像分割和细粒度定位的超列(HyperColumn)特征思想,通过超列集基于像素点的形式串联主干分类网络来形成多层叠加特征,从而更好地兼顾早期浅层空间特征与后期深度类别语义特征,并提高定位任务与主干网络分类任务的准确度。同时,该网络的主干网络采用轻量级的MobileNetV2,从而更好地满足视频应用场景下的实时性要求。此外,还创建了京剧人物(BJOR)数据集,并在此数据集上进行了相关消融实验。实验结果显示,HCA-CNN与传统细粒度循环注意力网络(RA-CNN)相比,除了在准确率(Accuracy)指标上提高了0.63个百分点以外,其内存使用量(Memory Usage)、参数量(Params)分别减少了162.84 MB、131.5 MB,乘加次数(Mult-Adds)、每秒浮点运算次数(FLOPs)分别减少了39 885×106、51 886×106。可见,针对京剧人物视觉特征提出的HCA-CNN能有效提高京剧人物识别的准确率和效率,满足实际应用的需求。

关 键 词:超列  注意力机制  递归网络  细粒度  京剧人物识别  
收稿时间:2020-08-20
修稿时间:2020-10-27

Beijing Opera character recognition based on attention mechanism with HyperColumn
QIN Jun,LUO Yifan,TIE Jun,ZHENG Lu,LYU Weilong.Beijing Opera character recognition based on attention mechanism with HyperColumn[J].journal of Computer Applications,2021,41(4):1027-1034.
Authors:QIN Jun  LUO Yifan  TIE Jun  ZHENG Lu  LYU Weilong
Affiliation:1. College of Computer Science, South-Central University for Nationalities, Wuhan Hubei 430074, China;2. Hubei Provincial Engineering Research Center for Intelligent Management of Manufacturing Enterprises;(South-Central University for Nationalities), Wuhan Hubei 430074, China;3. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing Jiangsu 210094, China
Abstract:In order to overcome the difficulty of visual feature extraction and meet the real-time recognition demand of Beijing Opera characters, a Convolutional Neural Network based on HyperColumn Attention(HCA-CNN) was proposed to extract and recognize the fine-grained features of Beijing Opera characters. The idea of HyperColumn features used for image segmentation and fine-grained positioning were applied to the attention mechanism used for key area positioning in the network. The multi-layer superposition features was formed by concatenating the backbone classification network in the forms of pixel points through the HyperColumn set, so as to better take into account both the early shallow spatial features and the late depth category semantic features, and improve the accuracy of positioning task and backbone network classification task. At the same time, the lightweight MobileNetV2 was adopted as the backbone network of the network, which better met the real-time requirement of video application scenarios. In addition, the BeiJing Opera Role(BJOR) dataset was created and the ablation experiments were carried out on this dataset. Experimental results show that, compared with the traditional fine-grained Recurrent Attention Convolutional Neural Network(RA-CNN), HCA-CNN not only improves the accuracy index by 0.63 percentage points, but also reduces the Memory Usage and Params by 162.84 MB and 131.5 MB respectively, and reduces the times of multiplication and addition Mult-Adds and floating-point operations per second FLOPs by 39 885×106 times and 51 886×106 times respectively. It verifies that the proposed HCA-CNN can effectively improve the accuracy and efficiency of Beijing Opera character recognition, and can meet the requirements of practical applications.
Keywords:HyperColumn  attention mechanism  recurrent network  fine-grained  Beijing Opera character recognition  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号