多模态硬币图像单应性矩阵预测 Homography estimation for multimodal coin images期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

多模态硬币图像单应性矩阵预测

作者姓名：	邓壮林张绍兵成苗何莲

作者单位：	1. 中国科学院成都计算机应用研究所，四川成都 610041； 2. 中国科学院大学计算机科学与技术学院，北京 100049； 3. 深圳市中钞科信金融科技有限公司，广东深圳 518206

摘要：	对不同成像条件下拍摄的硬币图像进行配准是硬币表面缺陷算法的前置任务。然而，基于互信息的传统多模态配准方法速度慢、精度低，现有的通过基于深度学习的单应性矩阵预测方法实现的图像配准只适用于单模态的任务。为此，提出一种基于深度学习的适用于多模态硬币图像的单应性矩阵预测方法，进而使用预测的单应性矩阵完成图像配准。首先，使用单应性矩阵预测层预测输入图像对间的单应性矩阵，使用单应性矩阵对待配准图像进行透视变换；然后，使用图像转换层将透视变换后的待配准图像和目标图像映射到同一域，图像转换层可在推理时去除从而减少推理时间；最后，计算同一域下的图像间的损失，并进行训练。实验表明，该方法在测试集上的平均距离误差为 3.417 像素，相较于基于互信息的传统多模态配准方法 5.575 像素的平均距离误差降低 38.71%。且配准单对图像耗时为 17.74 ms，远低于基于互信息的传统多模态配准方法的 6 368.49 ms。
关键词：	单应性矩阵图像配准硬币图像转换多模态
Homography estimation for multimodal coin images

Authors:	DENG Zhuang-lin ZHANG Shao-bing CHENG Miao HE Lian

Affiliation:	1. Chengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu Sichuan 610041, China; 2. School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; 3. Shenzhen CBPM-KEXIN Banking Technology Company Limited, Shenzhen Guangdong 518206, China

Abstract:	Registration of coin images under different illuminant is the predecessor of coin surface defect detection. However, the traditional multimodal registration method based on mutual information is slow and low accuracy, and the existing image registration methods realized by homography estimation based on deep learning only work in single-mode tasks. A homography estimation method based on deep learning for multimodal coin images is proposed in this paper, and image registration can be realized with the estimated homography. First, the homography estimation layer is used to estimate the homography between the pair of input images, and the homography is used for perspective transformation of the image to be registered; Then, the image translation layer is used to translate the pair of images to the same domain, and this layer can be removed in inference so as to reduce the inference time; Finally, train the network with the loss calculated using the pair of images in the same domain. Experiments show that the average distance error of the proposed method on the test set is 3.417 pixels, which is 38.71% lower than the traditional multimodal registration method based on mutual information. The inference time of the proposed method is 17.74 ms, which is much less than 6368.49 ms of the traditional multimodal registration method based on mutual information.

Keywords:	homography image registration coin image to image translation multimodality

	点击此处可从《》浏览原始摘要信息
	点击此处可从《》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏