Progressive Erasing Network with consistency loss for fine-grained visual classification |
| |
Affiliation: | 1. DATA61, CSIRO, Sydney, NSW 2122, Australia;2. School of Electrical and Data Engineering, University of Technology Sydney, Sydney, NSW 2007, Australia;1. School of Software Technology, Dalian University of Technology, Dalian, China;2. International School of Information Science & Engineering, Dalian University of Technology, China |
| |
Abstract: | Fine-grained Visual Categorization (FGVC) in computer vision aims to recognize images belonging to multiple subordinate categories of a super-category. The difficulty of FGVC lies in the close resemblance among inter-classes and large variations among intra-classes. Most existing networks only focus on a few discriminative regions, while ignoring many subtle complementary features. So we propose a Progressive Erasing Network (PEN). In PEN, a Multi-Grid Erasure mechanism augments data samples and assists in capturing the local discriminative features, where the overall structure of the image is destroyed indirectly through pixel-wise erasure. Cross-layer feature aggregation by extracting salient class features is of great significance in FGVC. However, the capability of cross-layer feature representation based on a simple aggregation strategy is still inefficient. To this end, the proposed Consistency loss explores the cross-layer semantic affinity, which guides the Cross-Layer Incentive (CLI) block to mine more efficient feature representations of different granularity. We also integrate Cross Entropy and Complementary Entropy to take the distribution of negative classes into account for better classification performance. Our method uses end-to-end training with only classification labels. Experimental results show that our model outperforms the state-of-the-art on three fine-grained benchmarks. |
| |
Keywords: | FGVC PEN Multi-grid erasure mechanism Cross-layer incentive block Consistency loss |
本文献已被 ScienceDirect 等数据库收录! |
|