49 Papers from PCL Selected in CVPR 2020
Date: 2020-06-19 Source:PCL
The 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), one of the world’s top academic conferences in the field of computer vision, has concluded on June 19. This year saw a total of 1,467 papers accepted from a record-high 5,865 valid submissions. The 25 percent acceptance rate is on par with CVPR 2019.
A total of 49 papers submitted by PCL researchers were accepted by CVPR2020 this year, an increase of 14 from last year, of which 41 were contributed by the Research Center for Artificial Intelligence, accounting for 84% of the total. According to preliminary statistics, among scientific research institutions and universities in China, the top three contributors are universities of Chinese Academy of Sciences, Tsinghua University, and Peking University. The number of papers contributed by PCL is second only to Peking University, ranking the fourth. Chinese scholars contributed significantly in CVPR2020. Baixin Shi and Rongrong Ji, dual-appointed researchers of PCL’s Research Center for AI, ranked the 4th and 5th among the Chinese authors, contributing 12 and 11 papers respectively.
Rongrong Ji team’s oral paper "HRank: Filter Pruning using High-Rank Feature Map" proposes a novel filter pruning method by exploring the High Rank of feature maps (HRank). Moreover, compared with the existing property importance based methods, HRank also leads to significant improvements in acceleration and compression in the amount of floating-point calculations and parameters without introducing any additional constraints.
It eliminates the need of introducing additional auxiliary constraints or retraining the model, thus simplifying the pruning complexity. The HRank is inspired by the discovery that the average rank of multiple feature maps generated by a single filter is always the same, regardless of the number of image batches CNNs receive. The results show that the rank of the feature map in the CNN can be estimated accurately and efficiently with a small set of images in the data set. There are three main contributions:
(1) The experiment shows that the average rank of multiple feature maps generated by a single filter is almost the same;
(2) Based on HRank, the researchers develop a method that is mathematically formulated to prune filters with low-rank feature maps. The principle behind their pruning is that low-rank feature maps contain less information, and thus pruned results can be easily reproduced;
(3) It demonstrates the efficiency and effectiveness of HRank in model compression and acceleration.
In another oral paper by Rongrong Ji team—the “Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation”, they propose a novel Multi-task Collaborative Network (MCN) to achieve a joint learning of REC and RES for the first time.
The two experimental tasks witness the distinct performance gains over SOTAs, achieving real-time joint detection. REC and RES is an important branch of language-vision task which also includes some multimodal tasks. In the past, the features of FRCNN were highly valued, intuitively believed to perform better. However, many facts show that it has some disadvantages and performs no be better than the features of Grid. Single-stage may become a trend. In addition, there are inextricable relationships between many multimodal tasks. How to find common ground while reserving differences may be a more worthy research direction than pre-trained models such as bert. These two points are the two very important supporting perspectives of this paper, and may also be the prospective direction of further development.
In the oral paper “AdderNet: Do we really need multiplications in deep learning?” presented by the team of Baixin Shi, they develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. The team then proposes an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. They present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer. This work proved the feasibility of adding multiplication instead of multiplication.
Due to the pandemic situation worldwide, this year's CVPR goes online. Even though CVPR is a completely virtual conference, it attracted great attention from global researchers. It also shows the good momentum and prospect of computer vision and pattern recognition on a global scale, and fuels the practitioners with confidence to continue to explore in this area. Based on its own advantages, PCL will take the historical opportunity of the national strategic planning on AI development and contribute more to the field of computer vision and pattern recognition in China.