* Beijing University of Posts and Telecommunications (BUPT)
COCO 2018 DensePose Test AP AP50 AP75 APm APl BUPT-PRIV (Ours) 64 92 75 57 67 PlumSix 58 89 66 50 61 ML_Lab 57 89 64 51 59 Sound of silent 57 87 66 48 61 DensePose ResNeXt101 56 89 64 51 59
Bottom-up Backbone Keypoints Dense Pose Person Part Grouping Backbone Instance B-Box Instancelevel task Top-down
Box head Backbone Mask head Mask R-CNN RoI Align Keypoint head
The outline of the human body is not good enough.
感谢您下载包图网平台上提供的PPT作品 为了您和包图网以及原创作者的利益 请勿复制 传播 销售 否则将承担法律责任 包图网将对作品进行维权 按照传播下载次数进行十倍的索取赔偿 Dense Pose Estimation Person Part Segmentation Can the Mask R-CNN handle the problem of person instance analysis well?
Box head Backbone Mask head Mask R-CNN RoI Align Keypoint head Parsing R-CNN DensePose head MR-Parsing Head Parsing head
Large scale instance More detail information use P2 for parsing AP P2-P6 (baseline) 48.9 P2-only 48.5-0.4 P2 for parsing 49.8 +0.9 Use P2-P6 for RPN, and only P2 for MR-Parsing Head
P2 RoI Align 75*125 14*14 250*200 300*500 1000*800 Lost too many details!
512*32*32 512*32*32 512*32*32 enlarging the roi size AP 14*14 (Original size ) 49.8 32*32 (Ours) 52.2 +2.4 48*48 (Ours) 52.8 +3.0 8 conv layers Use 32 * 32 roi size and output size is 128 * 128
512*32*32 512*32*32 8 conv layers RF = 17 < 32 = roi size From DeepLab V3
use aspp module for paring AP 8 conv layers 0.523 52.2 ppmaspp module module (PSPNet) 0.526 52.9 +0.3 +0.7 ppm aspp module module (PSPNET) 0.529 52.4 +0.2 +0.6 From DeepLab V3 Use aspp module to instead of 8 conv layers
512*32*32 512*32*32 ASPP add 4 conv layers after aspp AP aspp module 52.9 4 conv before aspp 53.0 +0.1 4 conv after aspp 53.9 +1.0 both 54.0 +1.1 Before ASPP Semantic space transformation? After ASPP High-level semantic feature? Use 4 conv layers after aspp module
aspp Small testing scale and less roi for speeding up AP ResNet50 (baseline) 48.9 +MR-Parsing Head 53.9 +5.0 +700 scale test 54.4 +5.5 +700 scale & 100 rois test 54.3 +5.4 oi Align output 4 conv layers A good designed head is very important for instance-level parsing
70 COCO 2018 DensePose Val 65 61.8 63.3 60 59.4 55 54.3 50 48.9 45 40 ResNet50 (baseline) + MR-Parsing Head + ResNeXt101 + COCO pre-train + Test aug We get 64.1 map on the test set with single model.
感谢您下载包图网平台上提供的PPT作品 为了您和包图网以及原创作者的利益 请勿复制 传播 销售 否则将承担法律责任 包图网将对作品进行维权 按照传播下载次数进行十倍的索取赔偿 DensePose ResNeXt101 Parsing R-CNN (ours)
What s the next?
Box head Backbone MR-Parsing Head RoI Adaptiv Align?? RoI Align Parsing R-CNN
And One More Thing
We get74.9 AP on the test set!!!
Mask Index Index Refine * = https://github.com/facebookresearch/densepose/blob/master/detectron/core/test.py
感谢您下载包图网平台上提供的PPT作品 为了您和包图网以及原创作者的利益 请勿复制 传播 销售 否则将承担法律责任 包图网将对作品进行维权 按照传播下载次数进行十倍的索取赔偿
AP (GPS) AP (GPS without Index refine) DensePose ResNet50 48.9 61.5 DensePose ResNeXt101 55.5 67.2 Parsing R-CNN ResNeXt101 61.8 71.7 Parsing R-CNN ResNeXt101 +Test Aug COCO 2018 DensePose Val 63.3 73.7 Without index refine, we can get unreasonabe high scores.
GPS jm = GPS j * IoU j mask = GPS j *! "#! %&'! "#! %&', I is class-agnostic index AP (GPS) AP (GPS without Index refine) AP (GPS M ) AP (GPS M without Index refine) (GPS - GPS M ) DensePose ResNet50 48.9 61.5 39.3 0.43 9.6 DensePose ResNeXt101 55.5 67.2 46.1 0.82 9.4 Parsing R-CNN ResNeXt101 61.8 71.7 51.7 0.64 9.9 Parsing R-CNN ResNeXt101 +Test Aug COCO 2018 DensePose Val 63.3 73.7 53.1 0.73 9.8 The proposed new metric GPS M