Deep High-Resolution Representation Learning for Visual Recognition

Jingdong WangKe SunTianheng ChengBorui JiangChaorui DengYang ZhaoDong LiuYadong MuMingkui TanXinggang WangWenyu LiuBin Xiao

   Papers with code   Abstract  PDF

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation... (read more)

Benchmarked Models

RANK
MODEL
REPO
CODE RESULT
PAPER RESULT
ε-REPRODUCED
BUILD
3
HRNetV2-W64
79.5%
--
6
HRNetV2-W48
79.3%
--
9
HRNetV2-W40
78.9%
--
12
HRNetV2-W44
78.9%
--
15
HRNetV2-W32
78.4%
--
18
HRNetV2-W30
78.2%
--
21
HRNetV2-W18
76.8%
--
22
HRNet-W18-C-Small-V2
75.1%
--
23
HRNet-W18-C-Small-V2
75.1%
--
24
HRNet-W18 Small V2
75.1%
--
25
HRNet-W18-C-Small-V1
72.3%
--
26
HRNet-W18-C-Small-V1
72.3%
--
27
HRNet-W18 Small V1
72.3%
--