Paper Reading: Regional Multi-Person Pose Estimation

Note: this post is only meant for personal digestion and interpretation. It is incomplete and may mislead readers.


facilitate pose estimation in the presence of inaccurate human bounding boxes

  • Symmetric Spatial Transformer Network (SSTN)
  • Parametric Pose NonMaximum-Suppression (NMS)
  • Pose-Guided Proposals Generator (PGPG)


  • 2-step framework
  • part-based framework


Parametric Pose NMS

most confident pose is selected as reference, and some poses close to it are subject to elimination by applying elimination criterion. pose similarity in order to eliminate the poses which are too close and too similar to each others

Pose-guided Proposals Generator

  1. learn the atomic poses
    1. align all poses so that their torsos have the same length
    2. k-means algorithm to cluster our aligned poses
    3. the computed cluster centers form our atomic poses
  2. for each person instance sharing the same atomic pose a, we calculate the offsets between its ground truth bounding box and detected bounding box
  3. The offsets are then normalized by the corresponding side-length of ground truth bounding box in that direction.
  4. the offsets form a frequency distribution, and we fit our data to a Gaussian mixture distribution

Author: Texot
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint polocy. If reproduced, please indicate source Texot !