3D Human Pose Estimation via Spatio-Temporal Matching from Monocular RGB Images

Jielu YAN, Ming Liang ZHOU*, Bin FANG, Ke XU

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

1 Citation (Scopus)

Abstract

Three-dimensional (3D) human pose estimation aims to locate 3D keypoints of individuals from given input RGB images. For two-dimensional (2D) human pose estimation problems, majority methods inferring 2D poses are from 2D heatmaps. However, it is hard to extend this method to 3D poses inferring area which makes computational loads increase sharply. To address the above problem, we propose STM-CNN method to estimate reconstruction coefficient matrix to calculate the final 3D pose instead of estimating 3D heatmaps to decrease the computational loads. First, STM-CNN does a preprocessing procedure to calculate a set of shape and weight bases. Second, STM-CNN infers a 2D matrix called reconstruction coefficient from the STM-CNN architecture. Third, STM-CNN utilizes the preprocessing shape and weight bases and estimated reconstruction coefficient matrix to calculate the final 3D pose. Meanwhile, STM-CNN method achieves better performances compared with the state-of-the-art methods on Human3.6M.

Original languageEnglish
Article number2255017
JournalInternational Journal of Pattern Recognition and Artificial Intelligence
Volume36
Issue number12
Early online date19 Aug 2022
DOIs
Publication statusPublished - 30 Sept 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 World Scientific Publishing Company.

Keywords

  • 3D human pose estimation
  • convolution neural network
  • shape base
  • time base

Fingerprint

Dive into the research topics of '3D Human Pose Estimation via Spatio-Temporal Matching from Monocular RGB Images'. Together they form a unique fingerprint.

Cite this