Bootstrapping Human Optical Flow and Pose

Overview

Abstract:

We propose a bootstrapping framework to enhance human optical flow and pose. We show that, for videos involving humans in scenes, we can improve both the optical flow and the pose estimation quality of humans by considering the two tasks at the same time. We enhance optical flow estimates by fine-tuning them to fit the human pose estimates and vice versa. In more detail, we optimize the pose and optical flow networks to, at inference time, agree with each other. We show that this results in state-of-the-art results on the Human 3.6M and 3D Poses in the Wild datasets, as well as a human-related subset of the Sintel dataset, both in terms of pose estimation accuracy and the optical flow accuracy at human joint locations.

Additional Qualitative Highlights

Qualitative Highlights of Optical Flow:

We demonstrate how our method improves optical flow estimates via bootstrapping it with human pose estimation.

Qualitative Highlights of Human Pose:

We further show how our method improves human pose estimation with the help of optical flow.