Show simple item record

Articulated Human Pose Estimation in Unconstrained Images and Videos

dc.contributor.advisorGall, Jürgen
dc.contributor.authorIqbal, Umar
dc.date.accessioned2020-04-25T14:07:23Z
dc.date.available2020-04-25T14:07:23Z
dc.date.issued12.12.2018
dc.identifier.urihttps://hdl.handle.net/20.500.11811/7685
dc.description.abstractThe understanding of the articulated human body pose is of great interest in many scenarios. While humans have an unmatched ability to effortlessly extract and interpret such information in any unconstrained environment, developing computational methods with similar capabilities is a very challenging task. The developed methods have to handle scenes with complex backgrounds, an unknown number of potentially occluded and truncated people, large-scale variations, diverse lighting conditions, and the vast amounts of appearance variation due to complex body articulations and clothing. The noise introduced by the lossy sensing modalities complicates the problem even further. While there has been a lot of work for human pose estimation in constrained environments, very few works have addressed these challenges in the literature. Further, the estimation of the articulated pose of small functional body parts such as hands has often been ignored in the existing works. To this end, this thesis addresses the aforementioned challenges and presents efficient and robust computational methods for the 2D and 3D articulated human body and hand pose estimation in unconstrained real-world scenarios.
First, we address the problem of 2D multi-person body pose estimation. We present an efficient approach that estimates the poses of people in groups or crowd. We demonstrate that the problem can be formulated as a set of local joint-to-person association problems which can be solved efficiently for each person in the image, while also handling occlusions and truncations.
Second, we introduce the challenging case of simultaneous multi-person pose estimation and tracking in videos. The approaches for multi-person pose estimation in images cannot be applied directly to this problem since it also requires to solve person associations over time. To this end, we propose a novel method that jointly models both problems in a single formulation using a spatio-temporal graph. The optimization of the graph using integer linear programming directly provides plausible body pose trajectories for each person. The proposed method does not make any assumptions and performs pose estimation and tracking in fully unconstrained videos. We also present a large scale dataset and a thorough evaluation protocol to evaluate the developed methods quantitatively. Further, we provide an extensive analysis of the performance of state-of-the-art methods and highlight their strengths and weaknesses.
Given the estimated, possibly noisy, 2D pose trajectory of a person, the third direction of this thesis focuses on the refinement of pose trajectory by exploiting the information about human activities. We present an action-conditioned pictorial structure model that predicts and incorporates activity information for body pose refinement.
The fourth direction of this thesis concerns 3D human pose estimation from single images. Given the estimated 2D pose of a person, we present an approach to lift the 2D pose to 3D by using an efficient and robust method for 3D pose retrieval and reconstruction. Unlike existing works, the proposed approach does not require any training images with annotated 3D poses. Since we can estimate 2D poses from any unconstrained image, the proposed method can also reconstruct 3D poses in any unconstrained scenario.
The final part of the thesis concerns the estimation of 3D hand pose from an RGB input. We present a novel 2.5D pose representation which can be estimated reliably from an RGB image and allows to reconstruct the absolute 3D pose of the hand using a novel 3D reconstruction approach. The proposed method can handle severe occlusions, complex hand articulations, and unconstrained images taken from the wild.
dc.language.isoeng
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectarticulated pose estimation
dc.subjectmulti-person pose tracking
dc.subjecthuman body pose
dc.subjecthand pose
dc.subject2D to 3D
dc.subject3D reconstruction
dc.subject.ddc004 Informatik
dc.titleArticulated Human Pose Estimation in Unconstrained Images and Videos
dc.typeDissertation oder Habilitation
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5n-52928
ulbbn.pubtypeErstveröffentlichung
ulbbn.birthnameUmar Iqbal
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID5292
ulbbnediss.date.accepted30.11.2018
ulbbnediss.instituteMathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Informatik / Institut für Informatik
ulbbnediss.fakultaetMathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coRefereeLepetit, Vincent


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

The following license files are associated with this item:

InCopyright