3D Hand Pose Estimation from Single RGB Images with Auxiliary Information

Yang, Linlin

dc.contributor.advisor	Yao, Angela
dc.contributor.author	Yang, Linlin
dc.date.accessioned	2022-12-14T10:48:03Z
dc.date.available	2022-12-14T10:48:03Z
dc.date.issued	14.12.2022
dc.identifier.uri	https://hdl.handle.net/20.500.11811/10521
dc.description.abstract	3D hand pose estimation from monocular RGB inputs is critical for augmented and virtual reality applications, and has achieved remarkable progress due to the revolution of deep learning. Existing deep-learning-based hand pose estimation systems target learning good representations for hand poses, requiring a large amount of accurate ground truth labels, which are difficult to obtain. We turn to explore different auxiliary information to aid representation learning and reduce the reliance on data annotation. This dissertation explores different auxiliary information, i.e., image factors, multi-modal data, and synthetic data, for 3D hand pose estimation. Motivated by the image rendering that requires a number of image factors of variation, we propose to learn disentangled representations to better analyze these factors of variation. The disentangled representations enable explicit control over different factors of variation for synthesizing hand images and training with hand factors as weak labels for hand pose estimation. Besides labelled or shared hand factors, different modalities (e.g., RGB images and depth maps) of the same hand should have shared information. Therefore, we present multi-modalities as auxiliary information for RGB inputs. Specifically, we explore multi-modal alignment in three aspects: latent space alignment based on variational autoencoder and product of Gaussian expert, pixel-level alignment via attention fusion, and low-dimensional subspace alignment via contrastive learning. Besides multi-modal alignment, the auxiliary modalities can also serve as weak labels for hand pose estimation. To further remove the requirements of image factors or different modalities, we emphasize the importance of synthetic data. Synthetic data is flexible, infinite, and easy to achieve. With synthetic data as auxiliary information, we can significantly reduce the number of labelled real-world data. Therefore, we introduce a challenging scenario that learns only from labelled synthetic data and fully unlabelled real-world data. To address this challenging scenario, we present a semi-supervised framework with pseudo-labelling and consistency training, and try to address noisy pseudo-labels using modules like label correction and self-distillation. This dissertation advances the state-of-the-art 3D hand pose estimation, explores representation learning, weakly- and semi-supervised learning for pose estimation, and paves a path forward for learning pose estimation with diverse auxiliary information.	en
dc.language.iso	eng
dc.rights	In Copyright
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/
dc.subject	3D Hand Pose Estimation
dc.subject	Weakly-Supervised Learning
dc.subject	Semi-Supervised Learning
dc.subject	Multi-Modal Learning
dc.subject	Deep Learning
dc.subject	Computer Vision
dc.subject.ddc	004 Informatik
dc.title	3D Hand Pose Estimation from Single RGB Images with Auxiliary Information
dc.type	Dissertation oder Habilitation
dc.publisher.name	Universitäts- und Landesbibliothek Bonn
dc.publisher.location	Bonn
dc.rights.accessRights	openAccess
dc.identifier.urn	https://nbn-resolving.org/urn:nbn:de:hbz:5-68952
dc.relation.doi	https://doi.org/10.1109/CVPR.2019.01011
dc.relation.doi	https://doi.org/10.1109/ICCV.2019.00242
dc.relation.doi	https://doi.org/10.1109/ICCV48922.2021.01117
ulbbn.pubtype	Erstveröffentlichung
ulbbnediss.affiliation.name	Rheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.location	Bonn
ulbbnediss.thesis.level	Dissertation
ulbbnediss.dissID	6895
ulbbnediss.date.accepted	23.11.2022
ulbbnediss.institute	Mathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Informatik / Institut für Informatik
ulbbnediss.fakultaet	Mathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coReferee	Klein, Reinhard
ulbbnediss.contributor.gnd	1292283181

Dateien zu dieser Ressource

Name:: 6895.pdf
Größe:: 19.7MB
Format:: PDF

Dokument öffnen

Das Dokument erscheint in:

E-Dissertationen (4378)

Zur Kurzanzeige

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden: