Learning Personalized and Human-Aware Robot Navigation

de Heuvel, Jorge

Volltext

View/Open (58.3MB)

Author

de Heuvel, Jorge

ORCID

https://orcid.org/0000-0002-4493-6148

Type of Scholarly Publication

Dissertation

Date of Exam

30.10.2025

Date of Publication

27.01.2026

Advisor

Bennewitz, Maren

Co-Referee

Valada, Abhinav

Degree Granting Institutions

Rheinische Friedrich-Wilhelms-Universität Bonn

Metadata

Show full item record

Citable Links

Handle: https://hdl.handle.net/20.500.11811/13852
URN: https://nbn-resolving.org/urn:nbn:de:hbz:5-87541
DOI: https://doi.org/10.48565/bonndoc-764

Abstract

Robots are increasingly moving from industrial applications into everyday human environments such as healthcare, households, and public spaces. In these interactive and personal contexts, successful human-robot interaction (HRI) critically depends on robots' abilities to interpret, reflect, and adapt to individual human preferences. Yet traditional robot navigation methods, though reliable in structured environments, generally fail to capture and reflect nuanced user preferences, resulting in suboptimal user experience, reduced trust, and limited acceptance.
To address these shortcomings, this thesis presents a comprehensive approach toward personalized, learning-based robot navigation. It specifically focuses on four critical aspects: (1) efficient and intuitive collection of human preferences, (2) balancing user preference reflection with robot navigation goals, (3) deriving expressive sensor representations suitable for dynamic environments, (3) deriving expressive sensor representations suitable for dynamic environments, and (4) ensuring adaptability and transparency in HRI once deployed on a robot.
First, user preferences are captured using intuitive interfaces and efficient learning frameworks. We introduce a virtual reality (VR) demonstration interface, enabling users to sketch robot navigation trajectories with high intuitiveness. The VR interface is complemented with a hybrid reinforcement learning (RL) and behavioral cloning (BC) framework that requires only few demonstrations. We confirm through a user study that the personalized controller outperforms non-aligned baseline approaches, with users reporting that their preferences were better reflected. Besides demonstration-based approaches, we also optimize the preference collection through RL from human feedback (RLHF). We introduce the novel query generation approach "EnQuery" based on policy ensembles, maximizing the information gain in low-query regimes, while providing trajectory options with common start and goal reference points. EnQuery subsequently drives a user study that compares immersive VR and conventional 2D video interfaces for preference collection. Here, we find effects of the interface modality on user experience, preference consistency, and policy alignment.
Second, the thesis develops and validates learning architectures that balance the trade-off between user preference reflection and robot task completion. The proposed hybrid RL+BC learning framework internalizes user preferences while preserving goal-directed performance. To quantify the quality of preference reflection in navigation trajectories, we introduce a new metric that is based on the Fréchet Distance.
Third, we address the challenge of sensor representation for robust navigation in dynamic, human-populated environments. A depth vision-based perception pipeline employing a variational autoencoder and motion prediction compresses sensor observations into latent states, capturing both scene details and the user for effective personalized policy learning. In parallel, a spatiotemporal attention mechanism paired with a novel 2D lidar state representation improves obstacle avoidance and foresight in dynamic human environments over state-of-the-art baselines.
Fourth, the thesis advances the adaptability and transparency of learning-based robot navigation policies. To accommodate adaptability to evolving user preferences, a multi-objective RL framework facilitates principled post-deployment tuning of demonstration reflection and other navigation objectives. For improved transparency between robot and user, an explainable artificial intelligence (AI) interface in VR is developed, visually grounding navigation policy attribution scores semantically in scene context. The approach communicates internal decision-making of black-box neural network policies in an intuitive manner and thereby improves non-expert users' objective and subjective understanding of robot behavior.
These contributions are validated through extensive simulation studies, user experiments, and real-robot deployments. The findings demonstrate that preference-reflecting, learning-based navigation is achievable, robust, and perceived as superior to classical approaches by users. The insights regarding interface modality, interaction sample efficiency, sensor abstraction, and explainability inform the design of future user-centric robotic systems. In summary, this thesis establishes principled methods for navigation preference collection, learning, and behavior explanation, advancing the state-of-the-art towards seamless, preference-aware HRI in daily life.

Classification (DDC)

004 Informatik

Related Publications

https://doi.org/10.1109/RO-MAN53752.2022.9900554
https://doi.org/10.1109/IROS55552.2023.10341370
https://doi.org/10.1109/LRA.2024.3373988
https://doi.org/10.1109/IROS60139.2025.11246392
https://doi.org/10.1109/RO-MAN60168.2024.10731470
https://doi.org/10.1109/IROS60139.2025.11246802
https://doi.org/10.1109/RO-MAN63969.2025.11217609

Supplementary Research Data

https://doi.org/10.5281/zenodo.15825155
https://github.com/hrl-bonn/EnQuery
https://github.com/HumanoidsBonn/demo_enhanced_morl_nav
https://github.com/HumanoidsBonn/rlhf_prefnav_interface_study

Zitiervorschlag
BibTeX

de Heuvel, Jorge: Learning Personalized and Human-Aware Robot Navigation. - Bonn, 2026. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-87541

@phdthesis{handle:20.500.11811/13852,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-87541,
doi: https://doi.org/10.48565/bonndoc-764,
author = {{Jorge de Heuvel}},
title = {Learning Personalized and Human-Aware Robot Navigation},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2026,
month = jan,
note = {Robots are increasingly moving from industrial applications into everyday human environments such as healthcare, households, and public spaces. In these interactive and personal contexts, successful human-robot interaction (HRI) critically depends on robots' abilities to interpret, reflect, and adapt to individual human preferences. Yet traditional robot navigation methods, though reliable in structured environments, generally fail to capture and reflect nuanced user preferences, resulting in suboptimal user experience, reduced trust, and limited acceptance.
To address these shortcomings, this thesis presents a comprehensive approach toward personalized, learning-based robot navigation. It specifically focuses on four critical aspects: (1) efficient and intuitive collection of human preferences, (2) balancing user preference reflection with robot navigation goals, (3) deriving expressive sensor representations suitable for dynamic environments, (3) deriving expressive sensor representations suitable for dynamic environments, and (4) ensuring adaptability and transparency in HRI once deployed on a robot.
First, user preferences are captured using intuitive interfaces and efficient learning frameworks. We introduce a virtual reality (VR) demonstration interface, enabling users to sketch robot navigation trajectories with high intuitiveness. The VR interface is complemented with a hybrid reinforcement learning (RL) and behavioral cloning (BC) framework that requires only few demonstrations. We confirm through a user study that the personalized controller outperforms non-aligned baseline approaches, with users reporting that their preferences were better reflected. Besides demonstration-based approaches, we also optimize the preference collection through RL from human feedback (RLHF). We introduce the novel query generation approach "EnQuery" based on policy ensembles, maximizing the information gain in low-query regimes, while providing trajectory options with common start and goal reference points. EnQuery subsequently drives a user study that compares immersive VR and conventional 2D video interfaces for preference collection. Here, we find effects of the interface modality on user experience, preference consistency, and policy alignment.
Second, the thesis develops and validates learning architectures that balance the trade-off between user preference reflection and robot task completion. The proposed hybrid RL+BC learning framework internalizes user preferences while preserving goal-directed performance. To quantify the quality of preference reflection in navigation trajectories, we introduce a new metric that is based on the Fréchet Distance.
Third, we address the challenge of sensor representation for robust navigation in dynamic, human-populated environments. A depth vision-based perception pipeline employing a variational autoencoder and motion prediction compresses sensor observations into latent states, capturing both scene details and the user for effective personalized policy learning. In parallel, a spatiotemporal attention mechanism paired with a novel 2D lidar state representation improves obstacle avoidance and foresight in dynamic human environments over state-of-the-art baselines.
Fourth, the thesis advances the adaptability and transparency of learning-based robot navigation policies. To accommodate adaptability to evolving user preferences, a multi-objective RL framework facilitates principled post-deployment tuning of demonstration reflection and other navigation objectives. For improved transparency between robot and user, an explainable artificial intelligence (AI) interface in VR is developed, visually grounding navigation policy attribution scores semantically in scene context. The approach communicates internal decision-making of black-box neural network policies in an intuitive manner and thereby improves non-expert users' objective and subjective understanding of robot behavior.
These contributions are validated through extensive simulation studies, user experiments, and real-robot deployments. The findings demonstrate that preference-reflecting, learning-based navigation is achievable, robust, and perceived as superior to classical approaches by users. The insights regarding interface modality, interaction sample efficiency, sensor abstraction, and explainability inform the design of future user-centric robotic systems. In summary, this thesis establishes principled methods for navigation preference collection, learning, and behavior explanation, advancing the state-of-the-art towards seamless, preference-aware HRI in daily life.},
url = {https://hdl.handle.net/20.500.11811/13852}
}

The following license files are associated with this item: