Show simple item record

Modeling Human Actions in Multi-Label settings

dc.contributor.advisorGall, Jürgen
dc.contributor.authorBiswas, Sovan
dc.date.accessioned2025-09-15T11:48:53Z
dc.date.available2025-09-15T11:48:53Z
dc.date.issued15.09.2025
dc.identifier.urihttps://hdl.handle.net/20.500.11811/13448
dc.description.abstractHuman actions are often complex and occur in dynamic contexts, posing a challenge for traditional recognition models. This challenge is further exaggerated due to humans' innate multi-tasking nature, i.e. a person typically performs multiple actions at the same time. This thesis delves into multi-label human action recognition and analysis, bridging the gap between single and group activities. Furthermore, the thesis acknowledges the cost of labeled data required for training and discusses novel approaches to develop models in various fully supervised settings to data-scarce, weakly supervised environments.
The core contributions lie in developing novel neural network architectures that can capture the intricacies of multi-label action recognition. Our first contribution is on Structural Recurrent Neural Networks (SRNNs) for group activity analysis. These networks capture individual actions, interactions between individuals, and the overall group activity, agnostic to the size of the group. Moving from group activity, we also proposed a Hierarchical Graph-RNN that specifically tackles multiple individual actions. This architecture incorporates the temporal context and relationships between different actions to achieve accurate multi-label recognition in space and time.
Beyond fully supervised settings, we also explored weakly supervised learning, where action annotations are scarce. Here, our approaches rely on sets of actions instead of individual classes as annotations that are cost and time-effective to obtain. Our initial approach uses Multi-Instance Multi-Label (MIML) Learning followed by constraint-based Linear programming to map the set of actions to individual humans in a video. Furthermore, the thesis addresses the challenge of longer videos in weakly supervised settings. Here, a novel Multiple Instance Triplet Loss (MITL) exploits temporal similarity across consecutive frames in comparison to temporal distant frames to train the action recognition model effectively.
Through this dissertation, we advanced the state-of-the-art in multi-label action analysis, proposed novel architectures for group and individual action recognition exploiting temporal and spatial context, and finally, explored approaches to develop models for weakly supervised settings. We demonstrated the effectiveness of our approaches through comprehensive experimentation and by comparing them with existing state-of-the-art on well-known public benchmarks. In the end, we conclude by discussing the open challenges and possible future research directions for multi-label human action analysis.
en
dc.language.isoeng
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectMulti-label action recognition
dc.subjectSpatio-temporal action detection
dc.subjectWeak-supervision
dc.subject.ddc004 Informatik
dc.titleModeling Human Actions in Multi-Label settings
dc.typeDissertation oder Habilitation
dc.identifier.doihttps://doi.org/10.48565/bonndoc-653
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5-84141
dc.relation.doihttps://doi.org/10.48550/arXiv.1802.02091
dc.relation.doihttps://doi.org/10.48550/arXiv.2101.08581
dc.relation.doihttps://doi.org/10.48550/arXiv.2101.08567
dc.relation.doihttps://doi.org/10.1109/ICCVW54120.2021.00245
ulbbn.pubtypeErstveröffentlichung
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID8414
ulbbnediss.date.accepted09.07.2025
ulbbnediss.instituteMathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Informatik / Institut für Informatik
ulbbnediss.fakultaetMathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coRefereeKühne, Hilde
ulbbnediss.contributor.orcidhttps://orcid.org/0000-0002-9866-8433


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

The following license files are associated with this item:

InCopyright