Show simple item record

Statistical learning for multivariate distributional regression with complex dependencies

dc.contributor.advisorMayr, Andreas
dc.contributor.authorStrömer, Annika Lisa
dc.date.accessioned2025-12-05T16:56:11Z
dc.date.available2025-12-05T16:56:11Z
dc.date.issued05.12.2025
dc.identifier.urihttps://hdl.handle.net/20.500.11811/13728
dc.description.abstractLarge, complex datasets are becoming increasingly important in biomedical research. Such datasets typically feature a high number of variables per subject, multiple outcomes and complex dependency structures. While they provide new opportunities to examine scientific questions in greater detail, they also pose major statistical challenges. Addressing these challenges requires advanced methods that can handle high dimensionality, capture dependencies between correlated outcomes and provide interpretable results.
This cumulative dissertation develops statistical frameworks for multivariate distributional regression and variable selection techniques, enabling the analysis of complex biomedical data while balancing flexibility, interpretability and efficiency. It comprises five publications covering methodological advances and applications in diverse biomedical contexts.
The first project demonstrates the value of advanced multivariate modeling for uncovering clinically relevant patterns in complex longitudinal data. Using latent class linear mixed models (LCMMs), unobserved patient subgroups are identified with distinct five-year trajectories in weight, depressive symptoms, eating disorder psychopathology and health-related quality of life (HRQoL) after obesity surgery. The results show that physical and psychological changes can evolve differently over time and may vary in sustainability, underscoring the need for joint models that capture both interdependencies and heterogeneity. The second project develops a model-based boosting approach for multivariate distributional regression within the framework of generalized additive models for location, scale and shape (GAMLSS). This method enables simultaneous modeling of all distribution parameters – including dependence parameters – of arbitrary parametric multivariate outcomes as functions of covariates. It incorporates data-driven variable selection and scales to high-dimensional settings where the number of covariates exceeds the number of observations (p > n).
Building on this, the third project tackles the issue of dependent censoring in survival analysis, a challenging scenario where the common assumption of independent censoring does not hold. In such cases, censoring may be related to the patient's health status; for instance, patients in poorer condition may withdraw from a study earlier. The work proposes a novel model-based boosting method using distributional copula regression to jointly model the marginal distributions of event and censoring times as well as their dependence, as functions of covariates.
The fourth and fifth papers address the challenge of improving interpretability in model-based boosting, particularly for high-dimensional biomedical data. While boosting provides flexibility, it may result in overly complex models by including covariates with negligible importance.
The fourth paper proposes a deselection approach for univariate (distributional) regression that removes irrelevant predictors with only a minor impact on the prediction of the model, yielding simpler and more interpretable models without compromising predictive performance.
The fifth paper extends this approach to distributional copula regression, enabling not only the removal of variables with minor importance but also the determination of whether specific parameters require covariate effects. This controls model complexity and enhances interpretability.
This dissertation includes five research articles published in peer-reviewed international journals (Publication A - E).
en
dc.language.isoeng
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subject.ddc310 Allgemeine Statistiken
dc.titleStatistical learning for multivariate distributional regression with complex dependencies
dc.typeDissertation oder Habilitation
dc.identifier.doihttps://doi.org/10.48565/bonndoc-732
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5-86720
ulbbn.pubtypeErstveröffentlichung
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID8672
ulbbnediss.date.accepted27.11.2025
ulbbnediss.instituteMedizinische Fakultät / Institute : Institut für Medizinische Biometrie, Informatik und Epidemiologie (IMBIE)
ulbbnediss.fakultaetMedizinische Fakultät
dc.contributor.coRefereeKlein, Nadja
ulbbnediss.contributor.orcidhttps://orcid.org/0000-0002-1284-3318


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

The following license files are associated with this item:

InCopyright