Statistical learning for multivariate distributional regression with complex dependencies
Statistical learning for multivariate distributional regression with complex dependencies

| dc.contributor.advisor | Mayr, Andreas | |
| dc.contributor.author | Strömer, Annika Lisa | |
| dc.date.accessioned | 2025-12-05T16:56:11Z | |
| dc.date.available | 2025-12-05T16:56:11Z | |
| dc.date.issued | 05.12.2025 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.11811/13728 | |
| dc.description.abstract | Large, complex datasets are becoming increasingly important in biomedical research. Such datasets typically feature a high number of variables per subject, multiple outcomes and complex dependency structures. While they provide new opportunities to examine scientific questions in greater detail, they also pose major statistical challenges. Addressing these challenges requires advanced methods that can handle high dimensionality, capture dependencies between correlated outcomes and provide interpretable results. This cumulative dissertation develops statistical frameworks for multivariate distributional regression and variable selection techniques, enabling the analysis of complex biomedical data while balancing flexibility, interpretability and efficiency. It comprises five publications covering methodological advances and applications in diverse biomedical contexts. The first project demonstrates the value of advanced multivariate modeling for uncovering clinically relevant patterns in complex longitudinal data. Using latent class linear mixed models (LCMMs), unobserved patient subgroups are identified with distinct five-year trajectories in weight, depressive symptoms, eating disorder psychopathology and health-related quality of life (HRQoL) after obesity surgery. The results show that physical and psychological changes can evolve differently over time and may vary in sustainability, underscoring the need for joint models that capture both interdependencies and heterogeneity. The second project develops a model-based boosting approach for multivariate distributional regression within the framework of generalized additive models for location, scale and shape (GAMLSS). This method enables simultaneous modeling of all distribution parameters – including dependence parameters – of arbitrary parametric multivariate outcomes as functions of covariates. It incorporates data-driven variable selection and scales to high-dimensional settings where the number of covariates exceeds the number of observations (p > n). Building on this, the third project tackles the issue of dependent censoring in survival analysis, a challenging scenario where the common assumption of independent censoring does not hold. In such cases, censoring may be related to the patient's health status; for instance, patients in poorer condition may withdraw from a study earlier. The work proposes a novel model-based boosting method using distributional copula regression to jointly model the marginal distributions of event and censoring times as well as their dependence, as functions of covariates. The fourth and fifth papers address the challenge of improving interpretability in model-based boosting, particularly for high-dimensional biomedical data. While boosting provides flexibility, it may result in overly complex models by including covariates with negligible importance. The fourth paper proposes a deselection approach for univariate (distributional) regression that removes irrelevant predictors with only a minor impact on the prediction of the model, yielding simpler and more interpretable models without compromising predictive performance. The fifth paper extends this approach to distributional copula regression, enabling not only the removal of variables with minor importance but also the determination of whether specific parameters require covariate effects. This controls model complexity and enhances interpretability. This dissertation includes five research articles published in peer-reviewed international journals (Publication A - E). | en |
| dc.language.iso | eng | |
| dc.rights | In Copyright | |
| dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | |
| dc.subject.ddc | 310 Allgemeine Statistiken | |
| dc.title | Statistical learning for multivariate distributional regression with complex dependencies | |
| dc.type | Dissertation oder Habilitation | |
| dc.identifier.doi | https://doi.org/10.48565/bonndoc-732 | |
| dc.publisher.name | Universitäts- und Landesbibliothek Bonn | |
| dc.publisher.location | Bonn | |
| dc.rights.accessRights | openAccess | |
| dc.identifier.urn | https://nbn-resolving.org/urn:nbn:de:hbz:5-86720 | |
| ulbbn.pubtype | Erstveröffentlichung | |
| ulbbnediss.affiliation.name | Rheinische Friedrich-Wilhelms-Universität Bonn | |
| ulbbnediss.affiliation.location | Bonn | |
| ulbbnediss.thesis.level | Dissertation | |
| ulbbnediss.dissID | 8672 | |
| ulbbnediss.date.accepted | 27.11.2025 | |
| ulbbnediss.institute | Medizinische Fakultät / Institute : Institut für Medizinische Biometrie, Informatik und Epidemiologie (IMBIE) | |
| ulbbnediss.fakultaet | Medizinische Fakultät | |
| dc.contributor.coReferee | Klein, Nadja | |
| ulbbnediss.contributor.orcid | https://orcid.org/0000-0002-1284-3318 |
Files in this item
This item appears in the following Collection(s)
-
E-Dissertationen (2079)




