Advances in Machine Learning Approaches for Biostatistical Learning

Welchowski, Thomas

dc.contributor.advisor	Schmid, Matthias
dc.contributor.author	Welchowski, Thomas
dc.date.accessioned	2025-06-27T14:30:42Z
dc.date.available	2025-06-27T14:30:42Z
dc.date.issued	27.06.2025
dc.identifier.uri	https://hdl.handle.net/20.500.11811/13164
dc.description.abstract	This habilitation thesis summarized current state-of-art advances in machine learning for biomedical applications. The first contribution was the development of a framework for tuning KDSN to increase prediction performance (Welchowski and Schmid, 2016). KDSN are a computational efficient alternative to backpropagation-based artificial neural network techniques with comparable prediction performance on biomedical tabular data that allow layer-wise closed form solutions. The proposed model-based tuning framework is much shorter in terms of computation time than grid-based search strategies. This work was extended to SKDSN that includes variable selection, dropout and regularization to make KDSN more flexible (Welchowski and Schmid, 2019). SKDSN modifications improved upon the performance of KDSN, but could not match the performance of ensemble methods applied to biomedical tabular data sets, especially when the number of covariates was high. IML methods provide tools to gain further insights from those black-box models. A case study in ecology highlighted strength and weaknesses of IML methods that quantify magnitude of effects and their interactions (Welchowski et al., 2022). In particular, graphical tools showed their limits to investigate higher order interaction effects. Previous approaches for inference of model-agnostic interaction effects were limited to few comparisons of covariates sets due to computational runtime intensive resampling and prediction model refitting. The follow-up article Welchowski and Edelmann (2024) then developed a model-agnostic interaction hypothesis test to detect interaction effects to address these shortcomings. Simulations showed control of type I error and reasonable power levels were achieved with approximately few hundred observations. Furthermore due to the derived asymptotic distribution the test is far more computational runtime efficient than previous approaches and can be flexibly specified to covariate sets of interest.	en
dc.description.abstract	Diese Habilitationsschrift fasste den aktuellen Stand des maschinellen Lernens für biomedizinische Anwendungen zusammen. Der erste Beitrag war die Entwicklung eines Frameworks zur Optimierung von KDSN zur Verbesserung der Vorhersageleistung (Welchowski und Schmid, 2016). KDSN stellen eine rechnerisch effiziente Alternative zu Backpropagation-basierten künstlichen neuronalen Netzen dar und bieten vergleichbare Vorhersageleistung für biomedizinische Tabellendaten, die schichtweise geschlossene Lösungen ermöglichen. Das vorgeschlagene modellbasierte Optimierungsframework ist deutlich rechenzeitsparender als gitterbasierte Suchstrategien. Diese Arbeit wurde auf SKDSN erweitert, das Variablenauswahl, Dropout und Regularisierung umfasst, um KDSN flexibler zu gestalten (Welchowski und Schmid, 2019). SKDSN-Modifikationen verbesserten die Leistung von KDSN, erreichten jedoch nicht die Leistung von Ensemble-Methoden für biomedizinische Tabellendatensätze, insbesondere bei hoher Anzahl an Kovariablen. IML-Methoden bieten Werkzeuge, um weitere Erkenntnisse aus diesen Black-Box-Modellen zu gewinnen. Eine Fallstudie aus der Ökologie verdeutlichte die Stärken und Schwächen von IML-Methoden zur Quantifizierung des Ausmaßes von Effekten und ihrer Wechselwirkungen (Welchowski et al., 2022). Insbesondere grafische Werkzeuge zeigten ihre Grenzen bei der Untersuchung von Interaktionseffekten höherer Ordnung. Frühere Ansätze zur Inferenz modellagnostischer Interaktionseffekte beschränkten sich aufgrund rechenintensiver Resampling- und Modellanpassungen auf wenige Vergleiche von Kovariablen. Der Folgeartikel von Welchowski und Edelmann (2024) entwickelte daraufhin einen modellagnostischen Interaktionshypothesentest zur Erkennung von Interaktionseffekten, um diese Defizite zu beheben. Simulationen zeigten eine Kontrolle des Fehlers erster Art und ein angemessenes Trennschärfeniveau mit etwa einigen hundert Beobachtungen. Darüber hinaus ist der Test aufgrund der abgeleiteten asymptotischen Verteilung weitaus rechenzeiteffizienter als frühere Ansätze und kann flexibel an die jeweiligen Kovariablen angepasst werden.	de
dc.language.iso	eng
dc.rights	In Copyright
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/
dc.subject.ddc	310 Allgemeine Statistiken
dc.subject.ddc	570 Biowissenschaften, Biologie
dc.subject.ddc	610 Medizin, Gesundheit
dc.title	Advances in Machine Learning Approaches for Biostatistical Learning
dc.type	Dissertation oder Habilitation
dc.identifier.doi	https://doi.org/10.48565/bonndoc-586
dc.publisher.name	Universitäts- und Landesbibliothek Bonn
dc.publisher.location	Bonn
dc.rights.accessRights	openAccess
dc.identifier.urn	https://nbn-resolving.org/urn:nbn:de:hbz:5-83173
dc.relation.doi	https://doi.org/10.1016/j.artmed.2016.04.002
dc.relation.doi	https://doi.org/10.1007/s00180-018-0832-9
dc.relation.doi	https://doi.org/10.1007/s13253-021-00479-7
dc.relation.doi	https://doi.org/10.3390/make6020061
ulbbn.pubtype	Erstveröffentlichung
ulbbnediss.affiliation.name	Rheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.location	Bonn
ulbbnediss.thesis.level	Habilitation
ulbbnediss.dissID	8317
ulbbnediss.date.accepted	30.01.2025
ulbbnediss.institute	Medizinische Fakultät / Institute : Institut für Medizinische Biometrie, Informatik und Epidemiologie (IMBIE)
ulbbnediss.fakultaet	Medizinische Fakultät
dc.contributor.coReferee	Rügamer, David
ulbbnediss.contributor.orcid	https://orcid.org/0000-0003-2940-647X

Dateien zu dieser Ressource

Name:: 8317.pdf
Größe:: 38.9MB
Format:: PDF

Dokument öffnen

Das Dokument erscheint in:

E-Dissertationen (2082)

Zur Kurzanzeige

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden: