Zur Kurzanzeige

Exploring and Addressing General Limitations of Compound Potency Predictions using Machine Learning

dc.contributor.advisorBajorath, Jürgen
dc.contributor.authorBorges Janela, Tiago
dc.date.accessioned2025-03-24T08:33:39Z
dc.date.available2025-03-24T08:33:39Z
dc.date.issued24.03.2025
dc.identifier.urihttps://hdl.handle.net/20.500.11811/12941
dc.description.abstractCompound potency prediction is a major task in computational drug discovery. Regression models based on machine learning (ML) approaches have become popular for small molecule potency predictions. Recently, deep learning (DL) methods have introduced novel architectures and data representations that have been applied to molecular potency predictions. Upon introducing a new computational approach, initial performance assessment is carried out using benchmark studies. Conventional benchmark calculations use compound potency data against a specific target divided into training sets for model generation and test sets for performance assessment over several rounds of cross-validation. Under these conditions, performance differences between prediction models are often negligible and do not translate into a successful application in prospective tasks. The mechanisms underlying these small performance differences are yet to be determined. This dissertation investigates the intrinsic limitations of current benchmark settings for compound potency predictions using ML models. The first study compares traditional ML, DL, and control models’ performance under different test conditions for several compound activity classes. Next, potency predictions are extended to a wide range of activity classes, using ML and control models. The impact of data composition and potency ranges on prediction accuracy is determined based on different data set generation strategies. At this stage, limitations associated with potency prediction benchmarks, such as limited differences between predictive ML/DL and control models are uncovered. Furthermore, ML/DL and control models are derived with original and modified training sets of increasing compound sizes. Prediction performance is determined over several potency sub-ranges to rationalize the unveiled benchmark limitations. Moreover, the impact of structural analogs on prediction models is determined using a newly designed compound pair-based evaluation scheme to monitor performance over increasing compound potency differences. Additionally, a novel DL method for compound potency predictions is introduced and compared to state-of-the-art ML models for the prediction of potent compounds. Finally, alternative evaluation schemes are explored, and possible future steps toward better benchmark systems for ML potency predictions are discussed. Taken together, this thesis uncovers current limitations of benchmark systems for comparing ML models and offers alternative approaches to better determine compound potency prediction performance.en
dc.language.isoeng
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectcompound potency predictions
dc.subjectmachine learning
dc.subjectperformance evaluation
dc.subject.ddc004 Informatik
dc.titleExploring and Addressing General Limitations of Compound Potency Predictions using Machine Learning
dc.typeDissertation oder Habilitation
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5-81116
dc.relation.doihttps://doi.org/10.1038/s42256-022-00581-6
dc.relation.doihttps://doi.org/10.3390/ph16040530
dc.relation.doihttps://doi.org/10.1038/s41598-023-45086-3
dc.relation.doihttps://doi.org/10.1021/acs.jcim.3c01530
dc.relation.doihttps://doi.org/10.3390/biom13020393
dc.relation.doihttps://doi.org/10.1016/j.xcrp.2024.101988
ulbbn.pubtypeErstveröffentlichung
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID8111
ulbbnediss.date.accepted04.02.2025
ulbbnediss.instituteZentrale wissenschaftliche Einrichtungen : Bonn-Aachen International Center for Information Technology (b-it)
ulbbnediss.fakultaetMathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coRefereeFröhlich, Holger
ulbbnediss.contributor.orcidhttps://orcid.org/0000-0002-0782-3021


Dateien zu dieser Ressource

Thumbnail

Das Dokument erscheint in:

Zur Kurzanzeige

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden:

InCopyright