Zur Kurzanzeige

Molecular Complexity Effects and Fingerprint-Based Similarity Search Strategies

dc.contributor.advisorBajorath, Jürgen
dc.contributor.authorWang, Yuan
dc.date.accessioned2020-04-14T04:16:17Z
dc.date.available2020-04-14T04:16:17Z
dc.date.issued13.11.2009
dc.identifier.urihttps://hdl.handle.net/20.500.11811/4158
dc.description.abstractMolecular fingerprints are bit string representations of molecular structure and properties. They are among the most popular descriptors and tools in molecular similarity searching because of their conceptual simplicity and computational efficiency. In order to calculate molecular similarity, fingerprints are computed for reference and screening database compounds and their bit settings are quantitatively compared using similarity metrics. One caveat of this approach is the bias caused by complexity effects: complex molecules have higher fingerprint bit density and produce artificially high similarity values.
The asymmetric behavior of Tversky similarity measurement has been reported: comparing A to B is not equal to comparing B to A. This phenomenon can be directly attributed to complexity effects. Hence, preference of parametric settings for Tversky coefficient is determined with regard to the relative difference of molecular complexity. One approach to avoid such effects is using fingerprint representations having constant bit density. Alternatively, emphasizing the absence of bit position features, which is not recorded using conventional fingerprint similarity search methods, provides another approach to address complexity effects. However, in order to optimize search performance, elimination of complexity effects using this approach is not as effective as modulation of complexity effects. In order to evaluate the outcome of virtual screening, search performance is monitored for combinations of different parameters. In general, in similarity searching using highly complex reference compounds it is difficult to recover potential hits that are less complex.
To further investigate complexity effects, the random reduction of fingerprint bit density is also explored. The ensuing loss of chemical information can be compensated for by balancing complexity effects when the fingerprints of reference compounds are modified to reduce their bit density.
When this random process is replaced with iterative bit silencing, the significance of each bit position in similarity searching can be analyzed and different weights can be assigned to each position. Such a weighting scheme emphasizes critical bit positions specific to the reference activity class. Class-specific similarity metrics can be derived by utilizing these weights in similarity calculation. Using these similarity metrics similarity search performance can be improved, especially when conventional methods fail to retrieve potential active compounds.
Information of reference sets can also be directly utilized in the form of Shannon entropy as a measure of similarity. This simple and efficient similarity search strategy assesses the fingerprint entropy penalty induced by introducing external molecules into the reference set. It has comparable or better performance compared to nearest neighbor approaches but lower computational costs.
dc.language.isoeng
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectLife Science Informatics
dc.subjectMolecular Fingerprint
dc.subjectSimilarity Search
dc.subject.ddc570 Biowissenschaften, Biologie
dc.titleMolecular Complexity Effects and Fingerprint-Based Similarity Search Strategies
dc.typeDissertation oder Habilitation
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5N-19490
ulbbn.pubtypeErstveröffentlichung
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID1949
ulbbnediss.date.accepted05.11.2009
ulbbnediss.fakultaetMathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coRefereeWeber, Andreas


Dateien zu dieser Ressource

Thumbnail

Das Dokument erscheint in:

Zur Kurzanzeige

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden:

InCopyright