Explainable Resource-Aware Representation Learning via Semantic Similarity

Brito Chacón, Eduardo Alfredo

dc.contributor.advisor	Bauckhage, Christian
dc.contributor.author	Brito Chacón, Eduardo Alfredo
dc.date.accessioned	2023-12-12T10:26:09Z
dc.date.available	2023-12-12T10:26:09Z
dc.date.issued	12.12.2023
dc.identifier.uri	https://hdl.handle.net/20.500.11811/11174
dc.description.abstract	The rapid advancement of artificial intelligence (AI) systems in recent years is largely due to the impressive capabilities of artificial neural networks. Their powerful capabilities in natural language understanding and computer vision have paved the way for the wide adoption of AI solutions. However, these models often demand significant computational resources and operate as "black boxes", limiting their utility in sensitive domains, such as finance and healthcare, where strict personal data protection regulations apply. This thesis addresses the triadic trade-off between accuracy, explainability, and resource consumption in the context of supervised learning, with an emphasis on representation learning for text applications. It starts presenting three use cases: semantic segmentation for autonomous driving, sentiment analysis via language models, and text summary evaluation. These cases underscore the need for robust evaluation techniques to enhance system trustworthiness but also highlight their limitations, motivating the development of RatVec, an explainable, resource-efficient framework leveraging kernel PCA and k-nearest neighbors, which is presented subsequently. RatVec demonstrates a competitive performance under certain conditions, especially when tasks can be represented as sequence similarity problems, e.g., protein family classification. For situations where RatVec is less suitable, such as text classification, the thesis proposes an analogous pipeline using Transformer-based text representations. This approach, when fine-tuned, approximates the accuracy from pure neural models while maintaining architectural explainability, and enables granular explanations of semantic similarity via a novel technique of pairing contextualized best-matching tokens. In sum, this thesis advances the pursuit of trustworthy AI systems by introducing RatVec, a resource-efficient, explainable framework optimally suited to settings that are naturally translatable to sequence similarity problems, and proposing an explainable Transformer-based pipeline for text classification tasks. These advancements address some of the challenges of deploying AI in sensitive domains and suggest several promising avenues for future research.	en
dc.language.iso	eng
dc.rights	In Copyright
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/
dc.subject.ddc	004 Informatik
dc.title	Explainable Resource-Aware Representation Learning via Semantic Similarity
dc.type	Dissertation oder Habilitation
dc.identifier.doi	https://doi.org/10.48565/bonndoc-173
dc.publisher.name	Universitäts- und Landesbibliothek Bonn
dc.publisher.location	Bonn
dc.rights.accessRights	openAccess
dc.identifier.urn	https://nbn-resolving.org/urn:nbn:de:hbz:5-72981
dc.relation.doi	https://doi.org/10.1145/3539618.3592017
dc.relation.doi	https://doi.org/10.1007/978-3-031-15791-2_5
dc.relation.doi	https://doi.org/10.1109/IVWorkshops54471.2021.9669248
dc.relation.doi	https://doi.org/10.1145/3342558.3345420
dc.relation.doi	https://doi.org/10.1007/978-3-658-19287-7_8
ulbbn.pubtype	Erstveröffentlichung
ulbbnediss.affiliation.name	Rheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.location	Bonn
ulbbnediss.thesis.level	Dissertation
ulbbnediss.dissID	7298
ulbbnediss.date.accepted	20.10.2023
ulbbnediss.institute	Mathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Informatik / Institut für Informatik
ulbbnediss.fakultaet	Mathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coReferee	Wrobel, Stefan
ulbbnediss.contributor.orcid	https://orcid.org/0000-0003-1235-700X

Files in this item

Name:: 7298.pdf
Size:: 9.2MB
Format:: PDF

View/Open

This item appears in the following Collection(s)

E-Dissertationen (4379)

Show simple item record

The following license files are associated with this item: