Automatic Evaluation of Dialogue-Systems Using Neural-Network Methods

Nedelchev, Rostislav

dc.contributor.advisor	Lehmann, Jens
dc.contributor.author	Nedelchev, Rostislav
dc.date.accessioned	2023-06-05T13:11:18Z
dc.date.available	2023-06-05T13:11:18Z
dc.date.issued	05.06.2023
dc.identifier.uri	https://hdl.handle.net/20.500.11811/10873
dc.description.abstract	We usually interact with computers by means of specialized tools that are not as common as the language humans use. This has motivated researchers for already several decades to develop algorithms that enable interfacing with computer systems using natural language. This is especially prominent in recent times with the rise of voice assistants like Apple Siri or Amazon Alexa. However, the research and development of such systems is expensive in terms of human labor. The high expenses are especially prominent for the evaluation of such systems, which are very often evaluated by human annotators as a final stage and based on expensive development. The focus of this thesis is to support the assessment of dialogue systems by creating automatic tools that support humans. Human conversations involve many intricacies that makes it difficult to develop an algorithm which could reliably but also informatively evaluate them. To put the challenge into context, one should consider the Turing test, which is a method of examination in artificial intelligence (AI) for ascertaining whether a computer is proficient of thinking like a human being. One of its key components is the ability to decide whether a conversation is natural. There are various criteria according to which a dialogue is evaluated, and hence, problems that is suffers from. In this work, we aim to detect of these problems. In order to emulate human-like intelligence, we stand on the shoulders of techniques in Natural Language Processing (NLP), machine and deep learning (ML, DL). Since we have the goal to reduce human effort in the evaluation of dialogues, we focus on methods that can achieve our goal without the need of additionally annotated data: 1. We apply approaches from various problem domains. The thesis makes use of out-of-distribution (OoD), and anomaly detection approaches to treat low quality or problematic dialogue utterances as "unusual." 2. Despite being researched for a few decades, Language Models (LMs) became popular in the research only in the last few years. In our work, we show that they too can be used to evaluate dialogue quality. 3. Natural Language Processing as a field aims to teach various human-like language skills to computers, e.g. abilities like understanding whether two sentences are similar in meaning or whether a piece of text has a positive or negative sentiment. We show that these skills can be used as indirect indicators of conversation quality. 4. In addition, we show that dialogue systems can be evaluated not by means of reference, but "opinion." In other words, instead of asking them to generate a solution for a problem, we show that you can ask them to evaluate a reference solution and based on develop an understanding about the abilities of a dialogue system. All of the proposed approaches in this thesis do not make use of supervision for dialogue evaluation. They manage to deliver insights using various perspectives that could potentially complement each other in an overall framework for assessing conversation quality.	en
dc.language.iso	eng
dc.rights	Namensnennung-Nicht kommerziell 4.0 International
dc.rights.uri	http://creativecommons.org/licenses/by-nc/4.0/
dc.subject.ddc	004 Informatik
dc.title	Automatic Evaluation of Dialogue-Systems Using Neural-Network Methods
dc.type	Dissertation oder Habilitation
dc.publisher.name	Universitäts- und Landesbibliothek Bonn
dc.publisher.location	Bonn
dc.rights.accessRights	openAccess
dc.identifier.urn	https://nbn-resolving.org/urn:nbn:de:hbz:5-70983
ulbbn.pubtype	Erstveröffentlichung
ulbbnediss.affiliation.name	Rheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.location	Bonn
ulbbnediss.thesis.level	Dissertation
ulbbnediss.dissID	7098
ulbbnediss.date.accepted	19.04.2023
ulbbnediss.institute	Mathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Informatik / Institut für Informatik
ulbbnediss.fakultaet	Mathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coReferee	Bauckhage, Christian
dcterms.hasSupplement	https://doi.org/10.60507/FK2/FX37GD
dcterms.hasSupplement	https://doi.org/10.60507/FK2/MAVB6H
ulbbnediss.contributor.orcid	https://orcid.org/0000-0002-0209-6558

Dateien zu dieser Ressource

Name:: 7098.pdf
Größe:: 4.7MB
Format:: PDF

Dokument öffnen

Das Dokument erscheint in:

E-Dissertationen (4335)

Zur Kurzanzeige

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden: