Zur Kurzanzeige

Automating the Fact-Checking Task: Challenges and Directions

dc.contributor.advisorLehmann, Jens
dc.contributor.authorNascimento Esteves da Silva, Diego
dc.date.accessioned2020-04-26T20:45:00Z
dc.date.available2020-04-26T20:45:00Z
dc.date.issued30.07.2019
dc.identifier.urihttps://hdl.handle.net/20.500.11811/8030
dc.description.abstractIn recent years, misinformation has caused widespread alarm and has become a global concern, given the negative impact placed on society, democratic institutions and even computing systems whose the primary objective is to serve as a reliable information channel, e.g., Knowledge Bases (KBs). The proliferation of fake news has a wide range of characteristics and different motivations. For instance, it can be produced unintentionally (e.g., the creation process of KBs which is mostly based on automated information extraction methods, thus naturally error-prone) or intentionally (e.g., the spread of misinformation through social media to persuade). Thus, they differ considerably in complexity, structure and number of arguments and propositions. To further exacerbate this problem, an ever-increasing amount of fake news on the Web has created another challenge to drawing correct information. This huge sea of data makes it very difficult for human fact checkers and journalists to assess all the information manually. Therefore, addressing this problem is of utmost importance to minimize real-world circumstances which may provoke a negative impact on the society, in general. Presently Fact-Checking has emerged as a branch of natural language processing devoted to achieving this feat. Under this umbrella, Automated Fact-Checking frameworks have been proposed to perform claim verification. However, given the nature of the problem, different tasks need to be performed, from natural language understanding to source trustworthiness analysis and credibility scoring. In this thesis, we tackle the problem of fake news and underlying challenges related to the process of estimating the veracity of a given claim, discussing challenges and proposing novel models to improve the current state of the art on different sub-tasks. Thus, besides the principal task (i.e., performing automated fact-checking) we also investigate: the recognition of entities on noisy data and the computation of web site credibility. Ultimately, due to the challenging nature of the automated fact-checking task - which requires a complex analysis over several perspectives - we also contribute towards reproducibility of scientific experiments. First, we tackle the named entity recognition problem. We propose a novel multi-level approach named HORUS which - given an input token - generates heuristics based on computer vision and text mining techniques. These heuristics are then used to detect and classify named entities on noisy data (e.g., The Web). Second, we propose WebCred, a novel model to compute the credibility score of a given website, regardless of dependency on search engine results, which is a limiting factor when dealing with real scenarios. WebCred does not require any third-party service and is 100% open-source. Third, we conduct several empirical evaluations and extend DeFacto, a fact-checking framework initially designed to verify English claims in RDF format. DeFacto supports both structured claims (e.g., triple-like) as well as complex claims (i.e., natural language sentences). Last, but not least, we consistently contributed towards better reproducibility research tools, methods, and methodologies. We proposed ontologies (MEX, ML-Schema) and tools (LOG4MEX, MEX-Interfaces, WEB4MEX, WASOTA) which turned into state of the art for better reproducibility of machine learning experiments, becoming part of a global W3C community.
dc.language.isoeng
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectFaktenchecker
dc.subjectVertrauenswürdigkeit
dc.subjectGlaubwürdigkeit
dc.subjectInformationsbeschaffung
dc.subjectNER
dc.subjectReproduzierbarkeit
dc.subject.ddc004 Informatik
dc.titleAutomating the Fact-Checking Task: Challenges and Directions
dc.typeDissertation oder Habilitation
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5n-55001
ulbbn.pubtypeErstveröffentlichung
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID5500
ulbbnediss.date.accepted29.05.2019
ulbbnediss.instituteMathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Informatik / Institut für Informatik
ulbbnediss.fakultaetMathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coRefereeAuer, Sören


Dateien zu dieser Ressource

Thumbnail

Das Dokument erscheint in:

Zur Kurzanzeige

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden:

InCopyright