On Modeling and Assessing Uncertainty Estimates in Neural Learning Systems

Sicking, Joachim

Volltext

View/Open (18.1MB)

Author

Sicking, Joachim

ORCID

https://orcid.org/0000-0003-1741-2338

Type of Scholarly Publication

Dissertation

Date of Exam

14.04.2023

Date of Publication

20.04.2023

Advisor

Wrobel, Stefan

Co-Referee

Bauckhage, Christian

Involved Institutions

Rheinische Friedrich-Wilhelms-Universität Bonn

Metadata

Show full item record

Citable Links

Handle: https://hdl.handle.net/20.500.11811/10778
URN: https://nbn-resolving.org/urn:nbn:de:hbz:5-70545

Abstract

While neural networks are universal function approximators when looked at from a theoretical perspective, we face, in practice, model size constraints and highly sparse data samples from open-world contexts. These limitations of models and data introduce uncertainty, i.e., they render it unclear whether a model's output for a given input datapoint can be relied on. This lack of information hinders the use of learned models in critical applications, as unrecognized erroneous predictions may occur. A promising safeguard against such failures is uncertainty estimation, which seeks to measure a model's input-dependent reliability. Theory, modeling, and operationalization of uncertainty techniques are, however, often studied in isolation. In this work, we combine these perspectives to enable the effective use of uncertainty estimators in practice. In particular, it is necessary to address (the interplay of) three points. First, we need to better understand the theoretical properties of uncertainty estimators, specifically, their shortcomings stemming from constrained model capacity. Second, we must find a way to closely model data and error distributions that are not explicitly given. Third, for real-world use cases, we need a deeper understanding of uncertainty estimation requirements and their test-based evaluations.
Regarding the first point, we study how the estimation of uncertainty is affected (and limited) by a learning system's capacity. Beginning with a simple model for uncertain dynamics, a hidden Markov model, we integrate (neural) word2vec-inspired representation learning into it to control its model complexity more directly and, as a result, identify two regimes of differing model quality. Expanding this analysis on model capacity to fully neural models, we investigate Monte Carlo (MC) dropout, which adds complexity control and uncertainty by randomly dropping neurons. In particular, we analyze the different types of output distributions this procedure can induce. While it is commonly assumed that output distributions can be treated as Gaussians, we show by explicit construction that wider tails can occur.
As to the second point, we borrow ideas from MC dropout and construct a novel uncertainty technique for regression tasks: Wasserstein dropout. It captures heteroscedastic aleatoric uncertainty by input-dependent matchings of model output and data distributions, while preserving the beneficial properties of MC dropout. An extensive empirical analysis shows that Wasserstein dropout outperforms various state-of-the-art methods regarding uncertainty quality, both on vanilla test data and under distributional shifts. It can also be used for critical tasks like object detection for autonomous driving. Moreover, we extend uncertainty assessment beyond distribution-averaged metrics and measure the quality of uncertainty estimation in worst-case scenarios.
To address the third point, we need not only granular evaluations but also have to consider the context of the intended machine learning use case. To this end, we propose a framework that i) structures and shapes application requirements, ii) guides the selection of a suitable uncertainty estimation method and iii) provides systematic test strategies that validate this choice. The proposed strategies are data-driven and range from general tests to identify capacity issues to specific ones to validate heteroscedastic calibration or risks stemming from worst- or rare-case scenarios.

Subjects

Neuronale Netze, Unsicherheitsschätzung, Vertrauenswürdiges Maschinelles Lernen, Deep Learning, Uncertainty Estimation, Dropout, Safe Machine Learning

Classification (DDC)

004 Informatik

Related Publications

arXiv:2204.13963
arXiv:2007.05434
arXiv:2101.02974
arXiv:2101.02726
https://doi.org/10.1007/s10994-022-06230-8
https://doi.org/10.5220/0010821800003122

Zitiervorschlag
BibTeX

Sicking, Joachim: On Modeling and Assessing Uncertainty Estimates in Neural Learning Systems. - Bonn, 2023. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-70545

@phdthesis{handle:20.500.11811/10778,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-70545,
author = {{Joachim Sicking}},
title = {On Modeling and Assessing Uncertainty Estimates in Neural Learning Systems},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2023,
month = apr,
note = {While neural networks are universal function approximators when looked at from a theoretical perspective, we face, in practice, model size constraints and highly sparse data samples from open-world contexts. These limitations of models and data introduce uncertainty, i.e., they render it unclear whether a model's output for a given input datapoint can be relied on. This lack of information hinders the use of learned models in critical applications, as unrecognized erroneous predictions may occur. A promising safeguard against such failures is uncertainty estimation, which seeks to measure a model's input-dependent reliability. Theory, modeling, and operationalization of uncertainty techniques are, however, often studied in isolation. In this work, we combine these perspectives to enable the effective use of uncertainty estimators in practice. In particular, it is necessary to address (the interplay of) three points. First, we need to better understand the theoretical properties of uncertainty estimators, specifically, their shortcomings stemming from constrained model capacity. Second, we must find a way to closely model data and error distributions that are not explicitly given. Third, for real-world use cases, we need a deeper understanding of uncertainty estimation requirements and their test-based evaluations.
Regarding the first point, we study how the estimation of uncertainty is affected (and limited) by a learning system's capacity. Beginning with a simple model for uncertain dynamics, a hidden Markov model, we integrate (neural) word2vec-inspired representation learning into it to control its model complexity more directly and, as a result, identify two regimes of differing model quality. Expanding this analysis on model capacity to fully neural models, we investigate Monte Carlo (MC) dropout, which adds complexity control and uncertainty by randomly dropping neurons. In particular, we analyze the different types of output distributions this procedure can induce. While it is commonly assumed that output distributions can be treated as Gaussians, we show by explicit construction that wider tails can occur.
As to the second point, we borrow ideas from MC dropout and construct a novel uncertainty technique for regression tasks: Wasserstein dropout. It captures heteroscedastic aleatoric uncertainty by input-dependent matchings of model output and data distributions, while preserving the beneficial properties of MC dropout. An extensive empirical analysis shows that Wasserstein dropout outperforms various state-of-the-art methods regarding uncertainty quality, both on vanilla test data and under distributional shifts. It can also be used for critical tasks like object detection for autonomous driving. Moreover, we extend uncertainty assessment beyond distribution-averaged metrics and measure the quality of uncertainty estimation in worst-case scenarios.
To address the third point, we need not only granular evaluations but also have to consider the context of the intended machine learning use case. To this end, we propose a framework that i) structures and shapes application requirements, ii) guides the selection of a suitable uncertainty estimation method and iii) provides systematic test strategies that validate this choice. The proposed strategies are data-driven and range from general tests to identify capacity issues to specific ones to validate heteroscedastic calibration or risks stemming from worst- or rare-case scenarios.},
url = {https://hdl.handle.net/20.500.11811/10778}
}

The following license files are associated with this item: