Ojeda Marin, Cesar Ali: Approximate Inference Applications toRepresentation Learning and Stochastic ProcessesProblems. - Bonn, 2021. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.

Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-61784

Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-61784

@phdthesis{handle:20.500.11811/9028,

urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-61784,

author = {{Cesar Ali Ojeda Marin}},

title = {Approximate Inference Applications toRepresentation Learning and Stochastic ProcessesProblems},

school = {Rheinische Friedrich-Wilhelms-Universität Bonn},

year = 2021,

month = apr,

note = {The present dissertation dwells in the development of inference algorithms and methodologies for the study of dynamical datasets. We developed techniques to analyze time-series datasets for point processes, switching dynamical systems, and queues systems dynamics. Furthermore, we developed analysis in the interplay of dynamic population behavior and how semantic structures inform this behavior. Conversely, we studied how dynamics in semantic spaces can be exploited to explain black-box classifiers’ decisions. Concretely, we extended the Hawkes process incorporating richer correlations in the excitations via introducing a sigmoid link function over a Gaussian process prior. We incorporate a Polya Gamma data augmentation approach and a sparse Gaussian process approximation into a mean field treatment of the variational lower bound to perform inference with analytic updates, obtaining a fast and scalable algorithm. Second, we introduce a flexible methodology for handling temporal Poisson process intensities based on spline interpolation for fast and scalable unsupervised analysis of point processes. We then propose a similarity measure for time series that is invariant to translations of local patterns. With this similarity measure, we develop a spectral clustering algorithm with a flexible, piecewise kernel evaluation for efficient computation, scaling to a large amount of data. The clustering procedure incorporates an entropy measure to determine how well a certain (intensity) time series is represented by a cluster prototype, allowing for the detection of outliers within a temporal pattern sample. Thirdly, we provide a deep learning solution for service times of queue systems; we exploit the representation learning capabilities of deep neural networks for point processes to infer service time distributions modeled as both, multilayered parametrizations of known distributions or nonparametric models through adversarial neural networks. The adversarial models capture multi-modal and long tail distribution of service times. This approach allows us to characterize the service times’ independent dynamics, allowing for exogenous events to be characterized implicitly. As a focus application area we provide the first deep and nonparametric solution for predicting unconfirmed transactions in the Bitcoin Mempool network. As a fourth contribution, different from point processes, we studied switching dynamical systems by exploiting recurrent neural networks (RNNs). This approach allows for an explicit description of non-linear and non-Markovian transition functions for both modes and switching dynamics. Indeed, within our model the modes are learned through independent RNNs whereas, similar to (mixture of) expert systems, the selection of modes is handled via a categorical distribution. As a fifth contribution, we incorporate the knowledge of semantic structures and their influence in dynamical processes. In the context of question answering sites, where knowledge is organized as a series of tags identifying questions experts domains, we propose an algorithm to learn hierarchical taxonomies. This algorithm considers co-occurrences of the tags assigned to the questions. Our algorithm infers hidden hierarchies only from sets of co-occurring tags, i.e. from all the n-tuples a given tag appears. We finally link the taxonomies with the dynamical behavior of the users posting questions. Our extensive empirical evaluation indicates that the tagging process of parent nodes is highly dependent on the tagging process of their descendants and not only of its co-occurring tags. As a final contribution, dynamical processes are studied not on population behavior, but on semantic spaces themselves, for the purpose of explaining classifier decisions. We aim at generating a set of examples that highlight differences in the decision of a black-box model. We use interpolations in latent space to generate a set of examples in feature space connecting the misclassified and the correctly classified points. We then condition the resulting feature-space paths on the black-box classifier’s decisions via a user-defined functional. Optimizing the latter over the space of paths allows us to find paths that highlight classification differences. We introduce and formalize the notion of stochastic semantic paths: stochastic processes on feature space created by latent code interpolations. Expected changes of a data point are characterized in terms of stochastic functionals along the path, which leads to the notion of a semantic Lagrangian. To train, say, a Variational Auto-Encoder, one must thus define a new training cost by solving the variational problem, which minimizes the functional along the paths.},

url = {http://hdl.handle.net/20.500.11811/9028}

}

urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-61784,

author = {{Cesar Ali Ojeda Marin}},

title = {Approximate Inference Applications toRepresentation Learning and Stochastic ProcessesProblems},

school = {Rheinische Friedrich-Wilhelms-Universität Bonn},

year = 2021,

month = apr,

note = {The present dissertation dwells in the development of inference algorithms and methodologies for the study of dynamical datasets. We developed techniques to analyze time-series datasets for point processes, switching dynamical systems, and queues systems dynamics. Furthermore, we developed analysis in the interplay of dynamic population behavior and how semantic structures inform this behavior. Conversely, we studied how dynamics in semantic spaces can be exploited to explain black-box classifiers’ decisions. Concretely, we extended the Hawkes process incorporating richer correlations in the excitations via introducing a sigmoid link function over a Gaussian process prior. We incorporate a Polya Gamma data augmentation approach and a sparse Gaussian process approximation into a mean field treatment of the variational lower bound to perform inference with analytic updates, obtaining a fast and scalable algorithm. Second, we introduce a flexible methodology for handling temporal Poisson process intensities based on spline interpolation for fast and scalable unsupervised analysis of point processes. We then propose a similarity measure for time series that is invariant to translations of local patterns. With this similarity measure, we develop a spectral clustering algorithm with a flexible, piecewise kernel evaluation for efficient computation, scaling to a large amount of data. The clustering procedure incorporates an entropy measure to determine how well a certain (intensity) time series is represented by a cluster prototype, allowing for the detection of outliers within a temporal pattern sample. Thirdly, we provide a deep learning solution for service times of queue systems; we exploit the representation learning capabilities of deep neural networks for point processes to infer service time distributions modeled as both, multilayered parametrizations of known distributions or nonparametric models through adversarial neural networks. The adversarial models capture multi-modal and long tail distribution of service times. This approach allows us to characterize the service times’ independent dynamics, allowing for exogenous events to be characterized implicitly. As a focus application area we provide the first deep and nonparametric solution for predicting unconfirmed transactions in the Bitcoin Mempool network. As a fourth contribution, different from point processes, we studied switching dynamical systems by exploiting recurrent neural networks (RNNs). This approach allows for an explicit description of non-linear and non-Markovian transition functions for both modes and switching dynamics. Indeed, within our model the modes are learned through independent RNNs whereas, similar to (mixture of) expert systems, the selection of modes is handled via a categorical distribution. As a fifth contribution, we incorporate the knowledge of semantic structures and their influence in dynamical processes. In the context of question answering sites, where knowledge is organized as a series of tags identifying questions experts domains, we propose an algorithm to learn hierarchical taxonomies. This algorithm considers co-occurrences of the tags assigned to the questions. Our algorithm infers hidden hierarchies only from sets of co-occurring tags, i.e. from all the n-tuples a given tag appears. We finally link the taxonomies with the dynamical behavior of the users posting questions. Our extensive empirical evaluation indicates that the tagging process of parent nodes is highly dependent on the tagging process of their descendants and not only of its co-occurring tags. As a final contribution, dynamical processes are studied not on population behavior, but on semantic spaces themselves, for the purpose of explaining classifier decisions. We aim at generating a set of examples that highlight differences in the decision of a black-box model. We use interpolations in latent space to generate a set of examples in feature space connecting the misclassified and the correctly classified points. We then condition the resulting feature-space paths on the black-box classifier’s decisions via a user-defined functional. Optimizing the latter over the space of paths allows us to find paths that highlight classification differences. We introduce and formalize the notion of stochastic semantic paths: stochastic processes on feature space created by latent code interpolations. Expected changes of a data point are characterized in terms of stochastic functionals along the path, which leads to the notion of a semantic Lagrangian. To train, say, a Variational Auto-Encoder, one must thus define a new training cost by solving the variational problem, which minimizes the functional along the paths.},

url = {http://hdl.handle.net/20.500.11811/9028}

}