Deep Dynamic Language Models

Cvejoski, Kostadin

Volltext

Dokument öffnen (23MB)

Autor

Cvejoski, Kostadin

ORCID

https://orcid.org/0009-0003-6976-3997

Art der Hochschulschrift

Dissertation

Prüfungsdatum

16.01.2024

Datum der Veröffentlichung

15.03.2024

Erstgutachter

Bauckhage, Christian

Zweitgutachter

Wrobel, Stefan

Grad-verleihende Institutionen

Rheinische Friedrich-Wilhelms-Universität Bonn

Metadaten

Zur Langanzeige

Zitierbare Links

Handle: https://hdl.handle.net/20.500.11811/11435
URN: https://nbn-resolving.org/urn:nbn:de:hbz:5-74907

Inhalt

This thesis investigates the domain of deep dynamic language models, focusing on the integration of temporal dynamics to enhance language modeling and its application in various tasks, such as text generation, recommendation systems, and predicting post popularity. Temporal content change, i.e., trends and themes that change with time featured in document collections such as academic journals, news articles and social media, make the traditional static language models (LMs) not an optimal solution. In order to address this limitation, several approaches to develop dynamic LMs are proposed and explored in this thesis.
Initially, the impact of incorporating temporal information is explored, specifically in the context of modeling online communities. For the analysis of temporal content change in Yelp — a crowd-sourced review platform — an instantaneous language model is proposed. This model combines a temporal point process (TPP) for modeling review creation times and a LM to capture textual aspects. Empirical evaluations demonstrate that this model significantly improves the performance of LMs in terms of both language modeling and prediction of review creation time.
Building upon the success of the instantaneous LM, the research in this thesis is extended to more application oriented task, such as recommender systems. Recognizing that user preferences and item reviews change over time, the proposed model here leverages users’ reviews to enhance rating predictions. By developing time-interval aware representations, the proposed model outperforms several state-of-the-art recommender systems models in real-world datasets.
Additionally, the integration of dynamic topic models into LMs is explored. First, the problem of skewed topic distributions in topic modeling is addressed, which can cause models to learn more general topics present in the majority of documents, rather than rare topics present in only a few documents. A neural dynamic focused topic model is proposed as a solution, which decouples topic activities from topic proportions in documents using sequences of Bernoulli random variables. Experimental evaluations show that this model outperforms state-of-the-art topic models in generalization tasks, while employing a comparable number of parameters and converging two times faster.
Furthermore, the performance of large pre-trained language models (LPLMs) in dynamic environments is explored. The empirical analysis on Reddit datasets reveals significant performance drops when predicting the popularity of future posts due to temporal distribution shifts in data. To mitigate this issue, a model is proposed that combines neural variational dynamic topic models and attention mechanisms to infer temporal LM representations. The proposed model exhibit improved performance while utilizing only a fraction of the parameters of LPLMs, and providing interpretable representations that offer insights into real-world events.
In summary, this thesis emphasizes the significance of incorporating temporal dynamics into LMs and explores their application in various tasks.

Schlagwörter

Large Language Models, Dynamic Language Models

Klassifikation (DDC)

004 Informatik

Zugehörige Publikation(en)

https://doi.org/10.48550/arXiv.1912.04132
https://doi.org/10.1109/IJCNN48605.2020.9206768
https://doi.org/10.1007/978-3-031-31836-8_4
https://doi.org/10.1609/aaai.v37i11.26496
https://doi.org/10.48550/arXiv.2211.00384

Zitiervorschlag
BibTeX

Cvejoski, Kostadin: Deep Dynamic Language Models. - Bonn, 2024. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-74907

@phdthesis{handle:20.500.11811/11435,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-74907,
author = {{Kostadin Cvejoski}},
title = {Deep Dynamic Language Models},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2024,
month = mar,
note = {This thesis investigates the domain of deep dynamic language models, focusing on the integration of temporal dynamics to enhance language modeling and its application in various tasks, such as text generation, recommendation systems, and predicting post popularity. Temporal content change, i.e., trends and themes that change with time featured in document collections such as academic journals, news articles and social media, make the traditional static language models (LMs) not an optimal solution. In order to address this limitation, several approaches to develop dynamic LMs are proposed and explored in this thesis.
Initially, the impact of incorporating temporal information is explored, specifically in the context of modeling online communities. For the analysis of temporal content change in Yelp — a crowd-sourced review platform — an instantaneous language model is proposed. This model combines a temporal point process (TPP) for modeling review creation times and a LM to capture textual aspects. Empirical evaluations demonstrate that this model significantly improves the performance of LMs in terms of both language modeling and prediction of review creation time.
Building upon the success of the instantaneous LM, the research in this thesis is extended to more application oriented task, such as recommender systems. Recognizing that user preferences and item reviews change over time, the proposed model here leverages users’ reviews to enhance rating predictions. By developing time-interval aware representations, the proposed model outperforms several state-of-the-art recommender systems models in real-world datasets.
Additionally, the integration of dynamic topic models into LMs is explored. First, the problem of skewed topic distributions in topic modeling is addressed, which can cause models to learn more general topics present in the majority of documents, rather than rare topics present in only a few documents. A neural dynamic focused topic model is proposed as a solution, which decouples topic activities from topic proportions in documents using sequences of Bernoulli random variables. Experimental evaluations show that this model outperforms state-of-the-art topic models in generalization tasks, while employing a comparable number of parameters and converging two times faster.
Furthermore, the performance of large pre-trained language models (LPLMs) in dynamic environments is explored. The empirical analysis on Reddit datasets reveals significant performance drops when predicting the popularity of future posts due to temporal distribution shifts in data. To mitigate this issue, a model is proposed that combines neural variational dynamic topic models and attention mechanisms to infer temporal LM representations. The proposed model exhibit improved performance while utilizing only a fraction of the parameters of LPLMs, and providing interpretable representations that offer insights into real-world events.
In summary, this thesis emphasizes the significance of incorporating temporal dynamics into LMs and explores their application in various tasks.},
url = {https://hdl.handle.net/20.500.11811/11435}
}

Die folgenden Nutzungsbestimmungen sind mit dieser Ressource verbunden: