Publikationen

Towards Uncertainty-Aware Low-Bit Quantized LLMs for On-Device Inference
Sparrenberg, Lorenz; Schneider, Tobias; Deußer, Tobias; Berger, Armin; Sifa, Rafet (2026-03-06)
Quantizing large language models (LLMs) significantly reduces memory usage and computational requirements, enabling efficient on-device inference. However, aggressive quantization can degrade model performance and exacerbate ...
Small and Fast LLMs on Commodity Hardware: Post-Training Quantization in llama.cpp
Sparrenberg, Lorenz; Deußer, Tobias; Berger, Armin; Sifa, Rafet (2025-11-24)
Large Language Models (LLMs) have demonstrated remarkable capabilities but their significant computational and memory demands hinder widespread deployment, especially on resource-constrained devices. Quantization, the ...
SciREX: Scientific Relation Extraction: Natural Language Processing Lab : final report
Karar, Sayanta; Altahan, Zyad; Aloradi, Sulaeman; Elshennawy, Abdelwahab (2025-09-23)
The rapid growth of biomedical literature makes it increasingly difficult to identify and organize meaningful knowledge. This project addresses the problem by focusing on <strong>relation extraction (RE)</strong>, i.e., ...