<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>Fachgruppe Informatik</title>
<link>https://hdl.handle.net/20.500.11811/655</link>
<description/>
<pubDate>Fri, 10 Apr 2026 16:34:30 GMT</pubDate>
<dc:date>2026-04-10T16:34:30Z</dc:date>
<item>
<title>Leveraging Synthetically Generated Data for Real Estate Document Classification</title>
<link>https://hdl.handle.net/20.500.11811/13972</link>
<description>Leveraging Synthetically Generated Data for Real Estate Document Classification
Deußer, Tobias; Ramien, Gregor; Weber, Nico; Meidinger, Maximilian; Hahnbück, Max; Bauckhage, Christian; Sifa, Rafet
Document classification in regulated domains like law, finance, or real estate is hindered by the scarcity of labeled data and strict privacy constraints. This paper presents a pipeline for synthetically generating training data for document classifiers using a combination of domain-specific templates, large language models, and data augmentation techniques. Focusing on two key document types relevant to real estate workflows, &lt;em&gt;Child Support Certificate and Refurbishment Roadmap&lt;/em&gt;, we construct realistic multi-page documents and generate negative classes using LLM-generated distractors. We train a BERT-based classifier on this synthetic dataset and evaluate it on real-world OCR-extracted documents, achieving strong performance despite the absence of real documents in training. Our findings highlight the feasibility of using synthetic data to overcome annotation bottlenecks and pave the way for broader applications in privacy-sensitive industries.
</description>
<pubDate>Mon, 01 Dec 2025 00:00:00 GMT</pubDate>
<guid isPermaLink="false">https://hdl.handle.net/20.500.11811/13972</guid>
<dc:date>2025-12-01T00:00:00Z</dc:date>
</item>
<item>
<title>A Survey on Current Trends and Recent Advances in Text Anonymization</title>
<link>https://hdl.handle.net/20.500.11811/13719</link>
<description>A Survey on Current Trends and Recent Advances in Text Anonymization
Deußer, Tobias; Sparrenberg, Lorenz; Berger, Armin; Hahnbück, Max; Bauckhage, Christian; Sifa, Rafet
The proliferation of textual data containing sensitive personal information across various domains requires robust anonymization techniques to protect privacy and comply with regulations, while preserving data usability for diverse and crucial downstream tasks. This survey provides a comprehensive overview of current trends and recent advances in text anonymization techniques. We begin by discussing foundational approaches, primarily centered on Named Entity Recognition, before examining the transformative impact of Large Language Models, detailing their dual role as sophisticated anonymizers and potent de-anonymization threats. The survey further explores domain-specific challenges and tailored solutions in critical sectors such as healthcare, law, finance, and education. We investigate advanced methodologies incorporating formal privacy models and risk-aware frameworks, and address the specialized subfield of authorship anonymization. Additionally, we review evaluation frameworks, comprehensive metrics, benchmarks, and practical toolkits for real-world deployment of anonymization solutions. This review consolidates current knowledge, identifies emerging trends and persistent challenges, including the evolving privacy-utility trade-off, the need to address quasi-identifiers, and the implications of LLM capabilities, and aims to guide future research directions for both academics and practitioners in this field.
</description>
<pubDate>Wed, 01 Oct 2025 00:00:00 GMT</pubDate>
<guid isPermaLink="false">https://hdl.handle.net/20.500.11811/13719</guid>
<dc:date>2025-10-01T00:00:00Z</dc:date>
</item>
<item>
<title>Super-resolution time-resolved imaging using computational sensor fusion</title>
<link>https://hdl.handle.net/20.500.11811/9220</link>
<description>Super-resolution time-resolved imaging using computational sensor fusion
Callenberg, Clara; Lyons, Ashley; den Brok, Dennis; Fatima, Areeba; Turpin, Alex; Zickus, Vytautas; Machesky, Laura; Whitelaw, Jamie; Faccio, Daniele; Hulin, Matthias B.
Imaging across both the full transverse spatial and temporal dimensions of a scene with high precision in all three coordinates is key to applications ranging from LIDAR to fluorescence lifetime imaging. However, compromises that sacrifice, for example, spatial resolution at the expense of temporal resolution are often required, in particular when the full 3-dimensional data cube is required in short acquisition times. We introduce a sensor fusion approach that combines data having low-spatial resolution but high temporal precision gathered with a single-photon-avalanche-diode (SPAD) array with data that has high spatial but no temporal resolution, such as that acquired with a standard CMOS camera. Our method, based on blurring the image on the SPAD array and computational sensor fusion, reconstructs time-resolved images at significantly higher spatial resolution than the SPAD input, upsampling numerical data by a factor 12 × 12 , and demonstrating up to 4 × 4 upsampling of experimental data. We demonstrate the technique for both LIDAR applications and FLIM of fluorescent cancer cells. This technique paves the way to high spatial resolution SPAD imaging or, equivalently, FLIM imaging with conventional microscopes at frame rates accelerated by more than an order of magnitude.
</description>
<pubDate>Mon, 18 Jan 2021 00:00:00 GMT</pubDate>
<guid isPermaLink="false">https://hdl.handle.net/20.500.11811/9220</guid>
<dc:date>2021-01-18T00:00:00Z</dc:date>
</item>
</channel>
</rss>
