Show simple item record

Clustering Trajectories: Detecting Patterns in Spatio-Temporal Data

dc.contributor.advisorDriemel, Anne
dc.contributor.authorConradi, Jacobus
dc.date.accessioned2025-12-18T11:15:21Z
dc.date.available2025-12-18T11:15:21Z
dc.date.issued18.12.2025
dc.identifier.urihttps://hdl.handle.net/20.500.11811/13768
dc.description.abstractIn this thesis, we study the detection of patterns in spatio-temporal data sets in the form of trajectory data. Many application areas collect unstructured trajectory data. Applications for the analysis of such data range from gait analysis of human movement to traffic-flow analysis, epidemiological hotspot detection, and Lagrangian analysis of particle simulations. Trajectory data is typically modeled as a sequence of points that defines a polygonal curve via linear interpolation between consecutive points. One of the most fundamental ways of detecting patterns in any kind of data is clustering, which aims to partition the data into groups where points from the same group are 'close' to one another and those from different groups are 'far apart'. The clustering of polygonal curves is the main focus of this thesis.
The first clustering problem we study is the (k, l)-median problem for polygonal curves, which asks for k center curves, each with at most l points, minimizing the sum of dynamic time warping (DTW) distances from each input curve to its closest center. Numerous classical results for clustering problems have been extended to many different settings. A common assumption made is that the distance function is a metric, which DTW is not. Despite this, we provide a polynomial-time approximation algorithm with an approximation factor of O(m l), where m is the maximum number of points in any input curve. This is achieved by extending known algorithms for coreset construction—i.e., problem-specific input condensations—from metric spaces to the non-metric space of polygonal curves under DTW.
The second clustering problem we consider is Subtrajectory Covering, motivated by the need to detect patterns within a single curve by segmenting it into its constituent semantic parts. This is formulated as a set cover problem based on the continuous Fréchet distance. Given a polygonal curve with n points, the goal is to compute the smallest set of center curves, each with at most l points, such that every point on the input curve is covered by some center. A point is considered covered if it lies on a subtrajectory of the input curve whose (continuous) Fréchet distance to some center curve is at most Δ. We provide bicriteria approximation algorithms for arbitrary curves as well as for a subclass of realistic curves. Both algorithms approximate the required number kΔ of centers by a factor of O(log n) and the distance threshold Δ by a factor of O(1). For arbitrary curves, the algorithm runs in roughly O(√kΔn5/2) time. For the subclass of realistic curves (or more formally, c-packed curves whose arc length inside any ball of radius r is bounded by cr), the algorithm runs in roughly O(kΔcn2) time. We demonstrate the viability of both the problem formulation and the algorithm on real-world data.
Additionally, we investigate two related questions. First, given a clustering of trajectory data, a natural question is how to classify a new trajectory which is not part of the input. To this end, we design a data structure that answers approximate nearest neighbor queries: Given n input curves of complexity m and a query curve q, compute the closest input curve to q w.r.t. the continuous Fréchet distance. Our data structure answers such queries in roughly O(F(m, Φ) log n) time, where F(m, Φ) depends only on m and the spread Φ of vertices and edges. Second, we analyze the detection of erroneous patterns within a curve: Given two curves, how can we optimally modify one curve by replacing pieces of the curve with line segments so as to minimize the continuous Fréchet distance to the other curve? We show that this problem is not fixed-parameter tractable in the number of modifications and give a polynomial-time O(1)-approximation algorithm for its decision variant.
en
dc.language.isoeng
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectClustering
dc.subjectCoreset
dc.subjectCenter Clustering
dc.subjectSubtrajecory Clustering
dc.subjectApproximate Nearest Neighbour
dc.subjectShortcut Fréchet Distance
dc.subjectFréchet Distance
dc.subject.ddc004 Informatik
dc.titleClustering Trajectories: Detecting Patterns in Spatio-Temporal Data
dc.typeDissertation oder Habilitation
dc.identifier.doihttps://doi.org/10.48565/bonndoc-750
dc.publisher.nameUniversitäts- und Landesbibliothek Bonn
dc.publisher.locationBonn
dc.rights.accessRightsopenAccess
dc.identifier.urnhttps://nbn-resolving.org/urn:nbn:de:hbz:5-87045
dc.relation.doihttps://doi.org/10.4230/LIPICS.SOCG.2024.42
dc.relation.doihttps://doi.org/10.4230/LIPICS.ESA.2022.28
dc.relation.doihttps://doi.org/10.48550/ARXIV.2308.14865
dc.relation.doihttps://doi.org/10.4230/LIPICS.ESA.2025.12
dc.relation.doihttps://doi.org/10.57717/CGT.V3I2.45
dc.relation.doihttps://doi.org/10.1145/3663762
ulbbn.pubtypeErstveröffentlichung
ulbbnediss.affiliation.nameRheinische Friedrich-Wilhelms-Universität Bonn
ulbbnediss.affiliation.locationBonn
ulbbnediss.thesis.levelDissertation
ulbbnediss.dissID8704
ulbbnediss.date.accepted28.11.2025
ulbbnediss.instituteMathematisch-Naturwissenschaftliche Fakultät : Fachgruppe Informatik / Institut für Informatik
ulbbnediss.fakultaetMathematisch-Naturwissenschaftliche Fakultät
dc.contributor.coRefereeRöglin, Heiko
ulbbnediss.contributor.orcidhttps://orcid.org/0000-0002-8259-1187


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

The following license files are associated with this item:

InCopyright