Theoretical Analysis of Hierarchical Clustering and the Shadow Vertex Algorithm

Großwendt, Anna-Klara

Volltext

View/Open (1.4MB)

Author

Großwendt, Anna-Klara

Type of Scholarly Publication

Dissertation

Date of Exam

13.03.2020

Date of Publication

05.05.2020

Advisor

Röglin, Heiko

Co-Referee

Schmidt, Melanie

Involved Institutions

Rheinische Friedrich-Wilhelms-Universität Bonn

Metadata

Show full item record

Citable Links

Handle: https://hdl.handle.net/20.500.11811/8348
URN: https://nbn-resolving.org/urn:nbn:de:hbz:5-58412

Abstract

Agglomerative clustering (AC) is a very popular greedy method for computing hierarchical clusterings in practice, yet its theoretical properties have been studied relatively little. We consider AC with respect to the most popular objective functions, especially the diameter function, the radius function and the k-means function. Given a finite set P of points in R^d, AC starts with each point from P in a cluster of its own and then iteratively merges two clusters from the current clustering that minimize the respective objective function when merged into a single cluster. We study the problem of partitioning P into k clusters such that the largest diameter of the clusters is minimized and we prove that AC computes an O(1)-approximation for this problem for any metric that is induced by a norm, assuming that the dimension d is a constant. This improves the best previously known bound of O(log k) due to Ackermann et al. Our bound also carries over to the k-center and the continuous k-center problem. Moreover we study the behavior of agglomerative clustering for the hierarchical k-means problem. We show that AC computes a 2-approximation with respect to the k-means objective function if the optimal k-clustering is well separated. If additionally the optimal clustering also satisfies a balance condition, then AC fully recovers the optimum solution. These results hold in arbitrary dimension. We accompany our positive results with a lower bound of Ω((3/2)^d) for data sets in R^d that holds if no separation is guaranteed, and with lower bounds when the guaranteed separation is not sufficiently strong. Finally, we show that AC produces an O(1)-approximative clustering for one-dimensional data sets.
Apart from AC we provide improved and in some cases new general upper and lower bounds on the existence of hierarchical clusterings. For the objective function discrete radius we provide a new lower bound of 2 and improve the upper bound of 4. For the k-means objective function we state a lower bound of 32 on the existence of hierarchical clusterings. This improves the best previously known bound of 576.
The simplex algorithm is probably the most popular algorithm for solving linear pro grams in practice. It is determined by so called pivot rules. The shadow vertex simplex algorithm is a popular pivot rule which has gained attention in recent years because it was shown to have polynomial running time in the model of smoothed complexity. In the second part of the dissertation we show that the shadow vertex simplex algorithm can be used to solve linear programs in strongly polynomial time with respect to the number n of variables, the number m of constraints, and 1/δ, where δ is a parameter that measures the flatness of the vertices of the polyhedron. This extends a previous result that the shadow vertex algorithm finds paths of polynomial length (w.r.t. n, m, and 1/δ) between two given vertices of a polyhedron. Our result also complements a result due to Eisenbrand and Vempala who have shown that a certain version of the random edge pivot rule solves linear programs with a running time that is strongly polynomial in the number of variables n and 1/δ, but independent of the number m of constraints. Even though the running time of our algorithm depends on m, it is significantly faster for the important special case of totally unimodular linear programs, for which 1/δ is smaller or equal than n and which have only O(n²) constraints.

Subjects

Hierarchische Clusterverfahren, Greedy-Methoden, Complete-Linkage, Wards Algorithmus, Clustering, Durchmesser, Radius, k-means, lineare Programmierung, Shadow-Vertex-Algorithmus, Hierarchical clustering, Greedy methods, approximation factor, Ward's algorithm, diameter, discrete radius

Classification (DDC)

004 Informatik

Zitiervorschlag
BibTeX

Großwendt, Anna-Klara: Theoretical Analysis of Hierarchical Clustering and the Shadow Vertex Algorithm. - Bonn, 2020. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-58412

@phdthesis{handle:20.500.11811/8348,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-58412,
author = {{Anna-Klara Großwendt}},
title = {Theoretical Analysis of Hierarchical Clustering and the Shadow Vertex Algorithm},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2020,
month = may,
note = {Agglomerative clustering (AC) is a very popular greedy method for computing hierarchical clusterings in practice, yet its theoretical properties have been studied relatively little. We consider AC with respect to the most popular objective functions, especially the diameter function, the radius function and the k-means function. Given a finite set P of points in R^d, AC starts with each point from P in a cluster of its own and then iteratively merges two clusters from the current clustering that minimize the respective objective function when merged into a single cluster. We study the problem of partitioning P into k clusters such that the largest diameter of the clusters is minimized and we prove that AC computes an O(1)-approximation for this problem for any metric that is induced by a norm, assuming that the dimension d is a constant. This improves the best previously known bound of O(log k) due to Ackermann et al. Our bound also carries over to the k-center and the continuous k-center problem. Moreover we study the behavior of agglomerative clustering for the hierarchical k-means problem. We show that AC computes a 2-approximation with respect to the k-means objective function if the optimal k-clustering is well separated. If additionally the optimal clustering also satisfies a balance condition, then AC fully recovers the optimum solution. These results hold in arbitrary dimension. We accompany our positive results with a lower bound of Ω((3/2)^d) for data sets in R^d that holds if no separation is guaranteed, and with lower bounds when the guaranteed separation is not sufficiently strong. Finally, we show that AC produces an O(1)-approximative clustering for one-dimensional data sets.
Apart from AC we provide improved and in some cases new general upper and lower bounds on the existence of hierarchical clusterings. For the objective function discrete radius we provide a new lower bound of 2 and improve the upper bound of 4. For the k-means objective function we state a lower bound of 32 on the existence of hierarchical clusterings. This improves the best previously known bound of 576.
The simplex algorithm is probably the most popular algorithm for solving linear pro grams in practice. It is determined by so called pivot rules. The shadow vertex simplex algorithm is a popular pivot rule which has gained attention in recent years because it was shown to have polynomial running time in the model of smoothed complexity. In the second part of the dissertation we show that the shadow vertex simplex algorithm can be used to solve linear programs in strongly polynomial time with respect to the number n of variables, the number m of constraints, and 1/δ, where δ is a parameter that measures the flatness of the vertices of the polyhedron. This extends a previous result that the shadow vertex algorithm finds paths of polynomial length (w.r.t. n, m, and 1/δ) between two given vertices of a polyhedron. Our result also complements a result due to Eisenbrand and Vempala who have shown that a certain version of the random edge pivot rule solves linear programs with a running time that is strongly polynomial in the number of variables n and 1/δ, but independent of the number m of constraints. Even though the running time of our algorithm depends on m, it is significantly faster for the important special case of totally unimodular linear programs, for which 1/δ is smaller or equal than n and which have only O(n²) constraints.},
url = {https://hdl.handle.net/20.500.11811/8348}
}

The following license files are associated with this item: