Show simple item record

dc.contributor.authorSimonov, Kirill
dc.date.accessioned2021-03-23T14:52:19Z
dc.date.available2021-03-23T14:52:19Z
dc.date.issued2021-03-29
dc.date.submitted2021-03-08T23:42:46.039Z
dc.identifiercontainer/4c/8b/81/b8/4c8b81b8-227c-43cd-ba60-bbebd03d71d4
dc.identifier.isbn9788230866511
dc.identifier.isbn9788230858189
dc.identifier.urihttps://hdl.handle.net/11250/2735169
dc.description.abstractThe two most popular unsupervised learning problems are k-Clustering and Low-Rank Approximation. Consider a set of n datapoints, in the k-Clustering problem, the objective is to partition these points into k clusters and select k centers such that each cluster is represented well by its center. In the Low-Rank Approximation problem, the task is to find an r-dimensional subspace that minimizes the sum of deviations from each point to the subspace. Both problems are of utmost importance for the modern data-driven applications, and both can be thought of as structured representation problems. In this thesis, we provide a thorough study of the multivariate complexity of k-Clustering and Low-Rank Approximation. We focus on extensions of these problems, such as robust and constrained versions, that reach beyond the well-studied standard setting. The main body of the thesis is divided into three parts. In the first part, we study parameterized complexity of exact algorithms, where the parameter is the total cost of the clustering. Problems that constitute the main focus of this part are k-Clustering in L_p-norm for p ∈ [0, ∞], and Categorical Clustering with Row/Column Outliers. We provide a number of fixed-parameter tractable algorithms based on hypergraph enumeration, and a number of hardness results. The second part can be summarized as employing sampling methods to provide (1 + ε)-approximation in FPT time. We show a space-efficient coreset for the Fair Clustering problem, and an FPT approximation scheme for the problem of clustering points with missing entries, where the number of missing entries in each point is bounded. Finally, in the third part we deal with the Low-Rank Approximation problem, and its robust variant, Low-Rank Approximation with Outliers. For the latter, we employ algebraic geometry methods to provide an n^O(rd) exact algorithm that is nearly tight even for arbitrary-factor approximation, and its dimensionality reduction-based improvements. We also present a PTAS for Low-Rank Approximation of binary matrices in column-sum norm.en_US
dc.language.isoengen_US
dc.publisherThe University of Bergenen_US
dc.rightsIn copyright
dc.rights.urihttp://rightsstatements.org/page/InC/1.0/
dc.titleAlgorithmic Complexity of Clustering and Low-Rank Approximation Problemsen_US
dc.typeDoctoral thesisen_US
dc.date.updated2021-03-08T23:42:46.039Z
dc.rights.holderCopyright the Author. All rights reserveden_US
dc.contributor.orcid0000-0001-9436-7310
dc.description.degreeDoktorgradsavhandling
fs.unitcode12-12-0


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record