• norsk
    • English
  • norsk 
    • norsk
    • English
  • Logg inn
Vis innførsel 
  •   Hjem
  • Faculty of Mathematics and Natural Sciences
  • Department of Informatics
  • Department of Informatics
  • Vis innførsel
  •   Hjem
  • Faculty of Mathematics and Natural Sciences
  • Department of Informatics
  • Department of Informatics
  • Vis innførsel
JavaScript is disabled for your browser. Some features of this site may not work without it.

Algorithmic Complexity of Clustering and Low-Rank Approximation Problems

Simonov, Kirill
Doctoral thesis
Thumbnail
Åpne
PDF (3.115Mb)
Permanent lenke
https://hdl.handle.net/11250/2735169
Utgivelsesdato
2021-03-29
Metadata
Vis full innførsel
Samlinger
  • Department of Informatics [746]
Sammendrag
The two most popular unsupervised learning problems are k-Clustering and Low-Rank Approximation. Consider a set of n datapoints, in the k-Clustering problem, the objective is to partition these points into k clusters and select k centers such that each cluster is represented well by its center. In the Low-Rank Approximation problem, the task is to find an r-dimensional subspace that minimizes the sum of deviations from each point to the subspace. Both problems are of utmost importance for the modern data-driven applications, and both can be thought of as structured representation problems.

In this thesis, we provide a thorough study of the multivariate complexity of k-Clustering and Low-Rank Approximation. We focus on extensions of these problems, such as robust and constrained versions, that reach beyond the well-studied standard setting. The main body of the thesis is divided into three parts. In the first part, we study parameterized complexity of exact algorithms, where the parameter is the total cost of the clustering. Problems that constitute the main focus of this part are k-Clustering in L_p-norm for p ∈ [0, ∞], and Categorical Clustering with Row/Column Outliers. We provide a number of fixed-parameter tractable algorithms based on hypergraph enumeration, and a number of hardness results. The second part can be summarized as employing sampling methods to provide (1 + ε)-approximation in FPT time. We show a space-efficient coreset for the Fair Clustering problem, and an FPT approximation scheme for the problem of clustering points with missing entries, where the number of missing entries in each point is bounded. Finally, in the third part we deal with the Low-Rank Approximation problem, and its robust variant, Low-Rank Approximation with Outliers. For the latter, we employ algebraic geometry methods to provide an n^O(rd) exact algorithm that is nearly tight even for arbitrary-factor approximation, and its dimensionality reduction-based improvements. We also present a PTAS for Low-Rank Approximation of binary matrices in column-sum norm.
Utgiver
The University of Bergen
Opphavsrett
Copyright the Author. All rights reserved

Kontakt oss | Gi tilbakemelding

Personvernerklæring
DSpace software copyright © 2002-2019  DuraSpace

Levert av  Unit
 

 

Bla i

Hele arkivetDelarkiv og samlingerUtgivelsesdatoForfattereTitlerEmneordDokumenttyperTidsskrifterDenne samlingenUtgivelsesdatoForfattereTitlerEmneordDokumenttyperTidsskrifter

Min side

Logg inn

Statistikk

Besøksstatistikk

Kontakt oss | Gi tilbakemelding

Personvernerklæring
DSpace software copyright © 2002-2019  DuraSpace

Levert av  Unit