Where academic tradition
meets the exciting future

MSSC Clustering of Large Data Using the Limited Memory Bundle Method

Napsu Karmitsa, Adil Bagirov, Sona Taheri, MSSC Clustering of Large Data Using the Limited Memory Bundle Method. TUCS Technical Reports 1164, TUCS, 2016.

Abstract:

Clustering is among most important tasks in data mining. This problem in large data sets is challenging for most existing clustering algorithms. It is important to develop clustering algorithms which are accurate and can provide real time clustering in such data sets.
This paper introduces one such algorithm. The clustering problem is formulated as a nonsmooth optimization problem with minimum sum-of-squares distance function. Then the limited memory bundle method [Haarala et.al. Math. Prog., Vol. 109, No. 1, pp. 181-205, 2007] is modified and combined with incremental approach to solve this problem. The method is evaluated using real world data sets with both the large number of attributes and large number of data points. The new software is also compared with some other optimization based clustering softwares.

Files:

Full publication in PDF-format

BibTeX entry:

@TECHREPORT{tKaBaTa16b,
  title = {MSSC Clustering of Large Data Using the Limited Memory Bundle Method},
  author = {Karmitsa, Napsu and Bagirov, Adil and Taheri, Sona},
  number = {1164},
  series = {TUCS Technical Reports},
  publisher = {TUCS},
  year = {2016},
  keywords = {Cluster analysis, Nonsmooth optimization, Nonconvex problems, Bundle methods, Limited memory methods},
  ISBN = {978-952-12-3427-9},
}

Belongs to TUCS Research Unit(s): Turku Optimization Group (TOpGroup)

Edit publication