Skip to content

ai-ku/wkmeans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

0f3c927 · Mar 7, 2015

History

28 Commits
May 26, 2012
Jun 6, 2012
Mar 25, 2012
Mar 7, 2015
Mar 27, 2013
May 21, 2012

Repository files navigation

WKMEANS		            Copyright (c) 2012, Deniz Yuret

Usage: wkmeans [options] < input > output
  -k number of clusters (default 2)
  -r number of restarts (default 0)
  -s random seed
  -l input file contains labels
  -w input file contains instance weights
  -v verbose output
  
Input format (assuming you have m vectors of n dimensions):
[label_1] [weight_1] x_11 ... x_1m
...
[label_m] [weight_m] x_m1 ... x_mn

label_i  : (string) label of the ith vector, required when -l used
weight_i : (double) weight of the ith vector, required when -w used
x_ij     : (double) ith vector's jth component

Output format:
[label_1] c_1
...
[label_m] c_m

c_i : (int) cluster of ith vector


Algorithm: wkmeans is a k-means algorithm with (optional) instance
weights.

* Based on mpi_kmeans-1.5 by Peter Gehler.

* Based on C. Elkan. Using the triangle inequality to accelerate
  kMeans. ICML 2003.

* Initialization based on Arthur, D. and Vassilvitskii,
  S. (2007). K-means++: the advantages of careful seeding.
  Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete
  algorithms. pp. 1027-1035.

Please see the file LICENSE for terms of use.  Everything is standard
C, so just typing make should give you an executable.