Class: BisectingKMeans

eclairjs/ml/clustering.BisectingKMeans

:: Experimental :: A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark. The algorithm starts from a single cluster that contains all points. Iteratively it finds divisible clusters on the bottom level and bisects each of them using k-means, until there are `k` leaf clusters in total or no leaf clusters are divisible. The bisecting steps of clusters on the same level are grouped together to increase parallelism. If bisecting all divisible clusters on the bottom level would result more than `k` leaf clusters, larger clusters get higher priority.

Constructor

new BisectingKMeans(uid)

Parameters:
Name Type Description
uid string
Source:
See:
  • [[http://glaros.dtc.umn.edu/gkhome/fetch/papers/docclusterKDDTMW00.pdf Steinbach, Karypis, and Kumar, A comparison of document clustering techniques, KDD Workshop on Text Mining, 2000.]]

Extends

Methods

(static) load(path) → {BisectingKMeans}

Parameters:
Name Type Description
path string
Source:
Returns:
Type
BisectingKMeans

copy(extra) → {BisectingKMeans}

Parameters:
Name Type Description
extra module:eclairjs/ml/param.ParamMap
Overrides:
Source:
Returns:
Type
BisectingKMeans

extractParamMap() → {module:eclairjs/ml/param.ParamMap}

Inherited From:
Source:
Returns:
Type
module:eclairjs/ml/param.ParamMap

fit(dataset) → {BisectingKMeansModel}

Parameters:
Name Type Description
dataset module:eclairjs/sql.Dataset
Overrides:
Source:
Returns:
Type
BisectingKMeansModel

setFeaturesCol(value) → {type}

Parameters:
Name Type Description
value string
Source:
Returns:
Type
type

setK(value) → {type}

Parameters:
Name Type Description
value number
Source:
Returns:
Type
type

setMaxIter(value) → {type}

Parameters:
Name Type Description
value number
Source:
Returns:
Type
type

setMinDivisibleClusterSize(value) → {type}

Parameters:
Name Type Description
value number
Source:
Returns:
Type
type

setPredictionCol(value) → {type}

Parameters:
Name Type Description
value string
Source:
Returns:
Type
type

setSeed(value) → {type}

Parameters:
Name Type Description
value number
Source:
Returns:
Type
type

transformSchema(schema) → {StructType}

Parameters:
Name Type Description
schema module:eclairjs/sql/types.StructType
Source:
Returns:
Type
StructType

uid() → {Promise.<string>}

An immutable unique ID for the object and its derivatives.
Source:
Returns:
Type
Promise.<string>