Class: RFormula

eclairjs/ml/feature. RFormula

Implements the transforms required for fitting a dataset against an R model formula. Currently we support a limited subset of the R operators, including '~', '.', ':', '+', and '-'. Also see the R formula docs here: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/formula.html The basic operators are: - `~` separate target and terms - `+` concat terms, "+ 0" means removing intercept - `-` remove a term, "- 1" means removing intercept - `:` interaction (multiplication for numeric values, or binarized categorical values) - `.` all columns except target Suppose `a` and `b` are double columns, we use the following simple examples to illustrate the effect of `RFormula`: - `y ~ a + b` means model `y ~ w0 + w1 * a + w2 * b` where `w0` is the intercept and `w1, w2` are coefficients. - `y ~ a + b + a:b - 1` means model `y ~ w1 * a + w2 * b + w3 * a * b` where `w1, w2, w3` are coefficients. RFormula produces a vector column of features and a double or string column of label. Like when formulas are used in R for linear regression, string input columns will be one-hot encoded, and numeric columns will be cast to doubles. If the label column is of type string, it will be first transformed to double with `StringIndexer`. If the label column does not exist in the Dataset, the output label column will be created from the specified response variable in the formula.

Constructor

new RFormula(uidopt)

Parameters:
Name Type Attributes Description
uid string <optional>
Source:

Extends

Methods

(static) load(path) → {module:eclairjs/ml/feature.RFormula}

Parameters:
Name Type Description
path string
Source:
Returns:
Type
module:eclairjs/ml/feature.RFormula

copy(extra) → {module:eclairjs/ml/feature.RFormula}

Parameters:
Name Type Description
extra module:eclairjs/ml/param.ParamMap
Overrides:
Source:
Returns:
Type
module:eclairjs/ml/feature.RFormula

extractParamMap() → {module:eclairjs/ml/param.ParamMap}

Inherited From:
Source:
Returns:
Type
module:eclairjs/ml/param.ParamMap

fit(dataset) → {module:eclairjs/ml/feature.RFormulaModel}

Parameters:
Name Type Description
dataset module:eclairjs/sql.Dataset
Overrides:
Source:
Returns:
Type
module:eclairjs/ml/feature.RFormulaModel

formula() → {module:eclairjs/ml/param.Param}

R formula parameter. The formula is provided in string form.
Source:
Returns:
Type
module:eclairjs/ml/param.Param

getFormula() → {string}

Source:
Returns:
Type
string

hasLabelCol(schema) → {boolean}

Parameters:
Name Type Description
schema module:eclairjs/sql/types.StructType
Source:
Returns:
Type
boolean

setFeaturesCol(value) → {module:eclairjs/ml/feature.RFormula}

Parameters:
Name Type Description
value string
Source:
Returns:
Type
module:eclairjs/ml/feature.RFormula

setFormula(value) → {module:eclairjs/ml/feature.RFormula}

Sets the formula to use for this transformer. Must be called before use.
Parameters:
Name Type Description
value string an R formula in string form (e.g. "y ~ x + z")
Source:
Returns:
Type
module:eclairjs/ml/feature.RFormula

setLabelCol(value) → {module:eclairjs/ml/feature.RFormula}

Parameters:
Name Type Description
value string
Source:
Returns:
Type
module:eclairjs/ml/feature.RFormula

toString() → {string}

Source:
Returns:
Type
string

transformSchema(schema) → {module:eclairjs/sql/types.StructType}

Parameters:
Name Type Description
schema module:eclairjs/sql/types.StructType
Overrides:
Source:
Returns:
Type
module:eclairjs/sql/types.StructType

uid() → {string}

An immutable unique ID for the object and its derivatives.
Source:
Returns:
Type
string