Class: PairRDD

eclairjs/rdd.PairRDD

new PairRDD(rdd)

Parameters:
Name Type Description
rdd module:eclairjs/rdd.RDD of Tuple(value, value).
Source:

Extends

  • RDD

Methods

(static) fromRDD(rdd) → {module:eclairjs/rdd.PairRDD}

Parameters:
Name Type Description
rdd RDFD
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

cache() → {module:eclairjs/rdd.PairRDD}

Persist this PairRDD with the default storage level (`MEMORY_ONLY`).
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

collect() → {Promise.<Array>}

Asynchronously returns all elements of the PairRDD.
Source:
Returns:
A Promise that resolves to an array containing all elements in the PairRDD.
Type
Promise.<Array>

count() → {Promise.<Number>}

Asynchronously returns the number of elements in the PairRDD.
Source:
Returns:
A Promise that resolves to the number of elements in the PairRDD.
Type
Promise.<Number>

filter(func, bindArgsopt) → {module:eclairjs/rdd.PairRDD}

Return a new PairRDD containing only the elements that satisfy a predicate.
Parameters:
Name Type Attributes Description
func function
bindArgs Array.<Object> <optional>
array whose values will be added to func's argument list.
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

flatMap(func, bindArgs)

Return a new PairRDD by first applying a function to all elements of this PairRDD, and then flattening the results.
Parameters:
Name Type Description
func function
bindArgs Array.<Object> Optional array whose values will be added to func's argument list.
Source:
Returns:
{PairRDD

foreach(func) → {Promise.<Void>}

Applies a function func to all rows.
Parameters:
Name Type Description
func function
Source:
Returns:
A Promise that resolves to nothing.
Type
Promise.<Void>

groupByKey(numberopt) → {module:eclairjs/rdd.PairRDD}

Group the values for each key in the RDD into a single sequence. Allows controlling the partitioning of the resulting key-value pair RDD by passing a Partitioner. Note: If you are grouping in order to perform an aggregation (such as a sum or average) over each key, using PairRDD.reduceByKey or combineByKey will provide much better performance.
Parameters:
Name Type Attributes Description
number number <optional>
number of partitions
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

join(other, optionanl) → {module:eclairjs/rdd.PairRDD}

Return an * @param {module:eclairjs/rdd.PairRDD} containing all pairs of elements with matching keys in `this` and `other`. Each pair of elements will be returned as a (k, (v1, v2)) tuple, where (k, v1) is in `this` and (k, v2) is in `other`. Performs a hash join across the cluster.
Parameters:
Name Type Description
other module:eclairjs/rdd.PairRDD
optionanl number
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

map(func, bindArgs) → {module:eclairjs/rdd.PairRDD}

Return a new PairRDD by applying a function to all elements of this RDD.
Parameters:
Name Type Description
func function
bindArgs Array.<Object> Optional array whose values will be added to func's argument list.
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

mapToFloat(func, bindArgs) → {module:eclairjs/rdd.FloatRDD}

Return a new RDD by applying a function to all elements of this RDD.
Parameters:
Name Type Description
func function
bindArgs Array.<Object> Optional array whose values will be added to func's argument list.
Source:
Returns:
Type
module:eclairjs/rdd.FloatRDD

mapToPair(func, bindArgs) → {module:eclairjs/rdd.PairRDD}

Return a new RDD by applying a function to all elements of this RDD.
Parameters:
Name Type Description
func function
bindArgs Array.<Object> Optional array whose values will be added to func's argument list.
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

mapValues(func, bindArgs) → {module:eclairjs/rdd.PairRDD}

Return a new PairRDD by applying a function to all elements of this RDD.
Parameters:
Name Type Description
func function
bindArgs Array.<Object> Optional array whose values will be added to func's argument list.
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

reduce(bindArgs) → {Promise.<Object>}

Reduces the elements of this PairRDD using the specified commutative and associative binary operator. {function} func - (undocumented) Function with two parameters
Parameters:
Name Type Description
bindArgs Array.<Object> Optional array whose values will be added to func's argument list.
Source:
Returns:
Type
Promise.<Object>

reduceByKey(func, bindArgsopt) → {module:eclairjs/rdd.PairRDD}

Merge the values for each key using an associative reduce function. This will also perform the merging locally on each mapper before sending results to a reducer, similarly to a "combiner" in MapReduce.
Parameters:
Name Type Attributes Description
func func
bindArgs Array.<Object> <optional>
array whose values will be added to func's argument list.
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

sample(withReplacement, fraction, seedopt) → {module:eclairjs/rdd.PairRDD}

Return a sampled subset of this RDD.
Parameters:
Name Type Attributes Description
withReplacement boolean
fraction number
seed number <optional>
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

saveAsObjectFile(path, overwriteopt) → {Promise.<void>}

Save this PairRDD as a SequenceFile of serialized objects.
Parameters:
Name Type Attributes Description
path string
overwrite boolean <optional>
defaults to false, if true overwrites file if it exists
Source:
Returns:
Type
Promise.<void>

saveAsTextFile(path, overwriteopt) → {Promise.<void>}

Save this PairRDD as a text file, using string representations of elements.
Parameters:
Name Type Attributes Description
path string
overwrite boolean <optional>
defaults to false, if true overwrites file if it exists
Source:
Returns:
Type
Promise.<void>

sortByKey(ascending) → {module:eclairjs/rdd.PairRDD}

Return this RDD sorted by the given key function.
Parameters:
Name Type Description
ascending Boolean
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD

take(num) → {Promise.<Array>}

Asynchronously returns the first num elements in this PairRDD.
Parameters:
Name Type Description
num Number
Source:
Returns:
A Promise that resolves to an array containing the first num elements in this PairRDD.
Type
Promise.<Array>

takeOrdered(num, func, bindArgsopt) → {Promise.<Array>}

Asynchronously returns the first k (smallest) elements from this RDD as defined by the specified implicit Ordering[T] and maintains the ordering. This does the opposite of top.
Parameters:
Name Type Attributes Description
num Number
func function (undocumented) Function with one parameter
bindArgs Array.<Object> <optional>
array whose values will be added to func's argument list.
Source:
Returns:
A Promise that resolves to an array containing the first num elements in this RDD.
Type
Promise.<Array>

takeSample(withReplacement, num, seed) → {Promise.<Array>}

Return a fixed-size sampled subset of this PairRDD in an array
Parameters:
Name Type Description
withReplacement boolean whether sampling is done with replacement
num number size of the returned sample
seed number seed for the random number generator
Source:
Returns:
A Promise that resolves to an array containing the specified number of elements in this PairRDD.
Type
Promise.<Array>

toArray() → {Promise.<Array>}

Return an array that contains all of the elements in this PairRDD.
Source:
Returns:
A Promise that resolves to an array containing all elements in this PairRDD.
Type
Promise.<Array>

values() → {module:eclairjs/rdd.PairRDD}

Return an PairRDD with the values of each tuple.
Source:
Returns:
Type
module:eclairjs/rdd.PairRDD