Class: PairDStream

eclairjs/streaming/dstream. PairDStream

new PairDStream(jDStream)

Parameters:
Name Type Description
jDStream object
Source:

Methods

cache() → {module:eclairjs/streaming/dstream.PairDStream}

Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

combineByKey(createCombiner, mergeValue, mergeCombiners, partitioner, mapSideCombineopt) → {module:eclairjs/streaming/dstream.PairDStream}

Combine elements of each key in DStream's RDDs using custom function. This is similar to the combineByKey for RDDs. Please refer to combineByKey in org.apache.spark.rdd.PairRDDFunctions for more information.
Parameters:
Name Type Attributes Description
createCombiner func
mergeValue func
mergeCombiners func
partitioner module:eclairjs.Partitioner
mapSideCombine boolean <optional>
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

compute(validTime) → {module:eclairjs.PairRDD}

Parameters:
Name Type Description
validTime module:eclairjs/streaming.Time
Source:
Returns:
Type
module:eclairjs.PairRDD

filter(f) → {module:eclairjs/streaming/dstream.PairDStream}

Parameters:
Name Type Description
f func
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

flatMapValues(f) → {module:eclairjs/streaming/dstream.PairDStream}

Return a new DStream by applying a flatmap function to the value of each key-value pairs in 'this' DStream without changing the key.
Parameters:
Name Type Description
f func
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

groupByKey() → {module:eclairjs/streaming/dstream.PairDStream}

Return a new DStream by applying `groupByKey` to each RDD. Hash partitioning is used to generate the RDDs with Spark's default number of partitions.
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

mapValues(f) → {module:eclairjs/streaming/dstream.PairDStream}

Return a new DStream by applying a map function to the value of each key-value pairs in 'this' DStream without changing the key.
Parameters:
Name Type Description
f func
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

persist(storageLevelopt) → {module:eclairjs/streaming/dstream.PairDStream}

Parameters:
Name Type Attributes Description
storageLevel module:eclairjs/storage.StorageLevel <optional>
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

reduceByKey(func) → {module:eclairjs/streaming/dstream.PairDStream}

Return a new DStream by applying `reduceByKey` to each RDD. The values for each key are merged using the associative reduce function. Hash partitioning is used to generate the RDDs with Spark's default number of partitions.
Parameters:
Name Type Description
func func
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

reduceByKeyAndWindow(reduceFunc, windowDuration) → {module:eclairjs/streaming/dstream.PairDStream}

Create a new DStream by applying `reduceByKey` over a sliding window on `this` DStream. Similar to `DStream.reduceByKey()`, but applies it over a sliding window. The new DStream generates RDDs with the same interval as this DStream. Hash partitioning is used to generate the RDDs with Spark's default number of partitions.
Parameters:
Name Type Description
reduceFunc func associative reduce function
windowDuration module:eclairjs/streaming.Duration width of the window; must be a multiple of this DStream's batching interval
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

repartition(numPartitions) → {module:eclairjs/streaming/dstream.PairDStream}

Return a new DStream with an increased or decreased level of parallelism. Each RDD in the returned DStream has exactly numPartitions partitions.
Parameters:
Name Type Description
numPartitions number
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

union(that) → {module:eclairjs/streaming/dstream.PairDStream}

Return a new DStream by unifying data of another DStream with this DStream.
Parameters:
Name Type Description
that module:eclairjs/streaming/dstream.PairDStream Another DStream having the same interval (i.e., slideDuration) as this DStream.
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream

window(windowDuration, slideDurationopt) → {module:eclairjs/streaming/dstream.PairDStream}

Return a new DStream which is computed based on windowed batches of this DStream.
Parameters:
Name Type Attributes Description
windowDuration module:eclairjs/streaming.Duration duration (i.e., width) of the window; must be a multiple of this DStream's interval
slideDuration module:eclairjs/streaming.Duration <optional>
sliding interval of the window (i.e., the interval after which the new DStream will generate RDDs); must be a multiple of this DStream's interval
Source:
Returns:
Type
module:eclairjs/streaming/dstream.PairDStream