Constructor
new StreamingContext(sparkContext, duration)
Parameters:
Name | Type | Description |
---|---|---|
sparkContext |
SparkContex | |
duration |
module:eclairjs/streaming.Duration |
Methods
(static) getOrCreate(checkpointPath, creatingFunc) → {JavaStreamingContext}
Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
If checkpoint data exists in the provided `checkpointPath`, then StreamingContext will be
recreated from the checkpoint data. If the data does not exist, then the provided factory
will be used to create a JavaStreamingContext.
Parameters:
Name | Type | Description |
---|---|---|
checkpointPath |
string | Checkpoint directory used in an earlier JavaStreamingContext program |
creatingFunc |
func | Function to create a new JavaStreamingContext |
Returns:
- Type
- JavaStreamingContext
awaitTermination()
Wait for the execution to stop
awaitTerminationOrTimeout(millis) → {boolean}
Wait for the execution to stop, or timeout
Parameters:
Name | Type | Description |
---|---|---|
millis |
long |
Returns:
- Type
- boolean
checkpoint(directory)
Sets the context to periodically checkpoint the DStream operations for master
fault-tolerance. The graph will be checkpointed every batch interval.
Parameters:
Name | Type | Description |
---|---|---|
directory |
string | HDFS-compatible directory where the checkpoint data will be reliably stored |
queueStream(queue, oneAtATimeopt, defaultRDDopt) → {module:eclairjs/streaming/dstream.DStream}
Create an input stream from an queue of RDDs. In each batch,
it will process either one or all of the RDDs returned by the queue.
NOTE:
1. Changes to the queue after the stream is created will not be recognized.
2. Arbitrary RDDs can be added to `queueStream`, there is no way to recover data of
those RDDs, so `queueStream` doesn't support checkpointing.
Parameters:
Name | Type | Attributes | Default | Description |
---|---|---|---|---|
queue |
Array.<module:eclairjs.RDD> | Queue of RDDs | ||
oneAtATime |
boolean |
<optional> |
true | Whether only one RDD should be consumed from the queue in every interval |
defaultRDD |
module:eclairjs.RDD |
<optional> |
Default RDD is returned by the DStream when the queue is empty |
Returns:
socketTextStream(host, port) → {module:eclairjs/streaming/dstream.DStream}
Create a input stream from TCP source hostname:port.
Parameters:
Name | Type | Description |
---|---|---|
host |
string | |
port |
string |
Returns:
sparkContext() → {module:eclairjs.SparkContext}
The underlying SparkContext
Returns:
start()
Start the execution of the streams.
stop()
Stops the execution of the streams.