Extends
- RDD
Methods
(static) fromRDD(rdd) → {module:eclairjs/rdd.FloatRDD}
Parameters:
Name | Type | Description |
---|---|---|
rdd |
module:eclairjs/rdd.RDD |
- Source:
Returns:
(static) toRDD(rdd) → {module:eclairjs/rdd.RDD}
Parameters:
Name | Type | Description |
---|---|---|
rdd |
module:eclairjs/rdd.FloatRDD |
- Source:
Returns:
cache() → {module:eclairjs/rdd.FloatRDD}
- Source:
Returns:
coalesce(numPartitions, shuffleopt) → {module:eclairjs/rdd.FloatRDD}
Return a new RDD that is reduced into `numPartitions` partitions.
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
numPartitions |
number | ||
shuffle |
boolean |
<optional> |
- Source:
Returns:
distinct(numPartitionsopt) → {module:eclairjs/rdd.FloatRDD}
Return a new RDD containing the distinct elements in this RDD.
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
numPartitions |
number |
<optional> |
- Source:
Returns:
filter(func) → {module:eclairjs/rdd.FloatRDD}
Return a new RDD containing only the elements that satisfy a predicate.
Parameters:
Name | Type | Description |
---|---|---|
func |
function |
- Source:
Returns:
first() → {float}
- Source:
Returns:
- Type
- float
histogram0(bucketCount) → {Pair}
Compute a histogram of the data using bucketCount number of buckets evenly
spaced between the minimum and maximum of the RDD. For example if the min
value is 0 and the max is 100 and there are two buckets the resulting
buckets will be [0,50) [50,100]. bucketCount must be at least 1
If the RDD contains infinity, NaN throws an exception
If the elements in RDD do not vary (max == min) always returns a single bucket.
Parameters:
Name | Type | Description |
---|---|---|
bucketCount |
number |
- Source:
Returns:
- Type
- Pair
histogram1(buckets) → {Array.<number>}
Compute a histogram using the provided buckets. The buckets are all open
to the left except for the last which is closed
e.g. for the array
[1,10,20,50] the buckets are [1,10) [10,20) [20,50]
e.g 1<=x<10 , 10<=x<20, 20<=x<50
And on the input of 1 and 50 we would have a histogram of 1,0,0
Note: if your histogram is evenly spaced (e.g. [0, 10, 20, 30]) this can be switched
from an O(log n) insertion to O(1) per element. (where n = # buckets) if you set evenBuckets
to true.
buckets must be sorted and not contain any duplicates.
buckets array must be at least two elements
All NaN entries are treated the same. If you have a NaN bucket it must be
the maximum value of the last position and all NaN entries will be counted
in that bucket.
Parameters:
Name | Type | Description |
---|---|---|
buckets |
Array.<number> |
- Source:
Returns:
- Type
- Array.<number>
histogram2(buckets, evenBuckets) → {Array.<number>}
Parameters:
Name | Type | Description |
---|---|---|
buckets |
Array.<float> | |
evenBuckets |
boolean |
- Source:
Returns:
- Type
- Array.<number>
intersection(other) → {module:eclairjs/rdd.FloatRDD}
Return the intersection of this RDD and another one. The output will not contain any duplicate
elements, even if the input RDDs did.
Note that this method performs a shuffle internally.
Parameters:
Name | Type | Description |
---|---|---|
other |
module:eclairjs/rdd.FloatRDD |
- Source:
Returns:
max() → {float}
Returns the maximum element from this RDD as defined by
the default comparator natural order.
- Source:
Returns:
the maximum of the RDD
- Type
- float
mean() → {float}
- Source:
Returns:
- Type
- float
meanApprox(timeout, confidenceopt) → {PartialResult}
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
timeout |
number | ||
confidence |
float |
<optional> |
- Source:
Returns:
- Type
- PartialResult
min() → {float}
Returns the minimum element from this RDD as defined by
the default comparator natural order.
- Source:
Returns:
the minimum of the RDD
- Type
- float
persist(newLevel) → {module:eclairjs/rdd.FloatRDD}
Set this RDD's storage level to persist its values across operations after the first time
it is computed. Can only be called once on each RDD.
Parameters:
Name | Type | Description |
---|---|---|
newLevel |
module:eclairjs/storage.StorageLevel |
- Source:
Returns:
repartition(numPartitions) → {module:eclairjs/rdd.FloatRDD}
Return a new RDD that has exactly numPartitions partitions.
Can increase or decrease the level of parallelism in this RDD. Internally, this uses
a shuffle to redistribute data.
If you are decreasing the number of partitions in this RDD, consider using `coalesce`,
which can avoid performing a shuffle.
Parameters:
Name | Type | Description |
---|---|---|
numPartitions |
number |
- Source:
Returns:
sample(withReplacement, fraction, seedopt) → {module:eclairjs/rdd.FloatRDD}
Return a sampled subset of this RDD.
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
withReplacement |
boolean | ||
fraction |
float | ||
seed |
number |
<optional> |
- Source:
Returns:
sampleStdev() → {float}
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N-1 instead of N).
- Source:
Returns:
- Type
- float
sampleVariance() → {float}
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the standard variance by dividing by N-1 instead of N).
- Source:
Returns:
- Type
- float
setName(name) → {module:eclairjs/rdd.FloatRDD}
Parameters:
Name | Type | Description |
---|---|---|
name |
string |
- Source:
Returns:
stats() → {StatCounter}
Return a StatCounter object that captures the mean, variance and
count of the RDD's elements in one operation.
- Source:
Returns:
- Type
- StatCounter
stdev() → {float}
- Source:
Returns:
- Type
- float
subtract0(other) → {module:eclairjs/rdd.FloatRDD}
Return an RDD with the elements from `this` that are not in `other`.
Uses `this` partitioner/partition size, because even if `other` is huge, the resulting
RDD will be <= us.
Parameters:
Name | Type | Description |
---|---|---|
other |
module:eclairjs/rdd.FloatRDD |
- Source:
Returns:
subtract1(other, numPartitions) → {module:eclairjs/rdd.FloatRDD}
Return an RDD with the elements from `this` that are not in `other`.
Parameters:
Name | Type | Description |
---|---|---|
other |
module:eclairjs/rdd.FloatRDD | |
numPartitions |
number |
- Source:
Returns:
subtract2(other, p) → {module:eclairjs/rdd.FloatRDD}
Return an RDD with the elements from `this` that are not in `other`.
Parameters:
Name | Type | Description |
---|---|---|
other |
module:eclairjs/rdd.FloatRDD | |
p |
Partitioner |
- Source:
Returns:
sum() → {float}
- Source:
Returns:
- Type
- float
sumApprox(timeout, confidenceopt) → {PartialResult}
Approximate operation to return the sum within a timeout.
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
timeout |
number | ||
confidence |
float |
<optional> |
- Source:
Returns:
- Type
- PartialResult
union(other) → {module:eclairjs/rdd.FloatRDD}
Return the union of this RDD and another one. Any identical elements will appear multiple
times (use `.distinct()` to eliminate them).
Parameters:
Name | Type | Description |
---|---|---|
other |
module:eclairjs/rdd.FloatRDD |
- Source:
Returns:
unpersist(blockingopt) → {module:eclairjs/rdd.FloatRDD}
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
blocking |
boolean |
<optional> |
Whether to block until all blocks are deleted. |
- Source:
Returns:
variance() → {float}
- Source:
Returns:
- Type
- float
wrapRDD(rdd) → {module:eclairjs/rdd.FloatRDD}
Parameters:
Name | Type | Description |
---|---|---|
rdd |
module:eclairjs/rdd.RDD |
- Source: