Constructor
new SparkSession()
- Source:
Examples
SparkSession.builder().getOrCreate()
The builder can also be used to create a new session:
SparkSession.builder()
.master("local")
.appName("Word Count")
.config("spark.some.config.option", "some-value").
.getOrCreate()
Methods
(static) builder() → {Builder}
Creates a [[SparkSession.Builder]] for constructing a SparkSession.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
- Type
- Builder
(static) clearActiveSession() → {Promise.<Void>}
Clears the active SparkSession for current thread. Subsequent calls to getOrCreate will
return the first created context instead of a thread-local override.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
A Promise that resolves to nothing.
- Type
- Promise.<Void>
(static) clearDefaultSession() → {Promise.<Void>}
Clears the default SparkSession that is returned by the builder.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
A Promise that resolves to nothing.
- Type
- Promise.<Void>
(static) setActiveSession(session) → {Promise.<Void>}
Changes the SparkSession that will be returned in this thread and its children when
SparkSession.getOrCreate() is called. This can be used to ensure that a given thread receives
a SparkSession with an isolated session, instead of the global (first created) context.
Parameters:
Name | Type | Description |
---|---|---|
session |
module:eclairjs/sql.SparkSession |
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
A Promise that resolves to nothing.
- Type
- Promise.<Void>
(static) setDefaultSession(session) → {Promise.<Void>}
Sets the default SparkSession that is returned by the builder.
Parameters:
Name | Type | Description |
---|---|---|
session |
module:eclairjs/sql.SparkSession |
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
A Promise that resolves to nothing.
- Type
- Promise.<Void>
baseRelationToDataFrame(baseRelation) → {module:eclairjs/sql.DataFrame}
Convert a [[BaseRelation]] created for external data sources into a DataFrame.
Parameters:
Name | Type | Description |
---|---|---|
baseRelation |
module:eclairjs/sql/sources.BaseRelation |
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
createDataFrame(rowRDD_or_values, schema) → {module:eclairjs/sql.Dataset}
Creates a Dataset from RDD of Rows using the schema
Parameters:
Name | Type | Description |
---|---|---|
rowRDD_or_values |
module:eclairjs.RDD.<module:eclairjs/sql.Row> | Array.<module:eclairjs/sql.Row> | A RDD of Rows or array of arrays that contain values of valid DataTypes |
schema |
module:eclairjs/sql/types.StructType | - |
- Source:
Returns:
Example
var df = sqlSession.createDataFrame([[1,1], [1,2], [2,1], [2,1], [2,3], [3,2], [3,3]], schema);
createDataFrameFromJson(schema) → {module:eclairjs/sql.Dataset}
Creates a Dataset from RDD of JSON
Parameters:
Name | Type | Description |
---|---|---|
{{module:eclairjs.RDD |
RDD of JSON | |
schema |
object | object with keys corresponding to JSON field names (or getter functions), and values indicating Datatype |
- Source:
Returns:
Example
var df = sqlSession.createDataFrame([{id:1,"name":"jim"},{id:2,"name":"tom"}], {"id":"Integer","name","String"});
createDataset(data, encoder) → {module:eclairjs/sql.Dataset}
:: Experimental ::
Creates a Dataset
Parameters:
Name | Type | Description |
---|---|---|
data |
module:eclairjs/rdd.RDD | Array.<object> | |
encoder |
function |
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
createDatasetFromJson(schema) → {module:eclairjs/sql.Dataset}
Creates a Dataset from RDD of JSON
Parameters:
Name | Type | Description |
---|---|---|
{{module:eclairjs.RDD |
RDD of JSON | |
schema |
object | object with keys corresponding to JSON field names (or getter functions), and values indicating Datatype |
- Source:
Returns:
Example
var df = sqlSession.createDataFrame([{id:1,"name":"jim"},{id:2,"name":"tom"}], {"id":"Integer","name","String"});
emptyDataset() → {module:eclairjs/sql.Dataset}
:: Experimental ::
Creates a new Dataset of type T containing zero elements.
- Source:
Returns:
2.0.0
newSession() → {module:eclairjs/sql.SparkSession}
Start a new session with isolated SQL configurations, temporary tables, registered
functions are isolated, but sharing the underlying SparkContext and cached data.
Note: Other than the SparkContext, all shared state is initialized lazily.
This method will force the initialization of the shared state to ensure that parent
and child sessions are set up with the same shared state. If the underlying catalog
implementation is Hive, this will initialize the metastore, which may take some time.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
range(end) → {module:eclairjs/sql.Dataset}
:: Experimental ::
Creates a [[Dataset]] with a single LongType column named `id`, containing elements
in the specified range
Parameters:
Name | Type | Description |
---|---|---|
end |
number |
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
read() → {module:eclairjs/sql.DataFrameReader}
Returns a DataFrameReader that can be used to read non-streaming data in as a
DataFrame.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
Example
sparkSession.read.parquet("/path/to/file.parquet")
sparkSession.read.schema(schema).json("/path/to/file.json")
readStream() → {module:eclairjs/sql/streaming.DataStreamReader}
:: Experimental ::
Returns a [[DataStreamReader]] that can be used to read streaming data in as a DataFrame.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
Example
sparkSession.readStream.parquet("/path/to/directory/of/parquet/files")
sparkSession.readStream.schema(schema).json("/path/to/directory/of/json/files")
sparkContext() → {module:eclairjs/SparkContext}
The underlying SparkContext.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
- Type
- module:eclairjs/SparkContext
sql(sqlText) → {module:eclairjs/sql.DataFrame}
Executes a SQL query using Spark, returning the result as a DataFrame.
The dialect that is used for SQL parsing can be configured with 'spark.sql.dialect'.
Parameters:
Name | Type | Description |
---|---|---|
sqlText |
string |
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
stop() → {Promise.<Void>}
Stop the underlying SparkContext.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
A Promise that resolves to nothing.
- Type
- Promise.<Void>
streams() → {module:eclairjs/sql/streaming.StreamingQueryManager}
:: Experimental ::
Returns a StreamingQueryManager that allows managing all the
[[StreamingQuery StreamingQueries]] active on `this`.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
table(tableName) → {module:eclairjs/sql.DataFrame}
Returns the specified table as a DataFrame.
Parameters:
Name | Type | Description |
---|---|---|
tableName |
string |
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
udf() → {module:eclairjs/sql.UDFRegistration}
A collection of methods for registering user-defined functions (UDF).
Note that the user-defined functions must be deterministic. Due to optimization,
duplicate invocations may be eliminated or the function may even be invoked more times than
it is present in the query.
The following example registers a Scala closure as UDF:
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
- Type
- module:eclairjs/sql.UDFRegistration
Examples
sparkSession.udf.register("myUDF", (arg1: Int, arg2: String) => arg2 + arg1)
The following example registers a UDF in Java:
sparkSession.udf().register("myUDF",
new UDF2<Integer, String, String>() {
sparkSession.udf().register("myUDF",
(Integer arg1, String arg2) -> arg2 + arg1,
DataTypes.StringType);
version() → {Promise.<string>}
The version of Spark on which this application is running.
- Since:
- EclairJS 0.7 Spark 2.0.0
- Source:
Returns:
- Type
- Promise.<string>