Constructor
new SparkSession()
- Source:
Examples
SparkSession.builder().getOrCreate()
The builder can also be used to create a new session:
SparkSession.builder()
.master("local")
.appName("Word Count")
.config("spark.some.config.option", "some-value").
.getOrCreate()
Methods
baseRelationToDataFrame(baseRelation) → {DataFrame}
Convert a [[BaseRelation]] created for external data sources into a DataFrame.
Parameters:
Name | Type | Description |
---|---|---|
baseRelation |
module:eclairjs/sql/sources.BaseRelation |
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
- Type
- DataFrame
builder() → {module:eclairjs/sql.SparkSessionBuilder}
Creates a [[module:eclairjs/sql.SparkSessionBuilder]] for constructing a module:eclairjs/sql.SparkSession.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
clearActiveSession()
Clears the active SparkSession for current thread. Subsequent calls to getOrCreate will
return the first created context instead of a thread-local override.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
clearDefaultSession()
Clears the default SparkSession that is returned by the builder.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
createDataFrame(rowRDD_or_values, schema) → {module:eclairjs/sql.DataFrame}
Creates a Dataset from RDD of Rows using the schema
Parameters:
Name | Type | Description |
---|---|---|
rowRDD_or_values |
module:eclairjs.RDD.<module:eclairjs/sql.Row> | Array.<module:eclairjs/sql.Row> | A RDD of Rows or array of arrays that contain values of valid DataTypes |
schema |
module:eclairjs/sql/types.StructType | - |
- Source:
Returns:
- Type
- module:eclairjs/sql.DataFrame
Example
var df = sqlSession.createDataFrame([[1,1], [1,2], [2,1], [2,1], [2,3], [3,2], [3,3]], schema);
createDataFrameFromJson(schema) → {module:eclairjs/sql.Dataset}
Creates a Dataset from RDD of JSON
Parameters:
Name | Type | Description |
---|---|---|
{{module:eclairjs.RDD |
RDD of JSON | |
schema |
object | object with keys corresponding to JSON field names (or getter functions), and values indicating Datatype |
- Source:
Returns:
Example
var df = sqlSession.createDataFrameFromJson([{id:1,"name":"jim"},{id:2,"name":"tom"}], {"id":"Integer","name","String"});
emptyDataset() → {module:eclairjs/sql.Dataset}
:: Experimental ::
Creates a new Dataset of type T containing zero elements.
- Source:
Returns:
2.0.0
newSession() → {module:eclairjs/sql.SparkSession}
Start a new session with isolated SQL configurations, temporary tables, registered
functions are isolated, but sharing the underlying SparkContext and cached data.
Note: Other than the SparkContext, all shared state is initialized lazily.
This method will force the initialization of the shared state to ensure that parent
and child sessions are set up with the same shared state. If the underlying catalog
implementation is Hive, this will initialize the metastore, which may take some time.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
range(tableName, start, end, stepopt, numPartitionsopt) → {module:eclairjs/sql.Dataset}
:: Experimental ::
Creates a [[Dataset]] with a single LongType column named `id`, containing elements
in a range from `start` to `end` (exclusive) with a step value, with partition number
specified.
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
tableName |
string | ||
start |
number | ||
end |
number | ||
step |
number |
<optional> |
|
numPartitions |
number |
<optional> |
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
read() → {module:eclairjs/sql.DataFrameReader}
Returns a DataFrameReader that can be used to read non-streaming data in as a
DataFrame.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
Example
sparkSession.read.parquet("/path/to/file.parquet")
sparkSession.read.schema(schema).json("/path/to/file.json")
readStream() → {module:eclairjs/sql/streaming.DataStreamReader}
:: Experimental ::
Returns a [[DataStreamReader]] that can be used to read streaming data in as a DataFrame.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
Example
sparkSession.readStream.parquet("/path/to/directory/of/parquet/files")
sparkSession.readStream.schema(schema).json("/path/to/directory/of/json/files")
setActiveSession(session)
Changes the SparkSession that will be returned in this thread and its children when
SparkSession.getOrCreate() is called. This can be used to ensure that a given thread receives
a SparkSession with an isolated session, instead of the global (first created) context.
Parameters:
Name | Type | Description |
---|---|---|
session |
module:eclairjs/sql.SparkSession |
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
setDefaultSession(session)
Sets the default SparkSession that is returned by the builder.
Parameters:
Name | Type | Description |
---|---|---|
session |
module:eclairjs/sql.SparkSession |
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
sparkContext() → {module:eclairjs.SparkContext}
The underlying SparkContext.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
sql(sqlText) → {module:eclairjs/sql.Dataset}
Executes a SQL query using Spark, returning the result as a module:eclairjs/sql.Dataset.
The dialect that is used for SQL parsing can be configured with 'spark.sql.dialect'.
Parameters:
Name | Type | Description |
---|---|---|
sqlText |
string |
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
stop()
Stop the underlying module:eclairjs.SparkContext.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
streams() → {module:eclairjs/sql/streaming.StreamingQueryManager}
:: Experimental ::
Returns a StreamingQueryManager that allows managing all the
[[StreamingQuery StreamingQueries]] active on `this`.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
table(tableName) → {module:eclairjs/sql.Dataset}
Returns the specified table as a module:eclairjs/sql.Dataset.
Parameters:
Name | Type | Description |
---|---|---|
tableName |
string |
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
udf() → {module:eclairjs/sql.UDFRegistration}
A collection of methods for registering user-defined functions (UDF).
Note that the user-defined functions must be deterministic. Due to optimization,
duplicate invocations may be eliminated or the function may even be invoked more times than
it is present in the query.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
version() → {string}
The version of Spark on which this application is running.
- Since:
- EclairJS 0.6 Spark 2.0.0
- Source:
Returns:
- Type
- string