Class: SparkSession

eclairjs/sql.SparkSession

The entry point to programming Spark with the Dataset and DataFrame API. In environments that this has been created upfront (e.g. REPL, notebooks), use the builder to get an existing session:

Constructor

new SparkSession()

Source:
Examples
SparkSession.builder().getOrCreate()


The builder can also be used to create a new session:
SparkSession.builder()
    .master("local")
    .appName("Word Count")
    .config("spark.some.config.option", "some-value").
    .getOrCreate()

Methods

(static) builder() → {Builder}

Creates a [[SparkSession.Builder]] for constructing a SparkSession.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
Builder

(static) clearActiveSession() → {Promise.<Void>}

Clears the active SparkSession for current thread. Subsequent calls to getOrCreate will return the first created context instead of a thread-local override.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
A Promise that resolves to nothing.
Type
Promise.<Void>

(static) clearDefaultSession() → {Promise.<Void>}

Clears the default SparkSession that is returned by the builder.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
A Promise that resolves to nothing.
Type
Promise.<Void>

(static) setActiveSession(session) → {Promise.<Void>}

Changes the SparkSession that will be returned in this thread and its children when SparkSession.getOrCreate() is called. This can be used to ensure that a given thread receives a SparkSession with an isolated session, instead of the global (first created) context.
Parameters:
Name Type Description
session module:eclairjs/sql.SparkSession
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
A Promise that resolves to nothing.
Type
Promise.<Void>

(static) setDefaultSession(session) → {Promise.<Void>}

Sets the default SparkSession that is returned by the builder.
Parameters:
Name Type Description
session module:eclairjs/sql.SparkSession
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
A Promise that resolves to nothing.
Type
Promise.<Void>

baseRelationToDataFrame(baseRelation) → {module:eclairjs/sql.DataFrame}

Convert a [[BaseRelation]] created for external data sources into a DataFrame.
Parameters:
Name Type Description
baseRelation module:eclairjs/sql/sources.BaseRelation
Since:
  • EclairJS 0.6 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

createDataFrame(rowRDD_or_values, schema) → {module:eclairjs/sql.Dataset}

Creates a Dataset from RDD of Rows using the schema
Parameters:
Name Type Description
rowRDD_or_values module:eclairjs.RDD.<module:eclairjs/sql.Row> | Array.<module:eclairjs/sql.Row> A RDD of Rows or array of arrays that contain values of valid DataTypes
schema module:eclairjs/sql/types.StructType -
Source:
Returns:
Type
module:eclairjs/sql.Dataset
Example
var df = sqlSession.createDataFrame([[1,1], [1,2], [2,1], [2,1], [2,3], [3,2], [3,3]], schema);

createDataFrameFromJson(schema) → {module:eclairjs/sql.Dataset}

Creates a Dataset from RDD of JSON
Parameters:
Name Type Description
{{module:eclairjs.RDD} RDD of JSON
schema object object with keys corresponding to JSON field names (or getter functions), and values indicating Datatype
Source:
Returns:
Type
module:eclairjs/sql.Dataset
Example
var df = sqlSession.createDataFrame([{id:1,"name":"jim"},{id:2,"name":"tom"}], {"id":"Integer","name","String"});

createDataset(data, encoder) → {module:eclairjs/sql.Dataset}

:: Experimental :: Creates a Dataset
Parameters:
Name Type Description
data module:eclairjs/rdd.RDD | Array.<object>
encoder function
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql.Dataset

createDatasetFromJson(schema) → {module:eclairjs/sql.Dataset}

Creates a Dataset from RDD of JSON
Parameters:
Name Type Description
{{module:eclairjs.RDD} RDD of JSON
schema object object with keys corresponding to JSON field names (or getter functions), and values indicating Datatype
Source:
Returns:
Type
module:eclairjs/sql.Dataset
Example
var df = sqlSession.createDataFrame([{id:1,"name":"jim"},{id:2,"name":"tom"}], {"id":"Integer","name","String"});

emptyDataset() → {module:eclairjs/sql.Dataset}

:: Experimental :: Creates a new Dataset of type T containing zero elements.
Source:
Returns:
2.0.0
Type
module:eclairjs/sql.Dataset

newSession() → {module:eclairjs/sql.SparkSession}

Start a new session with isolated SQL configurations, temporary tables, registered functions are isolated, but sharing the underlying SparkContext and cached data. Note: Other than the SparkContext, all shared state is initialized lazily. This method will force the initialization of the shared state to ensure that parent and child sessions are set up with the same shared state. If the underlying catalog implementation is Hive, this will initialize the metastore, which may take some time.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql.SparkSession

range(end) → {module:eclairjs/sql.Dataset}

:: Experimental :: Creates a [[Dataset]] with a single LongType column named `id`, containing elements in the specified range
Parameters:
Name Type Description
end number
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql.Dataset

read() → {module:eclairjs/sql.DataFrameReader}

Returns a DataFrameReader that can be used to read non-streaming data in as a DataFrame.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrameReader
Example
sparkSession.read.parquet("/path/to/file.parquet")
  sparkSession.read.schema(schema).json("/path/to/file.json")

readStream() → {module:eclairjs/sql/streaming.DataStreamReader}

:: Experimental :: Returns a [[DataStreamReader]] that can be used to read streaming data in as a DataFrame.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql/streaming.DataStreamReader
Example
sparkSession.readStream.parquet("/path/to/directory/of/parquet/files")
  sparkSession.readStream.schema(schema).json("/path/to/directory/of/json/files")

sparkContext() → {module:eclairjs/SparkContext}

The underlying SparkContext.
Since:
  • EclairJS 0.6 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/SparkContext

sql(sqlText) → {module:eclairjs/sql.DataFrame}

Executes a SQL query using Spark, returning the result as a DataFrame. The dialect that is used for SQL parsing can be configured with 'spark.sql.dialect'.
Parameters:
Name Type Description
sqlText string
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

stop() → {Promise.<Void>}

Stop the underlying SparkContext.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
A Promise that resolves to nothing.
Type
Promise.<Void>

streams() → {module:eclairjs/sql/streaming.StreamingQueryManager}

:: Experimental :: Returns a StreamingQueryManager that allows managing all the [[StreamingQuery StreamingQueries]] active on `this`.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql/streaming.StreamingQueryManager

table(tableName) → {module:eclairjs/sql.DataFrame}

Returns the specified table as a DataFrame.
Parameters:
Name Type Description
tableName string
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

udf() → {module:eclairjs/sql.UDFRegistration}

A collection of methods for registering user-defined functions (UDF). Note that the user-defined functions must be deterministic. Due to optimization, duplicate invocations may be eliminated or the function may even be invoked more times than it is present in the query. The following example registers a Scala closure as UDF:
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
module:eclairjs/sql.UDFRegistration
Examples
sparkSession.udf.register("myUDF", (arg1: Int, arg2: String) => arg2 + arg1)


The following example registers a UDF in Java:
sparkSession.udf().register("myUDF",
      new UDF2<Integer, String, String>() {
          
sparkSession.udf().register("myUDF",
      (Integer arg1, String arg2) -> arg2 + arg1,
      DataTypes.StringType);

version() → {Promise.<string>}

The version of Spark on which this application is running.
Since:
  • EclairJS 0.7 Spark 2.0.0
Source:
Returns:
Type
Promise.<string>