JSDoc: Class: SparkSession

Constructor

new SparkSession()

Source:

eclairjs/sql/SparkSession.js, line 46

Examples

SparkSession.builder().getOrCreate()
 

The builder can also be used to create a new session:

SparkSession.builder()
    .master("local")
    .appName("Word Count")
    .config("spark.some.config.option", "some-value").
    .getOrCreate()

Methods

baseRelationToDataFrame(baseRelation) → {DataFrame}

Convert a [[BaseRelation]] created for external data sources into a DataFrame.

Parameters:

Name	Type	Description
`baseRelation`	module:eclairjs/sql/sources.BaseRelation

Since:

EclairJS 0.6 Spark 2.0.0

Source:

eclairjs/sql/SparkSession.js, line 169

Returns:

Type: DataFrame

builder() → {module:eclairjs/sql.SparkSessionBuilder}

Creates a [[module:eclairjs/sql.SparkSessionBuilder]] for constructing a module:eclairjs/sql.SparkSession.

Since:

EclairJS 0.6 Spark 2.0.0

Source:

eclairjs/sql/SparkSession.js, line 382

Returns:

Type: module:eclairjs/sql.SparkSessionBuilder

clearActiveSession()

Clears the active SparkSession for current thread. Subsequent calls to getOrCreate will return the first created context instead of a thread-local override.

Since:

EclairJS 0.6 Spark 2.0.0

Source:

eclairjs/sql/SparkSession.js, line 406

clearDefaultSession()

Clears the default SparkSession that is returned by the builder.

Since:

EclairJS 0.6 Spark 2.0.0

Source:

eclairjs/sql/SparkSession.js, line 428

createDataFrame(rowRDD_or_values, schema) → {module:eclairjs/sql.DataFrame}

Creates a Dataset from RDD of Rows using the schema

Parameters:

Name	Type	Description
`rowRDD_or_values`	module:eclairjs.RDD.<module:eclairjs/sql.Row> \| Array.<module:eclairjs/sql.Row>	A RDD of Rows or array of arrays that contain values of valid DataTypes
`schema`	module:eclairjs/sql/types.StructType	-

Source:

eclairjs/sql/SparkSession.js, line 145

Returns:

Type: module:eclairjs/sql.DataFrame

Example

var df = sqlSession.createDataFrame([[1,1], [1,2], [2,1], [2,1], [2,3], [3,2], [3,3]], schema);

createDataFrameFromJson(schema) → {module:eclairjs/sql.Dataset}

Creates a Dataset from RDD of JSON

Parameters:

Name	Type	Description
`{{module:eclairjs.RDD`		RDD of JSON
`schema`	object	object with keys corresponding to JSON field names (or getter functions), and values indicating Datatype










    

    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 157
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql.Dataset


    


    


    Example
    
    var df = sqlSession.createDataFrameFromJson([{id:1,"name":"jim"},{id:2,"name":"tom"}], {"id":"Integer","name","String"});



        
            

    

    emptyDataset() → {module:eclairjs/sql.Dataset}

    




    :: Experimental ::
Creates a new Dataset of type T containing zero elements.
















    

    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 128
    
    

    

    

    














Returns:

        

    2.0.0





    
        Type
    
    
        
module:eclairjs/sql.Dataset


    


    



        
            

    

    newSession() → {module:eclairjs/sql.SparkSession}

    




    Start a new session with isolated SQL configurations, temporary tables, registered
functions are isolated, but sharing the underlying SparkContext and cached data.

Note: Other than the SparkContext, all shared state is initialized lazily.
This method will force the initialization of the shared state to ensure that parent
and child sessions are set up with the same shared state. If the underlying catalog
implementation is Hive, this will initialize the metastore, which may take some time.
















    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 108
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql.SparkSession


    


    



        
            

    

    range(tableName, start, end, stepopt, numPartitionsopt) → {module:eclairjs/sql.Dataset}

    




    :: Experimental ::
Creates a [[Dataset]] with a single LongType column named `id`, containing elements
in a range from `start` to `end` (exclusive) with a step value, with partition number
specified.










    Parameters:
    


    
    
        
        Name
        

        Type

        
        Attributes
        

        

        Description
    
    

    
    

        
            
                tableName
            

            
            
                
string


            
            

            
                
                

                

                
                
            

            

            
        

    

        
            
                start
            

            
            
                
number


            
            

            
                
                

                

                
                
            

            

            
        

    

        
            
                end
            

            
            
                
number


            
            

            
                
                

                

                
                
            

            

            
        

    

        
            
                step
            

            
            
                
number


            
            

            
                
                
                    <optional>

                

                

                
                
            

            

            
        

    

        
            
                numPartitions
            

            
            
                
number


            
            

            
                
                
                    <optional>

                

                

                
                
            

            

            
        

    
    









    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 268
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql.Dataset


    


    



        
            

    

    read() → {module:eclairjs/sql.DataFrameReader}

    




    Returns a DataFrameReader that can be used to read non-streaming data in as a
DataFrame.
















    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 329
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql.DataFrameReader


    


    


    Example
    
    sparkSession.read.parquet("/path/to/file.parquet")
  sparkSession.read.schema(schema).json("/path/to/file.json")
 



        
            

    

    readStream() → {module:eclairjs/sql/streaming.DataStreamReader}

    




    :: Experimental ::
Returns a [[DataStreamReader]] that can be used to read streaming data in as a DataFrame.
















    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 350
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql/streaming.DataStreamReader


    


    


    Example
    
    sparkSession.readStream.parquet("/path/to/directory/of/parquet/files")
  sparkSession.readStream.schema(schema).json("/path/to/directory/of/json/files")
 



        
            

    

    setActiveSession(session)

    




    Changes the SparkSession that will be returned in this thread and its children when
SparkSession.getOrCreate() is called. This can be used to ensure that a given thread receives
a SparkSession with an isolated session, instead of the global (first created) context.










    Parameters:
    


    
    
        
        Name
        

        Type

        

        

        Description
    
    

    
    

        
            
                session
            

            
            
                
module:eclairjs/sql.SparkSession


            
            

            

            

            
        

    
    









    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 393
    
    

    

    

    

















        
            

    

    setDefaultSession(session)

    




    Sets the default SparkSession that is returned by the builder.










    Parameters:
    


    
    
        
        Name
        

        Type

        

        

        Description
    
    

    
    

        
            
                session
            

            
            
                
module:eclairjs/sql.SparkSession


            
            

            

            

            
        

    
    









    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 417
    
    

    

    

    

















        
            

    

    sparkContext() → {module:eclairjs.SparkContext}

    




    The underlying SparkContext.
















    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 60
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs.SparkContext


    


    



        
            

    

    sql(sqlText) → {module:eclairjs/sql.Dataset}

    




    Executes a SQL query using Spark, returning the result as a module:eclairjs/sql.Dataset.
The dialect that is used for SQL parsing can be configured with 'spark.sql.dialect'.










    Parameters:
    


    
    
        
        Name
        

        Type

        

        

        Description
    
    

    
    

        
            
                sqlText
            

            
            
                
string


            
            

            

            

            
        

    
    









    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 310
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql.Dataset


    


    



        
            

    

    stop()

    




    Stop the underlying module:eclairjs.SparkContext.
















    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 369
    
    

    

    

    

















        
            

    

    streams() → {module:eclairjs/sql/streaming.StreamingQueryManager}

    




    :: Experimental ::
Returns a StreamingQueryManager that allows managing all the
[[StreamingQuery StreamingQueries]] active on `this`.
















    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 90
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql/streaming.StreamingQueryManager


    


    



        
            

    

    table(tableName) → {module:eclairjs/sql.Dataset}

    




    Returns the specified table as a module:eclairjs/sql.Dataset.










    Parameters:
    


    
    
        
        Name
        

        Type

        

        

        Description
    
    

    
    

        
            
                tableName
            

            
            
                
string


            
            

            

            

            
        

    
    









    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 293
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql.Dataset


    


    



        
            

    

    udf() → {module:eclairjs/sql.UDFRegistration}

    




    A collection of methods for registering user-defined functions (UDF).
Note that the user-defined functions must be deterministic. Due to optimization,
duplicate invocations may be eliminated or the function may even be invoked more times than
it is present in the query.
















    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 69
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
module:eclairjs/sql.UDFRegistration


    


    



        
            

    

    version() → {string}

    




    The version of Spark on which this application is running.
















    

    
    Since:
    EclairJS 0.6 Spark  2.0.0
    

    

    

    

    

    

    

    

    

    

    

    
    Source:
    
        eclairjs/sql/SparkSession.js, line 50
    
    

    

    

    














Returns:

        



    
        Type
    
    
        
string

Name	Type	Attributes
`tableName`	string
`start`	number
`end`	number
`step`	number	<optional>
`numPartitions`	number	<optional>