Class: SQLContext

eclairjs/sql. SQLContext

The entry point for working with structured data (rows and columns) in Spark. Allows the creation of DataFrame objects as well as the execution of SQL queries.

Constructor

new SQLContext(sparkContext)

Parameters:
Name Type Description
sparkContext module:eclairjs.SparkContext
Since:
  • EclairJS 0.1 Spark 1.0.0
Source:

Methods

(static) clearActive()

Clears the active SQLContext for current thread. Subsequent calls to getOrCreate will return the first created context instead of a thread-local override.
Since:
  • EclairJS 0.1 Spark 1.6.0
Source:

(static) getOrCreate(sparkContext) → {module:eclairjs/sql.SQLContext}

Get the singleton SQLContext if it exists or create a new one using the given SparkContext. This function can be used to create a singleton SQLContext object that can be shared across the JVM.
Parameters:
Name Type Description
sparkContext module:eclairjs.SparkContext
Source:
Returns:
Type
module:eclairjs/sql.SQLContext

(static) setActive(sqlContext)

Changes the SQLContext that will be returned in this thread and its children when SQLContext.getOrCreate() is called. This can be used to ensure that a given thread receives a SQLContext with an isolated session, instead of the global (first created) context.
Parameters:
Name Type Description
sqlContext module:eclairjs/sql.SQLContext
Since:
  • EclairJS 0.1 Spark 1.6.0
Source:

cacheTable(tableName)

Caches the specified table in-memory.
Parameters:
Name Type Description
tableName string
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:

clearCache()

Removes all cached tables from the in-memory cache.
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:

createDataFrame(rowRDD_or_values, schema) → {module:eclairjs/sql.DataFrame}

Creates a DataFrame from RDD of Rows using the schema
Parameters:
Name Type Description
rowRDD_or_values module:eclairjs.RDD.<module:eclairjs/sql.Row> | Array.<module:eclairjs/sql.Row> A RDD of Rows or array of arrays that contain values of valid DataTypes
schema module:eclairjs/sql/types.StructType -
Source:
Returns:
Type
module:eclairjs/sql.DataFrame
Example
var df = sqlContext.createDataFrame([[1,1], [1,2], [2,1], [2,1], [2,3], [3,2], [3,3]], schema);

createDataFrameFromJson(schema) → {module:eclairjs/sql.DataFrame}

Creates a DataFrame from RDD of JSON
Parameters:
Name Type Description
{{module:eclairjs.RDD} RDD of JSON
schema object object with keys corresponding to JSON field names (or getter functions), and values indicating Datatype
Source:
Returns:
Type
module:eclairjs/sql.DataFrame
Example
var df = sqlContext.createDataFrame([{id:1,"name":"jim"},{id:2,"name":"tom"}], {"id":"Integer","name","String"});

createExternalTable(tableName, path, sourceopt, mapopt) → {module:eclairjs/sql.DataFrame}

:: Experimental :: Creates an external table from the given path and returns the corresponding DataFrame. It will use the default data source configured by spark.sql.sources.default.
Parameters:
Name Type Attributes Description
tableName string
path string
source string <optional>
Creates an external table from the given path based on a data source
map object <optional>
of options (key, value), if specified path is ignored.
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

dropTempTable(tableName)

Drops the temporary table with the given table name in the catalog. If the table has been cached/persisted before, it's also unpersisted.
Parameters:
Name Type Description
tableName string the name of the table to be unregistered.
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:

getAllConfs() → {object}

Return all the configuration properties that have been set (i.e. not the default). This creates a new copy of the config properties in the form of a map.
Since:
  • EclairJS 0.1 Spark 1.0.0
Source:
Returns:
map of the key value pairs
Type
object

getConf(key, defaultValueopt) → {string}

Return the value of Spark SQL configuration property for the given key. If the key is not set yet, return `defaultValue`.
Parameters:
Name Type Attributes Description
key string
defaultValue string <optional>
Since:
  • EclairJS 0.1 Spark 1.0.0
Source:
Returns:
Type
string

isCached(tableName) → {boolean}

Returns true if the table is currently cached in-memory.
Parameters:
Name Type Description
tableName string
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:
Returns:
Type
boolean

newSession() → {module:eclairjs/sql.SQLContext}

Returns a SQLContext as new session, with separated SQL configurations, temporary tables, registered functions, but sharing the same SparkContext, CacheManager, SQLListener and SQLTab.
Since:
  • EclairJS 0.1 Spark 1.6.0
Source:
Returns:
Type
module:eclairjs/sql.SQLContext

range(start, end, stepopt, numPartitionsopt) → {module:eclairjs/sql.DataFrame}

:: Experimental :: Creates a [[DataFrame]] with a single LongType column named `id`, containing elements in an range from 0 to `end` (exclusive) with step value 1.
Parameters:
Name Type Attributes Description
start integer
end integer
step integer <optional>
defaults to 1
numPartitions integer <optional>
Since:
  • EclairJS 0.1 Spark 1.4.1
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

read() → {module:eclairjs/sql.DataFrameReader}

:: Experimental :: Returns a [[DataFrameReader]] that can be used to read data in as a DataFrame.
Since:
  • EclairJS 0.1 Spark 1.4.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrameReader
Example
sqlContext.read.parquet("/path/to/file.parquet")
  sqlContext.read.schema(schema).json("/path/to/file.json")

setConf(prop)

Set Spark SQL configuration properties.
Parameters:
Name Type Description
prop string | object if string sets the property with the value. If object properties are set using the object properties as the keys and the object property value as the value.
Since:
  • EclairJS 0.1 Spark 1.0.0
Source:
Example
sqlContext.setConf("dog", "Golden Retriever");
var map = {"dog": "Golden Retriever", "age": "> 3"};
sqlContext.setConf(map);

sparkContext() → {module:eclairjs.SparkContext}

Returns the SparkContext
Since:
  • EclairJS 0.1
Source:
Returns:
Type
module:eclairjs.SparkContext

sql(sqlText) → {module:eclairjs/sql.DataFrame}

Executes a SQL query using Spark, returning the result as a DataFrame. The dialect that is used for SQL parsing can be configured with 'spark.sql.dialect'.
Parameters:
Name Type Description
sqlText string
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

table(tableName) → {module:eclairjs/sql.DataFrame}

Returns the specified table as a DataFrame.
Parameters:
Name Type Description
tableName string
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

tableNames(databaseNameopt) → {module:eclairjs/sql.DataFrame}

Returns the names of tables in the database as an array.
Parameters:
Name Type Attributes Description
databaseName string <optional>
if not specified the current database is used.
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

tables(databaseNameopt) → {module:eclairjs/sql.DataFrame}

Returns a DataFrame containing names of existing tables in the database. The returned DataFrame has two columns, tableName and isTemporary (a Boolean indicating if a table is a temporary one or not).
Parameters:
Name Type Attributes Description
databaseName string <optional>
if not specified the current database is used.
Since:
  • EclairJS 0.1 Spark 1.3.0
Source:
Returns:
Type
module:eclairjs/sql.DataFrame

udf() → {module:eclairjs/sql.UDFRegistration}

A methods for registering user-defined functions (UDF).
Source:
Returns:
Type
module:eclairjs/sql.UDFRegistration
Example
sqlContext.udf().register("udfTest", function(col1, ...col22) {
      return col1 + ...col22;
}, DataTypes.StringType);
var smt = "SELECT *, udfTest(mytable.col1,...mytable.col22) as transformedByUDF FROM mytable";
var result = sqlContext.sql(smt).collect();

uncacheTable(tableName)

Removes the specified table from the in-memory cache.
Parameters:
Name Type Description
tableName string
Since:
  • EclairJS 0.1 Spark 1.3.0
Source: