camber.spark

The Camber spark module has methods to create a Spark cluster that you can access directly from your camber JupyterHub or notebook terminal. After you connect to the cluster, you can also access all operations avaialble in the PySpark API.

import camber.spark

Methods

connect

Creates a SparkSession object.

Example:

import camber
spark = camber.spark.connect()
... spark magic ...
spark.stop()

Args

engine_size: Optional[str]
Size of engine, one of XSMALL, SMALL, MEDIUM, LARGE.
Default is XSMALL.

Returns

SparkSession
The created SparkSession object.
ℹ️
Note that the Stash also has methods to read and write data from stashes into a Spark DataFrame.