camber.spark
The Camber spark
module has methods to create a Spark cluster that you can access directly from your camber JupyterHub or notebook terminal.
After you connect to the cluster, you can also access all operations avaialble in the PySpark API.
import camber.spark
Methods
connect
Creates a SparkSession
object.
Example:
import camber
spark = camber.spark.connect()
... spark magic ...
spark.stop()
Args
engine_size
: Optional[str]- Size of engine, one of
XSMALL
,SMALL
,MEDIUM
,LARGE
. - Default is
XSMALL
.
Returns
SparkSession
- The created SparkSession object.
ℹ️
Note that the Stash also has methods to read and write data from stashes into a Spark DataFrame.