Manage your data with Stash
You can find this tutorial in the demos
folder of your Jupyter notebook environment.
- stash_tutorial.ipynb
The Camber stash
package offers an interface to pass your data, code, and analysis between your personal and Camber’s public cloud storage.
There are two types of stashes:
private
: your personal cloud storage, which also mirrors your notebook’s local filesystem.public
: a read-only cloud storage that all Camber users have access to, also known as the “Open Stash”
Each Stash inits with a given current working directory:
private
: this is equivalent to the $HOME of your Jupyter notebook environment, or/home/{username}
public
: this is just the cloud storage location used by Camber to provision things like datasets
In this tutorial, follow along to learn how use stash to:
- view files and directories in your stash.
- transfer data from open to private stashes.
View files and directories
First, import Camber and assign variables to your stashes.
import camber
prv_stash = camber.stash.private
pub_stash = camber.stash.public
Inspect the home directories of your stashes with the ls
method:
print("private stash data:", prv_stash.ls("~"))
print("public stash data:", pub_stash.ls("~"))
Note that demos/
is included in the results of your private stash ls
.
This is the aforementioned private stash mirror at play.
You are welcome to use shell to manipulate files in your Jupyter notebook filesystem, however, using Stash
will allow you to interface with other cloud storage more efficiently, as we see below.
Copy from open to private stashes
The datasets
directory in the public stash holds datasets that are managed by Camber.
Use ls
to list the files in the open stash tutorials/
dataset:
pub_stash.ls("~/datasets/tutorial")
Public stash is read-only. To manipulate an open dataset, you need to copy it to your private stash.
Before doing that, though, make a file in your Jupyter space called stash-tutorial
.
This is to help keep your private stash organized.
!mkdir -p ~/demos/20-tutorials/01-stash/stash-tutorial
Now use the cp
method to copy the cereal dataset from the open stash to the stash-tutorial/
directory in your private stash:
pub_stash.cp(
dest_stash=prv_stash,
src_path="~/datasets/tutorial/cereal.csv",
dest_path="~/demos/20-tutorials/01-stash/stash-tutorial/cereal.csv",
)
Confirm that it’s in your private stash:
prv_stash.ls("base/demos/20-tutorials/01-stash/stash-tutorial")
Read more
- Plot a Histogram from GAIA. A more sophisticated example of analysis from a dataset on the Open stash.
- Stash API reference