# Workspaces

A cnvrg workspace is an interactive environment for developing and running code.

You can run any kind of interactive development environment, Python scripts and much more. The environment is pre-configured (meaning all your dependencies are preinstalled). All the files and data in your workspace will be preserved for you, across restarts. Your workspace has automatic version control and scalable compute available, so that you can use unlimited compute resources to do your data science research.

cnvrg.io has built-in support for JupyterLab, JupterLab on Spark, R Studio and Visual Studio Code to run on remote computes.

workspaces

The topics in this page:

# Supported Workspaces

There are a variety of supported workspace types that are supported in cnvrg:

  • JupyterLab
  • JupyterLab on Spark
  • R Studio
  • Visual Studio Code

# JupyterLab

JupyterLab is a next-generation web-based user interface for Project Jupyter. JupyterLab enables you to work with documents and activities such as Jupyter notebooks, text editors, terminals, and custom components in a flexible, integrated, and extensible manner.

Learn more about JupyterLab.

# Night Mode in JupyterLab

To switch your JupyterLab notebook to dark mode, click Settings inside the workspace > JupyterLab Theme > JupyterLab Dark.

# JupyterLab on Spark

JupyterLab can also be run on Spark in cnvrg. Choose this option with a compatible Spark Compute template to get started with JupyterLab backed by Spark.

# R Studio

RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management

Learn more about R Studio.

# Visual Studio Code

Visual Studio Code is a lightweight but powerful source code editor which runs on your desktop and is available for Windows, macOS and Linux. It comes with built-in support for JavaScript, TypeScript and Node.js and has a rich ecosystem of extensions for other languages (such as C++, C#, Java, Python, PHP, Go) and runtimes (such as .NET and Unity).

Learn more about Visual Studio Code.

# Visual Studio Code settings

You can add a Visual Studio Code settings file to your user account. This will be used to configure any Visual Studio Code workspaces that you start.

# Start a Workspace

New Workspace

To start a new workspace:

  1. Open a project then click the Workspaces tab.
  2. Click Start a Workspace.
  3. Choose the notebook type by clicking Change type: JupyterLab, JupyterLab on Spark, R Studio or VSCode.
  4. Set the Title for the workspace.
  5. Choose one or more Compute templates for your workspace.
  6. (Optional) Choose Datasets to add to your workspace.
  7. Choose a Docker Image to use for your workspace.
  8. (Optional) Choose an existing Volume or create a new one.
  9. Click Start workspace.

Workspace Starting

# Stop a Workspace

A cnvrg workspace will stay open until closed or until the Max Workspace Duration is reached.

There are two ways to stop a running workspace:

  1. From inside the workspace.
  2. From the Workspaces tab.

To stop a workspace that is loading or running:

  1. Go to the Workspaces tab.
  2. Either:
    • In the table of workspaces, click the menu at the right end of the workspace you want to stop, then click Stop.
    • Open the workspace you want to stop by clicking its name, then click the Stop button that appears at the top of the pane, next to the workspace name.
  3. Confirm the stop action.

# Rerun a Workspace

There are two ways to rerun or re-open a stopped workspace:

  1. From inside the workspace.
  2. From the Workspaces tab.

To rerun a workspace that has stopped:

  1. Go to the Workspaces tab.
  2. Either:
    • In the table of workspaces, click the menu at the right end of the workspace you want to resume, then click Start.
    • Open the workspace you want to stop by clicking its name, then click the Start button that appears at the top of the pane, next to the workspace name.
  3. Confirm the run action.

# Delete a Workspace

To delete a workspace:

  1. Go to the Workspaces tab.
  2. In the table of workspaces, click the menu at the right end of the workspace to resume, then click Delete.
  3. Confirm the delete action.

# Using Compute

Select the compute for your workspace when starting the workspace.

Choose one or more compute templates you would like to use for your workspace. If the first compute is unavailable, jobs attempt to run on the next compute you have selected. If you only select one compute template, the job waits until it is available to run or until the wait time reaches the max duration.

# Using Datasets

When starting a workspace you have the option of including one or more datasets. Choose the datasets you want to use when starting the workspace You can also choose a specific query or commit of the dataset at this point.

When you choose to clone or attache a dataset, it will be mounted in the remote compute and accessible from the absolute path: /data/name_of_dataset/.

For example, if you include your dataset hotdogs, when running an experiment, the dataset can be found at /data/hotdogs/ and all of the dataset files will be found in that directory.

Datasets can also be added to a live workspace.

# Using Images

You choose the Docker image to serve as the basis for virtual environment of the workspace.

cnvrg's default Docker images support most standard use cases and are updated regularly. Learn more about using your own custom Docker image to support your specific use case here.

# Workspace Volumes

Workspace Volumes allow you to create a persistent working environment for your workspace. This enables you to persist th exact some files and work even when shutting down workspaces and restarting them.

When a workspace is started without a volume, a fresh environment is started every time, and configuration will have to be repeated by the user. With a volume, the exact set up will be loaded as you left it. This allows you to work with a workspace as if it was your own laptop, restarting a workspace to find it the same as when you shut it down.

Volumes are specific to projects and can only be used by the project they were created in. A workspace has a title and a size (in GB). A volume can only be used by one workspace at a time.

The volume is attached to /cnvrg (the working directory) and /data. This means that only files in these two locations will be persisted in the volume.

When a volume is created, the environment will be created according to the regular process. When a volume is reused, the volume will simply be re-attached and no cnvrg cloning, data cloning or git cloning will take place at start up.

# Create a volume

You can create a workspace volume from the workspace creation page:

  1. On a project's dashboard, or from the Workspaces tab, click Start a Workspace.
  2. In the pop up that appears, fill in the regular workspace form: Title, Computes, Datasets and Image.
  3. Next to Volume, click + Create New. The Volume field will update.
  4. Enter a Volume Title and Size (GB) for the new volume.
  5. Click Start Workspace. The workspace environment will start being constructed and the volume will be created and attached.

# Use an existing volume

When restarting a closed workspace that was previously run with a volume, cnvrg wil automatically reattach the volume. Ensure that the volume is not currently in use before restarting the workspace.

You can also create a new workspace with an existing volume:

  1. On a project's dashboard, or from the Workspaces tab, click Start a Workspace.
  2. In the pop up that appears, fill in the regular workspace form: Title, Computes, and Image. You can skip datasets as using a volume will override the Dataset selector.
  3. Choose the volume you wish to use from the Volume selector. Only volumes not currently in use can be selected.
  4. Click Start Workspace. The workspace environment will start being constructed and the volume will be attached.

# The Files in a Workspace

Similar to the execution of any other machine learning workload in cnvrg, when a workspace is created without an existing volume, it constructs its environment as follows:

  1. Pull and execute the chosen container.
  2. Clone the chosen git branch and commit (if project connected to git) into /cnvrg.
  3. Clone the latest version of the files from your projects Files tab (or the chosen cnvrg.io commit) into /cnvrg.
  4. Clone or attach the chosen datasets into /data.
  5. Install packages from the requirements.txt file (if it exists).
  6. Execute the prerun.sh script (if it exists).

When these steps have completed, the environment will be fully constructed and the workspace will load.

For more information see Environment.

# Controlling a Workspace

When you are inside a workspace you can see the workspace environment in the centre. There you can access all the functionality that JupyterLab, R Studio and Visual Studio Code provide.

However, you also have access to other controls designed to make interfacing with cnvrg from a workspace as seamless as possible.

# Workspace Controls

The following controls are provided above the workspace:

  • Workspace Name: The workspace title appears here. Click it to edit it.
  • Stop: Stops the workspace from running and release the compute resources.
  • Sync: Syncs the working directory of the workspace with cnvrg. The workspace will also periodically sync according to the Sync time set in the project Settings > Environnement.
  • More: Click here to access the Upload (force) control. It will run the command cnvrg upload --force inside the workspace.

# Workspace Sidebar

Through the workspace sidebar you can add datasets, monitor running experiments, use AI libraries, see information and logs and use tools like SparkUI and TensorBoard.

The following icons are shown in the sidebar:

# Syncing

Because the workspace is running on a remote compute, in order to sync back file changes to cnvrg you must sync the workspace. If you have auto-sync enabled in your project settings, the workspace will auto-sync after the interval has passed.

You can always sync on demand by clicking the Sync button that appears above the workspace.

# Attach Datasets to a Live Workspace

From the Datasets tab in the right sidebar, you can control the datasets that are attached to your workspaces or attach new ones.

The tab contains a list of the datasets currently attached. For each dataset, you can see the name, size, and file count.

You can also choose a specific query or commit of the datasets.

When you choose to include a dataset within a workspace (whether at start-up or on-the-fly, using the sidebar), the dataset will be mounted in the remote compute and will be accessible from the absolute path: /data/<name_of_dataset>/.

For example, if you include your dataset hotdogs when working in the workspace, the dataset can be found at /data/hotdogs/ and all of the datasets files will be found in that directory.

If the dataset is still cloning, the cloning symbol appears next to the dataset name.

# Attach new datasets to the workspace

  1. Click Select Datasets To Attach.
  2. For each dataset you want to add to the workspace, choose the commit or query and the portion of the dataset to clone.
  3. Click the dataset to add it to the list of datasets for the workspace.
    You can remove a dataset from the list by clicking the X next to its name.
  4. Click Attach.

The selected datasets begin cloning to the workspace. You can track their statuses from the datasets panel where you attached them.

TIP

Any datasets added on-the-fly will are located at the absolute path: /data/name_of_dataset/.

# Sync a dataset (create a new commit)

To sync a dataset that is connected to the workspace, click the Sync button next to the dataset's name in the datasets tab.

You can optionally include a commit message. Clicking Sync uploads any changes made to the dataset in the workspace to the remote dataset as a new commit.

# Use AI Library Components in a Workspace

You can use all of your available AI Library components from within a workspace. To use a component:

  1. Go to the Workspaces tab.
  2. In the right sidebar, click Libraries.
  3. Choose the library you want to use.
  4. Copy the code.
  5. Paste it in your workspace and run it.

# View Experiments Running from a Workspace

To view the experiments that are running from the workspace:

  1. Go to the Workspaces tab.
  2. In the right sidebar, click Experiments.

Experiments that are running from the workspace are displayed here. You can see their status and click through to their pages.

# Track your Workspaces' Experiments

A workspace is a great place to run experiments and start training models on the fly. You can go ahead and run your scripts inside your workspace but still get access to all the cnvrg experiment tracking and visualization features.

To run an experiment locally (in the remote workspace) with cnvrg tracking through the SDK, run the experiment using Experiment.run(). The experiment will be recorded in the Experiments workspace sidebar. Track its progress either in the workspace or from the experiments page inside cnvrg.

Use the Experiment.run() method as follows:

from cnvrg import Experiment
e=Experiment.run(command_or_function, title="Local Workspace Experiment")

For example, to run a Python 3 training script:

from cnvrg import Experiment
e=Experiment.run('python3 train.py')

Or to run a function:

from cnvrg import Experiment
e=Experiment.run(train(input))

Learn more about using the Python SDK in the SDK docs.

# Working with git

With workspaces, you can quickly alter code and then run a new experiment to get quick results. While rapid experimentation is important, saving the code that resulted in the experiment is essential to trackable and repeatable experimentation. While you could push to git before each experiment, this can be avoided by using the git_diff parameter in your Experiment.run() command.

When you set git_diff=True, cnvrg will also sync the git difference, saving the changes relative to your git index along with the experiment. Then, after fiddling with your code and running experiments, you can simply compare the experiments, identify which was your most successful experiment, clone the cnvrg.io commit and the push back the changes to your git repository.

# Examples:

  • Run an experiment on remote compute using a script:

    from cnvrg import Experiment
    e=Experiment.run('python3 train,py', compute='medium', git_diff=True)
    
  • Run an experiment directly in your workspace (cnvrg workspace) or local IDE (local VS Code/Pycharm):

    from cnvrg import Experiment
    e=Experiment.run(train(), git_diff=True)
    

# Publish Apps and Dashboards from a Workspace

  1. Go to the Workspaces tab.
  2. In the right sidebar, click Apps.
  3. Click Publish Apps.
  4. Follow the instructions here.

# Workspace Info

You can see all the details of a workspace from the workspace info page. You can access the page from the workspace sidebar by selecting Info. Accessing a closed workspace will show the workspace info automatically.

Along the top of the page are summary details of the workspace session:

  • Start time
  • End time
  • Duration
  • Start commit
  • Compute
  • Image
  • Volume

Below that are logs of the workspace and underneath the logs are the list of associated commits for the workspace.

Last Updated: 6/29/2020, 12:46:29 PM