# cnvrg SDK V2

# Getting Started

The cnvrg SDK was developed in Python and is designed to help data scientists to interact with cnvrg from their code, experiments and models. Through the SDK, you can create experiments, manage models, automate your machine learning pipeline and more.

The topics in this page:

# Prerequisites

In order to run the pip commands, Python (version 3.6 or later) should be installed on the system.

# Download and Install the cnvrg SDK

To install, open up your terminal/command prompt and run the following command:

pip3 install cnvrgv2

# Install options

When on self-hosted cnvrg environemnt, you can specify an option for cnvrgv2, to fit the object storage you intend to work with in your cnvrg environment.

For Metacloud, we'll use the default installation without any option.

Add the options to the install command as needed, you can add multiple options by separating with comma:

pip install "cnvrgv2[options]"

available options are:

  • azure - Install packages relevant for Azure storage client
  • google - Install packages relevant for GCP storage client
  • python3.6 - Install specific dependencies for python version 3.6

# SDK Operations

# Authenticating the cnvrg SDK

# Inside a cnvrg job scope

The cnvrg SDK will already be initialized and authenticated with cnvrg using the account that is logged in. You can start using cnvrg SDK functions immediately by running:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()

# Authenticate using a local configuration file

You can authenticate to the cnvrg SDK by creating a configuration file in your working directory

  • In your working directory create a directory called .cnvrg You can create it using the following command:
    mkdir .cnvrg
    
  • Inside the directory .cnvrg create a configuration file named cnvrg.config
  • Edit the file and insert the following:
    check_certificate: <false/true>
    domain: <cnvrg_full_domain>
    keep_duration_days: null
    organization: <organiztion_name>
    token: <user_access_token>
    user: <user_email>
    version: null
    
  • Once you finish editing, save the file Now you can simply run the following in your code and it will log you in automatically:
from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()

# Authenticate using environment variables

You can authenticate to the cnvrg SDK by setting the following environment variable:

  • CNVRG_JWT_TOKEN: Your API token ( You can find it in your user settings page)
  • CNVRG_URL: Your cnvrg url that you use to view cnvrg through the browser, for example: https://app.prod.cnvrg.io
  • CNVRG_USER: Your email that you use to log in to cnvrg
  • CNVRG_ORGANIZATION: The organization name you use Once you set those environment variables, you can simply run the following and it will log you in automatically:
from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()

You can also pass the credentials as parameters:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg(domain="https://app.cnvrg.io",
              email="Johndoe@acme.com",
              password="123123",
              )

If you are on cnvrg metacloud environment you need to use your API KEY that can be found in your Account page

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg(domain="https://app.domain.metacloud.cnvrg.io",
              email="Johndoe@acme.com",
              token="YOUR API KEY")

NOTE

As a security measure, please do not put your credentials into your code.

NOTE

The following documentation assume you have successfully logged in to the SDK and loaded the cnvrg object

# User Operations

# Get the logged in user object

To get the logged in user object you can simply run:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
user = cnvrg.me()

Once you have the user object you can get the user fields like: email, username, organizations, git_access_token, name, time_zone For example:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
user = cnvrg.me()
email = user.email

# Set the default organization

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
cnvrg.set_organization("my-org")

# Resource Operations

# Connect your existing Kubernetes cluster

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
mycluster = cnvrg.clusters.create(resource_name="kubernetes_cluster",
                                  kube_config_yaml_path="kube_config.yaml",
                                  domain="https://app.cnvrg.io")

List of optional parameters:

Parameter type description required default
scheduler string supported schedulers to deploy cnvrg jobs No cnvrg_scheduler
namespace string the namespace to use inside the cluster No cnvrg
https_scheme bool resource supports HTTP/S urls when accessing jobs from the browser No False
persistent_volumes bool resource can dynamically create PVCs when running jobs No False
gaudi_enabled bool the cluster support HPU devices No False

# Create managed EKS cluster

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
mycluster = cnvrg.clusters.create(build_yaml_path="aws.yaml", provider_name="aws")

List of optional parameters:

Parameter type description required default
network string if left blank cnvrg will automatically provision the network for your cluster No istio

Yaml example:

name: mycluster
version: '1.21'
roleARN: arn:aws:iam::123456789101:role/cnvrg_role
region: us-west-2
vpc: null
publicSubnets:
  - ''
privateSubnets:
  - ''
securityGroup: ''
nodeGroups:
  - availabilityZones:
      - us-west-2a
      - us-west-2b
      - us-west-2d
      - us-west-2c
    autoScaling: false
    instanceType: m5.metal
    desiredCapacity: 2
    minSize: 0
    maxSize: 2
    spotInstances: false
    volumeSize: 100
    privateNetwork: true
    securityGroups:
      - ''
    tags:
      - key: ''
        value: ''
    taints:
      - key: ''
        value: ''
    labels:
      - key: ''
        value: ''
    attachPolicies:
      - ''
    addonPolicies:
      - key: ''
        value: ''

# Create partner resource

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
mycluster = cnvrg.clusters.create(resource_name="mypartner", provider_name="aibuilders")

# Get an existing resource

You can get the resource object by using the resource's slug:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
mycluster = cnvrg.clusters.get(slug="cluster-slug")

You can also get all of the resources in the organization:

clusters = [c for c in cnvrg.clusters.list()]

# Update an existing resource

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
mycluster = cnvrg.clusters.get(slug="cluster-slug")
mycluster.update(resource_name="new-name")

List of optional parameters:

Parameter type description required
scheduler string supported schedulers to deploy cnvrg jobs No
namespace string the namespace to use inside the cluster No
https_scheme bool resource supports HTTP/S urls when accessing jobs from the browser No
persistent_volumes bool resource can dynamically create PVCs when running jobs No
gaudi_enabled bool the cluster support HPU devices No

# Delete a resource

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
mycluster = cnvrg.clusters.delete(slug="cluster-slug")

or

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
mycluster = cnvrg.clusters.get(slug="cluster-slug")
mycluster.delete()

# Project Operations

# Create a new project

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
myproj = cnvrg.projects.create("myproject")

# Get the project's object:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
myproj = cnvrg.projects.get("myproject")

Once you have the project object you can get the project fields like: title, slug, git_url, git_branch, last_commit For example:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
myproj = cnvrg.projects.get("myproject")
title = myproj.title

NOTE

You can also reference the current project from within a job scope:

from cnvrgv2 import Project
ws = Project()

# List all the projects in the organization:

  • List all projects that the current user is allowed to view
from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
myproj = cnvrg.projects.list()

To order the projects list by created_at run the following:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
myproj = cnvrg.projects.list(sort="-created_at")

TIP

sort the list by: -key -> DESC | key -> ASC

# Delete a project:

from cnvrgv2 import Cnvrg
cnvrg = Cnvrg()
myproj = cnvrg.projects.get("myproject")
myproj.delete()

# Project File Operations

# Upload files to a project

myproj.put_files(paths=['/files_dir/file1.txt', '/files_dir/file2.txt'], 
             pattern='*')

available Parameters:

Parameter type description required default
paths List The list of file paths that will be uploaded to the project Yes
pattern string String defining the filename pattern No "*"
message string The commit message No ""
override bool Whether or not to re-upload even if the file already exists No False
force bool Should the new commit copy files from its parent No False

NOTE

If a folder is given all the relevant files in that folder (that answers to the regex pattern) will be uploaded.

# Remove files from a project

You can remove files from the project:

myproj.remove_files(paths='*', 
          message='This will delete everything!')

NOTE

When deleting files from a project paths parameter can be both a list of file paths or a string pattern like '*'

# List the project's content

You can list all of the files and folders that are in the project:

myproj.list_files()
myproj.list_folders(commit_sha1='xxxxxxxxx')

Available Parameters

Parameter type description required default
commit_sha1 string Sha1 string of the commit to list the files from No None
query string Query slug to list files from No None
query_raw string Raw query to list files according to Query language syntax No None
sort string Key to sort the list by (-key -> DESC / key -> ASC) No "-id"

# Clone the project to the current working directory

myproj.clone()

# Download the project files

myproj.download(commit_sha1='xxxxxxxxx')

WARNING

The Project must be cloned first

# Sync the local project with the remote one

myproj.sync_local()

# Update the project's settings

you can change any of the project's settings by passing them as keyword arguments

myproj.settings.update(title='NewProjectTitle',
              privacy='private')

Available Settings:

Parameter type description
title string The name of the project
default_image string The name of the image to set to be the project's default image
default_computes List The list of the project's default compute template names
privacy string The project's privacy set to either 'private' or 'public'
mount_folders List Paths to be mounted to the docker container
env_variables List KEY=VALUE pairs to be exported as environment variables to each job
check_stuckiness bool Whether to stop or restart experiments that have not printed new logs and have resource utilization below 20%
max_restarts int When "check_stuckiness" is True this sets how many times to repeatedly restart a single experiment each time it idles
stuck_time int The duration (in minutes) that an experiment must be idle for before it is stopped or restarted
autosync bool Whether or not to preform periodic automatic sync
sync_time int The interval (in minutes) between each automatic sync of jobs
collaborators List The list of users that are collaborators on the project
command_to_execute string The project's default command to execute when starting a new job
run_tensorboard_by_default bool Whether or not to run Tensorboard by default with each launched experiment
run_jupyter_by_default bool Whether or not to run Jupyter by default with each launched experiment
requirements_path string The default path to the requirements.txt file that will run with every job
is_git bool Whether if the project is linked to a git repo or not
git_repo string The address of the git repo
git_branch string The default branch
private_repo bool Whether the repo is private or not
output_dir string The default path for jobs output directory
email_on_success bool If email should be sent when the experiment finishes successfully
email_on_error bool If email should be sent when the experiment finishes with an error

# Setup Git Integrations in project settings

For a public git repository

myproj.settings.update(is_git=True, git_repo="MyGitRepo", git_branch="MyBranch")

For a private git repository using Oauth Token First make sure that the git Oauth Token is saved in your profile and then run

myproj.settings.update(is_git=True, git_repo="PrivateGitRepo", git_branch="MyBranch", private_repo=True)

To disable git integrations

myproj.settings.update(is_git=False)

# Templates Operations

# Create a new template

    # Get an existing template

    You can get the template object by using the template's slug:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    cluster = cnvrg.clusters.get("cluster_slug")
    template = cluster.templates.get("template_slug")
    

    # List all existing templates

    List all templates that the current user is allowed to view

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    cluster = cnvrg.clusters.get("cluster_slug")
    templates = cluster.templates.list()
    for template in templates:
        print("Template Details: title: {} , slug: {} , cpu: {} , memory: {} "
              .format(template.title, template.slug, template.cpu, template.memory))
    
    

    # Update an existing template

    You can update the existing template attributes:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    cluster = cnvrg.clusters.get("cluster_slug")
    template = cluster.templates.get("template_slug")
    template.update(title="new title",cpu=3)
    

    # Delete an existing template

    You can delete an existing template:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    cluster = cnvrg.clusters.get("cluster_slug")
    template = cluster.templates.get("template_slug")
    template.delete()
    

    # Workspaces operations:

    # Create a new workspace and run it:

    from cnvrgv2 import Cnvrg
    from cnvrgv2.modules.workflows.workspace.workspace import NotebookType
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    ws = myproj.workspaces.create(title="My Workspace", 
                                  templates=["small","medium"], 
                                  notebook_type=NotebookType.JUPYTER_LAB)
    

    If no parameters are provided, then the default values are used to further customize the created workspace you pass use the following parameters

    Parameter type description required default
    title string The name of the workspace No None
    templates list A list containing the names of the desired compute templates No None
    notebook_type string The notebook type (currently available: "jupyterlab", "r_studio", "vscode") No NotebookType.JUPYTER_LAB
    volume Volume The volume that will be attached to the workspace No None
    datasets list A list of datasets to be connected and used in the workspace No None
    Image Image The image to be used for the workspace environment No default organization image

    # Fetch the workspace object

    Once the workspace is created you can fetch it by its slug:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    ws = myproj.workspaces.get("workspace-slug")
    

    NOTE

    You can also reference the current running workspace from within its job scope:

    from cnvrgv2 import Workspace
    ws = Workspace()
    

    # Access workspace attributes

    You can access the workspace attributes by using regular dot notation:

    ws_slug = ws.slug
    ws_title = ws.title
    ws_datasets = ws.datasets
    ws_notebook = ws.notebook_type
    

    # Sync the workspace

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    ws = myproj.workspaces.get("workspace-slug")
    ws.sync()
    

    sync multiple by providing a list containing their slug ids

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    myproj.workspaces.sync(["workspace-slug"])
    

    # Stop a running workspace

    Stop a running workspace and sync it (the default is sync=False):

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    ws = myproj.workspaces.get("workspace-slug")
    ws.stop(sync=True)
    

    Stop multiple workspaces at once:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    myproj.workspaces.stop(["workspace-slug"],sync=True)
    

    # Start a stopped workspace

    Start a stopped workspace:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    ws = myproj.workspaces.get("workspace-slug")
    ws.start()
    

    # List all of the workspaces

    You can list all the workspaces in the current project, as well as sort them by a key in ASC or DESC order:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    exps = myproj.workspaces.list(sort="-created_at")
    

    TIP

    sort the list by: -key -> DESC | key -> ASC

    # Delete workspaces

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    ws = myproj.workspaces.get("workspace-slug")
    ws.delete()
    

    Delete multiple workspaces by listing their slugs:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    myproj.workspaces.delete(['workspace-slug1','workspace-slug2'])
    

    # Operate a Tensorboard

    Start a Tensorboard session for an ongoing workspace

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    ws = myproj.workspaces.get("workspace-slug")
    ws.start_tensorboard()
    

    Get the Tensorboard url:

    ws.tensorboard_url
    

    To stop the Tensorboard session:

    ws.stop_tensorboard()
    

    #

    # Experiment Operations

    # Create a new remote Experiment

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    e = myproj.experiments.create(title="my new exp", 
                                  template_names=["medium", "small"], 
                                  command="python3 test.py")
    

    List of optional parameters:

    Parameter type description required default
    title string The name of the experiment No None
    templates List list of the compute templates to be used in the experiment (if the cluster will not be able to allocate the first template, then it will try the one after and so on..) No None
    local bool whether or not to run the experiment locally (default: local=False)
    command string the starting command for the experiment (example: command='python3 train.py') No False
    datasets List[Dataset] A list of dataset objects to use in the experiment No None
    volume Volume Volume to be attached to this experiment No None
    sync_before bool Wheter or not to sync the environment before running the experiment No True
    sync_after bool Wheter or not to sync the environment after the experiment has finished No True
    image object The image to run on (example: image=cnvrg.images.get(name="cnvrg", tag="v5.0") No project's default image
    git_branch string The branch to pull files from for the experiment, in case project is git project No None
    git_commit string The specific commit to pull files from for the experiment, in case project is git project No None

    # Initalize an empty experiment

    You may create an empty experiment that will not be run automatically (by default: local=True, sync_after=False, sync_before=False):

    e = myproj.experiments.init(title="my new exp")
    

    # Create a local experiment

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    e = myproj.experiments.create(title="my new exp", 
                                  local=True
                                  command="python3 test.py",
                                  local_arguments={epochs:20,batch_size:12}))
    
    Parameter type description required default
    local_arguments dict If local experiment and command is a function, local_arguments is a dictionary of the arguments to pass to the experiment's function No None

    # Experiment slug

    In many commands, you will need to use an experiment slug. The experiment slug can be found in the URL for the experiment.

    For example, if you have an experiment that lives at: https://app.cnvrg.io/my_org/projects/my_project/experiments/kxdjsuvfdcpqkjma5ppq, the experiment slug is kxdjsuvfdcpqkjma5ppq.

    # Get an existing experiment

    You can get the experiment object by using the experiment's slug:

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    e = myproj.experiments.get("exp-slug") 
    

    You can also get all of the experiments in the project:

    experiments = [e for e in myproj.experiments.list()]
    

    NOTE

    You can also reference the current running experiment from within its job scope:

    from cnvrgv2 import Experiment
    ws = Experiment()
    

    # Delete Experiment

    You can delete a Experiment from a project by its slug value

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    e = myproj.experiments.get("experiment-slug")
    e.delete()
    
    # Do bulk delete on multiple experiments
    myproj.experiments.delete(['experiment-slug1','experiment-slug2']) 
    

    # Stop a running Experiment

    Stop a running Experiment by passing its slug value (also All Experiment must be running)

    from cnvrgv2 import Cnvrg
    cnvrg = Cnvrg()
    myproj = cnvrg.projects.get("myproject")
    e = myproj.experiments.get("experiment-slug")
    e.stop()
    
    # Do bulk stop on multiple experiments
    myproj.experiments.stop(['experiment-slug1','experiment-slug2'])
    
    Parameter type description required default
    sync bool sync the experiment's data or not No True

    # Get Experiment's system utilization

    You can access the experiment's system resources usage data. For example, let's get the 5 last records for memomry utilization percentage:

    >>> utilization = e.get_utilization()
    >>> utilization.attributes['memory']['series'][0]['data'][-5:]  
    [[1626601529000, 7.7], [1626601559000, 19.85], [1626601589000, 48.05], [1626601620000, 49.26]]
    

    NOTE

    The data syntax is [unix_timestamp, metric]

    # Track an Experiment manually

    You can initialize an empty Experiment in Cnvrg:

    cnvrg = Cnvrg()
    proj = cnvrg.projects.get('my-project')
    e = proj.experiments.init(title='my-exp')
    

    Now that the Experiment is initialized, its status is ONGOING and you can preform operations from within your code like with regular Cnvrg Experiments in order to track

    If you have initialized an Experiment object, you should conclude the experiment with the e.finish() command.

    To conclude an experiment object:

    exit_status = 0
    e.finish(exit_status=exit_status)
    

    NOTE

    0 is success, -1 is aborted, 1 and higher is error

    # Examples

      # Metadata operations on experiments

      You can create your own logs in an experiment (timestamp default is utcnow()):

      from datetime import datetime
      e.log("my first log", timestamp=datetime.now())
      e.log(["my first log","my second log"])
      

      Get the experiment's last 40 logs:

      logs = e.logs()
      

      # Create a Tag

      e.log_param("key","value")
      

      # Charts

      You can create various charts by using the sdk, for example create a linechart showing the experiments loss:

      from cnvrgv2 import Cnvrg, LineChart
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      e = myproj.experiments.get("exp-slug")
      
      loss_vals = []
      # experiment loop:
      for epoch in range(8):
        loss_vals.append(loss_func())
      
      # attach the chart to the experiment
      loss_chart = LineChart('loss')
      loss_chart.add_series(loss_vals, 's1')
      e.log_metric(loss_chart)
      

      WARNING

      chart_name can't include "/"

      You will immediately see the chart on the experiment's page:

      You can create all different types of charts:

      # Heatmap:

      In case of Heatmap, list of tuples that form a matrix. e.g, 2x2 matrix: [(0.5,1),(1,1)]

      from cnvrgv2 import Heatmap
      heatmap_chart = Heatmap('heatmap_example',
                              x_ticks=['x', 'y'], y_ticks=['a', 'b'],
                              colors=[[0,'#000000'],[1, '#7EB4EB']],
                              min=0,
                              max=10)
      heatmap_chart.add_series('s1', [(0.5,1),(1,1)])
      e.create_chart(heatmap_chart)
      

      Example heat map

      Typing information: x_ticks and y_ticks must be a List and matrix is a list of tuples in struct (x,y,z). color_stops is optional and is a List of Lists with size 2, where the nested first value is a float 0 <= X <= 1, and the second value is the hex value for the color to represent matrix values at that point of the scale. min and max are optional and should be numbers corresponding to the minimum and a maximum values for the key (scaling will be done automatically when these values are not submitted).

      Each struct corresponds to a row in the matrix and to a label from the y_ticks list. The matrix is built from the bottom up, with the first struct and y_tick at the bottom edge. Each value inside the struct corresponds to each x_tick. Using steps and groups allow you to submit the same heatmap across different steps and visualize it in a single chart with a slider to easily switch between the charts. steps should be an integer and group should be a string.

      Steps and groups:

      Using steps and groups allow you to submit heatmaps across different steps and visualize it in a single chart with a slider to easily move between the steps. steps should be an integer and group. Multiple steps should be grouped with a single group.

      Animated Heatmap

      # Bar chart:

      • Single bar:

        from cnvrgv2 import BarChart
        bar_chart = BarChart('bar_example', x_ticks=['bar1', 'bar2'])
        bar_chart.add_series('s1', [1, 2])
        e.create_chart(bar_chart)
        
      • Multiple bars:

        from cnvrgv2 import BarChart
        bar_chart = BarChart('bar_example', x_ticks=['bar1', 'bar2'])
        bar_chart.add_series('s1', [1, 2, 3])
        bar_chart.add_series('s2', [3, 4])
        e.create_chart(bar_chart)
        

      Example bar chart

      The x_ticks list will populate the labels for the bars, and the corresponding series values will dictate the value of the bar for that category. min and max are optional and are numbers that correspond the lower and upper bounds for the y values. Optionally, you can set each bar to be a specific color using the colors list of hex values, with each hex value corresponding to each x value.

      Steps and groups:

      Using steps and groups allow you to submit bar charts across different steps and visualize it in a single chart with a slider to easily move between the steps. steps should be an integer and group. Multiple steps should be grouped with a single group.

      Animated bargraph

      # Scatter Plot:

      You can pass a list of tuple pairs representing points on the axis

      • Single set of points:
        from cnvrgv2 import ScatterPlot
        points_list = [(1,1),(2,2), (3,3), (4,4)]
        scatter_chart = ScatterPlot('scatter_example')
        scatter_chart.add_series('s1', points_list)
        
      • Multiple sets of points:
        from cnvrgv2 import ScatterPlot
        points_list = [(1,1),(2,2), (3,3), (4,4)]
        scatter_chart = ScatterPlot('scatter_example')
        scatter_chart.add_series('s1', points_list)
        scatter_chart.add_series('s2', points_list[::-1]) # Reversed version of the list
        

      Example scatter plot

      # Upload artifacts

      You can add local files to the Experiments artifacts and create a new commit for it:

      paths = ['output/model.h5']
      e.log_artifacts(paths=paths)
      
      Parameter Type Description
      paths list List of paths of artifacts to save

      NOTE

      Log images with log_images(file_paths=[<images_paths>])

      # Download the Experiment's artifacts

      You can download the artifacts to your local working directory

      e.pull_artifacts(wait_until_success=True, poll_interval=5)
      
      Parameter Type Description
      wait_until_success bool Wait until current experiment is done before pulling artifacts
      poll_interval int If wait_until_success is True, poll_interval represents the time between status poll loops in seconds

      #

      # Flow Operations

      Flows can be created and run from any environment using the SDK. Creating flows requires using a flow configuration YAML file.

      # Create a Flow

      You can use a flow YAML to create a flow inside a project. You can use either the absolute path to a YAML file or include the YAML content directly. Use the Flow.create command:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.create(yaml_path='YAML_PATH')
      
      Parameter type description required default
      yaml_path path A path to the YAML configuration file. No None

      # Example YAML:

      ---
      flow: Flow Example
      recurring: 
      tasks:
        - title: Training Task
        type: exec
        input: python3 train.py
        computes:
          - medium
        image: cnvrg:v5.0
        relations: []
      

      # Access Flow attributes

      You can access the Flow's attributes by using regular dot notation:

      Example:

      >>> flow.title
      'Training Task'
      

      # Flow Attributes:

      Parameter type description
      title string The name of the Flow
      slug string The flow slug value
      created_at datetime The time that the Flow was created
      updated_at datetime The time that the Flow was last updated
      cron_syntax string The schedule Cron expression string (If the Flow was scheduled)
      webhook_url string
      trigger_dataset string A dataset that with every change will trigger this Flow

      # Flow slug

      In some commands, you will need to use an Flow slug. The Flow slug can be found in the Flow page URL.

      For example, if you have an Flow that lives at: https://app.cnvrg.io/my_org/projects/my_project/flows/iakzsmftgewhpxx9pqfo, the Flow slug is iakzsmftgewhpxx9pqfo.

      # Get a Flow

      Get an existing Flow by passing its slug value or title

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.get("slug/title")
      

      # List Flows

      You can list all existing flows:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flows = proj.flows.list()
      for flow in flows:
          print(flow.title)
      

      # Run a Flow

      To run The Flow's latest version:

      flow.run()
      

      # Update Flow

      You can update the existing Flow's attributes:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.get("slug/title")
      flow.update(title="My Updated Flow")
      

      # Delete Flow

      You can delete an existing Flow:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.get("slug/title")
      flow.delete()
      

      Or multiple Flows at once by listing all of the Flows slug values:

      proj.flows.delete(["FLOW1_SLUG", "FLOW2_SLUG"])
      

      # Schedule a Flow

      You can make the Flow run on schedule by using Cron expression syntax.

      # Set a new schedule:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.get("slug/title")
      flow.set_schedule("* * * * *")  # Run every minute
      

      Disable it with:

      flow.clear_schedule()
      

      # Trigger webhook

      You can create a webhook that will trigger the Flow run.

      Toggle it on/off by setting the toggle parameter to True or False:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.get("slug/title")
      flow.toggle_webhook(True)
      

      Get the webhook url:

      flow.webhook_url
      

      NOTE

      If you just toggled the webhook use flow.reload() before fetching the webhook_url

      # Toggle dataset update trigger

      You can toggle the option to trigger on dataset update on/off, by setting the toggle parameter to True or False:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      ds = cnvrg.datasets.get("myds")
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.get("slug/title")
      
      flow.toggle_dataset_update(True)
      

      # Flow versions

      Every Flow have multiple versions and you can access them:

      List all the flow versions:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.get("slug/title")
      flow_versions = flow.flow_versions.list()
      for fv in flow_versions:
          print(fv.title)
      

      Get a specific flow version object by slug or title:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      flow = proj.flows.get("slug/title")
      flow_version = flow.flow_versions.get("Version 1")
      

      Get info of a flow version status:

      info = flow_version.info()
      

      Stop a running Flow version:

      flow_version.stop()
      

      #

      # Endpoint Operations

      # Create Endpoint

      from cnvrgv2 import Cnvrg
      from cnvrgv2 import EndpointKind, EndpointEnvSetup
      
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      ep = proj.endpoints.create(title="myendpoint",
                     templates=["small","medium"],
                     kind=EndpointKind.WEB_SERVICE, 
                     file_name="predict.py",
                     function_name="predict",
                     env_setup=EndpointEnvSetup.PYTHON3,
                     kafka_brokers=None,
                     kafka_input_topics=None,
                     *args,
                     **kwargs)
      

      You can use the following parameters to build your Endpoint:

      Parameter type description required default
      title string Name of the Endpoint Yes
      kind int The kind of endpoint to deploy (example: EndpointKind.WEB_SERVICE, options: [WEB_SERVICE, STREAM, BATCH]) No EndpointKind.WEB_SERVICE
      templates List List of template names to be used No None
      image Image Image object to create endpoint with No organization default image
      file_name string The file containing the endpoint's functions Yes
      function_name string The name of the function the endpoint will route to Yes
      env_setup string The interpreter to use (example: EndpointEnvSetup.PYTHON3, options: [PYTHON2, PYTHON3, PYSPARK, RENDPOINT]) No None
      kafka_brokers List List of kafka brokers No None
      kafka_input_topics List List of topics to register as input No None
      queue List Name of the queue to run this job on No None
      kafka_output_topics List List of topics to register as input No None

      # Endpoint slug

      In many commands, you will need to use an endpoint slug. The endpoint slug can be found in the URL for the endpoint.

      For example, if you have an endpoint that lives at: https://app.cnvrg.io/my_org/projects/my_project/endpoints/show/j46mbomoyyqj4xx5f53f, the endpoint slug is j46mbomoyyqj4xx5f53f.

      # Get Endpoint object

      You can get Endpoints by passing their slug value:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      proj = cnvrg.projects.get("myproject")
      ep = proj.endpoints.get('slug')
      

      NOTE

      You can also reference the current running endpoint from within its job scope:

      from cnvrgv2 import Endpoint
      ws = Endpoint()
      

      # List Endpoints

      ep_list = proj.endpoints.list(sort='-created_at')  # Descending order
      

      TIP

      sort the list by: -key -> DESC | key -> ASC

      # Stop running Endpoints

      Stop a running Endpoint by passing its slug value (sync=False by default, also All Endpoints must be running)

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      ep = myproj.endpoints.get("endpoint-slug")
      ep.stop(sync=False)
      
      # Do bulk stop on multiple endpoints
      myproj.endpoints.stop(['endpoint-slug1','endpoint-slug2'])
      

      # Start a stopped Endpoint

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      ep = myproj.endpoints.get("endpoint-slug")
      ep.start()
      

      # Delete Endpoints

      You can delete a Endpoint from a project by its slug value

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      ep = myproj.endpoints.get("endpoint-slug")
      ep.delete()
      
      # Do bulk delete on multiple endpoints
      myproj.endpoints.delete(['endpoint-slug1','endpoint-slug2']) 
      

      # Endpoint Attributes

      You can access the Endpoint attributes by using regular dot notation, for example:

      >>> ep.api_key
      '43iVTWTp55N7p62iSZYZLyuk'
      
      Attribute type description
      title string Name of the Endpoint
      kind int The kind of endpoint (webservice, stream, batch)
      updated_at string when was this Endpoint last updated
      last_deployment dict details about the Endpoint's last deployment
      deployments List list of dictionaries containing details about all of the Endpoint's deployments
      deployments_count int The number of deployments that the Endpoint had
      templates List List of compute templates that are assigned to the Endpoint
      endpoint_url string The Endpoint's requests URL
      url string The Endpoint's base URL
      current_deployment dict The active deployment's data
      compute_name string The name of the current compute template that is being used for the Endpoint to run
      image_name string Name of the Endpoint's environment that is currently deployed
      image_slug string The slug value of the Endpoint's deployed image
      api_key string API key to access the Endpoint securely
      created_at string The time that this endpoint was created
      max_replica int Maximum number of pods to run this endpoint on
      min_replica int Minimum number of pods0 to run this endpoint on
      export_data bool whether to export data or not
      conditions dict conditions attached to this Endpoint and trigger a Flow/email every time one of them is met

      # Update The Endpoint's version

      You can deploy a new version to the Endpoint and change some of its settings, for example:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      ep = myproj.endpoints.get("endpoint-slug") 
      ep.update_version(file_name="new_predict.py", commit="q7veenevzd83rewxgncx")
      

      # Update the Endpoint's replica set

      You can update the minimum and maximum number of pods to run the Endpoint on:

      ep.update_replicas(min_replica=2, max_replica=5)
      

      # Get sample code

      You can fetch the sample code to query the Endpoint (as shown in the Endpoint's main screen):

      Sample Code example:

      >>> sample_code = ep.get_sample_code()
      >>> sample_code['curl']
      'curl -X POST \\\n    http://endpoint_title.cnvrg.io/api/v1/endpoints/q7veenevzd83rewx...'
      

      # Poll charts

      You can fetch a dictionary with data about the Endpoint's latency performance, number of requests and user generated metrics from the Endpoint's charts:

      >>> ep.poll_charts()
      

      # Rollback version

      If you want to rollback the Endpoint version to a previous one you just need to pass the current version's slug value, for example:

      >>> ep.current_deployment['title']
      3  # current version is 3
      >>> last_version_slug = ep.current_deployment["slug"] 
      >>> ep.rollback(version_slug=last_version_slug)
      >>> ep.reload()
      >>> ep.current_deployment['title']
      2  # after the rollback the Endpoint's version is now 2
      

      NOTE

      To fetch the most updated attributes of the Endpoint, use ep.reload()

      # Set feedback loop

      You can grab all inbound data and feed it into a dataset for various uses, such as continuous learning for your models, for example:

      from cnvrgv2 import Cnvrg
      from cnvrgv2.modules.workflows.endpoint.endpoint import FeedbackLoopKind
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      ep = myproj.endpoints.get("endpoint-slug") 
      ds_slug = "dataset-name"
      ep.configure_feedback_loop(dataset_slug=ds_slug,
                                 scheduling_type=FeedbackLoopKind.IMMEDIATE)
      

      Setup the feedback loop behaviour with the following parameters:

      Parameter type description required default
      dataset_slug string slug of the receiving dataset No None
      scheduling_type int whether if the feedback loop is immediate (for every request) or recurring (for every time interval) (use FeedbackLoopKind.IMMEDIATE or FeedbackLoopKind.RECURRING) No FeedbackLoopKind.IMMEDIATE
      cron_string string Cron syntax string if scheduling type is recurring No None

      Disable the feedback loop:

      ep.stop_feedback_loop()
      

      NOTE

      The data will be automatically saved in predict/predictions.csv Sample Code

      # Control batch Endpoint

      If the Endpoint is of batch type, then you can control it straight from the SDK:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      ep = myproj.endpoints.get("endpoint-slug")
      
      # Check if it is running
      ep.batch_is_running()
      
      # Scale it up or down
      ep.batch_scale_up()
      ep.batch_scale_down()
      

      #

      # Webapps Operations

      # Create a webapp

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      wb = myproj.webapps.create("mywebapp",
                                templates=["small","medium"],
                                webapp_type="dash", 
                                file_name="app.py")
      

      # Available parameters:

      Parameter type description required default
      webapp_type string The type of webapp to create ("shiny", "dash" or "voila") Yes
      file_name string File name of the main app script Yes
      title string Name of the webapp No None
      templates list List of template names to be used No None
      datasets list List of datasets to connect with the webapp. No None

      # Available attributes:

      You can access the WebApp attributes by using regular dot notation, for example:

      >>> wb.webapp_type
      'dash'
      
      Attribute type description
      webapp_type string The type of webapp ("shiny", "dash" or "voila")
      template_ids List The size of the Dataset
      title int The name of the Dataset
      members List List of collaborators on this Dataset
      category string The data structure category
      description string Description of the Dataset
      num_files int The number of files in the Dataset
      last_commit string The last commit on this Dataset
      current_commit string The current commit on this Dataset object

      # Get webapp object

      Get a specific WebApp by its slug value

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      wb = myproj.webapps.get("webapp-slug")
      

      NOTE

      You can also reference the current running webapp from within its job scope:

      from cnvrgv2 import Webapp
      ws = Webapp()
      

      # List all the webapps in the project

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      # sort them by decending order
      wb = myproj.webapps.list(sort="-created_at")
      

      TIP

      sort the list by: -key -> DESC | key -> ASC

      # Delete webapp

      You can delete a WebApp from a project by its slug value

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      wb = myproj.webapps.get("webapp-slug")
      wb.delete()
      
      # Do bulk delete on multiple webapps
      myproj.webapps.delete(['webapp-slug1','webapp-slug2']) 
      

      # Stop running webapp

      Stop a running WebApp by passing its slug value (sync=False by default, also All WebApps must be running)

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      myproj = cnvrg.projects.get("myproject")
      wb = myproj.webapps.get("webapp-slug")
      wb.stop(sync=False)
      
      # Do bulk stop on multiple webapps
      myproj.webapps.stop(['webapp-slug1','webapp-slug2'])
      

      # Dataset Operations

      # Create a Dataset

      You can create a new Dataset in cnvrg:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      ds = cnvrg.datasets.create(name="MyDataset",category='general')
      

      You can use the following parameters to customize the Dataset:

      Parameter type description required default
      name string The name of the new Dataset Yes
      category string The type of dataset, can be one of the following: [general, images, audio, video, text, tabular] No "general"

      # Dataset ID

      In some methods, you will need to use a dataset ID. The dataset ID is the name used for the dataset in its URL.

      For example, if you have a dataset that lives at: https://app.cnvrg.io/my_org/datasets/MyDataset, the dataset ID is MyDataset.

      # Access Dataset attributes

      You can access the workspace attributes by using regular dot notation:

      ds.slug
      ds.members
      ds.last_commit
      

      # Available attributes:

      Attribute type description
      slug string The unique slug value of the Dataset
      size int The size of the Dataset
      title int The name of the Dataset
      members List List of collaborators on this Dataset
      category string The data structure category
      description string Description of the Dataset
      num_files int The number of files in the Dataset
      last_commit string The last commit on this Dataset
      current_commit string The current commit on this Dataset object

      # Get a Dataset

      To fetch a Dataset from Cnvrg you can use:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      ds = cnvrg.datasets.get("MyDataset")
      

      # List all existing Datasets

      You can list all the datasets in the current organization:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      ds = cnvrg.datasets.list(sort="-created_at")
      

      TIP

      sort the list by: -key -> DESC | key -> ASC

      # Delete a Dataset

      To delete a Dataset call the delete() function on it's instance:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      ds = cnvrg.datasets.get("MyDataset")
      ds.delete()
      

      # Dataset Commits

      Every Dataset in Cnvrg may contain multiple data commits that you can interact with in the following manner:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      ds = cnvrg.datasets.get("MyDataset")  # Get the Dataset object
      
      # Get A specific commit by its sha1 value
      cm = ds.get_commit('xxxxxxxxx')
      # OR list all available commits
      commits = [cm for cm in ds.list_commits()]
      
      # Last and current commits are available as attributes
      last_commit = ds.last_commit
      current_commit = ds.current_commit
      

      Each commit contains the following attributes:

      Attribute type description
      sha1 string The unique sha1 value of this commit
      source int Where this commit were created from
      message string The commit message
      created_at List The date the commit was created on

      # Dataset Queries

      You can create and use queries directly from Cnvrg SDK to filter and use the Dataset exectly the way you want it by using Query language syntax

      # Create a new Query:

      from cnvrgv2 import Cnvrg
      cnvrg = Cnvrg()
      ds = cnvrg.datasets.get("MyDataset")  # Get the Dataset object
      ds.queries.create(name='OnlyPngFiles',
                        query='{"fullpath":"*.png"}',
                  commit_sha1='xxxxxxxxx')
      

      Query parameters:

      Parameter type description required default
      name string The name of the query Yes
      query string The query string according to Query language syntax Yes
      commit_sha1 string The sha1 value of the commit that this query will be based on No None

      # List all of the Dataset queries:

      ds.queries.list(sort="-created_at")
      

      TIP

      sort the list by: -key -> DESC | key -> ASC

      # Get a specific query

      q = ds.queries.get('slug')
      

      # Delete a query

      q.delete()
      

      # Dataset File operations

      # Upload files to a dataset

      ds.put_files(paths=['/files_dir/file1.txt', '/files_dir/file2.txt'], 
                   pattern='*')
      

      available Parameters:

      Parameter type description required default
      paths List The list of file paths that will be uploaded to the dataset Yes
      pattern string String defining the filename pattern No "*"
      message string The commit message No ""
      override bool Whether or not to re-upload even if the file already exists No False
      force bool Should the new commit copy files from its parent No False

      NOTE

      If a folder is given all the relevant files in that folder (that answers to the regex pattern) will be uploaded.

      # Remove files from a dataset

      You can remove files from the dataset:

      ds.remove_files(paths='*', 
                message='This will delete everything!')
      

      NOTE

      When deleting files from a dataset paths parameter can be both a list of file paths or a string pattern like '*'

      # List Dataset content

      You can list all of the files and folders that are in the dataset:

      ds.list_files(query_raw='{"color": "yellow"}',
                    sort='-id')
      ds.list_folders(commit_sha1='xxxxxxxxx')
      

      Available Parameters

      Parameter type description required default
      commit_sha1 string Sha1 string of the commit to list the files from No None
      query string Query slug to list files from No None
      query_raw string Raw query to list files according to Query language syntax No None
      sort string Key to sort the list by (-key -> DESC / key -> ASC) No "-id"

      # Clone the dataset to the current working directory

      ds.clone()
      

      # Download dataset latest commit

      ds.download()
      

      WARNING

      The Dataset must be cloned first

      # Sync the local dataset with the remote one

      ds.sync_local() 
      
      Last Updated: 3/7/2023, 10:29:36 AM