# cnvrg SDK V1 (Deprecated)

SDK v1 is deprecated on cnvrg v4.8.0 Please use SDK v2 instead.

The topics in this page:

Download and Install the cnvrg SDK
Authenticating the cnvrg SDK
Project Operations
Experiment Operations
Flow Operations
Endpoint Operations
Dataset Operations
AI Library Operations
Troubleshooting the cnvrg SDK

# Download and Install the cnvrg SDK

To install the cnvrg SDK simply use the following pip command.

pip install cnvrg

Or if you're using pip for Python 3, use the following:

pip3 install cnvrg

WARNING

The cnvrg SDK requires Python 3.5 or later.

# SDK Operations

# Authenticating the cnvrg SDK

# Inside a cnvrg workspace or experiment

The cnvrg SDK will already be initialized and authenticated with cnvrg using the account that is logged in. You can start using cnvrg SDK functions immediately.

# Authenticate using the SDK

You can authenticate with the SDK Cnvrg Class inside any scripts that you want to run locally. Authentication using this method does not persist after the script has finished running.

Import the Cnvrg submodule and then initialize a Cnvrg object with the correct API, username and password.

from cnvrg import Cnvrg 
Cnvrg(url='/api', email='email_address', password='password')

If you are on cnvrg metacloud environment you need to use your API KET that can be found in your Account page:

NOTE

Your API KEY can be found in your account settings

from cnvrg import Cnvrg 
Cnvrg(url='/api', email='email_address', token='YOUR API KEY')

The above code snippet should contain your API link as required. If you are still not sure what your unique Web URL is, log on via the Web and copy the URL from your browser, or contact support for additional help.

# Authenticate using the CLI

If you are running an experiment on your local machine or in another environment, you can authenticate your local workspace using the cnvrg CLI.

Finally, ensure you are working in a folder that is linked with cnvrg (you can link using either the cnvrg link or cnvrg clone command).

# Project Operations

# Create a new project

To create a new project:

from cnvrg import Project
p = Project.create('Project_Name')

# Upload files and artifacts to a project

To upload a list of files stored locally to the project. The list contains the absolute file paths for the files to be uploaded:

from cnvrg import Project
# set a project object from existing project or create a new one.
p = Project.create('Project_Name')
p.put_files(['/cnvrg/notebook.ipynb','/cnvrg/model.h5'])

NOTE

If you are using python on Windows OS, when putting the file path make sure it is formatted correctly either by putting it as an r string (example:r'C:\User\cnvrg\notebook.ipynb') or writing it with double slashes (example:'C:\\User\\cnvrg\\notebook.ipynb')

# Get a list of existing projects

To return a list of existing projects:

from cnvrg import Project
Project.list()

# Experiment Operations

# Experiment ID

In many commands, you will need to use an experiment ID. The experiment ID can be found in the URL for the experiment.

For example, if you have an experiment that lives at: https://app.cnvrg.io/my_org/projects/my_project/experiments/kxdjsuvfdcpqkjma5ppq, the experiment ID is kxdjsuvfdcpqkjma5ppq.

Then to use e=Experiment(), the command would be e=Experiment('kxdjsuvfdcpqkjma5ppq').

# Run an experiment

If you have a function or file that you would like to run as an experiment, use the Experiment.run() command. The code can be run locally (on your own machine or in a cnvrg workspace) or on remote compute (using the compute parameter).

A function can only be run as an experiment locally (if compute="local").

To create a new experiment and run specific code locally or on a remote machine:

from cnvrg import Experiment
e = Experiment.run(command,title='my title', compute='small')

Parameter	type	description
command	string	The command to run. Can be a shell command or function. For example, `python3 train.py` or `main`.
title	string	The title for the experiment.
image	string	The image to use for the experiment.
compute	string	The compute to run the experiment on, default is None (will run local). For Kubernetes compute it should be in the format: `cluster_name.template_name`. For example, `workers.medium`. You can leave out `cluster_name` to use the default cluster. For an on-premise machine, simply write the name of the machine. For example, `worker1`.
datasets	array of dictionaries	A list of datasets and commits/queries (that is: `[{"id": "dataset_id-1", "commit": "commit_sha1-1", "query": "query_name-1", "tree_only": true, "use_cached": true}, {dataset_dict_2}, ...]`) to attach to the experiment. You can use either `commit` or `query` and not both. `use_cached` is only relevant when integrated to NFS and is true by default.
callable	callable object	specify a function to run as experiment, instead of command
arguments	dictionary	Include parameters to be parsed on the command line (for example, parameters for grid search)
schedule	string	Set a time to execute the experiment. The time is relative to the timezone defined for the current user. You can specify the time using either a 24- or 12-hour clock format. You can specify time in the format `dd.mm.yyyy hh:mm<PM/AM>` or `yyyy.dd.mm hh:mm<PM/AM>`. For example: To specify 4:25 PM on the 31st of January 2012, you can specify any of these: `31.01.2021 16:25`, `31.01.2021 4:25PM`, `2021.01.31 16:25`, `2021.01.31 4:25PM`.
recurring	string	Set a schedule for executing a recurring experiment. Specify the schedule using Cron format. For example, to execute the experiment at 30 minutes past every hour, specify: `"30 * * * *"`
notify_on_success	boolean	Receive an email when the experiment ends successfully. Default is False.
notify_on_error	boolean	Receive an email if an error occurs while your experiment is running. Default is False.
emails	string	Add extra recipients for email notifications for the experiment. The string should be a list of comma-separated emails. For example, `"person1@email.com,person2@email.com"`.
git_branch	string	(When project is connected to git) - Which git branch to clone. (Default is project's default)
git_commit	string	(When project is connected to git) - Which commit of the git branch to clone. (Default is project's default)
commit	string	(When project is connected to git) - Which commit of the cnvrg project files to clone. (Default is project's default)
output_dir	string	(When project is connected to git) - The path of the folder with experiment artifacts to be synced as the commit for the experiment. (Default is project's default)
git_diff	boolean	(When project is connected to git) - Wether to additionally sync the files that are in the git diff. (Default is False)

NOTE

The git_diff parameter is useful when developing in a workspace. With it, you can fiddle with your code and run experiments without first syncing to git. The code you run with will still be synced so you can eventually push back the code that your are most happy with.

# Examples

Run an experiment on a remote small compute, using a script with a dataset and a query:

from cnvrg import Experiment
e = Experiment.run('python3 train.py',
                    title='my title',
                    compute='small',
                    datasets=['dogscats.only_dogs'])

Run a grid search:

from cnvrg import Experiment
e = Experiment.run('python3 train.py',
                    title='grid search'
                    arguments={"epochs": [10,15,20], "batch_size": [128,256]})

Run a grid search using a YAML file:

You can also run a grid search using the SDK by including the parameter information in a YAML file. To use the YAML file, you must first load it.
```
from cnvrg import Experiment
import yaml
with open('/path/parameters.yaml') as f:
    grid = yaml.full_load(f)
e = Experiment.run('python3 train.py', grid=grid)
```
TIP

Learn more about running grid searches and hyperparameter optimization here.

Run an experiment locally:

from cnvrg import Experiment
e = Experiment.run('python3 train.py',title='my title')

Run an experiment on a remote compute:

from cnvrg import Experiment
e = Experiment.run('python3 train.py',title='my title',compute='gpu')

Run an experiment using a specific Docker image:

from cnvrg import Experiment
e=Experiment()
e.run('command', compute='medium', image="tensorflow:19.07")

Run a local experiment using a function:

from cnvrg import Experiment
def main(experiment):
    [print(i) for i in range(100)]
e = Experiment.run(main,title='my function')

Load an existing experiment object from inside a running experiment:

from cnvrg import Experiment
e = Experiment()
# run commands on the experiment object:
e.log_param("tag_key", "tag_value")

Use email notifications for your experiment:

from cnvrg import Experiment
e = Experiment.run('python3 train.py', notify_on_success=True, notify_on_error=True)

# Initialize an experiment object

You can also create an Experiment object with the Experiment.init() command. This allows you to easily add SDK tracking into code you will run. Simply initialize the object and then run your code. Use the metadata operations to control what will be tracked by cnvrg.

To create a new experiment object that can track logs, metrics and artifacts:

from cnvrg import Experiment
e = Experiment.init()
# Then run your code and track it using the sdk, for example:
e.log("Starting my experiment")

You can also use the project parameter to tell cnvrg which project the experiment should be part of. If you use Experiment.init() in a cnvrg project folder linked with the CLI (either online or locally) you do not need to set this parameter.

from cnvrg import Experiment
e = Experiment.init(project=Project(‘slug’))

# Finish an experiment object

If you have initialized an Experiment object using the .init() command, you should conclude the experiment with the .finish() command.

To conclude an experiment object:

from cnvrg import Experiment
e.finish(exit_status=exit_status)

The default exit_status is aborted (exit_status=-1).

# Examples

# Metadata operations on experiments

# Add logs to an experiment:

from cnvrg import Experiment
e = Experiment()
e.log("my first log")
e.log(["my first log","my second log"]

# Get logs of an experiment:

from cnvrg import Experiment
e = Experiment()
e.logs() ### prints the logs of the experiment
         ### until the status of the experiment
         ### is error/aborted/success

# Get the title of an experiment and set a new one:

from cnvrg import Experiment
e = Experiment("EXPERIMENT_SLUG")
e.title # get title
e.title = "dogs-and-cats" # set new title

# Get various other metadata:

from cnvrg import Experiment
e = Experiment("EXPERIMENT_SLUG")
e['status'] # get current status of experiment
e['full_href'] # get full URL of the experiment
e['commit'] # get the start commit of the experiment
e['end_commit'] # get the end commit of the experiment

# Create a tag:

from cnvrg import Experiment
e = Experiment()
e.log_param("my_tag", "my_value")

# Create a line chart:

Single line:

from cnvrg import Experiment
e = Experiment()
e.log_metric("chart_name",
        Ys=[0.1, 0.2, 0.3],
        Xs=[1,2,3])

Multiple lines:

from cnvrg import Experiment
e = Experiment()
e.log_metric("chart_name",
        Ys=[0.1, 0.2, 0.3],
        Xs=[1,1,1],
        grouping=["loss", "val_loss", "acc"])

Xs and grouping are optional. When Xs is set, you can control the corresponding X value for the Y values. When grouping is set, multiple lines will be created: the first grouping, x and y value will be matched as a point on a single line. Xs and Ys can be of type float, integer, numpy single value, list, floats, list of numpy values or a numpy array.

WARNING

chart_name can't include "/"

# Create a bar chart:

Example bar chart

Single bar:

from cnvrg import Experiment
from cnvrg.charts import Bar
e = Experiment()
x_value=["bar1","bar2",...,"barN"]
y_value=[1,2,...,n]
e.log_chart("chart_key", title="chart_title",
    data=Bar(x=x_value, y=y_value, name="y_value"))

Multiple bars:

from cnvrg import Experiment
from cnvrg.charts import Bar
e = Experiment()
x_value=["bar1","bar2",...,"barN"]
y_value1=[1,2...n]
y_value2=[n, n-1,...,1]    
e.log_chart("chart_key", title="chart_title",
    data=[Bar(x=x_value, y=y_value1, name="y_value1", min_val=0, max_val=10),
          Bar(x=x_value, y=y_value2, name="y_value2", colors=['#FF0000', '#00FF00', '#0000FF',...])])

Typing information: x must be a List and y must be an Array, np.ndarry, pd.array or pd.series.

The x list will populate the labels for the bars, and the corresponding y value will dictate the value of the bar for that category. The name of the y array will be the name of the set/category in the graph. min_val and max_val are optional and are numbers that correspond the lower and upper bounds for the y values. Optionally, you can set each bar to be a specific color using the colors list of hex values, with each hex value corresponding to each x value.

Steps and groups:

Using steps and groups allow you to submit bar charts across different steps and visualize it in a single chart with a slider to easily move between the steps. steps should be an integer and group. Multiple steps should be grouped with a single group.

Animated bargraph

NOTE

e.log_chart() creates a new graph or overwrites an existing graph. It will not append new bars to the previous version of the graph (unlike e.log_metric()).
You must therefore include all bars in the single e.log_chart() method every time you update it or use a different chart_key.

# Create a heat map/confusion matrix:

Example heat map

from cnvrg import Experiment
from cnvrg.charts import MatrixHeatmap
e = Experiment()
e.log_chart("chart_key", title="Heatmap", x_ticks=['x', 'y'], y_ticks=['a', 'b'],
    data=MatrixHeatmap(matrix=[(0,5), (1,1)],
                       color_stops=[[0,'#000000'],[1, '#7EB4EB']],
                       min_val=0,
                       max_val=10))

Typing information: x_ticks and y_ticks must be a List and matrix is a list of tuples in struct (x,y,z). color_stops is optional and is a List of Lists with size 2, where the nested first value is a float 0 <= X <= 1, and the second value is the hex value for the color to represent matrix values at that point of the scale. min and max are optional and should be numbers corresponding to the minimum and a maximum values for the key (scaling will be done automatically when these values are not submitted).

Each struct corresponds to a row in the matrix and to a label from the y_ticks list. The matrix is built from the bottom up, with the first struct and y_tick at the bottom edge. Each value inside the struct corresponds to each x_tick. Using steps and groups allow you to submit the same heatmap across different steps and visualize it in a single chart with a slider to easily switch between the charts. steps should be an integer and group should be a string.

Steps and groups:

Using steps and groups allow you to submit heatmaps across different steps and visualize it in a single chart with a slider to easily move between the steps. steps should be an integer and group. Multiple steps should be grouped with a single group.

Animated Heatmap

TIP

When using the group parameter, make sure the chart's key is unique across the different steps

NOTE

When using e.log_chart() you are are creating or overwriting an existing graph. It will not append new bars to the previous version (unlike e.log_metric()).
You must therefore include all values in the single e.log_chart() method every time you update it or use a different chart_key.

# Create a scatter plot:

Example scatter plot

Single set of points:

from cnvrg import Experiment
from cnvrg.charts import Scatterplot
e=Experiment()
x_values=[1,2,3,4,5]
y_values=[5,4,3,2,1]  
e.log_chart("chart_key", title="chart_title",
    data=Scatterplot(x=x_values, y=y_values, name="name")

Multiple sets of points:

from cnvrg import Experiment
from cnvrg.charts import Scatterplot
e=Experiment()
x1_values=[1,2,3,4,5]
x2_values=[1,2,3,4,5]
y1_values=[5,4,3,2,1]  
y2_values=[1,2,3,4,5] 
e.log_chart("chart_key", title="chart_title",
    data=[Scatterplot(x=x1_values, y=y1_values, name="name"),
          Scatterplot(x=x2_values, y=y2_values, name="name2")]

Typing information: x and y must be an Array, np.ndarry, pd.array or pd.series. x is the list of x values and y is the list of y values.

NOTE

# Log artifacts for your experiment:

from cnvrg import Experiment
e = Experiment()
e.log_artifacts(['path/file.png','path/file2.png'], 'target_dir/')

Parameter	type	description
files	list of strings	List of files to upload (using relative path).
target_dir	string	Path in the experiment file system to upload the files.

# Show a line chart:

from cnvrg import Experiment
e = Experiment("experiment_id")
e.show_chart("chart_name") ## will open a matplotlib chart

# Operations on experiments

Sync during a running experiment:

from cnvrg import Experiment
e = Experiment()
e.sync(message="my commit")

Set the title of an experiment:

from cnvrg import Experiment
e = Experiment()
e.set_title("my new title")

Send an email alert during an experiment:

from cnvrg import Experiment
e = Experiment()
e.alert(message="Alert: Accuracy is less than 0.5", subject="Accuracy alert", recipients=["jp@myorg.com"])

Get a list of commits per experiment

from cnvrg import Experiment
e = Experiment()
e.commits()

Get the latest commit per experiment

from cnvrg import Experiment
e = Experiment()
e.commits(latest=True)

# Experiment artifacts

Retrieve a list of artifacts for a specific experiment:

from cnvrg import Experiment
e = Experiment("experiment_id")
e.artifacts()

Retrieve a list of artifacts after running an experiment:

from cnvrg import Experiment
e = Experiment.run('python3 train.py',title='my title',compute='gpu')
e.artifacts()

Response:

[{'name': 'artifact.txt',
  'url': 'URL LINK',
  'type': 'blob',
  'size': '121 Bytes',
  'fullpath': 'artifact.txt',
  'updated': '14 minutes ago'}]

Pull artifacts of a specific experiment:

from cnvrg import Experiment
e = Experiment("experiment_id")
e.pull_artifacts()

Parameter	type	description
commit	string	set a specific commit to list artifacts from. Default is the latest experiment's commit.
wait_until_success	boolean	wait until experiment has finished successfully and then pull the artifacts

Pull artifacts to a specific path:

from cnvrg import Experiment
e = Experiment("experiment_id")
e.pull_artifacts("path_to_folder")

Wait until the experiment has finished running successfully and then pull the artifacts from the end commit:

from cnvrg import Experiment
e = Experiment.run("experiment_id")
e.pull_artifacts(wait_until_success=True)

Pull artifacts after running an experiment:

from cnvrg import Experiment
e = Experiment.run('python3 train.py',title='my title',compute='gpu')
e.pull_artifacts()

# Flow Operations

Flows can be created and run from any environment using the SDK. Creating flows requires using a flow configuration YAML file.

# Create a flow

You can use a flow YAML to create a flow inside a project. You can use either the absolute path to a YAML file or include the YAML content directly. Use the Flow.create command:

from cnvrg import Flow
f = Flow.create(yaml_content=string, project=Project('Project_Name'))
#OR
f = Flow.create(file=path_to_YAML, project=Project('Project_Name'))

Parameter	type	description
yaml_content	string	Required (if not using `file`). A string with the contents of the YAML configuration.
file	path	Required (if not using `yaml_content`). A path to the YAML configuration file.
project	Project object	A cnvrg `Project` object for the project to create the flow in. This can be left out if you are in a cnvrg project (linked with the CLI).

# Examples

Using path:

from cnvrg import Flow
from cnvrg import Project
f = Flow.create(file='/local_path/flow.yaml', project=Project('Project_Name'))

Using YAML directly:

## Define your flow
example_yaml = '''---
flow: Flow Example
recurring: 
tasks:
- title: Training Task
  type: exec
  input: python3 train.py
  computes:
  - medium
  image: cnvrg:v5.0
relations: []'''

from cnvrg import Flow
from cnvrg import Project
f = Flow.create(yaml_content=example_yaml, project=Project('Project_Name'))

# Run a flow

To trigger a flow and run it as described in the YAML file or as constructed using the online flow builder:

from cnvrg import Flow
fv = f.run()

# Monitor the status of a running flow

To check the status of a running flow:

from cnvrg import Flow
fv = f.run()
fv.info()

# Pull latest task artifacts

Pull latest task artifacts. This command will pull the artifacts that were created in the experiment, from the last version that ran.

NOTE

You can only pull artifacts from a task that runs only a single experiment

from cnvrg import Flow
flow = Flow("training") # provide the name of the flow
flow.pull_artifacts(task="SVM")

NOTE

If running outside cnvrg project directory, you can provide the project name when defining the flow:

flow = Flow("training",project="mnist") # for example, if using 'mnist' project

Parameter	type	description
task	string	Required The name of the task of the flow you want too pull artifacts

# Endpoint Operations

# Endpoint ID

In many commands, you will need to use an endpoint ID. The endpoint ID can be found in the URL for the endpoint.

For example, if you have an endpoint that lives at: https://app.cnvrg.io/my_org/projects/my_project/endpoints/show/j46mbomoyyqj4xx5f53f, the endpoint ID is j46mbomoyyqj4xx5f53f.

Then to use e=Endpoint(), the command would be e=Endpoint('j46mbomoyyqj4xx5f53f').

# Load an endpoint

To load an endpoint and use it with the SDK:

from cnvrg import Endpoint
e = Endpoint("endpoint_id")

# Query an endpoint

Send a single prediction to endpoint:

from cnvrg import Endpoint
e = Endpoint("endpoint_id")
e.predict("my_predict")

Send batch predictions to endpoint:

from cnvrg import Endpoint
e = Endpoint("endpoint_id")
e.batch_predict(["my_predict1", "my_predict2",...,"my_predictn"], pool_size=100)

# Endpoint tracking and visualizing

To log a metric in your endpoint, use the log_metric() method in your code:

from cnvrg import Endpoint
e = Endpoint()
e.log_metric(k,v)

k is the name of the metric, v is value.

For example: e.log_metric(‘confidence’, 0.8)

The triggers in your endpoint leverage the metrics you log using this command. To set a trigger for confidence, you would have to be logging that metric using the cnvrg SDK. Tracked metrics will also be used for creating live graphs on the Charts tab.

WARNING

e.log_metric() currently only supports logging numbers.

# Get predictions

To get the previous 1000 predictions made by a model, use the get_predictions() method:

from cnvrg import Endpoint
e = Endpoint("endpoint_id")
predictions_list = e.get_predictions()

This method will return a list with the last 1000 predictions made by the endpoint.

# Rollback to previous model

To rollback a model that is in production use the rollback() method. This will rollback the latest model that was deployed and the endpoint will serve 100% of the traffic using the previous model:

from cnvrg import Endpoint
e = Endpoint("endpoint_id")
e.rollback()

# Miscellaneous

Log a prediction in your endpoint:

from cnvrg import Endpoint
e = Endpoint("endpoint_id")
e.log_request(input, output, start time (in ms))

Disable logging input in logs:

from cnvrg import Endpoint
e = Endpoint("endpoint_id")
e.disable_log_input()

Disable logging output in logs:

from cnvrg import Endpoint
e = Endpoint("endpoint_id")
e.disable_log_output()

# Dataset Operations

# Dataset ID

In many methods, you will need to use a dataset ID. The dataset ID is the name used for the dataset in its URL.

For example, if you have a dataset that lives at: https://app.cnvrg.io/my_org/datasets/dataset_name, the dataset ID is dataset_name.

Then to use ds=Dataset(), the command would be ds=Dataset(dataset_name).

# Load a dataset

To load a dataset with the SDK use the Dataset module.

from cnvrg import Dataset
ds = Dataset('dataset_id')

Ensure that the dataset slug is the url of the dataset inside cnvrg.

# List dataset query files

from cnvrg import Dataset
ds = Dataset('dataset_id') # Initializing dataset
query_files = ds.get_query("query_slug")
# [{'name': 'create_tags.rb',
#  'fullpath': 'create_tags.rb',
#  'url': 'https://object_storage_url...',
#  'metadata': {'color': 'blue', 'animal': 'dino'}
#  },
# {'name': 'dataset_tags.rb',
#  'fullpath': 'dataset_tags.rb',
#  'url': 'https://object_storage_url...',
#  'metadata': {'color': 'white', 'animal': 'dog'}
#  }]

Ensure that the dataset slug is the url of the dataset inside cnvrg and the query slug is the url of the query in cnvrg.

# Parameters

Parameter	type	description
data_dir	string	dataset local directory, if files exist in a local directory
filter	string	filter results (comma separated): fullpath, name, metadata, url

# Examples

Using the data_dir parameter:

from cnvrg import Dataset
ds = Dataset('dataset_id') # Initializing dataset
query_files = ds.get_query("query_slug",data_dir="/home/my_data")
# [{'name': 'create_tags.rb',
#  'fullpath': '/home/my_data/create_tags.rb',
#  'url': 'https://object_storage_url...',
#  'metadata': {'color': 'blue', 'animal': 'dino'}
#  },
# {'name': 'dataset_tags.rb',
#  'fullpath': '/home/my_data/dataset_tags.rb',
#  'url': 'https://object_storage_url...',
#  'metadata': {'color': 'white', 'animal': 'dog'}
#  }]

Using a filter:

from cnvrg import Dataset
ds = Dataset('dataset_id') # Initializing dataset
query_files = ds.get_query("query_slug",data_dir="/home/my_data",filter="name, fullpath")
# [{'name': 'create_tags.rb',
#  'fullpath': '/home/my_data/create_tags.rb',}
#  },
# {'name': 'dataset_tags.rb',
#  'fullpath': '/home/my_data/dataset_tags.rb',}
#  }]

# Search dataset using a query

from cnvrg import Dataset
ds = Dataset('dataset_id') # Initializing dataset
query_files = ds.search("{'type':'dog'}")
# [{'name': 'create_tags.rb',
#  'fullpath': 'create_tags.rb',
#  'url': 'https://object_storage_url...',
#  'metadata': {'color': 'blue', 'animal': 'dino'}
#  },
# {'name': 'dataset_tags.rb',
#  'fullpath': 'dataset_tags.rb',
#  'url': 'https://object_storage_url...',
#  'metadata': {'color': 'white', 'animal': 'dog'}
#  }]

Ensure that the dataset slug is the url of the dataset inside cnvrg and the query slug is the url of the query in cnvrg.

# Parameters

Parameter	type	description
data_dir	string	dataset local directory, if files exist in a local directory
filter	string	filter results (comma separated): fullpath, name, metadata, url

# Examples

Using the data_dir parameter:

ds = Dataset('dataset_id') # Initializing dataset
query_files = ds.search("{'type':'dog'}",data_dir="/home/my_data")
# [{'name': 'create_tags.rb',
#  'fullpath': '/home/my_data/create_tags.rb',
#  'url': 'https://object_storage_url...',
#  'metadata': {'color': 'blue', 'animal': 'dino'}
#  },
# {'name': 'dataset_tags.rb',
#  'fullpath': '/home/my_data/dataset_tags.rb',
#  'url': 'https://object_storage_url...',
#  'metadata': {'color': 'white', 'animal': 'dog'}
#  }]

Using a filter:

ds = Dataset('dataset_id') # Initializing dataset
query_files = ds.search("{'type':'dog'}",data_dir="/home/my_data",filter="name, fullpath")
# [{'name': 'create_tags.rb',
#  'fullpath': '/home/my_data/create_tags.rb',}
#  },
# {'name': 'dataset_tags.rb',
#  'fullpath': '/home/my_data/dataset_tags.rb',}
#  }]

# Fetch the metadata of dataset files

from cnvrg import Dataset
ds = Dataset("dataset_id")
metadata = ds.get_metadata(["dogs/black_dog.png", "cats/black_cat.png", "cats/white_cat.png", "non_existent_file", "file_without_metadata"])
# {
#     "dogs/black_dog.png": {
#                     "color": "black",
#                     "animal": "dog"
#                  },
#     "cats/black_cat.png": {
#                     "color": "black",
#                     "animal": "cat"
#                  }
#     "cats/white_cat.png": {
#                     "color": "white",
#                     "animal": "cat"
#                  }
#     "file_without_metadata": {}
# }

Ensure that the dataset slug is the url of the dataset inside cnvrg.

# Cache and clear commits

To cache a commit to an external NFS disk:

from cnvrg import Dataset
ds = Dataset('dataset_id')
ds.cache_commit(commit_sha1='sha1', external_disk_slug='disk_slug')

To clear a commit from an external NFS disk:

from cnvrg import Dataset
ds = Dataset('dataset_id')
ds.cache_clear(commit_sha1='sha1', external_disk_slug='disk_slug')

Parameter	type	description
commit_sha1	sha1 (string)	The sha1 of the commit.
external_disk_slug	string	The slug of the external disk.

# Dataset metadata (commits and external disks)

To return the current commit:

from cnvrg import Dataset
ds = Dataset('dataset_id')
ds.get_current_commit()

To return a tree-structured dictionary containing the current project files commits:
```
from cnvrg import Dataset
ds = Dataset('dataset_id')
ds.get_current_tree()
```

# AI Library Operations

You can run a component from the AI Library using the following method:

from cnvrg import Library
library = Library('/cnvrg_libraries/library_name')
experiment = library.run(arg1="val1",arg2="val2",...)

You can also run a grid search by using an array for an argument. cnvrg will identify how many experiments need to be run for each permutation of the library and starting them all. Use thw following format:

from cnvrg import Library
library = Library('/cnvrg_libraries/library_name')
experiment = library.run(arg=["val1","val2"])

All of the components arguments can be included in the library.run() method. If there is mo default value, the argument must be provided in the method call.

You can also set environmental conditions for the experiment using with the following arguments:

command
arguments
grid
title
project
computes
datasets
local
library
working_directory
sync_before
sync_after
prerun
requirements
image

For example, to run a library locally:

from cnvrg import Library
library = Library('/cnvrg_libraries/library_name')
experiment = library.run(computes='local')

# Troubleshooting the cnvrg SDK

If you experience issues running or installing the cnvrg SDK, troubleshoot by performing the following checks:

# Update the cnvrg SDK

Sometimes updating the SDK can help resolve the issue.

To update the SDK, run the command:

pip install cnvrg -U

Or if you're using pip for Python 3, use the following:

pip3 install cnvrg -U

If you do not have the latest version, the newest version will be downloaded and installed.

After updating the SDK, check if your issue persists.

# Install Xcode Command Line Tools (MacOS)

If you are having repeated errors installing pip or installing the SDK, you may not have the Xcode Command Line Tools installed on your MacOS machine. Run the following command and then try again:

xcode-select --install

NOTE

Sometimes, updating your version of MacOS requires a reinstallation of the Xcode Command Line Tools to fix common command line issues.

# Update or remove the `snowflake-connector-python`

If you are using the python package: snowflake-connector-python in your experiment, please upgrade it to version 2.2.7 or above using the following command:

pip3 install --upgrade 'snowflake-connector-python>=2.2.7'

If you no longer need the package at all, you can delete it with the following command:

pip3 uninstall snowflake-connector-python

← CLI v2 CLI v1 →

# cnvrg SDK V1 (Deprecated)

# Download and Install the cnvrg SDK

# SDK Operations

# Authenticating the cnvrg SDK

# Inside a cnvrg workspace or experiment

# Authenticate using the SDK

# Authenticate using the CLI

# Project Operations

# Create a new project

# Upload files and artifacts to a project

# Get a list of existing projects

# Experiment Operations

# Experiment ID

# Run an experiment

# Examples

# Initialize an experiment object

# Finish an experiment object

# Examples

# Metadata operations on experiments

# Add logs to an experiment:

# Get logs of an experiment:

# Get the title of an experiment and set a new one:

# Get various other metadata:

# Create a tag:

# Create a line chart:

# Create a bar chart:

# Create a heat map/confusion matrix:

# Create a scatter plot:

# Log artifacts for your experiment:

# Show a line chart:

# Operations on experiments

# Experiment artifacts

# Flow Operations

# Create a flow

# Examples

# Run a flow

# Monitor the status of a running flow

# Pull latest task artifacts

# Endpoint Operations

# Endpoint ID

# Load an endpoint

# Query an endpoint

# Endpoint tracking and visualizing

# Get predictions

# Rollback to previous model

# Miscellaneous

# Dataset Operations

# Dataset ID

# Load a dataset

# List dataset query files

# Parameters

# Examples

# Search dataset using a query

# Parameters

# Examples

# Fetch the metadata of dataset files

# Cache and clear commits

# Dataset metadata (commits and external disks)

# AI Library Operations

# Troubleshooting the cnvrg SDK

# Update the cnvrg SDK

# Install Xcode Command Line Tools (MacOS)

# Update or remove the snowflake-connector-python

# Update or remove the `snowflake-connector-python`