# Recommenders AI Blueprint - deprecated from 11/2024

# Inference

A recommender system is a set of algorithms that assesses people’s choices and generates responses for recommending similar items to them. The responses are based on a person’s selections from among a set list of items, and are specifically tailored to not repeat items the person has already viewed and rated.

Intermediate users of this blueprint include business analysts, marketing associates, and data scientists. Other users include business owners in industries such as retail, entertainment, and academia to make suggestions based on customer interests.

# Purpose

Use this inference blueprint to recommend similar items to customers according to their behaviors. To use this pretrained recommender model, create a ready-to-use API endpoint that can be quickly integrated with your data and application.

This inference blueprint’s model was trained using movie ratings data, specifically the MovieLens 100K Dataset. To use custom data according to your specific business, run this counterpart’s training blueprint, which trains the model and establishes an endpoint based on the newly trained model.

# Instructions

NOTE

The minimum resource recommendations to run this blueprint are 3.5 CPU and 8 GB RAM.

Complete the following steps to deploy this recommender API endpoint:

Click the Use Blueprint button.
In the dialog, select the relevant compute to deploy the API endpoint and click the Start button.
The cnvrg software redirects to your endpoint. Complete one or both of the following options:
- Use the Try it Live section with a relevant user ID to check your model's predictions.
- Use the bottom integration panel to integrate your API with your code by copying in your code snippet.

An API endpoint that recommends similar items to users based on their behavior has now been deployed. For information on this blueprint's software version and release details, click here.

Refer to the following blueprints related to this inference blueprint:

# Training

# Overview

The following diagram provides an overview of this blueprint's inputs and outputs.

# Purpose

Use this training blueprint to train a custom model that can recommend similar items to customers according to their behaviors. For the model to learn each customer’s choices and predict recommendations on future items, the blueprint requires custom data in the form of a customer’s preferences on the current items.

The model predictions are based directly on the scores, which are essentially predicted ratings for all items rather than just the ones the customer has already viewed and rated. This blueprint also establishes an endpoint that can recommend similar items to customers according to their behavior based on the newly trained model.

# Deep Dive

The following flow diagram illustrates this blueprint's pipeline: Deep Dive

# Flow

The following list provides a high-level flow of this blueprint’s run:

In the S3 Connector, the user provides the data bucket name and the directory path to the CSV file.
In the Data Validation task, the user provides the path to the CSV file including the prefix just provided in the S3 Connector task.
The blueprint trains the model with the user-provided custom data to make customer preference predictions.
The user uses the newly deployed endpoint to make predictions with new data using the newly trained model.

# Arguments/Artifacts

For more information on this blueprint's tasks, its inputs, and outputs, click here.

# Data Validation Inputs

--input_path is the name and path of the file containing the ratings.

# Data Validation Outputs

--item_dict1 is the item mapping file containing the real and converted item IDs.
--user_dct1 is the user mapping CSV file containing the real and converted user IDs.
--ratings_translated is the file containing the converted user IDs and item IDs alongside the ratings to be used.

# Train/Test/Split Inputs

--filename is the name and path of the file containing the ratings and user IDs/item IDs in the correct format.

# Train/Test/Split Outputs

--train_whole is the training dataset CSV file, around 75% of the data.
--test_whole is the test dataset CSV file, around 25% of the data.

NOTE

Both output dataset files are similar, except the data in one is not present in the other.

# Modes of Operation

This blueprint requires customer ratings data on a list of items to train the model and predict recommendations based on the scores the model generates. The model cannot extrapolate its recommendations on unknown customers (this functionality is a potential future advancement).

The Recommender Training blueprint can be used in two modes:

Train the model using existing common datasets to provide customer recommendations
Train the model using a completely new dataset and then use the results to provide recommendations

In both modes, the user can customize the hyperparameters and select the model for the final recommendation, the latter is in the case of advanced users.

# Instructions

NOTE

The minimum resource recommendations to run this blueprint are 3.5 CPU and 8 GB RAM.

Complete the following steps to train this recommenders model:

Click the Use Blueprint button. The cnvrg Blueprint Flow page displays.
In the flow, click the S3 Connector task to display its dialog.
- Within the Parameters tab, provide the following Key-Value pair information:
  - Key: bucketname - Value: enter the data bucket name
  - Key: prefix - Value: provide the main path to the data file folder
- Click the Advanced tab to change resources to run the blueprint, as required.
Return to the flow and click the Data Validation task to display its dialog.
- Within the Parameters tab, provide the following Key-Value pair information:
  - Key: input_path – Value: provide the path to the ratings file including the S3 prefix
  - /input/s3_connector/<prefix>/<csv file> − ensure the CSV file path adheres this format
  NOTE
  
  You can use prebuilt data examples paths already provided.
- Click the Advanced tab to change resources to run the blueprint, as required.
Click the Run button.

The cnvrg software launches the training blueprint as set of experiments, generating a trained recommender model and deploying it as a new API endpoint.

NOTE

The time required for model training and endpoint deployment depends on the size of the training data, the compute resources, and the training parameters.

For more information on cnvrg endpoint deployment capability, see cnvrg Serving.

Track the blueprint's real-time progress in its Experiments page, which displays artifacts such as logs, metrics, hyperparameters, and algorithms.
Click the Serving tab in the project and locate your endpoint.
Complete one or both of the following options:
- Use the Try it Live section with a relevant user ID to check the model's predictions.
- Use the bottom integration panel to integrate your API with your code by copying in your code snippet.

A custom model and an API endpoint, which can recommend similar items to customers according to their behavior, have now been trained and deployed. For information on this blueprint's software version and release details, click here.

# Connected Libraries

Refer to the following libraries connected to this blueprint:

Refer to the following blueprints related to this training blueprint:

# Recommenders AI Blueprint - deprecated from 11/2024

# Inference

# Purpose

# Instructions

# Related Blueprints

# Training

# Overview

# Purpose

# Deep Dive

# Flow

# Arguments/Artifacts

# Data Validation Inputs

# Data Validation Outputs

# Train/Test/Split Inputs

# Train/Test/Split Outputs

# Modes of Operation

# Instructions

# Connected Libraries

# Related Blueprints