# Configure Local Cache For Datasets

Local NFS cache can save time working with large datasets pulled from an external object-storage saving it to an NFS server in the vicinity of the Kubernetes cluster.

There are two options to configure local caching for datasets in cnvrg platform. The first is to configure a direct connection between the platform and the NFS backend. And the Second, by configuring a Kubernetes persistentVolume and persistentVolumeClaim to the NFS backend.

# Requirements

  • kubectl and access to the Kubernetes cluster
  • NFS server and an export, or an NFS service of a cloud provider

# Option 1, Creating a direct connection to NFS using Rails

Configure the NFS cache on cnvrg platform by accessing the app pod using kubectl exec and using the rails sub-command.

kubectl -n cnvrg exec -it deploy/app -c cnvrg-app -- bash

Using the Rails utility, configure a new NFS cache providing the following parameters:

  • Type "a" to attach the NFS to all organizations, or "s" for a specific organization out of many
  • Type "n" to select NFS machine
  • NFS ip: The NFS server IP
  • NFS Title: choose the title which will be displayed in UI
  • NFS host Path: NFS export path
  • NFS capacity in TB: total space in TB int example: 1,10, 100 (not 1TB,10TB)
  • Resource name associated with this NFS (i.e., the cluster name, can be found on compute page)
$ rails nfs:create
** Invoke nfs:create (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute nfs:create

Create the NFS for (A)ll organizations, or for a (S)ingle organization?
a
(N)FS Machine or Kubernetes (P)VC?
n
Enter the NFS ip:
192.168.20.49
Enter the NFS title:
NFS-Cache
Enter the NFS host path:
/mnt/nfs_share
Enter the NFS total space (in TB):
1
Enter the resource (cluster) name associated with this NFS:
default

Expected output:

Successfully attached NFS 'NFS-Cache' to 'MyOrg' organization!

# Option 2, Working with Kubernetes persistentVolume and a persistentVolumeClaim

  1. The following is a minimal YAML file that define the persistentVolume. Save the YAML to your computer as cnvrg-cache-pv.yaml changing the following parameters:
  • metadata.name = persistentVolume name
  • spec.capacity.storage = Storage size of the NFS share (Ei, Pi, Ti, Gi, Mi, Ki)
  • spec.claimRef.name = The name of the persistentVolumeClaim, for example project-x-dataset-cache-pvc
  • spec.nfs.path = NFS share path
  • spec.nfs.server = NFS server ip
apiVersion: v1
kind: PersistentVolume
metadata:
  name: cnvrg-project-dataset-cache-pv
spec:
  capacity:
    storage: 1Ti
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  claimRef:
    namespace: cnvrg
    name: cnvrg-project-dataset-cache-pvc
  mountOptions:
    - hard
    - nfsvers=4.1
  nfs:
    path: /mnt/nfs_share
    server: 192.168.20.49

Create the persistentVolume using kubectl:

kubectl create -f cnvrg-project-dataset-cache-pv.yaml
  1. Save the following persistentVolumeClaim yaml file locally as dataset-cache-pvc.yaml changing the following parameters:
  • metada.name = persistentVolumeClaim name
  • spec.volumeName = the persistentVolume name from the previous step
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cnvrg-project-dataset-cache-pvc
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Ti
  volumeName: cnvrg-project-dataset-cache-pv

Create the persistentVolumeClaim using kubectl:

kubectl create -f cnvrg-project-dataset-cache-pvc.yaml

Once both objects are created, verify that the status of the persistentVolumeClaim is Bound using kubectl get pvc command:

kubectl get pvc

NAME                                                                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
cnvrg-project-dataset-cache-pvc                                                         Bound    cnvrg-project-dataset-cache-pv                                     10Gi       RWO            gp2            71m                                                                            71m

# Creating the NFS cache using Rails

Configure the NFS cache on cnvrg platform by accessing the app pod using kubectl exec and using the rails sub-command.

kubectl -n cnvrg exec -it deploy/app -c cnvrg-app -- bash

Using the Rails utility, configure a new NFS cache providing the following parameters:

  • Type "a" to attach the NFS to all organizations, or "s" for specific organization out of many
  • Type "p" to select PVC
  • Type the PVC claim name
  • NFS Title: choose the title which will be displayed in UI
  • NFS host Path: NFS export path
  • NFS capacity in TB: total space in TB int example: 1,10, 100 (not 1TB,10TB)
  • Resource name associated with this NFS (i.e., the cluster name, can be found on compute page)
$ rails nfs:create
** Invoke nfs:create (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute nfs:create

Create the NFS for (A)ll organizations, or for a (S)ingle organization?
a
(N)FS Machine or Kubernetes (P)VC?
p
Enter the PVC claim name:
cnvrg-project-dataset-cache-pvc
Enter the NFS title:
NFS-Cache
Enter the NFS host path:
/mnt/nfs_share
Enter the NFS total space (in TB):
1
Enter the resource (cluster) name associated with this NFS:
default

Expected output:

Successfully attached NFS 'NFS-Cache' to 'MyOrg' organization!
Last Updated: 3/7/2022, 6:12:29 PM