# Configure Local Cache For Datasets
Local NFS cache can save time working with large datasets pulled from an external object-storage saving it to an NFS server in the vicinity of the Kubernetes cluster.
There are two options to configure local caching for datasets in cnvrg platform. The first is to configure a direct connection between the platform and the NFS backend. And the Second, by configuring a Kubernetes persistentVolume
and persistentVolumeClaim
to the NFS backend.
# Requirements
- kubectl and access to the Kubernetes cluster
- NFS server and an export, or an NFS service of a cloud provider
# Option 1, Creating a direct connection to NFS using Rails
Configure the NFS cache on cnvrg platform by accessing the app
pod using kubectl exec
and using the rails
sub-command.
kubectl -n cnvrg exec -it deploy/app -c cnvrg-app -- bash
Using the Rails
utility, configure a new NFS cache providing the following parameters:
- Type "a" to attach the NFS to all organizations, or "s" for a specific organization out of many
- Type "n" to select NFS machine
- NFS ip: The NFS server IP
- NFS Title: choose the title which will be displayed in UI
- NFS host Path: NFS export path
- NFS capacity in TB: total space in TB int example: 1,10, 100 (not 1TB,10TB)
- Resource name associated with this NFS (i.e., the cluster name, can be found on compute page)
$ rails nfs:create
** Invoke nfs:create (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute nfs:create
Create the NFS for (A)ll organizations, or for a (S)ingle organization?
a
(N)FS Machine or Kubernetes (P)VC?
n
Enter the NFS ip:
192.168.20.49
Enter the NFS title:
NFS-Cache
Enter the NFS host path:
/mnt/nfs_share
Enter the NFS total space (in TB):
1
Enter the resource (cluster) name associated with this NFS:
default
Expected output:
Successfully attached NFS 'NFS-Cache' to 'MyOrg' organization!
# Option 2, Working with Kubernetes persistentVolume and a persistentVolumeClaim
- The following is a minimal YAML file that define the persistentVolume. Save the YAML to your computer as
cnvrg-cache-pv.yaml
changing the following parameters:
- metadata.name =
persistentVolume
name - spec.capacity.storage = Storage size of the NFS share (Ei, Pi, Ti, Gi, Mi, Ki)
- spec.claimRef.name = The name of the persistentVolumeClaim, for example
project-x-dataset-cache-pvc
- spec.nfs.path = NFS share path
- spec.nfs.server = NFS server ip
apiVersion: v1
kind: PersistentVolume
metadata:
name: cnvrg-project-dataset-cache-pv
spec:
capacity:
storage: 1Ti
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
claimRef:
namespace: cnvrg
name: cnvrg-project-dataset-cache-pvc
mountOptions:
- hard
- nfsvers=4.1
nfs:
path: /mnt/nfs_share
server: 192.168.20.49
Create the persistentVolume
using kubectl
:
kubectl create -f cnvrg-project-dataset-cache-pv.yaml
- Save the following
persistentVolumeClaim
yaml file locally as dataset-cache-pvc.yaml changing the following parameters:
- metada.name =
persistentVolumeClaim
name - spec.volumeName = the
persistentVolume
name from the previous step
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cnvrg-project-dataset-cache-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 1Ti
volumeName: cnvrg-project-dataset-cache-pv
Create the persistentVolumeClaim
using kubectl
:
kubectl create -f cnvrg-project-dataset-cache-pvc.yaml
Once both objects are created, verify that the status of the persistentVolumeClaim
is Bound
using kubectl get pvc
command:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cnvrg-project-dataset-cache-pvc Bound cnvrg-project-dataset-cache-pv 10Gi RWO gp2 71m 71m
# Creating the NFS cache using Rails
Configure the NFS cache on cnvrg platform by accessing the app
pod using kubectl exec
and using the rails
sub-command.
kubectl -n cnvrg exec -it deploy/app -c cnvrg-app -- bash
Using the Rails
utility, configure a new NFS cache providing the following parameters:
- Type "a" to attach the NFS to all organizations, or "s" for specific organization out of many
- Type "p" to select PVC
- Type the PVC claim name
- NFS Title: choose the title which will be displayed in UI
- NFS host Path: NFS export path
- NFS capacity in TB: total space in TB int example: 1,10, 100 (not 1TB,10TB)
- Resource name associated with this NFS (i.e., the cluster name, can be found on compute page)
$ rails nfs:create
** Invoke nfs:create (first_time)
** Invoke environment (first_time)
** Execute environment
** Execute nfs:create
Create the NFS for (A)ll organizations, or for a (S)ingle organization?
a
(N)FS Machine or Kubernetes (P)VC?
p
Enter the PVC claim name:
cnvrg-project-dataset-cache-pvc
Enter the NFS title:
NFS-Cache
Enter the NFS host path:
/mnt/nfs_share
Enter the NFS total space (in TB):
1
Enter the resource (cluster) name associated with this NFS:
default
Expected output:
Successfully attached NFS 'NFS-Cache' to 'MyOrg' organization!