TensorRT

+ Gradient

How to use Gradient and

TensorRT

together

Using TensorRT to deploy models on Gradient.

TensorRT is an optimization library (SDK) and deep learning model serving system built on CUDA. NVIDIA developed TensorRT for GPU-enabled production environments that are focused on high-throughput and low-latency applications.

Gradient includes a pre-built TensorRT image out of the box which is updated regularly. Alternatively, customers can use a customized version of TensorRT by using their own Docker image hosted on a public or private Docker registry.

Deploying models with TensorRT

When creating a Deployment, you can select the prebuilt image or bring your own custom image. These options are possible via the web UI, the CLI, or defined as a step within an automated pipeline.

Selecting the prebuilt TensorRT image when creating a deployment

When using the CLI, the command would like something like this:

gradient deployments create \  

--name "my deployment" \
--deploymentType TensorRT \
--imageUrl paperspace/tensorrt \
--machineType P5000 \
--instanceCount 2
...

Learn more about the Gradient TensorFlow serving integration in the docs.