NVIDIA TensorRT 8 Released Today: High Performance Deep Neural Network Inference

NVIDIA TensorRT allows high-performance inference of TensorFlow and PyTorch neural networks. Today NVIDIA released version 8 of this amazing framework. The new version includes sparsity, which optimizations to prune weak connections that do not contribute to the overall calculation of the network. TensorRT 8 allows transformer optimizations and BERT-Large achievement. Getting Started with TensorRT 8: Installation Instructions for Windows, Linux and Cloud: #installing-zip 4.6. Zip File Installation 0:44 Quantized Network (QAT) 0:55 Sparsity 1:46 Setup TensorRT 3:10 Using the TensorRT 8 Jupyter 8 Notebook 6:21 Query BERT 6:54 Ask BERT your Own Question 8:00 BERT Weaknesses * Follow Me on Social Media! GitHub: Twitter: Instagram: Discord: Patreon:

3 views