When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

NVIDIA announces TensorRT-LLM for Windows that boosts LLMs by up to 4 times with RTX GPUs

Green electronic Nvidia logo on a dark background

NVIDIA is already the kind of generative AI in terms of hardware. Its GPUs power data centers used by Microsoft, OpenAI, and others to run AI services like Bing Chat, ChatGPT, and more. Today, NVIDIA announced a new software tool designed to boost the performance of large language models (LLMs) on local Windows PCs.

In a blog post, NVIDIA announced that its TensorRT-LLM open-sourced library, which was previously released for data centers, is now available for Windows PCs. The big feature is that TensorRT-LLM allows LLMs to run up to four times faster on Windows PCs if they have NVIDIA GeForce RTX GPUs.

NVIDIA describes the benefits of TensorRT-LLM for both developers and end users in the post:

At higher batch sizes, this acceleration significantly improves the experience for more sophisticated LLM use — like writing and coding assistants that output multiple, unique auto-complete results at once. The result is accelerated performance and improved quality that lets users select the best of the bunch.

nvidia chat

The blog post showed an example of how TensorRT-LLM works. It asked the question, "How does NVIDIA ACE generate emotional responses?" to the standard LLaMa 2 LLM, and it failed to offer an accurate response.

However, when an LLM is paired with a vector library or vector database, and then asked the same question, it generated not only an accurate answer, but the TensorRT-LLM library created a faster response. TensorRT-LLM should be available soon on NVIDIA's developer site.

NVIDIA also added some AI-based features in today's new GeForce driver update. That includes the new 1.5 version of its RTX Video Super Resolution feature for better upscaling and fewer compression effects when viewing online videos. It also added TensorRT AI acceleration for Stable Diffusion Web UI, allowing people with GeForce RTX GPUs to get images from the AI art creator faster than normal.

Report a problem with article
The Microsoft and Activision Blizzard logos
Next Article

Xbox head Phil Spencer: Activision Blizzard games are not coming to Game Pass until 2024

A woman with a Surface Book 3 Surface Dial and a Surface Pen
Previous Article

Surface Book 3 gets October 2023 firmware update