NVIDIA L40 GPU Accelerator

Powered by the Ada Lovelace architecture, the NVIDIA L40 GPU Accelerator is ideal for data center workloads focusing on neural graphics, virtualization, compute, and AI capabilities. It delivers next-generation graphic performance by harnessing 3rd gen RT cores, 4th gen Tensor cores, and Ada generation CUDA cores. With 48GB of GDDR6 memory, the L40 makes can also make quick work of memory-intensive applications such as data science, simulation, 3D modeling, and rendering.

Overview

NVIDIA L40 GPU Accelerator - System Overview

Performance

NVIDIA’s Ada Lovelace architecture offers 18,176 CUDA cores compared to the previous generation NVIDIA A40 with Ampere architecture with up to 10,752 CUDA cores. This GPU is designed to take advantage of the PCIe 4th Generation connection interface with a bi-directional transfer speed of 64GB/s, which is the double the 3rd Gen PCIe throughput. 142 3rd Generation Ray Tracing (RT) Cores deliver 2x to 3x the speed for Lovelace architecture cards compared to previous generations. Data science and AI model training has been improved with 568 4th Generation Tensor Cores providing up to 90.1 TF32 TFLOPS.

Memory

The L40 GPU supports 48GB of GDDR6 memory with Error Correction Code (ECC). It has a memory bandwidth of 864GB/s. Using NVIDIA RTX Virtual Workstation (RTX vWS) virtual GPU (vGPU) software it is possible to allocate memory to multiple users across different teams such as creative, data science, and design.

Cooling and Power

Maximum power consumption on this card is rated at 300W and uses a 16-pin power connector. The NVIDIA L40 GPU features bi-directional passive cooling and depends on the cooling provided by the host server chassis to maintain operational temperatures.

Summary

Offering impressive performance for simulation, ray-tracing, AI modeling, and virtual production, the NVIDIA L40 GPU Accelerator is a great general-purpose GPU and ideal demanding datacenter workloads. 48GB of GGRD6 memory are available on this card, as well as 142x 3rd generation RT Cores, 18176x CUDA cores and 568x 4th generation Tensor cores.

Specifications

NVIDIA L40 GPU Accelerator - Specifications

Memory

Memory Bandwidth: Up to 864GB/s

Cores

NVIDIA Ada Lovelace CUDA Cores: 18,176
NVIDIA 3rd Generation RT Cores: 142
NVIDIA 4th Generation Tensor Cores: 568

Performance

90.5 TFLOPS

TF32 Tensor Core TFLOPS performance:

90.5 TFLOPS

TF32 Tensor Core TFLOPS performance with Sparsity:

181 TFLOPS

BFLOAT16 Tensor Core TFLOPS performance:

181.05 TFLOPS

BFLOAT16 Tensor Core TFLOPS performance with Sparsity:

362 TFLOPS

FP16 Tensor Core performance:

181.05 TFLOPS

FP16 Tensor Core performance with Sparsity:

362 TFLOPS

FP8 Tensor Core performance:

362 TFLOPS

FP8 Tensor Core performance with Sparsity:

724 TFLOPS

Card Interface

PCIe Gen4 x16
64GB/s bi-directional

Display Connectors

4 x DisplayPort 1.4a

Power Consumption

300W

Form Factor

Board Height: 4.4"
Board Length: 10.5"
Dual Slot

Documentation

NVIDIA L40 GPU Accelerator - Documentation

NVIDIA L40 GPU Accelerator Data Sheet NVIDIA L40 GPU Accelerator Product Brief