NVIDIA Tesla V100 SXM2 GPU

The NVIDIA Tesla V100 SXM2 is a GPU based on the GV100 Volta microarchitecture. It stands by the Tesla V100 PCIe version, which is configured similarly to the SXM2 version. Both versions have the same number of CUDA cores, Tensor cores, and memory specifications. But the main difference is transfer rates using NVLink. The V100 is available everywhere from desktops to servers to cloud services, delivering performance gains and cost savings opportunities.

NVIDIA Tesla V100 SXM2 GPU - System Overview

Description

The Tesla V100 SXM2 is offered with 16GB or 32GB of memory and is the flagship product of Tesla data center computing platforms for deep learning, HPC, and graphics. By pairing 5,120 CUDA cores and 640 Tensor cores, a single server with V100 GPUs can substitute for hundreds of CPU servers delivering standard HPC and deep learning.

Performance

Powered by NVIDIA Volta architecture with 21.1 billion transistors, the NVIDIA V100 SXM2 GPU accelerator offers the performance of up to 100 CPUs in a single GPU for massively parallel processing. This means that data scientists, researchers, and engineers are able to tackle challenges that were once thought impossible. The dual-slot card supports 5120 NVIDIA CUDA cores, 320 texture mapping units, and 128 ROPs. The card also supports 640 tensor cores and is equipped with 125 TFLOPS for deep learning performance. This means it supports 12x the Tensor FLOPS for Deep Learning Training, and 6x the TFLOPS for Deep Learning Inference when compared to NVIDIA Pascal.

Memory

The standard memory offering for the Tesla V100 began with 16GB, but was doubled to a 32GB configuration. Tesla V100 with Volta architecture provides 1.5x higher memory bandwidth over Pascal GPUs with a combination of enhanced raw bandwidth of 900GB/s and greater DRAM consumption performance at 95%. Just like the PCIe version, 32GB of ECC memory is connected using a 4096-bit memory interface. Memory runs at 877 MHz, while the GPU operates at a frequency of 1290Mhz.

Features

Compared to the previous generation, the NVIDIA NVLink in Tesla V100 provides 2x greater throughput. To activate the highest possible application output on a single server, up to 8x V100 accelerators can be interconnected at up to 300GB/s. The passively cooled SXM2 V100 GPU has a 12nm manufacturing process with double-precision, single-precision, and tensor performance. The single card requires 300W. The device doesn’t have any display connectors since it’s not designed to have monitors connected to it and is a completely different form factor than a PCIe based GPU.

Summary

The SXM2 version of the NVIDIA Tesla V100 graphics processing unit is the leading card for accelerated computing platforms and drives some of the world’s largest data centers, all while saving money. It’s ideal for scientists, researchers, and engineers to accelerate AI, HPC, and graphics with generous performance gains.

NVIDIA Tesla V100 SXM2 GPU - Specifications

Memory

Memory Interface: 4096-bit
Memory Bandwidth: Up to 897.0 GB/s

Cores

NVIDIA CUDA Cores: 5,120
NVIDIA Tensor Cores: 640

Performance

31.33 TFLOPS (2:1)

FP32 (float) performance

15.67 TFLOPS

FP64 (double) performance

7.834 TFLOPS (1:2)

Tensor Performance

125 TFLOPS

System Interface

NVIDIA NVLink

Power Consumption

250W

Engine

CUDA
DirectCompute
OpenCL
OpenACC

NVIDIA Tesla V100 SXM2 GPU

NVIDIA Tesla V100 SXM2 GPU - System Overview

Description

Performance

Memory

Features

Summary

NVIDIA Tesla V100 SXM2 GPU - Specifications

Memory

Cores

Performance

System Interface

Power Consumption

Engine

NVIDIA Tesla V100 SXM2 GPU - Documentation