GPUltima-CI

GPUltima-CI

GPUltima-CI is a cutting-edge, flexible high-power computing solution for mixed workload data centers. This flexibility serves as an invaluable solution to many HPC applications such as AI, deep learning, image processing and scientific modeling. GPUltima-CI makes the data center workload-centric, where the hardware truly adapts to the needs of applications, rather than applications trying to adapt to limited hardware.

Find a Distributor

GPUltima-CI features the flexibility of disaggregated composable infrastructure that increases GPU accelerator utilization in mixed workload datacenters. With composable infrastructure, unused GPU, storage and network resources from one application are automatically released to other resource-hungry applications on other server nodes resulting in increased resource utilization.

The GPUltima-CI is a power-optimized rack that can be configured with up to 32 dual Intel Xeon Scalable Architecture compute nodes, 64 network adapters, 48 NVIDIA® Volta™ GPUs, and 32 NVMe drives on a 128Gb PCIe switched fabric, and can support tens of thousands of composable server configurations per rack. Using one or many racks, the OSS solution contains the necessary resources to compose any combination of GPU, NIC and storage resources as may be required in today’s mixed workload data center.

Architecture

Rack

  • 42U tall 1200mm traditional rack or Scale Matrix DDC

  • Also available in 24U, 44U and 48U tall versions

  • Supports OSS GPU Accelerators, NVMe Flash Storage Arrays, PCIe fabric switches and quad-node servers

Compute Accelerators

  • 3U SCA8000 8-way SXM2 V100 expansion with up to four 128Gb PCIe fabric connections

  • 4U EB3600 8-way PCIe V100 expansion with up to four 128Gb PCIe fabric connections

  • Half 4U EB3450 4-way PCIe V100 expansion with up to two 128Gb PCIe fabric connections

GPUs

SXM2 V100 with NVLink

  • 5,120 Cuda cores, 640 Tensor cores
  • 7.8 Tflops Double-Precision
  • 15.7 Tflops Single-Precision
  • 125 Tflops Tensor Performance
  • 300GB/s bi-directional interconnect bandwidth
  • 16GB HBM2 memory
  • 300 watts

PCIe V100

  • 5,120 Cuda cores, 640 Tensor cores
  • 7 Tflops Double-Precision
  • 14 Tflops Single-Precision
  • 112 Tflops Tensor Performance
  • 32GB/s bi-directional interconnect bandwidth
  • 16GB HBM2 memory
  • 250 watts

Flash Storage Arrays

  • 2U FSAe-2 24-way U.2 NVMe JBOF with up to two 128Gb PCIe fabric connections

  • 4U 4UV 16-way PCIe NVMe JBOF with up to two 128Gb PCIe fabric connections

NVMe Drives

PCIe SN260

  • 6.4TB, 3 DW/day
  • PCIe3.0 x8 64Gb/s
  • Max Read (128KB): 6.17GB/s
  • Max Write (128KB): 2.2GB/s
  • Random Read IOPS (4KB): 1,200,000
  • Random Write IOPS (4KB): 200,000
  • Write Latency (512B): 20µs

U.2 SN200

  • 6.4TB, 3 DW/day
  • PCIe3.0 x4 32Gb/s
  • Max Read (128KB): 3.35GB/s
  • Max Write (128KB): 2.1GB/s
  • Random Read IOPS (4KB): 835,000
  • Random Write IOPS (4KB): 200,000
  • Write Latency (512B): 20µs

Servers

  • 2U, 4-node, dual Intel Xeon Scalable Processor server. Each node contains:

  • Dual Socket P (LGA 3647) “SkyLake” CPUs up to 28 cores and 3.2GHz

  • Up to 2TB ECC DDR4-2666MHz

  • Two Gen 3 x16 PCIe expansion slots

  • Six 2.5” SATA3 SSDs

  • IPMI, dual USB 3.0 and Disk-on-module support

Infiniband Switch

  • Mellanox 36 port Infiniband switch

  • EDR 100Gb/s, QSFP connectors

  • 1U form factor

Composable Fabric Switch

  • Liqid Grid managed switch array

  • up to 8U, 96-ports

  • 128Gbps PCIe fabric per port

  • Fail-over and multi-topology support

  • 1Gb Management port with Xeon D-1548 management CPU

Composable Infrastructure Management

  • Liqid Command Center Management Software
  • Bare Metal machine management via GUI, RESTful API or CLI
  • Cluster management (create, edit, delete)
  • Device management (GPU, NVMe, NIC and CPU)
  • Package management (OS, Configuration and Snapshot)
  • Advanced peer-to-peer GPU and Fabric DMA NIC support

Infiniband Interface Card

  • Mellanox Connect-X5
  • EDR 100Gb/s, QSFP connectors
  • Single or dual port available
  • One card per server

Power Distribution Unit

  • Tripp-Lite Monitored PDU
  • 27.6kW power
  • Input: 380/400V 3 phase, 63A
  • Power monitoring via display and Ethernet
  • System utilizes 4 PDUs
  • 110kW total power ~ 97% over-provisioned

Cables

  • Copper network and fabric cables inside each rack
  • Fiber Infiniband and PCIe fabric cables up to 100m available for multi-rack GPUltima-CI deployments
  • Fully integrated, cabled, racked and stacked datacenter solutions

Software OS, Frameworks & Libraries

  • Operating Systems: CentOS, Ubuntu, Suse, Windows
  • Optional Pre-installed deep learning frameworks:
    • Torch
    • Caffe2
    • Theano
    • TensorFlow
  • CUDA NVIDIA drivers
  • Optional Pre-installed deep learning libraries:
    • MLPython
    • cuDNN
    • DIGITS
    • Caffe on Spark
    • NCCL