Menu

Enabling High-Performance Compute (HPC) at the Edge

April 03, 2023

Enabling High-Performance Compute (HPC) at the Edge

Seamus Eagan

   

Special Guest Blogger

Seamus Egan, Vice President Government Integrated Solutions at TMGcore, Inc.

Introduction

The past two decades have seen an exponential growth in data generated because of increased digitization of the business processes, instrumentation of devices, and the need to capture, collect and process vast quantities of unstructured data, such as video and images. This evolution has resulted in the development of new programming languages, new data storage architectures, and rapid advances in the infrastructure and chip architectures required to transform these new aggregated data sets into business insight and competitive advantage for businesses and governments. Chip manufacturers such as Intel, AMD, and NVIDIA are responding to this market demand by developing larger and more complex silicon architectures, including some with embedded high-bandwidth memory. Monolithic chip architectures are being replaced by multi-tile designs with 3-D stacking, delivering significant performance gains and design platforms, which will allow flexible configurations tailored to the business workloads. A challenging byproduct of the development is increasing amounts of heat generation, which is taxing the limits of traditional air-cooled deployments, and threatening to limit the performance of these advanced CPU and GPU systems.


High Density in the Core Data Center versus the Edge

Traditional data centers are adapting to mitigate these increasingly dense workloads by reducing rack densities, supplementing cooling with rear door heat exchangers, and in some cases, leveraging direct-to-chip cooling with its associated cooling distribution units. Server architectures are moving rapidly from 2U to 4, 6, and 8U rack sizes to accommodate the increased heatsink sizes and the enlarged fan systems required to dissipate the heat generated from the systems. Power Usage Effectiveness (PUE) which is a ratio of the power consumed by compute equipment as a function of the total energy (compute + overhead to run compute) is impacted negatively due to the increased energy required to air-cool these systems. Customers and data center operators can do a cost benefit analysis to determine the best balance of cooling strategies given the constraints of their data centers. Ultimately, data center designs will morph from air cooled to hybrid cooling designs, and finally to high efficiency low-PUE advanced cooling facilities aligned with the silicon roadmaps. This transformation will be accelerated by the Real Estate Investment Trusts (REITs) who are dominant in the at-risk development of large-scale data centers for businesses. Those REITs, driven by the need for a 15-20 return rate on the capital expended on developing new data centers, will be reluctant to develop legacy air-cooled data centers and have the wrong product for the market in the five years’ timeframe.

While data centers have some levers, and they can adjust to stretch the capabilities of existing infrastructure, the same cannot be said for the Edge. Edge server design has always been constrained by power limitations and the need to operate in austere locations without the protections of the data center. The exponential growth in data at the Edge, combined with the need to deploy advanced compute architectures to process this data, is driving the need for advanced energy efficient cooling designs to meet this challenge.


Liquid Immersion Cooling at The Edge

As discussed earlier, air cooling is just one of many techniques which can be applied to address the cooling needs of compute architectures, and increasing server densities is stressing the limits of air as a primary cooling technique. Liquid immersion cooling has the advantage of supporting extreme high-density compute and doing it very efficiently from a PUE perspective.

Liquid Immersion Cooling

With immersion cooling, the responsibility and control of the cooling process shifts from the individual server to immersion cooling platform. This simplifies the architecture and design of the server with the elimination of fault prone fans, large displacement heatsinks, and multi-RU server chassis designed to support airflow. The result is a solid-state server which operates at its optimal performance level and is more conducive to deployment in the diverse scenarios experienced at the Edge.

Data Center Cooling Technology


Server immersed in inert, dielectric two-phase fluid.

 

When it comes to liquid immersion cooling, two-phase liquid immersion cooling (2PLIC) delivers the lowest PUEs for the highest compute densities in the market. 2PLIC leverages an inert dielectric (non-conductive) fluid to support heat transfer rates, which are orders of magnitude higher than air. When the server is immersed in the 2PLIC fluid, the fluid in contact with the hot processor changes state to a vapor and rapidly moves the heat away from the processor to the surface of the tank, where it is condensed back into a liquid. The condensation process provides a very efficient mechanism to move the heat out to a final heat rejection mechanism, such as a dry cooler or chiller system. TMGcore’s 2PLIC systems are designed to control the cycle with zero vapor loss and at normal atmospheric pressure. The relatively low boiling point of the dielectric fluid and the buoyancy of the bubbles create an isothermal bath that maintains all components on the server at an optimal temperature, with no need for pumps to circulate the fluid.



TMGcore’s EdgeBox delivers HPC at the Edge

TMGcore has developed an EdgeBox series of platforms to enable the delivery of advanced compute architectures at the edge, such as One Stop Systems' (OSS) Rigel Edge Supercomputer. The EdgeBox is designed to support all the lifecycle management needs of the 2PLIC environment with the ability to operate the system fully disconnected in an air gapped deployment, or leverage a cloud-based management capability to centrally monitor and manage many distributed edge nodes.

The EdgeBox X (EB-X) supports up to 4 Rigel Edge Supercomputers, enabling the delivery of over 5 PetaFlops of TensorFlow AI/ML processing, all in a form factor less than 6 sq. ft. with a partial PUE of less than 1.03. The system is virtually silent, allowing the deployment adjacent to personnel without impeding their ability to carry out their normal functions. The EdgeBox X is designed to plug into customer-selected final heat dissipation solutions, and be configured with different power architectures, based on region or power density requirements.

The linear nature of the cooling properties of the 2PLIC allows the scaling up - as well as down - of the physical platforms. This will enable small form-factor designs for OSS’ AI Transportables strategies, or larger deployments on seaborne platforms or regional data aggregation nodes.

In summary, the IT market is at the infancy of a huge paradigm shift where leveraging TMGcore’s EdgeBox platforms with world-class advanced compute capabilities such as OSS’ Rigel platform, can be deployed to the point of data creation, enabling rapid decision-making with increased data fidelity for business owners and government entities.

  TMGCore

TMGcore EdgeBox X with OSS Rigel Edge Supercomputer

Read more about our partnership with TMGCore.

Click the buttons below to share this blog post!

Return to the main Blog page




Leave a comment

Comments will be approved before showing up.


Also in One Stop Systems Blog

Composable Infrastructure:  Dynamically Changing IT Infrastructure
Composable Infrastructure: Dynamically Changing IT Infrastructure

May 01, 2024

The evolution of IT infrastructure spans several decades and is marked by significant advancements in computing technology, networking, storage, and management practices. Data Centers have historically relied on Converged or Hyper-Converged infrastructures when deploying their hardware which proved to limited in flexibility, efficiency, scalability, and support for the Artificial Intelligence / Machine Learning (AI/ML) modern workloads of today. 

Continue Reading

Edge Computing
The Four Types of Edge Computing

April 17, 2024

“Edge Computing” is a term which has been widely adopted by the tech sector. Dominant leaders in accelerated computing have designated “Edge” as one of their fastest-growing segments, with FY24 revenue projected to be nearly $100 billion. The boom in the market for Edge Computing has become so significant that it is increasingly common to see companies create their own edge-related spinoff terms such as ‘Rugged Edge’, ‘Edge AI’, ‘Extreme Edge’, and a whole slew of other new buzzwords. 

Continue Reading

Datalogging in Autonomous Military
Unveiling the Strategic Edge: Datalogging in Autonomous Military Vehicles

March 11, 2024

The landscape of modern warfare is undergoing a profound transformation with the integration of cutting-edge technologies, and at the forefront of this evolution are autonomous military vehicles. Datalogging, a seemingly inconspicuous yet indispensable technology, plays a pivotal role in shaping the capabilities and effectiveness of these autonomous marvels. In this blog post, we delve into the critical role of datalogging in autonomous military vehicles and its impact on the future of defense strategies.

Continue Reading

You are now leaving the OSS website