Menu

The AI Transportable Hardware Path: From Ingesting Data to Actionable Intelligence

June 28, 2022

The AI Transportable Hardware Path: From Ingesting Data to Actionable Intelligence

By David Warren-Angelucci, OSS Channel Sales Manager

HPC hardware for AI Workflows on the Edge

The building blocks of an AI workflow are the same as any computational workflow:

  1. Acquire Data
  2. Store that Data
  3. Compute the Data
  4. Make educated decisions based on the computational output

While most AI workflows occur in the controlled environment of datacenters where servers have the HPC resources the applications need, many current AI applications require some or all the AI workflow steps to be performed out in the field, in harsh environmental conditions.  Until now, companies with applications on the ‘edge’ have had to rely on low-performance hardware or deal with the latency of uploading data to the cloud; rugged edge-computing devices, like industrial PCs and IOT devices, are able to withstand the extreme environmental factors of harsh environments, but they do not come close to offering the same computational performance of servers in a datacenter.  Because of this, AI applications on the ‘edge’ have had to compromise on performance, but not anymore!

With our latest line of “AI Transportable” products, One Stop Systems (OSS) supplies rugged appliances which have the same capacity of datacenter performance, but can be used for AI workflows in cars, planes, trucks, ships, drones, and any other environment which has never been able to support HPC hardware…until now. The products in the AI Transportable line are rugged, datacenter-type HPC products that are tailored for each of the four steps in the AI workflow. Companies with edge applications which require the highest performance compute power cannot compromise on performance; they need the components of the datacenter in the field.

With our “AI Transportable” product line, OSS brings the power of the datacenter to the edge!

OSS designs and manufactures high-performance computing systems that are uniquely positioned to support each stage of the AI Transportable workflow, and we have a range of products tailored to meeting the needs of each stage, based on the requirements of the application.

The 4 Stages of the AI Workflow

The ultimate goal of the AI workflow is to process raw data into actionable intelligence.  OSS provides hardware platforms which expedite AI workflows and significantly reduce the time to take action.

The four fundamental building blocks of an AI workflow include: gathering raw data from sensors and other I/O devices (OSS has products which acquire significant amounts of data at high-speed), storing that data (OSS has products which support high-density storage in a small footprint), computing that data (OSS specializes in providing multi-GPU platforms for high-speed analytics, inference, AI training, and retraining), and then making intelligent decisions based on the knowledge gained from that data.

1. Data Acquisition
    Ingesting data from various sensors and IO devices is a fundamental part of many edge applications. Acquiring large amounts of data at high speeds requires all-flash arrays with high-speed sensor inputs.

    OSS has a variety of products which are built to address those requirements, each with its own unique benefits. Some of our data-ingest servers, like the 2U flash-storage array in the picture on the left, are designed for speed and utility.  This server supports 24 2.5” SSD bays (which is up to 367TB using 15TB drives), so the chassis offers flexible capacity while maintaining high-bandwidth and low-latency with throughput of over 50GB/s. 

    Some of our data-ingest servers, like the rugged 4U FSAn-4 flash-storage array in the picture on the right, are designed with density and utility in mind. With 32 slots for PCIe NVMe flash add-in cards in four removeable canisters, it supports up to 400TB of data at double the bandwidth of traditional 2.5” SSDs, and 30GB of net data throughput per second.



    Storage
    2. Data Storage
      In addition to the last two ingest and storage servers, we have other storage devices which are designed with density, speed, and utility in mind. 

      The SB2000, which is illustrated on the left, supports 24 2.5” drive-bays in a 2U chassis. The drives are individually hot-swappable, or they can be removed in groups of 8, which increases the utility of the chassis for the user.  Like the 2U flash-storage array, the SB2000 can support up to 367TB of storage, and boasts over 50GB/s of data throughput.

      With a similar utility in mind, we are currently in development of our newest flagship rugged storage server, the Centauri NVMe. This is a 4U tall chassis, only a half rack wide. It is designed to take up minimal space, while offering maximum utility for applications out in the field. It’s fully ruggedized and supports up to 8 NVMe SSDs in a single hot-swappable canister. 

      The benefit of the hot-swappable canister is the minimal down-time between saturating the drives and replacing them with fresh drives. It makes it so that the data-recording process of the application can continue without the delay of having to remove each drive or upload data to the primary datacenter.  

      3. Compute
        Once the data has been collected and stored, it needs to be computed. The customers and applications which OSS targets require real-time multi-GPU computing out in the field.  OSS offers a wide range of GPU accelerated systems. Some of them are rackmount and designed to connect to an existing server and simply scale the GPU resources. Some of them are designed with space-limitations in mind and provide an all-in-one powerful solution to rugged multi-GPU computing out in the field.

        The chassis which is shown on left is our EB4400, which is designed to support up to 4 traditional GPUs. The EB4400 is a unique chassis in that it can be used as an expansion resource to a customer’s existing server, or it can be used as its own stand-alone GPU server solution.

        If more than 4 GPUs are required in a single 4U space, our 4U Pro is designed to be identical to the EB4400, but by supporting 8 GPUs, it offers twice the capacity.

        The Rigel Compute Server on the right is our flagship rugged GPU server. With a rugged chassis designed around NVIDIA’s HGX A100 4-GPU board, the Rigel harnesses the significant power of four SXM GPUs connected through NVLINK topology in the world’s highest-performing fully rugged GPU solution.

        SXM GPUs are already individually faster than traditional PCIe GPUs, and with the NVLINK topology on the HGX A100 4-GPU board, which supports four SXM GPUs connected with full-mesh peer-to-peer communication, this is the fastest throughput possible in a four GPU system.

        Customers around the globe are already using the HGX A100 4-GPU board in their NVIDIA’s DGX workstations in their labs, but our Rigel Supercomputer gives customers the ability to utilize that immense GPU power in harsh environments. Rigel can be used for a wide range of edge applications; it can be flange-mounted to the side of a truck and run on 48V DC power or installed in an airplane for real-time surveillance operations.


        4. Actionable Intelligence
          The last couple products are hybrid approaches to the purely storage or purely GPU solutions shown previously.

          The 3U Short Depth Server on the left is designed with both space limitations and harsh environments in mind.  It supports up to four double-wide add-in cards (like a GPU) and has 16 NVMe/SATA SSD bays in two removeable cannisters. The flexibility and utility of this all-in-one server are matched perfectly with its rugged nature for a wide variety of applications that may need more than only storage or only GPU computing.

          On the right, you’ll see the Rigel NVMe Storage Server mounted side-by-side to the Rigel Compute Server. The potential capacity of this dual-solution approach to rugged supercomputing is unmatched by anyone in the high-performance computing industry. It may be overkill for many applications, but for those which require the highest performance storage and GPU computing at the edge, this dual-Rigel solution is unmatched.

          The Future is Now

          The push for supporting AI applications in the field is becoming increasingly evident. Companies are no longer able to accept the compromises that they must make by relying on the time-consuming latency of uploading data to the cloud so that it can be stored and computed in a datacenter before results are transferred back to the field, and traditional industrial box PCs are no longer able to support the intense storage & compute requirements of many AI workflows.

          One Stop Systems is the solution -- leading the industry in offering rugged HPC solutions of varying scale for edge AI applications.

           

          Click the buttons below to share this blog post!

          Return to the main Blog page




          Leave a comment

          Comments will be approved before showing up.


          Also in One Stop Systems Blog

          The Challenges of Designing with PCIe Gen 5 for AI Transportables
          The Challenges of Designing with PCIe Gen 5 for AI Transportables

          November 22, 2022

          In the world of digital computing, the fundamental determinant of performance is the ability to distinguish between an electrical signal representing a 1 or a 0.  The speed at which these transitions can be recognized on a system’s internal PCI Express (PCIe) interconnect determines the bandwidth of data that can be transmitted and acted upon. With PCIe Gen5, transitions from 1’s to 0’s must be recognized 32 billion times per second.  

          Continue Reading

          VPX Takes a Back Seat to PCIe and NVLink for Military AI
          VPX Takes a Back Seat to PCIe and NVLink for Military AI

          November 15, 2022

          The need to keep US and allied troops out of harm’s way, while still pursuing battlefield superiority, increasingly requires a need for battlefield assets throughout the military theater to become fully autonomous.  Currently, most unmanned military vehicles are controlled remotely, but the military is expanding the role of autonomy within surface ships, submarine vessels, aircraft, and land vehicles to identify and take action on current and future threats. 

          Continue Reading

          Features and Benefits of the Rigel Edge Supercomputer
          [VIDEO] Features and Benefits of the Rigel Edge Supercomputer

          November 08, 2022

          In this video, Tom Fries, Government Sales Manager at OSS, does a quick walkthrough of the Rigel Edge Supercomputer. The Rigel Edge Supercomputer brings the power of NVIDIA® HGX™ A100 SXM GPUs to the rugged edge. The HGX A100 4-GPU backplane delivers 78 teraFLOPS of FP64 HPC performance using third generation NVIDIA NVLink™ technology. The GPUs are integrated with OSS PCIe Gen 4.0 expansion technology to take advantage of the latest AMD 3rd Gen EPYC processors while offering four PCIe Gen 4.0 x16 expansion slots for high-speed network interconnect, NVMe storage, or FPGA sensor capture. 

          Continue Reading

          You are now leaving the OSS website