By Braden Cooper, Director of Products at OSS
NVIDIA GTC 2026 in San Jose reinforced how quickly AI infrastructure is evolving across industries. The event continues to serve as a central forum for defining how AI systems are built, deployed, and scaled.
This year’s announcements were less about incremental performance gains and more about architectural direction. Across keynotes and partner discussions, a consistent theme emerged: AI is moving out of centralized data centers and into real-time systems operating closer to where data is generated.
For rugged edge deployments, this shift is already influencing system design, deployment models, and infrastructure requirements.
One of the more notable developments is the shift from standalone models to systems that operate continuously. NVIDIA NeMo, along with emerging agentic frameworks such as OpenClaw, reflects a move toward AI that is persistent, orchestrates multi-step processes, and operates with a degree of autonomy.
This evolution introduces new infrastructure demands. These systems are not invoked intermittently. They run continuously, maintain context over time, and depend on predictable, low-latency access to compute resources. In environments with constrained or unreliable connectivity, this model does not align with a cloud-first approach.
Instead, it favors edge deployment, where systems can execute locally and maintain autonomy. In government applications, industrial automation, and remote operations, systems must function independently for extended periods. The edge is no longer limited to inference endpoints. It is increasingly where decision-making and control reside.
NVIDIA’s framing of infrastructure as AI factories reflects a broader shift toward systems designed to continuously generate outputs, whether in the form of tokens, inferences, or actions. At hyperscale, this concept is implemented through tightly integrated compute, networking, and storage.
At the edge, the same principle applies in a distributed form. Systems deployed in the field are evolving into localized processing nodes that generate and act on data in real time, rather than simply forwarding it upstream. This approach becomes necessary in environments where bandwidth is limited, latency is critical, and data volumes are too large to move efficiently.
Form factor and deployability become key considerations in these scenarios. Platforms such as the
OSS Torrey 2U Short Depth Server (2U SDS) provide a practical path for deploying high-performance GPU compute in constrained and rugged environments. Built on OCP and NVIDIA MGX architectures, this platform enables deployment of edge AI infrastructure without requiring traditional data center resources.
Another clear trend at GTC 2026 is the expansion of real-time, sensor-driven AI. Platforms like IGX Thor and Holoscan are enabling systems that continuously ingest, process, and respond to data streams.
This model is now common across robotics, autonomous systems, medical imaging, and government applications. These workloads are inherently edge-native. Data is generated at the edge and must be processed immediately to retain value.
As a result, infrastructure must support low and predictable latency, high-throughput data ingestion, and reliable local compute. These systems often require deterministic response times, particularly in safety-critical or mission-critical environments.

Thermal management is becoming a primary constraint as device power increases. In high-density deployments, traditional air cooling can limit achievable performance or require tradeoffs in ambient operating conditions. Liquid cooling is emerging as a practical approach for extending performance into environments where those tradeoffs are not acceptable.
The OSS 3U SDS-LC liquid-cooled platform addresses this by integrating liquid cooling into a rugged, self-contained system. It enables support for high-power enterprise GPUs while maintaining operation in elevated ambient temperatures and noise-sensitive environments.
Performance remains important, but the emphasis is shifting toward efficiency and responsiveness. There is increasing focus on reducing inference latency and aligning compute resources with specific workload requirements.
GPUs continue to play a central role, but they are now part of broader system architectures that account for memory access patterns, data movement, and power consumption. At the edge, these considerations are amplified by environmental constraints.
Power availability is limited, thermal margins are tighter, and physical space is often restricted. These factors require a system-level approach where performance is evaluated in the context of real-world operating conditions rather than peak theoretical capability.
Cloud infrastructure continues to play a critical role in AI development, particularly for model training and large-scale data processing. However, deployment models are becoming more distributed.
A common pattern is emerging in which models are developed and trained in centralized environments, deployed at the edge, and operated with a degree of independence. Systems synchronize with centralized resources when connectivity allows, but they are not dependent on continuous communication.
This hybrid approach reflects how systems operate in rugged environments. It allows for autonomy without isolation, enabling systems to function independently while still benefiting from centralized updates and coordination.
GTC 2026 highlighted the continued convergence of industries around a shared AI ecosystem. Robotics, healthcare, automotive, energy, and federal sectors were all strongly represented.
This level of participation accelerates the development of common frameworks and reference architectures. It also increases expectations for interoperability and flexibility across platforms.
For edge deployments, infrastructure must support a wider range of workloads and adapt as requirements evolve. Systems are no longer designed for a single application but are expected to accommodate multiple use cases over their lifecycle.
The trends observed at GTC point toward a consistent set of requirements for edge systems. AI workloads are becoming continuous, moving closer to the source of data, and requiring real-time processing.
This results in several practical considerations:
These requirements define the next generation of rugged edge computing systems.

GTC 2026 reinforced a direction that has been building across the industry. AI is becoming more accessible, more distributed, and more tightly integrated into real-world systems.
For rugged edge computing, this represents a continuation of existing trends rather than a departure. The focus is not just on enabling AI at the edge, but on delivering systems that can sustain performance under real-world constraints.
As the need for high-performance edge computing nearer to the source of data continues to move outward, system design, integration, and deployability will become the primary differentiators.
Before starting college in 2022, I had considered artificial intelligence (AI) a thing of the future, something I wouldn’t see until I was later in my years. With the birth of large language models (LLMs) like ChatGPT and the rise of machine learning systems, my world flipped on its head. Since joining the tech industry, I see I am not alone with this experience. From one year to the next, there is no telling what kind of technological developments we will bear witness to. When it comes to the defense and security of our nation, capitalizing on these advancements is paramount, lest we fall behind our adversaries. As a result, within the defense industry, marketers are required to become adaptable to the shifting needs of their company’s customers.
My time at the booth was spent listening to my colleagues interacting with partners and potential customers. Watching these exchanges, I recognized what was at the heart of West—the real reason why hundreds of people had shown up during the work week to surround themselves with others in the defense industry. The obvious explanations come to mind: to meet customers, establish connections, solidify a brand’s image, and feel out competitors. But when I took a step back to study the messaging on the booths and walk the floor, I saw what I had studied for years as a marketing student come to life. Every conversation and display was geared toward answering one two-part question:
“What do my customers want, and how can my product(s) solve their problem?”
As data-intensive workloads continue to grow across industries like AI, defense, autonomous systems, and high-performance computing (HPC), the need for scalable, high-speed infrastructure has never been greater. Organizations are increasingly hitting the physical and performance limits of traditional server architectures—especially when it comes to GPU density, storage bandwidth, and I/O flexibility.
This is where PCIe Expansion Systems come into play. By extending the capabilities of existing servers, PCIe Expansion enables businesses to scale performance efficiently without complete infrastructure overhauls. In this blog, we’ll explore the key benefits of PCIe Expansion, including how expansion backplanes, GPU expansion, and emerging technologies like CXL and PCIe 6.0 are shaping the future of computing.