Articles

GPU Trends: The Quest for Performance, Latency, and Flexibility in ISR Systems

March 13, 2019 | BY: Tammy Carter

Download PDF

Published in Electronic Design

Employing strategies such as GPUDirect, PCIe Device Lending, and implementing SISCI API can help system integrators optimize ISR solutions.

For military intelligence, surveillance, and reconnaissance (ISR) applications, such as radar, EO/IR (electro-optic/infrared), or wideband ELINT (electronic intelligence), the ongoing problem is how best to handle the expanding “firehose” of data, fed by an increasing number of wide-bandwidth platform sensors.  To handle this massive inflow of data, and the complex algorithms required to process it, state-of-the-art computational engines and data-transport mechanisms are essential.

Deployed High Performance Embedded Computer (HPEC) systems designed to support these applications typically have a heterogeneous architecture of high-performance FPGAs, GPUs, and digital signal processors, or DSPs (today, often Intel Xeon-D based modules). GPUs provide a large number of floating-point cores tuned for complex mathematical algorithms, which makes them ideal for processing the complex algorithms used in ISR applications. In comparison, a single Intel Xeon-D processor can provide a peak throughput of ~600 MFLOPS, while NVIDIA’s Pascal P5000 GPU sports 6.4 TFLOPS of peak performance.

Tighter Integration

Today, ISR system integrators have three main goals: minimize latency, maximize system bandwidth, and optimize configuration flexibility within their given SWaP constraints. To address these issues, leading COTS vendors of OpenVPX modules are seeking ways to provide closer integration between the compute elements.

In the beginning, sensor data preprocessed by the FPGA had to be copied to the CPU, which subsequently copied it to the GPU for further processing. Then, NVIDA introduced GPUDirect, which added the capability to move the data directly from an FPGA or network interface, such as Mellanox Infiniband, to a GPU. By eliminating extra copies, both latency and backplane utilization were decreased. 

Such an approach works well until the amount of incoming data overwhelms the system, such that one batch of data hasn’t completed processing before the next batch of data arrives. This can result either from the transport systems being overwhelmed (I/O bound) or the GPU not completing the calculations in the required time frame (compute bound).

When using GPUs, the limiting factor is often the I/O, and this is usually addressed by employing either a round-robin distribution of the incoming data, and/or pipelining the processing stages. Unfortunately, as sensor data continues to increase, it’s become apparent that new techniques are required.

PCI Express

In OpenVPX systems, the standard interface between the FPGAs, GPUs, and CPUs is PCI Express (PCIe)—it offers the fastest path to and from the processor, and by definition, connects to other devices via the expansion plane. Offloading the Ethernet with the PCIe connection reduces latency and increases throughput.

Based on the original PCI parallel bus design, PCIe is controlled by a single “master” host called the root complex that scans the bus to find and enumerate all connected devices. When a PCIe switch is used to connect multiple devices to a root complex, it’s called a transparent bridge (TB), and all devices operate in a single address space.  With a TB, two root nodes (processors) can’t be connected because there will be a memory address conflict.

When a PCIe switch port is configured as non-transparent bridge (NTB), a root node doesn’t look to enumerate devices beyond that switch port. So, when either of the two processors enumerates their NTB port, the port requests memory on that processor. The NTB port provides the common memory address translation to either side.

Read the full article here

 

Author’s Biography

Tammy Carter

Senior Product Manager – GPGPUs & Software

Tammy Carter is the Senior Product Manager for GPGPUs and software products, featuring OpenHPEC, for Curtiss-Wright Defense Solutions. In addition to a M.S. in Computer Science, she has over 20 years of experience in designing, developing and integrating real-time embedded systems in the Defense, Communications and Medical arenas.

Share This Article

  • Share on Linkedin
  • Share on Twitter
  • Share on Facebook
  • Share on Google+
Want to add a comment? Please login
Loading...
Connect With Curtiss-Wright Connect With Curtiss-Wright Connect With Curtiss-Wright
Sales

CONTACT SALES

Contact our sales team today to learn more about our products and services.

YOUR LOCATION

PRODUCT INFORMATION

Support

GET SUPPORT

Our support team can help answer your questions - contact us today.

REQUEST TYPE

SELECT BY

SELECT Topic