Conferences and supporting programme
Addressing FPGA HLS (High-level Synthesis) challenges for heterogenous computing at the edge
Implementing FPGAs for heterogenous computing require stringent requirements such as, high-performance, area efficiency and low power. At the same, design cycles are shrinking and the need for HLS productivity is growing. How do you get the benefit of hand-coded RTL and the productivity of HLS?
For heterogenous solutions such as FPGAs with integrated processors or special compute engines, how do you determine which portion of your design is implemented on the various processing elements (FPGA fabric, CPU, engines). This paper/presentation explores the common design challenges engineers face when using HLS in a heterogenous compute environments at the edge.
HLS tools from FPGA vendors and third-party EDA companies improve productivity through a higher-level of abstraction, faster verification and quicker design iterations. For example, verifying your design by simulating in C/C++ can be 10 to 100x faster than simulating in RTL. In addition, many applications such as image processing require visualization that is available in C-based simulation that is not available in traditional waveform-based RTL simulation. With quicker simulation comes faster iterations and easier design space exploration.
The challenges that engineers face when using HLS can be difficult to overcome. In addition, the challenges can lead to more work by the designer thus requiring more time which begins to defeat the productivity gains of HLS. Some of the challenges include non-synthesizable C-code, non “hardware aware” C/C++-code, identifying the parallelism, and when/where to insert pragmas to guide the compiler to meet your performance needs. Some software algorithms can use simple constructs like the random function that exists in C but not in hardware making the C-code non-synthesizable in an FPGA. Creating C/C++ code such as memory constructs that doesn’t factor in the HW implementation can have unintended consequences and bloat the hardware implementation and lead to slow performance. For heterogenous designs, identifying what to keep in C/C++ to run on the processor and what to move to hardware to exploit the parallel nature of the FPGA fabric via HLS can take tremendous time through many iterations. Finally, leveraging the various pragmas provided the various tools to improve performance can be time-consuming.
SLX for FPGA analyzes C/C++ code to provide a deep understanding of software interdependencies, application hot spots, and parallelization opportunities to enable code optimization for heterogeneous multicore SoCs with FPGAs. The tool enables HW/SW exploration and recommends which part of your C/C++ code is best suited to stay on the ARM processor or to be accelerated in the FPGA fabric. In addition, SLX generates pragmas to accelerate your code on multicore ARM processors and FPGA fabric. Further enhancements include tighter integration with Xilinx’s SDSoC development environment.
--- Date: 28.02.2019 Time: 14:30 - 15:00 Location: Exhibitor's Forum, Hall 3, 3-719