HLS Kernel Design Integration into SDAccel

The major flow described in the application-centric methodology of this guide is concerned with accelerator kernels being developed and integrated in the host application of a project in a top-down model. This implies, all source code is presented to SDAccel™ and sections of it are dedicated to being synthesized into accelerator modules. This flow calls the Vivado® High-Level Synthesis (HLS) tool to translate the function into hardware implementable accelerator code.

Alternatively, SDAccel provides a bottom-up flow, where HLS-based hardware kernels are created directly by importing from a Vivado HLS project. This allows you to perform optimizations and to validate kernel performance within the Vivado HLS project. When your kernel meets performance and resource requirements, the resulting Xilinx® object file (.xo) is handed off for inclusion into the SDx™ project. During hand-off, all kernel Vivado HLS optimization is maintained.

Figure: Vivado HLS Design Flow

The benefits of the bottom-up flow include:

  • Designer can design, validate, and optimize the kernel prior to integration into the complete SDAccel project.
  • Specific kernel optimizations are maintained for each kernel.
  • Independent Vivado HLS and project locations allow separation of application and kernels.
  • VHLS project can be used by multiple different projects, like a library instantiation.
  • Allow teams to collaborate for increased productivity.

Creating SDAccel Kernels with Vivado HLS

Running Vivado HLS to generate kernels from C/C++ for SDAccel follows the regular Vivado HLS flow. However, since the kernel is supposed to operate as an accelerator in an SDAccel, the SDAccel kernel modeling guidelines need to be followed (see C/C++ modeling guide). Most importantly, the interfaces need to be modeled as AXI memory interfaces except for scalar parameters called by the value, which are mapped to an AXI4-Lite interface. This is illustrated in the following example:

void krnl_idct(const ap_int<512> *block, 
               const ap_uint<512> *q, 
               ap_int<512> *voutp, 
               int ignore_dc, 
               unsigned int blocks) {
  #pragma HLS INTERFACE m_axi     port=block     offset=slave bundle=p0      depth=512
  #pragma HLS INTERFACE s_axilite port=block                  bundle=control
  #pragma HLS INTERFACE m_axi     port=q         offset=slave bundle=p1      depth=2
  #pragma HLS INTERFACE s_axilite port=q                      bundle=control
  #pragma HLS INTERFACE m_axi     port=voutp     offset=slave bundle=p2      depth=512
  #pragma HLS INTERFACE s_axilite port=voutp                  bundle=control
  #pragma HLS INTERFACE s_axilite port=ignore_dc              bundle=control
  #pragma HLS INTERFACE s_axilite port=blocks                 bundle=control
  #pragma HLS INTERFACE s_axilite port=return                 bundle=control
Note: The use of ap-datatypes in the interfaces require the use of ap-datatypes in the test bench for HLS. This might result in slower C/C++ simulation speeds and mapping to native C/C++ should be considered. As most host code is based on native data types, using them in the kernel interfaces is recommended.

For information on creating a new project, see the Vivado Design Suite User Guide: High-Level Synthesis (UG902). For SDAccel kernel projects, you must select the SDAccel Bottom Up Flow check box and specify the Clock Period and Part Selection as shown in the following figure.

Figure: New Vivado HLS Project

Choose the platform by clicking the Browse button to open the Device Selection Dialog and select the accelerator board from the Device list.

Figure: Device Selection

When completed, the iterative optimization process can resume until the best possible implementation results are achieved. For more information, see Vivado Design Suite User Guide: High-Level Synthesis (UG902).

After synthesis is completed for the optimized design, it needs to be exported to the SDAccel tool chain. The export command is available through the Main Toolbar > Solution > Export RTL menu item.

Figure: Export RTL as IP

It is only necessary to confirm the XO file location, which names the generated XO-File that is imported in the next section back into SDAccel.

Note: Most of the options shown in the previous section can also be set and changed from a running project through the Main Menu > Solution > Solution Settings. The Synthesis and Export sections have the same content as previously shown in this documentation.

This completes the HLS synthesis part for SDAccel. In the following section, some required details for the command line flow are shown.

Typical Vivado HLS Script for SDAccel Synthesis

If you run your HLS synthesis through command line scripts, the following Tcl code is equivalent to the GUI flow shown before:

open_project guiProj
set_top krnl_idct
add_files src/krnl_idct.cpp
add_files -tb src/idct.cpp
open_solution "solution1"
set_part {xcu200-fsgd2104-2-e} -tool vivado
create_clock -period 10 -name default
config_sdx -optimization_level none -target xocc
config_schedule -effort medium -enable_dsp_full_reg
config_compile -name_max_length 256 -pipeline_loops 64
#source "./guiProj/solution1/directives.tcl"
csim_design
csynth_design
cosim_design
export_design -rtl verilog -format ip_catalog -xo \ 
    /wrk/bugs/xoFlow/idct_hls/krnl_idct.xo

Incorporating Vivado HLS Kernel Projects into SDAccel

The Vivado HLS output is the kernel code exported as a Xilinx object file (xo). This file can be seamlessly integrated into SDAccel by selecting the object file as input (see Adding Sources for more information). When SDAccel imports the xo file, the kernel name is automatically extracted so the host code can start applying the accelerator.

During SDAccel compilation, it is possible to create multiple compute units from the kernels, but the implementation remains the same as designed during the Vivado HLS run.

In SDAccel, the regular debug and analysis features are fully supported for this flow. It is possible to build the hardware emulation flow to test and debug in detail the implementation and tune the system build host code performance.

Note: The pure software emulation mode is currently not supported as duplicated header file dependencies can create an issue.

Known Limitations

This flow has certain limitations not present in top-down flow:
  • No software emulation support for projects with xo files (potential missing and duplicated header files).
  • GDB Kernel debug in hardware emulation flow is not supported.
  • HLS analysis functionality is only available in the Vivado HLS project and not from SDAccel.