Modifying Kernel Placement
The primary issue when targeting a new platform is ensuring that an existing kernel placement will work in the new target platform. Each target platform has an FPGA defined by a static region. As shown in the figure below, the target platform(s) can be different.
- The target platform on the left has four SLRs, and the static region is spread across all four SLRs.
- The target platform on the right has only three SLRs, and the static region is fully-contained in SLR1.
This section explains how to modify the placement of the kernels.
Implications of a New Hardware Platform
The figure below highlights the issue of kernel placement when migrating to a new target platform. In the example below:
- Existing kernel, kernel_B, is too large to fit into SLR2 of the new target platform because most of the SLR is consumed by the static region.
- The existing kernel, kernel_D, must be relocated to a new SLR because the new target platform does not have four SLRs like the existing platform.
When migrating to a new platform, you need to take the following actions:
- Understand the resources available in each SLR of the new target platform, as documented in the Vitis 2019.2 Software Platform Release Notes.
- Understand the resources required by each kernel in the design.
- Use the
v++ --configoption to specify which SLR each kernel is placed in, and which DDR bank each kernel connects to. For more details, refer to Assigning Compute Units to SLRs and Mapping Kernel Ports to Global Memory.
These items are addressed in the remainder of this section.
Determining Where to Place the Kernels
To determine where to place kernels, two pieces of information are required:
- Resources available in each SLR of the hardware platform (.xsa).
- Resources required for each kernel.
With these two pieces of information you will then determine which kernel or kernels can be placed in each SLR of the target platform.
Keep in mind when performing these calculation that 10% of the available resources can be used by system infrastructure:
- Infrastructure logic can be used to connect a kernel to a DDR interface if it has to cross an SLR boundary.
- In an FPGA, resources are also used for signal routing. It is never possible to use 100% of all available resources in an FPGA because signal routing also requires resources.
Available SLR Resources
The resources available in each SLR of the various platforms supported by a release can be found in the Vitis 2019.2 Software Platform Release Notes. The table shows an example target platform. In this example:
- SLR description indicates which SLR contains static and/or dynamic regions.
- Resources available in each SLR (LUTs, Registers, RAM, etc.) are listed.
This allows you to determine what resources are available in each SLR.
| Area | SLR 0 | SLR 1 | SLR 2 |
|---|---|---|---|
| SLR description | Bottom of device; dedicated to dynamic region. | Middle of device; shared by dynamic and static region resources. | Top of device; dedicated to dynamic region. |
| Dynamic region Pblock name | pfa_top_i_dynamic_region_pblock _dynamic_SLR0 | pfa_top_i_dynamic_region_pblock _dynamic_SLR1 | pfa_top_i_dynamic_region_pblock _dynamic_SLR2 |
| Compute unit placement syntax | set_property CONFIG.SLR_ASSIGNMENTS SLR0[get_bd_cells<cu_name>] | set_property CONFIG.SLR_ASSIGNMENTS SLR1[get_bd_cells<cu_name>] | set_property CONFIG.SLR_ASSIGNMENTS SLR2[get_bd_cells<cu_name>] |
| Global memory resources available in dynamic region | |||
| Memory channels; system port name | bank0 (16 GB DDR4) | bank1 (16 GB DDR4, in static region) bank2 (16 GB DDR4, in dynamic region) |
bank3 (16 GB DDR4) |
| Approximate available fabric resources in dynamic region | |||
| CLB LUT | 388K | 199K | 388K |
| CLB Register | 776K | 399K | 776K |
| Block RAM Tile | 720 | 420 | 720 |
| UltraRAM | 320 | 160 | 320 |
| DSP | 2280 | 1320 | 2280 |
Kernel Resources
The resources for each kernel can be obtained from the System Estimate report.
The System Estimate report is available in the Assistant view after either the Hardware Emulation or System run are complete. An example of this report is shown below.
- FF refers to the CLB Registers noted in the platform resources for each SLR.
- LUT refers to the CLB LUTs noted in the platform resources for each SLR.
- DSP refers to the DSPs noted in the platform resources for each SLR.
- BRAM refers to the block RAM Tile noted in the platform resources for each SLR.
This information can help you determine the proper SLR assignments for each kernel.
Assigning Kernels to SLRs
Each kernel in a design can be assigned to an SLR region using the connectivity.slr option in a configuration file specified
for the v++ --config command line option. Refer to
Assigning Compute Units to SLRs for more information.
When placing kernels, Xilinx recommends
assigning the specific DDR memory bank that the kernel will connect to using the connectivity.sp config option as described in Mapping Kernel Ports to Global Memory.
For example, the figure below shows an existing target platform that has four SLRs, and a new target platform with three SLRs. The static region is also structured differently between the two platforms. In this migration example:
- Kernel_A is mapped to SLR0.
- Kernel_B, which no longer fits in SLR1, is remapped to SLR0, where there are available resources.
- Kernel_C is mapped to SLR2.
- Kernel_D is remapped to SLR2, where there are available resources.
The kernel mappings are illustrated in the figure below.
Specifying Kernel Placement
For the example above, the configuration file to assign the kernels would be similar to the following:
[connectivity]
nk=kernel:4:kernel_A.lernel_B.kernel_C.kernel_D
slr=kernel_A:SLR0
slr=kernel_B:SLR0
slr=kernel_C:SLR2
slr=kernel_D:SLR2
The v++ command line to place each
of the kernels as shown in the figure above would be:
v++ -l --config config.txt ...
Specifying Kernel DDR Interfaces
You should also specify the kernel DDR memory interface when specifying kernel placements. Specifying the DDR interface ensures the automatic pipelining of kernel connections to a DDR interface in a different SLR. This ensures there is no degradation in timing which can reduce the maximum clock frequency.
In this example, using the kernel placements in the above figure:
- Kernel_A is connected to Memory Bank 0.
- Kernel_B is connected to Memory Bank 1.
- Kernel_C is connected to Memory Bank 2.
- Kernel_D is connected to Memory Bank 1.
The configuration file to perform these connections would be as follows, and
passed through the v++ --config command:
[connectivity]
nk=kernel:4:kernel_A.lernel_B.kernel_C.kernel_D
slr=kernel_A:SLR0
slr=kernel_B:SLR0
slr=kernel_C:SLR2
slr=kernel_D:SLR2
sp=kernel_A.arg1:DDR[0]
sp=kernel_B.arg1:DDR[1]
sp=kernel_C.arg1:DDR[2]
sp=kernel_D.arg1:DDR[1]
connectivity.sp option to assign kernel ports
to memory banks, you must map all interfaces/ports of the kernel. Refer to Mapping Kernel Ports to Global Memory for more information.