Accumulate Squared
The xFaccumulateSquare function adds the square of an image (src1) to the
accumulator image (src2) and generates the accumulated result (dst).
The accumulated result is a separate argument in the function, instead of having src2 as the accumulated result. In this implementation, having a bi-directional accumulator is not possible as the function makes use of streams.
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void xFaccumulateSquare (
xF::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xF::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xF::Mat<int DST_T, int ROWS, int COLS, int NPC> dst)
Parameter Descriptions
The following table describes the template and the function parameters.
| Parameter | Description |
|---|---|
| SRC_T | Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1) |
| DST_T | Output pixel type. Only 16-bit, unsigned, 1 channel is supported (XF_16UC1) |
| ROWS | Maximum height of input and output image (must be a multiple of 8) |
| COLS | Maximum width of input and output image (must be a multiple of 8) |
| NPC | Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively. |
| src1 | Input image |
| src2 | Input image |
| dst | Output image |
Resource Utilization
The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2017.1 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.
| Operating Mode |
Operating Frequency (MHz) |
Utilization Estimate | ||||
|---|---|---|---|---|---|---|
| BRAM_18K | DSP_48E | FF | LUT | CLB | ||
| 1 pixel | 300 | 0 | 1 | 71 | 52 | 14 |
| 8 pixel | 150 | 0 | 8 | 401 | 247 | 48 |
Performance Estimate
The following table summarizes the performance in different configurations, as generated using Vivado HLS 2017.1 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.
| Operating Mode | Latency Estimate |
|---|---|
| Max Latency (ms) | |
| 1 pixel operation (300 MHz) | 6.9 |
| 8 pixel operation (150 MHz) | 1.6 |
Deviation from OpenCV
In OpenCV the accumulated squared image is stored in the second input image. The src2 image acts as input as well as output.
Whereas, in the xfOpenCV implementation, the accumulated squared image is stored separately.