xfOpenCV Library API Reference

To facilitate local memory allocation on FPGA devices, the xfOpenCV library functions are provided in templates with compile-time parameters. Data is explicitly copied from cv::Mat to xf::Mat and is stored in physically contiguous memory to achieve the best possible performance. After processing, the output in xf::Mat is copied back to cv::Mat to write it into the memory.

xf::Mat Image Container Class

xf::Mat is a template class that serves as a container for storing image data and its attributes.

Note: The xf::Mat image container class is similar to the cv::Mat class of the OpenCV library.

Class Definition

template<int T, int ROWS, int COLS, int NPC>
class Mat {

  public:
    unsigned char allocatedFlag;            // flag to mark memory allocation in this class
    int rows, cols, size;                   // actual image size

#ifdef __SDSVHLS__
    typedef XF_TNAME(T,NPC) DATATYPE;
#else                                       // When not being built for V-HLS
    typedef struct {
        XF_CTUNAME(T,NPC) chnl[XF_NPIXPERCYCLE(NPC)][XF_CHANNELS(T,NPC)];
    } __attribute__ ((packed)) DATATYPE;
#endif

//#if (defined  (__SDSCC__) ) || (defined (__SYNTHESIS__))
#if defined (__SYNTHESIS__) && !defined (__SDA_MEM_MAP__)
    DATATYPE *data __attribute((xcl_array_geometry((ROWS)*(COLS>> (XF_BITSHIFT(NPC))))));//data[ ROWS * ( COLS >> ( XF_BITSHIFT ( NPC ) ) ) ];
#else
    DATATYPE *data;
#endif


    Mat();                                  // default constructor
    Mat(Size _sz);
    Mat(int _rows, int _cols);
    Mat(int _size, int _rows, int _cols);
    Mat(int _rows, int _cols, void *_data);
    Mat(const Mat&);                        // copy constructor

    ~Mat();

    Mat& operator= (const Mat&);            // Assignment operator
//  XF_TNAME(T, XF_NPPC1) operator() (unsigned int r, unsigned int c);
//  XF_CTUNAME(T, NPC) operator() (unsigned int r, unsigned int c, unsigned int ch);
    XF_TNAME(T,NPC) read(int index);
    float read_float(int index);
    void write(int index, XF_TNAME(T,NPC) val);
    void write_float(int index, float val);

    void init (int _rows, int _cols, bool allocate=true);
    void copyTo (void* fromData);
    unsigned char* copyFrom ();

    const int type() const;
    const int depth() const;
    const int channels() const;

    template<int DST_T>
    void convertTo (Mat<DST_T, ROWS, COLS, NPC> &dst, int otype, double alpha=1, double beta=0);
};

Parameter Descriptions

The following table lists the xf::Mat

Table 1. xf::Mat Class Parameter Descriptions
Parameter Description
rows The number of rows in the image or height of the image.
cols The number of columns in the image or width of the image. class parameters and their descriptions:
size The number of words stored in the data member. The value is calculated using rows*cols/(number of pixels packed per word).
allocatedFlag Flag for memory allocation status
*data class parameters and the pointer to the words that store the pixels of the image.

Member Functions Description

The following table lists the member functions and their descriptions:

Table 2. xf::Mat Member Function Descriptions
Member Functions Description
Mat() This default constructor initializes the Mat object sizes, using the template parameters ROWS and COLS.
Mat(int _rows, int _cols) This constructor initializes the Mat object using arguments _rows and _cols.
Mat(const xf::Mat &_src) This constructor helps clone a Mat object to another. New memory will be allocated for the newly created constructor.
Mat(int _rows, int _cols, void *_data) This constructor initializes the Mat object using arguments _rows, _cols, and _data. The *data member of the Mat object points to the memory allocated for _data argument, when this constructor is used. No new memory is allocated for the *data member.
convertTo(Mat<DST_T,ROWS, COLS, NPC> &dst, int otype, double alpha=1, double beta=0) Refer to xf::convertTo
copyTo(* fromData) Copies the data from Data pointer into physically contiguous memory allocated inside the constructor.
copyFrom() Returns the pointer to the first location of the *data member.
read(int index) Readout a value from a given location and return it as a packed (for multi-pixel/clock) value.
read_float(int index) Readout a value from a given location and return it as a float value
write(int index, XF_TNAME(T,NPC) val) Writes a packed (for multi-pixel/clock) value into the given location.
write_float(int index, float val) Writes a float value into the given location.
type() Returns the type of the image.
depth() Returns the depth of the image
channels() Returns number of channels of the image
~Mat() This is a default destructor of the Mat object.

Template Parameter Descriptions

Template parameters of the xf::Mat class are used to set the depth of the pixel, number of channels in the image, number of pixels packed per word, maximum number of rows and columns of the image. The following table lists the template parameters and their descriptions:

Table 3. xf::Mat Template Parameter Descriptions
Parameters Description
TYPE Type of the pixel data. For example, XF_8UC1 stands for 8-bit unsigned and one channel pixel. More types can be found in include/common/xf_params.h.
HEIGHT Maximum height of an image.
WIDTH Maximum width of an image.
NPC The number of pixels to be packed per word. For instance, XF_NPPC1 for 1 pixel per word; and XF_NPPC8 for 8 pixels per word.

Pixel-Level Parallelism

The amount of parallelism to be implemented in a function from xfOpenCV is kept as a configurable parameter. In most functions, there are two options for processing data.

  • Single-pixel processing
  • Processing eight pixels in parallel

The following table describes the options available for specifying the level of parallelism required in a particular function:

Table 4. Options Available for Specifying the Level of Parallelism
Option Description
XF_NPPC1 Process 1 pixel per clock cycle
XF_NPPC2 Process 2 pixels per clock cycle
XF_NPPC4 Process 4 pixels per clock cycle
XF_NPPC8 Process 8 pixels per clock cycle

Macros to Work With Parallelism

There are two macros that are defined to work with parallelism.

  • The XF_NPIXPERCYCLE(flags) macro resolves to the number of pixels processed per cycle.
    • XF_NPIXPERCYCLE(XF_NPPC1) resolves to 1
    • XF_NPIXPERCYCLE(XF_NPPC2) resolves to 2
    • XF_NPIXPERCYCLE(XF_NPPC4) resolves to 4
    • XF_NPIXPERCYCLE(XF_NPPC8) resolves to 8
  • The XF_BITSHIFT(flags) macro resolves to the number of times to shift the image size to right to arrive at the final data transfer size for parallel processing.
    • XF_BITSHIFT(XF_NPPC1) resolves to 0
    • XF_BITSHIFT(XF_NPPC2) resolves to 1
    • XF_BITSHIFT(XF_NPPC4) resolves to 2
    • XF_BITSHIFT(XF_NPPC8) resolves to 3

Pixel Types

Parameter types will differ, depending on the combination of the depth of pixels and the number of channels in the image. The generic nomenclature of the parameter is listed below.
XF_<Number of bits per pixel><signed (S) or unsigned (U) or float (F)>C<number of channels>

For example, for an 8-bit pixel - unsigned - 1 channel the data type is XF_8UC1.

The following table lists the available data types for the xf::Mat class:

Table 5. xf::Mat Class - Available Data Types
Option Number of bits per Pixel Unsigned/ Signed/ Float Type Number of Channels
XF_8UC1 8 Unsigned 1
XF_16UC1 16 Unsigned 1
XF_16SC1 16 Signed 1
XF_32UC1 32 Unsigned 1
XF_32FC1 32 Float 1
XF_32SC1 32 Signed 1
XF_8UC2 8 Unsigned 2
XF_8UC4 8 Unsigned 4
XF_8UC3 8 Unsigned 3
XF_2UC1 2 Unsigned 1

Manipulating Data Type

Based on the number of pixels to process per clock cycle and the type parameter, there are different possible data types. The xfOpenCV library uses these datatypes for internal processing and inside the xf::Mat class. The following are a few supported types:

  • XF_TNAME(TYPE,NPPC) resolves to the data type of the data member of the xf::Mat object. For instance, XF_TNAME(XF_8UC1,XF_NPPC8) resolves to ap_uint<64>.
  • Word width = pixel depth * number of channels * number of pixels to process per cycle (NPPC).
  • XF_DTUNAME(TYPE,NPPC) resolves to the data type of the pixel. For instance, XF_DTUNAME(XF_32FC1,XF_NPPC1) resolves to float.
  • XF_PTSNAME(TYPE,NPPC) resolves to the ‘C’ data type of the pixel. For instance, XF_PTSNAME (XF_16UC1,XF_NPPC2) resolves to unsigned short.
Note: ap_uint<>, ap_int<>, ap_fixed<>, and ap_ufixed<> types belong to the high-level synthesis (HLS) library. For more information, see the Vivado Design Suite User Guide: High-Level Synthesis (UG902).

Sample Illustration

The following code illustrates the configurations that are required to build the gaussian filter on an image, using the SDSoC™ tool for Zynq® UltraScale™ platform.
Note: In case of a real-time application, where the video is streamed in, it is recommended that the location of frame buffer is xf::Mat and is processed using the library function. The resultant location pointer is passed to display IPs.

xf_config_params.h

#define FILTER_SIZE_3 1
#define FILTER_SIZE_5 0
#define FILTER_SIZE_7 0
#define RO 0
#define NO 1

#if NO
#define NPC1 XF_NPPC1
#endif
#if RO
#define NPC1 XF_NPPC8
#endif

xf_gaussian_filter_tb.cpp

int main(int argc, char **argv) 
{
cv::Mat in_img, out_img, ocv_ref;
cv::Mat in_gray, in_gray1, diff;
in_img = cv::imread(argv[1], 1); // reading in the color image
		extractChannel(in_img, in_gray, 1);

xf::Mat<XF_8UC1, HEIGHT, WIDTH, NPC1> imgInput(in_img.rows,in_img.cols);
xf::Mat<XF_8UC1, HEIGHT, WIDTH, NPC1> imgOutput(in_img.rows,in_img.cols);

imgInput.copyTo(in_gray.data);

gaussian_filter_accel(imgInput,imgOutput,sigma);

// Write output image
xf::imwrite("hls_out.jpg",imgOutput);
}

xf_gaussian_filter_accel.cpp

#include "xf_gaussian_filter_config.h"

void gaussian_filter_accel(xf::Mat<XF_8UC1,HEIGHT,WIDTH,NPC1> &imgInput,xf::Mat<XF_8UC1,HEIGHT,WIDTH,NPC1>&imgOutput,float sigma)
{
	xf::GaussianBlur<FILTER_WIDTH, XF_BORDER_CONSTANT, XF_8UC1, HEIGHT, WIDTH, NPC1>(imgInput, imgOutput, sigma);
}

xf_gaussian_filter.hpp

#pragma SDS data data_mover("_src.data":AXIDMA_SIMPLE)
        #pragma SDS data data_mover("_dst.data":AXIDMA_SIMPLE)
        #pragma SDS data access_pattern("_src.data":SEQUENTIAL)
        #pragma SDS data copy("_src.data"[0:"_src.size"])
        #pragma SDS data access_pattern("_dst.data":SEQUENTIAL)
        #pragma SDS data copy("_dst.data"[0:"_dst.size"])
        
        template<int FILTER_SIZE, int BORDER_TYPE, int SRC_T, int ROWS, int COLS,int NPC = 1>
        void GaussianBlur(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst, float sigma)
        {
        //function body
        }

The design fetches data from external memory (with the help of SDSoC data movers) and is transferred to the function in 8-bit or 64-bit packets, based on the configured mode. Assuming 8-bits per pixel, 8 pixels can be packed into 64-bits. Therefore, 8 pixels are available to be processed in parallel.

Enable the FILTER_SIZE_3 and the NO macros in the xf_config_params.h file. The macro is used to set the filter size to 3x3 and #define NO 1 macro enables 1 pixel parallelism.

Specify the SDSoC tool specific pragmas, in the xf_gaussian_filter.hpp file.

#pragma SDS data data_mover("_src.data":AXIDMA_SIMPLE)
#pragma SDS data data_mover("_dst.data":AXIDMA_SIMPLE)
#pragma SDS data access_pattern("_src.data":SEQUENTIAL)
#pragma SDS data copy("_src.data"[0:"_src.size"])
#pragma SDS data access_pattern("_dst.data":SEQUENTIAL)
#pragma SDS data copy("_dst.data"[0:"_dst.size"])
Note: For more information on the pragmas used for hardware accelerator functions in SDSoC, see SDSoC Environment User Guide.

xf::imread

The function xf::imread loads an image from the specified file path, copies into xf::Mat and returns it. If the image cannot be read (because of missing file, improper permissions, unsupported or invalid format), the function exits with a non-zero return code and an error statement.
Note: In an HLS standalone mode like Cosim, use cv::imread followed by copyTo function, instead of xf::imread.

API Syntax

template<int PTYPE, int ROWS, int COLS, int NPC>
xf::Mat<PTYPE, ROWS, COLS, NPC> imread (char *filename, int type)

Parameter Descriptions

The table below describes the template and the function parameters.

Table 6. xf::imread Function Parameter Descriptions
Parameter Description
PTYPE Input pixel type. Value should be in accordance with the ‘type’ argument’s value.
ROWS Maximum height of the image to be read
COLS Maximum width of the image to be read
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
filename Name of the file to be loaded
type Flag that depicts the type of image. The values are:
  • '0' for gray scale
  • '1' for color image

xf::imwrite

The function xf::imwrite saves the image to the specified file from the given xf::Mat. The image format is chosen based on the file name extension. This function internally uses cv::imwrite for the processing. Therefore, all the limitations of cv::imwrite are also applicable to xf::imwrite.

API Syntax

template <int PTYPE, int ROWS, int COLS, int NPC>
void imwrite(const char *img_name, xf::Mat<PTYPE, ROWS, COLS, NPC> &img)

Parameter Descriptions

The table below describes the template and the function parameters.

Table 7. xf::imwrite Function Parameter Descriptions
Parameter Description
PTYPE Input pixel type. Supported types are: XF_8UC1, XF_16UC1, XF_8UC4, and XF_16UC4
ROWS Maximum height of the image to be read
COLS Maximum width of the image to be read
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
img_name Name of the file with the extension
img xf::Mat array to be saved

xf::absDiff

The function xf::absDiff computes the absolute difference between each individual pixels of an xf::Mat and a cv::Mat, and returns the difference values in a cv::Mat.

API Syntax

template <int PTYPE, int ROWS, int COLS, int NPC>
void absDiff(cv::Mat &cv_img, xf::Mat<PTYPE, ROWS, COLS, NPC>& xf_img, cv::Mat &diff_img )

Parameter Descriptions

The table below describes the template and the function parameters.

Table 8. xf::absDiff Function Parameter Descriptions
Parameter Description
PTYPE Input pixel type
ROWS Maximum height of the image to be read
COLS Maximum width of the image to be read
NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1, XF_NPPC4, and XF_NPPC8 for 1-pixel, 4-pixel, and 8-pixel parallel operations respectively.

cv_img cv::Mat array to be compared
xf_img xf::Mat array to be compared
diff_img Output difference image(cv::Mat)

xf::convertTo

The xf::convertTo function performs bit depth conversion on each individual pixel of the given input image. This method converts the source pixel values to the target data type with appropriate casting.

dst(x,y)= cast<target-data-type>(α(src(x,y)+β))
Note: The output and input Mat cannot be the same. That is, the converted image cannot be stored in the Mat of the input image.

API Syntax

template<int DST_T> void convertTo(xf::Mat<DST_T,ROWS, COLS, NPC> &dst, int ctype, double alpha=1, double beta=0)

Parameter Descriptions

The table below describes the template and function parameters.

Table 9. xf::convertTo Function Parameter Descriptions
Parameter Description
DST_T Output pixel type. Possible values are XF_8UC1, XF_16UC1, XF_16SC1, and XF_32SC1.
ROWS Maximum height of image to be read
COLS Maximum width of image to be read
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1, XF_NPPC4, and XF_NPPC8 for 1-pixel, 4-pixel, and 8-pixel parallel operations respectively. XF_32SC1 and XF_NPPC8 combination is not supported.
dst Converted xf Mat
ctype Conversion type : Possible values are listed here.

//Down-convert:

  • XF_CONVERT_16U_TO_8U

  • XF_CONVERT_16S_TO_8U

  • XF_CONVERT_32S_TO_8U

  • XF_CONVERT_32S_TO_16U

  • XF_CONVERT_32S_TO_16S

//Up-convert:

  • XF_CONVERT_8U_TO_16U

  • XF_CONVERT_8U_TO_16S

  • XF_CONVERT_8U_TO_32S

  • XF_CONVERT_16U_TO_32S

  • XF_CONVERT_16S_TO_32S

alpha Optional scale factor
beta Optional delta added to the scaled values

xfOpenCV Library Functions

The xfOpenCV library is a set of select OpenCV functions optimized for Zynq-7000 and Zynq UltraScale+ MPSoC devices. The following table lists the xfOpenCV library functions.

Note: Resolution Conversion (Resize) in 8 pixel per cycle mode, Dense Pyramidal LK Optical Flow, and Dense Non-Pyramidal LK Optical Flow functions are not supported on the Zynq-7000 SoC ZC702 devices, due to the higher resource utilization.
Note: Number of pixel per clock depends on the maximum bus width a device can support.

For example: Zynq-7000 Soc has 64 bit interface and so for a pixel type 16UC1 ,maximum of four pixel per clock(XF_NPPC4) is possible.

Absolute Difference

API Syntax

The absdiff function finds the pixel wise absolute difference between two input images and returns an output image. The input and the output images must be the XF_8UC1 type.



Where,
  • Iout(x, y) is the intensity of output image at (x,y) position.
  • Iin1(x, y) is the intensity of first input image at (x,y) position.
  • Iin2(x, y) is the intensity of second input image at (x,y) position.
template<int SRC_T, int ROWS, int COLS, int NPC=1>
void absdiff(
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2,
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the function parameters.

Table 11. absdiff Function Parameter Descriptions
Parameter Description
SRC_T Input and Output pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of 8, for 8-pixel operation.
NPC

Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.

src1 Input image
src2 Input image
dst Output image

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 12. absdiff Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 62 67 17
8 pixel 150 0 0 67 234 39

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 13. absdiff Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.69

Deviation from OpenCV

There is no deviation from OpenCV, except that the absdiff function supports 8-bit pixels.

Accumulate

The accumulate function adds an image (src1) to the accumulator image (src2), and generates the accumulated result image (dst).

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1> 
void accumulate (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2, 
xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the function parameters.

Table 14. accumulate Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3)
DST_T Output pixel type. Only 16-bit, unsigned, 1 and 3 channels are supported (XF_16UC1 and XF_16UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Recommend using a multiple of 8, for an 8-pixel operation.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 15. accumulate Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48E FF LUT CLB
1 pixel 300 0 0 62 55 12
8 pixel 150 0 0 389 285 61

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process 4K 3 Channel image.

Table 16. accumulate Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48E FF LUT CLB
1 pixel 300 0 1 207 72 32

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 17. accumulate Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

Deviation from OpenCV

In OpenCV the accumulated image is stored in the second input image. The src2 image acts as both input and output, as shown below:

Whereas, in the xfOpenCV implementation, the accumulated image is stored separately, as shown below:



Accumulate Squared

The accumulateSquare function adds the square of an image (src1) to the accumulator image (src2) and generates the accumulated result (dst).



The accumulated result is a separate argument in the function, instead of having src2 as the accumulated result. In this implementation, having a bi-directional accumulator is not possible as the function makes use of streams.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1> 
void accumulateSquare (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2, 
xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 18. accumulateSquare Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3)
DST_T Output pixel type. Only 16-bit, unsigned, 1 and 3 channels are supported (XF_16UC1 and XF_16UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 19. accumulateSquare Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48E FF LUT CLB
1 pixel 300 0 1 71 52 14
8 pixel 150 0 8 401 247 48

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process 4K 3 Channel image.

Table 20. accumulateSquare Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48E FF LUT CLB
1 pixel 300 0 3 227 86 37

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 21. accumulateSquare Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.6

Deviation from OpenCV

In OpenCV the accumulated squared image is stored in the second input image. The src2 image acts as input as well as output.



Whereas, in the xfOpenCV implementation, the accumulated squared image is stored separately.

Accumulate Weighted

The accumulateWeighted function computes the weighted sum of the input image (src1) and the accumulator image (src2) and generates the result in dst.



The accumulated result is a separate argument in the function, instead of having src2 as the accumulated result. In this implementation, having a bi-directional accumulator is not possible, as the function uses streams.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1> 
void accumulateWeighted (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2, 
xf::Mat<int DST_T, int ROWS, int COLS, int NPC> dst, 
float alpha )

Parameter Descriptions

The following table describes the template and the function parameters.

Table 22. accumulateWeighted Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3)
DST_T Output pixel type. Only 16-bit, unsigned, 1 and 3 channels are supported (XF_16UC1 and XF_16UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Recommend multiples of 8, for an 8-pixel operation.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image
alpha Weight applied to input image

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 23. accumulateWeighted Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 5 295 255 52
8 pixel 150 0 19 556 476 88

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a 4K 3 Channel image.

Table 24. accumulateWeighted Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 457 387 95

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 25. accumulateWeighted Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

Deviation from OpenCV

The resultant image in OpenCV is stored in the second input image. The src2 image acts as input as well as output, as shown below:

Whereas, in xfOpenCV implementation, the accumulated weighted image is stored separately.

AddS

The AddS function performs the addition operation between pixels of input image src and given scalar value scl and stores the result in dst.

dst(x,y)= src(x,y) + scl

Where (x,y) is the spatial coordinate of the pixel.

API Syntax

template<int POLICY_TYPE, int SRC_T, int ROWS, int COLS, int NPC =1>
void addS(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src1, unsigned char _scl[XF_CHANNELS(SRC_T,NPC)],xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 26. AddS Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. 8-bit, unsigned, 1 channel is supported (XF_8UC1).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. In case of N-pixel parallelism, width should be multiple of N.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src1 First input image
_scl Input scalar value, the size should be number of channels.
_dst Output image

Resource Utilization

The following table summarizes the resource utilization of the AddS function in both the resource optimized (8 pixel) mode and normal mode, as generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA.

Table 27. AddS Function Resource Utilization Summary
Name Resource Utilization
1 pixel per clock operation 8 pixel per clock operation
300 MHz 150 MHz
BRAM_18K 0 0
DSP48E 0 0
FF 100 101
LUT 52 185
CLB 20 45

Performance Estimate

The following table summarizes a performance estimate of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA to process a grayscale HD (1080x1920) image.

Table 28. AddS Function Performance Estimate Summary
Operating Mode Latency Estimate
Operating Frequency (MHz) Latency (ms)

1 pixel

300 6.9

8 pixel

150 1.7

Addweighted

The addweighted function calculates a weighted sum of two input images src1, src2 and generates the result in dst.

dst(x,y)= src1(x,y)*alpha+src2(x,y)*beta+ gamma

API Syntax

template< int SRC_T , int DST_T,   int ROWS, int COLS, int NPC=1>
void addWeighted(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src1, float alpha, xf::Mat<SRC_T, ROWS, COLS, NPC> & _src2, float beta, float gamma, xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 29. Addweighted Function Parameter Descriptions
Parameter Description
SRC_T Input Pixel Type. 8-bit, unsigned,1 channel is supported (XF_8UC1)
DST_T Output Pixel Type. 8-bit, unsigned,1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. In case of N-pixel parallelism, width should be multiple of N
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src1 First Input image
Alpha Weight applied on first image
_src2 Second Input image
Beta Weight applied on second image
gamma Scalar added to each sum
_dst Output image

Resource Utilization

The following table summarizes the resource utilization of the Addweighted function in Resource optimized (8 pixel) mode and normal mode, as generated in Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA.

Table 30. Addweighted Function Resource Utilization Summary
Name Resource Utilization
1 pixel per clock operation 8 pixel per clock operation
300 MHz 150 MHz
BRAM_18K 0 0
DSP48E 11 25
FF 903 680
LUT 851 1077
CLB 187 229

Performance Estimate

The following table summarizes a performance estimate of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA to process a grayscale HD (1080x1920) image.

Table 31. Addweighted Function Performance Estimate Summary
Operating Mode Latency Estimate
Operating Frequency (MHz) Latency (ms)

1 pixel

300 6.9

8 pixel

150 1.7

Bilateral Filter

In general, any smoothing filter smoothens the image which will affect the edges of the image. To preserve the edges while smoothing, a bilateral filter can be used. In an analogous way as the Gaussian filter, the bilateral filter also considers the neighboring pixels with weights assigned to each of them. These weights have two components, the first of which is the same weighing used by the Gaussian filter. The second component takes into account the difference in the intensity between the neighboring pixels and the evaluated one.
The bilateral filter applied on an image is:

Where

and is a gaussian filter with variance .
The gaussian filter is given by:

API Syntax

template<int FILTER_SIZE, int BORDER_TYPE, int TYPE, int ROWS, int COLS, int NPC=1> 
void bilateralFilter (
xf::Mat<int TYPE, int ROWS, int COLS, int NPC> src, 
xf::Mat<int TYPE, int ROWS, int COLS, int NPC> dst,
float sigma_space, float sigma_color )

Parameter Descriptions

The following table describes the template and the function parameters.

Table 32. bilateralFilter Function Parameter Descriptions
Parameter Description
FILTER_SIZE Filter size. Filter size of 3 (XF_FILTER_3X3), 5 (XF_FILTER_5X5) and 7 (XF_FILTER_7X7) are supported
BORDER_TYPE Border type supported is XF_BORDER_CONSTANT
TYPE Input and output pixel type. Only 8-bit, unsigned, 1 channel, and 3 channels are supported (XF_8UC1 and XF_8UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle; this function supports XF_NPPC1 and XF_NPPC8.
src Input image
dst Output image
sigma_space Standard deviation of filter in spatial domain
sigma_color Standard deviation of filter used in color space

Resource Utilization

The following table summarizes the resource utilization of the kernel in different configurations, generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to progress a grayscale HD (1080x1920) image.

Table 33. bilateralFilter Resource Utilization Summary
Operating Mode Filter Size

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 3x3 300 6 22 4934 4293
5x5 300 12 30 5481 4943
7x7 300 37 48 7084 6195

The following table summarizes the resource utilization of the kernel in different configurations, generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to progress a 4K 3 channel image.

Table 34. bilateralFilter Resource Utilization Summary
Operating Mode Filter Size

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 3x3 300 12 32 8342 7442
5x5 300 27 57 10663 8857
7x7 300 49 107 12870 12181

Performance Estimate

The following table summarizes a performance estimate of the kernel in different configurations, as generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 35. bilateralFilter Function Performance Estimate Summary
Operating Mode Filter Size Latency Estimate
300 MHz
Max (ms)
1 pixel 3x3 7.18
5x5 7.20
7x7 7.22

Deviation from OpenCV

Unlike OpenCV, xfOpenCV only supports filter sizes of 3, 5 and 7.

Bit Depth Conversion

The convertTo function converts the input image bit depth to the required bit depth in the output image.

API Syntax

template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void convertTo(xf::Mat<SRC_T, ROWS, COLS, NPC> &_src_mat, xf::Mat<DST_T, ROWS, COLS, NPC> &_dst_mat, ap_uint<4> _convert_type, int _shift)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 36. convertTo Function Parameter Descriptions
Parameter Description
SRC_T

Input pixel type. 8-bit, unsigned, 1 channel (XF_8UC1),

16-bit, unsigned, 1 channel (XF_16UC1),

16-bit, signed, 1 channel (XF_16SC1),

32-bit, unsigned, 1 channel (XF_32UC1)

32-bit, signed, 1 channel (XF_32SC1) are supported.

DST_T

Output pixel type. 8-bit, unsigned, 1 channel (XF_8UC1),

16-bit, unsigned, 1 channel (XF_16UC1),

16-bit, signed, 1 channel (XF_16SC1),

32-bit, unsigned, 1 channel (XF_32UC1)

32-bit, signed, 1 channel (XF_32SC1) are supported.

ROWS Height of input and output images
COLS Width of input and output images
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively. XF_NPPC8 is not supported with the 32-bit input and output pixel type.
_src_mat Input image
_dst_mat Output image
_convert_type This parameter specifies the type of conversion required. (See XF_convert_bit_depth_e enumerated type in file xf_params.h for possible values.)
_shift Optional scale factor

Possible Conversions

The following table summarizes supported conversions. The rows are possible input image bit depths and the columns are corresponding possible output image bit depths (U=unsigned, S=signed).

Table 37. convertTo Function Supported Conversions
INPUT/OUTPUT U8 U16 S16 U32 S32
U8 NA yes yes NA yes
U16 yes NA NA NA yes
S16 yes NA NA NA yes
U32 NA NA NA NA NA
S32 yes yes yes NA NA

Resource Utilization

The following table summarizes the resource utilization of the convertTo function, generated using Vivado HLS 2019.1 tool for the Xilinx® Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.
Table 38. convertTo Function Resource Utilization Summary For XF_CONVERT_8U_TO_16S Conversion
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 8 581 523 119
8 pixel 150 0 8 963 1446 290
Table 39. convertTo Function Resource Utilization Summary For XF_CONVERT_16U_TO_8U Conversion
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 8 591 541 124
8 pixel 150 0 8 915 1500 308

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 40. convertTo Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency
1 pixel operation (300 MHz) 6.91 ms
8 pixel operation (150 MHz) 1.69 ms

Bitwise AND

The bitwise_and function performs the bitwise AND operation for each pixel between two input images, and returns an output image.
Where,
  • is the intensity of output image at (x, y) position
  • is the intensity of first input image at (x, y) position
  • is the intensity of second input image at (x, y) position

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1> 
void bitwise_and (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the function parameters.

Table 41. bitwise_and Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Supports 1 channel and 3 channels (XF_8UC1 and XF_8UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be a multiple of 8, for 8 pixel mode)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations, respectively.
src1 Input image
src2 Input image
dst Output image

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 42. bitwise_and Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 62 44 10
8 pixel 150 0 0 59 72 13

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a 4K 3Channel image

Table 43. bitwise_and Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 155 61 22

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 44. bitwise_and Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

Bitwise NOT

The bitwise_not function performs the pixel wise bitwise NOT operation for the pixels in the input image, and returns an output image.
Where,
  • is the intensity of output image at (x, y) position
  • is the intensity of input image at (x, y) position

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1> 
void bitwise_not (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the function parameters.

Table 45. bitwise_not Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Supports 1 channel and 3 channels (XF_8UC1 and XF_8UC3).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations, respectively.
src Input image
dst Output image

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 46. bitwise_not Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 97 78 20
8 pixel 150 0 0 88 97 21

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a 4K 3Channel image.

Table 47. bitwise_not Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 155 61 22

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 48. bitwise_not Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

Bitwise OR

The bitwise_or function performs the pixel wise bitwise OR operation between two input images, and returns an output image.
Where,
  • is the intensity of output image at (x, y) position
  • is the intensity of first input image at (x, y) position
  • is the intensity of second input image at (x, y) position

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1> 
void bitwise_or (
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the function parameters.

Table 49. bitwise_or Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Supports 1 channel and 3 channels (XF_8UC1 and XF_8UC3).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of 8, for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 50. bitwise_or Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 62 44 10
8 pixel 150 0 0 59 72 13

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a 4K 3Channel image

Table 51. bitwise_or Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 155 61 22

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 52. bitwise_or Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

Bitwise XOR

The bitwise_xor function performs the pixel wise bitwise XOR operation between two input images, and returns an output image, as shown below:

Where,
  • is the intensity of output image at (x, y) position
  • is the intensity of first input image at (x, y) position
  • is the intensity of second input image at (x, y) position

API Syntax

template<int SRC_T, int ROWS, int COLS, int NPC=1> 
void bitwise_xor(
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src1, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> src2, 
xf::Mat<int SRC_T, int ROWS, int COLS, int NPC> dst )

Parameter Descriptions

The following table describes the template and the function parameters.

Table 53. bitwise_xor Function Parameter Descriptions
Parameter Description
SRC_T Input and output pixel type. Supports 1 channel and 3 channels (XF_8UC1 and XF_8UC3).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of 8, for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src1 Input image
src2 Input image
dst Output image

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image:

Table 54. bitwise_xor Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 62 44 10
8 pixel 150 0 0 59 72 13

Performance Estimate

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a 4k Channel image

Table 55. bitwise_xor Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 155 61 22

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image:

Table 56. bitwise_xor Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

Box Filter

The boxFilter function performs box filtering on the input image. Box filter acts as a low-pass filter and performs blurring over the image. The boxFilter function or the box blur is a spatial domain linear filter in which each pixel in the resulting image has a value equal to the average value of the neighboring pixels in the image.

API Syntax

template<int BORDER_TYPE,int FILTER_TYPE, int SRC_T, int ROWS, int COLS,int NPC=1,bool USE_URAM=false>
void boxFilter(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst_mat)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 57. boxFilter Function Parameter Descriptions
Parameter Description
FILTER_SIZE

Filter size. Filter size of 3(XF_FILTER_3X3), 5(XF_FILTER_5X5) and 7(XF_FILTER_7X7) are supported

BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT
SRC_T Input and output pixel type. 8-bit, unsigned, 16-bit unsigned and 16-bit signed, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
USE_URAM Enable to map storage structures to UltraRAM
_src_mat Input image
_dst_mat Output image

Resource Utilization

The following table summarizes the resource utilization of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 58. boxFilter Function Resource Utilization Summary
Operating Mode Filter Size

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 3x3 300 3 1 545 519 104
5x5 300 5 1 876 870 189
7x7 300 7 1 1539 1506 300
8 pixel 3x3 150 6 8 1002 1368 264
5x5 150 10 8 1576 3183 611
7x7 150 14 8 2414 5018 942

The following table summarizes the resource utilization of the kernel in different configurations, generated using the SDx™ 2019.1 tool for the xczu7ev-ffvc1156-2-e FPGA, to process a grayscale 4K (3840x2160) image with UltraRAM enable.

Table 59. boxFilter Function Resource Utilization Summary with UltraRAM enabled
Operating Mode Filter Size

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K URAM DSP_48Es FF LUT
1 pixel 3x3 300 0 1 1 821 521
5x5 300 0 1 1 1204 855
7x7 300 0 1 1 2083 1431
8 pixel 3x3 150 0 3 8 1263 1480
5x5 150 0 5 8 1771 3154
7x7 150 0 7 8 2700 5411

Performance Estimate

The following table summarizes the performance of the kernel in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image:

Table 60. boxFilter Function Performance Estimate Summary
Operating Mode

Operating Frequency

(MHz)

Filter Size Latency Estimate
Max (ms)
1 pixel 300 3x3 7.2
300 5x5 7.21
300 7x7 7.22
8 pixel 150 3x3 1.7
150 5x5 1.7
150 7x7 1.7

BoundingBox

The boundingbox function highlights the region of interest (ROI) from the input image using below equations.

P(X,Y) ≤ P(xi, yi) ≤ P(X,Y’)
P(X’,Y) ≤ P(xi, yi) ≤ P(X’,Y’)
Where,
  • P(xi, yi) - Current pixel location
  • P(X,Y) - Top left corner of ROI
  • P(X,Y’) - Top right corner of ROI
  • P(X’,Y) - Bottom left corner of ROI
  • P(X’,Y’) - Bottom Right of ROI

API Syntax

template<int SRC_T, int ROWS, int COLS, int MAX_BOXES=1, int NPC=1>
void boundingbox(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat, xf::Rect_<int> *roi , xf::Scalar<4,unsigned char > *color, int num_box)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 61. boundingbox Function Parameter Descriptions
Parameter Description
SRC_T Input pixel Type. Only 8-bit, unsigned, 1 channel and 3 channel is supported (XF_8UC1,XF_8UC3).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of NPC.
MAX_BOXES Maximum number of boxes, fixed to 5.
NPC Number of pixels to be processed per cycle, possible options are XF_NPPC1 only.
_src_mat Input image
roi ROI is a xf::Rect object that consists of the left corner of the rectangle along with the height and width of the rectangle.
color The xf::Scalar object consists of color information for each box (ROI).
num_box Number of boxes to be detected should be equal or less than MAX_BOXES.

Resource Utilization

The following table summarizes the resource utilization in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA.

Table 62. boundingbox Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 5 4 2521 1649 409

Performance Estimate

The following table summarizes the performance of the kernel in 1-pixel mode as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a grayscale 4K (2160x3840) image for highlighting 3 different boundaries(480x640, 100x200, 300x300).

Table 63. boundingbox Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 0.15

xfOpenCV Reference:

The xf::boundingbox is complaint with below xfOpenCV function:
void rectangle(Mat& img, Rect rec, const Scalar& color, int thickness=1, int lineType=8, int shift=0 )

Canny Edge Detection

The Canny edge detector finds the edges in an image or video frame. It is one of the most popular algorithms for edge detection. Canny algorithm aims to satisfy three main criteria:

  1. Low error rate: A good detection of only existent edges.
  2. Good localization: The distance between edge pixels detected and real edge pixels have to be minimized.
  3. Minimal response: Only one detector response per edge.

In this algorithm, the noise in the image is reduced first by applying a Gaussian mask. The Gaussian mask used here is the average mask of size 3x3. Thereafter, gradients along x and y directions are computed using the Sobel gradient function. The gradients are used to compute the magnitude and phase of the pixels. The phase is quantized and the pixels are binned accordingly. Non-maximal suppression is applied on the pixels to remove the weaker edges.

Edge tracing is applied on the remaining pixels to draw the edges on the image. In this algorithm, the canny up to non-maximal suppression is in one kernel and the edge linking module is in another kernel. After non-maxima suppression, the output is represented as 2-bit per pixel, Where:

  • 00 - represents the background
  • 01 - represents the weaker edge
  • 11 - represents the strong edge

The output is packed as 8-bit (four 2-bit pixels) in 1 pixel per cycle operation and packed as 16-bit (eight 2-bit pixels) in 8 pixel per cycle operation. For the edge linking module, the input is 64-bit, such 32 pixels of 2-bit are packed into a 64-bit. The edge tracing is applied on the pixels and returns the edges in the image.

API Syntax

The API syntax for Canny is:
template<int FILTER_TYPE,int NORM_TYPE,int SRC_T,int DST_T, int ROWS, int COLS,int NPC,int NPC1,bool USE_URAM=false>
void Canny(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<DST_T, ROWS, COLS, NPC1> & _dst_mat,unsigned char _lowthreshold,unsigned char _highthreshold)
The API syntax for EdgeTracing is:
template<int SRC_T, int DST_T, int ROWS, int COLS,int NPC_SRC,int NPC_DST,bool USE_URAM=false>
voidEdgeTracing(xf::Mat<SRC_T, ROWS, COLS, NPC_SRC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC_DST> & _dst)

Parameter Descriptions

The following table describes the xf::Canny template and function parameters:

Table 64. xf::Canny Function Parameter Descriptions
Parameter Description
FILTER_TYPE The filter window dimensions. The options are 3 and 5.
NORM_TYPE The type of norm used. The options for norm type are L1NORM and L2NORM.
SRC_T Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
DST_T Output pixel type. Only XF_2UC1 is supported. The output in case of NPC=XF_NPPC1 is 8-bit and packing four 2-bit pixel values into 8-bit. The output in case of NPC=XF_NPPC8 is 16-bit, 8-bit, 2-bit pixel values are packing into 16-bit.
ROWS Maximum height of input and output image
COLS Maximum width of input and output image (must be a multiple of 8, in case of 8 pixel mode)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively. In XF_NPPC, the output image pixels are packed and precision is XF_NPPC4. In XF_NPPC8, output pixels precision is XF_NPPC8.
USE_URAM Enable to map some storage structures to URAM
_src_mat Input image
_dst_mat Output image
_lowthreshold The lower value of threshold for binary thresholding.
_highthreshold The higher value of threshold for binary thresholding.

The following table describes the EdgeTracing template and function parameters:

Table 65. EdgeTracing Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type
DST_T Output pixel type
ROWS Maximum height of input and output image
COLS Maximum width of input and output image (must be a multiple of 32)
NPC_SRC Number of pixels to be processed per cycle. Fixed to XF_NPPC32.
NPC_DST Number of pixels to be written to destination. Fixed to XF_NPPC8.
USE_URAM Enable to map storage structures to URAM.
_src Input image
_dst Output image

Resource Utilization

The following table summarizes the resource utilization of xf::Canny and EdgeTracing in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image for Filter size is 3.

Table 66. xf::Canny and EdgeTracing Function Resource Utilization Summary
Name Resource Utilization
1 pixel 1 pixel 8 pixel 8 pixel Edge Linking Edge Linking
L1NORM,FS:3 L2NORM,FS:3 L1NORM,FS:3 L2NORM,FS:3
300 MHz 300 MHz 150 MHz 150 MHz 300 MHz 150 MHz
BRAM_18K 22 18 36 32 84 84
DSP48E 2 4 16 32 3 3
FF 3027 3507 4899 6208 17600 14356
LUT 2626 3170 6518 9560 15764 14274
CLB 606 708 1264 1871 2955 3241

The following table summarizes the resource utilization of xf::Canny and EdgeTracing in different configurations, generated using SDx 2019.1 tool for the xczu7ev-ffvc1156-2-e FPGA, to process a grayscale 4K image for Filter size is 3.

Table 67. xf::Canny and EdgeTracing Function Resource Utilization Summary with UltraRAM Enable
Name Resource Utilization
1 pixel 1 pixel 8 pixel 8 pixel Edge Linking Edge Linking
L1NORM,FS:3 L2NORM,FS:3 L1NORM,FS:3 L2NORM,FS:3
300 MHz 300 MHz 150 MHz 150 MHz 300 MHz 150 MHz
BRAM_18K 10 8 3 3 4 4
URAM 1 1 15 13 8 8
DSP48E 2 4 16 32 8 8
FF 3184 3749 5006 7174 5581 7054
LUT 2511 2950 6695 9906 4092 6380

Performance Estimate

The following table summarizes the performance of the kernel in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image for L1NORM, filter size is 3 and including the edge linking module.

Table 68. xf::Canny and EdgeTracing Function Performance Estimate Summary
Operating Mode Latency Estimate

Operating Frequency (MHz)

Latency (ms)
1 pixel 300 10.8
8 pixel 150 8.5

Deviation from OpenCV

In OpenCV Canny function, the Gaussian blur is not applied as a pre-processing step.

Channel Combine

The merge function, merges single channel images into a multi-channel image. The number of channels to be merged should be four.

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void merge(xf::Mat<SRC_T, ROWS, COLS, NPC> &_src1, xf::Mat<SRC_T, ROWS, COLS, NPC> &_src2, xf::Mat<SRC_T, ROWS, COLS, NPC> &_src3, xf::Mat<SRC_T, ROWS, COLS, NPC> &_src4, xf::Mat<DST_T, ROWS, COLS, NPC> &_dst)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 69. merge Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1,2 and 3 channel is supported (XF_8UC1)
DST_T Output pixel type. Only 8-bit, unsigned,4 channel is supported (XF_8UC4)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 for 1 pixel operation.
_src1 Input single-channel image
_src2 Input single-channel image
_src3 Input single-channel image
_src4 Input single-channel image
_dst Output multi-channel image

Resource Utilization

The following table summarizes the resource utilization of the merge function, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process 4 single-channel HD (1080x1920) images.

Table 70. merge Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 8 494 386 85

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process 4 single channel HD (1080x1920) images.

Table 71. merge Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency
1 pixel operation (300 MHz) 6.92 ms

Channel Extract

The extractChannel function splits a multi-channel array (32-bit pixel-interleaved data) into several single-channel arrays and returns a single channel. The channel to be extracted is specified by using the channel argument.

The value of the channel argument is specified by macros defined in the xf_channel_extract_e enumerated data type. The following table summarizes the possible values for the xf_channel_extract_e enumerated data type:

Table 72. xf_channel_extract_e Enumerated Data Type Values
Channel Enumerated Type
Unknown XF_EXTRACT_CH_0
Unknown XF_EXTRACT_CH_1
Unknown XF_EXTRACT_CH_2
Unknown XF_EXTRACT_CH_3
RED XF_EXTRACT_CH_R
GREEN XF_EXTRACT_CH_G
BLUE XF_EXTRACT_CH_B
ALPHA XF_EXTRACT_CH_A
LUMA XF_EXTRACT_CH_Y
Cb/U XF_EXTRACT_CH_U
Cr/V/Value XF_EXTRACT_CH_V

API Syntax

template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1> 
void extractChannel(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat, uint16_t _channel)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 73. extractChannel Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 4channel is supported (XF_8UC4)
DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be multiple of 8 for 8 pixel mode
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 for 1 pixel operation.
_src_mat Input multi-channel image
_dst_mat Output single channel image
_channel Channel to be extracted (See xf_channel_extract_e enumerated type in file xf_params.h for possible values.)

Resource Utilization

The following table summarizes the resource utilization of the extractChannel function, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a 4 channel HD (1080x1920) image.

Table 74. extractChannel Function Resource Utilization Summary
Operating Mode

Operating Frequency (MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 8 508 354 96

Performance Estimate

The following table summarizes the performance in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a 4 channel HD (1080x1920) image.

Table 75. extractChannel Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.92

Color Conversion

The color conversion functions convert one image format to another image format, for the combinations listed in the following table. The rows represent the input formats and the columns represent the output formats. Supported conversions are discussed in the following sections.

Table 76. Supported Color Conversions
I/O Formats RGBA NV12 NV21 IYUV UYVY YUYV YUV4 RGB BGR
RGBA N/A

For details, see the RGBA to NV12

For details, see the RGBA to NV21

For details, see the RGBA/RGB to IYUV

For details, see the RGBA/RGB to YUV4

   
NV12

For details, see the NV12 to RGBA

N/A For details, see the NV12 to NV21/NV21 to NV12

For details, see the NV12 to IYUV

For details, see the NV12/NV21 to UYVY/YUYV For details, see the NV12/NV21 to UYVY/YUYV

For details, see the NV12 to YUV4

For details, see the NV12/NV21 to RGB/ BGR For details, see the NV12/NV21 to RGB/ BGR
NV21

For details, see the NV21 to RGBA

For details, see the NV12 to NV21/NV21 to NV12 N/A

For details, see the NV21 to IYUV

For details, see the NV12/NV21 to UYVY/YUYV For details, see the NV12/NV21 to UYVY/YUYV

For details, see the NV21 to YUV4

For details, see the NV12/NV21 to RGB/ BGR For details, see the NV12/NV21 to RGB/ BGR
IYUV

For details, see the IYUV to RGBA/RGB

For details, see the IYUV to NV12

N/A

For details, see the IYUV to YUV4

For details, see the IYUV to RGBA/RGB  
UYVY

For details, see the UYVY to RGBA

For details, see the UYVY to NV12

For details, see the UYVY to IYUV

N/A    
YUYV

For details, see the YUYV to RGBA

For details, see the YUYV to NV12

For details, see the YUYV to IYUV

N/A    
YUV4 N/A    
RGB   For details see theRGB/ BGR to NV12/NV21 For details see theRGB/ BGR to NV12/NV21 For details see the RGBA/RGB to IYUV For details see theRGB/BGR to UYVY/YUYV For details see theRGB/BGR to UYVY/YUYV For details see the RGBA/RGB to YUV4   For details see theBGR to RGB / RGB to BGR
BGR   For details see theRGB/ BGR to NV12/NV21 For details see theRGB/ BGR to NV12/NV21   For details see the RGB/BGR to UYVY/YUYV For details see the RGB/BGR to UYVY/YUYV   For details see theBGR to RGB / RGB to BGR  

Other conversions

Few other conversions are also added. BGR/RGB<->HSV,BGR/RGB<->HLS,BGR/RGB<->YCrCb,BGR/RGB<->XYZ and RGB<->BGR conversions are added.

RGB to YUV Conversion Matrix

Following is the formula to convert RGB data to YUV data:

YUV to RGB Conversion Matrix

Following is the formula to convert YUV data to RGB data:

Source: http://www.fourcc.org/fccyvrgb.php

RGBA/RGB to YUV4

The rgba2yuv4 function converts a 4-channel RGBA image to YUV444 format and the rgb2yuv4 function converts a 3-channel RGB image to YUV444 format. The function outputs Y, U, and V streams separately.

API Syntax
template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void rgba2yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS, COLS, NPC> & _u_image, xf::Mat<DST_T, ROWS, COLS, NPC> & _v_image)
template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void rgb2yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS, COLS, NPC> & _u_image, xf::Mat<DST_T, ROWS, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 77. (rgba/rgb)2yuv4 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 4(RGBA) and 3(RGB)-channel are supported (XF_8UC4 and XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input Y plane of size (ROWS, COLS).
_y_image Output Y image of size (ROWS, COLS).
_u_image Output U image of size (ROWS, COLS).
_v_image Output V image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of RGBA/RGB to YUV4 for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 78. (rgba/rgb)2yuv4 Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 589 328 96
Performance Estimate

The following table summarizes the performance of RGBA/RGB to YUV4 for different configurations, as generated using the Vivado HLS 2019.1 version for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 79. (rgba/rgb)2yuv4 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 1.89

RGBA/RGB to IYUV

The rgba2iyuv function converts a 4-channel RGBA image to IYUV (4:2:0) format and the rgb2iyuv function converts a 3-channel RGB image to IYUV (4:2:0) format. The function outputs Y, U, and V planes separately. IYUV holds subsampled data, Y is sampled for every RGBA/RGB pixel and U,V are sampled once for 2row and 2column(2x2) pixels. U and V planes are of (rows/2)*(columns/2) size, by cascading the consecutive rows into a single row the planes size becomes (rows/4)*columns.

API Syntax
template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void rgba2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)
template <int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void rgb2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 80. (rgba/rgb)2iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit,unsigned, 4(RGBA) and 3(RGB)-channel are supported (XF_8UC4 and XF_8UC3).
DST_T Output pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input Y plane of size (ROWS, COLS).
_y_image Output Y image of size (ROWS, COLS).
_u_image Output U image of size (ROWS/4, COLS).
_v_image Output V image of size (ROWS/4, COLS).
Resource Utilization

The following table summarizes the resource utilization of RGBA/RGB to IYUV for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 81. (rgba/rgb)2iyuv Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 816 472 149
Performance Estimate

The following table summarizes the performance of RGBA/RGB to IYUV for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 82. (rgba/rgb)2iyuv Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 1.8

RGBA to NV12

The rgba2nv12 function converts a 4-channel RGBA image to NV12 (4:2:0) format. The function outputs Y plane and interleaved UV plane separately. NV12 holds the subsampled data, Y is sampled for every RGBA pixel and U, V are sampled once for 2row and 2columns (2x2) pixels. UV plane is of (rows/2)*(columns/2) size as U and V values are interleaved.

API Syntax
template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1>
void rgba2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS, COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC> & _uv)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 83. rgba2nv12 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit,unsigned, 4-channel is supported (XF_8UC4).
Y_T Output pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).
UV_T Output pixel type. Only 8-bit,unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input RGBA image of size (ROWS, COLS).
_y Output Y image of size (ROWS, COLS).
_uv Output UV image of size (ROWS/2, COLS/2).
Resource Utilization

The following table summarizes the resource utilization of RGBA to NV12 for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 84. rgba2nv12 Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 802 452 128
Performance Estimate

The following table summarizes the performance of RGBA to NV12 for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 85. rgba2nv12 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 1.8

RGBA to NV21

The rgba2nv21 function converts a 4-channel RGBA image to NV21 (4:2:0) format. The function outputs Y plane and interleaved VU plane separately. NV21 holds subsampled data, Y is sampled for every RGBA pixel and U, V are sampled once for 2 row and 2 columns (2x2) RGBA pixels. UV plane is of (rows/2)*(columns/2) size as V and U values are interleaved.

API Syntax
template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1>
void rgba2nv21(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS, COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC> & _uv)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 86. rgba2nv21 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).
Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input RGBA image of size (ROWS, COLS).
_y Output Y image of size (ROWS, COLS).
_uv Output UV image of size (ROWS/2, COLS/2).
Resource Utilization

The following table summarizes the resource utilization of RGBA to NV21 for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 87. rgba2nv21 Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 802 453 131
Performance Estimate

The following table summarizes the performance of RGBA to NV21 for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 88. rgba2nv21 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 1.89

YUYV to RGBA

The yuyv2rgba function converts a single-channel YUYV (YUV 4:2:2) image format to a 4-channel RGBA image. YUYV is a sub-sampled format, a set of YUYV value gives 2 RGBA pixel values. YUYV is represented in 16-bit values where as, RGBA is represented in 32-bit values.

API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void yuyv2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 89. yuyv2rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 incase of 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_dst Output image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of YUYV to RGBA for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 90. yuyv2rgba Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 6 765 705 165
Performance Estimate

The following table summarizes the performance of UYVY to RGBA for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 91. yuyv2rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

YUYV to NV12

The yuyv2nv12 function converts a single-channel YUYV (YUV 4:2:2) image format to NV12 (YUV 4:2:0) format. YUYV is a sub-sampled format, 1 set of YUYV value gives 2 Y values and 1 U and V value each.

API Syntax
template<int SRC_T,int Y_T,int UV_T,int ROWS,int COLS,int NPC=1,int NPC_UV=1>
void yuyv2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<Y_T, ROWS, COLS, NPC> & _y_image,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 92. yuyv2nv12 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_y_image Output Y plane of size (ROWS, COLS).
_uv_image Output U plane of size (ROWS/2, COLS/2).
Resource Utilization

The following table summarizes the resource utilization of YUYV to NV12 for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 93. yuyv2nv12 Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 831 491 149
8 pixel 150 0 0 1196 632 161
Performance Estimate

The following table summarizes the performance of YUYV to NV12 for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 94. yuyv2nv12 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

YUYV to IYUV

The yuyv2iyuv function converts a single-channel YUYV (YUV 4:2:2) image format to IYUV(4:2:0) format. Outputs of the function are separate Y, U, and V planes. YUYV is a sub-sampled format, 1 set of YUYV value gives 2 Y values and 1 U and V value each. U, V values of the odd rows are dropped as U, V values are sampled once for 2 rows and 2 columns in the IYUV(4:2:0) format.

API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void yuyv2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _y_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 95. yuyv2iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned,1 channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel modes.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS/4, COLS).
_v_image Output V plane of size (ROWS/4, COLS).
Resource Utilization

The following table summarizes the resource utilization of YUYV to IYUV for different configurations, generated using Vivado HLS 2019.1 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 96. yuyv2iyuv Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 835 497 149
8 pixel 150 0 0 1428 735 210
Performance Estimate

The following table summarizes the performance of YUYV to IYUV for different configurations, as generated using Vivado HLS 2019.1 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 97. yuyv2iyuv Function Performance Estimate
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

UYVY to IYUV

The uyvy2iyuv function converts a UYVY (YUV 4:2:2) single-channel image to the IYUV format. The outputs of the functions are separate Y, U, and V planes. UYVY is sub sampled format. One set of UYVY value gives two Y values and one U and V value each.

API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void uyvy2iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _y_image,xf::Mat<DST_T, ROWS/4, COLS, NPC> & _u_image, xf::Mat<DST_T, ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 98. uyvy2iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS/4, COLS).
_v_image Output V plane of size (ROWS/4, COLS).
Resource Utilization

The following table summarizes the resource utilization of UYVY to IYUV for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 99. uyvy2iyuv Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 835 494 139
8 pixel 150 0 0 1428 740 209
Performance Estimate

The following table summarizes the performance of UYVY to IYUV for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 100. uyvy2iyuv Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

UYVY to RGBA

The uyvy2rgba function converts a UYVY (YUV 4:2:2) single-channel image to a 4-channel RGBA image. UYVY is sub sampled format, 1set of UYVY value gives 2 RGBA pixel values. UYVY is represented in 16-bit values where as RGBA is represented in 32-bit values.

API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void uyvy2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 101. uyvy2rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_dst Output image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of UYVY to RGBA for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 102. uyvy2rgba Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 6 773 704 160
Performance Estimate

The following table summarizes the performance of UYVY to RGBA for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 103. uyvy2rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.8

UYVY to NV12

The uyvy2nv12 function converts a UYVY (YUV 4:2:2) single-channel image to NV12 format. The outputs are separate Y and UV planes. UYVY is sub sampled format, 1 set of UYVY value gives 2 Y values and 1 U and V value each.

API Syntax
template<int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1, int NPC_UV=1>
void uyvy2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<Y_T, ROWS, COLS, NPC> & _y_image,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 104. uyvy2nv12 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4 for 1 pixel and 8 pixel operations respectively.
_src Input image of size (ROWS, COLS).
_y_image Output Y plane of size (ROWS, COLS).
_uv_image Output U plane of size (ROWS/2, COLS/2).
Resource Utilization

The following table summarizes the resource utilization of UYVY to NV12 for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 105. uyvy2nv12 Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 831 488 131
8 pixel 150 0 0 1235 677 168
Performance Estimate

The following table summarizes the performance of UYVY to NV12 for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 106. uyvy2nv12 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

IYUV to RGBA/RGB

The iyuv2rgba function converts single channel IYUV (YUV 4:2:0) image to a 4-channel RGBA image and iyuv2rgb function converts single channel IYUV (YUV 4:2:0) image to a 3-channel RGB image . The inputs to the function are separate Y, U, and V planes. IYUV is sub sampled format, U and V values are sampled once for 2 rows and 2 columns of the RGBA/RGB pixels. The data of the consecutive rows of size (columns/2) is combined to form a single row of size (columns).

API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void iyuv2rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_v, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void iyuv2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_v, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 107. iyuv2(rgba/rgb) Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 4(RGBA) and 3(RGB)-channel are supported (XF_8UC4 and XF_8UC3).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_u Input U plane of size (ROWS/4, COLS).
src_v Input V plane of size (ROWS/4, COLS).
_dst0 Output RGBA image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of IYUV to RGBA/RGB for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 108. iyuv2(rgba/rgb) Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 2 5 1208 728 196
Performance Estimate

The following table summarizes the performance of IYUV to RGBA/RGB for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 109. iyuv2(rgba/rgb) Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

IYUV to NV12

The iyuv2nv12 function converts single channel IYUV image to NV12 format. The inputs are separate U and V planes. There is no need of processing Y plane as both the formats have a same Y plane. U and V values are rearranged from plane interleaved to pixel interleaved.

API Syntax
template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC =1, int NPC_UV=1>
void iyuv2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_v,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 110. iyuv2nv12 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8 for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
NPC_UV Number of UV Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_u Input U plane of size (ROWS/4, COLS).
src_v Input V plane of size (ROWS/4, COLS).
_y_image Output V plane of size (ROWS, COLS).
_uv_image Output UV plane of size (ROWS/2, COLS/2).
Resource Utilization

The following table summarizes the resource utilization of IYUV to NV12 for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image..

Table 111. iyuv2nv12 Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 12 907 677 158
8 pixel 150 0 12 1591 1022 235
Performance Estimate

The following table summarizes the performance of IYUV to NV12 for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 112. iyuv2nv12 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

IYUV to YUV4

The iyuv2yuv4 function converts a single channel IYUV image to a YUV444 format. Y plane is same for both the formats. The inputs are separate U and V planes of IYUV image and the outputs are separate U and V planes of YUV4 image. IYUV stores subsampled U,V values. YUV format stores U and V values for every pixel. The same U, V values are duplicated for 2 rows and 2 columns (2x2) pixels in order to get the required data in the YUV444 format.

API Syntax
template<int SRC_T, int ROWS, int COLS, int NPC=1>
void iyuv2yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_u,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & src_v,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _u_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 113. iyuv2yuv4 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8, for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_u Input U plane of size (ROWS/4, COLS).
src_v Input V plane of size (ROWS/4, COLS).
_y_image Output Y image of size (ROWS, COLS).
_u_image Output U image of size (ROWS, COLS).
_v_image Output V image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of IYUV to YUV4 for different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 114. iyuv2yuv4 Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 1398 870 232
8 pixel 150 0 0 2134 1214 304
Performance Estimate

The following table summarizes the performance of IYUV to YUV4 for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 115. iyuv2yuv4 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 13.8
8 pixel operation (150 MHz) 3.4

NV12 to IYUV

The nv122iyuv function converts NV12 format to IYUV format. The function inputs the interleaved UV plane and the outputs are separate U and V planes. There is no need of processing the Y plane as both the formats have a same Y plane. U and V values are rearranged from pixel interleaved to plane interleaved.

API Syntax
template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1, int NPC_UV=1>
void nv122iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & _u_image,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 116. nv122iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be a multiple of 8, for 8 pixel mode).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS/4, COLS).
_v_image Output V plane of size (ROWS/4, COLS).
Resource Utilization

The following table summarizes the resource utilization of NV12 to IYUV for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 117. nv122iyuv Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 1344 717 208
8 pixel 150 0 1 1961 1000 263
Performance Estimate

The following table summarizes the performance of NV12 to IYUV for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 118. nv122iyuv Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

NV12 to RGBA

The nv122rgba function converts NV12 image format to a 4-channel RGBA image. The inputs to the function are separate Y and UV planes. NV12 holds sub sampled data, Y plane is sampled at unit rate and 1 U and 1 V value each for every 2x2 Y values. To generate the RGBA data, each U and V value is duplicated (2x2) times.

API Syntax
template<int SRC_T, int UV_T, int DST_T, int ROWS, int COLS, int NPC=1>
void nv122rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y,xf::Mat<UV_T, ROWS/2, COLS/2, NPC> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 119. nv122rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
DST_T Output pixel type. Only 8-bit,unsigned,4channel is supported (XF_8UC4).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8, for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_dst0 Output RGBA image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of NV12 to RGBA for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 120. nv122rgba Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 2 5 1191 708 195
Performance Estimate

The following table summarizes the performance of NV12 to RGBA for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 121. nv122rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

NV12 to YUV4

The nv122yuv4 function converts a NV12 image format to a YUV444 format. The function outputs separate U and V planes. Y plane is same for both the image formats. The UV planes are duplicated 2x2 times to represent one U plane and V plane of the YUV444 image format.

API Syntax
template<int SRC_T,int UV_T, int ROWS, int COLS, int NPC=1, int NPC_UV=1>
void nv122yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _u_image,xf::Mat<SRC_T, ROWS, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 122. nv122yuv4 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be a multiple of 8, for 8 pixel mode).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS, COLS).
_v_image Output V plane of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of NV12 to YUV4 for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 123. nv122yuv4 Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 1383 832 230
8 pixel 150 0 0 1772 1034 259
Performance Estimate

The following table summarizes the performance of NV12 to YUV4 for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 124. nv122yuv4 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 13.8
8 pixel operation (150 MHz) 3.4

NV21 to IYUV

The nv212iyuv function converts a NV21 image format to an IYUV image format. The input to the function is the interleaved VU plane only and the outputs are separate U and V planes. There is no need of processing Y plane as both the formats have same the Y plane. U and V values are rearranged from pixel interleaved to plane interleaved.

API Syntax
template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>
void nv212iyuv(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<SRC_T, ROWS/4, COLS, NPC> & _u_image,xf::Mat<SRC_T, ROWS/4, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 125. nv212iyuv Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image .
COLS Maximum width of input and output image. Must be a multiple of 8, for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS/4, COLS).
_v_image Output V plane of size (ROWS/4, COLS).
Resource Utilization

The following table summarizes the resource utilization of NV21 to IYUV for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 126. nv212iyuv Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 1377 730 219
8 pixel 150 0 1 1975 1012 279
Performance Estimate

The following table summarizes the performance of NV21 to IYUV for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 127. nv212iyuv Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9
8 pixel operation (150 MHz) 1.7

NV21 to RGBA

The nv212rgba function converts a NV21 image format to a 4-channel RGBA image. The inputs to the function are separate Y and VU planes. NV21 holds sub sampled data, Y plane is sampled at unit rate and one U and one V value each for every 2x2 Yvalues. To generate the RGBA data, each U and V value is duplicated (2x2) times.

API Syntax
template<int SRC_T, int UV_T, int DST_T, int ROWS, int COLS, int NPC=1>
void nv212rgba(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 128. nv212rgba Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
DST_T Output pixel type. Only 8-bit, unsigned, 4-channel is supported (XF_8UC4).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be a multiple of 8, incase of 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_dst0 Output RGBA image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of NV21 to RGBA for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 129. nv212rgba Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 2 5 1170 673 183
Performance Estimate

The following table summarizes the performance of NV12 to RGBA for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 130. nv212rgba Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

NV21 to YUV4

The nv212yuv4 function converts an image in the NV21 format to a YUV444 format. The function outputs separate U and V planes. Y plane is same for both formats. The UV planes are duplicated 2x2 times to represent one U plane and V plane of YUV444 format.

API Syntax
template<int SRC_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>
void nv212yuv4(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & src_uv, xf::Mat<SRC_T, ROWS, COLS, NPC> & _y_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _u_image, xf::Mat<SRC_T, ROWS, COLS, NPC> & _v_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 131. nv212yuv4 Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be a multiple of 8, for 8 pixel mode).
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
NPC_UV Number of UV image Pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC4 for 1 pixel and 4-pixel operations respectively.
src_y Input Y plane of size (ROWS, COLS).
src_uv Input UV plane of size (ROWS/2, COLS/2).
_y_image Output Y plane of size (ROWS, COLS).
_u_image Output U plane of size (ROWS, COLS).
_v_image Output V plane of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of NV21 to YUV4 for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 132. nv212yuv4 Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 1383 817 233
8 pixel 150 0 0 1887 1087 287
Performance Estimate

The following table summarizes the performance of NV21 to YUV4 for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 133. nv212yuv4 Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 13.8
8 pixel operation (150 MHz) 3.5

RGB to GRAY

The rgb2gray function converts a 3-channel RGB image to GRAY format.

Y= 0.299*R+0.587*G+0.114*B
Where,
  • Y = Gray pixel
  • R= Red channel
  • G= Green channel
  • B= Blue channel
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void rgb2gray(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 134. RGB2GRAY Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image.
NPC Number of pixels to be processed per cycle.
_src RGB input image
_dst GRAY output image
Resource Utilization

The following table summarizes the resource utilization of RGB to GRAY for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 135. RGB2GRAY Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 3 439 280
Performance Estimate

The following table summarizes the performance of RGB to GRAY for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 136. RGB2GRAY Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

BGR to GRAY

The bgr2gray function converts a 3-channel BGR image to GRAY format.

Y= 0.299*R+0.587*G+0.114*B

Where,
  • Y = Gray pixel
  • R= Red channel
  • G= Green channel
  • B= Blue channel
API Syntax
template<int SRC_T, int DST_T, int ROWS, int COLS, int NPC=1>
void bgr2gray(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 137. bgr2gray Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned,1-channel is supported (XF_8UC1).
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle.
_src BGR input image
_dst GRAY output image
Resource Utilization

The following table summarizes the resource utilization of BGR to GRAY for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 138. bgr2gray Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 3 439 280
Performance Estimate

The following table summarizes the performance of BGR to GRAY for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 139. bgr2gray Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

GRAY to RGB

The gray2rgb function converts a gray intensity image to RGB color format.

R<-Y, G<-Y, B<-Y

  • Y = Gray pixel
  • R= Red channel
  • G= Green channel
  • B= Blue channel
API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void gray2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 140. gray2rgb Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle.
_src GRAY input image.
_dst RGB output image.
Resource Utilization

The following table summarizes the resource utilization of gray2rgb for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 141. gray2rgb Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 0 156 184
Performance Estimate

The following table summarizes the performance of gray2rgb for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 142. gray2rgb Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

GRAY to BGR

The gray2bgr function converts a gray intensity image to RGB color format.

R<-Y, G<-Y, B<-Y

Where,
  • Y = Gray pixel
  • R= Red channel
  • G= Green channel
  • B= Blue channel
API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>
void gray2bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 143. gray2bgr Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle;
_src GRAY input image.
_dst BGR output image.
Resource Utilization

The following table summarizes the resource utilization of gray2bgr for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 144. gray2bgr Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 0 156 184
Performance Estimate

The following table summarizes the performance of gray2bgr for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 145. gray2bgr Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

HLS to RGB/BGR

The hls2(rgb/bgr) function converts HLS color space to 3-channel RGB/BGR image.











API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void hls2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void hls2bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 146. HLS2RGB/BGR Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle.
_src HLS input image.
_dst RGB/BGR output image.
Resource Utilization

The following table summarizes the resource utilization of HLS2RGB/BGRR for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 147. HLS2RGB/BGR Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 3 4366 3096
Performance Estimate

The following table summarizes the performance of HLS2RGB/BGR for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 148. HLS2RGB/BGR Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

RGB to XYZ

The rgb2xyz function converts a 3-channel RGB image to XYZ color space.

  • R= Red channel
  • G= Green channel
  • B= Blue channel
API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void rgb2xyz(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 149. RGB2XYZ Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported. (XF_8UC3).
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle.
_src RGB input image.
_dst XYZ output image.
Resource Utilization

The following table summarizes the resource utilization of RGB to XYZ for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 150. RGB2XYZ Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 8 644 380
Performance Estimate

The following table summarizes the performance of RGB to XYZ for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 151. RGB2XYZ Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

BGR to XYZ

The bgr2xyz function converts a 3-channel BGR image to XYZ color space.

  • R= Red channel
  • G= Green channel
  • B= Blue channel
API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void bgr2xyz(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 152. RGB2XYZ Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image. Must be a multiple of 8.
COLS Maximum width of input and output image. Must be a multiple of 8.
NPC Number of pixels to be processed per cycle.
_src BGR input image.
_dst XYZ output image.
Resource Utilization

The following table summarizes the resource utilization of BGR to XYZ for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 153. BGR2XYZ Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 8 644 380
Performance Estimate

The following table summarizes the performance of BGR to XYZ for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 154. BGR2XYZ Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

RGB/BGR to YCrCb

The (rgb/bgr)2ycrcb function converts a 3-channel RGB image to YCrCb color space.
  • Y = 0.299*R + 0.587*G + 0.114*B
  • Cr= (R-Y)*0.713+delta
  • Cb= (B-Y)*0.564+delta


API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void rgb2ycrcb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void bgr2ycrcb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 155. RGB/BGR2YCrCb Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3)
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3)
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle
_src RGB/BGR input image
_dst YCrCb output image
Resource Utilization

The following table summarizes the resource utilization of RGB/BGR2YCrCb for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 156. RGB/BGR2YCrCb Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 5 660 500
Performance Estimate

The following table summarizes the performance of RGB/BGR2YCrCb for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 157. RGB/BGR2YCrCb Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

RGB/BGR to HSV

The (rgb/bgr)2hsv function converts a 3-channel RGB image to HSV color space.







API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void rgb2hsv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1> void bgr2hsv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 158. RGB/BGR2HSV Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle
_src RGB/BGR input image
_dst HSV output image
Resource Utilization

The following table summarizes the resource utilization of RGB/BGR2HSV for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 159. RGB/BGR2HSV Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 6 8 1582 1274
Performance Estimate

The following table summarizes the performance of RGB/BGR2HSV for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 160. RGB/BGR2HSV Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

RGB/BGR to HLS

The (rgb/bgr)2hls function converts a 3-channel RGB image to HLS color space.









API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void rgb2hls(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void bgr2hls(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 161. RGB/BGR2HLS Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle.
_src RGB/BGR input image.
_dst HLS output image.
Resource Utilization

The following table summarizes the resource utilization of RGB/BGR2HLS for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 162. RGB/BGR2HLS Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 3 4366 3096
Performance Estimate

The following table summarizes the performance of RGB/BGR2HLS for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 163. RGB/BGR2HLS Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

YCrCb to RGB/BGR

The ycrcb2(rgb/bgr) function converts YCrCb color space to 3-channel RGB/BGR image.

Where,
  • R= Y+1.403*(Cr-delta)
  • G= Y-0.714*(Cr-delta)-0.344*(cb-delta)
  • B= Y+1.773+(Cb-delta)
API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void ycrcb2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void ycrcb2bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 164. YCrCb2RGB/BGR Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image. Must be a multiple of 8.
COLS Maximum width of input and output image. Must be a multiple of 8.
NPC Number of pixels to be processed per cycle.
_src YCrCb input image.
_dst RGB/BGR output image.
Resource Utilization

The following table summarizes the resource utilization of YCrCb2RGB/BGR for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 165. YCrCb2RGB/BGR Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 4 538 575
Performance Estimate

The following table summarizes the performance of YCrCb2RGB/BGR for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 166. YCrCb2RGB/BGR Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

HSV to RGB/BGR

The hsv2(rgb/bgr) function converts HSV color space to 3-channel RGB/BGR image.











API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void hsv2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void hsv2bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 167. HSV2RGB/BGR Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3)
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3)
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle
_src HSV input image
_dst RGB/BGR output image
Resource Utilization

The following table summarizes the resource utilization of HSV2RGB/BGRR for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 168. HSV2RGB/BGR Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 8 1543 1006
Performance Estimate

The following table summarizes the performance of HSV2RGB/BGR for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 169. HSV2RGB/BGR Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

NV12/NV21 to RGB/ BGR

The nv122rgb/nv122bgr/nv212rgb/nv212bgr converts NV12 image format to a 3-channel RGB/BGR image. The inputs to the function are separate Y and UV planes. NV12 holds sub sampled data, Y plane is sampled at unit rate, and 1 U and 1 V value each for every 2x2 Y values. To generate the RGB data, each U and V value is duplicated (2x2) times.

API Syntax
NV122RGB:
template<int SRC_T,int UV_T,int DST_T,int ROWS,int COLS,int NPC=1,int NPC_UV=1>void nv122rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
NV122BGR:
template<int SRC_T,int UV_T,int DST_T,int ROWS,int COLS,int NPC=1,int NPC_UV=1>void nv122bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
NV212RGB:
template<int SRC_T,int UV_T,int DST_T,int ROWS,int COLS,int NPC=1,int NPC_UV=1>void nv212rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & src_uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
NV212BGR:
template<int SRC_T,int UV_T,int DST_T,int ROWS,int COLS,int NPC=1,int NPC_UV=1>void nv212bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & src_y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & src_uv, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst0)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 170. Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit,unsigned, 1-channel is supported (XF_8UC1).
UV_T Input pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be a multiple of NPC for N pixel mode.
NPC Number of Y Pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8.
NPC_UV Number of UV Pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2 and XF_NPPC4.
src_y Y input image of size(ROWS, COLS)
src_uv UV output image of size (ROWS/2, COLS/2).
_dst0 Output UV image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of NV12/NV21 to RGB/ BGR function in Normal mode (1 pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 2 5 339 289 76
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2018.3 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Table 171. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

NV12 to NV21/NV21 to NV12

The nv122nv21/nv212nv12 function converts a NV12 (YUV4:2:0) to NV21 (YUV4:2:0) or vice versa, where 8-bit Y plane followed by an interleaved U/V plane with 2x2 sub-sampling.

API Syntax
NV122NV21:
template<int SRC_Y,int SRC_UV,int ROWS,int COLS,int NPC=1,int NPC_UV=1>
void nv122nv21(xf::Mat<SRC_Y, ROWS, COLS, NPC> & _y,xf::Mat<SRC_UV, ROWS/2, COLS/2, NPC_UV> & _uv,xf::Mat<SRC_Y, ROWS, COLS, NPC> & out_y,xf::Mat<SRC_UV, ROWS/2, COLS/2, NPC_UV> & out_uv)
NV212NV12:
template<int SRC_Y, int SRC_UV, int ROWS, int COLS, int NPC=1,int NPC_UV=1>void nv212nv12(xf::Mat<SRC_Y, ROWS, COLS, NPC> & _y, xf::Mat<SRC_UV, ROWS/2, COLS/2, NPC_UV> & _uv, xf::Mat<SRC_Y, ROWS, COLS, NPC> & out_y, xf::Mat<SRC_UV, ROWS/2, COLS/2, NPC_UV> & out_uv)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 172. Function Parameter Descriptions
Parameter Description
SRC_Y Input Y pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)
SRC_UV Input UV pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2)
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be multiple of N.
NPC_Y Number of Y pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8.
NPC_UV Number of UV Pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2 and XF_NPPC4.
_y Y input image
_uv UV input image
out_y Y output image
out_uv UV output image
Resource Utilization

The following table summarizes the resource utilization of NV122NV21/NV212NV12 function in Normal mode (1-Pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 258 161 61
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Table 173. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

NV12/NV21 to UYVY/YUYV

The NV12/NV21 to UYVY/YUYV function converts a NV12/NV21 (YUV4:2:0) image to a single-channel YUYV/UYVY (YUV 4:2:2) image format. YUYV is a sub-sampled format. YUYV/UYVY is represented in 16-bit values whereas, RGB is represented in 24-bit values.

API Syntax
NV122UYVY:
template<int SRC_Y, int SRC_UV, int DST_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>void nv122uyvy(xf::Mat<SRC_Y, ROWS, COLS, NPC> & _y,xf::Mat<SRC_UV, ROWS/2, COLS/2, NPC_UV> & _uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
NV122YUYV:
template<int SRC_Y, int SRC_UV, int DST_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>void nv122yuyv(xf::Mat<SRC_Y, ROWS, COLS, NPC> & _y, xf::Mat<SRC_UV, ROWS/2, COLS/2, NPC_UV> & _uv, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
NV212UYVY:
template<int SRC_Y, int SRC_UV, int DST_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>void nv212uyvy(xf::Mat<SRC_Y, ROWS, COLS, NPC> & _y, xf::Mat<SRC_UV, ROWS/2, COLS/2, NPC_UV> & _uv,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
NV212YUYV:
 template<int SRC_Y, int SRC_UV, int DST_T,int ROWS, int COLS, int NPC=1,int NPC_UV=1>void nv212yuyv(xf::Mat<SRC_Y, ROWS, COLS, NPC> & _y, xf::Mat<SRC_UV, ROWS/2, COLS/2, NPC_UV> & _uv, xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 174. Function Parameter Descriptions
Parameter Description
SRC_Y Input Y image pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
SRC_UV Input UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
DST_T Output pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of NPC.
NPC Number of pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8.
NPC_UV Number of pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2 and XF_NPPC4.
_y Y input image
_uv UV input image
_dst UYVY/YUYV output image
Resource Utilization

The following table summarizes the resource utilization of NV12/NV21 to UYVY/YUYV function in Normal mode(1-Pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 1 0 337 201 64
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Table 175. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

UYVY/YUYV to RGB/BGR

The yuyv2rgb/yuyv2bgr/uyvy2rgb/uyvy2bgr function converts a single-channel YUYV/UYVY (YUV 4:2:2) image format to a 3- channel RGB/BGR image. YUYV/UYVY is a sub-sampled format, a set of YUYV/UYVY values gives 2 RGB pixel values. YUYV/UYVY is represented in 16-bit values whereas, RGB/BGR is represented in 24-bit values

API Syntax
YUYV2RGB:
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void yuyv2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
YUYV2BGR:
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void yuyv2bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
UYVY2RGB
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void uyvy2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
UYVY2BGR:
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void uyvy2bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 176. Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned,1-channel is supported (XF_16UC1).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be a multiple of NPC for N pixel mode.
NPC Number of Y pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8.
_src Input image of size(ROWS, COLS)
_dst Output image of size (ROWS, COLS).
Resource Utilization

The following table summarizes the resource utilization of UYVY/YUYV to RGB/BGR function in Normal mode(1-Pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 6 444 486 109
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Table 177. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

UYVY to YUYV/ YUYV to UYVY

The yuyv2uyvy/uyvy2yuyv function converts a YUYV (YUV4:2:2) to UYVY (YUV4:2:2) or vice versa, where 8-bit Y plane followed by an interleaved U/V plane with 2x2 sub sampling.

API Syntax
UYVY2YUYV :
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void uyvy2yuyv(xf::Mat<SRC_T, ROWS, COLS, NPC> & uyvy,xf::Mat<DST_T, ROWS, COLS, NPC> & yuyv)
YUYV2UYVY:
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void yuyv2uyvy(xf::Mat<SRC_T, ROWS, COLS, NPC> & yuyv,xf::Mat<DST_T, ROWS, COLS, NPC> & uyvy)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 178. Function Parameter Descriptions
Parameter Description
SRC_T Input Y pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1).
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be a multiple of N.
NPC Number of pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8.
yuyv Input image
uyvy Output image
Resource Utilization

The following table summarizes the resource utilization of UYVY to YUYV/ YUYV to UYVY function in Normal mode (1 pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA.

Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 1 368 176 109
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a grayscale HD (1080x1920) image.

Table 179. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

UYVY/YUYV to NV21

The UYVY/YUYV2NV21 function converts a single-channel YUYV/UYVY (YUV 4:2:2) image format to NV21 (YUV 4:2:0) format. YUYV/UYVY is a sub-sampled format, 1 set of YUYV/UYVY value gives 2 Y values and 1 U and V value each.

API Syntax
UYVY2NV21:
template<int SRC_T,int Y_T,int UV_T,int ROWS,int COLS,int NPC=1,int NPC_UV=1>void uyvy2nv21(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<Y_T, ROWS, COLS, NPC> & _y_image,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)
YUYV2NV21:
template<int SRC_T,int Y_T,int UV_T,int ROWS,int COLS,int NPC=1,int NPC_UV=1>void yuyv2nv21(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<Y_T, ROWS, COLS, NPC> & _y_image,xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv_image)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 180. Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 16-bit, unsigned,1-channel is supported (XF_16UC1).
Y_T Output Y image pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output UV image pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of NPC.
NPC Number of pixels to be processed per cycle; Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8.
NPC_UV Number of U, V Pixels to be processed per cycle; Possible options are XF_NPPC1,XF_NPPC2 and XF_NPPC4.
_src Input image
_y_image Y Output image
_uv_image UV Output image
Resource Utilization

The following table summarizes the resource utilization of UYVY/YUYV to NV21 function in Normal mode (1 pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 215 73 42
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Table 181. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

RGB/ BGR to NV12/NV21

The rgb2nv12/bgr2nv12/rgb2nv21/bgr2nv21 converts a 3-channel RGB/BGR image to NV12/NV21 (4:2:0) format. The function outputs Y plane and interleaved UV/VU plane separately. NV12/NV21 holds the subsampled data, Y is sampled for every RGB/BGR pixel and U, V are sampled once for 2 rows and 2columns (2x2) pixels. UV/VU plane is of (rows/2)*(columns/2) size as U and V values are interleaved.

API Syntax
RGB2NV12
template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>void rgb2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS, COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv)
BGR2NV12
template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>void bgr2nv12(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS, COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv)
RGB2NV21
template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>void rgb2nv21(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS, COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv)
BGR2NV21
template <int SRC_T, int Y_T, int UV_T, int ROWS, int COLS, int NPC=1,int NPC_UV=1>void bgr2nv21(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<Y_T, ROWS, COLS, NPC> & _y, xf::Mat<UV_T, ROWS/2, COLS/2, NPC_UV> & _uv)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 182. Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
Y_T Output pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
UV_T Output pixel type. Only 8-bit, unsigned, 2-channel is supported (XF_8UC2).
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be a multiple of NPC for N pixel mode.
NPC Number of Pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8.
NPC_UV Number of Pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2 and XF_NPPC4
_src RGB input image of size(ROWS,COLS)
_y Output Y image of size (ROWS, COLS).
_uv Output UV image of size (ROWS/2, COLS/2).
Resource Utilization

The following table summarizes the resource utilization of RGB/BGR to NV12/NV21 function in Normal mode (1-Pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 413 279 66
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Table 183. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

BGR to RGB / RGB to BGR

The bgr2rgb/rgb2bgr function converts a 3-channel BGR to RGB format or RGB to BGR format.

API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void bgr2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void rgb2bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 184. Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of N.
NPC Number of Pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8.
_src BGR/RGB input image
_dst RGB/BGR output image
Resource Utilization

The following table summarizes the resource utilization of RGB to BGR/ BGR to RGB function in Normal mode (1-Pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA.

Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 0 317 118 98
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Table 185. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

RGB/BGR to UYVY/YUYV

The RGB/BGR to UYVY/YUYV function converts a 3- channel RGB/BGR image to a single-channel YUYV/UYVY (YUV 4:2:2) image format. YUYV is a sub-sampled format, 2 RGBA pixel gives set of YUYV/UYVY values. YUYV/UYVY is represented in 16-bit values whereas, RGB is represented in 24-bit values

API Syntax

RGB to UYVY:

template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void rgb2uyvy(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
RGB to YUYV:
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void rgb2yuyv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
BGR to UYVY:
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void bgr2uyvy(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
BGR to YUYV
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void bgr2yuyv(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 186. Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3)
DST_T Output pixel type. Only 16-bit, unsigned, 1-channel is supported (XF_16UC1)
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be multiple of NPC.
NPC Number of pixels to be processed per cycle. Possible options are XF_NPPC1,XF_NPPC2,XF_NPPC4 and XF_NPPC8..
_src RGB/BGR input image
_dst UYVY/YUYV output image
Resource Utilization

The following table summarizes the resource utilization of RGB/BGR to UYVY/YUYV function in normal mode(1-Pixel), as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA.

Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 300 0 9 249 203 55
Performance Estimate

The following table summarizes the performance of the kernel in single pixel configuration as generated using Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA to process a HD (1080x1920) image.

Table 187. Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

XYZ to RGB/BGR

The xyz2rgb function converts XYZ color space to 3-channel RGB image.

API Syntax
template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void xyz2rgb(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)template<int SRC_T,int DST_T,int ROWS,int COLS,int NPC=1>void xyz2bgr(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst)
Parameter Descriptions

The following table describes the template and the function parameters.

Table 188. XYZ2RGB/BGR Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 3-channel is supported (XF_8UC3).
ROWS Maximum height of input and output image. Must be multiple of 8.
COLS Maximum width of input and output image. Must be multiple of 8.
NPC Number of pixels to be processed per cycle.
_src XYZ input image.
_dst RGB/BGR output image.
Resource Utilization

The following table summarizes the resource utilization of XYZ2RGB/BGR for different configurations, as generated in the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a HD (1080x1920) image.

Table 189. XYZ2RGB/BGR Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 300 0 8 639 401
Performance Estimate

The following table summarizes the performance of XYZ2RGB/BGR for different configurations, as generated using the Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1, to process a HD (1080x1920) image.

Table 190. XYZ2RGB/BGRFunction Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 6.9

Color Thresholding

The colorthresholding function compares the color space values of the source image with low and high threshold values, and returns either 255 or 0 as the output.

API Syntax

template<int SRC_T,int DST_T,int MAXCOLORS, int ROWS, int COLS,int NPC>
          void colorthresholding(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat,unsigned char low_thresh[MAXCOLORS*3], unsigned char high_thresh[MAXCOLORS*3])

Parameter Descriptions

The table below describes the template and the function parameters.
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 3 channel is supported (XF_8UC3).
DST_T Output pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1).
MAXCOLORS Maximum number of color values
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be a multiple of 8, for 8 pixel mode.
NPC Number of pixels to be processed per cycle. Only XF_NPPC1 supported.
_src_mat Input image
_dst_mat Thresholded image
low_thresh Lowest threshold values for the colors
high_thresh Highest threshold values for the colors

Compare

The Compare function performs the per element comparison of pixels in two corresponding images src1, src2 and stores the result in dst.

dst(x,y)=src1(x,y) CMP_OP src2(x,y)
CMP_OP – a flag specifies correspondence between the pixels.
  • XF_CMP_EQ : src1 is equal to src2
  • XF_CMP_GT : src1 is greater than src2
  • XF_CMP_GE : src1 is greater than or equal to src2
  • XF_CMP_LT : src1 is less than src2
  • XF_CMP_LE : src1 is less than or equal to src2
  • XF_CMP_NE : src1 is unequal to src2

If the comparison result is true, then the corresponding element of dst is set to 255; else it is set to 0.

API Syntax

template<int CMP_OP,  int SRC_T , int ROWS, int COLS, int NPC=1>
void compare(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src1, xf::Mat<SRC_T, ROWS, COLS, NPC> & _src2, xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 191. Compare Function Parameter Descriptions
Parameter Description
CMP_OP The flag that specify the relation between the elements needs to be checked
SRC_T Input Pixel Type. 8-bit, unsigned, 1 channel is supported (XF_8UC1)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. In case of N-pixel parallelism, width should be multiple of N
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src1 First input image
_src2 Second input image
_dst Output image

Resource Utilization

The following table summarizes the resource utilization of the Compare XF_CMP_NE configuration in Resource optimized (8 pixels) mode and normal mode as generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA.

Table 192. Compare Function Resource Utilization Summary
Name Resource Utilization
1 pixel per clock operation 8 pixel per clock operation
300 MHz 150 MHz
BRAM_18K 0 0
DSP48E 0 0
FF 87 60
LUT 38 84
CLB 16 20

Performance Estimate

The following table summarizes a performance estimate of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA to process a grayscale HD (1080x1920) image.

Table 193. Compare Function Performance Estimate Summary
Operating Mode Latency Estimate
Operating Frequency (MHz) Latency (in ms)

1 pixel

300 6.9

8 pixel

150 1.7

CompareS

The CompareS function performs the comparison of a pixel in the input image (src1) and the given scalar value scl, and stores the result in dst.

dst(x,y)=src1(x,y) CMP_OP scalar
CMP_OP – a flag specifies correspondence between the pixel and the scalar.
  • XF_CMP_EQ : src1 is equal to scl
  • XF_CMP_GT : src1 is greater than scl
  • XF_CMP_GE : src1 is greater than or equal to scl
  • XF_CMP_LT : src1 is less than scl
  • XF_CMP_LE : src1 is less than or equal to scl
  • XF_CMP_NE : src1 is unequal to scl

If the comparison result is true, then the corresponding element of dst is set to 255, else it is set to 0.

API Syntax

template<int CMP_OP,  int SRC_T , int ROWS, int COLS, int NPC=1>
void compareS(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src1, unsigned char _scl[XF_CHANNELS(SRC_T,NPC)], xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 194. CompareS Function Parameter Descriptions
Parameter Description
CMP_OP The flag that specifying the relation between the elements to be checked
SRC_T Input pixel type. 8-bit, unsigned, 1 channel is supported (XF_8UC1).
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. In case of N-pixel parallelism, the width should be a multiple of N
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixels operations respectively.
_src1 First input image
_scl Input scalar value, the size should be number of channels
_dst Output image

Resource Utilization

The following table summarizes the resource utilization of the CompareS function with XF_CMP_NE configuration in Resource optimized (8 pixels) mode and normal mode as generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA

Table 195. CompareS Function Resource Utilization Summary
Name Resource Utilization
1 pixel per clock operation 8 pixel per clock operation
300 MHz 150 MHz
BRAM_18K 0 0
DSP48E 0 0
FF 93 93
LUT 39 68
CLB 21 28

Performance Estimate

The following table summarizes a performance estimate of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA to process a grayscale HD (1080x1920) image.

Table 196. CompareS Function Performance Estimate Summary
Operating Mode Latency Estimate
Operating Frequency (MHz) Latency (ms)

1 pixel

300 6.9

8 pixel

150 1.7

Crop

The Crop function extracts the region of interest (ROI) from the input image.

P(X,Y) ≤ P(xi, yi) ≤ P(X’,Y’)
  • P(X,Y) - Top left corner of ROI
  • P(X’,Y’) - Bottom Right of ROI

Figure: Crop Function

API Syntax

template<int SRC_T, int ROWS, int COLS,int ARCH_TYPE=0,int NPC=1>
void crop(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<SRC_T, ROWS, COLS, NPC>  &_dst_mat,xf::Rect_<unsigned int> &roi)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 197. Crop Function Parameter Descriptions
Parameter Description
SRC_T Input pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3).
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image. Must be multiple of 8 for 8-pixel operation.
ARCH_TYPE Architecture type. 0 resolves to stream implementation and 1 resolves to memory mapped implementation.
NPC Number of pixels to be processed per cycle. NPC should be power of 2.
_src_mat Input image
_dst_mat Output ROI image
roi ROI is a xf::Rect object that consists of the top left corner of the rectangle along with the height and width of the rectangle.

Resource Utilization

The following table summarizes the resource utilization of crop function in normal mode (NPC=1) for 3 ROIs (480x640, 100x200, 300x300) as generated in the Vivado HLS 2019.1 tool for the Xilinx xczu9eg-ffvb1156-2-i-es2 FPGA.

Table 198. Crop Function Resource Utilization Summary
Name Resource Utilization
1-pixel per clock operation 8-pixel per clock operation
300 MHz 300MHz
BRAM_18K 6 8
DSP48E 10 10
FF 17482 16995
LUT 16831 15305

Performance Estimate

The following table summarizes a performance estimate of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA to process a grayscale HD (1080x1920) image for 3 ROIs (480x640, 100x200, 300x300).

Table 199. Crop Function Performance Estimate Summary
Operating Mode Latency Estimate
Operating Frequency (MHz) Latency (ms)
1 pixel 300 1.7
8 pixel 300 0.6

Multiple ROI Extraction

You can call the xf::crop function multiple times in accel.cpp.

Multiple ROI Extraction Example

void crop_accel(xf::Mat<TYPE, HEIGHT, WIDTH, NPIX> &_src,xf::Mat<TYPE,HEIGHT, WIDTH, NPIX> _dst[NUM_ROI],xf::Rect_<unsigned int> roi[NUM_ROI])
 {xf::crop<TYPE, TYPE, HEIGHT, WIDTH, NPIX>(_src, _dst[0],roi[0]); xf::crop<TYPE, TYPE, HEIGHT, WIDTH, NPIX>(_src, _dst[1],roi[1]); xf::crop<TYPE, TYPE, HEIGHT, WIDTH, NPIX>(_src, _dst[2],roi[2]);}

Custom Convolution

The filter2D function performs convolution over an image using a user-defined kernel.

Convolution is a mathematical operation on two functions f and g, producing a third function, The third function is typically viewed as a modified version of one of the original functions, that gives the area overlap between the two functions to an extent that one of the original functions is translated.

The filter can be unity gain filter or a non-unity gain filter. The filter must be of type XF_16SP. If the co-efficients are floating point, it must be converted into the Qm.n and provided as the input as well as the shift parameter has to be set with the ‘n’ value. Else, if the input is not of floating point, the filter is provided directly and the shift parameter is set to zero.

API Syntax

template<int BORDER_TYPE,int FILTER_WIDTH,int FILTER_HEIGHT, int SRC_T,int DST_T, int ROWS, int COLS,int NPC=1>
void filter2D(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat,short int filter[FILTER_HEIGHT*FILTER_WIDTH],unsigned char _shift)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 200. filter2D Function Parameter Descriptions
Parameter Description
BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT
FILTER_HEIGHT Number of rows in the input filter
FILTER_WIDTH Number of columns in the input filter
SRC_T Input pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3)
DST_T Output pixel type.8-bit unsigned single and 3 channels (XF_8UC1,XF_8UC3) and 16-bit signed single and 3 channels (XF_16SC1,XF_16SC3) supported.
ROWS Maximum height of input and output image
COLS Maximum width of input and output image. Must be multiple of 8, for 8 pixel mode.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image
filter The input filter of any size, provided the dimensions should be an odd number. The filter co-efficients either a 16-bit value or a 16-bit fixed point equivalent value.
_shift

The filter must be of type XF_16SP. If the co-efficients are floating point, it must be converted into the Qm.n and provided as the input as well as the shift parameter has to be set with the ‘n’ value. Else, if the input is not of floating point, the filter is provided directly and the shift parameter is set to zero.

Resource Utilization

The following table summarizes the resource utilization of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 201. filter2D Function Resource Utilization Summary
Operating Mode Filter Size

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 3x3 300 3 9 1701 1161 269
5x5 300 5 25 3115 2144 524
8 pixel 3x3 150 6 72 2783 2768 638
5x5 150 10 216 3020 4443 1007

The following table summarizes the resource utilization of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for the Xilinx Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a 4K 3 Channel image.

Table 202. filter2D Function Resource Utilization Summary
Operating Mode Filter Size

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 3x3 300 18 27 886 801
5x5 300 30 75 1793 1445

Performance Estimate

The following table summarizes the performance of the kernel in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 203. filter2D Function Performance Estimate Summary
Operating Mode

Operating Frequency

(MHz)

Filter Size Latency Estimate
Max (ms)
1 pixel 300 3x3 7
300 5x5 7.1
8 pixel 150 3x3 1.86
150 5x5 1.86

Delay

In image processing pipelines, it is possible that the inputs to a function with FIFO interfaces are not synchronized. That is, the first data packet for first input might arrive a finite number of clock cycles after the first data packet of the second input. If the function has FIFOs at its interface with insufficient depth, this causes the whole design to stall on hardware. To synchronize the inputs, we provide this function to delay the input packet that arrives early, by a finite number of clock cycles.

API Syntax

template<int MAXDELAY, int SRC_T, int ROWS, int COLS,int NPC=1 >
          void delayMat(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst)

Parameter Descriptions

The table below describes the template and the function parameters.
Parameter Description
SRC_T Input and output pixel type
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be a multiple of 8, for 8 pixel operation)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
MAXDELAY Maximum delay that the function is to be instantiated for.
_src Input image
_dst Output image

Demosaicing

The Demosaicing function converts a single plane Bayer pattern output, from the digital camera sensors to a color image. This function implements an improved bi-linear interpolation technique proposed by Malvar, He, and Cutler.

Figure: Bayer Mosaic for Color Image

The above figure shows the Bayer mosaic for color image capture in single-CCD digital cameras.

API Syntax

template<int BFORMAT, int SRC_T, int DST_T, int ROWS, int COLS, int NPC,bool USE_URAM=false>
void demosaicing(xf::Mat<SRC_T, ROWS, COLS, NPC> &src_mat, xf::Mat<DST_T, ROWS, COLS, NPC> &dst_mat)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 204. Demosaicing Function Parameter Descriptions
Parameter Description
BFORMAT Input Bayer pattern. XF_BAYER_BG, XF_BAYER_GB, XF_BAYER_GR, and XF_BAYER_RG are the supported values.
SRC_T Input pixel type. 8-bit, unsigned,1 and 3 channel (XF_8UC1 and XF_8UC3) and 16-bit, unsigned, 1 and 3 channel (XF_16UC1 and XF_16UC3) are supported.
DST_T Output pixel type. 8-bit, unsigned, 4 channel (XF_8UC4) and 16-bit, unsigned, 4 channel (XF_16UC4) are supported.
ROWS Number of rows in the image being processed.
COLS Number of columns in the image being processed. Must be multiple of 8, in case of 8 pixel mode.
NPC Number of pixels to be processed per cycle; single pixel parallelism (XF_NPPC1), two-pixel parallelism (XF_NPPC2) and four-pixel parallelism (XF_NPPC4) are supported. XF_NPPC4 is not supported with XF_16UC1 pixel type.
USE_URAM Enable to map storage structures to UltraRAM.
_src_mat Input image
_dst_mat Output image

Resource Utilization

The following table below shows the resource utilization of the Demosaicing function, generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA.

Table 205. Demosaicing Function Resource Utilization Summary
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K DSP48E FF LUT CLB
1 pixel 300 8 0 1906 1915 412
2 pixel 300 8 0 2876 3209 627
4 pixel 300 8 0 2950 3222 660

The following table shows the resource utilization of the Demosaicing function, generated using SDx 2019.1 version tool for the xczu7ev-ffvc1156-2-e FPGA.

Table 206. Demosaicing Function Resource Utilization Summary with UltraRAM Enabled
Operating Mode Operating Frequency (MHz) Utilization Estimate
BRAM_18K URAM DSP48E FF LUT CLB
1 pixel 300 0 1 0 1366 1339 412

Performance Estimate

The following table shows the performance in different configurations, generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 to process a 4K (3840x2160) image.

Table 207. Demosaicing Function Performance Estimate Summary
Operating Mode Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 27.82
2 pixel operation (300 MHz) 13.9
4 pixel operation (300 MHz, 8-bit image only) 6.95

Dilate

During a dilation operation, the current pixel intensity is replaced by the maximum value of the intensity in a nxn neighborhood of the current pixel.



API Syntax

template<int BORDER_TYPE, int TYPE, int ROWS, int COLS,int K_SHAPE,int K_ROWS,int K_COLS, int ITERATIONS, int NPC=1>
void dilate (xf::Mat<TYPE, ROWS, COLS, NPC> & _src, xf::Mat<TYPE, ROWS, COLS, NPC> & _dst,unsigned char _kernel[K_ROWS*K_COLS])

Parameter Descriptions

The following table describes the template and the function parameters.

Table 208. dilate Function Parameter Descriptions
Parameter Description
BORDER_TYPE Border Type supported is XF_BORDER_CONSTANT
TYPE Input and Output pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be multiple of 8, for 8-pixel operation)
K_SHAPE Shape of the kernel . The supported kernel shapes are RECT, CROSS, and ELLIPSE.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
K_ROWS Height of the kernel.
K_COLS Width of the kernel.
ITERATIONS Number of times the dilation is applied. Currently supporting for Rectangular shape kernel element.
_src_mat Input image
_dst_mat Output image
_kernel Dilation kernel of size K_ROWS * K_COLS.

Resource Utilization

The following table summarizes the resource utilization of the Dilation function with rectangle shape structuring element in 1 pixel operation and 8 pixel operation, generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA for HD (1080X1920) image.

Table 209. dilate Function Resource Utilization Summary
Name Resource Utilization
1 pixel per clock operation 8 pixel per clock operation
300 MHz 150 MHz
BRAM_18K 3 6
DSP48E 0 0
FF 411 657
LUT 392 1249
CLB 96 255

Performance Estimate

The following table summarizes the resource utilization of the Dilation function with rectangle shape structuring element in 1 pixel operation, generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA for 4K 3channel image.

Table 210. dilate Function Resource Utilization Summary
Name Resource Utilization
1 pixel per clock operation
300 MHz
BRAM_18K 18
DSP48E 0
FF 983
LUT 745
CLB 186

The following table summarizes a performance estimate of the Dilation function for Normal Operation (1 pixel) and Resource Optimized (8 pixel) configurations, generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA.

Table 211. dilate Function Performance Estimate Summary
Operating Mode Latency Estimate
Min (ms) Max (ms)

1 pixel (300 MHz)

7.0 7.0

8 pixel (150 MHz)

1.87 1.87

Duplicate

When various functions in a pipeline are implemented by a programmable logic, FIFOs are instantiated between two functions for dataflow processing. When the output from one function is consumed by two functions in a pipeline, the FIFOs need to be duplicated. This function facilitates the duplication process of the FIFOs.

API Syntax

template<int SRC_T, int ROWS, int COLS,int NPC=1>
          void duplicateMat(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src, xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst1,xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst2)

Parameter Descriptions

The table below describes the template and the function parameters.
Parameter Description
SRC_T Input and output pixel type
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be a multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle. Possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src Input image
_dst1 Duplicate output for _src
_dst2 Duplicate output for _src

Erode

The erode function finds the minimum pixel intensity in the NXN neighborhood of a pixel and replaces the pixel intensity with the minimum value.



API Syntax

template<int BORDER_TYPE, int TYPE, int ROWS, int COLS,int K_SHAPE,int K_ROWS,int K_COLS, int ITERATIONS, int NPC=1>
void erode (xf::Mat<TYPE, ROWS, COLS, NPC> & _src, xf::Mat<TYPE, ROWS, COLS, NPC> & _dst,unsigned char _kernel[K_ROWS*K_COLS]) 

Parameter Descriptions

The following table describes the template and the function parameters.

Table 212. erode Function Parameter Descriptions
Parameter Description
BORDER_TYPE Border type supported is XF_BORDER_CONSTANT
TYPE Input and Output pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be multiple of 8, for 8-pixel operation)
K_SHAPE Shape of the kernel . The supported kernel shapes are RECT,CROSS and ELLIPSE.
K_ROWS Height of the kernel.
K_COLS Width of the kernel.
ITERATIONS Number of times the erosion is applied.Currently supporting for Rectangular shape kernel element.
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image
_kernel Erosion kernel of size K_ROWS * K_COLS.

Resource Utilization

The following table summarizes the resource utilization of the Erosion function with rectangular shape structuring element generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA,for FullHD image(1080x1920).

Table 213. erode Function Resource Utilization Summary
Name Resource Utilization
1 pixel per clock operation 8 pixel per clock operation
300 MHz 150 MHz
BRAM_18K 3 6
DSP48E 0 0
FF 411 657
LUT 392 1249
CLB 96 255

The following table summarizes the resource utilization of the Erosion function with rectangular shape structuring element generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA,for 4K image with 3channels.

Table 214. erode Function Resource Utilization Summary
Name Resource Utilization
1 pixel per clock operation
300 MHz
BRAM_18K 18
DSP48E 0
FF 983
LUT 3745
CLB 186

Performance Estimate

The following table summarizes a performance estimate of the Erosion function for Normal Operation (1 pixel) and Resource Optimized (8 pixel) configurations, generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA.

Table 215. erode Function Performance Estimate Summary
Operating Mode Latency Estimate
Min (ms) Max (ms)

1 pixel (300 MHz)

7.0 7.0

8 pixel (150 MHz)

1.85 1.85

FAST Corner Detection

Features from accelerated segment test (FAST) is a corner detection algorithm, that is faster than most of the other feature detectors.

The fast function picks up a pixel in the image and compares the intensity of 16 pixels in its neighborhood on a circle, called the Bresenham's circle. If the intensity of 9 contiguous pixels is found to be either more than or less than that of the candidate pixel by a given threshold, then the pixel is declared as a corner. Once the corners are detected, the non-maximal suppression is applied to remove the weaker corners.

This function can be used for both still images and videos. The corners are marked in the image. If the corner is found in a particular location, that location is marked with 255, otherwise it is zero.

API Syntax

template<int NMS,int SRC_T,int ROWS, int COLS,int NPC=1>
void fast(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_mat,xf::Mat<SRC_T, ROWS, COLS, NPC> & _dst_mat,unsigned char _threshold)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 216. fast Function Parameter Descriptions
Parameter Description
NMS If NMS == 1, non-maximum suppression is applied to detected corners (keypoints). The value should be 0 or 1.
SRC_T Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1)
ROWS Maximum height of input image.
COLS Maximum width of input image (must be a multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src_mat Input image
_dst_mat Output image. The corners are marked in the image.
_threshold Threshold on the intensity difference between the center pixel and its neighbors. Usually it is taken around 20.

Resource Utilization

The following table summarizes the resource utilization of the kernel for different configurations, generated using Vivado HLS 2019.1 for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image with NMS.

Table 217. fast Function Resource Utilization Summary

Name

Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 10 20
DSP48E 0 0
FF 2695 7310
LUT 3792 20956
CLB 769 3519

Performance Estimate

The following table summarizes the performance of kernel for different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image with non-maximum suppression (NMS).

Table 218. fast Function Performance Estimate Summary
Operating Mode

Operating Frequency

(MHz)

Filter Size Latency Estimate
Max (ms)
1 pixel 300 3x3 7
8 pixel 150 3x3 1.86

Gaussian Filter

The GaussianBlur function applies Gaussian blur on the input image. Gaussian filtering is done by convolving each point in the input image with a Gaussian kernel.



Where , are the mean values and , are the variances in x and y directions respectively. In the GaussianBlur function, values of , are considered as zeroes and the values of , are equal.

API Syntax

template<int FILTER_SIZE, int BORDER_TYPE, int SRC_T, int ROWS, int COLS, int NPC =  1>
void GaussianBlur(xf::Mat<SRC_T, ROWS, COLS, NPC> & src, xf::Mat<SRC_T, ROWS, COLS, NPC> & dst, float sigma)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 219. GaussianBlur Function Parameter Descriptions
Parameter Description
FILTER_SIZE Filter size. Filter size of 3 (XF_FILTER_3X3), 5 (XF_FILTER_5X5) and 7 (XF_FILTER_7X7) are supported.
BORDER_TYPE Border type supported is XF_BORDER_CONSTANT
SRC_T Input and Output pixel type. Only 8-bit, unsigned, 1 and 3 channels are supported (XF_8UC1 and XF_8UC3)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be a multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle; possible values are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
src Input image
dst Output image
sigma Standard deviation of Gaussian filter

Resource Utilization

The following table summarizes the resource utilization of the Gaussian Filter in different configurations, generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to progress a grayscale HD (1080x1920) image.

Table 220. GaussianBlur Function Resource Utilization Summary
Operating Mode Filter Size

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT CLB
1 pixel 3x3 300 3 17 3641 2791 610
5x5 300 5 27 4461 3544 764
7x7 250 7 35 4770 4201 894
8 pixel 3x3 150 6 52 3939 3784 814
5x5 150 10 111 5688 5639 1133
7x7 150 14 175 7594 7278 1518

The following table summarizes the resource utilization of the Gaussian Filter in different configurations, generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to progress a 4K 3 Channel image.

Table 221. GaussianBlur Function Resource Utilization Summary
Operating Mode Filter Size

Operating Frequency

(MHz)

Utilization Estimate
BRAM_18K DSP_48Es FF LUT
1 pixel 3x3 300 18 33 4835 3472
5x5 300 30 51 5755 3994
7x7 300 42 135 8086 5422

Performance Estimate

The following table summarizes a performance estimate of the Gaussian Filter in different configurations, as generated using Vivado HLS 2019.1 tool for Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 222. GaussianBlur Function Performance Estimate Summary
Operating Mode Filter Size Latency Estimate
Max Latency (ms)
1 pixel operation (300 MHz) 3x3 7.01
5x5 7.03
7x7 7.06
8 pixel operation (150 MHz) 3x3 1.6
5x5 1.7
7x7 1.74

Gradient Magnitude

The magnitude function computes the magnitude for the images. The input images are x-gradient and y-gradient images of type 16S. The output image is of same type as the input image.

For L1NORM normalization, the magnitude computed image is the pixel-wise added image of absolute of x-gradient and y-gradient, as shown below:.



For L2NORM normalization, the magnitude computed image is as follows:



API Syntax

template< int NORM_TYPE ,int SRC_T,int DST_T, int ROWS, int COLS,int NPC=1>
void magnitude(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_matx,xf::Mat<DST_T, ROWS, COLS, NPC> & _src_maty,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 223. magnitude Function Parameter Descriptions
Parameter Description
NORM_TYPE Normalization type can be either L1 or L2 norm. Values are XF_L1NORM or XF_L2NORM
SRC_T Input pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)
DST_T Output pixel type. Only 16-bit, signed,1 channel is supported (XF_16SC1)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle; possible values are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src_matx First input, x-gradient image.
_src_maty Second input, y-gradient image.
_dst_mat Output, magnitude computed image.

Resource Utilization

The following table summarizes the resource utilization of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image and for L2 normalization.

Table 224. magnitude Function Resource Utilization Summary
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 0 0
DSP48E 2 16
FF 707 2002
LUT 774 3666
CLB 172 737

Performance Estimate

The following table summarizes the performance of the kernel in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image and for L2 normalization.

Table 225. magnitude Function Performance Estimate Summary
Operating Mode

Operating Frequency (MHz)

Latency Estimate
Max (ms)
1 pixel 300 7.2
8 pixel 150 1.7

Gradient Phase

The phase function computes the polar angles of two images. The input images are x-gradient and y-gradient images of type 16S. The output image is of same type as the input image.

For radians:

For degrees:

API Syntax

template<int RET_TYPE ,int SRC_T,int DST_T, int ROWS, int COLS,int NPC=1 >
void phase(xf::Mat<SRC_T, ROWS, COLS, NPC> & _src_matx,xf::Mat<DST_T, ROWS, COLS, NPC> & _src_maty,xf::Mat<DST_T, ROWS, COLS, NPC> & _dst_mat)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 226. phase Function Parameter Descriptions
Parameter Description
RET_TYPE Output format can be either in radians or degrees. Options are XF_RADIANS or XF_DEGREES.
  • If the XF_RADIANS option is selected, phase API will return result in Q4.12 format. The output range is (0, 2 pi).
  • If the XF_DEGREES option is selected, xFphaseAPI will return result in Q10.6 degrees and output range is (0, 360).
SRC_T Input pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1).
DST_T Output pixel type. Only 16-bit, signed, 1 channel is supported (XF_16SC1)
ROWS Maximum height of input and output image.
COLS Maximum width of input and output image (must be a multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
_src_matx First input, x-gradient image.
_src_maty Second input, y-gradient image.
_dst_mat Output, phase computed image.

Resource Utilization

The following table summarizes the resource utilization of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

Table 227. phase Function Resource Utilization Summary
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 6 24
DSP48E 6 19
FF 873 2396
LUT 753 3895
CLB 185 832

Performance Estimate

The following table summarizes the performance of the kernel in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image.

Table 228. phase Function Performance Estimate Summary
Operating Mode Operating Frequency (MHz) Latency Estimate (ms)
1 pixel 300 7.2
8 pixel 150 1.7

Deviation from OpenCV

In phase implementation, the output is returned in a fixed point format. If XF_RADIANS option is selected, phase API will return result in Q4.12 format. The output range is (0, 2 pi). If XF_DEGREES option is selected, phase API will return result in Q10.6 degrees and output range is (0, 360).

Harris Corner Detection

In order to understand Harris Corner Detection, let us consider a grayscale image. Sweep a window w(x,y) (with displacements u in the x-direction and v in the y-direction), I calculates the variation of intensity w(x,y).


Where:
  • w(x,y) is the window position at (x,y)
  • I(x,y) is the intensity at (x,y)
  • I(x+u,y+v) is the intensity at the moved window (x+u,y+v).

Since we are looking for windows with corners, we are looking for windows with a large variation in intensity. Hence, we have to maximize the equation above, specifically the term:



Using Taylor expansion:



Expanding the equation and cancelling I(x,y) with -I(x,y):



The above equation can be expressed in a matrix form as:



So, our equation is now:


A score is calculated for each window, to determine if it can possibly contain a corner:



Where,

API Syntax

Non-Maximum Suppression:

In non-maximum suppression (NMS) if radius = 1, then the bounding box is 2*r+1 = 3.

In this case, consider a 3x3 neighborhood across the center pixel. If the center pixel is greater than the surrounding pixel, then it is considered a corner. The comparison is made with the surrounding pixels, which are within the radius.

Radius = 1

x-1, y-1 x-1, y x-1, y+1
x, y-1 x, y x, y+1
x+1, y-1 x+1, y x+1, y+1

Threshold:

A threshold=442, 3109 and 566 is used for 3x3, 5x5, and 7x7 filters respectively. This threshold is verified over 40 sets of images. The threshold can be varied, based on the application. The corners are marked in the output image. If the corner is found in a particular location, that location is marked with 255, otherwise it is zero.

template<int FILTERSIZE,int BLOCKWIDTH, int NMSRADIUS,int SRC_T,int ROWS, int COLS,int NPC=1,bool USE_URAM=false>
void cornerHarris(xf::Mat<SRC_T, ROWS, COLS, NPC> & src,xf::Mat<SRC_T, ROWS, COLS, NPC> & dst,uint16_t threshold, uint16_t k)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 229. cornerHarris Function Parameter Descriptions
Parameter Description
FILTERSIZE Size of the Sobel filter. 3, 5, and 7 supported.
BLOCKWIDTH Size of the box filter. 3, 5, and 7 supported.
NMSRADIUS Radius considered for non-maximum suppression. Values supported are 1 and 2.
TYPE Input pixel type. Only 8-bit, unsigned, 1-channel is supported (XF_8UC1).
ROWS Maximum height of input image.
COLS Maximum width of input image (must be multiple of 8, for 8-pixel operation)
NPC Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively.
USE_URAM Enable to map some storage structures to URAM
src Input image
dst Output image.
threshold Threshold applied to the corner measure.
k Harris detector parameter

Resource Utilization

The following table summarizes the resource utilization of the Harris corner detection in different configurations, generated using Vivado HLS 2019.1 version tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a grayscale HD (1080x1920) image.

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=3 and NMS_RADIUS =1.

Table 230. Resource Utilization Summary - For Sobel Filter = 3, Box filter=3 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 33 66
DSP48E 10 80
FF 3254 9330
LUT 3522 13222
CLB 731 2568

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=5 and NMS_RADIUS =1.

Table 231. Resource Utilization Summary - Sobel Filter = 3, Box filter=5 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 45 90
DSP48E 10 80
FF 5455 12459
LUT 5675 24594
CLB 1132 4498

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=7 and NMS_RADIUS =1.

Table 232. Resource Utilization Summary - Sobel Filter = 3, Box filter=7 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 57 114
DSP48E 10 80
FF 8783 16593
LUT 9157 39813
CLB 1757 6809

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=3 and NMS_RADIUS =1.

Table 233. Resource Utilization Summary - Sobel Filter = 5, Box filter=3 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 200 MHz
BRAM_18K 35 70
DSP48E 10 80
FF 4656 11659
LUT 4681 17394
CLB 1005 3277
The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=5 and NMS_RADIUS =1.
Table 234. Resource Utilization Summary - Sobel Filter = 5, Box filter=5 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 47 94
DSP48E 10 80
FF 6019 14776
LUT 6337 28795
CLB 1353 5102

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=7 and NMS_RADIUS =1.

Table 235. Resource Utilization Summary - Sobel Filter = 5, Box filter=7 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 59 118
DSP48E 10 80
FF 9388 18913
LUT 9414 43070
CLB 1947 7508

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=3 and NMS_RADIUS =1.

Table 236. Resource Utilization Summary - Sobel Filter = 7, Box filter=3 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 37 74
DSP48E 11 88
FF 6002 13880
LUT 6337 25573
CLB 1327 4868

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=5 and NMS_RADIUS =1.

Table 237. Resource Utilization Summary - Sobel Filter = 7, Box filter=5 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 49 98
DSP48E 11 88
FF 7410 17049
LUT 8076 36509
CLB 1627 6518

The following table summarizes the resource utilization for Sobel Filter = 7, Box filter=7 and NMS_RADIUS =1.

Table 238. Resource Utilization Summary - Sobel Filter = 7, Box filter=7 and NMS_RADIUS =1
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 61 122
DSP48E 11 88
FF 10714 21137
LUT 11500 51331
CLB 2261 8863

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=3 and NMS_RADIUS =2.

Table 239. Resource Utilization Summary - Sobel Filter = 3, Box filter=3 and NMS_RADIUS =2
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 41 82
DSP48E 10 80
FF 5519 10714
LUT 5094 16930
CLB 1076 3127
The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=5 and NMS_RADIUS =2.
Table 240. Resource Utilization Summary
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 53 106
DSP48E 10 80
FF 6798 13844
LUT 6866 28286
CLB 1383 4965

The following table summarizes the resource utilization for Sobel Filter = 3, Box filter=7 and NMS_RADIUS =2.

Table 241. Resource Utilization Summary - Sobel Filter = 3, Box filter=7 and NMS_RADIUS =2
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 65 130
DSP48E 10 80
FF 10137 17977
LUT 10366 43589
CLB 1940 7440

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=3 and NMS_RADIUS =2.

Table 242. Resource Utilization Summary - Sobel Filter = 5, Box filter=3 and NMS_RADIUS =2
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 43 86
DSP48E 10 80
FF 5957 12930
LUT 5987 21187
CLB 1244 3922
The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=5 and NMS_RADIUS =2.
Table 243. Resource Utilization Summary - Sobel Filter = 5, Box filter=5 and NMS_RADIUS =2
Name Resource Utilization
1 pixel 8 pixel
300 MHz 150 MHz
BRAM_18K 55 110
DSP48E 10 80
FF 5442 16053
LUT 6561 32377
CLB 1374 5871

The following table summarizes the resource utilization for Sobel Filter = 5, Box filter=7 and NMS_RADIUS =2.

Table 244. Resource Utilization Summary - Sobel