HOG
The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision for the purpose of object detection. The feature descriptors produced from this approach is widely used in the pedestrian detection.
The technique counts the occurrences of gradient orientation in localized portions of an image. HOG is computed over a dense grid of uniformly spaced cells and normalized over overlapping blocks, for improved accuracy. The concept behind HOG is that the object appearance and shape within an image can be described by the distribution of intensity gradients or edge direction.
Both RGB and gray inputs are accepted to the function. In the RGB mode, gradients are computed for each plane separately, but the one with the higher magnitude is selected. With the configurations provided, the window dimensions are 64x128, block dimensions are 16x16.
API Syntax
template<int WIN_HEIGHT, int WIN_WIDTH, int WIN_STRIDE, int BLOCK_HEIGHT, int BLOCK_WIDTH, int CELL_HEIGHT, int CELL_WIDTH, int NOB, int ROWS, int COLS, int SRC_T, int DST_T, int DESC_SIZE, int NPC = XF_NPPC1, int IMG_COLOR, int OUTPUT_VARIANT>
void xFHOGDescriptor(xF::Mat<SRC_T, ROWS, COLS, NPC> &_in_mat,
xF::Mat<DST_T, 1, DESC_SIZE, NPC> &_desc_mat
Parameter Descriptions
The following table describes the template parameters.
| PARAMETERS | DESCRIPTION |
|---|---|
| WIN_HEIGHT | The number of pixel rows in the window. It is fixed at 128. |
| WIN_WIDTH | The number of pixel cols in the window. It is fixed at 64. |
| WIN_STIRDE | The pixel stride between two adjacent windows. It is fixed at 8. |
| BLOCK_HEIGHT | Height of the block. It is fixed at 16. |
| BLOCK_WIDTH | Width of the block. It is fixed at 16. |
| CELL_HEIGHT | Number of rows in a cell. It is fixed at 8. |
| CELL_WIDTH | Number of cols in a cell. It is fixed at 8. |
| NOB | Number of histogram bins for a cell. It is fixed at 9 |
| ROWS | Number of rows in the image being processed. (Should be a multiple of 8) |
| COLS | Number of columns in the image being processed. (Should be a multiple of 8) |
| SRC_T | Input pixel type. Must be either XF_8UC1 or XF_8UC4, for gray and color respectively. |
| DST_T | Ouput descriptor type. Must be XF_32UC1. |
| DESC_SIZE | The size of the output descriptor. |
| NPC | Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per cycle operations. |
| IMG_COLOR | The type of the image, set as either XF_GRAY or XF_RGB |
| OUTPUT_VARIENT | Must be either XF_HOG_RB or XF_HOG_NRB |
The following table describes the function parameters.
| PARAMETERS | DESCRIPTION |
|---|---|
| _in_mat | Input image, of xF::Mat type |
| _desc_mat | Output descriptors, of xF::Mat type |
- NO is normal operation (single pixel processing)
- RB is repetitive blocks (descriptor data are written window wise)
- NRB is non-repetitive blocks (descriptor data are written block wise, in order to reduce the number of writes).
Resource Utilization
The following table shows the resource utilization of xFHOGDescriptor
function for normal operation (1 pixel) mode as generated in Vivado HLS 2017.1 version tool for the part Xilinx Xczu9eg-ffvb1156-1-i-es1 at 300 MHz to process an image of 1920x1080
resolution.
| Resource | Utilization (at 300 MHz) of 1 pixel operation | |||
|---|---|---|---|---|
| NRB | RB | |||
| Gray | RGB | Gray | RGB | |
| BRAM_18K | 43 | 49 | 171 | 177 |
| DSP48E | 34 | 46 | 36 | 48 |
| FF | 15365 | 15823 | 15205 | 15663 |
| LUT | 12868 | 13267 | 13443 | 13848 |
Performance Estimate
The following table shows the performance estimates of xFHOGDescriptor() function for different configurations as generated in Vivado HLS 2017.1 version tool for the part Xilinx Xczu9eg-ffvb1156-1-i-es1 to process an image of 1920x1080p resolution.
| Operating Mode | Operating Frequency (MHz) | Latency Estimate | |
|---|---|---|---|
| Min (ms) | Max (ms) | ||
| NRB-Gray | 300 | 6.98 | 8.83 |
| NRB-RGBA | 300 | 6.98 | 8.83 |
| RB-Gray | 300 | 176.81 | 177 |
| RB-RGBA | 300 | 176.81 | 177 |
Deviations from OpenCV
- Border care
The border care that OpenCV has taken in the gradient computation is BORDER_REFLECT_101, in which the border padding will be the neighboring pixels' reflection. Whereas, in the Xilinx implementation, BORDER_CONSTANT (zero padding) was used for the border care.
- Gaussian weighing
The Gaussian weights are multiplied on the pixels over the block, that is a block has 256 pixels, and each position of the block are multiplied with its corresponding Gaussian weights. Whereas, in the HLS implementation, gaussian weighing was not performed.
- Cell-wise interpolationThe magnitude values of the pixels are distributed across different cells in the blocks but on the corresponding bins.

Pixels in the region 1 belong only to its corresponding cells, but the pixels in region 2 and 3 are interpolated to the adjacent 2 cells and 4 cells respectively. This operation was not performed in the HLS implementation. - Output handling
The output of the OpenCV will be in the column major form. In the HLS implementation, output will be in the row major form. Also, the feature vector will be in the fixed point type Q0.16 in the HLS implementation, while in the OpenCV it will be in floating point.
Limitations
- The configurations are limited to Dalal’s implementation.
- Image height and image width must be a multiple of cell height and cell width respectively.