Using Vivado HLS to implement the development process of OpenCV

In this paper, through the introduction of image types and function processing methods in OpenCV, the design example describes how to call Op in vivadoHLSThe enCV library function implements several basic steps of image processing, and completes the development process from OpenCV design to RTL conversion synthesis.

Open Source Computer Vision (OpenCV) is widely used to develop computer vision applications, it contains a library of more than 2500 optimized video functions and is specific for desktop processors and GPsU to optimize. OpenCV has thousands of users, and OpenCV is designed to run on ARM processors in Zynq devices without modification.However, high-definition processing using OpenCV is often limited by external memory, especially memoryStorage bandwidth can become a performance bottleneck, and storage access can also limit power efficiency.Using the VivadoHLS high-level language synthesis tool, it is easy to achieveConvert OpenCV C++ video processing design to RTL code, output hardware accelerator or directly implement real-time video processing function on FPGA. at the same time,The Zynq All-programmable SOC is an excellent way to implement embedded computer vision applications, and solves the limitations of low video processing performance and high power consumption on a single processor.Zynq high-performance programmable logic and embedded ARM cores are a power-optimized, integrated solution.

1 The relationship of image IplImage, CvMat, Mat type in OpenCV and the introduction of image hls::Mat type in VivadoHLS

Common data containers related to image operations in OpenCv are Mat, cvMat and IplImage, all three types can beRepresents and displays images, however, the Mat type focuses on calculations and is highly mathematical, and openCV also optimizes the Mat type calculations.The CvMat and IplImage types are more focused on “images”, and opencv operates on images (scaling, single-channel extraction,image thresholding operations, etc.) are optimized. Before opencv2.0, opencv was completely implemented in C, however, IplImThe relationship between the age type and the CvMat type is similar to the inheritance relationship in object-oriented.In fact, there is a more abstract base on top of CvMatClass – CvArr, this will be common in source code.

1.1 Mat type in OpenCV: matrix type (Matrix).

In openCV, Mat is a multidimensional array of dense data.Can be used to deal with common multi-dimensional vectors and matrices, images, histograms,

Mat has 3 important methods:

1. Mat mat = imread(const String* filename); read image

2. imshow(const string frameName, InputArray mat); Display image

3. imwrite (const string& filename, InputArray img); save image

Compared with the CvMat and IplImage types, the Mat type has stronger matrix operation capabilities and supports common matrix computing the secretIn the application of set type, converting CvMat and IplImage types to Mat type will greatly reduce the calculation time.

1.2 CvMat type and IplImage type in OpenCV: “image” type

In openCV, both the Mat type and CvMat and IplImage types can represent and Display images, however, the Mat typeIt focuses on calculation and is highly mathematical. OpenCV also optimizes the calculation of Mat type.And the CvMat and IplImage classesThe type is more focused on “images”, and openCV is optimized for image operations in it (scaling, single-channel extraction, image thresholding operations, etc.).

Replenish:IplImage is derived from CvMat and CvMat is derived from CvArr i.e. CvArr -> CvMat -> IplImage

CvArr is used as a parameter of the function, no matter whether CvMat or IplImage is passed in, the internal value is CvMatreason.

In openCV, there is no vector data structure.Anytime, but when we want to represent vectors, we use matrix data representation i.e.Can.

However, the CvMat type is more abstract than the vector concept we learned in linear algebra courses, such as the element data type of CvMat is notLimited to basic data types, for example, the following creates a two-dimensional data matrix:

CvMat* cvCreatMat(int rows ,int cols , int ttype);

The type here can be any predefined data type, such as RGB or other multi-channel data. In this way we can have a CvMaColorful images are represented on the t matrix.

1.3 IplImage type in OpenCV

On the OpenCV type relationship, we can say that the IplImage type inherits from the CvMat type, and of course other variables willIt is parsed into image data.

The IplImage type has many more parameters than CvMat, such as depth and common matrix typesAmong them, usually the depth and the number of channels are represented at the same time, such as using 32 bits to represent RGB+Alpha. However, in image processing, we oftenThe degree is handled separately from the number of channels, which is an optimization scheme of OpenCV for image representation.

Another optimization of IplImage for images is the variable origin – the origin.In computer vision processing, an importantEven if the definition of the origin is not clear, the source of the image, the encoding format, and even the operating system will have an impact on the selection of the origin. To make up for this, openCV allows users to define their own origin settings. A value of 0 means the origin is in the upper left corner of the image, and 1 means the lower left corner.

1.4 Image data type hls::Mat<> in VivadoHLS

The VivadoHLS video processing library uses the hls::Mat<> data type, which is used to model video pixel stream processing,Essentially equivalent to the type of the hls::steam<> stream, not the matri stored in external memory in OpenCVx matrix type.Therefore, in the design of HLS to implement OpenCV, it is necessary to input and output the HLS synthesizable video design interface, modify theIt is the Video stream interface, that is, the synthesizable function of the video interface provided by HLS is used to realize the conversion of AXI4 video stream to hls::Mat<> type in VivadoHLS.

2 The process of using VivadoHLS to implement OpenCV to RTL code conversion

2.1 Tradeoffs in OpenCV Design

OpenCV image processing is built on the memory frame buffer, it always assumes that the video frame data is stored in the external DDR memory, therefore, OpenCV has poor performance for accessing local images, because the small cache performance of the processor is not enough to complete this appointservice. And for performance reasons, the architecture based on OpenCV is more complex and consumes more power.In low resolution or frame rate requirements, or inProcessing the required features or regions in larger images is that OpenCV seems to be sufficient for many applications, but for high-resolution high-frameIn the scenario of real-time processing, OpenCV is difficult to meet the requirements of high performance and low power consumption.

Streaming-based architecture provides high performance and low power consumption, chained image processing functions reduce external memory accesses, video-optimized line bufferingAnd window caches are simpler and easier to use with FPGA parts than processor caches, implemented using dataflow optimizations in VivadoHLS.

The support of VivadoHLS for OpenCV does not mean that the function library of OpenCV can be directly synthesized into RTL code, but requiresTo convert codes into synthesizable code, these synthesizable video libraries are called HLS Video Libraries and are provided by VivadoHLS.

OpenCV functions cannot be synthesized directly through HLS, because OpenCV functions generally include dynamic memory allocation, floating point and falseSet the image to be stored or modified in external memory.

The VivadoHLS video library is used to replace many basic OpenCV functions. It has similar interfaces and algorithms to OpenCV. It is mainly aimed at the image processing functions implemented in the FPGA architecture.Contains FPGA-specific optimizations, such as fixed-point operations instead of floating-point operations (not necessarily bit-accurate), on-chip line buffers and window buffers.

2.2 Introduction of VivadoHLS to implement OpenCV design process

Using VivadoHLS to realize the development of OpenCV, the main three steps are as follows:

  1. To develop OpenCV applications on the computer, because it is an open source design, a C++ compiler is used to compile, simulate and debug it.Finally, an executable file is generated. These designs run OpenCV applications on ARM cores without modification.
  2. Use I/O functions to extract parts of the FPGA implementation, and use synthesizable VivadoHLS Video library function code instead of OpenCV function calls.
  3. Run HLS to generate RTL code, start co-sim in the vivadoHLS project, and reuse the test incentive verification of openCVGenerated RTL code. Do RTL integration and SOC/FPGA implementation in ISE or Vivado development environment.

2.2.1 VivadoHLS video library functions

The HLS video library is C++ code contained within the hls namespace. #include “hls_video.h”

Has a similar interface and equivalent behavior to OpenCV, etc., for example:

OpenCV library: cvScale(src, dst, scale, shift);

HLS video library: hls:Using Vivado HLS to implement the development process of OpenCVscale<…>(src, dst, scale, shift);

Some constructors have similar or alternative template parameters, for example:

OpenCV library: cv::Mat mat(rows, cols, CV_8UC3);

HLS video library: hls::Mat mat(rows, cols);,>

ROWS and COLS specify the maximum image size for processing

Table 2.2.1 VivadoHLS video processing function library

2.2.2 Limitations of VivadHLS Implementation of OpenCV Design

First, OpenCV calls must be replaced with HLS video library functions.

Secondly, OpenCV does not support accessing the frame buffer through pointers, you can use the VDMA and AXI Stream adpater functions in HLS instead.

Furthermore, OpenCV’s random access is not supported. HLS must replicate data that is read more than once, see h for more examplesls:Using Vivado HLS to implement the development process of OpenCVduplicate() function.

Finally, OpenCVS’s in-place updates such as cvRectangle (img, point1, point2) are not supported.

The following table 2.2.2 lists the implementation method of randomly accessing a frame of image processing corresponding to the HLS video library in OpenCV.


HLS Video Library

read operation

pix =,j)

pix = cvGet2D(cv_img,i,j)

hls_img >> pix

write operation,j) = pix


hls_img << pix

Table 2.2.2 Corresponding methods for accessing a frame of image pixels in OpenCV and HLS

2.3 Example of implementing OpenCV application with HLS (fast corner filter image_filter)

We use the example of fast corners to illustrate the process of implementing OpenCV with VivadoHLS. First, the development is based on OpenCV’s fast corner algorithm is designed and validated using OpenCV-based test-excited simulations.Next, establish a video data stream chain-basedOpenCV processing algorithm, rewriting the usual design of OpenCV intuition above, such rewriting is to be compatible with the HLS video library processing mechanismThe same, it is convenient to replace the function in the following steps.Finally, replace the function in the rewritten OpenCV design with the video of the corresponding function provided by HLSfunction, and synthesized with VivadoHLS, and finally implemented in the Xilinx development environment.Of course, these synthesizable codes can also beor run on ARM.

2.3.1 Design OpenCV-based video filter design and test stimulus

In this example, first design and develop a fast corner filter design opencv_image_f that fully calls the OpenCV library functionsilter.cpp and the test excitation of this filter opencv_image_filter_tb.cpp, the test excitationIt is encouraged to simulate and verify the function of the opencv_image_filter algorithm. The algorithm and test stimulus design code are as follows:

void opencv_image_filter(IplImage* src, IplImage* dst)


IplImage* gray = cvCreateImage( cvGetSize(src), 8, 1 );

std::vector<:keypoint> keypoints;

cv::Mat gray_mat(gray,0);

cvCvtColor( src, gray, CV_BGR2GRAY );

cv::FAST( gray_mat, keypoints, 20, true);


for (int i=0;i();i++)


cvRectangle(dst, cvPoint(keypoints[i].pt.x-1,keypoints[i].pt.y-1),

cvPoint(keypoints[i].pt.x+1,keypoints[i].pt.y+1), cvScalar(255,0,0),CV_FILLED);


cvReleaseImage( &gray );


Example The usual OpenCV video processing code opencv_image_filter.cpp

int main (int argc, char** argv) {

IplImage* src=cvLoadImage(INPUT_IMAGE);

IplImage* dst = cvCreateImage(cvGetSize(src), src->depth, src->nChannels);

opencv_image_filter(src, dst);

cvSaveImage(OUTPUT_IMAGE_GOLDEN, dst);



return 0;


Example OpenCV video processing test stimulus code opencv_image_filter_tb.cpp

The above example is an example of directly calling OpenCV to implement a software application on the processor. You can see that openc is directly called in the algorithm design.V library function, test excitation read-in image, save and analyze the output image after filter processing. It can be seen that the processing of the algorithm is based on IPIimage type, both input and output images use this type.

2.3.2 Replacing OpenCV library with IO functions and Vivado HLS video library

It should be noted that the video processing modules usually used by xilinx are based on the axi4 streaming protocol for the interaction of pixel data in different modes, which is what we call the AXI4 video interface protocol format.In order to unify with the xilinx video library interface protocol, VivadoHLS provides a video interface function libraryused to extract the top-level functions that need RTL synthesis conversion from the OpenCV program, and combine these synthesizable codes with OpenCVSynthesizable transformable code is isolated. Then, replace the OpenCV function that needs to be synthesized into RTL code with the synthesizable video function that xilinx VivadoHLS provides the corresponding function. Finally, the Op is simulated and verified in the C/C++ compilation environmentThe enCV code is the same as the function after replacing the video function, and the code is synthesized and RTL generated in the VivadoHLS development environmentCo-sim hybrid simulation verification of the code.

VivadoHLS synthesizable video interface functions:

Hls::AXIvideo2Mat Converts AXI4 video stream to hls::Mat representation format

Hls::Mat2AXIvideo Convert hls::Mat data format to AXI4 video stream

First of all, we rewrite the design of OpenCV in 2.3.1. The rewritten code is still based on the functions of OpenCV. The purpose is toIn order to process the video based on the video stream, it is consistent with the processing mechanism provided by the VivadoHLS video library. Below is the OpeAnother way of writing nCV design:

void opencv_image_filter(IplImage* src, IplImage* dst)


IplImage* gray = cvCreateImage( cvGetSize(src), 8, 1 );

IplImage* mask = cvCreateImage( cvGetSize(src), 8, 1 );

IplImage* dmask = cvCreateImage( cvGetSize(src), 8, 1 );

std::vector<:keypoint> keypoints;

cv::Mat gray_mat(gray,0);

cvCvtColor(src, gray, CV_BGR2GRAY );

cv::FAST(gray_mat, keypoints, 20, true);

GenMask(mask, keypoints);




cvReleaseImage( &mask );

cvReleaseImage( &dmask );

cvReleaseImage( &gray );


Example Another OpenCV design application opencv_image_filter.cpp

Second, use the Vivado HLS video library to replace the standard OpenCV functions, and use the synthesizable video interface functions to interact with video data in the form of video streams.Hardware synthesizable modules for FPGAs consist of VivadoHLS video library functions and interfaceswe replace OpenCV functions with similar functions in the hls namespace, and add interface functions to build an interface of AXI4 stream type.

void image_filter(AXI_STREAM& input, AXI_STREAM& output, int rows, int cols)


//Create AXI streaming interfaces for the core

#pragma HLS RESOURCE variable=input core=AXIS metadata=”-bus_bundle INPUT_STREAM”

#pragma HLS RESOURCE variable=output core=AXIS metadata=”-bus_bundle OUTPUT_STREAM”

#pragma HLS RESOURCE core=AXI_SLAVE variable=rows metadata=”-bus_bundle CONTROL_BUS”

#pragma HLS RESOURCE core=AXI_SLAVE variable=cols metadata=”-bus_bundle CONTROL_BUS”

#pragma HLS RESOURCE core=AXI_SLAVE variable=return metadata=”-bus_bundle CONTROL_BUS”

#pragma HLS interface ap_stable port=rows

#pragma HLS interface ap_stable port=cols

hls::Mat _src(rows,cols);,max_width,hls_8uc3>

hls::Mat _dst(rows,cols);,max_width,hls_8uc3>

#pragma HLS dataflow

hls::AXIvideo2Mat(input, _src);

hls::Mat src0(rows,cols);,max_width,hls_8uc3>

hls::Mat src1(rows,cols);,max_width,hls_8uc3>

#pragma HLS stream depth=20000 variable=src1.data_stream

hls::Mat mask(rows,cols);,max_width,hls_8uc1>

hls::Mat dmask(rows,cols);,max_width,hls_8uc1>

hls:Using Vivado HLS to implement the development process of OpenCVcalar<3,unsigned char> color(255,0,0);

hls:Using Vivado HLS to implement the development process of OpenCVduplicate(_src,src0,src1);

hls::Mat gray(rows,cols);,max_width,hls_8uc1>



hls:Using Vivado HLS to implement the development process of OpenCVilate(mask,dmask);


hls::Mat2AXIvideo(_dst, output);


Example Design opencv_image_filter.cpp that can be synthesized after replacing with VivadoHLS video library

Finally, synthesize the design of Example in the vivadoHLS development environment, generate RTL code and reuse OpenCV’sTest stimulus verifies RTL code functionality.

3 VHLS implementation of OpenCV design process summary

As can be seen from the introduction in the above chapter and the example of implementing the opencV design in the vivadoHLS tool, the OpenCV function can beImplement rapid prototyping of computer vision algorithms and convert them to RTL code using VivadoHLS tools to achieve high-resolution, high-frame-rate real-time video processing on an FPGA or Zynq SOC.The inherent heterogeneity of computer vision applications requires a combination of software and hardware.current plan. The Vivado HLS video library accelerates the mapping of OpenCV functions to FPGA programmable fabrics.

The Links:   LB064V02-TD01 DMF5005N-AAE-CO