The combination of tiling and ping-pong DMA has produced a speed-up factor of 15X to 20X, when the algorithm is executed from the prototyping FPGA hardware target.
In this paper, we take the implementation of the ADAS lane-detection algorithm as an example to present the embedded CV software development flow and the challenges facing CV algorithm developers in real-life applications to select and implement a robust algorithm while achieving real-time performance under constrained system resources. We further demonstrated the superior architecture of the Tensilica Vision DSP IP and how to utilize the high-performance XI software library for rapidly porting and optimizing CV algorithms to embedded hardware targets. The project to develop the ADAS lane-detection algorithm was completed within three months by one software engineer with no prior experience of programming a Tensilica Vision DSP. The tasks included application and algorithm research, algorithm prototyping using MATLAB, developing generic functional C code, optimization to Vision DSP, and demonstration on a FPGA prototyping hardware platform. With built-in support of advanced hardware features like 2D DMA and programming techniques such as tiling and ping-pong buffer management, a highly optimized implementation of the lane-detection algorithm can be demonstrated in real time on a prototyping hardware target with only a fraction of the operating frequency achievable by the Vision DSP in an embedded semiconductor SoC.
About the authors:
Charles Qi is a Senior System Solutions Architect in Cadence’s IP Group, responsible for providing system solutions based on the Cadence Tensilica DSP and Interface IP portfolio.
Han Lin is Computer Vision Application Software Engineer at Cadence.