
In order to minimize the impact of the memory access, the implementation also deployed a ping-pong DMA buffering scheme shown in Figure 6. In this scheme, the first iDMA descriptor is programmed to fetch source tile 1 from the DDR Memory to the ping buffer located in the local RAM. After the completion of the DMA transaction, the DSP software starts processing the ping buffer for tile 1. In the meantime, the second iDMA descriptor is programmed to fetch source tile 2 from the DDR Memory to the pong buffer located in the local RAM. Since the iDMA performs the second tile fetch in parallel with the DSP processing, it effectively hides the memory access latency.