Unlike H.264/AVC, where parallelism was an afterthought, the current HEVC draft contains several proposals aiming at making the codec better “parallelizable”. H.264/AVC supports slices, which were introduced mainly to prevent loss of quality in the case of transmission errors, but can also be used to parallelize the decoder. Employing slices for parallelism, however, has several problems. First and foremost, using many slices to increase parallelism incurs significant coding losses. Second, the number of slices is determined by the encoder and if the decoder relies on slices to obtain real-time performance, it may not achieve this if it receives a video sequence with only one or a few slices per frame. One of the two parallelization approaches included in the HEVC is Wavefront Parallel Processing (WPP), WPP allows creating picture partitions that can be processed in parallel without incurring high coding losses.
In Wavefront Parallel Processing (WPP) processes rows of treeblocks in parallel while preserving all coding dependencies. Since a treeblock being processed requires the left, top-left, top, and topright treeblocks to be available in order for predictions to operate correctly, a shift of at least two treeblocks is enforced between consecutive rows of treeblocks processed in parallel. Therefore, WPP requires, compared to Tiles in the non-cross border filtering mode, additional inter-core communication. Typically inter-core communication is not a burden for today’s multi-core processor architectures and WPP is therefore suited for software and hardware implementations. Especially, implementations of WPP are straight forward, since WPP does not affect the ability to perform single step processing, i.e. entropy coding, predictive coding as well as in-loop filtering can be applied in a single processing step. An example use case for WPP may be high-quality streaming over robust channels. In combination with Dependent Slices this tool can be also used in ultra-low delay applications.
Overlapped Wavefront (OWF) allows for overlapping the execution of consecutive pictures using Wavefronts. When a thread has finished a treeblock row in the current picture and no more rows are available it can start processing the next picture instead of waiting for the current picture to finish.
转自Wavefronts for HEVC Parallelism
下面给出了WPP的一种详细实现方式文档
A Multi-Threaded Full-feature HEVC Encoder Based on Wavefront Parallel Processing
已转存百度网盘
A Multi-Threaded Full-feature HEVC Encoder Based on Wavefront Parallel Processing