New data formats that include both video and the corresponding depth maps, such as multiview plus depth
(MVD), enable new video applications in which intermediate video views (virtual views) can be generated using
the transmitted/stored video views (reference views) and the corresponding depth maps as inputs. We propose a
depth map coding method based on a new distortion measurement by deriving relationships between distortions
in coded depth map and rendered view. In our experiments we use a codec based on H.264/AVC tools, where the
rate-distortion (RD) optimization for depth encoding makes use of the new distortion metric. Our experimental
results show the efficiency of the proposed method, with coding gains of up to 1.6 dB in interpolated frame
quality as compared to encoding the depth maps using the same coding tools but applying RD optimization
based on conventional distortion metrics.
To facilitate new video applications such as three-dimensional video (3DV) and free-viewpoint video (FVV), multiple view plus depth format (MVD), which consists of both video views and the corresponding per-pixel depth images, is being investigated. Virtual views can be generated using depth image based rendering (DIBR), which takes video and the corresponding depth images as input. This paper discusses view synthesis techniques based on DIBR, which includes forward warping, blending and hole filling. Especially, we will emphasize on the techniques brought to the MPEG view synthesis reference software (VSRS). Unlike the case in the field of computer graphics, the ground truth depth images for nature content are very difficult to obtain. The estimated depth images used for view synthesis typically contain different types of noises. Some robust synthesis modes to combat against the depth errors are also presented in this paper. In addition, we briefly discuss how to use synthesis techniques with minor modifications to generate the occlusion layer information for layered depth video (LDV) data, which is another potential format for 3DV applications.
KEYWORDS: Scalable video coding, Computer programming, Quantization, Video coding, Distortion, Video, Signal processing, Signal to noise ratio, Receivers, Image processing
It is well-known that the problem of addressing heterogeneous networks in multicast can be solved by simultaneous transmission of multiple bitstreams of different bitrates and by layered encoding. This paper analyzes the use of H.264/AVC video coding in simulcast and for layered encoding. The sub-sequence feature of H.264/AVC enables hierarchical temporal scalability, which allows disposal of reference pictures from a coded bitstream without affecting the decoding of the remaining stream. In this paper we extend the scope of the H.264/AVC sub-sequence coding technique to quality scalability. The resulting quality scalable coding technique is similar to conventional coarse-granularity quality scalability but fully compatible with the H.264/AVC standard. It is found that the proposed method drops bitrate consumption in the core network compared to simulcast up to 20%. However, the bitrate required for
enhanced-quality reception for scalably coded bitstreams is considerably higher than that of non-scalable bitstreams.
This paper investigates the transmission of H.264 /AVC video in the 3GPP Multimedia Broadcast/Multicast Streaming service (MBMS). Application-layer forward error correction (FEC) codes are used to combat transmission errors in the radio access network. In this FEC protection scheme, the media RTP stream is organized into source blocks spanning many RTP packets, over which FEC repair packets are generated. This paper proposes a novel method for unequal error
protection that is applicable in MBMS. The method reduces the expected tune-in delay when a new user joins into a broadcast. It is based on four steps. First, temporally scalable H.264 /AVC streams are coded including reference and non-reference pictures or sub-sequences. Second, the constituent pictures of a group of pictures (GOP) are grouped according to their temporal scalability layer. Third, the interleaved packetization mode of RFC3984 is used to transmit the groups in ascending order of relevance for decoding. As an example, the non-reference pictures of a GOP are sent earlier than the reference pictures of the GOP. Fourth, each group is considered a source block for FEC coding and the strength of the FEC is selected according to its importance. Simulations show that the proposed method improves the quality of the received video stream and decreases the expected tune-in delay.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.