This paper reports on verification testing of the coding performance of the screen content coding (SCC) extensions of the High Efficiency Video Coding (HEVC) standard (Rec. ITU-T H.265 | ISO/IEC 23008-2 MPEG-H Part 2). The coding performance of HEVC screen content model (SCM) reference software is compared with that of the HEVC test model (HM) without the SCC extensions, as well as with the Advanced Video Coding (AVC) joint model (JM) reference software, for both lossy and mathematically lossless compression using All-Intra (AI), Random Access (RA), and Lowdelay B (LB) encoding structures and using similar encoding techniques. Video test sequences in 1920×1080 RGB 4:4:4, YCbCr 4:4:4, and YCbCr 4:2:0 colour sampling formats with 8 bits per sample are tested in two categories: “text and graphics with motion” (TGM) and “mixed” content. For lossless coding, the encodings are evaluated in terms of relative bit-rate savings. For lossy compression, subjective testing was conducted at 4 quality levels for each coding case, and the test results are presented through mean opinion score (MOS) curves. The relative coding performance is also evaluated in terms of Bjøntegaard-delta (BD) bit-rate savings for equal PSNR quality. The perceptual tests and objective metric measurements show a very substantial benefit in coding efficiency for the SCC extensions, and provided consistent results with a high degree of confidence. For TGM video, the estimated bit-rate savings ranged from 60–90% relative to the JM and 40–80% relative to the HM, depending on the AI/RA/LB configuration category and colour sampling format.
The Advanced Display Stream Compression (ADSC) codec project is in development in response to a call for technologies from the Video Electronics Standards Association (VESA). This codec targets visually lossless compression of display streams at a high compression rate (typically 6 bits/pixel) for mobile/VR/HDR applications. Functionality of the ADSC codec is described in this paper, and subjective trials results are provided using the ISO 29170-2 testing protocol.
KEYWORDS: Clouds, Image compression, Computer programming, Principal component analysis, Raster graphics, 3D scanning, Data modeling, Data compression, Neural networks, RGB color model
Recent advances in point cloud capture and applications in VR/AR sparked new interests in the point cloud data
compression. Point Clouds are often organized and compressed with octree based structures. The octree subdivision
sequence is often serialized in a sequence of bytes that are subsequently entropy encoded using range coding, arithmetic
coding or other methods. Such octree based algorithms are efficient only up to a certain level of detail as they have an
exponential run-time in the number of subdivision levels. In addition, the compression efficiency diminishes when the
number of subdivision levels increases. Therefore, in this work we present an alternative enhancement layer to the coarse
octree coded point cloud. In this case, the base layer of the point cloud is coded in known octree based fashion, but the
higher level of details are coded in a different way in an enhancement layer bit-stream. The enhancement layer coding
method takes the distribution of the points into account and projects points to geometric primitives, i.e. planes. It then
stores residuals and applies entropy encoding with a learning based technique. The plane projection method is used for
both geometry compression and color attribute compression. For color coding the method is used to enable efficient raster
scanning of the color attributes on the plane to map them to an image grid. Results show that both improved compression
performance and faster run-times are achieved for geometry and color attribute compression in point clouds.
Transform coefficient coding in HEVC encompasses the scanning patterns and the coding methods for the last significant coefficient, significance map, coefficient levels and sign data. Unlike H.264/AVC, HEVC has a single entropy coding mode based on the context adaptive binary arithmetic coding (CABAC) engine. Due to this, achieving high throughput for transform coefficient coding was an important design consideration. This paper analyzes the throughput of different components of transform coefficient coding with special emphasis on the explicit coding of the last significant coefficient position and high throughput binarization. A comparison with H.264/AVC transform coefficient coding is also presented, demonstrating that HEVC transform coefficient coding achieves higher average and worst case throughput.
This paper describes video coding technology proposal submitted by Qualcomm Inc. in response to a joint call for
proposal (CfP) issued by ITU-T SG16 Q.6 (VCEG) and ISO/IEC JTC1/SC29/WG11 (MPEG) in January 2010. Proposed
video codec follows a hybrid coding approach based on temporal prediction, followed by transform, quantization, and
entropy coding of the residual. Some of its key features are extended block sizes (up to 64x64), recursive integer
transforms, single pass switched interpolation filters with offsets (single pass SIFO), mode dependent directional
transform (MDDT) for intra-coding, luma and chroma high precision filtering, geometry motion partitioning, adaptive
motion vector resolution. It also incorporates internal bit-depth increase (IBDI), and modified quadtree based adaptive
loop filtering (QALF). Simulation results are presented for a variety of bit rates, resolutions and coding configurations to
demonstrate the high compression efficiency achieved by the proposed video codec at moderate level of encoding and
decoding complexity. For random access hierarchical B configuration (HierB), the proposed video codec achieves an
average BD-rate reduction of 30.88c/o compared to the H.264/AVC alpha anchor. For low delay hierarchical P (HierP)
configuration, the proposed video codec achieves an average BD-rate reduction of 32.96c/o and 48.57c/o, compared to the
H.264/AVC beta and gamma anchors, respectively.
This paper describes design of transforms for extended block sizes for video coding. The proposed transforms are
orthogonal integer transforms, based on a simple recursive factorization structure, and allow very compact and efficient
implementations. We discuss techniques used for finding integer and scale factors in these transforms, and describe our
final design. We evaluate efficiency of our proposed transforms in VCEG's H.265/JMKTA framework, and show that
they achieve nearly identical performance compared to much more complex transforms in the current test model.
KEYWORDS: Quantization, Computer programming, Lithium, Distortion, Video coding, Binary data, Motion estimation, Chemical elements, Video compression, Digital image processing
In this paper, a rate-distortion optimized quantization scheme is described with application to H.264 video encoding. An efficient implementation of H.264 macroblock level adaptive quantization parameter selection is also described. Together these two encoder-only changes can achieve on average over 6% bit rate reduction under common testing conditions that are used in the H.264 standardization community. The described techniques provide this improvement in compression capability while retaining conformance of the encoded data to the H.264 standard. Thus, full compatibility with standard decoders can be achieved when applying these techniques.
In medical imaging, the popularity of image capture modalities such as multislice CT and MRI is resulting in an
exponential increase in the amount of volumetric data that needs to be archived and transmitted. At the same time, the
increased data is taxing the interpretation capabilities of radiologists. One of the workflow strategies recommended for
radiologists to overcome the data overload is the use of volumetric navigation. This allows the radiologist to seek a
series of oblique slices through the data. However, it might be inconvenient for a radiologist to wait until all the slices
are transferred from the PACS server to a client, such as a diagnostic workstation. To overcome this problem, we
propose a client-server architecture based on JPEG2000 and JPEG2000 Interactive Protocol (JPIP) for rendering
oblique slices through 3D volumetric data stored remotely at a server. The client uses the JPIP protocol for obtaining
JPEG2000 compressed data from the server on an as needed basis. In JPEG2000, the image pixels are wavelet-transformed
and the wavelet coefficients are grouped into precincts. Based on the positioning of the oblique slice,
compressed data from only certain precincts is needed to render the slice. The client communicates this information to
the server so that the server can transmit only relevant compressed data. We also discuss the use of caching on the client
side for further reduction in bandwidth requirements. Finally, we present simulation results to quantify the bandwidth
savings for rendering a series of oblique slices.
One of the key properties of the JPEG2000 standard is that it is possible to parse a JPEG2000 bit-stream to extract a lower resolution and/or quality image without having to perform dequantization and requantization. This property is especially useful given the variety of devices with vastly differing bandwidth and display capabilities that can now access the Internet. It is anticipated that a high-resolution JPEG2000-compressed image stored at an image server will be accessed by a variety of clients with differing needs for resolution and image quality. To satisfy the needs of these heterogeneous clients, it is essential that the server have the ability to transcode a JPEG2000 image in an efficient manner with very little loss in image quality. In this paper, we present a number of methods for transcoding a JPEG2000 image and evaluate each with respect to computational complexity and the quality of the transcoded image.
Ideally, when the same set of compression parameters are used, it is desirable for a compression algorithm to be idempotent to multiple cycles of compression and decompression. However, this condition is generally not satisfied for most images and compression settings of interest. Furthermore, if the image undergoes cropping before recompression, there is a severe degradation in image quality. In this paper we compare the multiple compression cycle performance of JPEG and JPEG2000. The performance is compared for different quantization tables (shaped or flat) and a variety of bit rates, with or without cropping. It is shown that in the absence of clipping errors, it is possible to derive conditions on the quantization tables under which the image is idempotent to repeated compression cycles. Simulation results show that when images have the same mean squared error (MSE) after the first compression cycle, there are situations in which the images compressed with JPEG2000 can degrade more rapidly compared to JPEG in subsequent compression cycles. Also, the multiple compression cycle performance of JPEG2000 depends on the specific choice of wavelet filters. Finally, we observe that in the presence of cropping, JPEG2000 is clearly superior to JPEG. Also, when it is anticipated that the images will be cropped between compression cycles when using JPEG2000, it is recommended that the canvas system be used.
Spatially varying quantization schemes try to exploit the non-stationary nature of image subbands. One technique for spatially varying quantization is classification based on AC energy of blocks. Several different methods of subband classification have been proposed in the literature. One method is to optimally classify each subband and send the classification maps as side information. Although image subbands can be shown to be roughly uncorrelated, they are not independent. Naveen and Woods proposed a method in which classification is done based on the AC energy of the block corresponding to the same spatial location, but from the lower frequency band. In their method, inter-subband dependence is exploited to almost completely eliminate side information, albeit at the cost of decreasing classification gain. In this paper, we proposed a new method of classification based on vector quantization of AC energy n-tuples formed by energies of blocks which correspond to the same spatial location in the original image but belong to different subbands. This method allows us to reduce the side information at the same time maximizing classification gain for each band under the vector constraint. The performance of the new method is compared with the other two methods. The comparison is made based on conditional entropies as well as actual bit rates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.