PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
We introduce a method for unsupervised change detection under non-uniform changes of
intensity. A locally adaptive normalizing window is correlated with the two images, and a
morphological processing is then applied to isolate objects that have been added or removed
from the scene. The multiplicative model used represent well image changes of intensity
when the locations of the light sources is unchanged between images, but when the
illuminating source changes location between exposures, the model depends on the geometry
of the surroundings. Computer simulations show that the method works well when the model
is satisfied. An example of detection of camouflaged targets is presented. In real images with
light sources changed, artifacts are introduced in the difference image. Results from images
taken with a web camera are shown.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper focuses on simulating image processing algorithms and exploring issues related to reducing high resolution images to 25 x 25 pixels suitable for the retinal implant. Field of view (FoV) is explored, and a novel method of virtual eye movement discussed. Several issues beyond the normal model of human vision are addressed through context based processing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An object recognition technique has been developed that allows the rapid screening of multispectral images for objects with known spectral signatures. The technique is based on the configuration of a radial basis neural network (RBN) that is specific for a particular object spectral signature or series of object spectral signatures. The method has been used to identify features in CASI-2 and HYDICE images with results comparable to conventional spatial object recognition techniques with a significant reduction in processing time. Radial basis neural networks have several advantages over the more common backpropagation neural networks, including better selectivity and faster training, resulting in a significant reduction in overall image processing time and greater accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the advance in multispectral imaging, the use of image fusion has emerged as a new and important research area.
Many studies have considered the advantages of specific fusion methods over the individual input bands in terms of
human performance, yet few comparison studies have been conducted to determine which fusion method is preferable to
another. This paper examines four different fusion methods, and compares human performance of observers viewing
fused images in a target detection task. In the presented experiment, we implemented an approach that has not been
generally used in the context of image fusion evaluation: we used the paired comparison technique to qualitatively assess
and scale the subjective value of the fusion methods. Results indicated that the false color and average methods showed
the best results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Block-based discrete transform domain algorithms are developed to retrieve information from digital image data. Specifically, discrete, real, and circular Fourier transforms of the data blocks are filtered by coefficients chosen in the discrete frequency domain to improve feature detection. In this paper, the proposed approach is applied to improve the identification of edge discontinuities in digital image data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A class of implementations of mutual information (MI) based image registration estimate MI from the joint histogram of
the overlap of two images. The consequence of this approach is that the MI estimate thus obtained is not overlap
invariant: its value tends to increase when the overlapped region is getting smaller. When the two images are very noisy
or are so different that the correct MI peak is very weak, it may lead to incorrect registration results using the
maximization of mutual information (MMI) criterion. In this paper, we present a new joint histogram estimation scheme
for overlap invariant MI estimation. The idea is to keep it a constant the number of samples used for joint histogram
estimation. When one image is completely within another, this condition is automatically satisfied. When one image
(floating image) partially overlaps another image (reference image) after applying a certain geometric transformation, it
is possible that, for a pixel from the floating image, there is no corresponding point in the reference image. In this case,
we generate its corresponding point by assuming that its value is a random variable following the distribution of the
reference image. In this way, the number of samples utilized for joint histogram estimation is always the same as that of
the floating image. The efficacy of this joint histogram estimation scheme is demonstrated by using several pairs of
remote sensing images. Our results show that the proposed method is able to produce a mutual information measure that
is less sensitive to the size of overlap and the peak found is more reliable for image registration.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present an adaptive algorithm to improve the quality of millimeter-wave video sequence by separating
each video frame into foreground region and background region, and handle them differently. We separate the
foreground from background area by using an adaptive Kalman filter. The background is then denoised by both spatial
and temporal algorithms. The foreground is denoised by the block-based motion compensated averaging, and enhanced
by wavelet-based multi-scale edge representation. Finally further adaptive contrast enhancement is applied to the
reconstructed foreground. The experimental results show that our algorithm is able to produce a sequence with smoother
background, more reduced noise, more enhanced foreground and higher contrast of the region of interest.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A low-complexity three-dimensional image compression algorithm based on wavelet transforms and set-partitioning strategy
is presented. The Subband Block Hierarchial Partitioning (SBHP) algorithm is modified and extended to three dimensions,
and applied to every code block independently. The resultant algorithm, 3D-SBHP, efficiently encodes 3D image data
by the exploitation of the dependencies in all dimensions, while enabling progressive SNR and resolution decompression
and Region-of-Interest (ROI) access from the same bit stream. The code-block selection method by which random access
decoding can be achieved is outlined.The resolution scalable and random access performances are empirically investigated.
The results show 3D-SBHP is a good candidate to compress 3D image data sets for multimedia applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Block-based image and video coding systems are used extensively in practice. In low bit-rate applications, however, they suffer from blocking artifacts. Reduction of blocking artifacts can improve visual quality and PSNR. Most methods in the literature that are proposed to reduce blocking artifacts apply post-processing techniques to the compressed image. One major benefit of such methods is no modification to current encoders. In this paper, we propose an approach where blocking artifacts are reduced using side information transmitted by the encoder. One major benefit of this approach is the ability to compare the processed image directly with the original undegraded image to improve the performance. For example, we could process an image with different methods and transmit the most effective method as part of the side information. A major question in using our proposed approach and compare it with a post-processing type of system and illustrate that the proposed approach has the potential to be beneficial in both visual quality and PSNR for some range of coding bit-rates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a new deinterlacing algorithm based on motion estimation and compensation with variable block size. Motion compensated methods using a fixed block size tend to produce undesirable artifacts when there exist complicated motion and high frequency components. In the proposed algorithm, the initial block size of motion estimation is determined based on the existence of global motion. Then, the block is further divided depending on block characteristics. Since motion compensated deinterlacing may not always provide satisfactory results, the proposed method also use an intrafield spatial deinterlacing. Experimental results show that the proposed method provides noticeable improvements compared to motion compensated deinterlacing with a fixed block size.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The detection of moving objects in complex scenes is the basis of many applications in surveillance, event
detection, and tracking. Complex scenes are difficult to analyze due to camera noise and lighting conditions.
Currently, moving objects are detected primarily using background subtraction algorithms, with block matching
techniques as an alternative. In this paper, we complement our earlier work on the comparison of background
subtraction methods by performing a similar study of block matching techniques. Block matching techniques
first divide a frame of a video into blocks and then determine where each block has moved from in the preceding
frame. These techniques are composed of three main components: block determination, which specifies the
blocks; search methods, which specify where to look for a match; and, the matching criteria, which determine
when a good match has been found. In our study, we compare various options for each component using publicly
available video sequences of a traffic intersection taken under different traffic and weather conditions. Our results
indicate that a simple block determination approach is significantly faster with minimum performance reduction,
the three step search method detects more moving objects, and the mean-squared-difference matching criteria
provides the best performance overall.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Synthetic Aperture Radar (SAR) techniques employ radar waves to generate high-resolution images in all illumination/weather conditions. The onboard implementation of the image reconstruction algorithms allows for the transmission of real-time video feeds, rather than raw radar data, from unmanned aerial vehicles (UAVs), saving significant communication bandwidth. This in turn saves power, enables longer missions, and allows the transmission of more useful information to the ground. For this application, we created a hardware architecture for a portable implementation of the motion compensation algorithms, which are more computationally intensive than the SAR reconstruction itself, and without which the quality of the SAR images is severely degraded, rendering them unusable.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a modified algorithm was introduced to improve Rice coding algorithm and researches of image compression with the CDF (2,2) wavelet lifting scheme was made. Our experiments show that the property of the lossless image compression is much better than Huffman, Zip, lossless JPEG, RAR, and a little better than (or equal to) the famous SPIHT. The lossless compression rate is improved about 60.4%, 45%, 26.2%, 16.7%, 0.4% on average. The speed of the encoder is faster about 11.8 times than the SPIHT's and its efficiency in time can be improved by 162%. The speed of the decoder is faster about 12.3 times than that of the SPIHT's and its efficiency in time can be rasied about 148%. This algorithm, instead of largest levels wavelet transform, has high coding efficiency when the wavelet transform levels is larger than 3. For the source model of distributions similar to the Laplacian, it can improve the efficiency of coding and realize the progressive transmit coding and decoding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An algorithm of combining LZC and arithmetic coding algorithm for image compression is presented and both theory deduction and simulation result prove the correctness and feasibility of the algorithm. According to the characteristic of context-based adaptive binary arithmetic coding and entropy, LZC was modified to cooperate the optimized piecewise arithmetic coding, this algorithm improved the compression ratio without any additional time consumption compared to traditional method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Replica detection is a prerequisite for the discovery of copyright infringement and detection of illicit content. For this purpose, content-based systems can be an efficient alternative to watermarking. Rather than imperceptibly embedding a signal, content-based systems rely on content similarity concepts. Certain content-based systems use adaptive classifiers to detect replicas. In such systems, a suspected content is tested against every original, which can become computationally prohibitive as the number of original contents grows. In this paper, we propose an image detection approach which hierarchically estimates the partition of the image space where the replicas (of an original) lie by means of R-trees. Experimental results show that the proposed system achieves high performance. For instance, a fraction of 0.99975 of the test images are filtered by the system when the test images are unrelated to any of the originals while only a fraction of 0.02 of the test images are rejected when the test image is a replica of one of the originals.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a Secure JPEG, an open and flexible standardized framework to secure JPEG images. Its goal
is to allow the efficient integration and use of security tools enabling a variety of security services such as
confidentiality, integrity verification, source authentication or conditional access. In other words, Secure JPEG aims at
accomplishing for JPEG what JPSEC enables for JPEG 2000. We describe in more details three specific examples of
security tools. The first one addresses integrity verification using a hash function to compute local digital signatures. The
second one considers the use of encryption for confidentiality. Finally, the third describes a scrambling technique.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an objective structural distortion measure which reflects the visual similarity between 3D
meshes and thus can be used for quality assessment. The proposed tool is not linked to any specific application
and thus can be used to evaluate any kinds of 3D mesh processing algorithms (simplification, compression,
watermarking etc.). This measure follows the concept of structural similarity recently introduced for 2D image
quality assessment by Wang et al.1 and is based on curvature analysis (mean, standard deviation, covariance) on
local windows of the meshes. Evaluation and comparison with geometric metrics are done through a subjective
experiment based on human evaluation of a set of distorted objects. A quantitative perceptual metric is also
derived from the proposed structural distortion measure, for the specific case of watermarking quality assessment,
and is compared with recent state of the art algorithms. Both visual and quantitative results demonstrate the
robustness of our approach and its strong correlation with subjective ratings.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We consider the mobile service scenario where video programming is broadcast to low-resolution wireless terminals. In such a scenario, broadcasters utilize simultaneous data services and bi-directional communications capabilities of the terminals in order to offer substantially enriched viewing experiences to users by allowing user participation and user tuned content. While users immediately benefit from this service when using their phones in mobile environments, the service is less appealing in stationary environments where a regular television provides competing programming at much higher display resolutions. We propose a fast super-resolution technique that allows the mobile terminals to show a much enhanced version of the broadcast video on nearby high-resolution devices, extending the appeal and usefulness of the broadcast service. The proposed single frame super-resolution algorithm uses recent sparse recovery results to provide high quality and high-resolution video reconstructions based solely on individual decoded frames provided by the low-resolution broadcast.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mobile devices have been transformed from voice communication tools to advanced tools for consuming multimedia contents. The extensive use of such mobile devices entails watching multimedia contents on small LCD panels. However, the most of video sequences are captured for normal viewing on standard TV or HDTV, but for cost reasons, merely resized and delivered without additional editing. This may give the small-display-viewers uncomfortable experiences in understanding what is happening in a scene. For instance, in a soccer video sequence taken by a long-shot camera technique, the tiny objects (e.g., soccer ball and players) may not be clearly viewed on the small LCD panel. Thus, an intelligent display technique needs to be developed to provide small-display-viewers with better experience. To this end, one of the key technologies is to determine region of interest (ROI) and display the magnified ROI on the screen, where ROI is a part of the scene that viewers pay more attention to than other regions. In this paper, which is an extension from our prior work, we focus on soccer video display for mobile devices, and a fully automatic and computationally efficient method is proposed. Instead of taking generic approaches utilizing visually salient features, we take domain-specific approach to exploit the attributes of the soccer video. The proposed scheme consists of two stages: shot classification, and ROI determination. The experimental results show that the proposed scheme offers useful tools for intelligent video display for multimedia mobile devices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The effect of parameterization combined with joint layer control for video streaming over wireless IEEE 802.11 network is presented in this paper. We describe an architecture that provides the cross-layer optimization and prioritization of the video streaming. The proposed approach allows us to assign various parameter sets to each priority class independently. We examine the performance of the FEC mechanism available at the application layer for the efficient and robust transmission of MPEG-coded video over IEEE 802.11a WLANs. Besides, we develop a joint-layer control policy to reduce bandwidth consumption and maximize the received video quality by investigating and dynamically selecting the optimal combination of application-layer FEC, interleaving depth, UDP packet size and MAC retransmission rate. The performance of the proposed concept will be shown by carrying out experiments in real wireless networks. The results demonstrate that even a small number of parameters utilized in the optimization process have a significant influence on the received video quality and resources consumption.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
H.264/MPEG-4 AVC, finished in May, 2003, is now a well-established video-coding standard, and derivative
standardization projects are beginning to emerge based on it. The first of these is the so-called Scalable Video Coding
(SVC) project. Launched within MPEG about the time that AVC was finishing, it was later moved in 2005 to the Joint
Video Team (JVT); the JVT is a joint committee of video experts set up by ISO/IEC MPEG and ITU-T/VCEG back in
2001 to develop AVC. The SVC project aims to develop a fully scalable video codec based on the AVC codec as its
backbone. While several previous scalable codecs have already been standardized before (i.e., in MPEG-2, H.263,
MPEG-4), each has seen barriers to deployment, mainly based on inadequate performance against single-rate coding.
SVC, due out in 2007, appears on the brink of overcoming those barriers to finally bring scalable coding to fruition. This
paper aims at an elementary, general account of its current status, which seems unavailable in the literature.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper reconsiders the rate-distortion performance comparison of JPEG2000 with H.264/AVC High Profile I-frame
coding for high definition (HD) video sequences. This work is a follow-on to our paper at SPIE 05 [14], wherein we
further optimize both codecs. This also extends a similar earlier study involving H.264/AVC Main Profile [2]. Coding
simulations are performed on a set of 720p and 1080p HD video sequences, which have been commonly used for
H.264/AVC standardization work. As expected, our experimental results show that H.264/AVC I-frame coding offers
consistent R-D performance gains (around 0.2 to 1 dB in peak signal-to-noise ratio) over JPEG2000 color image coding.
As in [1, 2], we do not consider scalability, complexity in this study (JPEG2000 is used in non-scalable, but optimal
mode).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, a performance evaluation of AVC Intra and JPEG2000 in terms of rate-distortion performance is
conducted. A rich set of test sequences with different spatial resolutions is used in this evaluation. Furthermore, the
comparison is made with both the Main and High profiles of AVC Intra.
For high spatial resolution sequences, our results show that JPEG2000 is very competitive with AVC High Profile Intra
and outperforms the Main Profile. For Intermediate and low spatial resolution sequences JPEG2000 is outperformed by
both Profiles of AVC Intra.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
MPEG LA, LLC offers a joint patent license for the AVC (a/k/a H.264) Standard (ISO/IEC IS 14496-10:2004). Like
MPEG LA's other licenses, the AVC Patent Portfolio License is offered for the convenience of the marketplace as an
alternative enabling users to access essential intellectual property owned by many patent holders under a single license
rather than negotiating licenses with each of them individually.
The AVC Patent Portfolio License includes essential patents owned by DAEWOO Electronics Corporation; Electronics
and Telecommunications Research Institute (ETRI); France Telecom, societe anonyme; Fujitsu Limited; Hitachi, Ltd.;
Koninklijke Philips Electronics N.V.; LG Electronics Inc.; Matsushita Electric Industrial Co., Ltd.; Microsoft
Corporation; Mitsubishi Electric Corporation; Robert Bosch GmbH; Samsung Electronics Co., Ltd.; Sedna Patent
Services, LLC; Sharp Kabushiki Kaisha; Siemens AG; Sony Corporation; The Trustees of Columbia University in the
City of New York; Toshiba Corporation; UB Video Inc.; and Victor Company of Japan, Limited. Another is expected
also to join as of August 1, 2006.
MPEG LA's objective is to provide worldwide access to as much AVC essential intellectual property as possible for the
benefit of AVC users. Therefore, any party that believes it has essential patents is welcome to submit them for
evaluation of their essentiality and inclusion in the License if found essential.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We investigate new and efficient methods for coding the motion compensated residues in a hybrid video coder framework, that are able to improve upon the performance of the very successful DCT based adaptive block size integer transform used in H.264/AVC.
We use an algorithm based on adaptive block size recurrent pattern matching, that encodes each block of motion compensated predicted data using a scaled pattern stored into an adaptive dictionary. We refer to this algorithm as Multidimensional Multiscale Parser, or MMP. A video encoding method is presented, that uses MMP instead of the integer transform in an H.264/AVC encoder framework. Experimental results show that MMP is capable of achieving consistent gains on the final average PSNR, when the new encoder is compared with H.264/AVC's high profile.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Reduction of the bitrate of video content is necessary in order to satisfy the different constraints imposed by networks and terminals. A fast and elegant solution for the reduction of the bitrate is requantization, which has been successfully applied on MPEG-2 bitstreams. Because of the improved intra prediction in the H.264/AVC specification, existing transcoding techniques are no longer suitable. In this paper we compare requantization transcoders for H.264/AVC bitstreams. The discussion is restricted to intra 4x4 macroblocks only, but the same techniques are also applicable to intra 16x16 macroblocks. Besides the open-loop transcoder and the transcoder with mode reuse, two architectures with drift compensation are described, one in the pixel domain and the other in the transform domain. Experimental results show that these architectures approach the quality of the full decode and recode architecture for low to medium bitrates. Because of the reduced computational complexity of these architectures, in particular the transform-domain compensation architecture, they are highly suitable for real-time adaptation of video content.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The resources for evaluation of moving imagery coding include a variety of subjective and objective methods for quality
measurement. These are applied to a variety of imagery, ranging from synthetically-generated to live capture. NIST has
created a family of synthetic motion imagery (MI) materials providing image elements such as moving spirals, blocks,
text, and spinning wheels. Through the addition of a colored noise background, the materials support the generation of
graded levels of MI coding impairments such as image blocking and mosquito noise, impairments that are found in
imagery coded with Motion Pictures Expert Group (MPEG) and similar codecs. For typical available synthetic imagery,
human viewers respond unfavorably to repeated viewings; so in this case, the use of objective (computed) metrics for
evaluation of quality is preferred. Three such quality metrics are described: a standard peak-signal-to-noise measure, a
new metric of edge-blurring, and another of added-edge-energy. As applied to the NIST synthetic clips, the metrics
confirm an approximate doubling [1] of compression efficiency between two commercial codecs, one an implementation
of AVC/H.264 and the other of MPEG-2.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Searching multimedia content for image, audio, and video is getting more attention especially for personal media content due to the affordability of consumer electronic devices such as MP3 recordable players, digital cameras, DV camcorders, and well-integrated smart phones. The precise search and retrieval of the content derived from these devices can be a very challenging task. Many leading edge search engine vendors have been applying sophisticated and advanced indexing and retrieval techniques on various text-based document formats, but when it comes to retrieving multimedia content, searching based on the media clip filename is the most common practice. As a result, there is an imprecise and ineffective user experience for searching multimedia content. This paper presents a new development underway from a joint effort between International Organization for Standardization (ISO)/International Electrotechnial Commission (IEC) Subcommittee (SC) 29 Working Group (WG) 11 MPEG (Moving Picture Experts Group) and WG1 JPEG (Joint Picture Experts Group) for a universal standard query format called MPEG-7 Query Format (MP7QF) as a means to enable a good user experience for consumers searching multimedia content. It also provides the industry with a unified way to accept and respond to user queries. This paper presents the core requirements for such a universal query format.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to the growing amount of digital audio an increasing need to automatically categorize music and to create self-controlled and suitable playlists has been emerged. A few approaches to this task relying on low-level features have been published so far. Unfortunately the results utilizing those technologies are not sufficient yet. This paper gives an introduction how to enhance the results with regard to the perceptual similarity using different high-level descriptors and a powerful interaction between the algorithm and the user to consider his preferences. A successful interaction between server and client requires a powerful standardized query language. This paper describes the tools of the MPEG-7 Audio standard in detail and gives examples of already established query languages. Furthermore the requirements of a multimedia query language are identified and its application is exemplified by an automatic audio creation system using a query language.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a smart camera which performs video analysis and generates an MPEG-7 compliant stream. By producing a content-based metadata description of the scene, the MPEG-7 camera extends the capabilities of conventional cameras. The metadata is then directly interpretable by a machine. This is especially helpful in a number of applications such as video surveillance, augmented reality and quality control. As a use case, we describe an algorithm to identify moving objects and produce the corresponding MPEG-7 description. The algorithm runs in real-time on a Matrox Iris P300C camera.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The utility of target shadows for automatic target recognition (ATR) in synthetic aperture radar (SAR)
imagery is investigated. Although target shadow, when available, is not a powerful target discriminating
feature, it can effectively increase the overall accuracy of the target classification when it is combined
with other target discriminating features such as peaks, edges, and corners. A second and more important
utility of target shadow is that it can be used to identify the target pose. Identification of the target pose
before the recognition process reduces the number of reference images used for comparison/matching,
i.e., the training sets, by at least fifty percent. Since implementation and the computation complexity of
the pose detection algorithm is relatively simple, the proposed two-step process, i.e., pose detection
followed matching, considerably reduces the complexity of the overall ATR system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Close-range mapping applications such as cultural heritage restoration, virtual reality modeling for the entertainment
industry, and anatomical feature recognition for medical activities require 3D data that is usually acquired by high
resolution close-range laser scanners. Since these datasets are typically captured from different viewpoints and/or at
different times, accurate registration is a crucial procedure for 3D modeling of mapped objects. Several registration
techniques are available that work directly with the raw laser points or with extracted features from the point cloud.
Some examples include the commonly known Iterative Closest Point (ICP) algorithm and a recently proposed technique
based on matching spin-images. This research focuses on developing a surface matching algorithm that is based on the
Modified Iterated Hough Transform (MIHT) and ICP to register 3D data. The proposed algorithm works directly with
the raw 3D laser points and does not assume point-to-point correspondence between two laser scans. The algorithm can
simultaneously establish correspondence between two surfaces and estimates the transformation parameters relating
them. Experiment with two partially overlapping laser scans of a small object is performed with the proposed algorithm
and shows successful registration. A high quality of fit between the two scans is achieved and improvement is found
when compared to the results obtained using the spin-image technique. The results demonstrate the feasibility of the
proposed algorithm for registering 3D laser scanning data in close-range mapping applications to help with the
generation of complete 3D models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a fractal measure based pattern classification algorithm for automatic feature extraction and identification of fungus associated with an infection of the cornea of the eye. A white-light confocal microscope image
of suspected fungus exhibited locally linear and branching structures. The pixel intensity variation across the
width of a fungal element was gaussian. Linear features were extracted using a set of 2D directional matched
gaussian-filters. Portions of fungus profiles that were not in the same focal plane appeared relatively blurred. We
use gaussian filters of standard deviation slightly larger than the width of a fungus to reduce discontinuities. Cell
nuclei of cornea and nerves also exhibited locally linear structure. Cell nuclei were excluded by their relatively
shorter lengths. Nerves in the cornea exhibited less branching compared with the fungus. Fractal dimensions
of the locally linear features were computed using a box-counting method. A set of corneal images with fungal
infection was used to generate class-conditional fractal measure distributions of fungus and nerves. The a priori
class-conditional densities were built using an adaptive-mixtures method to reflect the true nature of the feature
distributions and improve the classification accuracy. A maximum-likelihood classifier was used to classify
the linear features extracted from test corneal images as 'normal' or 'with fungal infiltrates', using the a priori
fractal measure distributions. We demonstrate the algorithm on the corneal images with culture-positive fungal
infiltrates. The algorithm is fully automatic and will help diagnose fungal keratitis by generating a diagnostic
mask of locations of the fungal infiltrates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Plastic-Bonded Explosives (PBXs) are a newer generation of explosive compositions developed at Los Alamos National Laboratory (LANL). Understanding the micromechanical behavior of these materials is critical. The size of the crystal particles and porosity within the PBX influences their shock sensitivity. Current methods to characterize the prominent structural characteristics include manual examination by scientists and attempts to use commercially available image processing packages. Both methods are time consuming and tedious. LANL personnel, recognizing this as a manually intensive process, have worked with the Kansas City Plant / Kirtland Operations to develop a system which utilizes image processing and pattern recognition techniques to characterize PBX material. System hardware consists of a CCD camera, zoom lens, two-dimensional, motorized stage, and coaxial, cross-polarized light. System integration of this hardware with the custom software is at the core of the machine vision system. Fundamental processing steps involve capturing images from the PBX specimen, and extraction of void, crystal, and binder regions. For crystal extraction, a Quadtree decomposition segmentation technique is employed. Benefits of this system include: (1) reduction of the overall characterization time; (2) a process which is quantifiable and repeatable; (3) utilization of personnel for intelligent review rather than manual processing; and (4) significantly enhanced characterization accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A machine capable of digitizing two 8 inch by 10 inch (203 mm by 254 mm) glass astrophotographic plates or a single
14 inch by 17 inch (356 mm by 432 mm) plate at a resolution of 11 μm per pixel or 2309 dots per inch (dpi) in 92
seconds is described. The purpose of the machine is to digitize the ~500,000 plate collection of the Harvard College
Observatory in a five-year time frame. The digitization must meet the requirements for scientific work in astrometry,
photometry, and archival preservation of the plates. This paper describes the requirements for and the design of the
subsystems of the machine that was developed specifically for this task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent developments in digital cameras in terms of an increase in size of the charged coupled device and the complementary metal oxide semiconductor arrays, as well as a reduction in costs, are leading to their use for traditional and new photogrammetric, surveying, and mapping functions. Such usage should be preceded by careful calibration of the implemented cameras in order to determine their interior orientation parameters. In addition, the wide diversity of expected users mandates the development of a convenient calibration procedure that does not require professional photogrammetrists and/or surveyors. This paper introduces a methodology for calibrating medium-format digital cameras using a test field consisting of straight lines and a few signalized point targets. A framework for the automatic extraction of the linear features and the point targets from the images, and for their incorporation into the calibration procedure, is presented and tested. In addition, the research introduces an approach for testing the camera stability, in which the degree of similarity between the bundles reconstructed from two sets of interior orientation parameters is quantitatively evaluated. Experimental results with real data proved the feasibility of the line-based self-calibration approach. In addition, the analysis of the internal characteristics of the utilized camera estimated from various calibration sessions revealed the camera's stability over a long period.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Myocardial motion analysis and quantification is of utmost importance for analyzing contractile heart abnormalities and
it can be a symptom of a coronary artery disease. A fundamental problem in processing sequences of images is the
computation of the optical flow, which is an approximation to the real image motion. This paper presents a new
algorithm for optical flow estimation based on a spatiotemporal-frequency (STF) approach, more specifically on the
computation of the Wigner-Ville distribution (WVD) and the Hough Transform (HT) of the motion sequences. The later
is a well-known line and shape detection method very robust against incomplete data and noise. The rationale of using
the HT in this context is because it provides a value of the displacement field from the STF representation. In addition, a
probabilistic approach based on Gaussian mixtures has been implemented in order to improve the accuracy of the motion
detection. Experimental results with synthetic sequences are compared against an implementation of the variational
technique for local and global motion estimation, where it is shown that the results obtained here are accurate and robust
to noise degradations. Real cardiac magnetic resonance images have been tested and evaluated with the current method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The steered multiscale Hermite transform is introduced as a tool for image fusion. It is shown how this transform's
particular characteristics, closely related to important visual perception properties, efficiently reproduce relevant image
structures in the fused products. Two cases of remote sensing image fusion are presented, namely multispectral with
panchromatic fusion and SAR with multispectral fusion. In the latter, a noise reduction algorithm also based on the
Hermite transform is incorporated within the fusion scheme so that characteristic SAR image speckle is reduced and thus
limited from corrupting fused products.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Registration plays a key role in multimodal data fusion to extract synergistic information from multiple non-destructive
evaluation (NDE) sources. One of the common techniques for registration of point datasets is the Iterative Closest Point
(ICP) Algorithm. Generally, modern day NDE techniques generate large datasets and conventional ICP algorithm
requires huge amount of time to register datasets to the desired accuracy. In this paper, we present algorithms to aid in
the registration of large 3D NDE data sets in less time with the required accuracy. Various methods of coarse registration
of data, partial registration and data reduction are used to realize this. These techniques have been used in registration
and it is shown that registration can be accomplished to the desired accuracy with more than 90% reduction in time as
compared to conventional ICP algorithm. Volumes of interest (VOI) can be defined on the data sets and merged together
so that only the features of interest are used in the registration. The proposed algorithm also provides capability for
eliminating noise in the data sets. Registration of Computed Tomography (CT) Image data, Coordinate Measuring
Machine (CMM) Inspection data and CAD model has been discussed in the present work. The algorithm is generic in
nature and can be applied to any other NDE inspection data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes tabletop display systems utilizing the stereoscopic 3D display technology. The authors have ever
researched 3D display systems using the polarized glasses and liquid crystal shutter glasses, the image splitter such as a
parallax barrier or the lenticular screen and the holographic optical elements1)2)3). These image splitting technologies
for displaying a stereoscopic 3D image are available for developing the tabletop display that can provide different
images to two users surrounding the system. To separate dual images using the polarizer slits, it is necessary to display
the orthogonal polarized two images. We developed the dual LCD panel using two liquid crystal layers to make the
thin and compact display system. This display panel enables observers to view full screen high resolution images.
This study shows that it is possible to simplify the optical system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The rear projection lenticular 3D display system using a projector has superior characteristics, such as having a large screen
with wide field of view. However, a conventional system has disadvantages such as having images with divided
horizontal resolution. We describe the 3D display system using a lenticular screen attached with vertically striped
polarizer slits. This 3D display can avoid the problem of conventional system because this display shows twice the 3D
image resolution. Moreover, we propose the lenticular display using double polarizer slit for elimination of pseudoscopic
viewing area to solve the pseudoscopic image problem.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Finding the distance of object in a scene from vision information is an important problem in machine vision. A large number of techniques for passive ranging of unknown objects have been developed over the years (i.e. range from stereo, motion, focus and defocus). Nearly all such techniques may be framed in terms of a differential formalism. In the case of binocular stereo, two different images are taken from cameras at different discrete viewpoints, similarly, difference between consecutive images are often used to determine viewpoint derivatives for structure from motion and two or more different images taken from cameras with different aperture size are used to compute the derivative respect to aperture size for range from focus and defocus method. All this methods may be fallen into a discrete differentiation category. Farid proposed a consecutive differentiation method for range estimation which employs the intensity variation of the images along with the aperture changes to measure the range information. In this paper, we first consider the plenoptic function which is a powerful mathematical tool for understanding the primary vision problem. We then show an algorithm within a differential framework for range estimation based on the assumption of brightness constancy. Finally we show several implementations of passive ranging using this differential algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One of the main problems in visual image processing is incomplete information owing an occlusion of objects by other objects. Since correlation filters mainly use contour information of objects to carry out pattern recognition then conventional correlation filters without training often yield a poor performance to recognize partially occluded objects. Adaptive correlation filters based on synthetic discriminant functions for recognition of partially occluded objects imbedded into a cluttered background are proposed. The designed correlation filters are adaptive to an input test scene, which is constructed with fragments of the target, false objects, and background to be rejected. These filters are able to suppress sidelobes of the given background as well as false objects. The performances of the adaptive filters in real scenes are compared with those of various correlation filters in terms of discrimination capability and robustness to noise.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In pattern recognition two different tasks are distinguished; that is, detection of objects and estimation of their exact
positions (localization) in images. Traditional methods for pattern recognition are based on correlation or template
matching. These methods are attractive because they can be easily implemented with digital or optical processors.
However, they are very sensitive to intensity degradations that always are present in observed images. In this paper we
analyze and compare correlation-based methods for reliable detection and localization of degraded objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a new 3D object tracking system using the disparity motion vector (DMV) is presented. In the proposed
method, the time-sequential disparity maps are extracted from the sequence of the stereo input image pairs and these
disparity maps are used to sequentially estimate the DMV defined as a disparity difference between two consecutive
disparity maps similarly to motion vectors in the conventional video signals, the DMV provides us with motion
information of a moving target by showing a relatively large change in the disparity values in the target areas.
Accordingly, this DMV helps detect the target area and its location coordinates. Based on these location data of a moving
target, the pan/tilt embedded in the stereo camera system can be controlled and consequently achieve real-time stereo
tracking of a moving target. From the results of experiments with 9 frames of the stereo image pairs having 256x256
pixels, it is shown that the proposed DMV-based stereo object tracking system can track the moving target with a
relatively low error ratio of about 3. 5 % on average.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a new real-time and intelligent mobile robot system for path planning and navigation using stereo camera embedded on the pan/tilt system is proposed. In the proposed system, face area of a moving person is detected from a sequence of the stereo image pairs by using the YCbCr color model and using the disparity map obtained from the left and right images captured by the pan/tilt-controlled stereo camera system and depth information can be detected. And then, the distance between the mobile robot system and the face of the moving person can be calculated from the detected depth information. Accordingly, based-on the analysis of these data, three-dimensional objects can be detected. Finally, by using these detected data, 2-D spatial map for a visually guided robot that can plan paths, navigate surrounding objects and explore an indoor environment is constructed. From some experiments on target tracking with 480 frames of the sequential stereo images, it is analyzed that error ratio between the calculated and measured values of the relative position is found to be very low value of 1.4 % on average. Also, the proposed target tracking system has achieved a high speed of 0.04 sec/frame for target detection and 0.06 sec/frame for target tracking.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we use adaptive rank-order filters for localization and extraction of desirable details from images. To
improve the performance of linear correlations, we use novel local adaptive correlations based on nonparametric
Spearman's correlation. These filters are based on correlation between the ranks of the input scene computed in a
moving window and those of the target. Their performance and noise robustness are compared with those of the
conventional linear correlation. Computer simulation results are provided and discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The progress of data transmission technology through the Internet has spread a variety of realistic contents. One of
such contents is multi-view video that is acquired from multiple camera sensors. In general, the multi-view video
processing requires encoders and decoders as many as the number of cameras, and thus the processing complexity
results in difficulties of practical implementation.
To solve for this problem, this paper considers a simple multi-view system utilizing a single encoder and a single
decoder. In the encoder side, input multi-view YUV sequences are combined on GOP units by a video mixer. Then, the
mixed sequence is compressed by an H.264/AVC encoder. The decoding is composed of a single decoder and a
scheduler controlling the decoding process. The goal of the scheduler is to assign approximately identical number of
decoded frames to each view sequence by estimating the decoder utilization of a GOP and subsequently applying frame
skip algorithms. Furthermore, in the frame skip, efficient frame selection algorithms are studied for H.264/AVC
baseline and main profiles based upon a cost function that is related to perceived video quality.
Our proposed method has been performed on various multi-view test sequences adopted by MPEG 3DAV.
Experimental results show that approximately identical decoder utilization is achieved for each view sequence so that
each view sequence is fairly displayed. Finally, the performance of the proposed method is compared with a simulcast
encoder in terms of bit-rate and PSNR using a rate-distortion curve.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital watermarking is an important technique to protect copyrighted multimedia data. This technique works by hiding secret information into the images. Therefore, it can be used to discourage illicit copying or distribution of copyrighted materials. In this paper, we propose a robust frequency domain digital watermarking algorithm for still image based on discrete cosine transformation. Adjustable parameters are introduced during the watermark embedding process, which adaptively change the JPEG quantization factor, as well as the depth at which the watermark is embedded. The proposed watermarking technique still maintains its validity under certain image processing operations such as low pass filtering, image cropping, etc. Compared with previous method, however, it has improved performance under Joint Photographic Experts Group (JPEG) compression attack. The extracted watermark maintains its high quality in terms of normalized correlation even under a high JPEG compression ratio.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes an in situ image recognition system designed to inspect the quality standards of the chocolate pops during their production. The essence of the recognition system is the localization of the events (i.e., defects) in the input images that affect the quality standards of pops. To this end, processing modules, based on correlation filter, and segmentation of images are employed with the objective of measuring the quality standards. Therefore, we designed the correlation filter and defined a set of features from the correlation plane. The desired values for these parameters are obtained by exploiting information about objects to be rejected in order to find the optimal discrimination capability of the system. Regarding this set of features, the pop can be correctly classified. The efficacy of the system has been tested thoroughly under laboratory conditions using at least 50 images, containing 3 different types of possible defects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a new resolution-enhanced computational integral imaging reconstruction method employing an intermediate-view reconstruction technique is presented. In the proposed method, a number of intermediate elemental images can be synthesized as many as required from the limited number of picked up elemental images by using the IVR technique. With sufficient overlapping of this increased number of elemental images in the reconstruction image plane, a resolution-enhanced 3D image can be displayed. To show a feasibility of the proposed scheme, some experiments were performed and its results were presented as well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a very simple scalable video coding (SVC) system based on the H.264 baseline profile codec. The proposed SVC algorithm can offer three levels of the temporal and spatial scalability - QVGA@15fps, QVGA@30fps, and VGA@30fps. The proposed system achieves the temporal scalability by encoding every other picture as the non-reference P-picture, so that the base layer codec dealing with the QVGA@15fps sequence is fully-compatible with the satellite-digital multimedia broadcasting (S-DMB) system in Korea. In addition, the same decoder can reconstruct the QVGA@30fps sequence when it receives the bits representing the non-reference pictures. For the spatial enhancement layer, the encoder follows the standard H.264 baseline profile except the inter-layer intra prediction. To reduce the computational burden of the encoder, the enhancement layer encoder may skip the motion estimation procedure by interpolating the motion field with that of the base layer. Simulation results show that the proposed system yields less then about 12% of loss in the reconstruction picture quality compared with the anchor H.264 JM encoder. The proposed SVC system still has a room for improvement of coding efficiency by trading with the computational complexity, so that lots of further works are required.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The high computational complexity of level set methods has excluded themselves from many real-time applications. The high algorithm complexity is mainly due to the need of solving partial differential equations (PDEs) numerically. For image segmentation and object tracking applications, it is possible to approximate level set curve evolution process without solving PDEs since we are interested in the final object boundary instead of the accurate curve evolution process. This paper proposes a fast parallel method to simplify curve evolution process using simple binary morphological operations. The proposed fast implementation allows real-time image segmentation and object tracking using level set curve evolution, while preserves the advantage of level set methods for automatically handling topological changes. It can utilize the parallel processing capability of existing embedded hardware, parallel computers or optical processors for fast curve evolution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Super-resolution reconstruction algorithms have been demonstrated to be very effective in enhancing image spatial resolution by combining several low-resolution images to yield a single high-resolution image. However, the high computational complexity has become a major obstacle for the use of super-resolution techniques in real time applications. Most previous computationally efficient super-resolution techniques have been focused on reducing the number of iterations due to the iterative nature of most super-resolution algorithms. In this paper, we propose a region-of-interest (ROI) image preprocessing technique to improve the processing speed of super-resolution reconstruction. To better integrate the preprocessing with super-resolution, the proposed ROI extraction technique is developed under the same statistical framework as super-resolution. Simulation results are provided to demonstrate the performance of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, the method for an effective and intelligent route decision of an unmanned ground vehicle (UGV) using a
2D spatial map of the stereo camera system is proposed.
The depth information and disparity map are detected in the inputting images of a parallel stereo camera. The distance
between the automatic moving robot and the obstacle detected and the 2D spatial map obtained from the location
coordinates, and then the relative distance between the obstacle and the other objects obtained from them. The unmanned
ground vehicle moves automatically by effective and intelligent route decision using the obtained 2D spatial map. From
some experiments on robot driving with 24 frames of the stereo images, it is analyzed that error ratio between the
calculated and measured values of the distance between the objects is found to be very low value of 1.57% on average,
respectably.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The proposed method offers one possibility to restore climate in order to avoid overheating. The method to create a
supplemental icy cover is considered in the paper. We investigate theoretically the creation of artificial rafts in the border
of water-ice area in the north seas. Firstly, such artificial rafts or films can be used as additional mirror for san energy.
Secondly, these rafts can decrease local water vibration for the ice to form easily in north regions. And finally, these rafts can be treated as crystallization centers in the supercooled water.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Crystal location and alignment to the x-ray beam is an enabling technology necessary for automation of the
macromolecular crystallography at synchrotron beamlines. In a process of crystal structure determination, a small size
x-ray synchrotron beam with FWHM as small as 70 μm (bending magnet beamlines) and 20 μm (undulator beamlines)
is focused at or downstream of the crystal sample. Protein crystals used in structure determination become smaller and
approach 50 μm or less, and need to be precisely placed in the focused x-ray beam. At the Structural Biology Center the
crystals are mounted on a goniostat, allowing precise crystal xyz positioning and rotations. One low and two high
magnification cameras integrated into synchrotron beamline permit imaging of the crystal mounted on a goniostat. The
crystals are held near liquid nitrogen temperatures using cryostream to control secondary radiation damage. Image
processing techniques are used for automatic and precise placing of protein crystals in synchrotron beam. Here we are
discussing automatic crystal centering process considered for Structure Biology Center utilizing several image
processing techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.