JEI Letters

Fast illumination normalization for robust eye localization under variable illumination

[+] Author Affiliations
Fei Yang, Jianbo Su

Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, China

J. Electron. Imaging. 18(1), 010503 (February 25, 2009). doi:10.1117/1.3086868
History: Received November 22, 2008; Revised January 14, 2009; Accepted January 16, 2009; Published February 25, 2009
Text Size: A A A

Open Access Open Access

Most eye localization methods suffer from illumination variation. To overcome this problem, we propose an illumination normalization technique as a preprocessing step before localizing eyes. This technique requires no training process, no assumption on the light conditions, and no alignment between different images for illumination normalization. Moreover, it is fast and thus effective for real-time applications. Experiment results verify the effectiveness and efficiency of the eye localization scheme with the proposed illumination normalization technique.

Figures in this Article

Eye localization has been an intensive research topic for its wide applications in face recognition, driver fatigue monitoring systems, etc.14 Unfortunately, it is still very hard to find an illumination-independent eye localization algorithm. Thus, the active infrared-based approaches are employed to diminish the influence of illumination variation.12 However, eye localization in visible-spectrum images is more relevant, since it is more practical in applications, without needing extra equipments. Consequently, it is often essential to normalize illumination before localizing eyes in visible-spectrum images of widely varied lighting. Jung et al. 3 apply the self quotient image (SQI) to rectify illumination and get satisfactory eye localization results on their private face database. However, in SQI, the image is normalized by division over its smoothed version, which depends on the kernel size of the weighed Gaussian filter to a great extent. The kernel size is rather difficult to determine since the intrinsic information will be severely reduced if the kernel size is small or halo effects might appear if the kernel size is large. Although a multiscale technique is adopted to alleviate this problem, computational cost is added, and sometimes overcompensation regions still exist. In addition, the image noise would be amplified by the division operation, which also deteriorates the performance of eye localization. This letter proposes to take advantage of a more effective illumination normalization method, the logarithmic total variation (LTV) model,5 as a preprocessing step for eye localization, and it validates the performance of this eye localization scheme on some public face databases. In Sec. 2, the LTV model is presented to normalize illumination on the face. Since the computational cost of the LTV model is expensive for real-time applications, a modified graph cut–based algorithm is proposed to solve the model in Sec. 3 so that the proposed preprocessing step is accelerated. Experimental results are given in Sec. 4, followed by conclusions in Sec. 5.

According to the Lambertian model, the captured face image I(x,y) can be represented asDisplay Formula

1I(x,y)=ρx,ySx,y,
where ρ is the albedo of the object surface, and S(x,y) is the final light strength received at location (x,y). The albedo ρ is the intrinsic representation of the captured face and is independent of the ambient lighting condition, which can be investigated for illumination-independent eye localization. Taking the logarithm of Eq. 1, we have:Display Formula
2log(I)=log(ρ)+log(S).
If we denote f=log(I), v=log(ρ), and u=log(S), respectively, thenDisplay Formula
3f=v+u.

Chen et al. 5 argued that one of the differences between the intrinsic structure and the illumination pattern of a face image is the scale difference, and the intrinsic structure is usually smaller than the illumination artifacts and shadows. In a way, v promotes the variation patterns of the albedos of small-scale facial features. Thus, in order to eliminate the interference of ambient lighting in eye localization, one needs to extract v from f. We notice that the TV-L1 model has shown its effectiveness for this task.5 Hence, we could use the TV-L1 model to estimate u:Display Formula

4u=argminuΩ(u+λfu)dx,
where λ is a penalty parameter. As λ increases, the term fu becomes more dominant, and thus u becomes more smoothing. One advantage of this LTV model is that the parameter λ, which depends only on the scale of image, is very easy to set. According to the LTV model, larger structures such as extrinsic illumination are left in u (or S), and ρ, which is taken as the output image of illumination normalization, can be obtained by ρ=exp(v)=exp(fu).

Therefore, the illumination normalization problem is transformed to how to solve the TV-L1 model. Since partial differential equation–based algorithms often have numerical difficulties, Chen et al. 5 cast Eq. 4 as a second-order cone program and solved it by the modern interior-point methods. But this iterative solution is expensive in both memory and computation time. To make this illumination normalization technique suitable for real-time applications, a graph cut–based algorithm is proposed to solve Eq. 4.

First, we show how to decompose Eq. 4 as several independent binary energy minimization problems. The images discussed in this letter are defined as m×n matrices in Zm×n, where Z denotes the set of nonnegative integers that represent the grayscale levels of images, and m×n denotes the size of images. Let fZm×n and uZm×n denote the original and separated images, respectively. According to Eq. 3, each element of these matrices satisfiesDisplay Formula

5fi,j=vi,j+ui,j,fori=1,,m,j=1,,n.
Moreover, we assume that all images satisfy the Neumann condition on the boundary of the domain Ω, i.e., the differentials on the image edges are defined to be zero. This assumption can be guaranteed by padding the image using the boundary elements. In this letter, to simplify and accelerate our algorithm, we just use the 4-neighbors of ui,j to approximate the gradient of u at the location (i,j). Consequently, the regularization term in Eq. 4 can be defined in the discrete case byDisplay Formula
6u=i,j(ui+1,jui,j+ui,j+1ui,j).
Suppose μ as given, and define Bi,j=1 for ui,jμ; otherwise, Bi,j=0. Define x+=max{x,0}, where x is an arbitrary real number. There exists Bi,jBk,l(Bi,jBk,l)++(Bk,lBi,j)+. For each pair of neighboring pixels (i,j) and (k,l), ui,juk,l can be expressed in terms of the elements of Bi,j over all grayscale levels μ=0,1,,lmax as follows:Display Formula
7ui,juk,l=μ=0lmaxBi,jBk,l=μ=0lmax[(Bi,jBk,l)++(Bk,lBi,j)+],
where lmax=maxi,j{ui,j}255. In this way, the original problem is reformulated into several independent binary problems based on the decomposition of a function into its level sets. Hence, combining Eq. 6 with Eq. 7, the first term in the right part of Eq. 4 can be binarized asDisplay Formula
8Ωudx=μ=0lmaxi,j{[(Bi,jBi+1,j)++(Bi+1,jBi,j)+]+[(Bi,jBi,j+1)++(Bi,j+1Bi,j)+]}.
Similarly, we define Bi,j=1 for fi,jμ; otherwise, Bi,j=0. For binary numbers Bi,j and Bi,j, there exists Bi,jBi,j(1Bi,j)Bi,j+Bi,j(1Bi,j). The second term of the right side in Eq. 4 can then be binarized asDisplay Formula
9Ωfudx=i,jfi,jui,j=μ=0lmaxi,j[(1Bi,j)Bi,j+Bi,j(1Bi,j)],
where lmax=maxi,j{fi,j}255. As a result, Eq. 4 is reformulated by combining Eq. 8 with Eq. 9. For given input f and λ and a fixed level μ{0,1,,lmax}, Eq. 4 can be rewritten asDisplay Formula
10u=argminμ=0lmaxE(B;f,λ,μ),
Display Formula
11E(B;f,λ,μ)=i,j{[(Bi,jBi+1,j)++(Bi+1,jBi,j)+]+[(Bi,jBi,j+1)++(Bi,j+1Bi,j)+]+λ[(1Bi,j)Bi,j+Bi,j(1Bi,j)]}.

Thus, the problem of minimizing discretized Eq. 4 is decomposed into minimizing E(B;f,λ,μ) for all levels μ=0,1,,lmax. It is noted that the minimizer u* of Eq. 4 can be constructed from the minimizers {Bμ*:μ=0,1,,lmax} using the relationship6Display Formula

12ui,j*=max[μBμ*(i,j)=1].

We then construct a directed capacitated graph corresponding to E(B;f,λ,μ) to find its minimizer Bμ* at every level μ=0,1,,lmax. It is worth noting that the nodes/pixels in the graph are all binary and that the cost of each n-link connecting one pair of neighboring pixels equals 1 and a t-link connecting (i,j) with the source or the sink costs λBi,j and λ(1Bi,j), respectively. In this way, a simplified two-terminal s-t graph representation of Eq.11 is constructed, and then the minimizer Bμ* is obtained via the min-cut algorithm on the graph.7

To sum up, by introducing divide-and-conquer methodology and a simplified graph representation, the minimizer u* of Eq. 4 can be computed more efficiently. This method is essentially identical with Ref. 6 but is easier to understand and implement. Consequently, the LTV model–based illumination normalization technique is accelerated, called Fast LTV (FLTV) in this letter.

Three well-known benchmark databases were chosen to evaluate the performance of the proposed eye localization scheme under both good and bad lighting conditions. In the Chinese Academy of Science: Pose, Expression, Accessory, Lighting (CAS-PEAL) face database,8 the Lighting and Normal subsets were used, which contain 2450 frontal face images under widely variable illumination and 1040 frontal face images under normal illumination, respectively. Yale face database B (Ref. 9), which contains 650 frontal face images was also adopted since it allows for testing under large variations of illumination, including strong shadow and side lighting. Another 3368 frontal face images under general illumination were chosen from the Face Recognition Technology (FERET) face database.10 All images are roughly cropped so that the facial regions are left and then resized to the appointed size. Then, illumination normalization is executed on the images. SQI and FLTV are both conducted here for comparison.

It is obvious that the darkest pixel in the eye region is most often a part of a pupil. Thus, this gray valley can be employed to localize eyes in face images. Generally, to suppress noise and alleviate the interference of other objects (e.g., hair, eye corner), a mean filter and a circular averaging filter are usually used to enhance the image. This simple eye localization approach requires no initialization and training process. Moreover, it is extremely fast and easy to implement and thus is widely used in practical applications. This approach is also used to test the illumination normalization methods here. For higher accuracy and speed, we limit searching gray valleys to the top half of the face image.

A few localization results on the Lighting subset of CAS-PEAL and Yale B face database are illustrated in Fig. 1 and Fig. 2, respectively. The upper images are the original images; the lower images are the eye localization results on corresponding illumination-normalized images using FLTV. Note that the same method used to select λ in Ref. 4 is also adopted here. It can be observed that there exist no over compensation regions in the normalized images. To evaluate the accuracy of eye localization, a general criterion4 to claim successful eye localization is adopted:Display Formula

13err=max{lclc,rcrc}lcrc<0.25,
where lc and rc are the manually marked left and right eye positions, and lc and rc are the automatically located positions. Thus, the correct localization rates directly on the original images and the preprocessed images with SQI and FLTV on the four test sets are separately obtained and are shown in Table 1. It can be seen that SQI and FLTV can greatly improve eye localization accuracy on all test sets, and FLTV outperforms SQI under both good and bad illumination. In order to evaluate the computational cost, we get the average location time per image by calculating the mean of the total execution time. The average location times on a 128×128 face image using SQI and FLTV as a preprocessing step are 0.781s and 0.057s, respectively. It is easy to conclude that FLTV is much faster than SQI and is effective for real-time eye localization. In addition, it takes 6.53s on average for the original LTV model to process a 128×128 image, which is much slower than the proposed FLTV and SQI methods. All the experiments are conducted with C++ on a Pentium D 2.8GHz computer.

Table Grahic Jump Location
Correct localization rates on the four test sets.
Graphic Jump LocationF1 :

Correct localization samples from CAS-Lighting subset.

Graphic Jump LocationF2 :

Correct localization samples from Yale B.

The experimental results demonstrate that our illumination normalization technique is reliable for robust eye localization under extreme lighting conditions. It can also greatly improve the eye localization accuracy on images under good lighting conditions. One reason may be that FLTV not only retains useful information for eye localization, but also eliminates the interference of hair of large size (structure) by leaving it in S. Therefore, using such a simple eye localization approach can achieve better or closer accuracy than other complicated eye localization algorithms.

In this letter, we propose an illumination normalization technique as a preprocessing step before localizing eyes. This eye localization scheme is proven to be very fast and reliable under variable illumination. Motivated by the effectiveness and efficiency of the proposed illumination normalization technique, we might expect good performance when combining it with other existing eye localization algorithms, which is also our future work.

Hizem  W., , Yang  Y., , and Dorizzi  B., “ Near-infrared sensing and associated landmark detection for face recognition. ,” J. Electron. Imaging.  1017-9909 17, (1 ), 011005  ((2008)).
Zhu  Z., and Ji  Q., “ Robust real-time eye detection and tracking under variable lighting conditions and various face orientations. ,” Comput. Vis. Image Underst..  1077-3142 98, (1 ), 124–154  ((2005)).
Jung  S. U., and Yoo  J. H., “ A robust eye detection method in facial region. ,” Lect. Notes Comput. Sci..  0302-9743 4418, , 596–606  ((2007)).
Zhou  Z. H., and Geng  X., “ Projection functions for eye detection. ,” Pattern Recogn..  0031-3203 37, (5 ), 1049–1056  ((2004)).
Chen  T., , Zhou  X. S., , Comaniciu  D., , and Huang  T. S., “ Total variation models for variable lighting face recognition. ,” IEEE Trans. Pattern Anal. Mach. Intell..  0162-8828 28, (9 ), 1519–1524  ((2006)).
Darbon  J., and Sigelle  M., “ Image restoration with discrete constrained total variation. Part I: fast and exact optimization. ,” J. Math. Imaging Vision.  0924-9907 26, (3 ), 261–276  ((2006)).
Boykov  Y., and Kolmogorov  V., “ An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. ,” IEEE Trans. Pattern Anal. Mach. Intell..  0162-8828 26, (9 ), 1124–1137  ((2004)).
Gao  W., , Cao  B., , Shan  S., , Chen  X., , Zhou  D., , Zhang  X., , and Zhao  D., “ The CAS-PEAL large-scale Chinese face database and evaluation protocols. ,” IEEE Trans. Syst. Man Cybern., Part A. 38, (1 ), 149–161  ((2008)).
Georghiades  A., , Kriegman  D., , and Belhumeur  P., “ From few to many: generative models for recognition under variable pose and illumination. ,” IEEE Trans. Pattern Anal. Mach. Intell..  0162-8828 23, (6 ), 643–660  ((2001)).
Phillips  P. J., , Moon  H., , Rizvi  S. A., , and Rauss  P. J., “ The FERET evaluation methodology for face-recognition algorithms. ,” IEEE Trans. Pattern Anal. Mach. Intell..  0162-8828 22, (10 ), 1090–1104  ((2000)).
© 2009 SPIE and IS&T

Citation

Fei Yang and Jianbo Su
"Fast illumination normalization for robust eye localization under variable illumination", J. Electron. Imaging. 18(1), 010503 (February 25, 2009). ; http://dx.doi.org/10.1117/1.3086868


Figures

Graphic Jump LocationF2 :

Correct localization samples from Yale B.

Graphic Jump LocationF1 :

Correct localization samples from CAS-Lighting subset.

Tables

Table Grahic Jump Location
Correct localization rates on the four test sets.

References

Hizem  W., , Yang  Y., , and Dorizzi  B., “ Near-infrared sensing and associated landmark detection for face recognition. ,” J. Electron. Imaging.  1017-9909 17, (1 ), 011005  ((2008)).
Zhu  Z., and Ji  Q., “ Robust real-time eye detection and tracking under variable lighting conditions and various face orientations. ,” Comput. Vis. Image Underst..  1077-3142 98, (1 ), 124–154  ((2005)).
Jung  S. U., and Yoo  J. H., “ A robust eye detection method in facial region. ,” Lect. Notes Comput. Sci..  0302-9743 4418, , 596–606  ((2007)).
Zhou  Z. H., and Geng  X., “ Projection functions for eye detection. ,” Pattern Recogn..  0031-3203 37, (5 ), 1049–1056  ((2004)).
Chen  T., , Zhou  X. S., , Comaniciu  D., , and Huang  T. S., “ Total variation models for variable lighting face recognition. ,” IEEE Trans. Pattern Anal. Mach. Intell..  0162-8828 28, (9 ), 1519–1524  ((2006)).
Darbon  J., and Sigelle  M., “ Image restoration with discrete constrained total variation. Part I: fast and exact optimization. ,” J. Math. Imaging Vision.  0924-9907 26, (3 ), 261–276  ((2006)).
Boykov  Y., and Kolmogorov  V., “ An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. ,” IEEE Trans. Pattern Anal. Mach. Intell..  0162-8828 26, (9 ), 1124–1137  ((2004)).
Gao  W., , Cao  B., , Shan  S., , Chen  X., , Zhou  D., , Zhang  X., , and Zhao  D., “ The CAS-PEAL large-scale Chinese face database and evaluation protocols. ,” IEEE Trans. Syst. Man Cybern., Part A. 38, (1 ), 149–161  ((2008)).
Georghiades  A., , Kriegman  D., , and Belhumeur  P., “ From few to many: generative models for recognition under variable pose and illumination. ,” IEEE Trans. Pattern Anal. Mach. Intell..  0162-8828 23, (6 ), 643–660  ((2001)).
Phillips  P. J., , Moon  H., , Rizvi  S. A., , and Rauss  P. J., “ The FERET evaluation methodology for face-recognition algorithms. ,” IEEE Trans. Pattern Anal. Mach. Intell..  0162-8828 22, (10 ), 1090–1104  ((2000)).

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging & repositioning the boxes below.

Related Book Chapters

Topic Collections

PubMed Articles
Advertisement
  • Don't have an account?
  • Subscribe to the SPIE Digital Library
  • Create a FREE account to sign up for Digital Library content alerts and gain access to institutional subscriptions remotely.
Access This Article
Sign in or Create a personal account to Buy this article ($20 for members, $25 for non-members).
Access This Proceeding
Sign in or Create a personal account to Buy this article ($15 for members, $18 for non-members).
Access This Chapter

Access to SPIE eBooks is limited to subscribing institutions and is not available as part of a personal subscription. Print or electronic versions of individual SPIE books may be purchased via SPIE.org.