Eye Localization in a Full Frontal Still Image

In this paper the detection of human face and eye in still frontal color images is discussed. Firstly; the preprocessing required step is accomplished. It includes image resize, RGB to gray-scale conversion, image binarization, noise removing and small objects removing. Then a proposed algorithm is applied for face localization by detecting the face edges using the detection of the pixel color change in the binary image. Finally, the normalized cross correlation is applied to find the accurate position of eyes within the localized area of the face.


Introduction
Face recognition as the front subject of pattern recognition and artificial intelligence has a broad application in the human-machine interface, the biometric information security and so on. The facial features localization is the key of face piece fitting and recognition [1].
The analysis of face images is a popular research area with applications such as face recognition, virtual tools, and human identification security systems [2].
As one of the salient features of the human face, human eyes play an important role in face recognition and facial expression analysis [3] [4]. In fact, the eyes can be considered salient and relatively stable feature on the face in comparison with other facial features. Therefore, when detecting facial features, it is advantageous to detect eyes before the detection of other facial features. The position of other facial features can be estimated using the eye position [2].
Many methods and approaches was proposed over the years for face detection [5][6] but it is still a very challenging task because of variability in scale, location, orientation, and pose. Facial expression, occlusion, and lighting conditions also change the overall appearance of faces [5]. A novel algorithm for exact eye contour detection in frontal face image has been proposed by Vladimir and Anna [7]. Snake model has been proposed by Lam and Yan [8] for detecting face boundary. Wang, et al [9] have employed genetic algorithm to detect human faces by calculating the projection of each face candidate onto the Eigen faces space. Rowley, et al [10] have improved a frontal face detection system based on neural network. This paper presents a robust face and eye detection algorithm for color frontal images. The idea of the method is to combine the respective advantages of two existing techniques, image processing with cross correlation and to overcome their shortcomings. Firstly, image processing techniques are used to locate the face region. After the location of face region is detected, an accurate detection of eyes is achieved by using cross correlation between the located face region and the previously prepared eye template. Results of experiments to the images show that the proposed approach is robust and quite efficient. This paper is organized as follows: besides this introduction, section 2 presents the proposed algorithm for face detection, section 3 describes the eye locating process, section 4 presents the experimental results and discussion and the last section is the conclusion.

Proposed Algorithm
The algorithm steps is summarized below: -1. Input the image of which face and eye detection has to be performed. 2. Resize image. 3. Convert color image into gray-scale image. 4. Binarize the gray-scale image using threshold. 5. Face top and width detection. 6. Noise Removal. 7. Small objects removing. 8. Finding the top of the head and the sides of the face accurately (as a reference). 9. crop the face area to narrow down the processing area. 10. using the cross correlation between the face area which is cropped from the original image and eye image (template) which prepared previously, the eye can be localized accurately.
A diagram of the major functions of the face and eye detection algorithm is shown in Figure (1)

Figure (1) Algorithm Steps
The following explains the face and eye detection procedure in the order of the processing operations. All images are generated in MatLab using the image processing toolbox. The input image is first resized as 640*480 for the purpose of normalization or compatibility.

Face Detection
The resized input image is then converted into gray-scale image as shown in Figure  (2). The following equation is used to convert the color image (RGB) to gray scale image [11] [12]: gray = (R*299 + G*587 + B*114)/1000 (1) where R, G, B represent the red, green and blue color components, respectively. Binarization step converts the resulting gray-scale image to a binary image. A binary image is an image in which each pixel assumes the value of only two discrete values. These values are 0 and 1, where 0 representing black and 1 representing white. With the binary image it is easy to distinguish objects from the background. The grayscale image is converted to a binary image via thresholding. The output binary image has values of 0 (black) for all pixels in the original image with luminance less than specified level (threshold), and 1 (white) for all other pixels. Thresholds are often determined based on surrounding lighting conditions. After observing many images of different faces under various lighting conditions a fixed threshold value of 160 is found to be effective. The criteria used in choosing the correct threshold is based on the idea that the binary image of the face should be majority white, allowing a few black blobs from the eyes, nose and/or lips. Figure (3) demonstrates optimal thresholded images. Figure (4) (a, b and c) illustrates binarization using threshold values of 120, 160 and 200, respectively for the same image. Figure 4b is an example of an optimal binary image for the face and eye detection algorithm so that the background is uniformly black, and the face is primary white. This will allow finding the edges of the face as described in the next steps. The results of this method depends on the value of the current pixels as in the following equation [4][11] [13] :

Figure (2) : RGB to gray-scale image conversion
where Iold represents a gray-scale matrix image pixel, and Inew represents binary matrix image pixel, the threshold represents a value which is chosen accurately by trying many values of threshold for a large number of images to find the optimum value (in the condition of the lighting of the images is approximately equal).  The process of converting an image from grayscale to binary image by a fixed threshold, although it achieved a high success rate but are deficient due to failure when lighting changes from one image to another, using variable threshold that adjusts with image lighting is the appropriate alternative to solve this problem.
The following algorithm can be used to obtain threshold value automatically: 1. Select an initial estimate for threshold. 2. Segment the image using threshold. This will produce two groups of pixels: G 1 consisting of all pixels with gray level values > threshold and G 2 consisting of pixels with values ≤ threshold 3. Compute the average gray level values µ 1 and µ 2 for the pixels in regions G 1 and G 2 . 4. Compute a new threshold value: 5. Repeat steps 2 through 4 until the difference in threshold in successive iterations is smaller than a predefined parameter T o .
A good initial value for threshold is the average gray level of the image. The parameter T o is used to stop the algorithm after changes become small in terms of this parameter [11].
The two methods are tested with the same set of images and the results show that the adaptive binarization is more efficient and the system become more robust. The next step in the eye detection function is determining the top and side of the face. This is important since finding the outline of the face narrows down the region in which the eyes fall, which makes it easier (computationally) to localize the position of the eyes. The first step is to find the top of the face. A starting point on the face should be identified firstly, followed by decrementing the y-coordinates until the top of the face is detected. Assuming that the person's face is approximately in the center of the image (shifting the person's face within the image borders will not affect the system work), the initial starting point used is (100,240).
The starting x-coordinate of 100 is chosen, to insure that the starting point is a black pixel (not on the face). The following algorithm describes how to find the actual starting point on the face, which will be used to find the top of the face.
1. Starting at (100,240), increment the x-coordinate until a white pixel is found. This is considered the left side of the face (L point).(not accurate) 2. If the initial white pixel is followed by 30 more white pixels, keep incrementing x until a black pixel is found. 3. Count the number of black pixels followed by the pixel found in step2, if a series of 30 black pixels are found, this is the right side (R point).

Noise Removal
The noise in binary image under study is usually due to the blobs of black pixels on the face, primarily in the eye, nose and lips. To fix this problem, an algorithm to remove the black blobs is developed as follows: Starting at the top, at point (mid2,y2) move 5 pixels down, label this point as (mid2,y3) were y3=y2+5. From this point move left (decrement x) until black pixel meet, label this point as Lnew, then move right (increment x) until black pixel meet, label it as Rnew. Calculate the horizontal distance between Lnew and Rnew. Divide the distance by 2, Label the result as Xnew.
From the point (mid2,y3) move left one pixel by decrementing (mid2) and set the pixel value to white, Repeat this process for (Xnew-25) times.
From the point (mid2,y3) move right one pixel by incrementing (mid2) and set the pixel value to white, Repeat this process for (Xnew-25) times.
The key to this is to stop before the left and right edges of the face; for this reason 25 pixels is left from each side, otherwise the information of where the edges of the face are, will be lost.
Repeat the last two steps for 250 value of y (250 is the approximate height of the face in the image). Figure (7) illustrates the results of implementing noise-removing algorithm.
After this step the small black spots remaining in the face can be removed by another step which is called small objects removing.

Small Objects Removal
Binary image consists of a set of objects, each object consists of a set of connected pixels. This process removes associated components (objects) of the binary image, which the number of its pixels is equal to or less than a specific number to be determined in advance and this process is similar in its work for the filters that remove unwanted parts of the image as noise. The selected number of pixels in this application is 150. Figure (8) shows the binary image after implementing the small objects removing algorithm. Finally after removing the black blobs on the face, the edges of the face can be found accurately using the point (mid2, y2).

Accurate Face Edges
Starting at (mid2,y2), the following algorithm explains the accurate face edges detection: 1. Increment y-coordinate.   The result of the face top and sides detection is shown in Figure (9), it is marked on the picture as part of the computer simulation. Figure (10) shows the cropped face image within these marks.

Eyes Locating
Now after the face boundaries are found, the face region cropped in a separate image to narrow down the area of where the eyes exist, leading to reduce the computational time for the next step significantly.

Matching by correlation [11][12][14][15]
Correlation is commonly exploited to measure the similarity between a stored template and the window image under consideration. Templates should be deliberately designed to cover variety of possible image variations [3]. Obviously, the first obligatory (essential) step for a template matching is to create a template. It's easy to find out eye templates which can be obtained from a real face image by crop it manually. Assuming that the original image f(x, y), and the sub-image w(x, y), the correlation between (f) and (w) In it is simplest form is performed according to the following equation [11] : For x = 0, 1, 2, ...., M -1, y = 0, 1, 2, ….., N -1, and the summation is taken over the image region where w and f overlap. The point that gives the highest value for c(x,y) will be the center of the sub image being searched for within the mother (main) image, and this operation is called the cross-correlation between f and w. Figure (11) illustrates the procedure, assuming that the origin of f is at its top left and the origin of w is at its center. For one value of (x, y), say, (x n , y n ) inside f, application of Eq.(3) yields one value of c. As x and y are varied, w moves around the image where w best matches f. (The higher the similarity between two statistical sets, the larger the correlation coefficient will be generated).
For image-processing applications in which the brightness of the image and template can vary due to lighting and exposure conditions, the images can be first normalized.
The correlation function given in Eq.
(3) has the disadvantage of being sensitive to changes in the amplitude of f and w. For example, doubling all values of f doubles the value of c(x, y) An approach frequently used to overcome this difficulty is to perform matching via the correlation coefficient, which is defined as [11] :

Experimental Results and Discussions
In this section, the experimental results of the proposed algorithms is presented. The proposed algorithm is implemented in MatLab 6.5 running on a computer with 1.6 GHz Atom processor and 1 GB RAM. The images utilized in the experiments are color images with size of 180×200 under various illumination condition and was obtained from the publicly available database of Computer Vision Science Research Projects [16].

Figure (13) Examples of eyes and faces located correctly
In order to test the performance of the proposed method, a comparison of proposed method and other different methods is shown in Table (2).

Conclusions
An approach to accurate face and eye detection for frontal color and still images has been presented in this paper, combining image processing techniques with cross correlation. The proposed algorithm firstly makes use of image processing techniques to detect the face region. The precise locations of eyes are then detected by performing template matching (cross correlation).
The proposed method has been tested by images from Computer Vision Science Research Projects face database. Experimental results show that this method works well. For 278 images, the detection accuracy is 73.4% when the fixed threshold is used and 95.7% when the adaptive threshold is used. In addition, the average execution time of proposed algorithms (4.16 seconds and 4.37 seconds) for the fixed and adaptive threshold, respectively is acceptable, shows that this approach is also efficient.
After comparing and analyzing the detection results, it found that : 1. The false detection in the images failed is mainly due to the failure in face detection which leads to a false template matching. 2. All the images pass the 2 nd stage (face detection) can pass the 3 rd stage (eye locating) correctly, that means all the failure is because face detection failure. 3. The template matching is excellent and accurate method for detection although the long time in execution. 4. The system becomes more efficient and robust by using the adaptive threshold. 5. The high difference between the results obtained by fixed threshold (low success rate) and adaptive threshold (high success rate) is due to appropriate binary image obtained by adaptive threshold method which allow to detect the face edges accurately in the next stage. 6. Comparing the results of this paper with previous works shows that the proposed algorithms are efficient and robust.