Face Recognition Using Kernel Methods



    1. Overivew

    Eigenface  or  Principal Component Analysis (PCA) methods have demonstrated their success in face recognition, detection, and tracking. The representation in PCA is based on the second order statistics of the image set, and does not address higher order statistical dependencies such as the relationships among three or more pixels. Recently Higher Order Statistics (HOS) have been used as a more informative low dimensional representation than PCA for face and vehicle detection. In this paper we investigate a generalization of PCA, Kernel Principal Component Analysis (Kernel PCA), for learning low dimensional representations in the context of face recognition.
    In contrast to HOS, Kernel PCA computes the higher order statistics without the combinatorial explosion of time and memory complexity. While PCA aims to find a second order correlation of patterns, Kernel PCA provides a replacement which takes into account higher order correlations. We compare the recognition results using kernel methods with Eigenface methods on two benchmarks. Empirical results show that Kernel PCA outperforms the Eigenface method in face recognition.

    2. Results

    We tested Kernel PCA with polynomial kernels against conventional PCA using two image databases. The Yale database contains 165 images of 11 subjects that
    includes variation in both facial expression and lighting. For efficiency, each image has been downsampled to 29 x 41 pixels. Figure 1 shows 22 closely cropped images which include internal facial structures such as the eyebrow, eyes, nose, mouth and chin, but do not contain the facial contours.
     


    Figure 1: The Yale Database.


    The experiments were performed using the ``leave-one-out'' strategy: To classify an image of person, that image is removed from the training
    set of M-1 images and the dimensionality reduction matrix is computed. All the M images in the training set are projected to a reduced space using the computed matrix and recognition is performed using a nearest neighbor classification.  The number of eigenvectors (or principal components) are empirically
    determined to achieve lowest error rate by each method. Table 1 shows the experimental results. Empirical results show that Kernel PCA method with a cubic polynomial kernel achieve the lowest error rate. Furthermore, the results show that Kernel PCA methods are insensitive to the degree of
    polynomial kernels.
     
     

    Table 1.


     
    Method
    Reduced Space
    Error Rate (%)
    Eigenface
    40
    28.49
    Kernel PCA, d=2
    80
    27.27
    Kernel PCA, d=3
    60
    24.24
    Kernel PCA, d=4
    60
    24.85
    Kernel PCA, d=10
    50
    26.01

    The AT&T (formerly Olivetti) database contains 400 images of 40 subjects that include variation in facial expression and pose. Each face image is downsampled to 23 x 28 to reduce the computational complexity. Figure 2  shows images of two subjects. In contrast to the Yale database, the images include the facial
    contours and certain pose variations. However, the lighting conditions remain the same.

       Figure 2. The AT&T Face Database

    We use the same strategy with the experiments using the Yale data set. Table 2 summarizes the empirical results. Consistent with the experiments on Yale  atabase, Kernel PCA methods achieve lower error rates than the Eigenface approach on the AT\&T dataset.
     


    Table 2.


     
    Method
    Reduced Space
    Error Rate (%)
    Eigenface
    30
    2.75
    Kernel PCA, d=2
    50
    2.50
    Kernel PCA, d=3
    50
    2.00
    Kernel PCA, d=4
    60
    2.25
    Kernel PCA, d=10
    80
    2.25

     

    3. Conclusion

    The representation in the Eigenface approaches is based on the second order statistics of the image set, i.e., covariance matrix, and  does not use high order statistical dependencies such as the relationships among three or more pixels. In a task such as face recognition, much of the important information may be contained in the high order statistical relationships among the pixels. We have investigated Kernel PCA and demonstrated that it provides a more effective representation for face recognition. Compared to other techniques for nonlinear feature extraction, Kernel PCA has the advantages that it does not require nonlinear optimization, but only the solution of an eigenvalue problem. Experimental results on two benchmark databases show that Kernel PCA method achieves a lower error rate than the Eigenface approach in face recognition.

    Future research will focus on analyzing face recognition methods using other kernel methods in high dimensional space. We plan to investigate and compare the performance of face recognition methods using  Kernel Fisher Linear Discriminant,  Independent Component Analysis  and Kernel PCA.
     
     

    Publications