Robust Recognition of Planar Shapes Under Affine Transforms Using Principal Component Analysis

A scheme, based on principal component analysis (PCA), is proposed that can be used for the recognition of 2-D planar shapes under affine transformations. A PCA step is first used to map the object boundary to its canonical form, reducing the problem of the nonuniform sampling of the object contour introduced by the affine transformation. Then, a PCA-based scheme is employed to train a set of basis functions on the signals extracted from the objects' boundaries. The derived bases are used to analyze the boundary locally. Based on the theory of invariants and local boundary analysis, a novel invariant function is constructed. The performance of the proposed framework is compared with a standard wavelet-based approach with promising results.


Robust recognition of planar shapes under affine transforms using Principal Component Analysis Georgios Tzimiropoulos, Nikolaos Mitianoudis and Tania Stathaki
Abstract-A scheme, based on Principal Component Analysis (PCA), is proposed that can be used for the recognition of 2D planar shapes under affine transformations. A PCA step is first used to map the object boundary to its canonical form, reducing the problem of the non-uniform sampling of the object contour introduced by the affine transformation. Then, a PCAbased scheme is employed to train a set of basis functions on the signals extracted from the objects' boundaries. The derived bases are used to analyze the boundary locally. Based on the theory of invariants and local boundary analysis, an novel invariant function is constructed. The performance of the proposed framework is compared with a standard wavelet-based approach with promising results.

I. INTRODUCTION
A SSUME that a collection of objects of interest is stored in a database and that we desire to design an algorithm which incorporates shape information, as extracted from the objects' boundaries, with the goal of identification in an unknown environment. Obviously, a desired property of a robust recognition system is invariance to the object shape deformation, caused by the arbitrary camera viewpoint positions. To tackle the problem, a common simplification made by the computer vision community is to approximate shape variation with an affine transformation. Let c(t) = [c x (t), c y (t)] T denote a parametric closed curve in 2D space, representing the boundary of an object, where t is an arbitrary parameter used for curve parameterization (e.g. the arc length). An affine transformation models scaling, rotation, shearing and translation of the object boundary as follows: A is a 2 × 2 nonsingular matrix and b is a 2 × 1 vector. The matrix A can be decomposed as follows: where s ∈ R + models global scaling, θ ∈ [0, 2π) models rotation and α ∈ R is the shearing parameter. The vector b represents translation. The transformed parameter t , which is, in general, a function of t, reflects the problem of sampling appropriately the object contour.
This work is sponsored by SELEX through the Systems Engineering for Autonomous Systems (SEAS) Defence Technology Centre.
Suppose we compute a quantity I from c and the corresponding quantity I from c . If I and I are related as follows: where µ = 0 is a constant, then I is an affine invariant. If µ = 1, I is called an absolute invariant, otherwise I is called a relative invariant.
Signal processing techniques, used to derive features from the boundary representation, that remain invariant under affine transformations, have played a key role in shape-based recognition algorithms. Classical shape analysis includes Fourier Descriptors [1], moments [2] and matched filtering [3]. Recently, methods based on the dyadic wavelet transform, have been proposed, which are reported to achieve state-of-the art performance. The object boundary is analyzed at different scales, yielding to the approximation and the detail signals, which are then used for the construction of affine invariant functions. The choice of the signals, the number of decomposition levels and the wavelet functions used, have all resulted in a number of different approaches [4], [5], [6], [7], [8]. Unfortunately, the performance of the above methods is strongly affected by the non-uniform sampling introduced by the boundary extraction process and the parameter transformation problem described above.
In this work, we propose a scheme, based on Principal Component Analysis (PCA), that can be used for object identification under large deformations and noise. Using PCA, the affine transformed boundary is first mapped to its canonical form. This step preserves object shape information and, at the same time, provides an efficient way to tackle the problem of the non-uniform sampling of the object contour. Then, the object boundary is analyzed locally, similarly to the wavelet analysis, using basis functions, derived from a PCA-based training scheme and the boundaries of the objects of interest. Based on the theory of determinants, an invariant function is constructed, which can be used for object identification under affine transforms.

A. Robust boundary encoding using PCA
In this section, the problem of an appropriate parameterization of the object boundary is considered. As mentioned earlier, to preserve the linearity of the affine transformation, the object contour must be parameterized using a parameter that transforms linearly under affine transformation. In this case, a one-to-one correspondence between equally spaced, in terms of the affine invariant parameter, points on the original contour and points on its affine transformed version can be established [4]. The most popular parameter used is the enclosed area defined as [1]: whereẋ,ẏ are the first-order derivatives of x(t) and y(t) with respect to t and the integration interval [a, b] denotes a segment along the curve. It transforms linearly under affine transformation only if b = 0. For this purpose, the origin of the coordinate system is set to the object area center [1]. Unfortunately, the computation of both the enclosed area and the area center is very sensitive to large deformations and noise, resulting in deterioration of the performance of the matching process.
To reduce the effect of the parameterization problem, we use Principal Component Analysis to map the affine transformed shape to its canonical form. For this purpose, c is first normalized to zero mean and, then, its covariance matrix, defined as R c = E{c c T }, is estimated. Let H be the matrix containing the eigenvectors of R c and D the diagonal matrix containing the eigenvalues of R c , such that the i-th diagonal element corresponds to the i-th column of H. Then, the canonical form z of the affine distorted shape c is given by: where W is the whitening matrix defined as It can be shown that z is uncorrelated and of unit variance, that is E{z z T } = I. Using the same principle, suppose that we compute the canonical form z of the original contour c. Then, it can be shown that z and z are related as follows: where Θ is a 2 × 2 orthogonal matrix representing rotation or mirror rotation and reflection [9]. Note that the whitening transform preserves all the shape information of c . This is simply due to the fact that Eq.(5) can be seen as an affine transformation itself. We further mention that the accurate computation of the second moment matrix R c also requires an appropriate affine invariant curve parameterization. However, from our experiments, we have observed that the shape principal axes can be restored successfully, under large deformations and noise, without employing the enclosed area parameter.
Once the object has been brought to its canonical form, we can encode the object boundary using a set of equally spaced points, in terms of the arc length, whose computation is simpler and more accurate. In this way, the stability and robustness of the invariant features, derived from the object boundary, are much less affected by the parameter transformation problem.

B. An invariant function using PCA bases
We have considered, so far, a PCA-based preprocessing step, which can be used to reduce the effect of the nonuniform sampling of the object contour, introduced by random affine transformations. We will now describe a methodology to derive kernels that can be employed for shape analysis, based on Principal Component Analysis and the boundaries of the object of interest.
We assume that all objects are normalized to their canonical form and represented by a set of N c equally spaced points, in terms of the arc length parameter. Let us denote by s(n) = [s x (n), s y (n)] T any curve segment of length N s along the boundary of any of our objects and let also s xy be a N s × 1 vector representing either s x (n) or s y (n). Our target is to estimate a set of appropriate basis functions to represent s xy . Principal Component Analysis can be used to train a set of uncorrelated bases on the curve segments extracted from the objects' boundaries, by optimizing an energy compaction mechanism. It has been used for feature extraction and dimensionality reduction in a wide range of applications [10], [11]. Since it operates on the signal secondorder statistics, it implicitly assumes a Gaussian distribution of the data. However, this Gaussian profile imposed by PCA seems quite reasonable due to the low-pass nature of the signals obtained from the objects' boundaries.
Curve segments from all objects are placed in a matrix S as follows: where K is the total number of curve segments. We note that, when constructing the matrix S, to ensure signal's stationarity and capture all local structure, the contour of each object should be divided into a number of overlapping segments. In the usual way, we compute the eigenvalue decomposition of the covariance matrix of S, defined as R S = SS T /K. If U is the matrix containing the eigenvectors of R S and Y is the diagonal matrix containing the eigenvalues of R S , such that the i-th diagonal element corresponds to the i-th column of U , then, the PCA bases are defined as the rows of the following matrix V : Overall, a total of N s basis functions is provided by the scheme described above. However, one can easily form a reduced set of bases that can be used for an efficient data representation with only a little loss of information. This is because the significance of the derived bases is indicated by the corresponding eigenvalues. Therefore, this information can be used, through selecting the most significant basis functions, to retain most of the signals structure and, at the same time, to perform considerable noise reduction. In fact, this is an advantage of our approach over the popular waveletbased methods, which, depending on the application, entail a thorough investigation of the wavelet decomposition tree in order to identify the wavelet and pick the levels which yield the best possible performance [4]. In contrary, the PCA analysis functions are directly estimated from the actual data, along with a measure of significance, which simplifies the selection process. The eight most important PCA bases, for the application of aircraft silhouette identification (considered in the following section), along with their frequency content can be seen in Fig. 1. It can be observed that the boundary is analyzed locally using a series of bandpass filters, similarly to the wavelet transform. We note that the function corresponding to the largest eigenvalue is not presented, since it represents a DC component, i.e. a change in signal level, and, therefore, is of no significance.
Let us denote by P i ϕ the set of coefficients obtained by projecting the signal ϕ to the i-th basis function, using a sliding window approach. By applying the i-th and the j-th kernel to the object boundary, one can form: Taking the determinants in both parts yields: since det(Θ) = ±1. Therefore, the function I 1 is a relative invariant with µ = ±1. Note that the above derivation has been used to provide invariants for the case of affine transformations, while in our case, the matrix R represents only rotation. However, it stills provides an efficient way to embody the transform coefficients, obtained by applying two different kernels to the object boundary, in a single function. Fig. 2 shows the boundary extracted from an aircraft model, its canonical form and the corresponding derived invariant function I 1 .
The degree of similarity between two objects κ and λ can be measured as the maximum value of the normalized circular cross-correlation [6]: C 1,κ,λ (m) = m n I 1,κ (n)I 1,λ (n − m) n I 1,κ (n) 2 n I 1,λ (n) 2 (11) where I 1,κ and I 1,λ are the invariant functions derived from the objects κ and λ respectively and I 1,λ is circularly shifted. If the maximum absolute value is used, then the sign ambiguity introduced by Eq. (10) will be removed and the function I 1 can be considered as an absolute invariant. The circular crosscorrelation is used to reduce the effect of the unknown shift between the starting points of the two contours [5]. This is because both c(t) and c(t + t 0 ), where t 0 is any shift in the origin of the object boundary, essentially represent the same object.

III. RESULTS
To evaluate the performance of our approach, the method is applied to the within-class object recognition application of aircraft silhouette identification. Fig. 3 shows the contours of K = 20 aircraft models which have also been used in [6] to test the discrimination power of the invariant function and its ability to capture small variations. To model a 128×128 image resolution, the resolution of all model images in Fig. 3 is such that each aircraft approximately fits in a d 1 × d 2 rectangular grid, where d 1 d 2 = 128 2 . From these images, the aircraft contours are extracted, using a simple 8−point connectivity algorithm [12]. Each model is then transformed to its canonical form and represented by a set of N c = 512 equally spaced points, in terms of the arc length parameter. The length of the curve segment N s is chosen to be equal to 64 points. The PCA basis functions are extracted, and, for each model, the invariant function I 1 is computed using the two most significant PCA bases, as described in the previous section. For each aircraft model, a set of test images is generated, which depict the same model under large deformations. We have used the following affine parameters: θ = {0 • , 60 • , 120 • , 180 • } and α = {2, 3}. Similarly, the boundaries are extracted and, for each test boundary, the corresponding invariant function is derived.
The proposed scheme is compared with a popular waveletbased affine invariant function [6], denoted as I 2 . For the computation of I 2 , we first parameterize the object boundary using the signed enclosed area parameterization and a scheme based on the methods proposed in [3] and [4]. We do not claim that our implementation is optimum, however, for each model, we have used the parameters (see [3]) that appear to yield the best possible performance (On the other hand, in our method, all parameters remain fixed and independent of the model under examination). The signed enclosed area parameter is employed, since it is found to be less sensitive to noise [13]. The invariant function I 2 is computed at scales (5, 6) and (6,7), which provide the best trade off between discrimination capability and robustness to noise. The quadratic B-spline wavelet has been used, which is reported to give the best results [8]. The performance of the two invariant functions is evaluated in a noisy environment. Uniformly distributed noise is artificially added to the x and y coordinates of each contour point, after the extraction of the boundary from each test image. The amount of noise added is controlled using the Signal to Noise Ratio (SNR) defined in [4]. We have considered a large noise level of SNR = 20 dB [6]. For each test image, the experiment is repeated 100 times. The classification results are given in Table I. As it can be observed, the proposed algorithm features robust performance and appears to be much more stable than the standard wavelet-based method.

IV. CONCLUSIONS
We have a presented a PCA-based framework with the goal of robust shape-based object recognition under affine transformations. The role of Principal Component Analysis, in the proposed scheme, is twofold. First, it is used to map the affine transform shape to its canonical form in order to tackle the problem of sampling the object contour appropriately. Then, it is used to train a set of basis functions with desired properties on the signals extracted from the object boundaries. The derived bases functions are then used for the construction of a novel invariant function. The proposed framework is applied to the problem of aircraft silhouette identification and compared with a popular wavelet-based method. Simulation results show that our method outperforms the wavelet-based method. In addition, the extra problem of identifying the most appropriate wavelet and decomposition levels does not exist in the proposed scheme.