Optimal Endmember-Based Super-Resolution Land Cover Mapping

Super-resolution mapping (SRM) aims to determine the spatial distribution of the land cover classes contained in the area represented by mixed pixels to obtain a more appropriate and accurate map at a finer spatial resolution than the input remotely sensed image. The image-based SRM models directly use the observed images as input and can mitigate the uncertainty caused by class fraction errors. However, existing image-based SRM models always adopt a fixed set of endmembers used in the entire image, ignoring the spatial variability and spectral uncertainty of endmembers. To address this problem, this letter proposed an optimal endmember-based SRM (OESRM) model, which considers the spatial variations in endmembers, and determines the best-fit one for each coarse resolution pixel using the spectral angle and the spectral distance as the spectral similarity indexes. A Sentinel-2A and a Landsat-8 multispectral images were used to analyze the performance of OESRM, by comparing with three other SRM methods which adopt a fixed endmember set or multiple endmember sets. The results showed that OESRM generated resultant land cover maps with more spatial detail, and reduced the confusion between land cover classes with similar spectral features. The proposed OESRM model produced the results with the highest overall accuracy in both experiments, showing its effectiveness in reducing the effect of endmember uncertainty on SRM.


I. INTRODUCTION
UPER-RESOLUTION mapping (SRM) is a process aiming to determine the spatial distribution of different land cover classes within mixed pixels. SRM can be regarded as a way to enhance the spatial resolution of remotely sensed images, as it can obtain the land cover map with a higher spatial resolution than the input remotely sensed data [1,2]. Therefore, SRM is a promising approach to reduce the negative effects of mixed pixels on the extraction of land cover information with remotely sensed images. In the past two decades, various SRM algorithms have been proposed, such as Hopfield neural network [3], pixel swapping [4], spatial interpolation [5], and SRM with a directly mapping model [6]. SRM has also been successful used in many fields, including urban tree mapping [7] and waterline mapping [8].
According to the input data, there are two types of SRM model: fraction based SRM and image based SRM. Fraction based SRM is a method in which land cover fraction images are produced from the remotely sensed imagery by spectral unmixing and used as the input to the SRM analysis to estimate the fine spatial resolution land cover map. Fraction based SRM is widely used, but it is limited because the fraction images produced by spectral unmixing often include errors, which may degrade the accuracy of the resultant land cover map [9]. In contrast, image based SRM directly uses the remotely sensed imagery as its input. Consequently, image based SRM avoids errors associated with the production of fraction images. The fuzzy c-means based SRM model [10], the spectral and spatial integration SRM model [11], and the Markov random field based SRM model [9,12] are representative image based SRM models.
The aim of image based SRM models is the direct generation of a fine spatial resolution land cover map from coarse resolution remotely sensed imagery. During the process, endmembers, each of which represents the spectral information of a land cover class, are necessary to transform observed spectral information into resultant land cover category information. It is critical to select suitable endmembers to make the conversion between the spectrum and category accurate, however, existing image based SRM methods typically use a fixed set of endmembers over the entire image. The effect of spatial variability in the spectral properties of the classes and spectral uncertainty of the endmembers have not been fully considered in SRM.
In contrast to the very few studies of endmember uncertainty in SRM, an extensity researches have been focused on the effect of endmember uncertainty of endmembers in spectral unmixing [13,14]. Typically, an endmember library is first constructed, and then the optimal endmember combination for each land cover class is selected for each coarse resolution pixel with a certain criterion, such as root-mean-squared error (RMSE) [15], the spectral angle mapper (SAM) criterion [16], and spectral angle and spectral distance parameter [17]. This kind of method can, to a large extent, reduce the errors in spectral unmixing related to endmember variability.
In this letter, an optimal-endmember based SRM model (OESRM) is proposed. Unlike traditional SRM models using a fixed endmember set in the entire image, the proposed OESRM model uses the optimal endmember combination for each coarse resolution pixel to reduce the effect of be the B -band multispectral remotely sensed imagery with the spatial resolution of R . Let N I J  be the total number of coarse pixels in Y . By setting z as the scale factor, SRM aims to generate a labeled land cover map X containing ( ) ( ) z I z J  finer resolution pixels. The fine resolution pixel label in X is defined as c ( 1, 2, , cC  ， where C is the number of land cover classes in X ).
In general, the image based SRM model is established by minimizing an objective function of E , which is made up of two parts [11]: The first part, spectral E , is the spectral term providing spectral information from the remotely sensed image Y . The second part, spatial E , is the spatial term, which gives spatial information of the fine resolution land cover map X . These two terms in the goal function are balanced by the parameter  .

B． Spectral Term
The object of the spectral term is to minimize the difference of spectral signatures between the spectrum observations in coarse resolution pixels and the simulated spectrum values based on the land cover labels in the fine resolution pixels. The spectral constraints is formulated to minimize the energy function spectral E [18] as: where ij y is the observed spectrum of the coarse resolution pixel ( , ) ij , ij f is the class fraction vector which is calculated by dividing the number of fine resolution pixels of different land cover classes in the coarse resolution pixel ( , ) ij by zz  . ij e is a BC  matrix that represents the endmembers of all land cover classes in the coarse resolution pixel ( , ) ij . Therefore ij ij ef represents the synthetic spectrum for the coarse resolution pixel ( , ) ij on the basis of the linear mixture model.
The endmember combination of the coarse resolution pixel ( , ) ij, ij e , has a great influence on the spectral term. Rather than use a single or fixed endmember set, an optimal endmember combination is estimated for each coarse pixel in order to account for the spatial variability and spectral uncertainty of endmembers.
Here, the spectral similarity index (SSI) is used as the criterion for selecting the optimal combination of endmembers for each coarse resolution pixel. First, for each land cover class, a set of representative endmembers are extracted from the original image, which will then be constructed as the candidate endmembers library. Then, for each coarse resolution pixel ( , ) ij and cm e (the th m candidate endmember of the land cover class c ), the value of SSI, which measures the similarity between their spectra, is calculated as: where 11   between the two spectra. In OESRM, for each land cover class in the coarse resolution pixel ( , ) ij, the endmember with the maximum ijcm SSI is regarded as the most probable endmember in this specific coarse resolution pixel [16]. The most probable endmembers of all land cover class form the optimal endmember combination for the coarse resolution pixel ( , ) ij.

C． Spatial Term
The aim of the spatial term is to model the spatial land distribution of land cover for fine spatial resolution pixels. Here, the maximal spatial dependence model, which is used to make the fine spatial resolution land cover map spatially smooth [12], was adopted as the spatial term spatial E : Na is the square spatial neighborhood composed of all fine spatial resolution pixels inside a square window, of which center is

D． OESRM Initialization and Optimization
The Iterated Conditional Modes (ICM) algorithm was adopted to minimize the OESRM global energy for the entire remotely sensed imagery. The implementation steps of OESRM are: 1) Setting parameters including the scale factor z , the number of class C , the balancing parameter of spatial function  , the neighborhood window size W , and the number of iterations T . 2) Constructing the candidate endmembers library from the input multi-spectral image Y . 3) Selecting optimal endmember combination for each coarse resolution pixel according to the SSI principle. 4) Random initialization. All of the fine-resolution pixels are randomly labeled to generate an initialized fine resolution land cover map.
5) The class labels are iterative updated in terms of Eq.(1) of the entire image. The class label that contributes to the minimum of the objective function is taken as the candidate label of this fine resolution pixel. 6) When there is no change in pixel class labels in two consecutive iterations, or when the predefined iterations have been completed, ICM converges.

III. DATA AND METHODS
The potential of the OESRM approach was evaluated in experiments based upon two remotely sensed data sets.
For purposes of comparison, another image based SRM and two fraction based SRMs were also applied to the same image, including SRM_LM, an image based SRM using a fixed endmember [11], MESMA_PS, the pixel-swapping algorithm [4] that uses the fraction images estimated by the multi-endmember spectral mixture analysis as the input, and SMA_PS, the pixel-swapping algorithm that uses the fraction images estimated by the spectral mixture analysis using a fixed endmember set as the input. For all four SRM methods, SMA_PS, MESMA_PS, SRM_LM and OESRM, the scale factor was set to be 5 z  , the neighborhood window size was set to be 5 W  , and the balancing parameter of spatial function  was estimated by trial and error. Then, the resultant fine resolution land cover maps have the spatial resolution of 2 m, which is as same as the reference land cover map produced with the Google Earth image.
For all four methods, the endmember selection is vital and an endmember spectral library must be constructed. There are various proposed methods to select the candidate endmembers, such as manual selection [19], selection using spectral libraries [15,20], automatic extraction like Pixel Purity Index (PPI) and N-FINDR [21]. Here, for simplicity, candidate endmembers were directly selected from the image manually. The selected candidate endmembers for four different land cover classes are shown as green lines in Fig.2. By directly selecting the endmembers from the image, we can ensure that the different endmembers are evenly distributed in different locations of the image and therefore reduce the effect of spatial heterogeneity of the spectrum. Meanwhile, for land cover classes with strong spectral variability, more candidate endmembers need to be selected. It is evident that there is a considerable difference in spectrum between candidate endmembers for some land cover classes, especially for bare land and urban, as shown in Fig.2(c)-(d).
For SMA_PS and SRM_LM, a fixed endmember set was adopted in the entire image. In this letter, the average of all candidate endmembers was considered as the fixed endmember for each land cover class, shown as the red lines in Fig.2. For MESMA_PS and OESRM, the optimal endmember combination was selected for each coarse spatial resolution pixel according to the SSI principle.
2) Landsat-8 image: A Landsat-8 multispectral image taken over at Caidian District, Wuhan, Hubei Province, China, was used to further evaluate the performance of OESRM. The size of the input image is 80×80 pixels of 30-m spatial resolution bands, including bands 1 to 7. Similar with the Sentinel-2A experiment, the scale factor was set to 5, and the land cover in the map is divided into four classes: water, vegetation, bare land, and urban. A Google earth image was digitized as a reference image with a spatial resolution of 6 m, as shown in Fig.3(b). In this experiment, the same methods of contrast experiment were adopted to evaluate the model.

IV. RESULTS AND DISCUSSION
The land cover maps generated by the four different methods are displayed in Fig.1(c)-(f). Comparing these maps with the reference map, it was evident that the map produced by the OESRM method included more spatial detail and was visually closer to the reference map than the maps from the SMA_PS, MESMA_PS and SRM_LM.
The land cover maps produced from the SMA_PS and MESMA_PS (Fig.1(c)-(d)) were fuzzy with a lot of noise. While the map from the SRM_LM was smoother than that from the SMA_PS and MESMA_PS, spatial details are not well represented in a few regions. For example, in the area indicated by the black ellipse in Fig.1(b), the linear objects of the urban class were mapped but were fuzzy in the results of the SMA_PS and MESMA_PS, meanwhile, these linear objects were not mapped in the result of SRM_LM. In contrast, the land cover map from the OESRM had smoother boundaries with less noise than those of the other three methods. In the area indicated by the black ellipse, linear objects were mapped more accurately than that in the maps from the SMA_PS, MESMA_PS and SRM_LM.
The confusion matrices in Table I show that in the maps obtained from the SRM_LM and SMA_PS, the bare land and urban classes were extensively confused. While in the map from the OESRM and MESMA_PS, there was a higher degree of separation between the two classes. The reason for this situation is that the spectral characteristics of urban areas are complex, and its endmember variability is higher than the other classes (Fig.2). Simply averaging the candidate  endmembers in SRM_LM and SMA_PS would discard useful endmember information used to map the land cover classes, especially for those with high spectral variability in endmembers. Moreover, some endmember spectral curves for the urban class are similar to those of bare land, and the average endmember spectral curve of urban is similar to that of bare land. As a result, both SRM_LM and SMA_PS have large commission and omission errors for bare land and urban classes. By contrast, MESMA_PS and OESRM can find the optimal endmembers for each coarse resolution pixel, by taking account of the variability of endmembers. Therefore, the land cover maps produced by OESRM and MESMA_PS have much lower misclassification error for these two classes. However, although the optimal endmembers are adopted in both MESMA_PS and OESRM, the result of OESRM is much more accurate than that of MESMA_PS. Similarly, despite of using the same fixed set of endmembers, the overall accuracy of the result of SRM_LM is higher than that of SMA_PS. This shows that, when using the same set of endmembers, image based SRM can avoid the effect of the potential errors of fraction images produced by spectral unmixing and therefore can produce a more accurate result than fraction based SRM. In general, OESRM increased the overall accuracy compared to the SMA_PS, MESMA_PS and SRM_LM, showing the advantage of the proposed method. The maps generated by the four different methods applied to the Landsat-8 imagery are shown in Fig.3(c)-(f). Visual comparison of the results shows that the proposed OESRM model is superior to the other three methods. The SMA_PS and MESMA_PS contain many unsmoothed boundaries with a lot of noise. The SRM_LM have less noise than SMA_PS and MESMA_PS, but many spatial details were lost, and many narrow linear objects were not distinguished, as shown in the area indicated by the black circle in Fig.3(d). In contrast, there was more spatial detail in the map obtained from OESRM, and boundaries of linear objects are more continuous in OESRM. The map from the proposed OESRM method had the highest overall accuracy of 82.24%, higher than the 78.46% with the SRM_LM, 70.64% with the MESMA_PS and 69.74% with the SMA_PS.

V. CONCLUSIONS
In this letter, an optimal endmember based SRM model was proposed, in order to reduce the impact of endmember uncertainty on the accuracy of SRM. The proposed OESRM model takes the spectral similarity index as the criterion to select the optimal endmember combination for each coarse pixel. Therefore, OESRM can use the spectral information more effectively, and generate the fine resolution land cover map with a higher accuracy. Experiments on both the Sentinel-2A and Landsat-8 images showed that the proposed OESRM model generated fine resolution land cover maps that were closer to the reference, compared with SRM_LM, MESMA_PS and SMA_PS, showing that the proposed OESRM model can effectively reduce the effect of endmember variability and the errors of fraction images produced by spectral unmixing on SRM.