Improving super-resolution mapping through combining multiple super-resolution land-cover maps

ABSTRACT Super-resolution mapping (SRM) is an ill-posed problem, and different SRM algorithms may generate non-identical fine-spatial resolution land-cover maps (sub-pixel maps) from the same input coarse-spatial resolution image. The output sub-pixels maps may each have differing strengths and weaknesses. A multiple SRM (M-SRM) method that combines the sub-pixel maps obtained from a set of SRM analyses, obtained from a single or multiple set of algorithms, is proposed in this study. Plurality voting, which selects the class with the most votes, is used to label each sub-pixel. In this study, three popular SRM algorithms, namely, the pixel-swapping algorithm (PSA), the Hopfield neural network (HNN) algorithm, and the Markov random field (MRF)-based algorithm, were used. The proposed M-SRM algorithm was validated using two data sets: a simulated multispectral image and an Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) hyperspectral image. Results show that the highest overall accuracies were obtained by M-SRM in all experiments. For example, in the AVIRIS image experiment, the highest overall accuracies of PSA, HNN, and MRF were 88.89, 93.81, and 82.70%, respectively, and these increased to 95.06, 95.37, and 85.56%, respectively for M-SRM obtained from the multiple PSA, HNN, and MRF analyses.


Introduction
Super-resolution mapping (SRM) is a process used to predict the spatial distribution of land-cover classes in image pixels at a finer spatial resolution than that of the input data. As such, SRM has an important role to play in reducing the mixed-pixel problem that is commonly encountered in mapping land cover from remotely sensed data. A variety of SRM methods are available and often employ constraints to guide the analysis to an appropriate solution Foody, Muslim, and Atkinson 2005;Ge 2013;Ge et al. 2014;Hu et al. 2015;Ling et al. 2013Ling et al. , 2010Wang, Wang, and Liu 2012). For example, an analysis may be constrained to ensure that the land-cover class areal proportions for a coarse-resolution pixel, estimated by a soft classification, are maintained within the geographical area it represents and/or that prior information on the spatial pattern of the land cover is used to generate the subpixel land-cover map. However, the solution space of SRM is large, and it provides multiple plausible solutions that satisfy the constraints. Previous studies have shown that a varied set of land-cover representations may arise from the same coarse-spatial resolution image through the use of different SRM methods Makido, Messina, and Shortridge 2008). Typically, the identification of an optimal SRM method in advance is a difficult, if not impossible, challenge. The multiple classifier system is a powerful solution to difficult pattern recognition problems involving large class sets (Ho, Hull, and Srihari 1994), and this system has shown considerable potential to increase the accuracy of classification of remotely sensed imagery (Benediktsson and Sveinsson 2003;Briem, Benediktsson, and Sveinsson 2002;Bruzzone, Cossu, and Vernazza 2004;Kavzoglu and Colkesen 2013). Since each classifier usually generates a unique land-cover map that satisfies the classifier's objective function, a set of different maps may be generated from a suite of classifiers. The multiple classifier system combines the set of maps, aiming to produce a final map that is of superior quality to the individual maps it is made from. Although the multiple classifier system has been extensively investigated for the classification of remotely sensed imagery, it has been mostly used to combine multiple land-cover maps generated with conventional (hard) image classification at the pixel scale. As the latter type of analysis may be degraded by the mixed-pixel problem, the multiple classifier approach may, however, also be used to combine multiple soft classifications . Although soft classification can predict sub-pixel scale class areal proportion information, it does not indicate the geographical location of the classes within the area of each coarse-resolution pixel. A simple enhancement would be to generate a set of sub-pixel maps from the soft classifications via a series of SRMs and combine them. Little research has, however, focused on the ensemble of multiple SRM algorithms. Many studies show that no single SRM algorithm can be expected to perform perfectly, and each SRM output has its own strengths and weaknesses (Atkinson 2009;Ling et al. 2014). The combination of multiple SRM outputs could utilize the different information of each while addressing drawbacks of the individual methods, and this combination is expected to produce a more accurate sub-pixel map than that produced by an individual SRM algorithm.
The use of different SRM algorithms, or a single algorithm with, for example, dissimilar parameter settings, allows the generation of non-identical sub-pixel maps from the same data (Makido, Messina, and Shortridge 2008). In this study, the multiple SRM (M-SRM) approaches that combine the multiple maps from a single SRM algorithm and from multiple SRM algorithms are explored. Three popular SRM algorithms, namely, the pixelswapping algorithm (PSA) (Atkinson 2005), the Hopfield neural network (HNN) algorithm (Su et al. 2012a;Tatem et al. 2001), and the Markov random field (MRF)-based algorithm (Kasetkasem, Arora, and Varshney 2005;, were used. The combination of multiple sub-pixel maps obtained from a set of SRM analyses was accomplished with a voting-based approach. The proposed M-SRM was validated using two data sets: a simulated multispectral image and an Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) hyperspectral image. Moreover, analyses using different parameter settings for each algorithm were undertaken, allowing the combination process to be based upon outputs from a single algorithm or multiple algorithms.

. SRM algorithms introduction
Three popular SRM algorithms, the PSA, HNN and MRF, were adopted. In these methods, the coarse-resolution pixel is broken down to sub-pixels (fine-resolution pixels) initially, and the different algorithms have dissimilar strategies to label the sub-pixels. This section outlines the salient features of each.
The PSA is applied to a soft classification output. It is designed to convert the class areal proportions predicted by a soft classification into a set of (hard) sub-pixel landcover class allocations. This is achieved by swapping sub-pixel class labels in a way that maximizes the spatial autocorrelation between neighbouring sub-pixels, under the constraint that the original class areal proportions for the area represented by each coarse-resolution pixel are maintained (Atkinson 2005). If swapping a pair of sub-pixels in a coarse-resolution pixel would increase the spatial autocorrelation of the output map, the sub-pixels are swapped. Otherwise, no swap is made. The PSA is converged until either no swap of sub-pixels is made or a predefined iteration is reached. This approach is reasonable when the land cover exists as a mosaic of patches that are larger than the size of the coarse-resolution pixel (Atkinson 2009). The class areal proportions are unchanged before and after each swapping of sub-pixels in the coarse pixel.
The HNN is also applied to a soft classification output. The HNN is a recurrent neural network and is formulated as an energy minimization tool to predict the sub-pixel landcover distribution within the geographical area of each coarse-resolution pixel (Tatem et al. 2001). By utilizing information contained in surrounding pixels, the land-cover within each pixel may be mapped using a simple spatial clustering function coded into the HNN. In the HNN-based SRM, sub-pixels are allocated (hard) land-cover class labels in a manner that reflects directly the class areal proportions predicted by a soft classification. The relative weights of a set of goal functions control the nature of the final output. The HNN class areal proportions constraint aims to retain the class areal proportional information output from the soft classification that informs the SRM. The class areal proportions outputted from soft classification do not have to be faithfully maintained in the sub-pixel map, depending on the weight of the class areal proportions constraint in the HNN goal function.
The MRF-based SRM is applied directly to remotely sensed imagery and is thus different from the PSA and HNN. The MRF-based SRM goal function is not directly relevant to the class areal proportions, but is modelled by analysing the image spectral information and land-cover spatial information (Kasetkasem, Arora, and Varshney 2005). The MRF-based SRM goal function includes an image spectral constraint and a landcover spatial constraint. The spectral constraint is the assumption that the coarse pixel has a spectral response that is generated from the combined spectra from the classes contained in the sub-pixel map. The spectral constraint aims to refine the sub-pixel labels in order that the degraded and observed coarse-resolution pixel spectra are similar (Tolpekin and Stein 2009). In the spatial constraint, it is assumed that a subpixel map has MRF properties, and the land-cover classes occupying neighbouring subpixels are more likely to come from the same class than from different classes. The MRFbased SRM land-cover spatial constraint is similar to that adopted in the PSA and HNN, which maximizes the spatial autocorrelation between neighbouring sub-pixels in the resultant sub-pixel map.

SRM map initialization
Both PSA and HNN use class areal proportions generated from the soft classification as input and aim to maintain the class areal proportions in the resultant sub-pixel map, whereas the MRF is applied directly to the original remotely sensed image. The final subpixels land-cover map is generated by the SRM analysis using an iteratively refined fineresolution map that is provided, along with the class areal proportions or the original remotely sensed image, to the PSA, HNN, and MRF algorithms. The initial value at each sub-pixel location will have an effect on SRM performance, and different initialization maps may result in dissimilar SRM outputs (Makido, Messina, and Shortridge 2008).
The PSA initialization map is generated based on the soft classification output. This map is a sub-pixel land-cover map, and each sub-pixel is given an initial class value of c (c 2 1; Á Á Á ; C, and C is the number of land-cover classes). The PSA initialization map is produced by randomly assigning sub-pixels class labels in a manner that maintains the class proportion information conveyed by the prior soft classification (Atkinson 2005). The MRF initialization map is also a land-cover map, and can be generated based on the soft classification output or without using the soft classification output by assigning each sub-pixel label randomly within the range 1 to C. The initial sub-pixel map based on soft classification output is an appropriate starting point to result in a faster convergence of the MRF algorithm (Kasetkasem, Arora, and Varshney 2005). The HNN initialization map is not hard-classified land-cover maps but soft-classified class areal proportions, and is generated without using the soft classification output (Tatem et al. 2001). The C class proportion images are represented by C interconnected layers, and the neurons within these layers are referred to by coordinate notation at the sub-pixel scale. An iterative analysis is then undertaken in which the neurons ultimately indicate the class label for each sub-pixel given the goal constraints applied.

M-SRM
In M-SRM, the combination of multiple sub-pixel maps obtained from a set of SRM analyses is accomplished via voting. Voting is a simple rule for combining the outputs of multiple estimators by treating the output of each estimator as a vote. There are many voting strategies that can be implemented, such as plurality voting, weighted voting, and soft voting (Latif-Shabgahi, Bass, and Bennett 2004;Parhami 1994). Plurality voting (Lin et al. 2003) is a combination strategy that selects the candidate with the most votes, assuming that the choice with the most votes should be the optimal choice. Plurality voting is conducted on the basis that the decision of a group result is superior to that of a single individual, is one of the most extensively used combination strategies, and can achieve an enhanced trade-off between identification and rejection rates. Plurality voting was used here to select the class label for each sub-pixel from the multiple SRM outputs available.
The voting procedure can be illustrated for analysis of a coarse-spatial resolution remotely sensed image that contains I × J pixels. The SRM generated from the latter image is a fine-resolution land-cover map (sub-pixel map) with I × s × J × s pixels, where s is the scale factor and each coarse-resolution pixel contains s 2 sub-pixels. Each subpixel is labelled by one of the C classes, with c α h;i;j À Á by the class label of be the vote of class c for sub-pixel α h;i;j from the k th (k ¼ 1; Á Á Á ; K) SRM algorithm. The predicted class for sub-pixel α h;i;j in M-SRM obtained with plurality voting is derived by maximizing the following function: The use of different initialization maps can result in dissimilar SRM outputs. In order to explore the influence of different initialization maps on M-SRM, each SRM algorithm for M-SRM combination is run a number of predefined times with different initialization maps. Each SRM is run N times, and the vote V k c α h;i;j À Á ¼ c À Á is related to the N sub-pixel maps from the k th SRM algorithm. The label of sub-pixel α h;i;j in M-SRM can be dependent on the classes depicted in the multiple maps for that sub-pixel, and thus the vote is determined according to the number of times that the sub-pixel α h;i;j is labelled as class c from the N times of the k th SRM (k = 1,. . .,K) as: where c k;n ðα h;i;j Þ is the label of α h;i;j from the n th (n ¼ 1; Á Á Á ; N) result derived from the k th SRM algorithm; δ c k;n ðα h;i;j Þ; c À Á is the Kronecker delta function that equals 1 if c k;n ðα h;i;j Þ ¼ c, and 0 otherwise. This ensemble approach of M-SRM, called the pixelbased M-SRM, processes the labels of each sub-pixel from the multiple SRM outputs without considering the autocorrelation between spatially adjacent sub-pixels. Spatial context captures spatial information relative to local features in an image, and has been used in the improvement of image classification accuracy (Tarabalka et al. 2010). Spatial context can be described in terms of relations of neighbouring objects. It creates connections among pixels, and can be used to investigate the spatial autocorrelation between spatially close pixels. The basis is that sub-pixels that are close together are more likely to be similar in labelling than those that are far apart. With the context information, the problem of speckling (i.e. individual pixels differing in class label from their surrounding pixels) is reduced in image classification. In the aforementioned pixelbased M-SRM, the ensemble of different SRM outputs depends only on the labels of a sub-pixel in the multiple outputs but ignores the spatial context information for that sub-pixel, and the speckling problem may affect M-SRM accuracy. A context-based M-SRM that incorporates the spatial context information among neighbouring subpixels in the available SRM outputs is proposed, and is expected to minimize the speckling problem. Context-based M-SRM, in which the labelling of each sub-pixel is related to the labels of neighbouring sub-pixels, is designed as follows.
Define η α h;i;j À Á as the sub-pixel neighbourhood that includes all sub-pixels inside a square window of W × W sub-pixels centred on α h;i;j . The neighbourhood window size, W is the length of the square side, and can be set to 1, 3, 5, 7, or any other odd integer.
Assume α l is a neighbourhood sub-pixel of α h;i;j in η α h;i;j À Á . The context-based M-SRM integrates local spatial autocorrelation between neighbourhood sub-pixels, with the magnitude of the autocorrelation inversely related to the distance between the subpixels under consideration. The effect of α l on the labelling of α h;i;j in the W × W window may be dissimilar, depending on the distance between α l and α h;i;j . The effect of subpixel α l on α h;i;j in the W × W neighbourhood window is defined as the weight, w α h;i;j α l j À Á . Many distance-dependent weighting functions, including the Gaussian model, the inverse distance weighting function, and the exponential decay function, can be employed to measure the variation of w α h;i;j α l j À Á with the distance between α l and α h;i;j . The Gaussian model is adopted in this paper: where d α h;i;j ; α l À Á is the Euclidean distance between α l and α h;i;j ; r is the range value that controls the relative magnitude of w α h;i;j α l j À Á with the distance d α h;i;j ; α l À Á . The variation in the magnitude of the weight w α h;i;j α l j À Á with the variation of d α h;i;j ; α l À Á according to different range value r is shown in Figure 1. The weight w α h;i;j α l j À Á decreases very slowly with distance d α h;i;j ; α l À Á when r = 10, and the spatial autocorrelations between distant sub-pixels in the W × W window are high; (w α h;i;j α l j À Á approximates to 0.8 when d α h;i;j ; α l À Á ¼ 5). In contrast, the weight w α h;i;j α l j À Á decreases most sharply with distance d α h;i;j ; α l À Á when r = 1, and the spatial autocorrelations between distant sub-pixels in the W × W window are low; (w α h;i;j α l j À Á approximates to 0 when d α h;i;j ; α l À Á ¼ 5). Note that when W = 1, the window is of size 1 × 1 and hence the label of a sub-pixel pixel is only dependent on the classes depicted in the multiple maps for that sub-pixel, and there is no use of contextual information in this situation. The context-based M-SRM is degraded to pixel-based M-SRM in this case.
Given the N sub-pixel maps from the k th SRM algorithm and according to the neighbourhood system η α h;i;j À Á and the weight w where c k;n ðα l Þ is the label of α l from the n th (n ¼ 1; Á Á Á ; N) result derived from the k th SRM algorithm.

Accuracy assessment
The accuracy of each land-cover map obtained from the SRM analyses was assessed relative to a reference land-cover map of the same geographical area of the same resolution as the SRM output, and was expressed as the percentage of cases correctly allocated (i.e. overall accuracy); details of the reference maps are provided below for each experiment. The accuracy of the class areal proportion images unmixed from soft classification was also assessed. The reference class areal proportion images were first calculated based on the reference land-cover map, and the class areal proportion for each class in each coarse-resolution pixel was calculated by dividing the number of subpixels of that class in the coarse pixel by the square of the scale factor (s 2 ). Then the unmixed and reference class areal proportion images were compared using the root mean square error (RMSE) value (Jin, Wang, and Zhang 2010): where θ c;i;j and ω c;i;j are the class areal proportions of class c in the coarse-resolution pixel (i,j) in the reference land-cover map and unmixed class areal proportion image, respectively.

Experiments and results
Experiments using a simulated multispectral image and an AVIRIS hyperspectral image were conducted to assess the proposed M-SRM method. The PSA and MRF performances are related to the number of neighbourhood sub-pixels (L). When the number is 1 this means that the analysis is based upon the eight immediate sub-pixel neighbours that lie within a 3 × 3 window centred on the sub-pixel of interest. In PSA and MRF, the scale factor s and the number of neighbouring sub-pixels L are two correlated parameters, and different combinations of s and L will yield different SRM results (Atkinson 2005;Su et al. 2012b;Tolpekin and Stein 2009). The optimal number of neighbouring sub-pixels, L should not be too large and no more than the scale factor, s (Atkinson 2005), and was set to L = s − 1 in both PSA (Su et al. 2012b) and MRF (Tolpekin and Stein 2009). Both pixel-and context-based M-SRM were assessed. The set of M-SRM analyses undertaken are summarized in Table 1. The SRM repetition number N was set to 10, and each single SRM algorithm was performed 10 times using different initialization maps. The initialization maps are sub-pixel land-cover maps for PSA and MRF, and subpixel soft-classified class areal proportion images for HNN. In order to fairly compare the accuracy of single PSA and MRF algorithms, the same set of sub-pixel initialization maps, which contained 10 different sub-pixel initialization maps, was inputted to PSA and HNN. For context-based M-SRM, the sub-pixel neighbourhood window size W was set to 3, 5, 7, and 9, and the range value r was set to 1, 2, 3, and 10, respectively.

Simulated multispectral image experiment
3.1.1. Overview A simulated multispectral image was used to control for possible sources of endmember extraction error. A real fine-resolution image was used as the starting point. Visual classification of this image yielded a ground reference map for the test site. A fivewaveband multispectral image of the site was then generated using a set of spectral endmembers generated to fit with the classes depicted in the reference map. The derived multispectral imagery was then degraded with a 5 × 5 pixel mean filter. A soft classification of the latter coarse-spatial resolution imagery was obtained using a linear mixture model (Hu and Weng 2011;Settle and Drake 1993). The class areal proportion images generated from soft classification were used as the class proportion constraints in the PSA and HNN, and to generate the initial sub-pixel land-cover maps for PSA and MRF. The coarse-resolution image was also used in the MRF spectral constraint.

Data
The starting point image was a subset of a QuickBird panchromatic image of Wuhan, Hubei Province, China (Figure 2, spatial resolution 0.6 m, 30°35′51″ N and 114°19′56″ E). The panchromatic image was manually interpreted to yield a reference map for an area of 120 × 120 pixels of four classes identified: tree, grass, bare earth, and path. A simulated five-band multispectral image was generated using four sets of spectral endmembers, and the digital number values of the four endmembers are [630,425,270,130,185] T , [210,380,130,260, 310] T , [150,590,340,560,440] T , and [400, 220, 520, 360, 650] T . The covariance matrices were defined following the approach discussed in Tolpekin and Stein (2009), where the covariance matrices for all the classes were manually set to M × A. M is an identity matrix, size B × B (B is the number of spectral  bands and B = 5 in this experiment), and A = 1200 is a constant. The spectral response of each class was normally distributed in each waveband Tolpekin and Stein 2009).

Results and discussion
The sub-pixel maps obtained from PSA, HNN, and MRF with the highest overall accuracies, as well as those produced by the pixel-and context-based M-SRM with highest accuracy for each analysis, are shown in Figure 2. In the map obtained from PSA, many speckle-like artefacts (examples are highlighted by red circles in Figure 2) were observed. These arose from spectral unmixing errors. Linear unmixing analysis may, for example, allocate a small fractional cover of a class that is absent to a pixel and, because of the constraints used in PSA, this fractional cover must be maintained in the SRM. However, the representation obtained was close to the references, with an RMSE for the unmixed class areal proportions of 0.0324. Scatter plots of the reference and unmixed class areal proportions are shown in Figure 3. The scatter plots indicate that many estimated class areal proportion values are close but not identical to the reference values for different classes. Unlike with PSA, which maintains the class proportional information, HNN and MRF eliminated the speckle-like artefacts due to the spatial smoothing effect based on the spatial autocorrelation model. It was also evident that parts of the path were poorly represented, with some sections disconnected in PSA, HNN, and MRF results (highlighted by the green circle in Figure 2). This is because class spatial autocorrelation, which is reasonable where the land-cover target of interest is larger than the pixel size, is adopted as the land-cover prior information in PSA, HNN, and MRF. Many parts of the linear connected path were no larger than the coarseresolution pixel size, and were smoothed and disconnected in the results. The outputs from the M-SRM approach differed from those obtained from the single SRM analyses. The map generated from the pixel-based M-PSA contained more connected path and fewer speckle-like artefacts than the PSA map. The maps from pixelbased M-HNN and M-MRF showed the path to be more connected than the maps from the standard single HNN and MRF analyses. This is because errors may exist in individual output but are more frequently labelled correctly in the other maps available to M-SRM. Context-based M-SRM integrates the neighbourhood sub-pixel information and this eliminated most speckle-like artefacts. The maps from context-based M-HNN and M-MRF showed the path to be more fully connected than those from pixel-based M-HNN and M-MRF. The maps from pixel-and context-based M-PSA-HNN, M-PSA-MRF, M-HNN-MRF, and M-PSA-HNN-MRF showed few speckle-like artefacts and the path to be highly connected. This is because the maps obtained from different SRM algorithms were different (Figure 2), and the sub-pixels labelled as speckle-like artefacts or the disconnected features from one SRM algorithm were labelled correctly in the output of other SRM algorithms.
Tables 2-4 show the overall accuracies of single SRM and M-SRM algorithms. With a single SRM algorithm it was evident that the combination of a set of SRMs obtained from it yielded an increase in accuracy. The highest overall accuracies of PSA, HNN,and MRF were 90.14,89.15,and 89.76%,respectively,increased to 91.52,91.38,and 91 It was also evident that the accuracy of M-SRM was influenced by several factors. First, the highest overall accuracies of context-based M-SRM were higher than that of pixelbased M-SRM for each M-SRM. The accuracy of the context-based M-SRM was affected by the neighbourhood window size W and range value r. In context-based M-SRM, a larger W indicates a larger neighbourhood window size that explores more sub-pixels with local spatial autocorrelation, and the spatial autocorrelations between distant subpixels in the W × W neighbourhood window are higher with a larger r. In general, context-based M-SRM with W < 5 and r < 3 generated the highest overall accuracy in this experiment. The reference map contains linear path objects that are no larger than the coarse-spatial resolution pixel, and could be over-smoothed if the neighbourhood window W is large and the spatial autocorrelations between distant sub-pixels are high and with large r. Second, for M-SRM that combined different SRM algorithms, the algorithms selected for inclusion played a key role in determining the accuracy of the final map. The mean overall accuracy of PSA was higher than that of HNN and MRF; the    overall accuracies of M-SRM that combined PSA were higher than those that excluded PSA. Specifically, the highest overall accuracy of M-PSA was higher than that of M-HNN and M-MRF, and the highest overall accuracies of M-PSA-HNN, M-PSA-MRF, and M-PSA-HNN-MRF were higher than that of M-HNN-MRF, which excluded PSA. These results highlight the importance of careful selection of algorithms for use in a multiple classifier system. Note, for instance, that the highest overall accuracy of M-PSA-HNN-MRF was lower than that of M-PSA and M-PSA-HNN. Thus a multiple classifier system using only a subset of the classification methods can be more accurate than one using the whole set available.

Overview
A set of analyses based on a real remotely sensed data set were undertaken. This research used an AVIRIS image to map land cover, with the result validated against reference data obtained from visual interpretation of imagery in Google Earth.

Data
An AVIRIS image acquired on 11 June 2008, comprising 224 spectral bands with a spatial resolution of 17 m for a test site centred on the airport located in Moffett Field, San Francisco Bay, USA, was used (Figure 4, 37°24′54″ N and 122°02′54″ W). The focus was on a 180 × 70 pixel subset of the imagery, for which a reference map was generated using a 900 × 350 pixel fine-spatial resolution image available in Google Earth acquired on 13 October 2008. The Google Earth image was geo-registered to the AVIRIS image (root mean squared error was 4.12 m). The scale factor was set to s = 5. The image contained four land-cover classes, namely, water, grass, dark surface, and white surface. The endmember signatures in the AVIRIS image were selected using the N-finder algorithm (Winter 1999). According to the geometry of convex sets, the N-finder is based on the fact that in p spectral dimensions, the p-volume contained by a simplex formed of the  Pixel-based M-SRM W = 1 r = 1 r = 2 r = 3 r = 10 r = 1 r = 2 r = 3 r = 10 purest pixels is larger than any other volume formed from any other combination of pixels. The multiple endmember spectral mixture analysis was applied to generate landcover class areal proportion images.

Results and discussion
The SRMs obtained from the analyses of the AVIRIS image are shown in Figure 4. In the maps obtained from PSA, HNN, and MRF, many speckle-like artefacts (examples highlighted by red circles in Figure 4) were observed. This is because the fractional covers, which were absent in a pixel but allocated by soft classification, were maintained in the PSA and were partly smoothed in HNN and MRF. Some speckle-like artefacts were still found in HNN and MRF. This occurred because the class proportion RMSE value was 0.2302 for the spectral unmixing output, which is very large. Scatter plots of the reference and unmixed class areal proportions are shown in Figure 5. The scatter plots indicate that there was obvious overestimation and underestimation in grass area, and obvious underestimation in light surface area. In the PSA in Figure 4, the speckle-like artefacts in region A were due to the underestimation of the grass fraction, and the speckle-like artefacts in regions B and C were due to the underestimation of the light surface fraction. There are more fractional covers represented as large speckle-like artefacts in the coarse pixels in the AVIRIS image than in the simulated image; the More accurate sub-pixel maps were obtained from the M-SRM relative to the singlealgorithm SRM analyses. The highest overall accuracies of PSA, HNN,and MRF were 88.89,93.81,and 82.70%,respectively,increased to 89.99,94.05, and 82.92%, respectively for pixel-based M-PSA, M-HNN, and M-MRF, and increased to 95.06, 95.37, and 85.56%, respectively for context-based M-PSA, M-HNN, and M-MRF. The highest overall accuracies of context-based M-SRM were much higher than that of pixel-based M-SRM for each M-SRM. Higher overall accuracies were found for context-based M-SRM with W = 9 and r ! 3, which is different to the results obtained with the simulated data. This is because the spectral unmixing error was larger in the AVIRIS image experiment than the simulated image experiment, and the maps obtained from PSA, HNN, and MRF for the AVIRIS image contained more large speckle-like artefacts. The context-based M-SRM with larger W and r, which indicates a larger neighbourhood window size that explores more sub-pixels with local spatial autocorrelation and a higher spatial autocorrelation between distant sub-pixels in the W × W neighbourhood window, eliminated most of the artefacts more efficiently.
In terms of the overall classification accuracy, the map obtained from MRF was less accurate than those from PSA and HNN. The highest overall accuracy of M-MRF was lower than that of M-PSA and M-HNN, and the highest overall accuracies of M-PSA-MRF, M-HNN-MRF, and M-PSA-HNN-MRF, which combined MRF were lower than that of M-PSA-HNN, which excluded the MRF. As with the simulated data, the results showed that M-SRM which combined the whole set of different maps did not always perform better than M-SRM that combined only subsets of maps with high accuracy, highlighting the need for care in the selection of algorithms to use within a multi-classifier system (see Tables 5-7).

Conclusion
The potential to enhance land-cover mapping from remotely sensed data through the combination of multiple sub-pixel maps obtained from a set of SRM analyses was explored. In the multiple SRM approach, M-SRM, each sub-pixel is allocated the class label which is most frequently predicted for it in the available SRM outputs. Critically, the results of two studies using PSA, HNN, and MRF show that the M-SRM approach can  M-SRM W = 1 r = 1 r = 2 r = 3 r = 10 r = 1 r = 2 r = 3 r = 10 r = 1 r = 2 r = 3 r = 10   Pixel-based M-SRM W = 1 r = 1 r = 2 r = 3 r = 10 r = 1 r = 2 r = 3 r = 10 Pixel-based M-SRM W = 1 r = 1 r = 2 r = 3 r = 10 r = 1 r = 2 r = 3 r = 10 increase the accuracy of land-cover maps over that achieved through the conventional use of a single SRM analysis. The land-cover maps generated from the M-SRM were also visually superior to those from standard single SRM analyses, with fewer speckle-like artefacts and linear features such as paths more fully connected. Given that researchers often run a SRM algorithm several times in order to determine the optimal parameter settings, these results show that using, rather than discarding, the outputs of these trial runs can sometimes enhance the accuracy of SRM. The algorithms selected for use in M-SRM also play a key role in relation to map accuracy. The accuracy of maps obtained from M-SRM that used different algorithms was not always higher than that based on the use of a single algorithm.
We assessed pixel-based M-SRM that labels a sub-pixel based on the classes depicted in the multiple maps for that sub-pixel, and context-based M-SRM which labels a sub-pixel based on the classes depicted in the multiple maps for that subpixel and its neighbouring sub-pixels. The highest overall accuracies of the contextbased M-SRM were higher than that of pixel-based M-SRM for each M-SRM. In addition, the performance of the context-based M-SRM was found to be dependent on the neighbourhood window size used and the magnitude of the parameter r, which controls the magnitude of spatial autocorrelations between sub-pixels in the W × W neighbourhood window. In context-based M-SRM, a larger W indicates a larger neighbourhood window size that explores more sub-pixels with local spatial autocorrelations, and the spatial autocorrelations between distant sub-pixels in the W × W neighbourhood window are higher with a larger r than with a smaller r. Context-based M-SRM would be expected to better preserve the spatial details of land cover which were no larger than the coarse-resolution pixel size with small values of W and r. In the simulated image analysis, most land-covers which are smaller than the coarse-resolution pixel size were reconstructed with W < 5 and r < 3. In addition, context-based M-SRM would be expected to better eliminate speckle-like artefacts with large values of W and r. In the AVIRIS data set, the class areal proportion image error was large and the PSA, HNN, and MRF maps contained many large speckle-like artefacts; most of these speckle-like artefacts were eliminated in context-based M-SRM with W = 9 and r ! 3. In addition, the selection of optimal W and r values in context-based M-SRM may have been affected by the geo-reference error between reference and input images, and context-based M-SRM performed better with relatively small W and r values in the simulated image experiment, in which the geo-reference root mean squared error was 0; and with relative large W and r values in the AVIRIS image experiment in which the georeference root mean squared error was 4.12 m. A comprehensive study on the impact of geo-reference error on M-SRM should be explored in the future.

Disclosure statement
No potential conflict of interest was reported by the authors.