Multigranulation Supertrust Model for Attribute Reduction

As big data often contains a significant amount of uncertain, unstructured, and imprecise data that are structurally complex and incomplete, traditional attribute reduction methods are less effective when applied to large-scale incomplete information systems to extract knowledge. Multigranular computing provides a powerful tool for use in big data analysis conducted at different levels of information granularity. In this article, we present a novel multigranulation supertrust fuzzy–rough set-based attribute reduction (MSFAR) algorithm to support the formation of hierarchies of information granules of higher types and higher orders, which addresses newly emerging data mining problems in big data analysis. First, a multigranulation supertrust model based on the valued tolerance relation is constructed to identify the fuzzy similarity of the changing knowledge granularity with multimodality attributes. Second, an ensemble consensus compensatory scheme was adopted to calculate the multigranular trust degree based on the reputation at different granularities to create reasonable subproblems with different granulation levels. Third, an equilibrium method of multigranular coevolution is employed to ensure a wide range of balancing of exploration and exploitation, and this strategy can classify super elitists’ preferences and detect noncooperative behaviors with a global convergence ability and high search accuracy. The experimental results demonstrate that the MSFAR algorithm achieves a high performance in addressing uncertain and fuzzy attribute reduction problems with a large number of multigranularity variables.


I. INTRODUCTION
R ECENTLY, there has been explosive growth in the amount of data generated, and the term "big data" is being used to refer to the challenges of handling data with a high volume, variety, velocity, intrinsic value, and uncertain veracity. These "five V's" are the key features defining the essence of big data [1], [2]. Big data has attracted considerable attention from a variety of circles of scientific research, marketing, business management, and government decision making, leading to an upsurge of research [3]- [8]. Although a large candidate set of attributes is provided in big data problems, most may be redundant or irrelevant, which highly diminishes the learning performance of decision-making algorithms. The complexity of the big data problem mainly arises from the very large number of decision variables and various types of constraints. Thus, it has become highly desirable to develop some effective attribute reduction (feature selection) methods to extract useful knowledge hidden in large-scale data repositories. Since big data can often be incomplete, uncertain, and vague in reality, conventional knowledge discovery techniques, ranging from models, algorithms, and systems to applications, have been challenged in terms of how to store, manage, process, and analyze the complex attribute sets of big data [9]- [11].
With the increase in big data, researchers have started realizing the existence of data space alongside natural and social spaces and shown remarkable interest in its exploration. Structuralized knowledge organization and reasoning is considered an effective paradigm for handling large-scale tasks. In the recent past, a considerable amount of work has focused on a new research area-granular computing (GrC)-which is situated against the background of other human-centered information processing paradigms. GrC, which is a term coined by Zadeh [12], refers to a new knowledge representation and reasoning paradigm with information granules. Fuzzy sets and rough sets are the two main formal frameworks among active branches of GrC [13], which are of vital importance for the understanding of big data analysis completed at different granularity levels. They provide two powerful conceptual and algorithmic vehicles for multiple-view data analysis. The fuzzy set theory introduced by Zadeh [14] in 1965 is a formal mechanism by which to represent and manipulate concepts with ambiguous boundaries and to understand and apply the processes employed in human reasoning. However, a fuzzy set is characterized by only a membership function, which ignores uncertain information and thus degrades its performance in big data analysis. The rough set theory proposed by Pawlak [15] in 1982 has been applied to contend with uncertainty caused by indiscernibility and incompleteness [16]- [21]. Since the rough set theory is complementary to the fuzzy set theory, fuzzy-rough sets have appeared as a newly emerging combination delivering the advantages of both complementary areas and are considered to provide a more powerful model for analyzing uncertainty in big data [22]- [24]. A fuzzy-rough set is defined by two fuzzy sets, fuzzy upper and lower approximations, which are obtained by extending the corresponding notions of a rough set.
In the fuzzy-rough framework, elements have membership grades located within some range, which allows for greater flexibility in handling uncertain information [25]. In the Boolean case, elements that belong to the lower approximation are represented as belonging to the approximated set with absolute certainty. Therefore, it has become timely and strongly justified to develop effective fuzzy-rough set algorithms with multigranulation to enhance understanding and reasoning in big data analytics. Type-2 fuzzy sets, as a higher type with a higher order of information granules, extend the expressive capabilities of Type-1 fuzzy sets, and they are able to represent the imprecision of the membership function of fuzzy sets. A Type-2 interval number (IN) is a mathematical object that can be interpreted either probabilistically or possibilistically. The use of an IN is particularly appropriate when modeling linguistic concepts [26], [27]. Type-2 fuzzy sets have the potential to model uncertainties despite the large number of associated computations, especially when applied to nonreal-time applications [28], [29]. However, Type-2 fuzzy sets do not place any constraints on the continuity or other properties of their embedded sets.
Fuzzy-rough set models provide a method by which discrete or a real-valued noisy data or a mixture of both can be greatly reduced so that it can be effectively applied to both regression and classification of large-scale datasets. Fuzzy-rough set research has attracted considerable attention in recent years. Some approaches have been proposed to improve the performance of traditional fuzzy sets and rough sets as follows. Wang et al. [30] presented a new nearest-neighbor clustering classification algorithm based on the fuzzy-rough set theory, in which every training sample was made according to fuzzy roughness, and then training sample points in a class boundary or overlapping regions were removed. Hassanien [31] introduced a hybrid scheme in conjunction with statistical feature extraction techniques by combining the advantages of both rough sets and fuzzy sets, wherein rough sets were employed for the reduction generation of the minimal number of features, and fuzzy sets were considered as an image preprocessing technique to enhance the contrast of the whole image. It was reported, however, that fuzzy-rough sets are sensitive to noisy samples. To alleviate this shortcoming, Hu et al. [32] discussed why the models of rough sets are sensitive to noise and developed some robust fuzzy-rough set models based on fuzzy lower approximations. Petrosino and Salvi [33] presented a multiscale algorithm based on rough fuzzy sets in which rough sets handled the vagueness and fuzzy sets handled the coarseness. Sarkar [34] generalized the concept of rough membership functions to rough-fuzzy membership functions, wherein the value signified the rough uncertainty as well as the fuzzy uncertainty associated with the pattern. An et al. [35] proposed a novel robust data-distribution-aware fuzzy-rough set model by computing lower and upper approximations. However, the proposed models cannot be used to handle multimodal big data in real-world applications. Xu et al. [36] put forward a novel data redundancy reduction approach based on both the fuzzy-rough set theory and the information theory. Salama [37] provided fuzzy-rough attribute reduction software to facilitate the reduction of high-dimensional data. Zeng et al. [38] proposed the fuzzy-rough set approach for incremental feature selection in hybrid information systems. Maji and Garai [39] presented an interval type-2 (IT2) fuzzy-rough feature selection method, judiciously integrating the merits of the IT2 fuzzy set and rough sets to effectively reduce real-valued noisy features. Zhao et al. [40] analyzed the nested topological structure of fuzzy-rough sets with incremental parameters and designed a novel algorithm to compute a nested classifier by reflecting all possible parameters. Feng and Mi [41] considered the multigranulation fuzzy-rough sets of an information system by the minimal and maximal membership degrees based on multifuzzy tolerance relations. Yang et al. [42] presented two incremental algorithms for attribute reduction with fuzzy-rough sets for one and multiple incoming samples. Wang et al. [43] introduced a fitting fuzzy-rough set model to guarantee the maximal membership degree of a sample to its own category. Because fuzziness is employed in the rough set theory, more reduction information relevant to continuous attributes can be successfully acquired. Hu et al. [50] proposed a model of multikernel fuzzy-rough sets and described a parallel strategy to handle large-scale multimodality fuzzy data attribute reduction.
Although fuzzy-rough attribute reduction methods have shown promising performance, they cannot cope well with the multimodality of big data and a large variety of real-world applications that involve the challenging complexity of big data. In practice, most attribute reduction and classification tasks are associated with mixtures of numerical and categorical attribute features. The size of multimodal big data is usually very large, resulting in extensive time consumption and the use of massive parallel processing databases in performing attribute reduction. Obviously, this can be greatly detrimental to the traditional attribute reduction performance for analyses of incomplete large-scale information systems. Furthermore, noisy attributes are also one of the main sources of uncertainty in big data applications. Although a few fuzzy-rough attribute reduction methods are robust toward complex noisy attributes, they require more user-supplied information, and there is a lack of continuity and inheritance in their internal relationships, which results in unsatisfactory performances. Meanwhile, when the volume of massive data objects increases in the database, much more computing time and space overhead are necessary to address the new rendering decision attributes. Currently, few works have considered the multigranulation algorithm for big data analysis at different granularity levels, and there has been a shortage of fuzzy-rough set analytical research.
The focus of this article is to devise a fuzzy-rough attribute reduction approach capable of addressing structurally complex and granular large-scale attributes. The efficiency of fuzzyrough attribute reduction algorithms in a very large-scale data set is an important research topic for the future. The multigranulation fuzzy-rough model is an appropriate solution by which to accelerate the process of finding attribute reduction sets. In this article, we propose a novel multigranulation supertrust fuzzy-rough attribute reduction (MSFAR) algorithm to support the formation of hierarchies of information granules of higher types and higher orders. MSFAR is suitable not only to address newly emerging attribute reduction problems associated with an irregular distribution of changing large-scale datasets, but also to satisfy scenarios with complex noisy attributes. Furthermore, the multigranulation supertrust model offers a new way to classify data with different degrees of overlap, resulting in the creation of reasonable subproblems with different levels of granulation. Its main advantages are its high efficiency and robustness. Therefore, the main contributions of this paper are as follows.
1) A multigranulation supertrust model based on valued tolerance relations is constructed to identify the fuzzy similarity of the changing knowledge granularity for fuzzy classification with multimodality attributes, which effectively solves the attribute reduction problem of missing data in a large-scale information system. 2) An ensemble consensus compensatory scheme is adopted to calculate the multigranular trust degree in different granularities, resulting in the creation of reasonable subproblems with different granulation levels, from coarsened to refined. 3) An equilibrium method of multigranular coevolution is employed to ensure a wide range of balancing of exploration and exploitation. This strategy can classify super elitists' preferences and detect noncooperative behaviors with a global convergence ability and high search accuracy. The rest of this article is organized as follows. Section II provides background information about the fuzzy-rough set model based on the valued tolerance relation. Section III introduces a novel multigranulation supertrust model with a self-evolving compensatory scheme, wherein the multigranulation supertrust model, the ensemble consensus compensatory scheme, and an equilibrium method of multigranular coevolution are described in detail. Section IV details the primary steps of MSFAR. Section V describes the extensive experimental evaluations. Finally, Section VI concludes this article.

II. FUZZY-ROUGH SET MODEL BASED ON VALUED TOLERANCE RELATIONS
This section provides the relevant definitions for the fuzzyrough set model based on valued tolerance relations.
Definition 1 [15]: In the rough set theory, the universe is divided into a set of equivalence classes according to the attribute values of objects. An information system can be defined as a decision table by T = (U, C, D, V, f ), where C is the set of condition attributes, D is the set of decision attributes, V is the value set of all attributes, and f : Definition 2 [16], [17]: For P ⊆ {C ∩ D}, an equivalence relation IND(P ) is defined as follows: (1) IND(P ) partitions U into disjoint subsets. Let U/P denote the family of all equivalence classes of relation IND(P ), i.e., U/P = {P 1 , P 2 , . . . , P i . . . , P n }, where P i is an equivalence class of P , which is denoted as [x i ] P . Note that equivalence classes are defined with respect to their own attribute sets. Equivalence classes U/C and U/D will be called condition and decision classes, respectively.
Definition 3 [14]: whereD i (x) denotes the fuzzy decision of x to D i . Definition 4: For an object x ∈ U , the fuzzy positive region of D relative to B is defined as where D i (x) is a set of decision attributes. Definition 5 [42]: For each condition attribute a ∈ A, one can define a fuzzy binary relation R a , which is called a fuzzy equivalence relation if R a is reflexive (R(x, x) = 1), symmetric (R(x, y) = R(y, x)), and sup-min transitive (R(x, y) ≥ sup z∈U min{R(x, z), R(z, y)}) for ∀x, y ∈ U . A subset B ⊆ A can also define a fuzzy equivalence relation, denoted by R B = ∩ a∈B R a . Based on the fuzzy equivalence relation, the concept of fuzzy rough set is defined as follows.
Let F (U ) be the fuzzy power set of U and B ⊆ A. For each x ∈ U , a pair of lower and upper approximation operationers of where the lower approximation R B (X)(x) is considered as the degree of x certainly belonging to X, whereas the upper approximation R B (X)(x) is the degree of x possibly belonging to X.
) is referred to as the fuzzy rough set of Xwith respect to B, which is defined based on Max t-conorm and Min t-norm.
Definition 6: For the fuzzy-rough attribute reduction process, it is necessary to determine the dependence degree of the decision features. The dependence function of D relative to B is formally described by where 0 ≤ ∂ B (D) ≤ 1, and it is defined as the ratio of the sizes of the positive region relative to all samples in the feature space. Definition 7: The main thought behind the attribute reduction process using fuzzy-rough sets is to find a minimal subset of attributes that keeps the positive region unchanged, so that those wiped features will not affect the decision making. Considering a decision table (U, A ∪ D), a subset Red ⊆ A is called a reduct of A relative to D if the following conditions are satisfied 2) ∀a ∈ P, ∃y ∈ U, Notice that the reduct is usually not unique. Let Red D (A) denote the set of all reducts with respect to (U, A ∪ D), and then Core D (A) = ∩Red D (A) is called the core of (U, A ∪ D). It is easier to obtain the core first and then find a reduct based on the core.
Definition 8 [44], [45]: In the improved quantitative tolerance model, for an incomplete information system DIIS = . ∀x, y. Their similarity degree based on the value of a ∈ A is defined as Therefore, the similarity between x and y in A ⊆ AT is defined as and the quantitative tolerance class of x in A ⊆ AT is denoted by Definition 9: A multigranular valued tolerance relation is considered a good decomposition fuzzy-rough set framework for addressing large-scale problems with dynamically increasing complexity. For the information system DIS = {U, A, V, f }, A 1 , A 2 , . . . , A m ⊆ A, which corresponds to the classification threshold with a sequence of m sequence of attribute sets. ∀X ⊆ U , where the upper approximation and lower approximation based on the grade multigranulations are defined as follows: is represented as the optimistic multigranularity fuzzy-rough set model based on quantitative tolerance.

III. MULTIGRANULATION SUPERTRUST MODEL WITH A SELF-EVOLVING COMPENSATORY SCHEME
Traditional attribute reduction methods are satisfactory to a certain extent, but they are not capable of addressing massive amounts of complex large-scale data. Thus, there is a need for devising an effective trust method to efficiently handle the inherent multimodality attributes characteristics of big data. A multigranulation supertrust fuzzy-rough set model based on the valued tolerance relation is constructed to extract the fuzzy similarity of changing knowledge granularity for fuzzy classification. This model effectively solves the problem of missing data in an incomplete large-scale information system. With the increasing dimensionality of multigranulation space, most approaches to extract the fuzzy similarity of knowledge granularity are easily trapped in local optima due to overexploitation; therefore, their performance deteriorates. To achieve a better balance between the exploration and exploitation of knowledge granularity for solving complex large-scale datasets, we propose a novel multigranulation supertrust model with a self-evolving compensatory scheme to calculate the multigranular trust degree according to (17) at different granularities, thereby splitting the large data set into reasonable subdatasets. In addition, this model can explore the search space and locate the global best region during the fuzzy-rough attribute reduction process, as well as accelerate the premature convergence speed.

A. Multilgranulation Supertrust Model
Since there are various methods for calculating the credibility of a population, a practical method aims to process the multigranulation supertrust framework to adjust trust relationships based on subpopulations' interactions in different granularity spaces. We construct a granu-subpopulation architecture according to the trust degree of evolutionary elitists, including super elitists, denoted by " ", and ordinary elitists, denoted by " ". The credibility of subpopulations in the same granusubpopulation is calculated according to the trust calculation mechanism based on their respective reputations, which has been proven to be a good reflection of trust relationships between subpopulations of different granularities. Since the interaction between elitists in the same granu-subpopulation will be more frequent than those between granu-subpopulations in the  common topology, the trust degree can be quickly established, and the granu-subpopulation can be evaluated effectively. The dynamic trust execution process is described in Fig. 1. Two types of supertrust relationships are employed to play two roles within different granu-subpopulations for fuzzy-rough attribute reduction. As specified in Fig. 2, the supertrust relationships between the super elitist and the ordinary elitist and between ordinary elitists are both direct trust relationships, and those between the super elitists within different granu-subpopulations are recommendation trust relationships. The main steps of the process are described in Algorithm 1.

B. Ensemble Consensus Compensatory Scheme
In this section, an ensemble consensus compensatory scheme is presented to evaluate the trust information of recommended granu-subpopulations. Its main novelty lies in calculating the multigranular trust degrees in different granularities based on the fuzzy granularity compensatory scheme, as well as addressing reasonable subproblem splitting. By analyzing the change in the knowledge granularity produced by coarsening and refining in the process of attribute reduction, the overall performance Initialize the first granularity subpopulation by assigning the collective preference in round P t to granularity subpopulation center GS t 1 . Then, initialize the second granularity subpopulation center GS t 2 as the elitist , compute the minimum distance between each of the remaining ordinary elitists' preferences E t i and all current initial granularity subpopulation centers and find the super elitist preference whose minimum distance is the largest by Assign it to GS t h . Repeat this step until all N granularity subpopulations are initialized. 4. In the Granu-subpopulation i , to compute the distancesy between preference relations (both elitists' preferencesy and cluster centers indistinctly), the trust degree of different elitists in the same granularity subpopulation is defined by where n is the total number of elitists, SP i is the supery elitist, and P ij is the ordinary elitist in the Granu-subpopulation i .

5.
Compute the trust degree of each super elitist SP i toward each granularity subpopulation center GS t h (h ≥ 2), uGS h (P i ) ∈ [0, 1], as follows: 6. The granularity subpopulation centers GS t h (h ≥ 2) and cluster trust degrees uGS h (P i ) are updated iteratively. Reputations represent the trust degrees of different granu-subpopulations, which can be expressed as follows: 7. The similarity between granularity subpopulation GS t h (h ≥ 2) is determined in the current round t, t ∈ {2, . . . , Maxrounds − 1}, and each granularity subpopulation C t−1 u (u ≥ 2) is computed in the previous round (t − 1). Thus, the scale of the subdatasets is dynamically updated iteratively by the trust degree relationships based on the subpopulation interactions in different granularity spaces. 8. A granularity subpopulation similarity measure sim(GS t h , GS t−1 u ) is defined by

Algorithm 1:
Continued where Δ t hu (P i ) ∈ [0, 1] is the variation in the P i membership of both granularity subpopulations, which is computed by For the granularity is a similarity threshold. Then, GS t h and GS t−1 u are assumed to represent the same granu-subpopulation.
is greatly improved. Fig. 2 shows the framework of the ensemble consensus compensatory scheme, and its main steps are described in Algorithm 2.

C. Equilibrium Adjustment Strategy of Multigranular Coevolution
Information granularities are not completely independent, and they usually overlap and overlay each other. Therefore, a dynamic-approximation equilibrium adjustment strategy is needed in the multigranular-coevolution space to avoid the super elitists running into optima and, thus, providing good guidance for all elitists. While super elitists are in the local optima state according to the premature judgment mechanism, the mutative-scale equilibrium adjustment strategy is used to ensure a wide range of balancing of exploration and exploitation. This strategy can classify super elitists' preferences and detect noncooperative behaviors with a global convergence ability and high search accuracy. Fig. 3 shows the updating approximations of the equilibrium adjustment strategy when using different multigranularities for super elitists. Our projection produces different isosceles right triangles, and the arrow shows the direction of sorting super elitists in each triangle area. If two super elitists start with the lower granularity of a N 3 , the attribute values of the updating approximations converge to the equilibrium pair (a N 3 , a N 3 ). Similarly, if both super elitists start with very a high granularity of a N 1 , the attribute values of the updating Algorithm 2: Ensemble Consensus Compensatory Scheme (ECCS): 1. Dedicate each granu-subpopulation to its corresponding attribute set. The fitness evaluation is distributed equally among the participating Granu − subpopulation i with the importance weight w i of the i th elitist. Granu − subpopulation i is evaluated in its own domain.

Compute the weights by the fuzzy measures over each
Granu is the permutation of the evaluation values of the alternative A i , and C iσ(j) is the criterion corresponding to d iσ(j) . The weight criterion is calculated by is the weight of criterion C iσ(j) , and μ(A iσ(j) ) is the fuzzy measure. 3. Based on the given weight vector w = (w 1 , w 2 , . . . , w m ) T , the intuitionistic fuzzy matrix is a matrix of pairs of nonnegative weight numbers, which is constructed by I = (a ij ) n×m (i = 1, 2, . . . , n; j = 1, 2, . . . , m). (21) 4. Each super elitist SP i ∈ E provides its preference for alternatives according to the fuzzy preference relation P i = (p lk i ) n×n , which consists of a matrix of assessments p lk i for each pair of super elitists (x i , x j ), l, k ∈ {1, 2, . . . , n}. The ensemble consensus of preferences can be improved if the super elitists provide reciprocal assessments. If p lk i = p, p ∈ [0, 1], l = k, then p lk i = 1−p. 5. Compute the similarity degree of each pair of super elitists (SP i , SP j ), and then the similarity matrix SM ij = (sm lk ij ) n×m is defined by Algorithm 2: Continued.
where sm lk ij ∈ [0, 1] is the similarity degree between super elitists SP i and SP j in their assessments p lk i and p lk j , as obtained by the similarity function

6.
A consensus matrix CM = (cm lk ) n×m is computed by aggregating similarity matrices, including the importance weights w ij ∈ [0, 1] of each pair of super elitists(SP i , SP j ). The weighted average of the similarity degrees of each elitist cm lk ∈ [0, 1], (l = k) is computed by 7. If all super elitists are given equal importance weights, cm lk is redefined by where m(m − 1)/2 is the number of different pairs of elitists (e i , e j ) in the Granu − subpopulation i . 8. Design the average weight cm lk ij associated with each pair of super elitists (e i , e j ) to select each pair of features (x l , x k ) as where w ij ∈ [0, 1], and the value is computed based on a single super elitist's weights w i , w j . 9. An ensemble consensus degree with a compensatory scheme is computed by three different levels: ) be the sums of the elitists' membership degrees to cluster C t h (h ≥ 2), respectively. Analogously, let S t 1 = m i=1 μ C t 1 (P t i ) be the sum of the elitists' membership degrees to the collective preference.

Super elitists' fuzzy membership degrees uC h (P i ) to
Granu − subpopulation i are computed using similarity measures, where the distance between preference P i and the super elitist center C h is represented as d(P i , C h ). 5. Calculate the consistency ratio CR for each of the super elitists, C t = (c t ij ) n×n and D t = (d t ij ) n×n , with t ∈ {1, 2, . . . , s}.
approximations convergence to the equilibrium pair (a N 1 , a N 1 ). Thus, the results show that using the equilibrium adjustment strategy of multigranular coevolution for super elitists based on isosceles right triangles can lead to an increase in the size of the basin of multigranularity attraction. Its main steps are described in Algorithm 3.

IV. PROPOSED MSFAR ALGORITHM
We propose a novel MSFAR algorithm to support the formation of hierarchies of information granules of higher types and higher orders to support big data analysis. To accomplish this, we implement the described multigranulation fuzzy-rough sets and supertrust model with a self-evolving compensatory scheme to calculate the multigranular trust degree. It explicitly permits identifying the interdependent variables and adaptively decomposes them in the multigranulation space, so that the complexity and nonseparability of interdependent variables can be minimized among different fuzzy attribute subsets. It also extracts the fuzzy similarity of the changing knowledge granularity for fuzzy classification with multimodality attributes, effectively solving the problem of missing data in incomplete large-scale information systems. The proposed MSFAR algorithm can split a large data set into reasonable subdatasets with the multigranulation supertrust model. It incorporates an additional multigranulation module and supertrust coevolution to achieve the desirable goal of detecting complex interdependent variables, which can serve as a guide to carry out fuzzy-rough attribute reduction tasks with multigranulation flexible classification thresholds in big data. Its main steps are detailed as follows.
First, the fuzzy attribute sets are mapped into the evolutionary population space, and the fuzzy reduction model is completed as the optimization objective model.
Second, a multigranulation fuzzy-rough model based on the valued tolerance relation is constructed to identify the fuzzy similarity of the changing knowledge granularity for fuzzy classification with multimodality attributes, which effectively solves the problem of missing data in an incomplete large-scale information system. Then, an equilibrium method of multigranular coevolution is employed to classify super elitists' preferences and detect noncooperative behaviors.
Third, the multigranulation supertrust-coevolution model with a self-evolving compensatory scheme is adopted to calculate the multigranular trust degrees of different granularities, which represent their reputations in the group, resulting in the creation of reasonable subproblems with different levels from coarsened to refined granulation. It can self-adapt among different multigranulation layers and capture interdependent fuzzy-rough attribute subsets.
The pseudocodes of MSFAR are listed in Algorithm 4.

V. EXPERIMENTAL EVALUATION AND DISCUSSIONS
To validate the efficiency and effectiveness of the proposed MSFAR algorithm, we carry out a thorough series of experiments in this section, including a comparison of the computational times and accuracies of different algorithms and robustness comparisons for big datasets with consideration of the attribute noise. We provide performance comparisons between MSFAR and other representative algorithms for large-scale datasets and conduct robustness comparisons using attribute noise datasets for MSFAR versus representative algorithms.

A. Experimental Setup
All experiments are performed on computers with a Windows 10 operating system, an i7 3770k Intel CPU and 64 GB RAM using Java 11 programming language. We select five publicly available large-scale datasets from the UCI repository with heterogeneous attributes [46]. In addition, we produce three synthetic large-scale WEKA datasets with the WEKA data mining software [47], with these datasets involving a large number of samples with different statistical characteristics. Descriptive information about the eight datasets is given in Table I, where "KddCup99" is the network connectivity data set from USA air Algorithm 4: Proposed MSFAR algorithm. 1. Initialize the searching space of the fuzzy attribute sets and the granularity subpopulations by assigning the collective preference in round P t , initialize the granularity subpopulations GS h , and generate a list of candidate fuzzy-rough attribute subsets (A 1 , A 2 , . . . , A n ).

2.
Decompose the fuzzy-rough attribute sets, compute equivalence class of the decision table, and classify super elitists' preferences using Algorithm 1 (MSTM). Then, obtain S = {E r1 , E r2 , . . . , E rn }. 3. Conduct the ensemble consensus compensatory scheme by using Algorithm 2 (ECCS) to determine whether a granularity subpopulation is composed of the same super elitists as follows.
Suppose that GS t h and GS t−1 u are considered to represent the two granularity subpopulations of two super elitists and ordinary elitists and that their super elitists' trust ) have values close to each other for all e i ∈ E. Then, conduct two similar granularity subpopulation compositions. 4. Perform the equilibrium adjustment strategy of multigranular coevolution using Algorithm 3 (EASMC) and then obtain the corresponding perfect consistency equilibrium degree CED based on isosceles right triangles asC t = (c t ij ) n×n t ∈ {1, 2, . . . , s} for any inconsistent . . , s}. Thus, a perfect consistency pairT t = C t ,D t ptcan be constituted bỹ for t ∈ {1, 2, . . . , s}.
(29) 6. The energy function is reconstructed by where G denotes a K × K matrix defining the connectivity between elitist populations k and j and v j denotes the accumulated class probabilities in the neighborhood information N i of elitist population i. 7. The definition of the multigranulation flexible threshold F T R (X) of R is formulated as

Algorithm 4: Continued
(iii) Select the best feature subset F S best i for each granu-subpopulation and achieve a cascade feature setof fuzzy-rough attribute subsets as follows: (32)

9.
Evaluate whether or not the accuracy of the fuzzy-rough attribute reduction is satisfied with respect to the predefined accuracy. If satisfied, then output the optimal set FS Opt = n i=1 F S best i ; otherwise, go to Step 6. force simulation over nine weeks and "RLCP" means "record linkage comparison patterns." The data sets Susy and PokerHand are duplicated several times. We employ a stratified tenfold cross-validation for data validation. The original data set is equally partitioned into ten parts, wherein two parts are used for testing and the remaining eight parts are used as the training set for attribute reduction. A classifier is then learned with the reduced training set and the classification accuracy is obtained on the reduced testing data. The cross-validation process is then repeated ten times. The average values are calculated for the final performance. We compare the experimental results of MSFAR with the results achieved by other representative algorithms.

B. Computational Times and Accuracies of Different Algorithms
To test the computational feasibility of MSFAR for use with large-scale datasets, we quantitatively compare its classification accuracy with those of some representative fuzzy attribute reduction algorithms, such as fuzzy boundary-region-based FS (B-FRFS) [48], mutual information-based algorithm for fuzzyrough attribute reduction (MIBAFRAR) [36], unsupervised fuzzy-rough feature selection (UFRFS) [49], multimodality fuzzy data attribute reduction (MFDAR) [50] and multi-label fuzzy rough set (MLFRS) [51]. To avoid the influence of random selection, independent runs for each data set are repeated 50 times, and the averaged computational time and accuracy are presented as the final results. As described in Table II, we employ "−" to indicate no acceptable solution and a bold number to represent the best result of the computational time (time/s ×10 2 ). As indicated in Table II, Table III shows the numerical comparison of the selected features of different fuzzy attribute reduction algorithms. As described in Tables II and III, the proposed MSFAR algorithm can reduce the amount of redundant uncertain, unstructured, and imprecise data and significantly improve the computational time. Therefore, it acquires the optimal number of selected features of the fuzzy-rough attribute sets, which is consistently a much better performance than those of its rivals for seven large-scale datasets. The main reasons behind these results are that the multigranulation supertrust-coevolution model with a self-evolving compensatory scheme employed in MSFAR is constructed to calculate the multigranular trust degree at different granularities and to split large datasets into reasonable subdatasets. MSFAR can consider both strongly relevant features and their corresponding correlated features simultaneously, and it selects important correlated features from a set of attribute features for classification.
In the following experiment, we further evaluate the classification accuracy of MSFAR for selected dynamically increasing sample sizes in the large-scale Higgs and Weka-1.8G datasets compared with those of representative algorithms. We employ only two classifiers, namely, support vector machine (SVM) [52] and C4.5 [53], to process datasets whose attributes have been selected by six different methods, namely, B-FRFS, MIBAR-FRAR, UFRFS, MLFRS, MFDAR, and MSFAR. Tables IV and V report the classification accuracy versus the dynamically increasing sample size of large datasets with the SVM classifier and C4.5 classifier, respectively. The six different attribute reduction algorithms result in the different sets of attributes for the large-scale increasing datasets. It is obvious that MSFAR significantly surpasses most of the representative algorithms. As an extreme case, both B-FRFS and MIBARFRAR fail with increasing sample sizes of the Higgs data set because they are easily overwhelmed when processing high-dimensional large datasets. However, MSFAR achieves much better classification accuracy because it can benefit from the advantages of that the multigranulation fuzzy-rough set model can accurately capture interdependent variables associated with structurally complex and incomplete attribute sets and can greatly eliminate most irrelevant attribute sets without lowering the classification performance. For example, for the 80 × 10 6 sample size of Weka-1.8G, MSFAR can reach 97.81% classification accuracy, whereas for B-FRFS, UFRFS, and MLFRS, these values are 92.21%, 93.43%, and 94.62%, respectively. For the evergrowing largescale datasets, MSFAR performs significantly better. Therefore,  the performance of MSFAR improves for large datasets-the larger is the data set, the higher is the classification accuracy.
As indicated in experimental results, MSFAR is suitable for use in fuzzy-rough attribute reduction and classification for large-scale increasing datasets, thereby overcoming the limitations of the representative algorithms.
The experimental results clearly indicate that the classification system employing MSFAR as the fuzzy-rough attribute reduction algorithm can acquire the optimal reduction results of structurally complex and incomplete attribute sets and lead to an appealing performance in the classification accuracy, irrespective of different classifiers.

C. Statistical Analysis
In order to sufficiently report classification accuracies, an appropriate statistical test need to be applied to evaluate the statistical significance of the results of the fuzzy-rough attribute reduction algorithms. In this article, a paired t-test was used to determine the statistical significance of the results at the 0.05   Table VI for the SVM classifier, where a "b" symbol next to a value indicates that the performance is statistically better than MSFAR, and a "w" symbol shows that the performance is worse statistically. The final line in Table VI summarizes the count of the number of statistically better, equivalent, and worse results for each representative fuzzy attribute reduction algorithm in comparison with MSFAR. The statistical comparison results between each compared algorithm and MSFAR indicate that only three datasets (KddCup99, Susy, and RLCP) for which MSFAR is bettered by MLFRS and MFDAR, but for the remainder, MSFAR achieves statistically equivalent to or better accuracy results than other representative fuzzy attribute reduction algorithms.
It is obvious in Table VI that B-FRFS and MIBARFRAR do not perform as well as does MSFAR in producing classification accuracy results, which are consistent across the SVM classifier, as there are four or five cases wherein they produces worse statistical results compared with MSFAR. From the results reported in this table, it can be seen that the representative fuzzy attribute reduction algorithms attain a few results that are statistically comparable with the accuracy for the unreduced data, but MSFAR achieves significantly better classification results than them. Generally speaking, it can be concluded that MSFAR is superior to the other algorithms in the paired t-test. It offers the significance level for the tests, whereas retaining the semantics of the data. The better performance of the proposed MSFAR algorithm is achieved due to the fact that it provides an efficient way to select a reduced attribute set of real-valued datasets, having maximum significance and relevance without lowering the classification performance.

D. Discussion
From the results for average classification accuracy presented in Tables IV and V, it can be observed that the multigranulation fuzzy-rough set model can perform better with lower average subset sizes in comparison with both the UFRFS and MLFRS algorithms. In particular, for the proposed MSFAR algorithm, from the abovementioned results, it is clear that the multigranulation model offers a greater reduction of the large-scale size. This reduction is to be expected since there is much discriminative information contained in the decision features. The main reason behind this result is that the multigranulation fuzzy-rough set model can accurately capture interdependent variables associated with structurally complex and incomplete attribute sets and can greatly eliminate most irrelevant attribute sets without lowering the classification performance. The possible reasons behind this effect are that we construct a multigranulation fuzzy-rough set model based on a valued tolerance relation to extract the fuzzy similarity of the changing knowledge granularity for fuzzy classification with consideration of multimodality attributes, which effectively solves the problem of missing data in the large-scale information system. Meanwhile, the multilayered supertrustcoevolution model with a self-evolving compensatory scheme can calculate the multigranular trust degree at different granularities based on the reputation in the group to split the large problem into reasonable subproblems.
It is necessary for the algorithm to perform efficiently in real time so that it consistently functions without requiring a significant amount of computing resources. In our experiments, the time complexity of the proposed MSFAR algorithm is also analyzed by observing the real-time performance for selected big datasets of varying sizes. The experimental results indicate that the computational complexity of MSFAR is obviously less than those of UFRFS and MLFRS. Hence, if we were to perform the evaluation using larger datasets, the computational time would be unacceptable for the comparison algorithms, but MSFAR requires a smaller amount time to obtain the optimal solution.
Of course, in a few special cases, the results of MSFAR are slightly poorer than those of the representative algorithms. In general, the two main components in MSFAR should work together to allow it to obtain better results. Furthermore, MS-FAR can dynamically adapt its main operators to suit various large-scale instances with dynamically increasing percentages of noise.

VI. CONCLUSION
Recently, big data has been an emerging topic that has attracted the attention of many researchers. The significant amount of unstructured, uncertain, and imprecise large-scale data exhibits structurally complex and granular characteristics. Uncertainty data have been widely adopted for attribute reduction, but alone, they may be insufficient for use in batch feature selection. The recent progress in fuzzy-rough set approaches can be helpful for analyses of big data problems. In this article, we present a novel MSFAR algorithm for use in big data analysis at different granularity levels. A multigranulation fuzzy-rough set model based on a valued tolerance relation is constructed to identify the fuzzy similarity of the changing knowledge granularity for fuzzy classification with multimodality attributes. Meanwhile, the multigranulation supertrust-coevolution model with a self-evolving compensatory scheme was adopted to calculate the multigranular trust degree at different granularities, and it could be directly applied to a variety of knowledge analytical problems with continuous or numerical large-scale datasets. The experimental results demonstrate that MSFAR produces very good results. It is theoretically and experimentally indicated that the multigranulation supertrust-coevolution model with a self-evolving compensatory scheme could provide a much better performance by the MSFAR compared with those of the representative models. These represent important developments for improving the reasoning and understanding of big data.
In the era of big data, the size of large data usually dynamically increases, including current changing and interconnected datasets. It is time-consuming to perform efficient attribute reduction and classification for these uncertain and redundant datasets. In the future, it is expected that using analytical methods to learn from big data can significantly improve the fuzzy-rough attribute reduction process. We will also explore the effective and robust multigranulation mechanisms of fuzzyrough reduction estimation to achieve improved understanding of large-scale feature selection. We intend to exert great effort in promoting our research to offer a new avenue by which to address the problem of optimum predicting disorder from neonatal brain magnetic resonance imagings.