Skip to main content

Research Repository

Advanced Search

All Outputs (37)

Handling uncertainty in citizen science data: towards an improved amateur-based large-scale classification (2018)
Journal Article
Jiménez, M., Triguero, I., & John, R. (2019). Handling uncertainty in citizen science data: towards an improved amateur-based large-scale classification. Information Sciences, 479, 301-320. https://doi.org/10.1016/j.ins.2018.12.011

© 2018 Citizen Science, traditionally known as the engagement of amateur participants in research, is showing great potential for large-scale processing of data. In areas such as astronomy, biology, or geo-sciences, where emerging technologies genera... Read More about Handling uncertainty in citizen science data: towards an improved amateur-based large-scale classification.

Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data (2018)
Journal Article
Triguero, I., Garcia-Gil, D., Maillo, J., Luengo, J., Garcia, S., & Herrera, F. (2019). Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(2), Article e1289. https://doi.org/10.1002/widm.1289

The k-nearest neighbours algorithm is characterised as a simple yet effective data mining technique. The main drawback of this technique appears when massive amounts of data -likely to contain noise and imperfections - are involved, turning this algo... Read More about Transforming big data into smart data: An insight on the use of the k-nearest neighbors algorithm to obtain quality data.

Coevolutionary fuzzy attribute order reduction with complete attribute-value space tree (2018)
Journal Article
Ding, W., Triguero, I., & Lin, C. (2018). Coevolutionary fuzzy attribute order reduction with complete attribute-value space tree. IEEE Transactions on Emerging Topics in Computational Intelligence, https://doi.org/10.1109/tetci.2018.2869919

Since big data sets are structurally complex, high-dimensional, and their attributes exhibit some redundant and irrelevant information, the selection, evaluation, and combination of those large-scale attributes pose huge challenges to traditional met... Read More about Coevolutionary fuzzy attribute order reduction with complete attribute-value space tree.

Instance reduction for one-class classification (2018)
Journal Article
Krawczyk, B., Triguero, I., García, S., Woźniak, M., & Herrera, F. (in press). Instance reduction for one-class classification. Knowledge and Information Systems, https://doi.org/10.1007/s10115-018-1220-z

Instance reduction techniques are data preprocessing methods originally developed to enhance the nearest neighbor rule for standard classification. They reduce the training data by selecting or generating representative examples of a given problem. T... Read More about Instance reduction for one-class classification.

On the use of convolutional neural networks for robust classification of multiple fingerprint captures (2017)
Journal Article
Peralta, D., Triguero, I., García, S., Saeys, Y., Benitez, J. M., & Herrera, F. (in press). On the use of convolutional neural networks for robust classification of multiple fingerprint captures. International Journal of Intelligent Systems, 33(1), https://doi.org/10.1002/int.21948

Fingerprint classification is one of the most common approaches to accelerate the identification in large databases of fingerprints. Fingerprints are grouped into disjoint classes, so that an input fingerprint is compared only with those belonging to... Read More about On the use of convolutional neural networks for robust classification of multiple fingerprint captures.

KEEL 3.0: an open source software for multi-stage analysis in data mining (2017)
Journal Article
Triguero, I., González, S., Moyano, J. M., García, S., Alcalá-Fdez, J., Luengo, J., …Herrera, F. (2017). KEEL 3.0: an open source software for multi-stage analysis in data mining. International Journal of Computational Intelligence Systems, 10(1), https://doi.org/10.2991/ijcis.10.1.82

This paper introduces the 3rd major release of the KEEL Software. KEEL is an open source Java framework (GPLv3 license) that provides a number of modules to perform a wide variety of data mining tasks. It includes tools to performdata management, des... Read More about KEEL 3.0: an open source software for multi-stage analysis in data mining.

An Immune-Inspired Technique to Identify Heavy Goods Vehicles Incident Hot Spots (2017)
Journal Article
Figueredo, G. P., Triguero, I., Mesgarpour, M., Maciel Guerra, A., Garibaldi, J. M., & John, R. (2017). An Immune-Inspired Technique to Identify Heavy Goods Vehicles Incident Hot Spots. IEEE Transactions on Emerging Topics in Computational Intelligence, 1(4), 248-258. https://doi.org/10.1109/TETCI.2017.2721960

We report on the adaptation of an immune-inspired instance selection technique to solve a real-world big data problem of determining vehicle incident hot spots. The technique, which is inspired by the Immune System self-regulation mechanism, was orig... Read More about An Immune-Inspired Technique to Identify Heavy Goods Vehicles Incident Hot Spots.

Self-labeling techniques for semi-supervised time series classification: an empirical study (2017)
Journal Article
González, M., Bergmeir, C., Triguero, I., Rodríguez, Y., & Benítez, J. M. (in press). Self-labeling techniques for semi-supervised time series classification: an empirical study. Knowledge and Information Systems, https://doi.org/10.1007/s10115-017-1090-9

An increasing amount of unlabeled time series data available render the semi-supervised paradigm a suitable approach to tackle classification problems with a reduced quantity of labeled data. Self-labeled techniques stand out from semi-supervised cla... Read More about Self-labeling techniques for semi-supervised time series classification: an empirical study.

Distributed incremental fingerprint identification with reduced database penetration rate using a hierarchical classification based on feature fusion and selection (2017)
Journal Article
Peralta, D., Triguero, I., García, S., Saeys, Y., Benitez, J. M., & Herrera, F. (2017). Distributed incremental fingerprint identification with reduced database penetration rate using a hierarchical classification based on feature fusion and selection. Knowledge-Based Systems, 126, https://doi.org/10.1016/j.knosys.2017.03.014

Fingerprint recognition has been a hot research topic along the last few decades, with many applications and ever growing populations to identify. The need of flexible, fast identification systems is therefore patent in such situations. In this conte... Read More about Distributed incremental fingerprint identification with reduced database penetration rate using a hierarchical classification based on feature fusion and selection.

EPRENNID: An evolutionary prototype reduction based ensemble for nearest neighbor classification of imbalanced data (2016)
Journal Article
Vluymans, S., Triguero, I., Cornelis, C., & Saeys, Y. (2016). EPRENNID: An evolutionary prototype reduction based ensemble for nearest neighbor classification of imbalanced data. Neurocomputing, 216, https://doi.org/10.1016/j.neucom.2016.08.026

Classification problems with an imbalanced class distribution have received an increased amount of attention within the machine learning community over the last decade. They are encountered in a growing number of real-world situations and pose a chal... Read More about EPRENNID: An evolutionary prototype reduction based ensemble for nearest neighbor classification of imbalanced data.

kNN-IS: an iterative spark-based design of the k-nearest neighbors classifier for big data (2016)
Journal Article
Maillo, J., Ramirez, S., Triguero, I., & Herrera, F. (2017). kNN-IS: an iterative spark-based design of the k-nearest neighbors classifier for big data. Knowledge-Based Systems, 117, 3-15. https://doi.org/10.1016/j.knosys.2016.06.012

The k-Nearest Neighbors classifier is a simple yet effective widely renowned method in data mining. The actual application of this model in the big data domain is not feasible due to time and memory restrictions. Several distributed alternatives base... Read More about kNN-IS: an iterative spark-based design of the k-nearest neighbors classifier for big data.

DPD-DFF: a dual phase distributed scheme with double fingerprint fusion for fast and accurate identification in large databases (2016)
Journal Article
Peralta, D., Triguero, I., García, S., Herrera, F., & Benitez, J. M. (2016). DPD-DFF: a dual phase distributed scheme with double fingerprint fusion for fast and accurate identification in large databases. Information Fusion, 32(Part A), https://doi.org/10.1016/j.inffus.2016.03.002

Nowadays, many companies and institutions need fast and reliable identification systems that are able to deal with very large databases. Fingerprints are among the most used biometric traits for identification. In the current literature there are fin... Read More about DPD-DFF: a dual phase distributed scheme with double fingerprint fusion for fast and accurate identification in large databases.

Labelling strategies for hierarchical multi-label classification techniques (2016)
Journal Article
Triguero, I., & Vens, C. (2016). Labelling strategies for hierarchical multi-label classification techniques. Pattern Recognition, 56, 170-183. https://doi.org/10.1016/j.patcog.2016.02.017

© 2016 Elsevier Ltd Many hierarchical multi-label classification systems predict a real valued score for every (instance, class) couple, with a higher score reflecting more confidence that the instance belongs to that class. These classifiers leave t... Read More about Labelling strategies for hierarchical multi-label classification techniques.

ROSEFW-RF: the winner algorithm for the ECBDL’14 big data competition: an extremely imbalanced big data bioinformatics problem (2015)
Journal Article
Triguero, I., del Río, S., López, V., Bacardit, J., Benítez, J. M., & Herrera, F. (2015). ROSEFW-RF: the winner algorithm for the ECBDL’14 big data competition: an extremely imbalanced big data bioinformatics problem. Knowledge-Based Systems, 87, https://doi.org/10.1016/j.knosys.2015.05.027

The application of data mining and machine learning techniques to biological and biomedicine data continues to be an ubiquitous research theme in current bioinformatics. The rapid advances in biotechnology are allowing us to obtain and store large qu... Read More about ROSEFW-RF: the winner algorithm for the ECBDL’14 big data competition: an extremely imbalanced big data bioinformatics problem.

MRPR: A MapReduce solution for prototype reduction in big data classification (2014)
Journal Article
Triguero, I., Peralta, D., Bacardit, J., García, S., & Herrera, F. (2015). MRPR: A MapReduce solution for prototype reduction in big data classification. Neurocomputing, 150(Part A), 331-345. https://doi.org/10.1016/j.neucom.2014.04.078

In the era of big data, analyzing and extracting knowledge from large-scale data sets is a very interesting and challenging task. The application of standard data mining tools in such data sets is not straightforward. Hence, a new class of scalable m... Read More about MRPR: A MapReduce solution for prototype reduction in big data classification.

SEG-SSC: a framework based on synthetic examples generation for self-labeled semi-supervised classification (2014)
Journal Article
Triguero, I., Garcia, S., & Herrera, F. (2015). SEG-SSC: a framework based on synthetic examples generation for self-labeled semi-supervised classification. IEEE Transactions on Cybernetics, 45(4), https://doi.org/10.1109/TCYB.2014.2332003

Self-labeled techniques are semi-supervised classification methods that address the shortage of labeled examples via a self-learning process based on supervised models. They progressively classify unlabeled data and use them to modify the hypothesis... Read More about SEG-SSC: a framework based on synthetic examples generation for self-labeled semi-supervised classification.

Minutiae filtering to improve both efficacy and efficiency of fingerprint matching algorithms (2014)
Journal Article
Peralta, D., Galar, M., Triguero, I., Miguel-Hurtado, O., Benitez, J. M., & Herrera, F. (2014). Minutiae filtering to improve both efficacy and efficiency of fingerprint matching algorithms. Engineering Applications of Artificial Intelligence, 32, 37-53. https://doi.org/10.1016/j.engappai.2014.02.016

Fingerprint minutiae extraction is a critical issue in fingerprint recognition. Both missing and spurious minutiae hinder the posterior matching process. Spurious minutiae are more frequent than missing ones, but they can be removed by post-processin... Read More about Minutiae filtering to improve both efficacy and efficiency of fingerprint matching algorithms.