Skip to main content

Research Repository

Advanced Search

EUSC: A clustering-based surrogate model to accelerate evolutionary undersampling in imbalanced classification

Le, Hoang Lam; Landa-Silva, Dario; Galar, Mikel; Garcia, Salvador; Triguero, Isaac

EUSC: A clustering-based surrogate model to accelerate evolutionary undersampling in imbalanced classification Thumbnail


Authors

Hoang Lam Le

Profile Image

DARIO LANDA SILVA DARIO.LANDASILVA@NOTTINGHAM.AC.UK
Professor of Computational Optimisation

Mikel Galar

Salvador Garcia



Abstract

© 2020 Learning from imbalanced datasets is highly demanded in real-world applications and a challenge for standard classifiers that tend to be biased towards the classes with the majority of the examples. Undersampling approaches reduce the size of the majority class to balance the class distributions. Evolutionary-based approaches are prominent, treating undersampling as a binary optimisation problem that determines which examples are removed. However, their utilisation is limited to small datasets due to fitness evaluation costs. This work proposes a two-stage clustering-based surrogate model that enables evolutionary undersampling to compute fitness values faster. The main novelty lies in the development of a surrogate model for binary optimisation which is based on the meaning (phenotype) rather than their binary representation (genotype). We conduct an evaluation on 44 imbalanced datasets, showing that in comparison with the original evolutionary undersampling, we can save up to 83% of the runtime without significantly deteriorating the classification performance.

Citation

Le, H. L., Landa-Silva, D., Galar, M., Garcia, S., & Triguero, I. (2021). EUSC: A clustering-based surrogate model to accelerate evolutionary undersampling in imbalanced classification. Applied Soft Computing, 101, Article 107033. https://doi.org/10.1016/j.asoc.2020.107033

Journal Article Type Article
Acceptance Date Dec 12, 2020
Online Publication Date Dec 19, 2020
Publication Date Mar 1, 2021
Deposit Date Jan 5, 2021
Publicly Available Date Dec 20, 2021
Journal Applied Soft Computing
Print ISSN 1568-4946
Electronic ISSN 1872-9681
Publisher Elsevier
Peer Reviewed Peer Reviewed
Volume 101
Article Number 107033
DOI https://doi.org/10.1016/j.asoc.2020.107033
Keywords Software
Public URL https://nottingham-repository.worktribe.com/output/5201456
Publisher URL https://www.sciencedirect.com/science/article/pii/S1568494620309728
Additional Information This article is maintained by: Elsevier; Article Title: EUSC: A clustering-based surrogate model to accelerate evolutionary undersampling in imbalanced classification; Journal Title: Applied Soft Computing; CrossRef DOI link to publisher maintained version: https://doi.org/10.1016/j.asoc.2020.107033; Content Type: article; Copyright: Crown Copyright © 2020 Published by Elsevier B.V. All rights reserved.

Files





You might also like



Downloadable Citations