Enrico Glaab
Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data
Glaab, Enrico; Bacardit, Jaume; Garibaldi, Jonathan M.; Krasnogor, Natalio
Authors
Jaume Bacardit
Jonathan M. Garibaldi
Natalio Krasnogor
Abstract
Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scientific and clinical applications. Increasing the interpretability of prediction models while retaining a high accuracy would help to exploit the information content in microarray data more effectively. For this purpose, we evaluate our rule-based evolutionary machine learning systems, BioHEL and GAssist, on three public microarray cancer datasets, obtaining simple rule-based models for sample classification. A comparison with other benchmark microarray sample classifiers based on three diverse feature selection algorithms suggests that these evolutionary learning techniques can compete with state-of-the-art methods like support vector machines. The obtained models reach accuracies above 90% in two-level external cross-validation, with the added value of facilitating interpretation by using only combinations of simple if-then-else rules. As a further benefit, a literature mining analysis reveals that prioritizations of informative genes extracted from BioHEL's classification rule sets can outperform gene rankings obtained from a conventional ensemble feature selection in terms of the pointwise mutual information between relevant disease terms and the standardized names of top-ranked genes.
Citation
Glaab, E., Bacardit, J., Garibaldi, J. M., & Krasnogor, N. (2012). Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data. PLoS ONE, 7(7), Article e39932. https://doi.org/10.1371/journal.pone.0039932
Journal Article Type | Article |
---|---|
Publication Date | Jul 1, 2012 |
Deposit Date | Jul 17, 2012 |
Publicly Available Date | Jul 17, 2012 |
Journal | PLoS ONE |
Electronic ISSN | 1932-6203 |
Publisher | Public Library of Science |
Peer Reviewed | Peer Reviewed |
Volume | 7 |
Issue | 7 |
Article Number | e39932 |
DOI | https://doi.org/10.1371/journal.pone.0039932 |
Keywords | gene, protein, expression, microarray analysis, literature mining, classification, machine learning, prediction, cancer, cross-validation, sample classification, feature selection |
Public URL | https://nottingham-repository.worktribe.com/output/1007141 |
Publisher URL | http://dx.doi.org/10.1371/journal.pone.0039932 |
Files
biohel_plosone_journal.pone.0039932.pdf
(567 Kb)
PDF
Copyright Statement
Copyright information regarding this work can be found at the following address: http://creativecommons.org/licenses/by/4.0
You might also like
SoftED: Metrics for Soft Evaluation of Time Series Event Detection
(2024)
Journal Article
Explain the world – Using causality to facilitate better rules for fuzzy systems
(2024)
Journal Article
Gradient-based Fuzzy System Optimisation via Automatic Differentiation – FuzzyR as a Use Case
(2024)
Preprint / Working Paper
A pattern-based algorithm with fuzzy logic bin selector for online bin packing problem
(2024)
Journal Article
Boundary-wise loss for medical image segmentation based on fuzzy rough sets
(2024)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search