Daniele Soria
Clustering breast cancer data by consensus of different validity indices
Soria, Daniele; Garibaldi, Jonathan M.; Ambrogi, Federico; Lisboa, Paulo J.G.; Boracchi, Patrizia; Biganzoli, Elia M.
Authors
Jonathan M. Garibaldi
Federico Ambrogi
Paulo J.G. Lisboa
Patrizia Boracchi
Elia M. Biganzoli
Abstract
Clustering algorithms will, in general, either partition a given data set into a pre-specified number of clusters or will produce a hierarchy of clusters. In this paper we analyse several different clustering techniques and apply them to a particular data set of breast cancer data. When we do not know a priori which is the best number of groups, we use a range of different validity indices to test the quality of clustering results and to determine the best number of clusters. While for the K-means method there is not absolute agreement among the indices as to which is the best number of clusters, for the PAM algorithm all the indices indicate 4 as the best cluster number.
Citation
Soria, D., Garibaldi, J. M., Ambrogi, F., Lisboa, P. J., Boracchi, P., & Biganzoli, E. M. Clustering breast cancer data by consensus of different validity indices. Presented at International Conference on Advances in Medical, Signal and Information Processing (4th)
Conference Name | International Conference on Advances in Medical, Signal and Information Processing (4th) |
---|---|
End Date | Jul 16, 2008 |
Deposit Date | Mar 18, 2015 |
Peer Reviewed | Peer Reviewed |
Keywords | Clustering algorithms, Breast cancer, Validity indices |
Public URL | https://nottingham-repository.worktribe.com/output/1016192 |
Publisher URL | http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4609085&filter%3DAND%28p_IS_Number%3A4609057%29%26rowsPerPage%3D75 |
Additional Information | Published in: 4th IET International Conference on Advances in Medical, Signal and Information Processing, 2008: MEDSIP 2008. IEEE, 2008. ISBN: 978-0-86341-934-8. © 2008 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Files
Soria2008a.pdf
(451 Kb)
PDF
You might also like
SoftED: Metrics for Soft Evaluation of Time Series Event Detection
(2024)
Journal Article
Explain the world – Using causality to facilitate better rules for fuzzy systems
(2024)
Journal Article
Gradient-based Fuzzy System Optimisation via Automatic Differentiation – FuzzyR as a Use Case
(2024)
Preprint / Working Paper
A pattern-based algorithm with fuzzy logic bin selector for online bin packing problem
(2024)
Journal Article
Boundary-wise loss for medical image segmentation based on fuzzy rough sets
(2024)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search