Dr Gavin Smith GAVIN.SMITH@NOTTINGHAM.AC.UK
ASSOCIATE PROFESSOR
Towards optimal symbolization for time series comparisons
Smith, Gavin; Goulding, James; Barrack, Duncan
Authors
Dr JAMES GOULDING JAMES.GOULDING@NOTTINGHAM.AC.UK
PROFESSOR OF DATA SCIENCE
Duncan Barrack
Abstract
The abundance and value of mining large time series data sets has long been acknowledged. Ubiquitous in fields ranging from astronomy, biology and web science the size and number of these datasets continues to increase, a situation exacerbated by the exponential growth of our digital footprints. The prevalence and potential utility of this data has led to a vast number of time-series data mining techniques, many of which require symbolization of the raw time series as a pre-processing step for which a number of well used, pre-existing approaches from the literature are typically employed. In this work we note that these standard approaches are sub-optimal in (at least) the broad application area of time series comparison leading to unnecessary data corruption and potential performance loss before any real data mining takes place. Addressing this we present a novel quantizer based upon optimization of comparison fidelity and a computationally tractable algorithm for its implementation on big datasets. We demonstrate empirically that our new approach provides a statistically significant reduction in the amount of error introduced by the symbolization process compared to current state-of-the-art. The approach therefore provides a more accurate input for the vast number of data mining techniques in the literature, providing the potential of increased real world performance across a wide range of existing data mining algorithms and applications.
Citation
Smith, G., Goulding, J., & Barrack, D. Towards optimal symbolization for time series comparisons. Presented at IEEE 13th International Conference on Data Mining Workshops (ICDMW 2013)
Conference Name | IEEE 13th International Conference on Data Mining Workshops (ICDMW 2013) |
---|---|
End Date | Dec 10, 2013 |
Acceptance Date | Oct 26, 2013 |
Publication Date | Dec 7, 2013 |
Deposit Date | Jun 4, 2018 |
Publicly Available Date | Jun 4, 2018 |
Peer Reviewed | Peer Reviewed |
Book Title | 2013 IEEE 13th International Conference on Data Mining Workshops |
DOI | https://doi.org/10.1109/ICDMW.2013.59 |
Keywords | Time series analysis; Quantization (signal); Equations; Mathematical model; Data mining; Approximation methods; Simulated annealing |
Public URL | https://nottingham-repository.worktribe.com/output/720523 |
Publisher URL | https://doi.org/10.1109/ICDMW.2013.59 |
Additional Information | © 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Published in the Proceedings of the IEEE International Conference on Data Mining Workshops (ICDMW 2014) |
Contract Date | Jun 4, 2018 |
Files
Towards optimal.pdf
(233 Kb)
PDF
You might also like
Detecting iodine deficiency risks from dietary transitions using shopping data
(2024)
Journal Article
Bundle entropy as an optimized measure of consumers' systematic product choice combinations in mass transactional data
(2022)
Presentation / Conference Contribution
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search