Rebecca Tickle
PAS3-HSID: a Dynamic Bio-Inspired Approach for Real-Time Hot Spot Identification in Data Streams
Tickle, Rebecca; Triguero, Isaac; Figueredo, Grazziela P.; Mesgarpour, Mohammad; John, Robert I.
Authors
ISAAC TRIGUERO VELAZQUEZ I.TrigueroVelazquez@nottingham.ac.uk
Associate Professor
GRAZZIELA FIGUEREDO G.Figueredo@nottingham.ac.uk
Associate Professor
Mohammad Mesgarpour
Robert I. John
Abstract
© 2019, Springer Science+Business Media, LLC, part of Springer Nature. Hot spot identification is a very relevant problem in a wide variety of areas such as health care, energy or transportation. A hot spot is defined as a region of high likelihood of occurrence of a particular event. To identify hot spots, location data for those events is required, which is typically collected by telematics devices. These sensors are constantly gathering information, generating very large volumes of data. Current state-of-the-art solutions are capable of identifying hot spots from big static batches of data by means of variations of clustering or instance selection techniques that pre-process the original input data, providing the most relevant locations. However, these approaches neglect to address changes in hot spots over time. This paper presents a dynamic bio-inspired approach to detect hot spots in big data streams. This computational intelligence method is designed and applied to the transportation sector as a case study to identify incidents in the roads caused by heavy goods vehicles. We adapt an immune-based algorithm to account for the temporary aspect of hot spots inspired by the idea of pheromones, which is then subsequently implemented using Apache Spark Streaming. Experimental results on real datasets with up to 4.5 million data points—provided by a telematics company—show that the algorithm is capable of quickly processing large streaming batches of data, as well as successfully adapting over time to detect hot spots. The outcome of this method is twofold, both reducing data storage requirements and demonstrating resilience to sudden changes in the input data (concept drift).
Citation
Tickle, R., Triguero, I., Figueredo, G. P., Mesgarpour, M., & John, R. I. (2019). PAS3-HSID: a Dynamic Bio-Inspired Approach for Real-Time Hot Spot Identification in Data Streams. Cognitive Computation, 11(3), 434–458. https://doi.org/10.1007/s12559-019-09638-y
Journal Article Type | Article |
---|---|
Acceptance Date | Mar 10, 2019 |
Online Publication Date | Apr 10, 2019 |
Publication Date | 2019-06 |
Deposit Date | May 8, 2019 |
Publicly Available Date | Apr 11, 2020 |
Journal | Cognitive Computation |
Print ISSN | 1866-9956 |
Electronic ISSN | 1866-9964 |
Publisher | Springer Verlag |
Peer Reviewed | Peer Reviewed |
Volume | 11 |
Issue | 3 |
Pages | 434–458 |
DOI | https://doi.org/10.1007/s12559-019-09638-y |
Keywords | Cognitive Neuroscience; Computer Vision and Pattern Recognition; Computer Science Applications |
Public URL | https://nottingham-repository.worktribe.com/output/2030211 |
Publisher URL | https://link.springer.com/article/10.1007%2Fs12559-019-09638-y |
Additional Information | This is a post-peer-review, pre-copyedit version of an article published in Cognitive Computation. The final authenticated version is available online at: http://dx.doi.org/10.1007/s12559-019-09638-y. Received: 11 May 2018; Accepted: 10 March 2019; First Online: 10 April 2019; : ; : The authors declare that they have no conflict of interest.; : This article does not contain any studies with human participants or animals performed by any of the authors. |
Contract Date | May 8, 2019 |
Files
Dynamic_Hot_Spots
(2.7 Mb)
PDF
You might also like
MRPR: A MapReduce solution for prototype reduction in big data classification
(2014)
Journal Article
Labelling strategies for hierarchical multi-label classification techniques
(2016)
Journal Article
kNN-IS: an iterative spark-based design of the k-nearest neighbors classifier for big data
(2016)
Journal Article
Evolutionary undersampling for extremely imbalanced big data classification under apache spark
(2016)
Presentation / Conference Contribution
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search