A Dynamic Multiarmed Bandit-Gene Expression Programming Hyper-Heuristic for Combinatorial Optimization Problems

Sabar, Nasar R.; Ayob, Masri; Kendall, Graham; Qu, Rong

doi:10.1109/TCYB.2014.2323936

A Dynamic Multiarmed Bandit-Gene Expression Programming Hyper-Heuristic for Combinatorial Optimization Problems

Sabar, Nasar R.; Ayob, Masri; Kendall, Graham; Qu, Rong

Authors

Nasar R. Sabar

Masri Ayob

Graham Kendall

Professor RONG QU rong.qu@nottingham.ac.uk
PROFESSOR OF COMPUTER SCIENCE

Abstract

Hyper-heuristics are search methodologies that aim to provide high-quality solutions across a wide variety of problem domains, rather than developing tailor-made methodologies for each problem instance/domain. A traditional hyper-heuristic framework has two levels, namely, the high level strategy (heuristic selection mechanism and the acceptance criterion) and low level heuristics (a set of problem specific heuristics). Due to the different landscape structures of different problem instances, the high level strategy plays an important role in the design of a hyper-heuristic framework. In this paper, we propose a new high level strategy for a hyper-heuristic framework. The proposed high-level strategy utilizes a dynamic multiarmed bandit-extreme value-based reward as an online heuristic selection mechanism to select the appropriate heuristic to be applied at each iteration. In addition, we propose a gene expression programming framework to automatically generate the acceptance criterion for each problem instance, instead of using human-designed criteria. Two well-known, and very different, combinatorial optimization problems, one static (exam timetabling) and one dynamic (dynamic vehicle routing) are used to demonstrate the generality of the proposed framework. Compared with state-of-the-art hyper-heuristics and other bespoke methods, empirical results demonstrate that the proposed framework is able to generalize well across both domains. We obtain competitive, if not better results, when compared to the best known results obtained from other methods that have been presented in the scientific literature. We also compare our approach against the recently released hyper-heuristic competition test suite. We again demonstrate the generality of our approach when we compare against other methods that have utilized the same six benchmark datasets from this test suite.

Citation

Sabar, N. R., Ayob, M., Kendall, G., & Qu, R. (2015). A Dynamic Multiarmed Bandit-Gene Expression Programming Hyper-Heuristic for Combinatorial Optimization Problems. IEEE Transactions on Cybernetics, 45(2), 217-228. https://doi.org/10.1109/TCYB.2014.2323936

Journal Article Type	Article
Acceptance Date	May 11, 2014
Online Publication Date	Jun 2, 2014
Publication Date	2015-02
Deposit Date	Feb 19, 2015
Publicly Available Date	Feb 19, 2015
Journal	IEEE Transactions on Cybernetics
Electronic ISSN	2168-2275
Publisher	Institute of Electrical and Electronics Engineers
Peer Reviewed	Peer Reviewed
Volume	45
Issue	2
Pages	217-228
DOI	https://doi.org/10.1109/TCYB.2014.2323936
Keywords	Gene Expression Programming, Hyper-heuristic, Timetabling, Vehicle Routing, Dynamic Optimization
Public URL	https://nottingham-repository.worktribe.com/output/744195
Publisher URL	http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6824192
Additional Information	(c) 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Contract Date	Feb 19, 2015