Dr Wasiur Rahman Khuda Bukhsh WASIUR.KHUDABUKHSH@NOTTINGHAM.AC.UK
ASSISTANT PROFESSOR
Generalized Cost-Based Job Scheduling in Very Large Heterogeneous Cluster Systems
KhudaBukhsh, Wasiur R.; Kar, Sounak; Alt, Bastian; Rizk, Amr; Koeppl, Heinz
Authors
Sounak Kar
Bastian Alt
Amr Rizk
Heinz Koeppl
Abstract
We study job assignment in large, heterogeneous resource-sharing clusters of servers with finite buffers. This load balancing problem arises naturally in today's communication and big data systems, such as Amazon Web Services, Network Service Function Chains, and Stream Processing. Arriving jobs are dispatched to a server, following a load balancing policy that optimizes a performance criterion such as job completion time. Our contribution is a randomized Cost-Based Scheduling (CBS) policy in which the job assignment is driven by general cost functions of the server queue lengths. Beyond existing schemes, such as the Join the Shortest Queue (JSQ), the power of d or the SQ(d) and the capacity-weighted JSQ, the notion of CBS yields new application-specific policies such as hybrid locally uniform JSQ. As today's data center clusters have thousands of servers, exact analysis of CBS policies is tedious. In this article, we derive a scaling limit when the number of servers grows large, facilitating a comparison of various CBS policies with respect to their transient as well as steady state behavior. A byproduct of our derivations is the relationship between the queue filling proportions and the server buffer sizes, which cannot be obtained from infinite buffer models. Finally, we provide extensive numerical evaluations and discuss several applications including multi-stage systems.
Citation
KhudaBukhsh, W. R., Kar, S., Alt, B., Rizk, A., & Koeppl, H. (2020). Generalized Cost-Based Job Scheduling in Very Large Heterogeneous Cluster Systems. IEEE Transactions on Parallel and Distributed Systems, 31(11), 2594-2604. https://doi.org/10.1109/tpds.2020.2997771
Journal Article Type | Article |
---|---|
Acceptance Date | May 16, 2020 |
Online Publication Date | May 26, 2020 |
Publication Date | Nov 1, 2020 |
Deposit Date | Apr 9, 2022 |
Publicly Available Date | Apr 14, 2022 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Print ISSN | 1045-9219 |
Electronic ISSN | 1558-2183 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 31 |
Issue | 11 |
Pages | 2594-2604 |
DOI | https://doi.org/10.1109/tpds.2020.2997771 |
Keywords | Computational Theory and Mathematics; Hardware and Architecture; Signal Processing |
Public URL | https://nottingham-repository.worktribe.com/output/7715621 |
Publisher URL | https://ieeexplore.ieee.org/document/9099971 |
Additional Information | © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Files
Main
(5 Mb)
PDF
You might also like
On the Trade-off between Fidelity and Latency for the Quantum Link Layer with few Memories and Entanglement Purification
(2024)
Presentation / Conference Contribution
On the Fidelity Distribution of Purified Link-level Entanglements
(2024)
Presentation / Conference Contribution
Estimating disease transmission in a closed population under repeated testing
(2024)
Journal Article
Towards Inferring Network Properties from Epidemic Data
(2023)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search