Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Safety-Aware Deep Reinforcement Learning in Job Scheduling

Creative Commons 'BY-ND' version 4.0 license
Abstract

Resource allocation in computing clusters presents an NP-hard problem, with existing solutions often employing generalized heuristics that overlook job stream specificities. Past applications of machine learning to this task have shown promise, yet the lack of robustness and reliability guarantee often results in limited confidence in produced scheduling decisions. Our work introduces a novel reinforcement learning-based scheduling model, uniquely designed to incorporate job stream characteristics and with verified decision robustness. Our findings underline the potential for deep reinforcement learning to produce quality-controlled scheduling decisions, laying grounds for future research towards safe and verifiable reinforcement learning.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View