Towards a Realistic Scheduler for Mixed Workloads with Workflows
More Info
expand_more
Abstract
Many fields of modern science require huge amounts of computation, and workflows are a very popular tool in e-Science since they allow to organize many small, simple tasks to solve big problems. They are used in astronomy, bioinformatics, machine learning, social network analysis, physics, and many other branches of science. Workflows are notoriously difficult to schedule, and the vast majority of research on workflow scheduling is concerned with scheduling single workflows with known runtimes. The goal of this PhD research is to bring more realism to the problem of workflow scheduling in actual systems. First, in real systems, multiple workflows may be contending for the available resources. Second, task runtime estimates are not always known, and task runtime estimates may be wrong. Third, workflows are usually not the only type of jobs submitted to a system, there may for instance also be parallel applications and bags-of-tasks. Accordingly, the purpose of this PhD research is to create and analyze policies for online scheduling of workloads of workflows with and without known task runtimes that also contain jobs of other types. We are in the process of simulating policies, and we will validate our results by means of an implementation and real-world experiments with the KOALA-W workflow processing system.