Scippy

UG

Ubiquity Generator framework

Concept of UG's high-level task parallelization framework

One of the most famous high-level task parallelization frameworks would be one that follows the Master-Worker paradigm: A Task represents an operation that is running on a Worker. In this context, the word high-level means that the granularity of the Task is coarse. The words Task and its granularity are generally used in the field of parallel programming, where the granularity indicates how much computation is needed to process one Task.

UG is a high-level task parallelization framework following the Supervisor-Worker paradigm: From the configuration point of view, UG is composed by the two elements LoadCoordinator and Solvers, where the LoadCoordinator is the Supervisor and a Solver is a Worker. In the UG's high-level task parallelization framework, the goal is to parallelize a state-of-the-art single-threaded or multi-threaded Solver, in which the Task operation is very much complicated and is processed by a large number of lines of code. For example, a Mixed Integer Programming problem (MIP) solver was the initial main target. In that case, a Task is a (sub-)MIP.

To understand UG, the concepts of Master-Worker and Supervisor-Worker are compared using the example of the way a parallel Branch-and-bound solver works.

In the Master-Worker paradigm, the communication is very simple, the key messages are:

  • Task : contains task information, such as a (sub-)problem representation and current best incumbent value
  • Result : contains all information regarding the task execution. In the case of a parallel MIP solver, the best incumbent solution and all open nodes as examples.
Master-Worker.png
A parallel MIP solver based on the Master-Worker paradigm

The granularity of a Task can be controlled in the Worker by a termination criterion for a (sub-)problem computation, such as the number of open nodes generated. Note that all open branch-and-bound nodes are managed by the Master side and the number of transferred nodes is huge in case the solution process generates a huge search tree. In order to reduce the number of open nodes, usually depth-first search is used in the Worker side. An advantage is that the all nodes in the Master are independent, that is, all nodes can be a (sub-)problem root. This fact allows for a simple checkpoint and restart mechanism, that can fully save and resume the tree search. This loosely coupled Master-Worker paradigm is very well suitable for high-throughput computing, but is not so efficient in large-scale high-performance computing.

Different from the previously explained Master-Worker paradigm, the Supervisor-Worker paradigm used in UG's high-level task parallelization framework defines a very flexible message passing protocol. The key messages are:

  • Task : contains task information, such as a (sub-)problem representation, and it indicates the beginning of the Task computation.
  • Status : contains the Task computation status and the nitification frequency to Supervisor, which can be specified at run-time.
  • Completion : indicates the termination of the task computation.

In between the Task and Completion messages, any message passing protocol can be defined, for example

  • Solution : represents the best incumbent solution. Then, the solution can be share whenever a single Solver found a new one.
  • InCollecting : indicates that the Supervisor needs new Tasks. This message allows to collect (sub-)problems on demand.
  • OutCollecting : indicates that the Supervisor does not need new Tasks.
  • Interrupt : indicates the current executing Task can be interrupted
  • etc.
Supervisor-Worker.png
A parallel MIP solver based on Supervisor-Worker

The LoadCoordinator manages and does a dynamic load balancing among Solvers. However, the LoadCoordinaor only keeps root (sub-)problem nodes at run-time. For checkpoints, an even smaller number of only essential nodes are saved. These essential nodes are checking relation among the root nodes, that is, if its ancestor node exits. If it does exist, then the node does not have to be saved, since it can be regenerated from its ancestor. This looks inefficient, but in a state-of-the-art MIP solver, it sometimes works very well (see Solving Hard MIPLIB2003 Problems with ParaSCIP on Supercomputers: An Update). For Supervisor-Worker, see Parallel Solvers for Mixed Integer Linear Optimization in more detailed.

UG can flexibly define the message passing protocol in between Task and Completion.

The key feature of UG's high-level Task parallelization framework is the above described flexible message passing protocol during a Task computation.