In a multiprocessor system, the set of runnable tasks may be organized in a central task queue that is accessible from all processors, or distributed in local task queues. If there are many processors, a central task queue may become a system bottleneck, and the distributed organization may cause a load imbalance. The authors propose a hierarchical approach between these two extremes. A performance analysis shows that the advantages of the two quoted organizations are nearly preserved while the disadvantages are nearly omitted.
All arriving tasks are put into a central task queue at the root of a tree. Each node of the tree is a task queue. Each time a processor fetches a task from an inner node, some tasks are taken from the queue and distributed among the nodes of the subtree. A processor looks for a task for execution in its local queue. If empty, the search is continued at the parent node, and so on.
The performance analysis consists of a workload model. It uses Poisson job arrivals; the number of tasks per job is “truncated exponentially” distributed (likely “geometrically” is meant), and the service time of a task is “exponentially distributed,” sometimes with a coefficient of variation different from one (this is a contradiction).
Simple formulas allow comparisons of the task queue utilizations of the three organizations and for the saturation points that determine the conditions under which the queues become a bottleneck. Job response times for the three task queue organizations are determined by simulation. From the simulation results, the authors conclude that the proposed organization schema has favorable properties.
This paper is clearly written, and the new organization is supported by performance evaluations.