Abstract:
In parallel program performance prediction, static techniques
offer compile-time, parametric feedback at low cost. While dynamic
techniques have access to a wealth of run-time profile data,
static techniques have the advantage of retaining analytical
information on the performance effects of symbolic program/machine
parameters, without requiring costly execution runs (or simulations)
for each different parameter setting or data set.
A widely known disadvantage of static techniques, however, is their
inability to take into account dynamic, data-dependent program
behavior, typically reflected in data-dependent (parallel/sequential)
loop bounds, and branch conditions, thus causing execution time
to be stochastic, rather than deterministic. For instance, the disparity
between traditional static estimates and the actually observed
mean value of the execution time even grows logarithmically
with the amount of parallelism.
We present a static prediction technique that analytically approximates
a program's execution time distribution, employing statistical information
on data-dependent control parameters gathered in the course of
previous program runs over a training corpus of input data sets.
We show that our statistical enhancement does not increase prediction
time complexity, while experimental results demonstrate that
the first four statistical moments of the program execution time
can be predicted with an inherent accuracy in the 1% range,
provided the statistical program parameter information is correct.