TY - CHAP
T1 - Statistical regimes and runtime prediction
AU - Hurley, Barry
AU - O'Sullivan, Barry
PY - 2015
Y1 - 2015
N2 - The last decade has seen a growing interest in solver portfolios, automated solver configuration, and runtime prediction methods. At their core, these methods rely on a deterministic, consistent behaviour from the underlying algorithms and solvers. However, modern state-of-the-art solvers have elements of stochasticity built in such as randomised variable and value selection, tie-breaking, and randomised restarting. Such features can elicit dramatic variations in the overall performance between repeated runs of the solver, often by several orders of magnitude. Despite the success of the aforementioned fields, such performance variations in the underlying solvers have largely been ignored. Supported by a large-scale empirical study employing many years of industrial SAT Competition instances including repeated runs, we present statistical and empirical evidence that such a performance variation phenomenon necessitates a change in the evaluation of portfolio, runtime prediction, and automated configuration methods. In addition, we demonstrate that this phenomenon can have a significant impact on empirical solver competitions. Specifically, we show that the top three solvers from the 2014 SAT Competition could have been ranked in any permutation. These findings demonstrate the need for more statistically well-founded regimes in empirical evaluations.
AB - The last decade has seen a growing interest in solver portfolios, automated solver configuration, and runtime prediction methods. At their core, these methods rely on a deterministic, consistent behaviour from the underlying algorithms and solvers. However, modern state-of-the-art solvers have elements of stochasticity built in such as randomised variable and value selection, tie-breaking, and randomised restarting. Such features can elicit dramatic variations in the overall performance between repeated runs of the solver, often by several orders of magnitude. Despite the success of the aforementioned fields, such performance variations in the underlying solvers have largely been ignored. Supported by a large-scale empirical study employing many years of industrial SAT Competition instances including repeated runs, we present statistical and empirical evidence that such a performance variation phenomenon necessitates a change in the evaluation of portfolio, runtime prediction, and automated configuration methods. In addition, we demonstrate that this phenomenon can have a significant impact on empirical solver competitions. Specifically, we show that the top three solvers from the 2014 SAT Competition could have been ranked in any permutation. These findings demonstrate the need for more statistically well-founded regimes in empirical evaluations.
UR - https://www.scopus.com/pages/publications/84949816668
M3 - Chapter
AN - SCOPUS:84949816668
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 318
EP - 324
BT - IJCAI 2015 - Proceedings of the 24th International Joint Conference on Artificial Intelligence
A2 - Wooldridge, Michael
A2 - Yang, Qiang
PB - International Joint Conferences on Artificial Intelligence
T2 - 24th International Joint Conference on Artificial Intelligence, IJCAI 2015
Y2 - 25 July 2015 through 31 July 2015
ER -