Measuring up to Stability
Guidelines towards accurate energy consumption measurement results of Rust benchmarks
More Info
expand_more
Abstract
In Sustainable Software Engineering there is a need for tooling and guidelines for developers. In this research we aim to provide such guidelines. We find that for our experimental setup and set of benchmarks 500 samples gives results that are likely stable at a 1% threshold in their Relative Confidence Interval Width. Running benchmarks with a variable CPU clock-speed can lead to higher variability of measurements; as well as initialising benchmarks with random data. Likewise we investigate the effect of the length of benchmarks on their stability but we can not rule out that this is caused by the experiment setup. Lastly we identify control flow statements and code related to memory accesses as potential large influences of instability.