On the Extrapolation of Rank-Biased Overlap and the Assumption of Constant Agreement

More Info
expand_more

Abstract

As a point estimate of the similarity score between two possibly indefinite rankings, extrapolated rank-biased overlap (RBOEXT) uses the assumption that the agreement observed at the last evaluation depth continues indefinitely across the unseen tails of the two lists. This assumption does not account for any patterns that occur in the visible prefixes, imposing a strict restriction on the extrapolation. In an effort to improve the accuracy of RBOEXT, three reformulations with a relaxed theoretical basis are proposed in this paper: one continually re-uses the agreement from the previous depth while the other two rely on regression to fit a function on the seen agreements. Using synthetic data, the performance of these new extrapolation methods is compared to the original's in terms of closeness to the true RBO score as well as the average distance between assumed and actual agreement in the rankings' unseen tails. Overall, an impactful difference is observed in the estimates of agreement generated by the four approaches: as the trends from the visible prefixes are barely captured by the simpler techniques or closely-reproduced by the more flexible ones, the trade-off between under- and overfitting becomes increasingly relevant. The results thus indicate a need for some middle-ground to be established such that it factors in the observed patterns while also generalizing well for the tails.