Towards Minimal Necessary Data

The Case for Analyzing Training Data Requirements of Recommender Algorithms

More Info
expand_more

Abstract

This paper states the case for the principle of minimal necessary data: If two recommender algorithms achieve the same effectiveness, the better algorithm is the one that requires less user data. Applying this principle involves carrying out training data requirements analysis, which we argue should be adopted as best practice for the development and evaluation of recommender algorithms. We take
the position that responsible recommendation is recommendation that serves the people whose data it uses. To minimize the imposition on users’ privacy, it is important that a recommender system does not collect or store more user information than it absolutely needs. Further, algorithms using minimal necessary data reduce training time and address the cold start problem. To illustrate the trade-off between training data volume and accuracy, we carry out
a set of classic recommender system experiments. We conclude that
consistently applying training data requirements analysis would represent a relatively small change in researchers’ current practices, but a large step towards more responsible recommender systems.