In recent years, the novel framing of the caching problem as an Online Convex Optimisation (OCO) problem has led to the introduction of several online caching policies. These policies are proven optimal with regard to regret for any arbitrary request pattern, including that of ad
...
In recent years, the novel framing of the caching problem as an Online Convex Optimisation (OCO) problem has led to the introduction of several online caching policies. These policies are proven optimal with regard to regret for any arbitrary request pattern, including that of adversarial origin. Nevertheless, their performance is primarily affected by the tuning of their so-called hyperparameters (e.g. learning rate), which is often done under the presumption of adversarial conditions. Consequently, as not all request patterns encountered in practice are adversarial, this suggests potential for further improvements. Unfortunately, the tuning of these hyperparameters to achieve optimal performance across non-adversarial request sequences remains dubious and poses a new challenge. This paper proposes an online meta-learner that combines several instances of the OGA caching policy, each distinct in their learning rate, framing the problem as an expert advice decision problem. The proposed meta-learner dynamically shifts between caching experts and achieves a regret upper bound similar to that of the best-performing caching expert. The penalty introduced for learning the best expert is limited to a logarithmic dependence on the number of experts, which improves upon previous works. Numerical evaluations composed of synthetic request traces demonstrate consistent improvements when the caching experts' relative performance varies over time.