Curriculum Learning Strategies for IR

Penha, G.; Hauff, C.

Curriculum Learning Strategies for IR

An Empirical Study on Conversation Response Ranking

Conference paper (2020)

Authors

G. Penha Web Information Systems -

C. Hauff Web Information Systems -

Research Group

Web Information Systems () (TU Delft)

Curriculum learning Conversation response ranking

To reference this document use:

http://resolver.tudelft.nl/uuid:3d6216fb-d31f-47d4-af0d-911d3bccaccb

More Info

expand_more

Published Date

2020

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Software Technology

Research Group

Web Information Systems

Abstract

Neural ranking models are traditionally trained on a series of random batches, sampled uniformly from the entire training set. Curriculum learning has recently been shown to improve neural models’ effectiveness by sampling batches non-uniformly, going from easy to difficult instances during training. In the context of neural Information Retrieval (IR) curriculum learning has not been explored yet, and so it remains unclear (1) how to measure the difficulty of training instances and (2) how to transition from easy to difficult instances during training. To address both challenges and determine whether curriculum learning is beneficial for neural ranking models, we need large-scale datasets and a retrieval task that allows us to conduct a wide range of experiments. For this purpose, we resort to the task of conversation response ranking: ranking responses given the conversation history. In order to deal with challenge (1), we explore scoring functions to measure the difficulty of conversations based on different input spaces. To address challenge (2) we evaluate different pacing functions, which determine the velocity in which we go from easy to difficult instances. We find that, overall, by just intelligently sorting the training data (i.e., by performing curriculum learning) we can improve the retrieval effectiveness by up to 2% (The source code is available at https://github.com/Guzpenha/transformers_cl.).

Files

Penha_Hauff2020_Chapter_Curric... (pdf)

(pdf | 0.667 Mb)

Unknown license

Download not available