Improving Long Content Question Generation with Multi-level Passage Encoding
More Info
expand_more
Abstract
Generating questions that can be answered with word spans from passages is an important natural language task, which can be used for educational applications, question-answering systems, and conversational systems. Existing question generation models suffer from creating questions that are often unrelated to the context passage and answer span. In this paper, we first analyze questions generated by a common baseline model: we find over half of the generated questions that are rated as the lowest quality to be semantically unrelated to the context passage. We then investigate how humans ask factual questions and show that most often they are a reformulation of the target sentence and information from context passage. Based on these findings, we propose a multi-level encoding and gated attention fusion based neural network model for question generation (QG) which overcomes these shortcomings. Our experiments demonstrate that our model outperforms existing state-of-art seq2seq QG models.
Files
Download not available