An adversarial model for paraphrase generation
Universidad Católica San Pablo
Paraphrasing is the action of expressing the idea of a sentence using different words. Paraphrase generation is an interesting and challenging task due mainly to three reasons: (1) The nature of the text is discrete, (2) it is difficult to modify a sentence slightly without changing the meaning, and (3) there are no accurate automatic metrics to evaluate the quality of a paraphrase. This problem has been addressed with several methods. Even so, neural network-based approaches have been tackling this task recently. This thesis presents a novel framework to solve the paraphrase generation problem in English. To do so, this work focuses and evaluates three aspects of a model, as the teaser figure shows. (a) Static input representations extracted from pre-trained language models. (b) Convolutional sequence to sequence models as our main architecture. (c) Hybrid loss function between maximum likelihood and adversarial REINFORCE, avoiding the computationally expensive Monte-Carlo search. We compare our best models with some baselines in the Quora question pairs dataset. The results show that our framework is competitive against the previous benchmarks.