Graduate Thesis Or Dissertation
 

Explaining and Improving Neural Machine Translation

Public Deposited

Contenu téléchargeable

Télécharger le fichier PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/6q182t650

Descriptions

Attribute NameValues
Creator
Abstract
  • Machine Translation, the task of automatically translating between human languages has been studied for decades. This task is used to be solved by count-based statistical models, e.g. Phrase-based Statistical Machine Translation (PBSMT), which solves the translation problem by separately training a statistical language model and a translation model. Recently, Neural Machine Translation (NMT) is proposed to solve this problem by training a Deep Neural Network (DNN) in an end-to-end fashion, which demonstrates superior performance and thus becomes the state-of-the-art approach for the Machine Translation task. However, black-box DNN models are inscrutable and it is difficult to explain the common failure modes in NMT systems. In this thesis, we attempt to explain and improve the modern Neural Machine Translation model by looking closely into multiple aspects of it and proposing new algorithms and heuristics to improve the translation performance. During inference time, it is common to adopt the beam search algorithm to explore the exponential candidate space. Yet, the NMT model is widely found to generate worse translations with increasing search budgets (i.e. larger beam sizes). This phenomenon, commonly referred to as the beam search curse hinders the translation performance and limits the beam sizes to be less than 10. In Chapter 3, we examine the beam search curse in-depth and propose new rescoring methods to ameliorate it. With its deeply stacked architecture-identical neural networks, it is nearly impossible to ex-plain the internal functionalities of DNN-based NMT models. In Chapter 4, we explain the NMT decoder module-level functionalities through our proposed information probing frame-work. Somewhat surprisingly, we find that half of its parameters could be dropped with minimal performance loss. Finally for the recently popularized Multilingual NMT model, one model is trained on all language pairs to translate between all languages. We find that it faces a severe off-target translation issue, where it frequently produces outputs in the incorrect language. In Chapter 6, we explain how off-target translation emerges during beam search decoding, and propose a language-informed beam search algorithm to notably reduce off-target errors solely during decoding time. In Chapter 5, we propose representation-level and gradient-level regularizations during training time to significantly improve translation performance and reduce off-target errors.
License
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Déclaration de droits
Publisher
Peer Reviewed
Language

Des relations

Parents:

This work has no parents.

Dans Collection:

Articles