Explaining and Improving Neural Machine Translation

Yang, Yilin

Graduate Thesis Or Dissertation

Explaining and Improving Neural Machine Translation

Public Deposited

Télécharger le fichier PDF

Citeable URL: https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/6q182t650

Descriptions

Attribute Name	Values
Creator	Yang, Yilin
Abstract	Machine Translation, the task of automatically translating between human languages has been studied for decades. This task is used to be solved by count-based statistical models, e.g. Phrase-based Statistical Machine Translation (PBSMT), which solves the translation problem by separately training a statistical language model and a translation model. Recently, Neural Machine Translation (NMT) is proposed to solve this problem by training a Deep Neural Network (DNN) in an end-to-end fashion, which demonstrates superior performance and thus becomes the state-of-the-art approach for the Machine Translation task. However, black-box DNN models are inscrutable and it is difficult to explain the common failure modes in NMT systems. In this thesis, we attempt to explain and improve the modern Neural Machine Translation model by looking closely into multiple aspects of it and proposing new algorithms and heuristics to improve the translation performance. During inference time, it is common to adopt the beam search algorithm to explore the exponential candidate space. Yet, the NMT model is widely found to generate worse translations with increasing search budgets (i.e. larger beam sizes). This phenomenon, commonly referred to as the beam search curse hinders the translation performance and limits the beam sizes to be less than 10. In Chapter 3, we examine the beam search curse in-depth and propose new rescoring methods to ameliorate it. With its deeply stacked architecture-identical neural networks, it is nearly impossible to ex-plain the internal functionalities of DNN-based NMT models. In Chapter 4, we explain the NMT decoder module-level functionalities through our proposed information probing frame-work. Somewhat surprisingly, we find that half of its parameters could be dropped with minimal performance loss. Finally for the recently popularized Multilingual NMT model, one model is trained on all language pairs to translate between all languages. We find that it faces a severe off-target translation issue, where it frequently produces outputs in the incorrect language. In Chapter 6, we explain how off-target translation emerges during beam search decoding, and propose a language-informed beam search algorithm to notably reduce off-target errors solely during decoding time. In Chapter 5, we propose representation-level and gradient-level regularizations during training time to significantly improve translation performance and reduce off-target errors.
License	All rights reserved
Resource Type	Dissertation
Date Issued	2022-06-01
Degree Level	Doctoral
Degree Name	Doctor of Philosophy (Ph.D.)
Degree Field	Computer Science
Degree Grantor	Oregon State University
Commencement Year	2023
Advisor	Tadepalli, Prasad Lee, Stefan
Committee Member	Fern, Xiaoli Pavol, Mike Li, Fuxin
Academic Affiliation	Electrical Engineering and Computer Science
Déclaration de droits	In Copyright
Publisher	Oregon State University
Peer Reviewed	No
Language	English [eng]

Des relations

Parents:

This work has no parents.

Dans Collection:

Graduate Theses and Dissertations (GTD)

Articles

La vignette	Titre	Date de téléchargement	Visibilité	actes
	YilinYang2022.pdf	2022-08-15	Public	Télécharger

Hyrax

Explaining and Improving Neural Machine Translation

Contenu téléchargeable

Descriptions

Des relations

Articles