Honors College Thesis
 

Leveraging transformer encoder output for effective token summary in simultaneous translation

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/honors_college_theses/j9602806f

Descriptions

Attribute NameValues
Creator
Abstract
  • Simultaneous speech-to-text translation remains a difficult yet important problem for modern machine learning models whereby a text translation is generated concurrently with receiving partial speech inputs. One state-of-the-art simultaneous speech-to-text model is the augmented memory transformer whose encoder breaks a speech input into fixed-size overlapping segments composed of left, right, and center contexts. For each segment, the encoder saves a summarization of its hidden states as memory banks to use in the encoder attention calculation for subsequent segments. Even with the solid performance of the current augmented memory transformer, it fails to adequately reuse information learned from prior segments and is not adapted to receive inputs of variable length which cause gaps in the center context. Therefore, we propose leveraging the encoder output in the left context calculation for effective token summarization. Additionally, we introduce a secondary variable length left context to fill empty spaces in the center context. Our experiments show when isolated, each of these contributions, on average, increases the BLEU score by 3.91 and 5.69 points, respectively, over their baseline counterparts using the English-German MUST-C dataset without affecting latency.
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Non-Academic Affiliation
Rights Statement
Publisher
Peer Reviewed
Language
Embargo reason
  • Pending Publication
Embargo date range
  • 2023-04-05 to 2023-06-03

Relationships

Parents:

This work has no parents.

In Collection:

Items