Graduate Project
 

A Study of Temporal Convolutional Neural Networks for Music Language Modeling as Applied to Automatic Music Transcription

Öffentlich Deposited

Herunterladbarer Inhalt

PDF Herunterladen
https://ir.library.oregonstate.edu/concern/graduate_projects/kh04dw03f

Descriptions

Attribute NameValues
Creator
Abstract
  • Automatic music transcription (AMT) is the task, given an acoustic representation of music, to recover a symbolic notation of the written notes expressed by the sound. Transcribing music with multiple notes sounding simultaneously is difficult for both humans and machines. Much existing work on AMT has focused on suitable acoustic or combined acoustic/sequence discriminative models. However, it has been speculated that music language modeling (MLM), which seeks to build generative models of musical sequences, could be useful for building better AMT systems. Although existing work demonstrates only modest effectiveness, plausible reasons suggest that graphical models combining language model with acoustic model might be able to significantly improve AMT acoustic models' performance. Common sequence modeling techniques for the MLM task are Hidden Markov Models and recurrent neural networks such as Long Short-Term Memory (LSTM) models. In contrast to these well established sequential modeling techniques, the present work studies a feed-forward, temporal convolutional neural network (TCNN) architecture for MLM within a local radius of up to one minute of music. This network, and a typical LSTM network, are trained on a substantial set of digital symbolic music data, and both networks are used in graphical models to postprocess the predictions of a preexisting acoustic model. Although neither MLM improves time-frame transcription performance, both substantially improve note-wise and onset/offset performance over the acoustic model alone. This work is the first to the author's knowledge that uses TCNN architecture specifically for language modeling of symbolic musical data and successfully uses that model to improve the AMT note transcription performance of an acoustic model.
License
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Urheberrechts-Erklärung
Publisher
Peer Reviewed
Language

Beziehungen

Parents:

This work has no parents.

Artikel