Graduate Thesis Or Dissertation
 

Action Segmentation in Videos: Deep Models, Prediction Diffusion, and Deep-Temporal Augmentations

Public Deposited

Contenu téléchargeable

Télécharger le fichier PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/cz30q253p

Descriptions

Attribute NameValues
Creator
Abstract
  • This paper addresses the high model complexity and overconfident frame labeling of state-of-the-art (SOTA) action segmenters. Their complexity is typically justified by the need to sequentially refine action segmentation through multiple stages of a deep architecture. However, this multistage refinement does not take into account uncertainty of frame labeling predicted by the previous stage, and hence may temporally propagate highly confident prediction errors from neighboring frames to other frames. To reduce model complexity and account for prediction uncertainty, we replace the refinement stages with a new, lighter module for conducting the Dirichlet diffusion. Given the initial framewise prediction of the first stage of a SOTA model, the Dirichlet diffusion re-calibrates confidence of the initial prediction and diffuses the class distributions of video frames, and thus produces the final action segmentation. The diffusion is controlled by learnable Dirichlet parameters, efficiently estimated only for a few key frames with a transformer-based network. Our experimental results demonstrate performance gains over SOTA models on the benchmark datasets, including Breakfast, GTEA, 50Salads, and Assembly101, while reducing complexity.
License
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Déclaration de droits
Publisher
Peer Reviewed
Language
File Format

Des relations

Parents:

This work has no parents.

Dans Collection:

Articles