Despite an increase in the number of people who rely on manual wheelchairs, there are still substantial economic barriers to affordable and accessible localization systems. As a result, there is a pressing need to build a versatile yet low cost localization system for manual wheelchairs. Such systems allow users to...
This paper addresses the high model complexity and overconfident frame labeling of state-of-the-art (SOTA) action segmenters. Their complexity is typically justified by the need to sequentially refine action segmentation through multiple stages of a deep architecture. However, this multistage refinement does not take into account uncertainty of frame labeling predicted...
Sonification, or the “technique of rendering sound in response to data and interactions” [1], is an alternative to visual graphs and has the potential to make data more interpretable and accessible. As a combination of the arts (music) and the sciences (data), sonification is an interdisciplinary tool for connecting people...
Distributed version control allows developers to manage software evolution among distributed development teams. But it does not eliminate all consistency and concurrency issues, and instead introduces additional complexity when merging code. And resolving merge conflicts is nontrivial when automated merging fails. In such cases, developers are forced to inspect the...
Metric spaces (X, d) are ubiquitous objects in mathematics and computer science that are able to capture pairwise distance relationships d(x, y) between points x, y ∈ X. Because of this, it is natural to ask what useful generalizations there are of metric spaces for capturing “k-wise distance relationships” d(x1,...
Explainable Artificial Intelligence (XAI) systems aim to improve users’ understanding of AI but rarely consider the inclusivity aspects of XAI. Without inclusive approaches, improving explanations might not work well for everyone. This study investigates leveraging users’ diverse problem-solving styles as an inclusive strategy to fix an XAI prototype, with the...
We present student perceptions of a new first-year engineering programming class that was designed by informed research practices. While the College of Engineering at Oregon State University saw a lot of major switching in the first year, there were not many students switching into computer science (CS). This could have...
The use of genetic algorithms to compose music and generate sounds is an area of interest in the artificial intelligence field. Music and instrument sounds have known rules and structures that can be followed which make them well-suited for genetic algorithms. However, genetic algorithms still struggle to generate sounds comparable...
Over 37,000 people die each year in automobile accidents, with many of these fatalities resulting from collisions with emergency vehicles. The rise of autonomous cars creates the need for an accurate and failsafe method of detecting and responding to emergency vehicles safely and on time. This thesis investigates the ability...
Social media platforms use many techniques to engage users' attention with their platforms, including notifications, popups, and gamification elements. The impact of social media on physical and mental health has been studied, but limited publicly available research exists on how social media users can be helped to disengage from these...
This study investigates the evaluation of Return on Investment (ROI) in education from the perspective of high school students, introducing a theoretical model that encompasses both financial and non-financial aspects, with a primary focus on the unique insights provided by high school students. Drawing from a literature review and survey-based...
Autonomous robotic agents are on their way to becoming in-home personal assistants, construction assistants, and warehouse workers. The degree of autonomy of such systems is reflected by the manner in which we specify goals to them; the abstraction of low-level commands to high-level goals goes hand-in-hand with increased autonomy. In...
Learning to recognize objects is a fundamental and essential step in human perception and understanding of the world. Accordingly, research of object discovery across diverse modalities plays a pivotal role in the context of computer vision. This field not only contributes significantly to enhancing our understanding of visual information but...
In this thesis, we propose a systematic code for correcting t = 1 insertion/deletion errors of the character ”0” that can occur between any two consecutive 1’s in a binary string. The code requires balanced input strings, where each word of length n contains ⌈n/2⌉ 0’s and ⌊n/2⌋ 1’s. This...
Using supervised machine learning (ML) to train a computer vision model typically requires human annotators to label objects in images and video. Given a large training dataset, this can be labor intensive, presenting a significant bottleneck in the model-development process. LabelFlicks is an open-source desktop application that aims to address...
One of the pervasive problems arising in our modern, digital world surrounds data breaches where an adversary, through zero-day exploitations, phishing, or old-fashioned social engineering attacks, gains access to a service’s data stores. Our society increasingly relies on these cloud-based services for everything from our taxes to personal communication. As...
A secret sharing scheme allows a dealer to distribute a secret with a set of parties, such that only a certain subset of parties can collaborate and learn the shared secret. Traditional secret sharing schemes have been used as building blocks in various subdomains of cryptography. Recently, two new extensions...
SpotFinder is the mobile frontend of a parking system that helps drivers find a parking spot on campus. (The backend piece of the parking system was developed by others in the lab as part of a previous project.) Finding parking can be viewed as both a search problem and a...
The Pacific Islander diaspora in the Pacific Northwest United States is a unique community characterized by diverse cultures, experiences, and motivations for migration. This study aimed to explore the narratives of individuals from different Pacific Islander backgrounds residing in Oregon, shedding light on their reasons for migration and the impact...
Emerging research shows that individual differences in how people use technology sometimes cluster by socioeconomic status (SES) and that when technology is not socioeconomically inclusive, low-SES individuals may abandon it. To understand how to improve technology’s SES-inclusivity, we present a multi-phase case study on SocioEconomicMag (SESMag), an emerging inspection method...
Over time, Open Source Software (OSS) has become indispensable in the creation and upkeep of software products, serving as the fundamental building block for widely used solutions in our daily lives, including applications that enable communication, entertainment, and productivity. A sustainable OSS ecosystem is one that attracts and retains a...
While digital inclusivity researchers and software practitioners have been trying to address exclusion biases in Windows, Icons, Menus, and Pointers (WIMP) user interfaces (UIs) for a long time, little has been done to investigate if and how inclusive software design and its methods that have been devised for WIMP UIs...
The understanding of Discipline-Specific Language is an important competency for students of any field to begin mastering early in their studies, since it serves as a prerequisite for both the analysis of expert text and precise communication. Therefore, an introductory curriculum should pay careful attention to how it incorporates, defines,...
In the ever-evolving field of computer science (CS) education, the significance of teachers and their backgrounds have often been overshadowed by the predominant focus on students. Teachers in the K-12 often lack the necessary expertise and have limited support provided by existing CS-based curricula. While research on CS education effectiveness...
This thesis presents innovative pedagogical approaches to teach fundamental Computer Science (CS) concepts, such as abstraction, representation, algorithms, and computation utilizing manipulatives, which are physical objects that students interact with to teach or reinforce a concept. Teaching and learning with manipulatives has a long history in science and mathematics education,...
Intuitively, it seems as though natural language processing tasks might benefit from explicit representations of the syntactic and semantic properties of text. Ontonotes is a dataset which attempts to annotate texts, to represent as much as possible of the meaning of the text explicitly within the annotation. Many tools exist...
Simulation has provided much design inspiration for the field of robotics. While many astute ideas can be computationally formulated, there are some good structures in animals that people call “biologically-inspired”. Many robots are designed based on natural creatures such as crabs, spiders, etc. Before we can take full advantage of...
This paper contributes to the ongoing discussion on the impacts of nuclear energy on the economy and energy security of select European countries. While previous literature has identified a connection between nuclear energy and economic growth, this study focuses on assessing the comparative effects of nuclear energy, measured by operable...
Forests are important to Oregon for their beauty as well as economic value, and Douglas fir trees are among the most common and important in the state. Managing and monitoring Oregon’s forests is imperative to ensure they can remain healthy and productive. One tool that helps forest scientists to understand...
Secure Computation is a powerful tool that enables a set of parties to jointly compute any function over their private inputs, without a trusted third party. Private Set Intersection is a specific case of two-party Secure Computation, where Alice (with private set X) and Bob (with private set Y) specifically...
In an increasingly computation-driven world, algorithms and mathematical models significantly impact decision making across various fields. To foster trust and understanding, it is crucial to provide users with clear and concise explanations of the reasoning behind the results produced by computational tools, especially when recommendations appear counterintuitive. Legal frameworks in...
We construct a website to explain how Gaussian mixture models can be optimized using the expectation maximization algorithm. Previous free, online material on this process has been extremely limited. All sources surveyed failed to entirely describe our identified criteria for an in-depth description and useful visualizations. After surveying a variety...
Compactness in deep learning can be critical to a model’s viability in low-resource applications, and a common approach to extreme model compression is quantization. We consider Iterative Product Quantization (iPQ) with Quant-Noise [Fan et al., 2020] to be state-of-the-art in this area, but this quantization framework suffers from preventable inference...
Voltage fault injection is a technique to disrupt power supply, such that the data or instruction flow in a microcontroller can be modified. Recently, a new class of voltage glitches was introduced termed arbitrary wave voltage glitches. Despite its demonstrated success in practical studies it comes with additional challenges, such...
The types and rates of label noise in real-world data sets present a challenge to machine learning projects. In this thesis, we propose a novel approach to address this issue. Our method combines a noise modelling technique for correcting label noise across the entire data set with a robust loss...
This paper synthesizes various works Wang tiles up to this point, including: the reduction from the Halting Problem to Wang Tiling Problem and notions around various aspects of periodicity, including minimal period, aperiodicity, and axis-aligned periodicity. Additionally it includes new work, including: a proof that a tiling can be periodic...
Ornithology is an exciting field with novel research emerging everyday. Researchers in bioacoustics often spend hours within the wilderness recording bird calls to analyze later in their lab. The burden of sifting through hours of audio recordings from the field continues to remain a time-consuming and manual task, despite the...
Users curate music playlists based on emotion to focus or relax, so streaming services often create playlists of songs to aid this process. Prior research focuses on generating playlists of a single mood or genre, although a few studies work to construct automatic playlists that transition between the genres of...
We present a tool that converts any image into a painterly rendered one, or one that looks as though it was hand-painted. To do this, we implemented the Sobel filter to calculate the edge field that is used to create streamlines, which serve as the foundation for brush strokes. We...
This dissertation delves into understanding, characterizing, and addressing dataset shift in deep learning, a pervasive issue for deployed machine learning systems. Integral aspects of the problem are examined: We start with the use of counterfactual explanations in order to characterize the behavior of deep reinforcement learning agents in visual input...
The advancement of artificial intelligence (AI) has led to transformative developments across multiple sectors, fostering innovation and redefining our interactions with technology. As AI matures and becomes integrated into society, it offers numerous opportunities to address global challenges and revolutionize a wide array of human endeavors. These advances are driven...
New capabilities in wireless network security are now possible through deep learning, which can identify and leverage patterns in radio frequency (RF) data. One area of deep learning, known as open set detection, is focused on detecting data instances from new devices encountered during deployment that were not previously seen...
Deep learning is now being utilized widely in applications where sensitive data is being used for model training, for example, in health care. In this scenario, any data leakage will cause privacy concerns to whose data records are used to train the model. An attacker can actively cause privacy leakage...
Transmit beamforming is an important technique employed to improve efficiency and signal quality in wireless communication systems by steering signals towards their in- tended users. It often arises jointly with the antenna selection problem due to various reasons, such as limited number of radio frequency (RF) chains and energy/resource effi-...
Visualization and simulation software serves an important role in education, especially in the education of abstract topics. The field of computational theory, and specifically the topics of formal languages and finite automata are well suited to visualization. When done properly, this improves the learning experience for both students and educators....
Tensor field topology is of importance to research areas of medicine, science, and engineering. Degenerate curves are one of the crucial topological features that provide valuable insights for tensor field visualization. In this thesis, we study the atomic bifurcations of degenerate curves in 3D linear second-order symmetric tensor fields, and...
Simultaneous speech translation (SimulST) is widely useful in many cross-lingual communication scenarios, including multinational conferences and international traveling. Since text-based simultaneous machine translation (SimulMT) has achieved great success in recent years. The conventional cascaded approach for SimulST uses a pipeline of streaming ASR followed by simultaneous MT but suffers from...
RNAs play important roles in multiple cellular processes, and many of their functions rely on folding to specific structures. To maintain their functions, secondary structures of RNA homologs are conserved across evolution. These conserved structures provide critical targets for diagnostics and therapeutics. Thus, there is a need for developing fast...
Secure two-party computation (2PC) is the task of performing arbitrary calculations on secret inputs provided by two parties, while maintaining secrecy if at least one party is honest. 2PC has been applied to privacy-preserving record linkage and machine learning, in areas such as medicine where maintaining privacy is crucial. One...
Learning latent space representations of high-dimensional world states has been at the core of recent rapid growth in reinforcement learning(RL). At the same time, RL algo- rithms have suffered from ignored uncertainties in the predicted estimates of model-free or model-based methods. In our work, we investigate both of these aspects...
Iterative algorithms are simple yet efficient in solving large-scale optimization problems in practice. With a surge in the amount of data in past decades, these methods have become increasingly important in many application areas including matrix/tensor recovery, deep learning, data mining, and reinforcement learning. To optimize or improve iterative algorithms,...
Oceanic plankton have both a large impact on our oceans health and atmo- spheric balance of Co2. The overall health of planktonic life is determined by many physical, chemical, and ecological factors, that drive taxonomic abundances and the relative amount of non-living biomass called detritus. Recent advances in microscopic imaging...
In standard training regimes, one assumes that the classes presented to a model constitute all of the classes that the model will encounter when it is deployed. In real deployment scenarios, however, a model can sometimes encounter situations or objects that it has never seen. When these scenarios are safety-critical,...
Predicting the average affect of a piece of music is a task which has been of recent interest in the field of music information retrieval. We investigate the use of sentiment analysis on online social media conversations to predict a song’s valence and arousal. Using four music emotion datasets -...
Many home users nowadays use various smart devices to improve the efficiency and convenience of their home environments. Trigger-action platforms such as “If-This-Then-That” (IFTTT) enable end users to connect different smart devices and services using simple apps to control these devices and automate the tasks (e.g., if the camera detects...
In this thesis, a new learning algorithm is introduced that is targeted towards individual fairness. In order to be individually fair, mispredictions need to be avoided as each such prediction means the learning algorithm was unfair towards some individual. Therefore, achieving individual fairness implies having a perfect classifier, which is...
Currently, a popular approach to image classification uses the deep Transformer architecture. In a Transformer, the attention mechanism enables the model to learn efficiently with fewer computational resources than the convolutional neural networks (CNNs). In this thesis, we study the sparse attention mechanism widely used in the Transformers developed specifically...
Programming is integrated across the workflow of multiple domains where end-user programmers, those who need to program as a means to an end, regularly need to code. In the modern setting of collaborative development, end-user programmers have to interpret the intentions behind existing code to contribute and build solutions to...
With continuing improvements in performance and capability, GPU processing has gained significant and growing interest across science and industry. With this interest, research has increasingly focused upon methods of processing algorithms with stochastic, non-uniform branching while maintaining low divergence. Central among these methods is thread-data remapping (TDR), whereby data is...
Object detection models are being widely used in many applications, such as autonomous driving, construction management, and cancer detection. Evaluating the performance of the object detection model is more complicated than other computer vision models such as image classification models. Most of the images have several objects to be detected,...
The use of board games for teaching introductory computer science is a promising recent avenue of research. The goal is to introduce computing concepts through their use in the implementations of simple games, thereby keeping students engaged through their learning process. However, there is a gap between students' algorithmic descriptions...
Papers proposing novel machine learning algorithms tend to present the algorithm or technique in question in the best possible light. The standard practice is generally for authors to emphasize their proposed algorithms' performance in the precise setting where it is maximally impressive, often by only fully evaluating their best known...
Tracr is a modern browser-based user interface, designed to be used with languages that can generate customized explanations from execution traces. While Tracr is primarily designed for use with the Xtra language, Tracr defines a generalized interface that would allow it to be used with other languages as well. Explanations...
We explore the application of deep learning to the disparate fields of natural language processing and computational biology. Both the sentences uttered by humans as well as the RNA and protein sequences found within the cells of their bodies can be considered formal languages in computer science, as sets of...
This study compares three approaches in the design of an autonomous machine listening agent that predicts harbor porpoise ultrasonic echolocation clicks in diverse noise environments. Considering the temporal variations of noisy coastal ocean soundscapes which the harbor porpoises inhabit, we propose a leave-one-day-out (LODO) cross-validation strategy in the training of...
This research paper describes the impacts of a new programming class on first-year, non-CS engineering students. In the spring of 2021, the College of Engineering at Oregon State University piloted a new programming course that will be required of all engineering majors in the following academic years. We used the...
This paper presents a method of implementing real-time apple detection for closed-loop control to approach apples for grabbing. The approach is to train two known real-time object detection networks–the Faster R-CNN and YOLOv5–on a novel dataset to verify that it is possible to achieve maximum average precision (mAP) above 50\%...
Machine Translation, the task of automatically translating between human languages has been studied for decades. This task is used to be solved by count-based statistical models, e.g. Phrase-based Statistical Machine Translation (PBSMT), which solves the translation problem by separately training a statistical language model and a translation model. Recently, Neural...
When my mother’s mother sent her over to the new, western world, she sent her with a suitcase full of spices you couldn’t find at an American grocery store in the 90’s: tamarind and turmeric, coriander root and cumin. Ammu, my mother, had truly packed her entire life and everything...
We do not know how to align a very intelligent AI agent's behavior with human interests. I investigate whether—absent a full solution to this AI alignment problem—we can build smart {\ai} agents which have limited impact on the world, and which do not autonomously seek power. In this thesis, I...
The widespread adoption of computerized systems around the turn of the century as a means of more efficiently conducting elections introduced more issues than these computer systems were intended to address. Though many of these flaws were not considered for years or decades after the introduction of digital election infrastructure,...
Many object recognition applications require detecting and responding to objects drawn from a different distribution from that of the training data. This task is referred to as out-of-distribution (OOD) detection, and it is often formulated as an outlier detection problem
wherein the probability distribution of the known data P(X) is...
This algorithm presents the first steps towards a solution for novice database administrators that helps them transform a non-normalized relational database into a database in the third normal form. The algorithm uses relational algebra operations that apply principles from the third normal form. This provides the database administrator with an...
Automatic Music Transcription is a growing area of interest in Music Information Retrieval, and recent research has shown promise using onset detection on spectrograms. We propose a deep learning model that takes raw audio as input in order to transcribe a solo piano performance from audio to MIDI without complicated...
Continuous Improvement (CI) of academic computing programs is a main requirement of accreditation. Academic computing programs must have a well-documented CI plan in order to be granted accreditation. Based on the existing literature, we developed a comprehensive CI (or 360-CI) model consisting of 8 components: course, curriculum, administration, faculty, research,...
All life depends on the reliable translation of RNA to protein according to complex interactions between translation machinery and RNA sequence features. While ribosomal occupancy and codon frequencies vary across coding regions, well-established metrics for computing coding potential of RNA do not capture such positional dependence. Here, we investigate position-dependent...
As one of the most popular data types, the point cloud is widely used in various appli- cations, including computer vision, computer graphics and robotics. The capability to directly measure 3D point clouds is invaluable in those applications as depth information could remove a lot of the segmentation ambiguities in...
This full research paper explores two factors of increasing importance for first-year university engineering curricula: sustainability and diversity. Over the past fifteen years, many universities in the United States have adjusted their engineering programs in response to these two values expressed by industry, professional organizations, and the Accreditation Board for...
Data science is a rapidly growing industry permeating throughout every aspect of society. Everything collects data these days, and people use this data to find meaningful patterns leading to benefits ranging from more intuitive marketing to better cancer detection. However, increased data collection also leads to increased complexity, and data...
Assessing AI systems is difficult. Humans rely on AI systems in increasing ways, both visible and invisible, meaning a variety of stakeholders need a variety of assessment tools (e.g., a professional auditor, a developer, and an end user all have different needs). We posit that it is possible to provide...
University students first learning about computer science (CS) can be intimidated and frustrated by programming. In addition, the general-purpose programming languages chosen for introducing students to programming contain several features that have the potential to overwhelm and distract them from focused curriculum topics, which can lead to reduced retention of...
This thesis seeks to determine whether a self-driving car's behavior should depend on the hearing status of its passengers: namely, hard-of-hearing (including deaf) or hearing. It is believed that auditory deprivation provokes adaptations in visual attention. These adaptations may lead to atypical movement patterns that can translate to different driving...
With the rapid advancement of educational technology and the need for personalized, engaging content to accommodate diverse learning needs, Virtual Reality (VR) holds promises for the present and future. However, VR applications suffer from challenges, including usability concerns, lack of pedagogical value, and evaluation standards. This thesis focuses on two...
Through comparison and analysis of selected holidays in countries with significantly different features (Kyrgyzstan, Russia, and the United States), this study aims to gauge the impact of holiday celebrations in influencing cultural values in those countries studied. K. Zygulski’s classification system for holidays was applied to divide the holidays of...
The core element of democracy is elections. Modern elections not only cost a lot of money to conduct elections, but we also bear a lot of social costs when the election is questioned. For this reason, the US and European countries have been considering ways to innovate by introducing IT...
In computer science, learning abstract fundamental programming concepts requiring students to understand memory management can be very difficult and lead to misunderstandings that carry on into advanced topics. This is especially true in data structures with abstract data types. Understanding how novice students think and reason about data structures is...
Real-world datasets are dirty and contain many errors. Examples of these issues are violations of integrity constraints, duplicates, and inconsistencies in representing data values and entities. Applying machine learning on dirty databases may lead to inaccurate results. Users have to spend a lot of time and effort repairing data errors...
Relational binary operators, such as join, are arguably the most costly and frequently used operations in relational data systems. In many join algorithms, the majority of the process time is spent on scanning and attempting to join the parts of the relations that do not satisfy the join condition and...
Autonomous vehicles bring great societal benefits but also potential impact and disruption to road safety, traffic congestion, and driving behaviors. One important technology that is indispensable to the success of such systems is vehicular networks. Vehicular networks provide the backbone for ensuring communication and connectivity among vehicles, all crucial to...
When contributing to a software system, developers need to understand the rationale for previous design decisions so that they can adhere to the system’s design. Not doing so can lead to erosion of the overall design quality of the system. However, discussions embedded in a large volume of communication on...
As the number of nodes in high-performance computing (HPC) systems continues to grow, it becomes increasingly important to design scalable interconnection network topologies. Prior work has shown promise in adding random shortcuts on top of an existing topology to reduce average hop count and network diameter, but has been limited...
Top-performing approaches to embodied AI tasks like point-goal navigation often rely on training agents via reinforcement learning over tens of millions (or even billions) of experiential steps – learning neural agents that map directly from visual observations to actions. In this work, we question whether these extreme training durations are...
This paper discusses opportunities for developments in spatial clustering methods to help leverage broad scale community science data for building species distribution models (SDMs). SDMs are critical tools that inform the science and policy needed to mitigate the impacts of climate change on biodiversity. Community science data span spatial and...
Data centers (DCs) have been witnessing unprecedented growth in size, number and complexity in recent years. They consist of tens of thousands of servers interconnected by fast network switches, hosting and enabling numerous applications with various traffic characteristics and requirements. As a result, DC networks have been presented with several...
Exemplar-based style transfer techniques reimagine the content of a target image in the style of an exemplar image. Style transfer techniques that make use of patch-based synthesis copy sections of the exemplar to corresponding regions of the target, creating a new image in a patchwork manner. The artistic control one...
The online computer science classroom is growing, but there is little research on how to teach inclusive design online. As a result, online CS students are graduating without learning how to avoid bias in their software designs. Through the lens of the Inclusive Design Pedagogical Content Knowledge (PCK), this thesis...
When a single object is captured by multiple edge devices, the data captured by every edge device could contain both public and private information. The private information in every copy of the data captured has to be preserved while the public information has to be utilized for any further machine...
Rhizome is an information-centric model that uses different interaction methods than traditional desktop systems. I built Rhizome with the specific use case of sharing a photo to observe people using the Rhizome operating system (OS) shell and modern OS shells. The purpose of this is to measure the cognitive load...
Hand detection is a fundamental step for many hand-related computer vision tasks, such as gesture recognition, hand pose estimation, hand sign language translation, and so on. However, robustly detecting hands is a challenging task because of drastic changes in appearance based on finger articulation and changes in lighting conditions, camera...