Intuitively, it seems as though natural language processing tasks might benefit from explicit representations of the syntactic and semantic properties of text. Ontonotes is a dataset which attempts to annotate texts, to represent as much as possible of the meaning of the text explicitly within the annotation. Many tools exist...
It is common practice in the unsupervised anomaly detection literature to create experimental benchmarks by sampling from existing supervised learning datasets. We seek to improve this practice by identifying four dimensions important to real-world anomaly detection applications --- point difficulty, clusteredness of anomalies, relevance of features, and relative frequency of...
How can software practitioners assess whether their software supports diverse users? Although there are empirical processes that can be used to find “inclusivity bugs” piecemeal, what is often needed is a systematic inspection method to assess software’s support for diverse populations. To help fill this gap, this thesis introduces InclusiveMag,...
The ability to create reproducible cryptographically secure keys from temporal environments (e.g., images) has the potential to be a contributor to effective cryptographic mechanisms. Due to the noisy nature of these environments, achieving this goal in a user friendly fashion is a very challenging task, especially since there exists a...
Developers spend a considerable amount of time comprehending code and building accurate mental models of the code. Understanding the relationships between software features within IDEs is difficult, with information split across different visual hierarchies making navigation cumbersome. Canvas-based IDEs mitigate some of the navigation costs by allowing relevant information to...
Newcomers’ seamless onboarding is important for open collaboration communi- ties, particularly those that leverage outsiders’ contributions to remain sustainable. Nevertheless, previous work shows that OSS newcomers often face several barriers to contribute, which lead them to lose motivation and even give up on contributing. A well-known way to help newcomers...
Emergence of highly accurate Convolutional Neural Networks (CNNs) with the capability to process large datasets, has led to their popularity in many applications, including safety/security-sensitive (e.g. disease recognition, self-driving cars). Despite the high accuracy of convolutional neural networks, they have been found to be susceptible to adversarial noise added to...
The performance of deep learning frameworks could be significantly improved through considering the particular underlying structures for each dataset. In this thesis, I summarize our three work about boosting the performance of deep learning models through leveraging structures of the data. In the first work, we theoretically justify that, for...
End-user programming has become widespread. The increasing size of this population and the prevalence of barriers that they face has sparked the development of approaches that promote end-user programing by helping them overcome barriers and teaching them programming. Despite the fact that these approaches have done well in achieving those...
Programming is integrated across the workflow of multiple domains where end-user programmers, those who need to program as a means to an end, regularly need to code. In the modern setting of collaborative development, end-user programmers have to interpret the intentions behind existing code to contribute and build solutions to...