Real-world datasets are dirty and contain many errors. Examples of these issues are violations of integrity constraints, duplicates, and inconsistencies in representing data values and entities. Applying machine learning on dirty databases may lead to inaccurate results. Users have to spend a lot of time and effort repairing data errors...
RNAs play important roles in the central dogma of molecular biology, and are involved in multiple biology processes such as chromatin modification, transcriptional interference and translation initiation. The functions of RNAs, especially non-coding RNAs, are highly related to its secondary structures, therefore computational methods for RNA structure prediction are of...
An important problem in computer graphics is to determine where contour lines and ridges appear in a surface constructed from a triangle mesh. In this presentation we will investigate a new answer to this problem – the horizon measure. The horizon measure determines the likelihood of contour lines to appear...
Simultaneous translation, which translates concurrently with the source language speech, is widely used in many scenarios including multilateral organizations. However, it is well known to be one of the most challenging tasks for humans due to the simultaneous perception and production in two languages. On the other hand, simultaneous translation...
Obesity and metabolic syndrome (MetS) affect over 1.9 billion adults and 12.7 million youth in the US. If all of today’s obese American youth become obese adults, the total societal costs over their lifetime may exceed $1.1 trillion. We show that feeding rodents a high-fat diet supplemented with xanthohumol (XN)...
In areas such as spatial analysis and time series analysis, it is essential to understand and quantify spatial or temporal heterogeneity. In this dissertation, we focus on a spatially varying coefficient model, in which spatial heterogeneity is accommodated by allowing the regression coefficients to vary in a given spatial domain....
With continuing improvements in performance and capability, GPU processing has gained significant and growing interest across science and industry. With this interest, research has increasingly focused upon methods of processing algorithms with stochastic, non-uniform branching while maintaining low divergence. Central among these methods is thread-data remapping (TDR), whereby data is...
Simultaneous speech translation (SimulST) is widely useful in many cross-lingual communication scenarios, including multinational conferences and international traveling. Since text-based simultaneous machine translation (SimulMT) has achieved great success in recent years. The conventional cascaded approach for SimulST uses a pipeline of streaming ASR followed by simultaneous MT but suffers from...
RNAs play important roles in multiple cellular processes, and many of their functions rely on folding to specific structures. To maintain their functions, secondary structures of RNA homologs are conserved across evolution. These conserved structures provide critical targets for diagnostics and therapeutics. Thus, there is a need for developing fast...
We explore the application of deep learning to the disparate fields of natural language processing and computational biology. Both the sentences uttered by humans as well as the RNA and protein sequences found within the cells of their bodies can be considered formal languages in computer science, as sets of...