A copula is the representation of a multivariate distribution. Copulas are used to model multivariate data in many fields. Recent developments include copula models for spatial data and for discrete marginals. We will present a new methodological approach for modeling discrete spatial processes and for predicting the process at unobserved...
DNA microarray technology is a powerful tool for analyzing patterns in gene expression data for thousands of genes. Due to a number of systematic variations in microarray experiments, the raw gene expression data is often obfuscated by undesirable technical noises. Various normalization techniques were designed in an attempt to remove...
Local signal detection is useful in many scientific areas such as imaging processing and speech recognition, for extracting meaningful patterns from noisy signals. In this dissertation, we study estimation and local signal detection for spatial data distributed over irregular domains. In particular, we use bivariate splines defined on triangulations to...
Analysis of observations on sequential events over time is common in real life. Sequential measurements over time describing the behavior of systems are usually called time series data, which have been collected in a wide range of disciplines. Over the years there have been multiple research areas in studying stochastic...
This thesis consists of three papers which investigate marginal models, nonparametric approaches, generalized mixed effects models and variance components estimation in longitudinal data analysis. In the first paper, a new marginal approach is introduced for high-dimensional cell-cycle microarray data with no replicates. There are two kinds of correlation for cell-cycle...
The recent advent of large-scale microbiome studies enabled by high-throughput sequencing calls for innovative statistical methodologies that are capable of tackling a myriad of challenges presented by microbiome data. Compelled by this need, we focus on the development of statistical tools for two types of problems that arise in microbiome...
We consider two semiparametric regression models for data analysis, the stochastic additive model (SAM) for nonlinear time series data and the additive coefficient model (ACM) for randomly sampled data with nonparametric structure. We employ the SCAD-penalized polynomial spline estimation method for estimation and simultaneous variable selection in both models. It...
Missing data is one of the major methodological problems in longitudinal studies. It not only reduces the sample size, but also can result in biased estimation and inference. It is crucial to correctly understand the missing mechanism and appropriately incorporate it into the estimation and inference procedures. Traditional methods, such...