This thesis consists of three papers which investigate marginal models, nonparametric approaches, generalized mixed effects models and variance components estimation in longitudinal data analysis. In the first paper, a new marginal approach is introduced for high-dimensional cell-cycle microarray data with no replicates. There are two kinds of correlation for cell-cycle...
Area frame sampling for agricultural statistics is a
procedure currently used by the Statistical Reporting Service of
the US Department of Agriculture as well as by agriculture
departments in other countries. A primary advantage of the area
frame is that it provides complete coverage of the population.
In area frame...
Environmental monitoring poses two challenges to statistical analysis: complex data and complex survey designs. Monitoring for system health involves measuring physical, chemical, and biological properties that have complex relations. Exploring these relations is an integral part of understanding how systems are changing under stress. How does one explore high dimensional...
Multiple linear regression was used to develop equations for 12-,
24-, and 36-hour surface wind forecasts for the wind energy site at
Goodnoe Hills. Equations were derived separately for warm and cool
seasons. The potential predictors included LFM II model output, MOS
surface wind forecasts extrapolated from surrounding stations, pressure...
We propose a new classification method for longitudinal data based on a semiparametric approach. Our approach builds a classifier by taking advantage of modeling information between response and covariates for each class, and assigns a new subject to the class with the smallest quadratic distance. This enables one to overcome...
Differential expression (DE) analysis is a key task in gene expression study, because it uncovers the association between expression levels of a gene and the covariates of interest. This dissertation pertains to two particular aspects of DE analysis—identifying stably expressed genes for count normalization and accounting for correlation between DE...
An important impact of the genome technology revolution will be the elucidation of mechanisms of cancer pathogenesis, leading to improvements in the diagnosis of cancer and the selection of cancer treatment. Integrated with current well-studied massive knowledge and findings about the role of protein-coding mutations in cancer, demystifying the functional...
The National Agricultural Statistical Service (NASS) conducts quarterly surveys for estimation of some primary commodities produced on farms and ranches. The commodities often have highly skewed distributions with a few farms producing very large amounts. NASS uses dual sampling frames comprised of the list frame for efficient stratification and the...
Obtaining accurate estimates of animal abundance is made difficult by the fact that most
animal species are detected imperfectly. Early attempts at building likelihood models that
account for unknown detection probability impose a simplifying assumption unrealistic for
many populations, however: no births, deaths, migration or emigration can occur in the...