Respondent-driven sampling (RDS) is used throughout the world to estimate prevalence and population size for hidden populations. Although RDS is an effective method for enrolling people from key populations in studies, it relies on a partially unknown sampling mechanism, and thus each individual’s inclusion probability is unknown. Current estimators for...
In this dissertation we present a compilation of the research conducted during the author’s doctoral program. In the first part, we discuss a case study regarding the impact of scholar-ships on student success at Oregon State University (OSU). Specifically, we look at the grad-uation and retention rates and aim to...
In this dissertation I will demonstrate a novel application of self-exciting point process models to mass shooting data. I will also introduce two adaptations to the traditional nonparametric Hawkes process modeling framework. One such modification allows for the estimation of the additional productivity introduced by an event that is not...
Analysis of observations on sequential events over time is common in real life. Sequential measurements over time describing the behavior of systems are usually called time series data, which have been collected in a wide range of disciplines. Over the years there have been multiple research areas in studying stochastic...
Salmonid fish raised in hatcheries often have lower fitness (number of returning adult offspring) than wild fish when both spawn in the wild. Body size at release from hatcheries is positively correlated with survival at sea. So one explanation for reduced fitness is that hatcheries inadvertently select for trait values...
The recent advent of large-scale microbiome studies enabled by high-throughput sequencing calls for innovative statistical methodologies that are capable of tackling a myriad of challenges presented by microbiome data. Compelled by this need, we focus on the development of statistical tools for two types of problems that arise in microbiome...
Multiple hypothesis testing has been a popular topic in statistical research. Although vast works have been done, controlling the false discoveries remains a challenging task when the corresponding test statistics are dependent. Various methods have been proposed to estimate the false discovery proportion (FDP) under arbitrary dependence among the test...
Agent-based models (ABM) are widely used in network data analysis, and due to their simple structures and sophisticated outcomes, they serve as good tools in understanding the dynamics in networks. In this thesis, we develop an agent-based dynamic network model, and show that it can replicate the expected degree distribution...
In areas such as spatial analysis and time series analysis, it is essential to understand and quantify spatial or temporal heterogeneity. In this dissertation, we focus on a spatially varying coefficient model, in which spatial heterogeneity is accommodated by allowing the regression coefficients to vary in a given spatial domain....
This dataset contains records of identified benthic and aquatic emergent invertebrates from pre-timber harvest, post-timber harvest, and reference (no harvest) collection sites in the Hinkle Creek watershed. A variety of fish-bearing tributaries, fishless tributaries, and mainstem creek sites were sampled. Benthic invertebrates were collected using Serber nets and emergent adults...