In this dissertation we present a compilation of the research conducted during the author’s doctoral program. In the first part, we discuss a case study regarding the impact of scholar-ships on student success at Oregon State University (OSU). Specifically, we look at the grad-uation and retention rates and aim to...
We explore the possibility of estimating sparse inverse covariance matrices when for scientific reasons the covariance matrix is restricted to be a non-negative matrix. The process mirrors the graphical lasso process developed by Friedman and others(2008) that did not have this additional constraint.Accordingly, the Lasso procedure is done through coordinate...
Large wood has been utilized in many restoration projects to improve in-stream habitat in the Pacific Northwest for salmon. However, the benefits of this practice remain the subject of ongoing debate and evaluation of these projects has scarcely been done for non-salmonid species such as lamprey. In this study we...
In probability and statistics, Simpson’s paradox is an apparent paradox in which a trend is present in different groups, but is reversed when the groups are combined. Joel Cohen (1986) has shown that continuously distributed lifetimes can never have a Simpson’s paradox. We investigate the same question for discrete random...
In this dissertation I will demonstrate a novel application of self-exciting point process models to mass shooting data. I will also introduce two adaptations to the traditional nonparametric Hawkes process modeling framework. One such modification allows for the estimation of the additional productivity introduced by an event that is not...
Methods that are applied to smooth distribution functions are useful in many applications. Areas of application include economics, financial markets and survival analysis. The empirical cumulative distribution function (ecdf) is unbiased and its asymptotic distribution is normal. However, the jump discontinuities of $\frac{1}{n}$ are undesirable in estimation because it makes...
In many scientific settings, investigators are interested in the effect of a new treatment. To have a point of comparison data is collected from patients before treatment groups are assigned. When seeking to test the presence of a significant treatment effect it may be unclear how baseline measurements should be...
Due to recent advances in computer technology, the cost of collecting and storing data has dropped drastically. This makes it feasible to collect large amounts of information for each data point. This increasing trend in feature dimensionality justifies the need for research on variable selection. Random forest (RF) has demonstrated...
This thesis mainly consists of two parts: (1) comparing statistical modeling methods based on the area-based approach (ABA) for predicting forest inventory attributes using airborne light detection and ranging (LiDAR) data (Chapter 2), and (2) suggesting a new methodology fusing the individual tree detection (ITD) approach and the ABA for...
Nonparametric model-assisted estimators have been proposed to improve estimates of finite population parameters. More efficient estimators are obtained when the parametric model is misspecified due to the flexibility of nonparametric models. In this dissertation, we derive information criteria to select appropriate auxiliary variables to use in an additive model-assisted method....