In many scientific settings, investigators are interested in the effect of a new treatment. To have a point of comparison data is collected from patients before treatment groups are assigned. When seeking to test the presence of a significant treatment effect it may be unclear how baseline measurements should be...
Diverse scientific fields collect multiple time series data to investigate the dynamical behavior of complex systems: atmospheric and climate science, geophysics, neuroscience, epidemiology, ecology, and environmental science. Identifying patterns of mutual dependence among such data generates valuable knowledge that can be applied either for inferential or forecasting purposes. Vector autoregressive...
Due to recent advances in computer technology, the cost of collecting and storing data has dropped drastically. This makes it feasible to collect large amounts of information for each data point. This increasing trend in feature dimensionality justifies the need for research on variable selection. Random forest (RF) has demonstrated...
Anomaly detection is the task of identifying observations (points) that differ from the majority of other points, which requires some measure of difference, or distance. Many anomaly detection methods rely on “implicit distance” measures: rather than directly calculating an explicitly defined distance, these approaches quantify a point’s “abnormality” by examining...
Bayesian optimization (BO) aims to optimize costly-to-evaluate functions by running a limited number of experiments that each evaluates the function at a selected input. Typical BO formulations assume that experiments are selected sequentially, or in fixed batches. Moreover, these experiments can be executed immediately upon request and have the same...
Changes in the global climate and forest management practices have given rise to increasing numbers and severity of wildfires. More than five million acres burned in the United States in 2017, while in Canada 7.4 million acres burned. In particular, an increasing amount of dead woody biomass is a key...
Randomized trials are the gold standard for the clinical assessment of a new treatment
compared to a placebo or standard of care. Often in clinical trials, patients are
accrued sequentially rather than all at once. Thus, the data from such a trial becomes
available sequentially to the researcher. Monitoring and...
Agent-based models (ABM) are widely used in network data analysis, and due to their simple structures and sophisticated outcomes, they serve as good tools in understanding the dynamics in networks. In this thesis, we develop an agent-based dynamic network model, and show that it can replicate the expected degree distribution...
Humans are remarkably efficient in learning by interacting with other people and observing their behavior. Children learn by watching their parents’ actions and mimic their behavior. When they are not sure about their parents demonstration, they communicate with them, ask questions, and learn from their feedback. On the other hand,...