<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>Theses and Dissertations (Statistics)</title>
<link>http://hdl.handle.net/1957/18468</link>
<description/>
<pubDate>Mon, 20 May 2013 19:52:25 GMT</pubDate>
<dc:date>2013-05-20T19:52:25Z</dc:date>
<item>
<title>Fisher and logistic discriminant function estimation in the presence of collinearity</title>
<link>http://hdl.handle.net/1957/37471</link>
<description>Fisher and logistic discriminant function estimation in the presence of collinearity
O'Donnell, Robert P. (Robert Paul)
The relative merits of the Fisher linear discriminant function&#13;
(Efron, 1975) and logistic regression procedure (Press and Wilson,&#13;
1978; McLachlan and Byth, 1979), applied to the two group&#13;
discrimination problem under conditions of multivariate normality and&#13;
common covariance, have been debated. In related research, DiPillo&#13;
(1976, 1977, 1979) has argued that a biased Fisher linear&#13;
discriminant function is preferable when one or more collinearities&#13;
exist among the classifying variables.&#13;
This paper proposes a generalized ridge logistic regression&#13;
(GRL) estimator as a logistic analog to DiPillo's biased alternative&#13;
estimator. Ridge and Principal Component logistic estimators&#13;
proposed by Schaefer et al. (1984) for conventional logistic&#13;
regression are shown to be special cases of this generalized ridge&#13;
logistic estimator.&#13;
Two Fisher estimators (Linear Discriminant Function (LDF) and&#13;
Biased Linear Discriminant Function (BLDF)) and three logistic&#13;
estimators (Linear Logistic Regression (LLR), Ridge Logistic&#13;
Regression (RLR) and Principal Component Logistic Regression (PCLR))&#13;
are compared in a Monte Carlo simulation under varying conditions of&#13;
distance between populations, training set s1ze and degree of&#13;
collinearity. A new approach to the selection of the ridge parameter&#13;
in the BLDF method is proposed and evaluated.&#13;
The results of the simulation indicate that two of the biased&#13;
estimators (BLDF, RLR) produce smaller MSE values and are more stable&#13;
estimators (smaller standard deviations) than their unbiased&#13;
counterparts. But the improved performance for MSE does not&#13;
translate into equivalent improvement in error rates. The expected&#13;
actual error rates are only marginally smaller for the biased&#13;
estimators. The results suggest that small training set size, rather&#13;
than strong collinearity, may produce the greatest classification&#13;
advantage for the biased estimators.&#13;
The unbiased estimators (LDF, LLR) produce smaller average apparent&#13;
error rates. The relative advantage of the Fisher&#13;
estimators over the logistic estimators is maintained. But, given&#13;
that the comparison is made under conditions most favorable to the&#13;
Fisher estimators, the absolute advantage of the Fisher estimators is&#13;
small. The new ridge parameter selection method for the BLDF&#13;
estimator performs as well as, but no better than, the method used by&#13;
DiPillo.&#13;
The PCLR estimator shows performance comparable to the other&#13;
estimators when there is a high level of collinearity. However, the&#13;
estimator gives up a significant degree of performance in conditions&#13;
where collinearity is not a problem.
Graduation date: 1991
</description>
<pubDate>Thu, 27 Sep 1990 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/1957/37471</guid>
<dc:date>1990-09-27T00:00:00Z</dc:date>
</item>
<item>
<title>Factors affecting bird counts and their influence on density estimates</title>
<link>http://hdl.handle.net/1957/37264</link>
<description>Factors affecting bird counts and their influence on density estimates
McCracken, Marti L.
Variable area surveys are used in large geographic regions to estimate the&#13;
density of birds distributed over a region. If some birds go undetected, a&#13;
measure of the effective area surveyed, the amount of area occupied by the birds&#13;
detected, is needed. The effective area surveyed is determined by observational,&#13;
biological, and environmental factors relating to detectability. It has been&#13;
suggested that density estimates are inaccurate, and that it is risky to compare&#13;
bird populations intraspecifically over time and space, since factors influencing&#13;
bird counts will vary.&#13;
There have been several controversial studies where variable area survey&#13;
density estimates were evaluated using density estimates calculated from spot&#13;
mapping as the standard for comparison. Spot mapping itself is an unproven&#13;
estimator that the previously mentioned factors also influence. Without a known&#13;
population density, determining how the different density estimators perform is&#13;
difficult to access. Variable area surveys of inanimate objects whose densities&#13;
were known have been conducted under controlled circumstances with results&#13;
generally supporting the variable area survey method, but time and inability to&#13;
control for all factors limit the application of this type of study. A simulation&#13;
program that distributes over a region vegetation and a known density of birds,&#13;
and then simulates the process of gathering bird detection data is one tool&#13;
accessible to evaluate variable area density estimates. Within such a simulation&#13;
study various observational, biological, and environment factors could be&#13;
introduced.&#13;
This thesis introduces such a simulation program, VABS, that was written with the objectives of identifying factors that influence bird counts and determining the limitations of the variable area survey. Within this thesis are discussions concerning the several factors that have been identified as influencing bird counts and the effects that these factors had on the Fourier series, exponential power series, and Cum-D density estimates when these factors were simulated in VABS. Critical assumptions of the variable area survey are identified, and the ability of the variable area survey to estimate density for different detectability curve is examined. Also included are discussions on the topics of pooling data gathered under different detectabilities and monitoring population trends.
Graduation date: 1994
</description>
<pubDate>Thu, 22 Jul 1993 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/1957/37264</guid>
<dc:date>1993-07-22T00:00:00Z</dc:date>
</item>
<item>
<title>Uses of Bayesian posterior modes in solving complex estimation problems in statistics</title>
<link>http://hdl.handle.net/1957/36872</link>
<description>Uses of Bayesian posterior modes in solving complex estimation problems in statistics
Lin, Lie-fen
In Bayesian analysis, means are commonly used to&#13;
summarize Bayesian posterior distributions. Problems with&#13;
a large number of parameters often require numerical&#13;
integrations over many dimensions to obtain means. In this&#13;
dissertation, posterior modes with respect to appropriate&#13;
measures are used to summarize Bayesian posterior&#13;
distributions, using the Newton-Raphson method to locate&#13;
modes. Further inference of modes relies on the normal&#13;
approximation, using asymptotic multivariate normal&#13;
distributions to approximate posterior distributions. These&#13;
techniques are applied to two statistical estimation&#13;
problems.&#13;
First, Bayesian sequential dose selection procedures&#13;
are developed for Bioassay problems using Ramsey's prior&#13;
[28]. Two adaptive designs for Bayesian sequential dose&#13;
selection and estimation of the potency curve are given.&#13;
The relative efficiency is used to compare the adaptive&#13;
methods with other non-Bayesian methods (Spearman-Karber,&#13;
up-and-down, and Robbins-Monro) for estimating the ED50 .&#13;
Second, posterior distributions of the order of an&#13;
autoregressive (AR) model are determined following Robb's&#13;
method (1980). Wolfer's sunspot data is used as an example&#13;
to compare the estimating results with FPE, AIC, BIC, and&#13;
CIC methods. Both Robb's method and the normal&#13;
approximation for estimation of the order have full&#13;
posterior results.
Graduation date: 1992
</description>
<pubDate>Tue, 17 Mar 1992 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/1957/36872</guid>
<dc:date>1992-03-17T00:00:00Z</dc:date>
</item>
<item>
<title>Analysis of epidemiological data with covariate errors</title>
<link>http://hdl.handle.net/1957/36494</link>
<description>Analysis of epidemiological data with covariate errors
Delongchamp, Robert
In regression analysis, random errors in an explanatory variable cause the&#13;
usual estimates of its regression coefficient to be biased. Although this problem has&#13;
been studied for many years, routine methods have not emerged. This thesis&#13;
investigates some aspects of this problem in the setting of analysis of epidemiological&#13;
data.&#13;
A major premise is that methods to cope with this problem must account for&#13;
the shape of the frequency distribution of the true covariable, e.g., exposure. This is&#13;
not widely recognized, and many existing methods focus only on the variability of the&#13;
true covariable, rather than on the shape of its distribution. Confusion about this&#13;
issue is exacerbated by the existence of two classical models, one in which the&#13;
covariable is a sample from a distribution and the other in which it is a collection of&#13;
fixed values. A unified approach is taken here, in which for the latter of these models&#13;
more attention than usual is given to the frequency distribution of the fixed values.&#13;
In epidemiology the distribution of exposures is often very skewed, making&#13;
these issues particularly important. In addition, the data sets can be very large, and&#13;
another premise is that differences in the performance of methods are much greater&#13;
when the samples are very large.&#13;
Traditionally, methods have largely been evaluated by their ability to remove&#13;
bias from the regression estimates. A third premise is that in large samples there may&#13;
be various methods that will adequately remove the bias, but they may differ widely in&#13;
how nearly they approximate the estimates that would be obtained using the&#13;
unobserved true values.&#13;
A collection of old and new methods is considered, representing a variety of&#13;
basic rationales and approaches. Some comparisons among them are made on&#13;
theoretical grounds provided by the unified model. Simulation results are given which&#13;
tend to confirm the major premises of this thesis. In particular, it is shown that the&#13;
performance of one of the most standard approaches, the "correction for attenuation"&#13;
method, is poor relative to other methods when the sample size is large and the&#13;
distribution of covariables is skewed.
Graduation date: 1993
</description>
<pubDate>Thu, 18 Feb 1993 00:00:00 GMT</pubDate>
<guid isPermaLink="false">http://hdl.handle.net/1957/36494</guid>
<dc:date>1993-02-18T00:00:00Z</dc:date>
</item>
</channel>
</rss>
