Graduate Thesis Or Dissertation
 

Species Distribution Modeling of Citizen Science Data as A Classification Problem with Class-Conditional Label Noise

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/8k71nn952

Descriptions

Attribute NameValues
Creator
Abstract
  • Species distribution models (SDM), which quantify the correlation between the distribution of a species and environmental factors, are increasingly used to map and monitor animal and plant distributions in the context of awareness of environmental change and its ecological consequence. For perfect data, this is a straightforward classification problem from environmental features to presence or absence labels. But for imperfect data, such as the citizen science data from eBird, in which volunteers report locations where they observed or failed to observe sets of species, mistakes will cause label noise. In this case, both the class features and the observation features would be sources of false positive noise and false negative noise. However, few common modeling approaches for this task address these sources of noise explicitly. In this work, I explore the idea of treating this problem as a classification problem with class-conditional label noise. By leveraging additional information about observation features, this model outperforms other candidates significantly when sufficient data is available. I describe the conditions under which the parameters of my proposed model are identifiable, explore the impact of model misspecification, and apply this model to simulated data and real data from the eBird citizen science project.
License
Resource Type
Date Available
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Rights Statement
Publisher
Peer Reviewed
Language
Replaces

Relationships

Parents:

This work has no parents.

In Collection:

Items