Graduate Thesis Or Dissertation


Anomaly Detection and Probabilistic Diagnosis for Automated Data Quality Control Public Deposited

Downloadable Content

Download PDF


Attribute NameValues
  • Advances in sensor technology are greatly expanding the range of quantities that can be measured while simultaneously reducing the cost. However, deployed sensors drift out of calibration and fail, so every sensor network requires quality control procedures to promptly detect these failures. To address these problems, we propose a two-level architecture, SENSOR-DX, for automated quality control. SENSOR-DX is based on defining a collection of {\it views} of the network, where each view captures the behavior of a one or more sensors at one or more sites over a specified time interval. The lower level of SENSOR-DX consists of anomaly detectors trained for each view. These produce an anomaly score based on the sensor readings in the view. The upper level of SENSOR-DX performs probabilistic reasoning over these anomaly scores to infer which individual sensors are malfunctioning. SENSOR-DX combines the enhanced ability to detect sensor failures by modeling correlations among multiple sensors with the ability of probabilistic inference to determine which individual sensors are malfunctioning. This dissertation also studies two subproblems that arise as part of SENSOR-DX. First, the data collected from sensors may contain missing values. Existing anomaly detection methods cannot handle missing values. We studied various methods for addressing this and concluded that two methods, proportional distribution and imputation, work the best. The second subproblem is that some weather variables, such as precipitation, have difficult probability distributions that are not handled well by general-purpose anomaly detection methods. We study special-purpose models that predict the amount of precipitation at each station as a function of the precipitation observed at neighboring stations. We find that a conditional mixture model gives the most effective anomaly detections for this task.
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Committee Member
Academic Affiliation
Rights Statement
Funding Statement (additional comments about funding)
  • This thesis is based upon work supported by the National Science Foundation under Grant No.~1514550.
Peer Reviewed



This work has no parents.

In Collection: