mirage   mirage   mirage

Anomaly Detection Meta-Analysis Benchmarks

DSpace/Manakin Repository

Show simple item record

dc.creator Emmott, Andrew
dc.creator Das, Shubhomoy
dc.creator Dietterich, Thomas
dc.creator Fern, Alan
dc.creator Wong, Weng-Keen
dc.date.accessioned 2016-06-10T15:33:15Z
dc.date.available 2016-06-10T15:33:15Z
dc.date.issued 2016-06-10
dc.identifier.uri http://hdl.handle.net/1957/59114
dc.description Benchmarks are derived from several data sets found at the UC Irvine Machine Learning Repository: https://archive.ics.uci.edu/ml/index.html en_US
dc.description [Suggested Citation: Emmott, Andrew; Das, Shubhomoy; Dietterich, Thomas; Fern, Alan; Wong, Weng-Keen (2016): Anomaly Detection Meta-Analysis Benchmarks. Oregon State University Libraries & Press. Dataset. http://dx.doi.org/10.7267/N97H1GGX] en_us
dc.description This dataset is hosted at: https://oregonstate.app.box.com/s/5txnlc6z7aifsptu415vrxc03bhhm40x
dc.description See the related paper: Emmott A, Das S, Dietterich T, Fern A, Wong WK (2016). A meta-analysis of the anomaly detection problem. arXiv:1503.01158v2
dc.description.abstract This article provides a thorough meta-analysis of the anomaly detection problem. To accomplish this we first identify approaches to benchmarking anomaly detection algorithms across the literature and produce a large corpus of anomaly detection benchmarks that vary in their construction across several dimensions we deem important to real-world applications: (a) point difficulty, (b) relative frequency of anomalies, (c) clusteredness of anomalies, and (d) relevance of features. We apply a representative set of anomaly detection algorithms to this corpus, yielding a very large collection of experimental results. We analyze these results to understand many phenomena observed in previous work. First we observe the effects of experimental design on experimental results. Second, results are evaluated with two metrics, ROC Area Under the Curve and Average Precision. We employ statistical hypothesis testing to demonstrate the value (or lack thereof) of our benchmarks. We then offer several approaches to summarizing our experimental results, drawing several conclusions about the impact of our methodology as well as the strengths and weaknesses of some algorithms. Last, we compare results against a trivial solution as an alternate means of normalizing the reported performance of algorithms. The intended contributions of this article are many; in addition to providing a large publicly-available corpus of anomaly detection benchmarks, we provide an ontology for describing anomaly detection contexts, a methodology for controlling various aspects of benchmark creation, guidelines for future experimental design and a discussion of the many potential pitfalls of trying to measure success in this field. Link to the dataset can be found below. en_US
dc.description.sponsorship Defense Advanced Research Projects Agency (DARPA) en_US
dc.language.iso en_US en_US
dc.rights CC0 1.0 Universal *
dc.rights.uri http://creativecommons.org/publicdomain/zero/1.0/ *
dc.subject Anomaly Detection en_US
dc.subject Benchmarks en_US
dc.subject Machine Learning en_US
dc.subject Artificial Intelligence en_US
dc.subject Meta-Analysis en_US
dc.subject Point Difficulty en_US
dc.subject Relative Frequency en_US
dc.subject Clusteredness en_US
dc.subject Feature Irrelevance en_US
dc.title Anomaly Detection Meta-Analysis Benchmarks en_US
dc.type Dataset en_US
dc.description.peerreview yes en_US
dcterms.bibliographicCitation Emmott, Andrew; Das, Shubhomoy; Dietterich, Thomas; Fern, Alan; Wong, Weng-Keen (2016): Anomaly Detection Meta-Analysis Benchmarks. Oregon State University Libraries & Press. Dataset. http://dx.doi.org/10.7267/N97H1GGX
dcterms.isCitedBy This dataset is cited by: Emmott A, Das S, Dietterich T, Fern A, Wong WK (2016). A meta-analysis of the anomaly detection problem. arXiv:1503.01158v2 https://arxiv.org/abs/1503.01158
dcterms.isCitedBy arXiv:1503.01158v2
dcterms.identifier 10.7267/N97H1GGX
dcterms.identifierType DOI


Files in this item

Files Size Format View

There are no files associated with this item.

The following license files are associated with this item:

This item appears in the following Collection(s)

  • Datasets
    The central location for datasets in ScholarsArchive@OSU.

Show simple item record

CC0 1.0 Universal Except where otherwise noted, this item's license is described as CC0 1.0 Universal

Search ScholarsArchive@OSU


Advanced Search

Browse

My Account

Statistics