Technical Report
 

Exploiting symmetry properties in the evaluation of inductive learning algorithms : an empirical domain-independent comparative study

Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/technical_reports/br86bc065

Descriptions

Attribute NameValues
Creator
Abstract
  • Although numerous Boolean concept learning algorithms have been introduced in the literature, little is known about what categories of concepts are actually learned satisfactorily by most of these algorithms. Conventional comparison studies, which test various algorithms in some chosen domain, do not provide such information, since their conclusions are limited to the domain considered. A more general way to evaluate a learning algorithm is to test it on all the possible concepts defined on a given number of Boolean features. However, this immediately leads to unaffordable computational costs, since we need to consider as many as 22 n concepts, when the number of features is n. In [D89], experiments of this type were reported for the case of three features, while the cases of four or more features were concluded to be infeasible. This paper directly builds on the work of [D89]. We introduce two techniques that significantly cut the computational costs of the desired experiments and enable us to perform experiments over the space of concepts defined on up to five variables. The first technique is to exploit the fact that inductive learning algorithms are generally insensitive to permuting and/or complementing the features of the domain. We give a method for eliminating redundancy in the experiments by computing a set of representative concepts that suffices to characterize the behavior of a given algorithm over the space of all concepts. The second technique is to resort to statistical approximation to avoid running algorithms on all the possible samples of a concept. We show that testing a feasibly small number of samples suffices to obtain results with a high level of confidence. Applying these techniques, we report experimental results analogous to those of [D89] on some decision tree building algorithms over five Boolean features. The results we present are rather surprising and demonstrate that there is still much to be learned about the algorithms we tested. The paper also discusses the possibility of enhancing the above techniques to work for the cases of six or more Boolean features.
Resource Type
Date Issued
Academic Affiliation
Series
Rights Statement
Publisher
Peer Reviewed
Language
Funding Body

Relationships

Parents:

This work has no parents.

Items