Improving SVM accuracy by training on auxiliary data sources

Oregon State University. Dept. of Computer Science; Wu, Pengcheng; Dietterich, Thomas Glen

Technical Report

Improving SVM accuracy by training on auxiliary data sources

Public Deposited

Download PDF

Citeable URL: https://ir.library.oregonstate.edu/concern/technical_reports/q237ht16k

Descriptions

Attribute Name	Values
Creator	Oregon State University. Dept. of Computer Science Wu, Pengcheng Dietterich, Thomas Glen
Abstract	The standard model of supervised learning assumes that training and test data are drawn from the same underlying distribution. This paper explores an application in which a second, auxiliary, source of data is available drawn from a different distribution. This auxiliary data is more plentiful, but of significantly lower quality, than the training and test data. In the SVM framework, a training example has two roles: (a) as a data point to constrain the learning process and (b) as a candidate support vector that can form part of the definition of the classifier. The paper considers using the auxiliary data in either (or both) of these roles. This auxiliary data framework is applied to a problem of classifying images of leaves of maple and oak trees using a kernel derived from the shapes of the leaves. Experiments show that when the training data set is very small, training with auxiliary data can produce large improvements in accuracy, even when the auxiliary data is significantly different from the training (and test) data. The paper also introduces techniques for adjusting the kernel scores of the auxiliary data points to make them more comparable to the training data points.
Resource Type	Research Paper
Date Available	2012-12-03T17:12:42+00:00
Date Issued	2004
Series	Technical report (Oregon State University. Department of Computer Science)
Subject	Supervised learning (Machine learning) Leaves -- Identification -- Computer programs
Rights Statement	Copyright Not Evaluated
Publisher	Corvallis, OR : Oregon State University, Dept. of Computer Science
Peer Reviewed	No
Language	English [eng]
Replaces	http://hdl.handle.net/1957/35379

Relationships

Parents:: This work has no parents.

Items

Thumbnail	Title	Date Uploaded	Visibility	Actions
	2004-23.pdf	2017-07-18	Public	Download

ScholarsArchive@OSU

Improving SVM accuracy by training on auxiliary data sources

Downloadable Content

Descriptions

Relationships

Items