The Peaking Phenomenon in Semi-supervised Learning

Conference paper (2016)

Authors

J.H. Krijthe Pattern Recognition and Bioinformatics - , Leiden University Medical Center

M. Loog University of Copenhagen, Pattern Recognition and Bioinformatics -

Research Group

Pattern Recognition and Bioinformatics () (TU Delft)

Peaking Semi-supervised learning Least squares classfier Pseudo-inverse

To reference this document use:

http://resolver.tudelft.nl/uuid:eedff80e-b79c-400e-817f-de76ef50e3d9

More Info

expand_more

Published Date

2016

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Intelligent Systems

Research Group

Pattern Recognition and Bioinformatics

Abstract

For the supervised least squares classifier, when the number of training objects is smaller than the dimensionality of the data, adding more data to the training set may first increase the error rate before decreasing it. This, possibly counterintuitive, phenomenon is known as peaking. In this work, we observe that a similar but more pronounced version of this phenomenon also occurs in the semi-supervised setting, where instead of labeled objects, unlabeled objects are added to the training set. We explain why the learning curve has a more steep incline and a more gradual decline in this setting through simulation studies and by applying an approximation of the learning curve based on the work by Raudys and Duin.