Active learning has the potential to reduce labeling costs in terms of time and money. In practical use, active learning works as an efficient data labeling strategy. Another point of view to look at active learning is to consider active learning as a learning problem, where the
...
Active learning has the potential to reduce labeling costs in terms of time and money. In practical use, active learning works as an efficient data labeling strategy. Another point of view to look at active learning is to consider active learning as a learning problem, where the training data is queried by the active learner. Under this perspective, an important question is inconsistency: can classifiers trained using active learning converge to the same result as using random sampling given an infinite number of data. In this paper, we discuss the possibility and potential consequences of using new sampling settings other than sampling without replacement in active learning to analyze the inconsistency problem. Moreover, a third sampling setting is defined to simulate the infinite data scenario in inconsistency. We compare the traditional setting, sampling without replacement in active learning with sampling with replacement in active learning, and true active learning. Furthermore, the two unusual sampling settings provide insight into the inconsistency problem. (1)Regularization parameter without adjustment can lead to inconsistency. (2)Querying data ”really” close to the decision boundary can also bring threats to active learning.