Generalizability of Deep Domain Adaptation in case of Sample Selection Bias
More Info
expand_more
Abstract
Domain adaptation allows machine learning models to perform well in a domain that is different from the available train data. This non-trivial task is approached in many ways and often relies on assumptions about the source (train) and target (test) domains. Unsupervised domain adaptation uses unlabeled target data to mitigate a shift or bias difference between the domains. Deep domain adaptation (DDA) is a powerful class of these methods, which utilizes deep learning to extract high-level features that are common across the domains and robust against the shift. These algorithms adapt to a specific target domain. This has two possible downsides. Firstly, the model might not generalize and thus require retraining for each new domain. Secondly, obtaining data for the target domain(s) might be difficult. There can be situations where both source and target domains originate from a “global” domain, from which the samples are selected in a biased way. We explore a new application of existing DDA methods and answer the question: How effective is deep domain adaptation when adapting with the global domain, instead of the target domain, in case of sample selection bias?. Results with synthetic data show that where target adaptation works, global adaptation also improves accuracy compared to supervised learning, although to a lesser extent.