Exploring the Potential of Performance Bounds in Multi-Source Domain Adaptation

More Info
expand_more

Abstract

Currently, trained machine learning models are readily available, but their training data might not be (for example due to privacy reasons). This thesis investigates how pre-trained models can be combined for performance on all their source domains, without access to data. This problem is formulated as a Multiple-Source Domain Adaptation (MSA) problem setting, where models trained on source domains are combined so that the combiner is robust to application on any unknown target domain. This thesis explores the MSA setting and presents a perspective on MSA theory from literature. The issue in the MSA setting is that target models are not robust in general, leading to negative transfer. Firstly, this issue is illustrated by example of the source models and linearly weighted combinations of the source models. Next, existing theory that guarantees the existence of a robust model is investigated. It is argued that a performance bound has the potential to be extended from the perspective of additional knowledge: in addition to the in the MSA setting available source models, some additional knowledge of the source domains might be used by the model. Existing MSA theory’s assumptions are clarified and the theory is split in two. One half is inherent to the MSA setting and guarantees a model with as robustness property the loss on a matching mixture. The other half assumes a combiner that uses additional knowledge of the source domains, for which the robustness is proven satisfactory. Finally, it is investigated what makes additional knowledge in the MSA setting useful. Current literature assumes a specific target model--the distribution-weighted (DW) combiner--that is viewed as using the sources' joint distributions as additional knowledge. It is argued that knowledge of the training process of the source models can also be used as additional knowledge. In conclusion, this thesis discusses how robustness in the MSA setting can be improved from that of the source models by basing the combiner on additional knowledge of the source domains.

Files