Domain Adaptation for Rare Classes Augmented with Synthetic Samples

Tuhin Das, Robert-Jan Bruintjes, Attila Lengyel, Jan van Gemert, Sara Beery
ArXiv (ArXiv)


@article{das2021domain, title={Domain Adaptation for Rare Classes Augmented with Synthetic Samples}, author={Das, Tuhin and Bruintjes, Robert-Jan and Lengyel, Attila and van Gemert, Jan and Beery, Sara}, journal={arXiv preprint arXiv:2110.12216}, year={2021} }


To alleviate lower classification performance on rare classes in imbalanced datasets, a possible solution is to augment the underrepresented classes with synthetic samples. Domain adaptation can be incorporated in a classifier to decrease the domain discrepancy between real and synthetic samples. While domain adaptation is generally applied on completely synthetic source domains and real target domains, we explore how domain adaptation can be applied when only a single rare class is augmented with simulated samples. As a testbed, we use a camera trap animal dataset with a rare deer class, which is augmented with synthetic deer samples. We adapt existing domain adaptation methods to two new methods for the single rare class setting: DeerDANN, based on the Domain-Adversarial Neural Network (DANN), and DeerCORAL, based on deep correlation alignment (Deep CORAL) architectures. Experiments show that DeerDANN has the highest improvement in deer classification accuracy of 24.0% versus 22.4% improvement of DeerCORAL when compared to the baseline. Further, both methods require fewer than 10k synthetic samples, as used by the baseline, to achieve these higher accuracies. DeerCORAL requires the least number of synthetic samples (2k deer), followed by DeerDANN (8k deer).