Documente Academic
Documente Profesional
Documente Cultură
1 Introduction
Deep Convolutional Neural Networks (CNN) have shown amazing results for a
variety of representation based learning tasks [1], but the success of these models
purely rely on the availability of the substantial amount of labeled data, due to
an event known as domain shift or dataset-bias [2]. However, in many real-world
scenarios, acquiring the training labels can be expensive, tedious, or simply bor-
ing at times. One of the promising solution to overcome this problem is through
domain adaptation, that aims to transfer the knowledge of labeled source domain
to the unlabeled target domain and fine-tune underlying networks according to
task-specific datasets. As the domains are of different distributions, the domain
shift problem is a major concern. Thus, the goal of domain adaptation is to make
the model generalise well on the target domain.
Adversarial networks are employed to generalise across different domains
when a substantial amount of labeled data is available to train a deep CNN
for source domain and no annotated data is available for training of target do-
main. Many distinct models have been proposed to tackle this problem, primarily
2 Reshma Rastogi and Ritesh Gangnani
concentrating on reducing the distribution shift among the source and target do-
mains to perform classification on the target domain. The alternative methods
include transforming the image representations by mapping the source domain
images to the target domain or by learning a generator model to reconstruct
new target images from the source data [3]. Several recent Generative Adversar-
ial Networks (GANs) have been proposed in sync with these approaches to adapt
different data distributions [4][5][6] in the uni-directional manner. However, in
[7], the authors proposed a symmetric bi-directional adaptation GAN model,
that introduces a symmetric mapping between source domain to the target do-
main, and, additionally mapping target domain to the source domain, followed
by training of two classifiers in both directions. In this model, both the classi-
fiers are trained in a supervised manner on the labeled source data and generated
target-like source data, respectively. However, authors did not considered target
domain images at all while training the target classifier.
Although the above-mentioned classifiers only rely on supervised learning
framework ignoring the available unlabeled data information while learning the
classifier for the target data, it is intuitive that adversarial learning on both
the source and target domain should take advantage of labeled and unlabeled
information. Hence, we propose a framework termed as Semi-supervised Multi-
category Classification with Generative Adversarial Networks (SMC-GAN), where
the ultimate task is to learn a semi-supervised classifier for the unlabeled target
data. As illustrated in Fig. 1, we first perform unsupervised domain adaptation
that maps the labeled source images to the target domain using GANs and gen-
erates the target-like source images from the labeled source images employing
domain-adversarial loss. Subsequently, we train the classifier on the generated
target-like source images and then assign pseudo-labels to the unlabeled target
images using the trained classifier by assigning the class labels based on max-
imum predicted probability. Finally, we re-train the classifier with these target
data annotations resulting in the semi-supervised learning of the classifier model.
The rest of the paper is organized as follows: Section 2 reviews the related
work in the Generative Adversarial Networks. Section 3 details out the problem
Semi-supervised Multi-category Classification approach based on GAN. Subse-
quently, Section 4 summarizes the results and finally, Section 5 concludes the
paper.
2 Related Work
and real samples from training data. GANs have been applied to different do-
mains successfully, such as imitating images of digit dataset and faces. They have
been extended in several ways for unsupervised domain adaptation to tackle the
domain shift. The conditional GAN [9] is a type of GAN, that employs the
class annotations as additional input to both generator and discriminator model
resulting in better learning of the networks when compared to baseline GAN
models. In Adversarial Discriminative Domain Adaptation (ADDA) [10], author
proposed a framework for unsupervised domain adaptation based on adversarial
as well as discriminative learning. A discriminative mapping of target images
to source feature space is learnt and the target test images are mapped to the
same space for classification. Authors in [11] proposed CycleGAN which em-
ployed a round-trip mapping approach using two GANs for image translation
i.e. source to target to source, which implies mapping source samples in one
direction followed by mapping it into opposite direction should return to their
ground truth values, introducing a cycle-consistency loss. Use of this approach
generates high quality transformed images to the target domain. In CyCADA
[12], the model adapt representations at both pixel-level and feature-level, en-
forcing the cycle-consistency loss. Similarly, SBADAGAN [7] also employs two
generative adversarial losses to encourage the entire network to produce target-
like source images as well as source-like target images. Further, it simultaneously
minimizes both the classification losses collectively to produce final annotations
for the target images.
In [13], authors have used pseudo-label for unlabeled data, which corresponds
to the maximum predicted probability for the semi-supervised learning frame-
work. Thus, learning the model simultaneously on supervised as well as unsuper-
vised dataset belonging to the same distribution. Similarly, in another approach
[14], the semi-supervised model is trained by simultaneously minimizing the sum
of supervised and unsupervised loss functions. To the best of our knowledge, the
approach to combine CNN with semi-supervised learning has been explored in
some recent works, however, the semi-supervised learning with adversarial do-
main adaptation has not been explored yet.
In the adversarial approaches discussed above, after successful feature map-
ping of either the source domain to target domain, and/or learning of target to
source domain, the classifier is trained completely in a supervised manner on
the labeled source data, and used for the labels prediction of target data. In our
model, instead of training the classifier solely on, source data, we annotate, tar-
get data with pseudo-labels and train the classifier in a semi-supervised manner
by using target-like source data (labeled) and actual target data (unlabeled).
3 Proposed Model
where, q̂ s = (C(G(ps , zs ))) and q s is the one-hot encoding of the labels assigned
to the target-like source data.
The loss for assignment of annotations to the Xt i.e. self-loss is a simple
classification softmax cross-entropy, given as:
Lunsupervised = Lself (G, C) = E{pt ,qt }∼T, z t ∼noise [−q tself · log(q̂ tself )], (4)
self
where, q̂ tself = (C(pt )) and q tself is the one-hot encoding of the assigned target
labels. This loss is backpropagated to the generator G, that encourages the
network to preserve the annotated category of the target images.
By collecting all the loss functions discussed above, we conclude SMC-GAN
with the overall loss, given as:
LSM C−GAN (G, D, C) = min max βLD (D, G) + α1 LC (G, C) + α2 Lself (G, C).
G,C D
(5)
Here, (α1 ,α2 ,β ) ≥ 0 are weights to manipulate the interaction among the loss
terms.
4 Experimental Results
The model is implemented in python and experiments are performed using the
Keras framework [20]. Our model architecture is analogous to that used in [21].
SMC-GAN 7
We used the ADAM [22] optimizer with the learning rate 10−4 for both the
generator network as well as discriminator network. The model is trained for
500 epochs with the batch size of 32, resulting in no case of overfitting. The
parameter α1 defined in Eq. (5) is set to 1, whereas β is set to 10 to prevent
the generator from indirectly switching labels. The training starts with the self-
labeling loss and hence, α2 is set to zero as this loss hinders the convergence of
the generator model in the initial stage. After the generator starts to converge,
α2 is switched to 1 in order to increase the performance of the model.
5 Conclusions
In this paper, we proposed a semi-supervised learning framework for efficient
prediction of unlabeled samples in the target domain. We utilized unsupervised
domain adaptation techniques based on adversarial objectives followed by a semi-
supervised classification. Our method maps the source samples into the target
domain to tackle the problem of distribution shifts between source and target
data and learns a semi-supervised classifier for classifying the test patterns in
target domain. We utilized the self-labelling approach on target samples and used
these along with the generated target-like source images for the fine-tuning the
resulting classification model. The proposed framework boosts the performance
of the model significantly in terms of prediction performance and training time
when compared to related models. In future, we would like to explore the semi-
supervised methodology for cross-domain adaptation.
References
1. J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features
in deep neural networks?” in Advances in neural information processing systems,
2014, pp. 3320–3328.
2. A. Gretton, A. Smola, J. Huang, M. Schmittfull, K. Borgwardt, and B. Schölkopf,
Covariate shift and local learning by distribution matching. Cambridge, MA, USA:
MIT Press, 2009, pp. 131–160.
3. M. Ghifary, W. B. Kleijn, M. Zhang, D. Balduzzi, and W. Li, “Deep reconstruction-
classification networks for unsupervised domain adaptation,” in European Confer-
ence on Computer Vision. Springer, 2016, pp. 597–613.
4. Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette,
M. Marchand, and V. Lempitsky, “Domain-adversarial training of neural net-
works,” The Journal of Machine Learning Research, vol. 17, no. 1, pp. 2096–2030,
2016.
5. X. Zhang, F. X. Yu, S.-F. Chang, and S. Wang, “Deep transfer network: Unsuper-
vised domain adaptation,” CoRR, vol. abs/1503.00591, 2015.
6. Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropaga-
tion,” in International Conference on Machine Learning, 2015, pp. 1180–1189.
7. P. Russo, F. M. Carlucci, T. Tommasi, and B. Caputo, “From source to target and
back: symmetric bi-directional adaptive gan,” in Proceedings of the IEEE Confer-
ence on Computer Vision and Pattern Recognition, 2018, pp. 8099–8108.
8. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural
information processing systems, 2014, pp. 2672–2680.
9. M. Mirza and S. Osindero, “Conditional generative adversarial nets,” CoRR, vol.
abs/1411.1784, 2014. [Online]. Available: http://arxiv.org/abs/1411.1784
10. E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discriminative do-
main adaptation,” in Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, 2017, pp. 7167–7176.
11. J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image trans-
lation using cycle-consistent adversarial networks,” in Proceedings of the IEEE
international conference on computer vision, 2017, pp. 2223–2232.
SMC-GAN 9
12. J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. Efros, and T. Dar-
rell, “CyCADA: Cycle-consistent adversarial domain adaptation,” in Proceedings
of the 35th International Conference on Machine Learning, ser. Proceedings of Ma-
chine Learning Research, J. Dy and A. Krause, Eds., vol. 80. Stockholmsmssan,
Stockholm Sweden: PMLR, 10–15 Jul 2018, pp. 1989–1998.
13. D.-H. Lee, “Pseudo-label: The simple and efficient semi-supervised learning method
for deep neural networks,” in Workshop on Challenges in Representation Learning,
ICML, vol. 3, 2013, p. 2.
14. A. Rasmus, M. Berglund, M. Honkala, H. Valpola, and T. Raiko, “Semi-supervised
learning with ladder networks,” in Advances in neural information processing sys-
tems, 2015, pp. 3546–3554.
15. X. Mao, Q. Li, H. Xie, R. Y. K. Lau, and Z. Wang, “Multi-class generative adver-
sarial networks with the l2 loss function,” CoRR, vol. abs/1611.04076, 2016.
16. M.-Y. Liu and O. Tuzel, “Coupled generative adversarial networks,” in Advances
in neural information processing systems, 2016, pp. 469–477.
17. K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Krishnan, “Unsupervised
pixel-level domain adaptation with generative adversarial networks,” 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pp. 95–104,
2017.
18. R. Shu, H. Bui, H. Narui, and S. Ermon, “A DIRT-t approach to unsupervised
domain adaptation,” in International Conference on Learning Representations,
2018. [Online]. Available: https://openreview.net/forum?id=H1q-TM-AW
19. P. Haeusser, T. Frerix, A. Mordvintsev, and D. Cremers, “Associative domain
adaptation,” in Proceedings of the IEEE International Conference on Computer
Vision, 2017, pp. 2765–2773.
20. F. Chollet et al., “Keras,” 2015.
21. A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with
deep convolutional generative adversarial networks,” CoRR, vol. abs/1511.06434,
2016.
22. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR,
vol. abs/1412.6980, 2015.