Sunteți pe pagina 1din 9

Semi-supervised Multi-category Classification

with Generative Adversarial Networks

Reshma Rastogi1 and Ritesh Gangnani2


1
South Asian University, New Delhi 110021, India
reshma.khemchandani@sau.ac.in
2
South Asian University, New Delhi 110021, India
ritesh.gangnani@gmail.com

Abstract. For training robust deep neural architectures to generate


complex samples across varied domains, Generative Adversarial Net-
works (GANs) have shown promising performance in recent years. In
previous works, the effectiveness of GANs in transforming images from
the labeled source domain to the unlabeled target domain has shown high
potential. In this paper, we outline a generalized semi-supervised learn-
ing framework where proposed ‘Semi-supervised Multi-category Classifi-
cation with Generative Adversarial Networks (SMC-GAN)’ model, first,
maps the data in the source domain to target domain to generate target-
like source images and, then, learns to discriminate the target domain
data using semi-supervised classifier. Extensive experimental evaluations
on standard cross-domain datasets show that the proposed model is an
efficient classifier and allows faster convergence than a conventional GAN
approach for digit classification tasks.

Keywords: Domain Adaptation · Adversarial Learning · GAN · Semi-


supervised Learning.

1 Introduction
Deep Convolutional Neural Networks (CNN) have shown amazing results for a
variety of representation based learning tasks [1], but the success of these models
purely rely on the availability of the substantial amount of labeled data, due to
an event known as domain shift or dataset-bias [2]. However, in many real-world
scenarios, acquiring the training labels can be expensive, tedious, or simply bor-
ing at times. One of the promising solution to overcome this problem is through
domain adaptation, that aims to transfer the knowledge of labeled source domain
to the unlabeled target domain and fine-tune underlying networks according to
task-specific datasets. As the domains are of different distributions, the domain
shift problem is a major concern. Thus, the goal of domain adaptation is to make
the model generalise well on the target domain.
Adversarial networks are employed to generalise across different domains
when a substantial amount of labeled data is available to train a deep CNN
for source domain and no annotated data is available for training of target do-
main. Many distinct models have been proposed to tackle this problem, primarily
2 Reshma Rastogi and Ritesh Gangnani

concentrating on reducing the distribution shift among the source and target do-
mains to perform classification on the target domain. The alternative methods
include transforming the image representations by mapping the source domain
images to the target domain or by learning a generator model to reconstruct
new target images from the source data [3]. Several recent Generative Adversar-
ial Networks (GANs) have been proposed in sync with these approaches to adapt
different data distributions [4][5][6] in the uni-directional manner. However, in
[7], the authors proposed a symmetric bi-directional adaptation GAN model,
that introduces a symmetric mapping between source domain to the target do-
main, and, additionally mapping target domain to the source domain, followed
by training of two classifiers in both directions. In this model, both the classi-
fiers are trained in a supervised manner on the labeled source data and generated
target-like source data, respectively. However, authors did not considered target
domain images at all while training the target classifier.
Although the above-mentioned classifiers only rely on supervised learning
framework ignoring the available unlabeled data information while learning the
classifier for the target data, it is intuitive that adversarial learning on both
the source and target domain should take advantage of labeled and unlabeled
information. Hence, we propose a framework termed as Semi-supervised Multi-
category Classification with Generative Adversarial Networks (SMC-GAN), where
the ultimate task is to learn a semi-supervised classifier for the unlabeled target
data. As illustrated in Fig. 1, we first perform unsupervised domain adaptation
that maps the labeled source images to the target domain using GANs and gen-
erates the target-like source images from the labeled source images employing
domain-adversarial loss. Subsequently, we train the classifier on the generated
target-like source images and then assign pseudo-labels to the unlabeled target
images using the trained classifier by assigning the class labels based on max-
imum predicted probability. Finally, we re-train the classifier with these target
data annotations resulting in the semi-supervised learning of the classifier model.
The rest of the paper is organized as follows: Section 2 reviews the related
work in the Generative Adversarial Networks. Section 3 details out the problem
Semi-supervised Multi-category Classification approach based on GAN. Subse-
quently, Section 4 summarizes the results and finally, Section 5 concludes the
paper.

2 Related Work

Unsupervised domain adaptation has been a challenging research work in re-


cent years, both theoretically and practically. Since a lot of prior work exists,
our literature review primarily focuses on approaches using CNN for domain
adaptation due to their firm empirical supremacy over the problem.
Generative Adversarial Networks (GAN) [8] consists of two components, a
generator and a discriminator. The generator model generate samples approx-
imately matching the distribution of the real data, whereas the discriminator
model distinguishes between the generated samples drawn from the generator
SMC-GAN 3

and real samples from training data. GANs have been applied to different do-
mains successfully, such as imitating images of digit dataset and faces. They have
been extended in several ways for unsupervised domain adaptation to tackle the
domain shift. The conditional GAN [9] is a type of GAN, that employs the
class annotations as additional input to both generator and discriminator model
resulting in better learning of the networks when compared to baseline GAN
models. In Adversarial Discriminative Domain Adaptation (ADDA) [10], author
proposed a framework for unsupervised domain adaptation based on adversarial
as well as discriminative learning. A discriminative mapping of target images
to source feature space is learnt and the target test images are mapped to the
same space for classification. Authors in [11] proposed CycleGAN which em-
ployed a round-trip mapping approach using two GANs for image translation
i.e. source to target to source, which implies mapping source samples in one
direction followed by mapping it into opposite direction should return to their
ground truth values, introducing a cycle-consistency loss. Use of this approach
generates high quality transformed images to the target domain. In CyCADA
[12], the model adapt representations at both pixel-level and feature-level, en-
forcing the cycle-consistency loss. Similarly, SBADAGAN [7] also employs two
generative adversarial losses to encourage the entire network to produce target-
like source images as well as source-like target images. Further, it simultaneously
minimizes both the classification losses collectively to produce final annotations
for the target images.
In [13], authors have used pseudo-label for unlabeled data, which corresponds
to the maximum predicted probability for the semi-supervised learning frame-
work. Thus, learning the model simultaneously on supervised as well as unsuper-
vised dataset belonging to the same distribution. Similarly, in another approach
[14], the semi-supervised model is trained by simultaneously minimizing the sum
of supervised and unsupervised loss functions. To the best of our knowledge, the
approach to combine CNN with semi-supervised learning has been explored in
some recent works, however, the semi-supervised learning with adversarial do-
main adaptation has not been explored yet.
In the adversarial approaches discussed above, after successful feature map-
ping of either the source domain to target domain, and/or learning of target to
source domain, the classifier is trained completely in a supervised manner on
the labeled source data, and used for the labels prediction of target data. In our
model, instead of training the classifier solely on, source data, we annotate, tar-
get data with pseudo-labels and train the classifier in a semi-supervised manner
by using target-like source data (labeled) and actual target data (unlabeled).

3 Proposed Model

Our model is primarily focused on the semi-supervised classification task, pre-


dicting the correct labels of the provided unsupervised target domain in an
efficient and well-generalised manner. In unsupervised adaptation, we assume
Ns
access to two related datasets, i.e. a labeled source dataset Xs = {pis , qsi }i=1
4 Reshma Rastogi and Ritesh Gangnani

Fig. 1. An overview of Semi-supervised Multi-category Classification with Generative


Adversarial Networks (SMC-GAN).

drawn from a source domain S containing Ns samples, and an unlabeled set


Nt
of target images given as Xt = {pit }i=1 drawn from a target domain T with
different distribution containing Nt samples, where no annotations are available
for training. The ultimate task is to build a model to predict the class labels
of the target dataset using the knowledge available in the source dataset. The
structure of our GAN model is diagrammed in Figure 1. Since there is a domain
shift/gap among the datasets, thus, initially to bridge the domain shift/gap, we
propose a framework that adapts the knowledge of source images after mapping
them to the distribution of the target images in Xt . To perform the domain
adaptation task, we use GANs to identify the mapping of a random noise vector
z together with Xs to achieve the corresponding generated image. Thus, we train
a generator network G that learns to map the source samples pis to its target
Ns Ns
like version pist = G(pis )i=1 defining the set Xst = {pist , qsi }i=1 . The model is also
extended with an adversarial discriminator, D, and a semi-supervised classifier,
C. The discriminator input takes Xt as well as generated target-like images Xst
and recognizes them as two distinct sets, adversarially learning G. Thus, the
generator is aimed to learn the distribution of real data, while the discriminator
is aimed to correctly classify whether the input data is from the real data or the
generator. The following part, i.e., classifier network, takes generated images as
input and assigns the task-specific labels qsi to the generated target-like source
images.
After the successful generation of the target-like source data from the gen-
erator, we train the semi-supervised classifier network on the generated labeled
source data (i.e., from the generator) and the unlabeled target data. Subse-
quently, we propose a simple approach to predict the labels of the unsupervised
target domain data using the knowledge of the data with a similar distribution.
To achieve this, along with the GAN network, we simultaneously train a classifier
using the labeled target-like generated images, that also benefits the generator
model to improve through backpropagation. With a motivation to learn the clas-
SMC-GAN 5

sifier in a semi-supervised manner, we also require to train the classifier utilizing


the knowledge in the unlabeled target data. To handle the problem of the non-
availability of labels for images in the target domain, we use the self-labeling
approach [13]. Consequently, we annotate each target sample pit using the clas-
sifier model trained on generated target-like source samples Xst . These assigned
pseudo-labels are then used transductively to re-train our classifier network on
the original target data along with the generated target-like source data. Self-
labeling has a successful track record for domain adaptation problems and is
proved to be effective for modern deep architectures. The exactness of these as-
signed pseudo-labels is validated as the distribution of the real target data, and
the generated target-like source data is the same. Thus, the target data should
belong to one of the classes present in the target-like source data. In case of
the slight domain shift, the true pseudo-labels helps to regularize the learning of
the model. Whereas, in case of significant domain shift, the possible mislabeled
samples do not hinder the efficiency of the model as the distribution of both the
labeled and unlabeled target is same.

3.1 Problem Formulation


In this section, we formalize our proposed model and specify how the model is
optimized using the loss functions.
We first describe the optimization of discriminator D given generator G. We
input noise vector z belonging to uniform distribution along with the source
images to the generator model. This allows an additional degree of freedom to
model external variations in the dataset. We introduce a mapping from source to
target through Generator G and train it to produce target-like samples that can
fool Discriminator D. Adversarial discriminator attempts to classify the domains
of real data and fake data. Thus, rather than robust binary cross-entropy which
is used common practice, we propose least square loss function for optimizing
the discriminator parameters:
min max LD (D, G) = Ept ∼T [(D(pt ) − 1)2 ] + Eps ∼S , z∼noise [(D(G(ps , z)))2 ],(1)
G D

where, ps is sampled from source data distribution S, pt is sampled from tar-


get data distribution T , z is sampled from the prior distribution Pz (z) such as
uniform distribution, and E(·) represents the expectation.
As the semi-supervised classifier network is trained considering the labeled
as well as unlabeled data, the corresponding terms consists of two parts in the
loss function and is given by :
LC = α1 Lsupervised + α2 Lunsupervised , (2)
where, Lsupervised is the loss for the labeled target-like source data and Lunsupervised
is the loss for unlabeled target data which is the self-labeling loss. For the semi-
supervised classifier evaluated on the transformed source images as well as target
images, corresponding loss LC is a standard softmax cross-entropy, given as:
Lsupervised = LC (G, C) = E{p , q } ∼ S, z ∼ noise [−q s · log(q̂ s )], (3)
s s s
6 Reshma Rastogi and Ritesh Gangnani

where, q̂ s = (C(G(ps , zs ))) and q s is the one-hot encoding of the labels assigned
to the target-like source data.
The loss for assignment of annotations to the Xt i.e. self-loss is a simple
classification softmax cross-entropy, given as:

Lunsupervised = Lself (G, C) = E{pt ,qt }∼T, z t ∼noise [−q tself · log(q̂ tself )], (4)
self

where, q̂ tself = (C(pt )) and q tself is the one-hot encoding of the assigned target
labels. This loss is backpropagated to the generator G, that encourages the
network to preserve the annotated category of the target images.
By collecting all the loss functions discussed above, we conclude SMC-GAN
with the overall loss, given as:

LSM C−GAN (G, D, C) = min max βLD (D, G) + α1 LC (G, C) + α2 Lself (G, C).
G,C D
(5)
Here, (α1 ,α2 ,β ) ≥ 0 are weights to manipulate the interaction among the loss
terms.

4 Experimental Results

We evaluate SMC-GAN for unsupervised domain adaptation across four dif-


ferent domain shifts. We explore four digits dataset of varying distributions,
namely, MNIST, MNIST-M, USPS, SVHN, each consisting of 10 classes. The
description of these datasets can be found in [7]. We compare our model against
multiple state-of-the-art approaches, all based upon domain adversarial learning
objectives.
MNIST → USPS MNIST → MNIST-M SVHN → MNIST USPS → MNIST
DANN [4] 85.1 77.4 73.9 73.0±2.0
CoGAN [16] 91.2 62.0 not conv. 89.1
ADDA [10] 89.4 ± 0.2 - 76.0±1.8 90.1±0.8
PixelDA [17] 95.9 98.2 - -
DTN [5] - - 84.4 -
DIRT-T [18] - 98.7 99.4 -
DAass [19] - 89.5 97.6 -
CyCADA [12] 95.6 ± 0.2 - 90.4 ± 0.4 96.5 ± 0.1
SBADA-GAN [7] 97.6 99.4 76.1 95.0
SMC-GAN (Proposed) 91.74 98.4 75.43 98.07

Table 1. Comparison against existing works on unsupervised domain adaptation.


SMC-GAN reports the resulting accuracy obtained by the proposed semi-supervised
classifier.

4.1 Implementation Details

The model is implemented in python and experiments are performed using the
Keras framework [20]. Our model architecture is analogous to that used in [21].
SMC-GAN 7

We used the ADAM [22] optimizer with the learning rate 10−4 for both the
generator network as well as discriminator network. The model is trained for
500 epochs with the batch size of 32, resulting in no case of overfitting. The
parameter α1 defined in Eq. (5) is set to 1, whereas β is set to 10 to prevent
the generator from indirectly switching labels. The training starts with the self-
labeling loss and hence, α2 is set to zero as this loss hinders the convergence of
the generator model in the initial stage. After the generator starts to converge,
α2 is switched to 1 in order to increase the performance of the model.

Fig. 2. Examples of generated dig- Fig. 3. Comparisons of accuracies obtained


its. The top row represents the origi- by supervised and semi-supervised classifi-
nal samples, whereas the bottom row cation framework as a function of number
represents the corresponding generated of epochs.
image.

4.2 Results Discussion

Table 1 shows the results in above-mentioned evaluation settings. It can be


seen that the proposed SMC-GAN model achieves competitive results and per-
forms significantly better on most of the tasks when compared to previous uni-
directional approaches [10][4][16] and few cyclic approaches [7][12]. In compari-
son to previous approaches, our model learns to generalise better for the target
discriminative task under the guidance of the generator, thus, directly learning
a target discriminative model through adversarial adaptation losses. The ulti-
mate task of our model is to learn a semi-supervised classifier which has been
shown to perform well when compared to supervised classifiers approaches used
in earlier frameworks [10] [9] [7], and has a faster convergence rate of the gener-
ator model due to backpropagation by the classifier network. Moreover, we also
experimented with MNIST-M (source) → MNIST (target) pair and achieved
99.2 % accuracy for the classification task on the target dataset using proposed
SMC-GAN. However, we have not included this setting in Table 1 as it is not re-
ported by other referred papers. Furthermore, in Figure 2, we show the generated
target-like source images using the source samples.
Figure 3 shows the iterative comparison of accuracies as obtained by the stan-
dard supervised classifier and proposed semi-supervised classification framework.
It is evident that SMC-GAN framework achieves better performance in lesser
number of epochs, which validates the claim that the proposed framework results
into a faster and more generalizable model.
8 Reshma Rastogi and Ritesh Gangnani

5 Conclusions
In this paper, we proposed a semi-supervised learning framework for efficient
prediction of unlabeled samples in the target domain. We utilized unsupervised
domain adaptation techniques based on adversarial objectives followed by a semi-
supervised classification. Our method maps the source samples into the target
domain to tackle the problem of distribution shifts between source and target
data and learns a semi-supervised classifier for classifying the test patterns in
target domain. We utilized the self-labelling approach on target samples and used
these along with the generated target-like source images for the fine-tuning the
resulting classification model. The proposed framework boosts the performance
of the model significantly in terms of prediction performance and training time
when compared to related models. In future, we would like to explore the semi-
supervised methodology for cross-domain adaptation.

References
1. J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features
in deep neural networks?” in Advances in neural information processing systems,
2014, pp. 3320–3328.
2. A. Gretton, A. Smola, J. Huang, M. Schmittfull, K. Borgwardt, and B. Schölkopf,
Covariate shift and local learning by distribution matching. Cambridge, MA, USA:
MIT Press, 2009, pp. 131–160.
3. M. Ghifary, W. B. Kleijn, M. Zhang, D. Balduzzi, and W. Li, “Deep reconstruction-
classification networks for unsupervised domain adaptation,” in European Confer-
ence on Computer Vision. Springer, 2016, pp. 597–613.
4. Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette,
M. Marchand, and V. Lempitsky, “Domain-adversarial training of neural net-
works,” The Journal of Machine Learning Research, vol. 17, no. 1, pp. 2096–2030,
2016.
5. X. Zhang, F. X. Yu, S.-F. Chang, and S. Wang, “Deep transfer network: Unsuper-
vised domain adaptation,” CoRR, vol. abs/1503.00591, 2015.
6. Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropaga-
tion,” in International Conference on Machine Learning, 2015, pp. 1180–1189.
7. P. Russo, F. M. Carlucci, T. Tommasi, and B. Caputo, “From source to target and
back: symmetric bi-directional adaptive gan,” in Proceedings of the IEEE Confer-
ence on Computer Vision and Pattern Recognition, 2018, pp. 8099–8108.
8. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural
information processing systems, 2014, pp. 2672–2680.
9. M. Mirza and S. Osindero, “Conditional generative adversarial nets,” CoRR, vol.
abs/1411.1784, 2014. [Online]. Available: http://arxiv.org/abs/1411.1784
10. E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discriminative do-
main adaptation,” in Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, 2017, pp. 7167–7176.
11. J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image trans-
lation using cycle-consistent adversarial networks,” in Proceedings of the IEEE
international conference on computer vision, 2017, pp. 2223–2232.
SMC-GAN 9

12. J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. Efros, and T. Dar-
rell, “CyCADA: Cycle-consistent adversarial domain adaptation,” in Proceedings
of the 35th International Conference on Machine Learning, ser. Proceedings of Ma-
chine Learning Research, J. Dy and A. Krause, Eds., vol. 80. Stockholmsmssan,
Stockholm Sweden: PMLR, 10–15 Jul 2018, pp. 1989–1998.
13. D.-H. Lee, “Pseudo-label: The simple and efficient semi-supervised learning method
for deep neural networks,” in Workshop on Challenges in Representation Learning,
ICML, vol. 3, 2013, p. 2.
14. A. Rasmus, M. Berglund, M. Honkala, H. Valpola, and T. Raiko, “Semi-supervised
learning with ladder networks,” in Advances in neural information processing sys-
tems, 2015, pp. 3546–3554.
15. X. Mao, Q. Li, H. Xie, R. Y. K. Lau, and Z. Wang, “Multi-class generative adver-
sarial networks with the l2 loss function,” CoRR, vol. abs/1611.04076, 2016.
16. M.-Y. Liu and O. Tuzel, “Coupled generative adversarial networks,” in Advances
in neural information processing systems, 2016, pp. 469–477.
17. K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Krishnan, “Unsupervised
pixel-level domain adaptation with generative adversarial networks,” 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), pp. 95–104,
2017.
18. R. Shu, H. Bui, H. Narui, and S. Ermon, “A DIRT-t approach to unsupervised
domain adaptation,” in International Conference on Learning Representations,
2018. [Online]. Available: https://openreview.net/forum?id=H1q-TM-AW
19. P. Haeusser, T. Frerix, A. Mordvintsev, and D. Cremers, “Associative domain
adaptation,” in Proceedings of the IEEE International Conference on Computer
Vision, 2017, pp. 2765–2773.
20. F. Chollet et al., “Keras,” 2015.
21. A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with
deep convolutional generative adversarial networks,” CoRR, vol. abs/1511.06434,
2016.
22. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR,
vol. abs/1412.6980, 2015.

S-ar putea să vă placă și