Document Type : Original/Review Paper

Authors

1 Department of Computer Engineering, South Tehran Branch, Islamic Azad University, Tehran, Iran.

2 School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran.

Abstract

Generative Adversarial Networks (GANs) have emerged as a pivotal research focus within artificial intelligence due to their exceptional capabilities in data generation. Their ability to produce high-quality synthetic data has garnered significant attention, leading to their application in diverse domains such as image and video generation, classification, and style transfer. Beyond these continuous data applications, GANs are also being leveraged for discrete data tasks, including text and music generation. The distinct nature of continuous and discrete data poses unique challenges for GANs. In particular, generating discrete values necessitates the use of Policy Gradient algorithms from reinforcement learning to avoid the direct back-propagation typically used for continuous values. The generator must map latent variables into discrete domains, and unlike continuous value generation, this process involves subtle adjustments to the generator’s outputs to progressively align with real discrete data, guided by the discriminator. This paper aims to provide a thorough review of GAN architectures, fundamental concepts, and applications in the context of discrete data. Additionally, it addresses the existing challenges, evaluation metrics, and future research directions in this burgeoning field.

Keywords

Main Subjects

[1] Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A & Shi, W. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4681-4690).
 
[2] Vondrick, C., Pirsiavash, H., & Torralba, A. (2016). Generating videos with scene dynamics. Advances in neural information processing systems, 29.
 
[3] Donahue, C., Li, B., & Prabhavalkar, R. (2018, April). Exploring speech enhancement with generative adversarial networks for robust speech recognition. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5024-5028). IEEE.
 
[4] Zhang, Z., Liu, S., Li, M., Zhou, M., & Chen, E. (2018, October). Bidirectional generative adversarial networks for neural machine translation. In Proceedings of the 22nd conference on computational natural language learning (pp. 190-199).
 
[5] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S & Bengio, Y. (2014). Generative adversarial nets. Advances in neural information processing systems, 27.
 
[6] Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X. (2016). Improved techniques for training gans. Advances in neural information processing systems, 29.
 
[7] Yang, Z., Chen, W., Wang, F., & Xu, B. (2017). Improving neural machine translation with conditional sequence generative adversarial nets. arXiv preprint arXiv:1703.04887.
 
[8] Arjovsky, M., Chintala, S., & Bottou, L. (2017, July). Wasserstein generative adversarial networks. In International conference on machine learning (pp. 214-223). PMLR.
 
[9] Fedus, W., Rosca, M., Lakshminarayanan, B., Dai, A. M., Mohamed, S., & Goodfellow, I. (2017). Many paths to equilibrium: GANs do not need to decrease a divergence at every step. arXiv preprint arXiv:1710.08446.
 
[10] Nowozin, S., Cseke, B., & Tomioka, R. (2016). f-gan: Training generative neural samplers using variational divergence minimization. Advances in neural information processing systems, 29.
 
[11] Guimaraes, G. L., Sanchez-Lengeling, B., Outeiral, C., Farias, P. L. C., & Aspuru-Guzik, A. (2017). Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models. arXiv preprint arXiv:1705.10843.
 
[12] Bojar, O., Buck, C., Callison-Burch, C., Federmann, (2014). Findings of the 2014 In Proceedings of the Ninth Workshop on Statistical Machine Translation
 
[13] Amazon. (2019). Amazon Customer Reviews Dataset (Version 2021-01-07) [Data set]. Amazon. https://registry.opendata.aws/amazon-reviews/
 
[14] Lin, T.-Y., Maire, M., Belongie, (2014). Microsoft COCO: Common Objects in Context (Image Captions Subset). In Proceedings of the European Conference on Computer Vision. Springer.
 
[15] Ehsani, K., Mottaghi, R., & Farhadi, A. (2018). Segan: Segmenting and generating the invisible. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6144-6153).
 
[16] Linguistic Data Consortium. (2018). Chinese-English LDC Dataset Linguistic Data Consortium. https://www.ldc.upenn.edu/
 
[17] OpenSubtitles. (2018). OpenSubtitles2018: Statistical Alignment of Parallel Sentences from OpenSubtitles in 60 Languages [Data set]. OPUS. http://opus.nlpl.eu/OpenSubtitles2018.php
 
[18] Li, Yanran, Hui Su. "DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset." In Proceedings of the Eighth International Joint Conference on Natural Language Processing 2017.
 
[19] Peters, J., & Bagnell, J. A. (2010). Policy Gradient Methods. Scholarpedia, 5(11), 3698.
 
[20] Hermann, K. M., Kocisky(2015). Teaching machines to read and comprehend. In Advances in Neural Information Processing System).
 
 
[21] Graff, David, and Christopher Cieri. "English Gigaword." Linguistic Data Consortium, Philadelphia 4 (2003).2. South J., Blass B., The future of modern genomics, Blackwell, London, 2001.
 
[22] Yu, L., Zhang, W., Wang, J., & Yu, Y. (2017, February). Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI conference on artificial intelligence (Vol. 31, No. 1).
 
[23] Zhang, C., Xiong, C., & Wang, L. (2019, August). A research on generative adversarial networks applied to text generation. In 2019 14th International Conference on Computer Science & Education (ICCSE) (pp. 913-917). IEEE.
 
[24] Nie, Weili, Nina Narodytska, and Ankit Patel. "Relgan: Relational generative adversarial networks for text generation." International conference on learning representations. 2018.
 
[25] Lin, K., Li, D., He, X., Zhang, Z., & Sun, M. T. (2017). Adversarial ranking for language generation. Advances in neural information processing systems, 30.
 
[26] Yang, Y., Dan, X., Qiu, X., & Gao, Z. (2020). -: Feature-guiding generative adversarial networks for text generation. IEEE Access, 8, 105217-105225.
 
[27] Wu, Qingyang, Lei Li, and Zhou Yu. "Textgail: Generative adversarial imitation learning for text generation." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 35. No. 16. 2021.Xu, C., Wu, W., & Wu, Y. (2018).
 
[28] Xu, Can, Wei Wu, and Yu Wu. "Towards explainable and controllable open domain dialogue generation with dialogue acts." arXiv preprint arXiv:1807.07255 (2018).
 
[28] Liu, Z., Wang, J., & Liang, Z. (2020, April). Catgan: Category-aware generative adversarial networks with hierarchical evolutionary learning for category text generation. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 05, pp. 8425-8432).
 
[29] Chen, J., Wu, Y., Jia, C., Zheng, H., & Huang, G. (2020). Customizable text generation via conditional text generative adversarial network. Neurocomputing, 416, 125-135.
 
[30] Alibasa, Muhammad Johan, Rizka Widyarini Purwanto, Yudi Priyadi, and Rosa Reska Riskiana. "Towards Generating Unit Test Codes Using Generative Adversarial Networks." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 6, no. 2 (2022): 305-314.
 
[31] Li, Xinze, Kezhi Mao, Fanfan Lin, and Zijian Feng. "Feature-aware conditional GAN for category text generation." Neurocomputing 547 (2023): 126352.
 
[32] Wu, L., Xia, Y., Tian, F., Zhao, L., Qin, T., Lai, J., & Liu, T. Y. (2018, November). Adversarial neural machine translation. In Asian Conference on Machine Learning (pp. 534-549). PMLR.
 
 
[33] Yang, Z., Chen, W., Wang, F., & Xu, B. (2018). Generative adversarial training for neural machine translation. Neurocomputing, 321, 146-155.
 
[34] Zhang, Y., Gan, Z., Fan, K., Chen, Z., Henao, R., Shen, D., & Carin, L. (2017, July). Adversarial feature matching for text generation. In International Conference on Machine Learning (pp. 4006-4015). PMLR.
 
[35] Ahn, J., Madhu, H., & Nguyen, V. (2021). Improvement in Machine Translation with Generative Adversarial Networks. arXiv preprint arXiv:2111.15166.
 
[36] Mi, C., Xie, L., & Zhang, Y. (2020). Improving adversarial neural machine translation for morphologically rich language. IEEE Transactions on Emerging Topics in Computational Intelligence, 4(4), 417-426.
 
[37] Xia, LinJie, Zhengtao Yu, and Shengxiang Gao. "Chinese-Vietnamese cross-language topic discovery method based on generative adversarial networks." International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2022). Vol. 12247. SPIE, 2022.
 
[38] Srisurya, Ippatapu Venkata. "Neural Machine Translation using Adam Optimised Generative Adversarial Network." 2023 7th International Conference on Computing Methodologies and Communication (ICCMC). IEEE, 2023.
 
[39] Alonso, E., Moysset, B., & Messina, R. (2019, September). Adversarial generation of handwritten text images conditioned on sequences. In 2019 International Conference on Document Analysis and Recognition (ICDAR) (pp. 481-486). IEEE.
 
[40] Li, J., Monroe, W., Shi, T., Jean, S., Ritter, A., & Jurafsky, D. (2017). Adversarial learning for neural dialogue generation. arXiv preprint arXiv:1701.06547.
 
[41] Li, Y., & Wu, B. (2020, June). Emotional dialogue generation with generative adversarial networks. In 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC) (Vol. 1, pp. 868-873). IEEE.
 
[42] Olabiyi, Oluwatobi, Alan Salimov, Anish Khazane, and Erik T. Mueller. "Multi-turn dialogue response generation in an adversarial learning framework." arXiv preprint arXiv:1805.11752 (2018).
 
[43] Serban, Iulian, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, Yoshua Bengio"A hierarchical latent variable encoder-decoder model for generating dialogues." Proceedings of the AAAI conference on artificial intelligence. Vol. 31. No. 1. 2017.
 
[44] Feng, S., Chen, H., Li, K., & Yin, D. (2020, April). Posterior-gan: Towards informative and coherent response generation with posterior generative adversarial network. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 05, pp. 7708-7715).
 
[45] Chen, Xiuying, Mingzhe Li, Jiayi Zhang, Xiaoqiang Xia, Chen Wei, Jianwei Cui, Xin Gao, Xiangliang Zhang, and Rui Yan. "Learning towards selective data augmentation for dialogue generation." In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 11, pp. 12673-12681. 2023.
 
[46] Rekabdar, B., Mousas, C., & Gupta, B. (2019, January). Generative adversarial network with policy gradient for text summarization. In 2019 IEEE 13th international conference on semantic computing (ICSC) (pp. 204-207). IEEE.
 
[47] Vo, Tham. "A novel semantic-enhanced generative adversarial network for abstractive text summarization." Soft Computing 27.10 (2023): pp. 6267-6280.
 
[48] Barratt, S., & Sharma, R. (2018). A note on the inception score. arXiv preprint arXiv:1801.01973.
 
[49] Bang, D., & Shim, H. (2018, July). Improved training of generative adversarial networks using representative features. In International conference on machine learning (pp. 433-442). PMLR.
 
[50] Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., & Paul Smolley, S. (2017). Least squares generative adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2794-2802).
 
[51] Huszár, F. (2015). How (not) to train your generative model: Scheduled sampling, likelihood, adversary? arXiv preprint arXiv:1511.05101.
 
[52] de Rosa, G. H., & Papa, J. P. (2021). A survey on text generation using generative adversarial networks. Pattern Recognition, 119, 108098.
 
[53] Browne, C. B., Powley, E., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P. & Colton, S. (2012). A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games, 4(1), 1-43.
 
[54] Chen, Y., Pan, Y., Yao, T., Tian, X., & Mei, T. (2019, October). Mocycle-gan: Unpaired video-to-video translation. In Proceedings of the 27th ACM International Conference on Multimedia (pp. 647-655).
 
[55] Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2223-2232).
 
[56] Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
 
[57] Mogren, O. (2016). C-RNN-GAN: Continuous recurrent neural networks with adversarial training. arXiv preprint arXiv:1611.09904.
 
[58] Patel, S., Kakadiya, A., Mehta, M., Derasari, R., Patel, R., & Gandhi, R. (2018). Correlated discrete data generation using adversarial training. arXiv preprint arXiv:1804.00925.
 
[59] Saito, Y., Takamichi, S., & Saruwatari, H. (2017). Statistical parametric speech synthesis incorporating generative adversarial networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(1), 84-96.
 
[60] Sheather, S. J. (2004). Density estimation. Statistical science, 588-597.
 
[61] Theis, L., Oord, A. V. D., & Bethge, M. (2015). A note on the evaluation of generative models. arXiv preprint arXiv:1511.01844.