Transformer-based Generative Chatbot Using Reinforcement Learning

Esfandiari, Nura; Kiani, Kourosh; Rastgoo, Razieh

doi:10.22044/jadm.2024.14466.2549

Document Type : Original/Review Paper

Authors

Electrical and Computer Engineering Department, Semnan University, Semnan, Iran.

https://doi.org/10.22044/jadm.2024.14466.2549

Abstract

A chatbot is a computer program system designed to simulate human-like conversations and interact with users. It is a form of conversational agent that utilizes Natural Language Processing (NLP) and sequential models to understand user input, interpret their intent, and generate appropriate answer. This approach aims to generate word sequences in the form of coherent phrases. A notable challenge associated with previous models lies in their sequential training process, which can result in less accurate outcomes. To address this limitation, a novel generative chatbot is proposed, integrating the power of Reinforcement Learning (RL) and transformer models. The proposed chatbot aims to overcome the challenges associated with sequential training by combining these two approaches. The proposed approach employs a Double Deep Q-Network (DDQN) architecture with utilizing a transformer model as the agent. This agent takes the human question as an input state and generates the bot answer as an action. To the best of our knowledge, this is the first time that a generative chatbot is proposed using a DDQN architecture with the embedded transformer as an agent. Results on two public datasets, Daily Dialog and Chit-Chat, validate the superiority of the proposed approach over state-of-the-art models involves employing various evaluation metrics.

Keywords

Main Subjects

H.3.8. Natural Language Processing

References

[1] G. Caldarini, S. Jaf, and K. McGarry, “A literature survey of recent advances in chatbots,” Information, vol. 13, pp. 41, 2022.

[2] A. D. Tran, J. I. Pallant, and L. W. Johnson, “Exploring the impact of chatbots on consumer sentiment and expectations in retail,” Journal of Retailing and Consumer Services, vol. 63, pp. 102718, 2021.

[3] C. W. Okonkwo and A. Ade-Ibijola, “Chatbots applications in education: A systematic review” Computers and Education: Artificial Intelligence, vol. 2, pp. 100033, 2021.

[4] R. Rastgoo, K. Kiani, and S. Escalera, “Word separation in continuous sign language using isolated signs and post-processing,” Expert Systems with Applications, vol. 249, pp. 123695, 2024.

[5] R. Rastgoo, K. Kiani, and S. Escalera, “Sign language recognition: A deep survey,” Expert Systems with Applications, vol. 164, pp. 113794, 2021.

[6] R. Rastgoo, K. Kiani, S. Escalera, V. Athitsos, and M. Sabokrou, “A survey on recent advances in sign language production,” Expert Systems with Applications, vol. 243, pp. 122846, 2024.

[7] D. Mangla, R. Aggarwal, and M. Maurya, “Measuring perception towards AI-based chatbots in the insurance sector,” in 2023 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), 2023.

[8] M.-H. Tsai, C.-H. Yang, J.-Y. Chen, S.-C. Kang, “Four-stage framework for implementing a chatbot system in disaster emergency operation data management: A flood disaster management case study,” KSCE Journal of Civil Engineering, vol. 25, 2020.

[9] R. Ren, J. W. Castro, A. Santos, O. Dieste and S. T. Acuña, "Using the SOCIO Chatbot for UML Modelling: A Family of Experiments," in IEEE Transactions on Software Engineering, vol. 49, no. 1, pp. 364-383, 1 Jan. 2023,

Oscar; Silvia T.

[10] Sh. Foolad, K. Kiani, and R. Rastgoo, “Recent advances in multi-choice machine reading comprehension: A survey on methods and datasets,” arXiv:2408.02114, 2024.

[11] P. I. Prayitno, R. P. Pujo Leksono, F. Chai, R. Aldy and W. Budiharto, "Health Chatbot Using Natural Language Processing for Disease Prediction and Treatment," 2021 1st International Conference on Computer Science and Artificial Intelligence (ICCSAI), Jakarta, Indonesia, 2021, pp. 62-67.

[12] E. Adamopoulou and L. Moussiades, “Chatbots: History, technology, and applications,” Machine Learning with Applications, vol. 2, pp. 100006, 2020.

[13] H. Naveed, A.U Khan, S. Qiu, M. Saqib, S. Anwar, M. Usman, N. Akhtar, N. Barnes, A. Mian, “A comprehensive overview of large language models,” arXiv preprint arXiv:2307.06435, 2023.

[14] O. Caelen, M.-A Blete, “Developing apps with GPT-4 and ChatGPT,” O'Reilly Media, Inc., Aug. 2023.

[15] Y. Zhu, J.-Y Nie, K. Zhou, P. Du, H. Jiang, Z. Dou, “Proactive retrieval-based chatbots based on relevant knowledge and goals,” in SIGIR '21, pp. 2000–2004, 2021

[16] R. Lowe, N. Pow, I. Serban, J. Pineau, “The Ubuntu Dialogue Corpus: A large dataset for research in unstructured multi-turn dialogue systems,” in SIGDIAL Conference, pp. 285–294, 2015.

[17] Z. Peng and X. Ma, “A survey on construction and enhancement methods in service chatbots design,” CCF Transactions on Pervasive Computing and Interaction, vol. 1, no. 3, pp. 204–223, 2019.

[18] C. Shu, Z. Zhang, Y. Chen, J. Xiao, J.H. Lau, Q. Zhang, Z. Lu, "Open Domain Response Generation Guided by Retrieved Conversations," in IEEE Access, vol. 11, pp. 99365-99375, 2023

[19] M. Dhyani and R. Kumar, “An intelligent chatbot using deep learning with bidirectional RNN and attention model,” Materials Today: Proceedings, vol. 34, pp. 817–824, 2021.

[20] Y. Wang, W. Rong, Y. Ouyang and Z. Xiong, "Augmenting Dialogue Response Generation with Unstructured Textual Knowledge," in IEEE Access, vol. 7, pp. 34954-34963, 2019

[21] Y. Peng, Y. Fang, Z. Xie, G. Zhou, “Topic-enhanced emotional conversation generation with attention mechanism,” Knowledge-Based Systems, vol. 163, pp. 429–437, 2019.

[22] M. Yang, W. Tu, Q. Qu, Z. Zhao, X. Chen, J. Zhu, “Personalized response generation by dual-learning based domain adaptation,” Neural Networks, vol. 103, pp. 72–82, 2018.

[23] Z. Lin, P. Xu, G.I. Winata, F.B. Siddique, Z. Liu, J. Shin, P. Fung, “Caire: An end-to-end empathetic chatbot,” in Proceedings of the AAAI Conference on Artificial Intelligence, 34(09), pp.13622-13623, 2020.

[24] T.-H. Lin, Y.-H. Huang, and A. Putranto, “Intelligent question and answer system for building information modeling and artificial intelligence of things based on the bidirectional encoder representations from transformers model,” Automation in Construction, vol. 142, pp. 104483, 2022.

[25] A. K. M. Masum, S. Abujar, S. Akter, N. J. Ria and S. A. Hossain, "Transformer Based Bengali Chatbot Using General Knowledge Dataset," 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Pasadena, CA, USA, 2021, pp. 1235-1238.

[26] B. Peng, M. Galley, P. He, C. Brockett, L. Liden, E. Nouri, Z. Yu, B. Dolan, J. Gao, “GODEL: Large-scale pre-training for goal-directed dialog”, arXiv:2206.11309, 2022.

[27] T. Shao, Y. Guo, H. Chen and Z. Hao, "Transformer-Based Neural Network for Answer Selection in Question Answering," in IEEE Access, vol. 7, pp. 26146-26156, 2019.

[28] S. Shang, J. Liu and Y. Yang, "Multi-Layer Transformer Aggregation Encoder for Answer Generation," in IEEE Access, vol. 8, pp. 90410-90419, 2020.

[29] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, “Attention is all you need,” in NIPS '17, pp. 6000–6010, 2017

[30] Y. Zhang, S. Sun, M. Galley, Y.-C. Chen, C. Brockett, X. Gao, J. Gao, J. Liu, B. Dolan, “DIALOGPT: Large-scale generative pre-training for conversational response generation,” in Annual Meeting of the Association for Computational Linguistics, pp. 270–278, 2019.

[31] H. Zhou et al H. Zhou, M. Huang, T. Zhang, X. Zhu, B. Liu, “Emotional chatting machine: Emotional conversation generation with internal and external memory,” in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp. 730–738, 2018.

[32] S. H. Bao, H. Wang, F. Wu, and H. Wang, “PLATO: Pre-trained dialogue generation model with discrete latent variable,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 85–96, 2022.

[33] Y. Gou, Y. Lei, L.o Liu, Y. Dai, C. Shen, “Contextualize knowledge bases with transformer for end-to-end task-oriented dialogue systems,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021.

[34] R. Keerthana, G. Fathima, and L. Florence, “Evaluating the performance of various deep reinforcement learning algorithms for a conversational chatbot,” in 2nd International Conference for Emerging Technology, pp. 1–8, 2021.

[35] Q.-D. L. Tran and A.-C. Le, “Exploring bi-directional context for improved chatbot response generation using deep reinforcement learning,” Applied Sciences, vol. 13, no. 8, pp. 5041, 2023.

[36] Q. Zhu, L. Cui, W.-N. Zhang, F. Wei, T. Liu, “Retrieval-enhanced adversarial training for neural response generation,” in Annual Meeting of the Association for Computational Linguistics, pp. 3763–3773, 2018.

[37] L. Yu, W. Zhang, J. Wang, Y. Yu, “Seqgan: Sequence generative adversarial nets with policy gradient,” in Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2852 – 2858, 2017.

[38] Y.-L. Tuan and H.-Y. Lee, “Improving conditional sequence generative adversarial networks by stepwise evaluation,” Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1346–1355, 2018.

[39] N. Esfandiari, K. Kiani, and R. Rastgoo, “A conditional generative chatbot using transformer model,” arXiv:0234.02074, 2023.

[40] F. Jafarinejad. "Benefiting from Structured Resources to Present a Computationally Efficient Word Embedding Method", Journal of AI and Data Mining, 10, 4, 2022, 505-514.

[41] K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in ACL-2002: 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318, 2002.

[42] C.-Y. Lin, “ROUGE: A package for automatic evaluation of summaries,” in Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, Spain, 2004.

[43] C. Chen, “BERT2BERT: Towards reusable pretrained language models,” in Association for Computational Linguistics, Dublin, Ireland, 2022.

Transformer-based Generative Chatbot Using Reinforcement Learning

References

References

Volume 12, Issue 3July 2024Pages 349-358

Volume 12, Issue 3
July 2024
Pages 349-358