Development of a Persian Mobile Sales Chatbot based on LLMs and Transformer

Esfandiari, Nura; Kiani, Kourosh; Rastgoo, Razieh

doi:10.22044/jadm.2024.15067.2609

Document Type : Original/Review Paper

Authors

Electrical and Computer Engineering Department, Semnan University, Semnan, Iran.

https://doi.org/10.22044/jadm.2024.15067.2609

Abstract

Chatbots are computer programs designed to simulate human conversation. Powered by artificial intelligence (AI), these chatbots are increasingly used to provide customer service, particularly by large language models (LLMs). A process known as fine-tuning LLMs is employed to personalize chatbot answers. This process demands substantial high-quality data and computational resources. In this article, to overcome the computational hurdles associated with fine-tuning LLMs, innovative hybrid approach is proposed. This approach aims to enhance the answers generated by LLMs, specifically for Persian chatbots used in mobile customer services. A transformer-based evaluation model was developed to score generated answers and select the most appropriate answers. Additionally, a Persian language dataset tailored to the domain of mobile sales was collected to support the personalization of the Persian chatbot and the training of the evaluation model. This approach is expected to foster increased customer interaction and boost sales within the Persian mobile phone market. Experiments conducted on four different LLMs demonstrated the effectiveness of the proposed approach in generating more relevant and semantically accurate answers for users.

Keywords

Main Subjects

H.3.8. Natural Language Processing

References

[1] G. Caldarini, S. Jaf, and K. McGarry, “A literature survey of recent advances in chatbots,” Information, vol. 13, pp. 41, 2022.

[2] M.F. Shahzad, et al., “Assessing the impact of AI-chatbot service quality on user e-brand loyalty through chatbot user trust, experience and electronic word of mouth,” Journal of Retailing and Consumer Services, vol. 79, pp. 103867, 2024.

[3] K. Palasundram, et al., “SEQ2SEQ++: A Multitasking-Based Seq2seq Model to Generate Meaningful and Relevant Answers,” IEEE Access, vol. 9, pp. 164949-164975, 2021.

[4] R. Zandie and M.H. Mahoor, “Emptransfo: A multi-head transformer architecture for creating empathetic dialog systems,” in the thirty-third international flairs conference, 2020.

[5] S. Yu, Y. Chen, and H. Zaidi, “AVA: A Financial Service Chatbot Based on Deep Bidirectional Transformers,” Frontiers in Applied Mathematics and Statistics, vol. 7, pp. 604842, 2021.

[6] J.K. Kim, et al., “ChatGPT and large language model (LLM) chatbots: The current state of acceptability and a proposal for guidelines on utilization in academic medicine,” Journal of Pediatric Urology, vol. 19(5), pp. 598-604, 2023.

[7] O. Caelen, M.A. Blete, “eveloping Apps with GPT-4 and ChatGPT,” O'Reilly Media, August 2023.

[8] C. Lin, A.Y.Q.Huang, S.J.H. Yang, “A Review of AI-Driven Conversational Chatbots Implementation Methodologies and Challenges (1999–2022),” Sustainability, vol. 15, 2023.

[9] C. Jeong, “Fine-tuning and utilization methods of domain-specific llms,” arXiv preprint arXiv:2401.02981, 2024.

[10] A. Vaswani, et al., “Attention is all you need,” NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000 - 6010, 2017.

[11] I.V. Serban, et al., “A hierarchical latent variable encoder-decoder model for generating dialogues,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 3295–3301, 2017.

[12] J. Gu, et al., “Incorporating Copying Mechanism in Sequence-to-Sequence Learning,” Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1631–1640, 2016.

[13] M. Nuruzzaman, O.K. Hussain, “IntelliBot: A Dialogue-based chatbot for the insurance industry,” Knowledge-Based Systems, vol. 196, pp. 105810, 2020.

[14] K.Y. RAO, K.S.RAO, “Modeling text generation with contextual feature representation and dimension using deep transfer learning and BI-LSTM,” Journal of Theoretical Applied Information Technology, vol. 100(9), 2022.

[15] R. Keerthana, G. Fathima and L.Florence, “Evaluating the Performance of Various Deep Reinforcement Learning Algorithms for a Conversational Chatbot,” in 2nd International Conference for Emerging Technology, pp. 1-8, 2021.

[16] B. Santra, P. Anusha, and P. Goyal, “Hierarchical Transformer for Task Oriented Dialog Systems,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics, 2021.

[17] N. Esfandiari, K. Kiani and R. Rastgoo, “Transformer-based Generative Chatbot Using Reinforcement Learning,” Journal of Artificial Intelligence & Data Mining (JAIDM), vol. 12, pp. 349-358, 2024.

[18] S. Diao, et al., “TILGAN: Transformer-based Implicit Latent GAN for Diverse and Coherent Text Generation,” Association for Computational Linguistics (ACL), 2021.

[19] N. Esfandiari, K. Kiani, and R. Rastgoo, “A Conditional Generative Chatbot using Transformer Model,” preprint arXiv:02074, 2023.

[20] P.P. Ray, “ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope,” Internet of Things and Cyber-Physical Systems, vol. 3, pp. 121-154, 2023.

[21] H. Touvron, et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.

[22] S. Yu, Y. Chen, and H. Zaidi, “AVA: A Financial Service Chatbot Based on Deep Bidirectional Transformers,” Frontiers in Applied Mathematics and Statistics, vol. 7, pp. 604842, 2021.

[23] Y Zhang, et al., “DIALOGPT: Large-Scale Generatie Pre-training for Conversational Response Generation,” In Annual Meeting of the Association for Computational Linguistics, 2019.

[24] S.M.J. Uddin, et al., “ChatGPT as an educational resource for civil engineering students,” Computer Applications in Engineering Education, vol. 32(4), pp. e22747, 2024.

[25] S. Vakayil, et al. “RAG-Based LLM Chatbot Using Llama-2,” in 2024 7th International Conference on Devices, Circuits and Systems (ICDCS), 2024.