H.3.8. Natural Language Processing
Davud Mohammadpur; Mehdi Nazari
Abstract
Text summarization has become one of the favorite subjects of researchers due to the rapid growth of contents. In title generation, a key aspect of text summarization, creating a concise and meaningful title is essential as it reflects the article's content, objectives, methodologies, and findings. Thus, ...
Read More
Text summarization has become one of the favorite subjects of researchers due to the rapid growth of contents. In title generation, a key aspect of text summarization, creating a concise and meaningful title is essential as it reflects the article's content, objectives, methodologies, and findings. Thus, generating an effective title requires a thorough understanding of the article. Various methods have been proposed in text summarization to automatically generate titles, utilizing machine learning and deep learning techniques to improve results. This study aims to develop a title generation system for scientific articles using transformer-based methods to create suitable titles from article abstracts. Pre-trained transformer-based models like BERT, T5, and PEGASUS are optimized for constructing complete sentences, but their ability to generate scientific titles is limited. We have attempted to improve this limitation by presenting a proposed method that combines different models along with a suitable dataset for training. To create our desired dataset, we collected abstracts and titles of articles published on the ScienceDirect.com website. After performing preprocessing on this data, we developed a suitable dataset consisting of 50,000 articles. The results from the evaluations of the proposed method indicate more than 20% improvement based on various ROUGE metrics in the generation of scientific titles. Additionally, an examination of the results by experts in each scientific field revealed that the generated titles are also acceptable to these specialists.