H.3.8. Natural Language Processing
Milad Allahgholi; Hossein Rahmani; Parinaz Soltanzadeh
Abstract
Stance detection is the process of identifying and classifying an author's point of view or stance towards a specific target in a given text. Most of previous studies on stance detection neglect the contextual information hidden in the input data and as a result lead to less accurate results. In this ...
Read More
Stance detection is the process of identifying and classifying an author's point of view or stance towards a specific target in a given text. Most of previous studies on stance detection neglect the contextual information hidden in the input data and as a result lead to less accurate results. In this paper, we propose a novel method called ConSPro, which uses decoder-only transformers to consider contextual input data in the process of stance detection. First, ConSPro applies zero-shot prompting of decoder only transformers to extract the context of target in the input data. Second, in addition to target and input text, ConSPro uses the extracted context as the third type of parameter for the ensemble method. We evaluate ConSPro on SemEval2016 and the empirical results indicate that ConSPro outperforms the non-contextual approaches methods, on average 9% with respect to f-measure. The findings of this study show the strong capabilities of zero-shot prompting for extracting the informative contextual information with significantly less effort comparing to previous methods on context extraction.
H.3.8. Natural Language Processing
Milad Allhgholi; Hossein Rahmani; Amirhossein Derakhshan; Saman Mohammadi Raouf
Abstract
Document similarity matching is essential for efficient text retrieval, plagiarism detection, and content analysis. Existing studies in this field can be categorized into three approaches: statistical analysis, deep learning, and hybrid approaches. However, to the best of our knowledge, none have incorporated ...
Read More
Document similarity matching is essential for efficient text retrieval, plagiarism detection, and content analysis. Existing studies in this field can be categorized into three approaches: statistical analysis, deep learning, and hybrid approaches. However, to the best of our knowledge, none have incorporated the importance of named entities into their methodologies. In this paper, we propose DOSTE, a method that first extracts name entities and then utilizes them to enhance document similarity matching through statistical and graph-based analysis. Empirical results indicate that DOSTE achieves better results by emphasizing named entities, resulting in an average improvement of 9% in the average recall metric compared to baseline methods. Also, DOSTE unlike LLM-based approaches, does not require extensive GPU resources. Additionally, non-empirical interpretations of the results indicate that DOSTE is particularly effective in identifying similarity in short documents and complex document comparisons.