H.3.2.10. Medicine and science
Fahimeh Hafezi; Maryam Khodabakhsh
Abstract
Coronavirus disease as a persistent epidemic of acute respiratory syndrome posed a challenge to global healthcare systems. Many people have been forced to stay in their homes due to unprecedented quarantine practices around the world. Since most people used social media during the Coronavirus epidemic, ...
Read More
Coronavirus disease as a persistent epidemic of acute respiratory syndrome posed a challenge to global healthcare systems. Many people have been forced to stay in their homes due to unprecedented quarantine practices around the world. Since most people used social media during the Coronavirus epidemic, analyzing the user-generated social content can provide new insights and be a clue to track changes and their occurrence over time. An active area in this space is the prediction of new infected cases from Coronavirus-generated social content. Identifying the social content that relates to Coronavirus is a challenging task because a significant number of posts contain Coronavirus-related content but do not include hashtags or Corona-related words. Conversely, posts that have the hashtag or the word Corona but are not really related to the meaning of Coronavirus and are mostly promotional. In this paper, we propose a semantic approach based on word embedding techniques to model Corona and then introduce a new feature namely semantic similarity to measure the similarity of a given post to Corona in semantic space. Furthermore, we propose two other features namely fear emotion and hope feeling to identify the Coronavirus-related posts. These features are used as statistical indicators in a regression model to estimate the new infected cases. We evaluate our features on the Persian dataset of Instagram posts, which was collected in the first wave of Coronavirus, and demonstrate that the consideration of the proposed features will lead to improved performance of the Coronavirus incidence rate estimation.
H.6.3.2. Feature evaluation and selection
A. Zangooei; V. Derhami; F. Jamshidi
Abstract
Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy ...
Read More
Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that is considered as feature cost in this paper. Here, two novel features are proposed. They use semantic similarity measure to determine the relationship between the content and the URL of a page. Since suggested features don't apply third-party services such as search engines result, the features extraction time decreases dramatically. Login form pre-filer is utilized to reduce unnecessary calculations and false positive rate. In this paper, a cost-based feature selection is presented as the most effective feature. The selected features are employed in the suggested PWDS. Extreme learning machine algorithm is used to classify webpages. The experimental results demonstrate that suggested PWDS achieves high accuracy of 97.6% and short average detection time of 120.07 milliseconds.