Original/Review Paper
H.3. Artificial Intelligence
Ali Zahmatkesh Zakariaee; Hossein Sadr; Mohamad Reza Yamaghani
Abstract
Machine learning (ML) is a popular tool in healthcare while it can help to analyze large amounts of patient data, such as medical records, predict diseases, and identify early signs of cancer. Gastric cancer starts in the cells lining the stomach and is known as the 5th most common cancer worldwide. ...
Read More
Machine learning (ML) is a popular tool in healthcare while it can help to analyze large amounts of patient data, such as medical records, predict diseases, and identify early signs of cancer. Gastric cancer starts in the cells lining the stomach and is known as the 5th most common cancer worldwide. Therefore, predicting the survival of patients, checking their health status, and detecting their risk of gastric cancer in the early stages can be very beneficial. Surprisingly, with the help of machine learning methods, this can be possible without the need for any invasive methods which can be useful for both patients and physicians in making informed decisions. Accordingly, a new hybrid machine learning-based method for detecting the risk of gastric cancer is proposed in this paper. The proposed model is compared with traditional methods and based on the empirical results, not only the proposed method outperform existing methods with an accuracy of 98% but also gastric cancer can be one of the most important consequences of H. pylori infection. Additionally, it can be concluded that lifestyle and dietary factors can heighten the risk of gastric cancer, especially among individuals who frequently consume fried foods and suffer from chronic atrophic gastritis and stomach ulcers. This risk is further exacerbated in individuals with limited fruit and vegetable intake and high salt consumption.
Original/Review Paper
H.6.3.2. Feature evaluation and selection
Farhad Abedinzadeh Torghabeh; Yeganeh Modaresnia; Seyyed Abed Hosseini
Abstract
Various data analysis research has recently become necessary in to find and select relevant features without class labels using Unsupervised Feature Selection (UFS) approaches. Despite the fact that several open-source toolboxes provide feature selection techniques to reduce redundant features, data ...
Read More
Various data analysis research has recently become necessary in to find and select relevant features without class labels using Unsupervised Feature Selection (UFS) approaches. Despite the fact that several open-source toolboxes provide feature selection techniques to reduce redundant features, data dimensionality, and computation costs, these approaches require programming knowledge, which limits their popularity and has not adequately addressed unlabeled real-world data. Automatic UFS Toolbox (Auto-UFSTool) for MATLAB, proposed in this study, is a user-friendly and fully-automatic toolbox that utilizes several UFS approaches from the most recent research. It is a collection of 25 robust UFS approaches, most of which were developed within the last five years. Therefore, a clear and systematic comparison of competing methods is feasible without requiring a single line of code. Even users without any previous programming experience may utilize the actual implementation by the Graphical User Interface (GUI). It also provides the opportunity to evaluate the feature selection results and generate graphs that facilitate the comparison of subsets of varying sizes. It is freely accessible in the MATLAB File Exchange repository and includes scripts and source code for each technique. The link to this toolbox is freely available to the general public on: bit.ly/AutoUFSTool
Original/Review Paper
H.3. Artificial Intelligence
Amir Mehrabinezhad; Mohammad Teshnelab; Arash Sharifi
Abstract
Due to the growing number of data-driven approaches, especially in artificial intelligence and machine learning, extracting appropriate information from the gathered data with the best performance is a remarkable challenge. The other important aspect of this issue is storage costs. The principal component ...
Read More
Due to the growing number of data-driven approaches, especially in artificial intelligence and machine learning, extracting appropriate information from the gathered data with the best performance is a remarkable challenge. The other important aspect of this issue is storage costs. The principal component analysis (PCA) and autoencoders (AEs) are samples of the typical feature extraction methods in data science and machine learning that are widely used in various approaches. The current work integrates the advantages of AEs and PCA for presenting an online supervised feature extraction selection method. Accordingly, the desired labels for the final model are involved in the feature extraction procedure and embedded in the PCA method as well. Also, stacking the nonlinear autoencoder layers with the PCA algorithm eliminated the kernel selection of the traditional kernel PCA methods. Besides the performance improvement proved by the experimental results, the main advantage of the proposed method is that, in contrast with the traditional PCA approaches, the model has no requirement for all samples to feature extraction. As regards the previous works, the proposed method can outperform the other state-of-the-art ones in terms of accuracy and authenticity for feature extraction.
Methodologies
H.3. Artificial Intelligence
Zeinab Poshtiban; Elham Ghanbari; Mohammadreza Jahangir
Abstract
Analyzing the influence of people and nodes in social networks has attracted a lot of attention. Social networks gain meaning, despite the groups, associations, and people interested in a specific issue or topic, and people demonstrate their theoretical and practical tendencies in such places. Influential ...
Read More
Analyzing the influence of people and nodes in social networks has attracted a lot of attention. Social networks gain meaning, despite the groups, associations, and people interested in a specific issue or topic, and people demonstrate their theoretical and practical tendencies in such places. Influential nodes are often identified based on the information related to the social network structure and less attention is paid to the information spread by the social network user. The present study aims to assess the structural information in the network to identify influential users in addition to using their information in the social network. To this aim, the user’s feelings were extracted. Then, an emotional or affective score was assigned to each user based on an emotional dictionary and his/her weight in the network was determined utilizing centrality criteria. Here, the Twitter network was applied. Thus, the structure of the social network was defined and its graph was drawn after collecting and processing the data. Then, the analysis capability of the network and existing data was extracted and identified based on the algorithm proposed by users and influential nodes. Based on the results, the nodes identified by the proposed algorithm are considered high-quality and the speed of information simulated is higher than other existing algorithms.
Technical Paper
H.5. Image Processing and Computer Vision
Mohammad Mahdi Nakhaie; Sasan Karamizadeh; Mohammad Ebrahim Shiri; Kambiz Badie
Abstract
Lung cancer is a highly serious illness, and detecting cancer cells early significantly enhances patients' chances of recovery. Doctors regularly examine a large number of CT scan images, which can lead to fatigue and errors. Therefore, there is a need to create a tool that can automatically detect and ...
Read More
Lung cancer is a highly serious illness, and detecting cancer cells early significantly enhances patients' chances of recovery. Doctors regularly examine a large number of CT scan images, which can lead to fatigue and errors. Therefore, there is a need to create a tool that can automatically detect and classify lung nodules in their early stages. Computer-aided diagnosis systems, often employing image processing and machine learning techniques, assist radiologists in identifying and categorizing these nodules. Previous studies have often used complex models or pre-trained networks that demand significant computational power and a long time to execute. Our goal is to achieve accurate diagnosis without the need for extensive computational resources. We introduce a simple convolutional neural network with only two convolution layers, capable of accurately classifying nodules without requiring advanced computing capabilities. We conducted training and validation on two datasets, LIDC-IDRI and LUNA16, achieving impressive accuracies of 99.7% and 97.52%, respectively. These results demonstrate the superior accuracy of our proposed model compared to state-of-the-art research papers.
Original/Review Paper
H.3. Artificial Intelligence
Hamid Ghaffari; Hemmatollah Pirdashti; Mohammad Reza Kangavari; Sjoerd Boersma
Abstract
An intelligent growth chamber was designed in 2021 to model and optimize rice seedlings' growth. According to this, an experiment was implemented at Sari University of Agricultural Sciences and Natural Resources, Iran, in March, April, and May 2021. The model inputs included radiation, temperature, carbon ...
Read More
An intelligent growth chamber was designed in 2021 to model and optimize rice seedlings' growth. According to this, an experiment was implemented at Sari University of Agricultural Sciences and Natural Resources, Iran, in March, April, and May 2021. The model inputs included radiation, temperature, carbon dioxide, and soil acidity. These growth factors were studied at ambient and incremental levels. The model outputs were seedlings' height, root length, chlorophyll content, CGR, RGR, the leaves number, and the shoot's dry weight. Rice seedlings' growth was modeled using LSTM neural networks and optimized by the Bayesian method. It concluded that the best parameter setting was at epoch=100, learning rate=0.001, and iteration number=500. The best performance during training was obtained when the validation RMSE=0.2884.
Original/Review Paper
H.3. Artificial Intelligence
Ali Rebwar Shabrandi; Ali Rajabzadeh Ghatari; Nader Tavakoli; Mohammad Dehghan Nayeri; Sahar Mirzaei
Abstract
To mitigate COVID-19’s overwhelming burden, a rapid and efficient early screening scheme for COVID-19 in the first-line is required. Much research has utilized laboratory tests, CT scans, and X-ray data, which are obstacles to agile and real-time screening. In this study, we propose a user-friendly ...
Read More
To mitigate COVID-19’s overwhelming burden, a rapid and efficient early screening scheme for COVID-19 in the first-line is required. Much research has utilized laboratory tests, CT scans, and X-ray data, which are obstacles to agile and real-time screening. In this study, we propose a user-friendly and low-cost COVID-19 detection model based on self-reportable data at home. The most exhausted input features were identified and included in the demographic, symptoms, semi-clinical, and past/present disease data categories. We employed Grid search to identify the optimal combination of hyperparameter settings that yields the most accurate prediction. Next, we apply the proposed model with tuned hyperparameters to 11 classic state-of-the-art classifiers. The results show that the XGBoost classifier provides the highest accuracy of 73.3%, but statistical analysis shows that there is no significant difference between the accuracy performance of XGBoost and AdaBoost, although it proved the superiority of these two methods over other methods. Furthermore, the most important features obtained using SHapely Adaptive explanations were analyzed. “Contact with infected people,” “cough,” “muscle pain,” “fever,” “age,” “Cardiovascular commodities,” “PO2,” and “respiratory distress” are the most important variables. Among these variables, the first three have a relatively large positive impact on the target variable. Whereas, “age,” “PO2”, and “respiratory distress” are highly negatively correlated with the target variable. Finally, we built a clinically operable, visible, and easy-to-interpret decision tree model to predict COVID-19 infection.
Original/Review Paper
H.3. Artificial Intelligence
Mahdi Rasouli; Vahid Kiani
Abstract
The identification of emotions in short texts of low-resource languages poses a significant challenge, requiring specialized frameworks and computational intelligence techniques. This paper presents a comprehensive exploration of shallow and deep learning methods for emotion detection in short Persian ...
Read More
The identification of emotions in short texts of low-resource languages poses a significant challenge, requiring specialized frameworks and computational intelligence techniques. This paper presents a comprehensive exploration of shallow and deep learning methods for emotion detection in short Persian texts. Shallow learning methods employ feature extraction and dimension reduction to enhance classification accuracy. On the other hand, deep learning methods utilize transfer learning and word embedding, particularly BERT, to achieve high classification accuracy. A Persian dataset called "ShortPersianEmo" is introduced to evaluate the proposed methods, comprising 5472 diverse short Persian texts labeled in five main emotion classes. The evaluation results demonstrate that transfer learning and BERT-based text embedding perform better in accurately classifying short Persian texts than alternative approaches. The dataset of this study ShortPersianEmo will be publicly available online at https://github.com/vkiani/ShortPersianEmo.
Technical Paper
G.3.7. Database Machines
Abdul Aziz Danaa Abukari; Mohammed Daabo Ibrahim; Alhassan Abdul-Barik
Abstract
Hidden Markov Models (HMMs) are machine learning models that has been applied to a range of real-life applications including intrusion detection, pattern recognition, thermodynamics, statistical mechanics among others. A multi-layered HMMs for real-time fraud detection and prevention whilst reducing ...
Read More
Hidden Markov Models (HMMs) are machine learning models that has been applied to a range of real-life applications including intrusion detection, pattern recognition, thermodynamics, statistical mechanics among others. A multi-layered HMMs for real-time fraud detection and prevention whilst reducing drastically the number of false positives and negatives is proposed and implemented in this study. The study also focused on reducing the parameter optimization and detection times of the proposed models using a hybrid algorithm comprising the Baum-Welch, Genetic and Particle-Swarm Optimization algorithms. Simulation results revealed that, in terms of Precision, Recall and F1-scores, our proposed model performed better when compared to other approaches proposed in literature.
Technical Paper
B.3. Communication/Networking and Information Technology
S. Mojtaba Matinkhah; Roya Morshedi; Akbar Mostafavi
Abstract
The Internet of Things (IoT) has emerged as a rapidly growing technology that enables seamless connectivity between a wide variety of devices. However, with this increased connectivity comes an increased risk of cyber-attacks. In recent years, the development of intrusion detection systems (IDS) has ...
Read More
The Internet of Things (IoT) has emerged as a rapidly growing technology that enables seamless connectivity between a wide variety of devices. However, with this increased connectivity comes an increased risk of cyber-attacks. In recent years, the development of intrusion detection systems (IDS) has become critical for ensuring the security and privacy of IoT networks. This article presents a study that evaluates the accuracy of an intrusion detection system (IDS) for detecting network attacks in the Internet of Things (IoT) network. The proposed IDS uses the Decision Tree Classifier and is tested on four benchmark datasets: NSL-KDD, BOT-IoT, CICIDS2017, and MQTT-IoT. The impact of noise on the training and test datasets on classification accuracy is analyzed. The results indicate that clean data has the highest accuracy, while noisy datasets significantly reduce accuracy. Furthermore, the study finds that when both training and test datasets are noisy, the accuracy of classification decreases further. The findings of this study demonstrate the importance of using clean data for training and testing an IDS in IoT networks to achieve accurate classification. This research provides valuable insights for the development of a robust and accurate IDS for IoT networks.
Original/Review Paper
H.3. Artificial Intelligence
Seyed Alireza Bashiri Mosavi; Omid Khalaf Beigi
Abstract
A speedy and accurate transient stability assessment (TSA) is gained by employing efficient machine learning- and statistics-based (MLST) algorithms on transient nonlinear time series space. In the MLST’s world, the feature selection process by forming compacted optimal transient feature space ...
Read More
A speedy and accurate transient stability assessment (TSA) is gained by employing efficient machine learning- and statistics-based (MLST) algorithms on transient nonlinear time series space. In the MLST’s world, the feature selection process by forming compacted optimal transient feature space (COTFS) from raw high dimensional transient data can pave the way for high-performance TSA. Hence, designing a comprehensive feature selection scheme (FSS) that populates COTFS with the relevant-discriminative transient features (RDTFs) is an urgent need. This work aims to introduce twin hybrid FSS (THFSS) to select RDTFs from transient 28-variate time series data. Each fold of THFSS comprises filter-wrapper mechanisms. The conditional relevancy rate (CRR) is based on mutual information (MI) and entropy calculations are considered as the filter method, and incremental wrapper subset selection (IWSS) and IWSS with replacement (IWSSr) formed by kernelized support vector machine (SVM) and twin SVM (TWSVM) are used as wrapper ones. After exerting THFSS on transient univariates, RDTFs are entered into the cross-validation-based train-test procedure for evaluating their efficiency in TSA. The results manifested that THFSS-based RDTFs have a prediction accuracy of 98.87 % and a processing time of 102.653 milliseconds for TSA.
Methodologies
H.6. Pattern Recognition
Sadegh Rahmani Rahmani-Boldaji; Mehdi Bateni; Mahmood Mortazavi Dehkordi
Abstract
Efficient regular-frequent pattern mining from sensors-produced data has become a challenge. The large volume of data leads to prolonged runtime, thus delaying vital predictions and decision makings which need an immediate response. So, using big data platforms and parallel algorithms is an appropriate ...
Read More
Efficient regular-frequent pattern mining from sensors-produced data has become a challenge. The large volume of data leads to prolonged runtime, thus delaying vital predictions and decision makings which need an immediate response. So, using big data platforms and parallel algorithms is an appropriate solution. Additionally, an incremental technique is more suitable to mine patterns from big data streams than static methods. This study presents an incremental parallel approach and compact tree structure for extracting regular-frequent patterns from the data of wireless sensor networks. Furthermore, fewer database scans have been performed in an effort to reduce the mining runtime. This study was performed on Intel 5-day and 10-day datasets with 6, 4, and 2 nodes clusters. The findings show the runtime was improved in all 3 cluster modes by 14, 18, and 34% for the 5-day dataset and by 22, 55, and 85% for the 10-day dataset, respectively.