A.1. General
Morteza Mohammadi Zanjireh; farzad morady
Abstract
This paper predicts the severity of crashes based on the analysis of multiple variables and using machine learning methods. For this purpose, data related to the years 2012 to 2024 of Tempe city in the state of Arizona USA was used. Features were selected using the metaheuristic method. Then, by using ...
Read More
This paper predicts the severity of crashes based on the analysis of multiple variables and using machine learning methods. For this purpose, data related to the years 2012 to 2024 of Tempe city in the state of Arizona USA was used. Features were selected using the metaheuristic method. Then, by using decision tree and artificial neural network, the classification of the severity of crashes was carried out. Based on the metrics, decision tree with an overall accuracy of 54% was the optimal. Finally, using the permutation feature importance method, the optimal model was interpreted. The results show that the characteristics of the year with 0.22 and the spatial characteristics with 0.11 and the collision manner with 0.1 have a higher importance in predicting the severity of crashes on urban roads.
H.3. Artificial Intelligence
Mohamad Mahdi Yadegar; Hossein Rahmani
Abstract
In recent years, new technologies have brought new innovations into the financial and commercial world, giving fraudsters many ways to commit fraud and cost companies big time. We can build systems that detect fraudulent patterns and prevent future incidents using advanced technologies. Machine learning ...
Read More
In recent years, new technologies have brought new innovations into the financial and commercial world, giving fraudsters many ways to commit fraud and cost companies big time. We can build systems that detect fraudulent patterns and prevent future incidents using advanced technologies. Machine learning algorithms are being used more for fraud detection in financial data. But the common challenge is the imbalance of the dataset which hinders traditional machine learning methods. Finding the best approach towards these imbalance datasets is the problem many of the researchers are facing when trying to use machine learning methods. In this paper, we propose the method called FinFD-GCN that use Graph Convolutional Networks (GCNs) for fraud detection in credit card transaction datasets. FinFD-GCN represents transactions as graph in which each node represents a transaction and each edge represents similarity between transactions. By using this graph representation FinFD-GCN can capture complex relationships and anomalies that may have been overlooked by traditional methods or were even impossible to detect with conventional approaches, thus enhancing the accuracy and robustness of fraud detection in financial data. We use common evaluation metrics and confusion matrices to evaluate the proposed method. FinFD-GCN achieves significant improvements in recall and AUC compared to traditional methods such as logistic regression, support vector machines, and random forests, making it a robust solution for credit card fraud detection. By using the GCN model for fraud detection in this credit card dataset we outperformed base models 5% and 10%, with respect to F1 and AUC, respectively.
I.3.7. Engineering
Elahe Moradi
Abstract
Thyroid disease is common worldwide and early diagnosis plays an important role in effective treatment and management. Utilizing machine learning techniques is vital in thyroid disease diagnosis. This research proposes tree-based machine learning algorithms using hyperparameter optimization techniques ...
Read More
Thyroid disease is common worldwide and early diagnosis plays an important role in effective treatment and management. Utilizing machine learning techniques is vital in thyroid disease diagnosis. This research proposes tree-based machine learning algorithms using hyperparameter optimization techniques to predict thyroid disease. The thyroid disease dataset from the UCI Repository is benchmarked to evaluate the performance of the proposed algorithms. After data preprocessing and normalization steps, data balancing has been applied to the data using the random oversampling (ROS) technique. Also, two methods of grid search (GS) and random search (RS) have been employed to optimize hyperparameters. Finally, employing Python software, various criteria were used to evaluate the performance of proposed algorithms such as decision tree, random forest, AdaBoost, and extreme gradient boosting. The results of the simulations indicate that the Extreme Gradient Boosting (XGB) algorithm with the grid search method outperforms all the other algorithms, obtaining an impressive accuracy, AUC, sensitivity, precision, and MCC of 99.39%, 99.97%, 98.85%, 99.40%, 98.79%, respectively. These results demonstrated the potential of the proposed method for accurately predicting thyroid disease.
H.3.2.6. Games and infotainment
Shaqayeq Saffari; Morteza Dorrigiv; Farzin Yaghmaee
Abstract
Procedural Content Generation (PCG) through automated and algorithmic content generation is an active research field in the gaming industry. Recently, Machine Learning (ML) approaches have played a pivotal role in advancing this area. While recent studies have primarily focused on examining one or a ...
Read More
Procedural Content Generation (PCG) through automated and algorithmic content generation is an active research field in the gaming industry. Recently, Machine Learning (ML) approaches have played a pivotal role in advancing this area. While recent studies have primarily focused on examining one or a few specific approaches in PCG, this paper provides a more comprehensive perspective by exploring a wider range of approaches, their applications, advantages, and disadvantages. Furthermore, the current challenges and potential future trends in this field are discussed. Although this paper does not aim to provide an exhaustive review of all existing research due to the rapid and expansive growth of this domain, it is based on the analysis of selected articles published between 2020 and 2024.
H.3. Artificial Intelligence
Sajjad Alizadeh Fard; Hossein Rahmani
Abstract
Fraud in financial data is a significant concern for both businesses and individuals. Credit card transactions involve numerous features, some of which may lack relevance for classifiers and could lead to overfitting. A pivotal step in the fraud detection process is feature selection, which profoundly ...
Read More
Fraud in financial data is a significant concern for both businesses and individuals. Credit card transactions involve numerous features, some of which may lack relevance for classifiers and could lead to overfitting. A pivotal step in the fraud detection process is feature selection, which profoundly impacts model accuracy and execution time. In this paper, we introduce an ensemble-based, explainable feature selection framework founded on SHAP and LIME algorithms, called "X-SHAoLIM". We applied our framework to diverse combinations of the best models from previous studies, conducting both quantitative and qualitative comparisons with other feature selection methods. The quantitative evaluation of the "X-SHAoLIM" framework across various model combinations revealed consistent accuracy improvements on average, including increases in Precision (+5.6), Recall (+1.5), F1-Score (+3.5), and AUC-PR (+6.75). Beyond enhanced accuracy, our proposed framework, leveraging explainable algorithms like SHAP and LIME, provides a deeper understanding of features' importance in model predictions, delivering effective explanations to system users.
H.3. Artificial Intelligence
Damianus Kofi Owusu; Christiana Cynthia Nyarko; Joseph Acquah; Joel Yarney
Abstract
Head and neck cancer (HNC) recurrence is ever increasing among Ghanaian men and women. Because not all machine learning classifiers are equally created, even if multiple of them suite very well for a given task, it may be very difficult to find one which performs optimally given different distributions. ...
Read More
Head and neck cancer (HNC) recurrence is ever increasing among Ghanaian men and women. Because not all machine learning classifiers are equally created, even if multiple of them suite very well for a given task, it may be very difficult to find one which performs optimally given different distributions. The stacking learns how to best combine weak classifier models to form a strong model. As a prognostic model for classifying HNSCC recurrence patterns, this study tried to identify the best stacked ensemble classifier model when the same ML classifiers for feature selection and stacked ensemble learning are used. Four stacked ensemble models; in which first one used two base classifiers: gradient boosting machine (GBM) and distributed random forest (DRF); second one used three base classifiers: GBM, DRF, and deep neural network (DNN); third one used four base classifiers: GBM, DRF, DNN, and generalized linear model (GLM); and fourth one used five base classifiers: GBM, DRF, DNN, GLM, and Naïve bayes (NB) were developed, using GBM meta-classifier in each case. The results showed that implementing stacked ensemble technique consisting of five base classifiers on gradient boosted features achieved better performance than achieved on other feature subsets, and implementing this stacked ensemble technique on gradient boosted features achieved better performance compared to other stacked ensemble techniques implemented on gradient boosted features and other feature subsets used. Learning stacked ensemble technique having five base classifiers on GBM features is clinically appropriate as a prognostic model for classifying and predicting HNSCC patients’ recurrence data.
H.3. Artificial Intelligence
Ali Zahmatkesh Zakariaee; Hossein Sadr; Mohamad Reza Yamaghani
Abstract
Machine learning (ML) is a popular tool in healthcare while it can help to analyze large amounts of patient data, such as medical records, predict diseases, and identify early signs of cancer. Gastric cancer starts in the cells lining the stomach and is known as the 5th most common cancer worldwide. ...
Read More
Machine learning (ML) is a popular tool in healthcare while it can help to analyze large amounts of patient data, such as medical records, predict diseases, and identify early signs of cancer. Gastric cancer starts in the cells lining the stomach and is known as the 5th most common cancer worldwide. Therefore, predicting the survival of patients, checking their health status, and detecting their risk of gastric cancer in the early stages can be very beneficial. Surprisingly, with the help of machine learning methods, this can be possible without the need for any invasive methods which can be useful for both patients and physicians in making informed decisions. Accordingly, a new hybrid machine learning-based method for detecting the risk of gastric cancer is proposed in this paper. The proposed model is compared with traditional methods and based on the empirical results, not only the proposed method outperform existing methods with an accuracy of 98% but also gastric cancer can be one of the most important consequences of H. pylori infection. Additionally, it can be concluded that lifestyle and dietary factors can heighten the risk of gastric cancer, especially among individuals who frequently consume fried foods and suffer from chronic atrophic gastritis and stomach ulcers. This risk is further exacerbated in individuals with limited fruit and vegetable intake and high salt consumption.
H.3. Artificial Intelligence
Ali Rebwar Shabrandi; Ali Rajabzadeh Ghatari; Nader Tavakoli; Mohammad Dehghan Nayeri; Sahar Mirzaei
Abstract
To mitigate COVID-19’s overwhelming burden, a rapid and efficient early screening scheme for COVID-19 in the first-line is required. Much research has utilized laboratory tests, CT scans, and X-ray data, which are obstacles to agile and real-time screening. In this study, we propose a user-friendly ...
Read More
To mitigate COVID-19’s overwhelming burden, a rapid and efficient early screening scheme for COVID-19 in the first-line is required. Much research has utilized laboratory tests, CT scans, and X-ray data, which are obstacles to agile and real-time screening. In this study, we propose a user-friendly and low-cost COVID-19 detection model based on self-reportable data at home. The most exhausted input features were identified and included in the demographic, symptoms, semi-clinical, and past/present disease data categories. We employed Grid search to identify the optimal combination of hyperparameter settings that yields the most accurate prediction. Next, we apply the proposed model with tuned hyperparameters to 11 classic state-of-the-art classifiers. The results show that the XGBoost classifier provides the highest accuracy of 73.3%, but statistical analysis shows that there is no significant difference between the accuracy performance of XGBoost and AdaBoost, although it proved the superiority of these two methods over other methods. Furthermore, the most important features obtained using SHapely Adaptive explanations were analyzed. “Contact with infected people,” “cough,” “muscle pain,” “fever,” “age,” “Cardiovascular commodities,” “PO2,” and “respiratory distress” are the most important variables. Among these variables, the first three have a relatively large positive impact on the target variable. Whereas, “age,” “PO2”, and “respiratory distress” are highly negatively correlated with the target variable. Finally, we built a clinically operable, visible, and easy-to-interpret decision tree model to predict COVID-19 infection.
H.3. Artificial Intelligence
Saheb Ghanbari Motlagh; Fateme Razi Astaraei; Mojtaba Hajihosseini; Saeed Madani
Abstract
This study explores the potential use of Machine Learning (ML) techniques to enhance three types of nano-based solar cells. Perovskites of methylammonium-free formamidinium (FA) and mixed cation-based cells exhibit a boosted efficiency when employing ML techniques. Moreover, ML methods are utilized to ...
Read More
This study explores the potential use of Machine Learning (ML) techniques to enhance three types of nano-based solar cells. Perovskites of methylammonium-free formamidinium (FA) and mixed cation-based cells exhibit a boosted efficiency when employing ML techniques. Moreover, ML methods are utilized to identify optimal donor complexes, high blind temperature materials, and to advance the thermodynamic stability of perovskites. Another significant application of ML in dye-sensitized solar cells (DSSCs) is the detection of novel dyes, solvents, and molecules for improving the efficiency and performance of solar cells. Some of these materials have increased cell efficiency, short-circuit current, and light absorption by more than 20%. ML algorithms to fine-tune network and plasmonic field bandwidths improve the efficiency and light absorption of surface plasmonic resonance (SPR) solar cells. This study outlines the potential of ML techniques to optimize and improve the development of nano-based solar cells, leading to promising results for the field of solar energy generation and supporting the demand for sustainable and dependable energy.
H.3. Artificial Intelligence
Mohammad Hossein Shayesteh; Behrooz Shahrokhzadeh; Behrooz Masoumi
Abstract
This paper provides a comprehensive review of the potential of game theory as a solution for sensor-based human activity recognition (HAR) challenges. Game theory is a mathematical framework that models interactions between multiple entities in various fields, including economics, political science, ...
Read More
This paper provides a comprehensive review of the potential of game theory as a solution for sensor-based human activity recognition (HAR) challenges. Game theory is a mathematical framework that models interactions between multiple entities in various fields, including economics, political science, and computer science. In recent years, game theory has been increasingly applied to machine learning challenges, including HAR, as a potential solution to improve recognition performance and efficiency of recognition algorithms. The review covers the shared challenges between HAR and machine learning, compares previous work on traditional approaches to HAR, and discusses the potential advantages of using game theory. It discusses different game theory approaches, including non-cooperative and cooperative games, and provides insights into how they can improve the HAR systems. The authors propose new game theory-based approaches and evaluate their effectiveness compared to traditional approaches. Overall, this review paper contributes to expanding the scope of research in HAR by introducing game-theoretic concepts and solutions to the field and provides valuable insights for researchers interested in applying game-theoretic approaches to HAR.
Oladosu Oladimeji; Olayanju Oladimeji
Abstract
Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false ...
Read More
Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false positives and negatives in women with a dense breast which pose certain uncertainties in high-risk population. The objective of this study is Detecting Breast Cancer Through Blood Analysis Data Using Classification Algorithms. This will serve as a complement to these expensive methods. High ranking features were extracted from the dataset. The KNN, SVM and J48 algorithms were used as the training platform to classify 116 instances. Furthermore, 10-fold cross validation and holdout procedures were used coupled with changing of random seed. The result showed that KNN algorithm has the highest and best accuracy of 89.99% and 85.21% for cross validation and holdout procedure respectively. This is followed by the J48 with 84.65% and 75.65% for the two procedures respectively. SVM had 77.58% and 68.69% respectively. Although it was also discovered that Blood Glucose level is a major determinant in detecting breast cancer, it has to be combined with other attributes to make decision as a result of other health issues like diabetes. With the result obtained, women are advised to do regular check-ups including blood analysis in order to know which of the blood components need to be worked on to prevent breast cancer based on the model generated in this study.