H.3. Artificial Intelligence
Sajjad Alizadeh Fard; Hossein Rahmani
Abstract
Fraud in financial data is a significant concern for both businesses and individuals. Credit card transactions involve numerous features, some of which may lack relevance for classifiers and could lead to overfitting. A pivotal step in the fraud detection process is feature selection, which profoundly ...
Read More
Fraud in financial data is a significant concern for both businesses and individuals. Credit card transactions involve numerous features, some of which may lack relevance for classifiers and could lead to overfitting. A pivotal step in the fraud detection process is feature selection, which profoundly impacts model accuracy and execution time. In this paper, we introduce an ensemble-based, explainable feature selection framework founded on SHAP and LIME algorithms, called "X-SHAoLIM". We applied our framework to diverse combinations of the best models from previous studies, conducting both quantitative and qualitative comparisons with other feature selection methods. The quantitative evaluation of the "X-SHAoLIM" framework across various model combinations revealed consistent accuracy improvements on average, including increases in Precision (+5.6), Recall (+1.5), F1-Score (+3.5), and AUC-PR (+6.75). Beyond enhanced accuracy, our proposed framework, leveraging explainable algorithms like SHAP and LIME, provides a deeper understanding of features' importance in model predictions, delivering effective explanations to system users.
H.3. Artificial Intelligence
Damianus Kofi Owusu; Christiana Cynthia Nyarko; Joseph Acquah; Joel Yarney
Abstract
Head and neck cancer (HNC) recurrence is ever increasing among Ghanaian men and women. Because not all machine learning classifiers are equally created, even if multiple of them suite very well for a given task, it may be very difficult to find one which performs optimally given different distributions. ...
Read More
Head and neck cancer (HNC) recurrence is ever increasing among Ghanaian men and women. Because not all machine learning classifiers are equally created, even if multiple of them suite very well for a given task, it may be very difficult to find one which performs optimally given different distributions. The stacking learns how to best combine weak classifier models to form a strong model. As a prognostic model for classifying HNSCC recurrence patterns, this study tried to identify the best stacked ensemble classifier model when the same ML classifiers for feature selection and stacked ensemble learning are used. Four stacked ensemble models; in which first one used two base classifiers: gradient boosting machine (GBM) and distributed random forest (DRF); second one used three base classifiers: GBM, DRF, and deep neural network (DNN); third one used four base classifiers: GBM, DRF, DNN, and generalized linear model (GLM); and fourth one used five base classifiers: GBM, DRF, DNN, GLM, and Naïve bayes (NB) were developed, using GBM meta-classifier in each case. The results showed that implementing stacked ensemble technique consisting of five base classifiers on gradient boosted features achieved better performance than achieved on other feature subsets, and implementing this stacked ensemble technique on gradient boosted features achieved better performance compared to other stacked ensemble techniques implemented on gradient boosted features and other feature subsets used. Learning stacked ensemble technique having five base classifiers on GBM features is clinically appropriate as a prognostic model for classifying and predicting HNSCC patients’ recurrence data.
H.3. Artificial Intelligence
Ali Zahmatkesh Zakariaee; Hossein Sadr; Mohamad Reza Yamaghani
Abstract
Machine learning (ML) is a popular tool in healthcare while it can help to analyze large amounts of patient data, such as medical records, predict diseases, and identify early signs of cancer. Gastric cancer starts in the cells lining the stomach and is known as the 5th most common cancer worldwide. ...
Read More
Machine learning (ML) is a popular tool in healthcare while it can help to analyze large amounts of patient data, such as medical records, predict diseases, and identify early signs of cancer. Gastric cancer starts in the cells lining the stomach and is known as the 5th most common cancer worldwide. Therefore, predicting the survival of patients, checking their health status, and detecting their risk of gastric cancer in the early stages can be very beneficial. Surprisingly, with the help of machine learning methods, this can be possible without the need for any invasive methods which can be useful for both patients and physicians in making informed decisions. Accordingly, a new hybrid machine learning-based method for detecting the risk of gastric cancer is proposed in this paper. The proposed model is compared with traditional methods and based on the empirical results, not only the proposed method outperform existing methods with an accuracy of 98% but also gastric cancer can be one of the most important consequences of H. pylori infection. Additionally, it can be concluded that lifestyle and dietary factors can heighten the risk of gastric cancer, especially among individuals who frequently consume fried foods and suffer from chronic atrophic gastritis and stomach ulcers. This risk is further exacerbated in individuals with limited fruit and vegetable intake and high salt consumption.
H.3. Artificial Intelligence
Ali Rebwar Shabrandi; Ali Rajabzadeh Ghatari; Nader Tavakoli; Mohammad Dehghan Nayeri; Sahar Mirzaei
Abstract
To mitigate COVID-19’s overwhelming burden, a rapid and efficient early screening scheme for COVID-19 in the first-line is required. Much research has utilized laboratory tests, CT scans, and X-ray data, which are obstacles to agile and real-time screening. In this study, we propose a user-friendly ...
Read More
To mitigate COVID-19’s overwhelming burden, a rapid and efficient early screening scheme for COVID-19 in the first-line is required. Much research has utilized laboratory tests, CT scans, and X-ray data, which are obstacles to agile and real-time screening. In this study, we propose a user-friendly and low-cost COVID-19 detection model based on self-reportable data at home. The most exhausted input features were identified and included in the demographic, symptoms, semi-clinical, and past/present disease data categories. We employed Grid search to identify the optimal combination of hyperparameter settings that yields the most accurate prediction. Next, we apply the proposed model with tuned hyperparameters to 11 classic state-of-the-art classifiers. The results show that the XGBoost classifier provides the highest accuracy of 73.3%, but statistical analysis shows that there is no significant difference between the accuracy performance of XGBoost and AdaBoost, although it proved the superiority of these two methods over other methods. Furthermore, the most important features obtained using SHapely Adaptive explanations were analyzed. “Contact with infected people,” “cough,” “muscle pain,” “fever,” “age,” “Cardiovascular commodities,” “PO2,” and “respiratory distress” are the most important variables. Among these variables, the first three have a relatively large positive impact on the target variable. Whereas, “age,” “PO2”, and “respiratory distress” are highly negatively correlated with the target variable. Finally, we built a clinically operable, visible, and easy-to-interpret decision tree model to predict COVID-19 infection.
H.3. Artificial Intelligence
Saheb Ghanbari Motlagh; Fateme Razi Astaraei; Mojtaba Hajihosseini; Saeed Madani
Abstract
This study explores the potential use of Machine Learning (ML) techniques to enhance three types of nano-based solar cells. Perovskites of methylammonium-free formamidinium (FA) and mixed cation-based cells exhibit a boosted efficiency when employing ML techniques. Moreover, ML methods are utilized to ...
Read More
This study explores the potential use of Machine Learning (ML) techniques to enhance three types of nano-based solar cells. Perovskites of methylammonium-free formamidinium (FA) and mixed cation-based cells exhibit a boosted efficiency when employing ML techniques. Moreover, ML methods are utilized to identify optimal donor complexes, high blind temperature materials, and to advance the thermodynamic stability of perovskites. Another significant application of ML in dye-sensitized solar cells (DSSCs) is the detection of novel dyes, solvents, and molecules for improving the efficiency and performance of solar cells. Some of these materials have increased cell efficiency, short-circuit current, and light absorption by more than 20%. ML algorithms to fine-tune network and plasmonic field bandwidths improve the efficiency and light absorption of surface plasmonic resonance (SPR) solar cells. This study outlines the potential of ML techniques to optimize and improve the development of nano-based solar cells, leading to promising results for the field of solar energy generation and supporting the demand for sustainable and dependable energy.
H.3. Artificial Intelligence
Mohammad Hossein Shayesteh; Behrooz Shahrokhzadeh; Behrooz Masoumi
Abstract
This paper provides a comprehensive review of the potential of game theory as a solution for sensor-based human activity recognition (HAR) challenges. Game theory is a mathematical framework that models interactions between multiple entities in various fields, including economics, political science, ...
Read More
This paper provides a comprehensive review of the potential of game theory as a solution for sensor-based human activity recognition (HAR) challenges. Game theory is a mathematical framework that models interactions between multiple entities in various fields, including economics, political science, and computer science. In recent years, game theory has been increasingly applied to machine learning challenges, including HAR, as a potential solution to improve recognition performance and efficiency of recognition algorithms. The review covers the shared challenges between HAR and machine learning, compares previous work on traditional approaches to HAR, and discusses the potential advantages of using game theory. It discusses different game theory approaches, including non-cooperative and cooperative games, and provides insights into how they can improve the HAR systems. The authors propose new game theory-based approaches and evaluate their effectiveness compared to traditional approaches. Overall, this review paper contributes to expanding the scope of research in HAR by introducing game-theoretic concepts and solutions to the field and provides valuable insights for researchers interested in applying game-theoretic approaches to HAR.
Oladosu Oladimeji; Olayanju Oladimeji
Abstract
Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false ...
Read More
Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false positives and negatives in women with a dense breast which pose certain uncertainties in high-risk population. The objective of this study is Detecting Breast Cancer Through Blood Analysis Data Using Classification Algorithms. This will serve as a complement to these expensive methods. High ranking features were extracted from the dataset. The KNN, SVM and J48 algorithms were used as the training platform to classify 116 instances. Furthermore, 10-fold cross validation and holdout procedures were used coupled with changing of random seed. The result showed that KNN algorithm has the highest and best accuracy of 89.99% and 85.21% for cross validation and holdout procedure respectively. This is followed by the J48 with 84.65% and 75.65% for the two procedures respectively. SVM had 77.58% and 68.69% respectively. Although it was also discovered that Blood Glucose level is a major determinant in detecting breast cancer, it has to be combined with other attributes to make decision as a result of other health issues like diabetes. With the result obtained, women are advised to do regular check-ups including blood analysis in order to know which of the blood components need to be worked on to prevent breast cancer based on the model generated in this study.