H.3. Artificial Intelligence
Sajjad Alizadeh Fard; Hossein Rahmani
Abstract
Fraud in financial data is a significant concern for both businesses and individuals. Credit card transactions involve numerous features, some of which may lack relevance for classifiers and could lead to overfitting. A pivotal step in the fraud detection process is feature selection, which profoundly ...
Read More
Fraud in financial data is a significant concern for both businesses and individuals. Credit card transactions involve numerous features, some of which may lack relevance for classifiers and could lead to overfitting. A pivotal step in the fraud detection process is feature selection, which profoundly impacts model accuracy and execution time. In this paper, we introduce an ensemble-based, explainable feature selection framework founded on SHAP and LIME algorithms, called "X-SHAoLIM". We applied our framework to diverse combinations of the best models from previous studies, conducting both quantitative and qualitative comparisons with other feature selection methods. The quantitative evaluation of the "X-SHAoLIM" framework across various model combinations revealed consistent accuracy improvements on average, including increases in Precision (+5.6), Recall (+1.5), F1-Score (+3.5), and AUC-PR (+6.75). Beyond enhanced accuracy, our proposed framework, leveraging explainable algorithms like SHAP and LIME, provides a deeper understanding of features' importance in model predictions, delivering effective explanations to system users.
H.3.7. Learning
Laleh Armi; Elham Abbasi
Abstract
In this paper, we propose an innovative classification method for tree bark classification and tree species identification. The proposed method consists of two steps. In the first step, we take the advantages of ILQP, a rotationally invariant, noise-resistant, and fully descriptive color texture feature ...
Read More
In this paper, we propose an innovative classification method for tree bark classification and tree species identification. The proposed method consists of two steps. In the first step, we take the advantages of ILQP, a rotationally invariant, noise-resistant, and fully descriptive color texture feature extraction method. Then, in the second step, a new classification method called stacked mixture of ELM-based experts with a trainable gating network (stacked MEETG) is proposed. The proposed method is evaluated using the Trunk12, BarkTex, and AFF datasets. The performance of the proposed method on these three bark datasets shows that our approach provides better accuracy than other state-of-the-art methods.Our proposed method achieves an average classification accuracy of 92.79% (Trunk12), 92.54% (BarkTex), and 91.68% (AFF), respectively. Additionally, the results demonstrate that ILQP has better texture feature extraction capabilities than similar methods such as ILTP. Furthermore, stacked MEETG has shown a great influence on the classification accuracy.
H. Sarabi Sarvarani; F. Abdali-Mohammadi
Abstract
Bone age assessment is a method that is constantly used for investigating growth abnormalities, endocrine gland treatment, and pediatric syndromes. Since the advent of digital imaging, for several decades the bone age assessment has been performed by visually examining the ossification of the left hand, ...
Read More
Bone age assessment is a method that is constantly used for investigating growth abnormalities, endocrine gland treatment, and pediatric syndromes. Since the advent of digital imaging, for several decades the bone age assessment has been performed by visually examining the ossification of the left hand, usually using the G&P reference method. However, the subjective nature of hand-craft methods, the large number of ossification centers in the hand, and the huge changes in ossification stages lead to some difficulties in the evaluation of the bone age. Therefore, many efforts were made to develop image processing methods. These methods automatically extract the main features of the bone formation stages to effectively and more accurately assess the bone age. In this paper, a new fully automatic method is proposed to reduce the errors of subjective methods and improve the automatic methods of age estimation. This model was applied to 1400 radiographs of healthy children from 0 to 18 years of age and gathered from 4 continents. This method starts with the extraction of all regions of the hand, the five fingers and the wrist, and independently calculates the age of each region through examination of the joints and growth regions associated with these regions by CNN networks; It ends with the final age assessment through an ensemble of CNNs. The results indicated that the proposed method has an average assessment accuracy of 81% and has a better performance in comparison to the commercial system that is currently in use.
M. Salehi; J. Razmara; Sh. Lotfi
Abstract
Prediction of cancer survivability using machine learning techniques has become a popular approach in recent years. In this regard, an important issue is that preparation of some features may need conducting difficult and costly experiments while these features have less significant impacts on the ...
Read More
Prediction of cancer survivability using machine learning techniques has become a popular approach in recent years. In this regard, an important issue is that preparation of some features may need conducting difficult and costly experiments while these features have less significant impacts on the final decision and can be ignored from the feature set. Therefore, developing a machine for prediction of survivability, which ignores these features for simple cases and yields an acceptable prediction accuracy, has turned into a challenge for researchers. In this paper, we have developed an ensemble multi-stage machine for survivability prediction which ignores difficult features for simple cases. The machine employs three basic learners, namely multilayer perceptron (MLP), support vector machine (SVM), and decision tree (DT), in the first stage to predict survivability using simple features. If the learners agree on the output, the machine makes the final decision in the first stage. Otherwise, for difficult cases where the output of learners is different, the machine makes decision in the second stage using SVM over all features. The developed model was evaluated using the Surveillance, Epidemiology, and End Results (SEER) database. The experimental results revealed that the developed machine obtains considerable accuracy while it ignores difficult features for most of the input samples.
H.6.4. Clustering
M. Owhadi-Kareshki; M.R. Akbarzadeh-T.
Abstract
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality ...
Read More
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in the consensus process, hence no private data are transferred. With the proposed use of entropy as an internal measure of consensus clustering validation at each machine, the cluster centers of the local machines with higher expected clustering validity have more influence in the final consensus centers. We also employ relative cost function of the local Fuzzy C-Means (FCM) and the number of data points in each machine as measures of relative machine validity as compared to other machines and its reliability, respectively. The utility of the proposed consensus strategy is examined on 18 datasets from the UCI repository in terms of clustering accuracy and speed up against the centralized version of FCM. Several experiments confirm that the proposed approach yields to higher speed up and accuracy while maintaining data security due to its protected and distributed processing approach.
H.6.3.1. Classifier design and evaluation
M. Moradi; J. Hamidzadeh
Abstract
Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ...
Read More
Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ratings to make robust predictions and suggesting qualified recommendations are two significant challenges in recommender systems. However, the latter is far from satisfactory because human decisions affected by environmental conditions and they might change over time. In this paper, we introduce an innovative method to impute ratings to missed components of the rating matrix. We also design an ensemble-based method to obtain Top-k recommendations. To evaluate the performance of the proposed method, several experiments have been conducted based on 10-fold cross validation over real-world data sets. Experimental results show that the proposed method is superior to the state-of-the-art competing methods regarding applied evaluation metrics.