H.3. Artificial Intelligence
Sajjad Alizadeh Fard; Hossein Rahmani
Abstract
Fraud in financial data is a significant concern for both businesses and individuals. Credit card transactions involve numerous features, some of which may lack relevance for classifiers and could lead to overfitting. A pivotal step in the fraud detection process is feature selection, which profoundly ...
Read More
Fraud in financial data is a significant concern for both businesses and individuals. Credit card transactions involve numerous features, some of which may lack relevance for classifiers and could lead to overfitting. A pivotal step in the fraud detection process is feature selection, which profoundly impacts model accuracy and execution time. In this paper, we introduce an ensemble-based, explainable feature selection framework founded on SHAP and LIME algorithms, called "X-SHAoLIM". We applied our framework to diverse combinations of the best models from previous studies, conducting both quantitative and qualitative comparisons with other feature selection methods. The quantitative evaluation of the "X-SHAoLIM" framework across various model combinations revealed consistent accuracy improvements on average, including increases in Precision (+5.6), Recall (+1.5), F1-Score (+3.5), and AUC-PR (+6.75). Beyond enhanced accuracy, our proposed framework, leveraging explainable algorithms like SHAP and LIME, provides a deeper understanding of features' importance in model predictions, delivering effective explanations to system users.
Document and Text Processing
Mina Tabatabaei; Hossein Rahmani; Motahareh Nasiri
Abstract
The search for effective treatments for complex diseases, while minimizing toxicity and side effects, has become crucial. However, identifying synergistic combinations of drugs is often a time-consuming and expensive process, relying on trial and error due to the vast search space involved. Addressing ...
Read More
The search for effective treatments for complex diseases, while minimizing toxicity and side effects, has become crucial. However, identifying synergistic combinations of drugs is often a time-consuming and expensive process, relying on trial and error due to the vast search space involved. Addressing this issue, we present a deep learning framework in this study. Our framework utilizes a diverse set of features, including chemical structure, biomedical literature embedding, and biological network interaction data, to predict potential synergistic combinations. Additionally, we employ autoencoders and principal component analysis (PCA) for dimension reduction in sparse data. Through 10-fold cross-validation, we achieved an impressive 98 percent area under the curve (AUC), surpassing the performance of seven previous state-of-the-art approaches by an average of 8%.
Mohammad Nazari; Hossein Rahmani; Dadfar Momeni; Motahare Nasiri
Abstract
Graph representation of data can better define relationships among data components and thus provide better and richer analysis. So far, movies have been represented in graphs many times using different features for clustering, genre prediction, and even for use in recommender systems. In constructing ...
Read More
Graph representation of data can better define relationships among data components and thus provide better and richer analysis. So far, movies have been represented in graphs many times using different features for clustering, genre prediction, and even for use in recommender systems. In constructing movie graphs, little attention has been paid to their textual features such as subtitles, while they contain the entire content of the movie and there is a lot of hidden information in them. So, in this paper, we propose a method called MoGaL to construct movie graph using LDA on subtitles. In this method, each node is a movie and each edge represents the novel relationship discovered by MoGaL among two associated movies. First, we extracted the important topics of the movies using LDA on their subtitles. Then, we visualized the relationship between the movies in a graph, using the cosine similarity. Finally, we evaluated the proposed method with respect to measures genre homophily and genre entropy. MoGaL succeeded to outperforms the baseline method significantly in these measures. Accordingly, our empirical results indicate that movie subtitles could be considered a rich source of informative information for various movie analysis tasks.
M. Nasiri; H. Rahmani
Abstract
Determining the personality dimensions of individuals is very important in psychological research. The most well-known example of personality dimensions is the Five-Factor Model (FFM). There are two approaches 1- Manual and 2- Automatic for determining the personality dimensions. In a manual approach, ...
Read More
Determining the personality dimensions of individuals is very important in psychological research. The most well-known example of personality dimensions is the Five-Factor Model (FFM). There are two approaches 1- Manual and 2- Automatic for determining the personality dimensions. In a manual approach, Psychologists discover these dimensions through personality questionnaires. As an automatic way, varied personal input types (textual/image/video) of people are gathered and analyzed for this purpose. In this paper, we proposed a method called DENOVA (DEep learning based on the ANOVA), which predicts FFM using deep learning based on the Analysis of variance (ANOVA) of words. For this purpose, DENOVA first applies ANOVA to select the most informative terms. Then, DENOVA employs Word2Vec to extract document embeddings. Finally, DENOVA uses Support Vector Machine (SVM), Logistic Regression, XGBoost, and Multilayer perceptron (MLP) as classifiers to predict FFM. The experimental results show that DENOVA outperforms on average, 6.91%, the state-of-the-art methods in predicting FFM with respect to accuracy.
H. Rahmani; H. Kamali; H. Shah-Hosseini
Abstract
Nowadays, a significant amount of studies are devoted to discovering important nodes in graph data. Social networks as graph data have attracted a lot of attention. There are various purposes for discovering the important nodes in social networks such as finding the leaders in them, i.e. the users who ...
Read More
Nowadays, a significant amount of studies are devoted to discovering important nodes in graph data. Social networks as graph data have attracted a lot of attention. There are various purposes for discovering the important nodes in social networks such as finding the leaders in them, i.e. the users who play an important role in promoting advertising, etc. Different criteria have been proposed in discovering important nodes in graph data. Measuring a node’s importance by a single criterion may be inefficient due to the variety of graph structures. Recently, a combination of criteria has been used in the discovery of important nodes. In this paper, we propose a system for the Discovery of Important Nodes in social networks using Genetic Algorithms (DINGA). In our proposed system, important nodes in social networks are discovered by employing a combination of eight informative criteria and their intelligent weighting. We compare our results with a manually weighted method, that uses random weightings for each criterion, in four real networks. Our method shows an average of 22% improvement in the accuracy of important nodes discovery.