M. Nasiri; H. Rahmani
Abstract
Determining the personality dimensions of individuals is very important in psychological research. The most well-known example of personality dimensions is the Five-Factor Model (FFM). There are two approaches 1- Manual and 2- Automatic for determining the personality dimensions. In a manual approach, ...
Read More
Determining the personality dimensions of individuals is very important in psychological research. The most well-known example of personality dimensions is the Five-Factor Model (FFM). There are two approaches 1- Manual and 2- Automatic for determining the personality dimensions. In a manual approach, Psychologists discover these dimensions through personality questionnaires. As an automatic way, varied personal input types (textual/image/video) of people are gathered and analyzed for this purpose. In this paper, we proposed a method called DENOVA (DEep learning based on the ANOVA), which predicts FFM using deep learning based on the Analysis of variance (ANOVA) of words. For this purpose, DENOVA first applies ANOVA to select the most informative terms. Then, DENOVA employs Word2Vec to extract document embeddings. Finally, DENOVA uses Support Vector Machine (SVM), Logistic Regression, XGBoost, and Multilayer perceptron (MLP) as classifiers to predict FFM. The experimental results show that DENOVA outperforms on average, 6.91%, the state-of-the-art methods in predicting FFM with respect to accuracy.
C.3. Software Engineering
Saba Beiranvand; Mohammad Ali Zare Chahooki
Abstract
Software Cost Estimation (SCE) is one of the most widely used and effective activities in project management. In machine learning methods, some features have adverse effects on accuracy. Thus, preprocessing methods based on reducing non-effective features can improve accuracy in these methods. In clustering ...
Read More
Software Cost Estimation (SCE) is one of the most widely used and effective activities in project management. In machine learning methods, some features have adverse effects on accuracy. Thus, preprocessing methods based on reducing non-effective features can improve accuracy in these methods. In clustering techniques, samples are categorized into different clusters according to their semantic similarity. Accordingly, in the proposed study, to improve SCE accuracy, first samples are clustered based on original features. Then, a feature selection (FS) technique is separately done for each cluster. The proposed FS method is based on a combination of filter and wrapper FS methods. The proposed method uses both filter and wrapper advantages in selecting effective features of each cluster, with less computational complexity and more accuracy. Furthermore, as the assessment criteria have significant impacts on wrapper methods, a fused criterion has also been used. The proposed method was applied to Desharnais, COCOMO81, COCONASA93, Kemerer, and Albrecht datasets, and the obtained Mean Magnitude of Relative Error (MMRE) for these datasets were 0.2173, 0.6489, 0.3129, 0.4898 and 0.4245, respectively. These results were compared with previous studies and showed improvement in the error rate of SCE.
Document and Text Processing
Zobeir Raisi; Vali Mohammad Nazarzehi
Abstract
The Persian language presents unique challenges for scene text recognition due to its distinctive script. Despite advancements in AI, recognition in non-Latin scripts like Persian still faces difficulties. In this paper, we extend the vanilla transformer architecture to recognize arbitrary shapes of ...
Read More
The Persian language presents unique challenges for scene text recognition due to its distinctive script. Despite advancements in AI, recognition in non-Latin scripts like Persian still faces difficulties. In this paper, we extend the vanilla transformer architecture to recognize arbitrary shapes of Persian text instances. We apply Contextual Position Encoding (CPE) to the baseline transformer architecture to improve the recognition of Persian scripts in wild images, especially for oriented and spaced characters. The CPE utilizes position information to generate contrastive data pairs that help better in capturing Persian characters written in a different direction. Moreover, we evaluate several state-of-the-art deep-learning models using our prepared challenging Persian scene text recognition dataset and develop a transformer-based architecture to enhance recognition accuracy. Our proposed scene text recognition architecture achieves superior word recognition accuracy compared to existing methods on a real-world Persian text dataset.
R. Mohammadian; M. Mahlouji; A. Shahidinejad
Abstract
Multi-view face detection in open environments is a challenging task, due to the wide variations in illumination, face appearances and occlusion. In this paper, a robust method for multi-view face detection in open environments, using a combination of Gabor features and neural networks, is presented. ...
Read More
Multi-view face detection in open environments is a challenging task, due to the wide variations in illumination, face appearances and occlusion. In this paper, a robust method for multi-view face detection in open environments, using a combination of Gabor features and neural networks, is presented. Firstly, the effect of changing the Gabor filter parameters (orientation, frequency, standard deviation, aspect ratio and phase offset) for an image is analysed, secondly, the range of Gabor filter parameter values is determined and finally, the best values for these parameters are specified. A multilayer feedforward neural network with a back-propagation algorithm is used as a classifier. The input vector is obtained by convolving the input image and a Gabor filter, with both the angle and frequency values equal to π/2. The proposed algorithm is tested on 1,484 image samples with simple and complex backgrounds. The experimental results show that the proposed detector achieves great detection accuracy, by comparing it with several popular face-detection algorithms, such as OpenCV’s Viola-Jones detector.
A.H. Damia; M. Esnaashari; M.R. Parvizimosaed
Abstract
In the structural software test, test data generation is essential. The problem of generating test data is a search problem, and for solving the problem, search algorithms can be used. Genetic algorithm is one of the most widely used algorithms in this field. Adjusting genetic algorithm parameters helps ...
Read More
In the structural software test, test data generation is essential. The problem of generating test data is a search problem, and for solving the problem, search algorithms can be used. Genetic algorithm is one of the most widely used algorithms in this field. Adjusting genetic algorithm parameters helps to increase the effectiveness of this algorithm. In this paper, the Adaptive Genetic Algorithm (AGA) is used to maintain the diversity of the population to test data generation based on path coverage criterion, which calculates the rate of recombination and mutation with the similarity between chromosomes and the amount of chromosome fitness during and around each algorithm. Experiments have shown that this method is faster for generating test data than other versions of the genetic algorithm used by others.
H.3.8. Natural Language Processing
Nura Esfandiari; Kourosh Kiani; Razieh Rastgoo
Abstract
Chatbots are computer programs designed to simulate human conversation. Powered by artificial intelligence (AI), these chatbots are increasingly used to provide customer service, particularly by large language models (LLMs). A process known as fine-tuning LLMs is employed to personalize chatbot answers. ...
Read More
Chatbots are computer programs designed to simulate human conversation. Powered by artificial intelligence (AI), these chatbots are increasingly used to provide customer service, particularly by large language models (LLMs). A process known as fine-tuning LLMs is employed to personalize chatbot answers. This process demands substantial high-quality data and computational resources. In this article, to overcome the computational hurdles associated with fine-tuning LLMs, innovative hybrid approach is proposed. This approach aims to enhance the answers generated by LLMs, specifically for Persian chatbots used in mobile customer services. A transformer-based evaluation model was developed to score generated answers and select the most appropriate answers. Additionally, a Persian language dataset tailored to the domain of mobile sales was collected to support the personalization of the Persian chatbot and the training of the evaluation model. This approach is expected to foster increased customer interaction and boost sales within the Persian mobile phone market. Experiments conducted on four different LLMs demonstrated the effectiveness of the proposed approach in generating more relevant and semantically accurate answers for users.
H. Gholamalinejad; H. Khosravi
Abstract
Optimizers are vital components of deep neural networks that perform weight updates. This paper introduces a new updating method for optimizers based on gradient descent, called whitened gradient descent (WGD). This method is easy to implement and can be used in every optimizer based on the gradient ...
Read More
Optimizers are vital components of deep neural networks that perform weight updates. This paper introduces a new updating method for optimizers based on gradient descent, called whitened gradient descent (WGD). This method is easy to implement and can be used in every optimizer based on the gradient descent algorithm. It does not increase the training time of the network significantly. This method smooths the training curve and improves classification metrics. To evaluate the proposed algorithm, we performed 48 different tests on two datasets, Cifar100 and Animals-10, using three network structures, including densenet121, resnet18, and resnet50. The experiments show that using the WGD method in gradient descent based optimizers, improves the classification results significantly. For example, integrating WGD in RAdam optimizer increased the accuracy of DenseNet from 87.69% to 90.02% on the Animals-10 dataset.
A. Omondi; I.A. Lukandu; G.W. Wanyembi
Abstract
Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search ...
Read More
Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of relevant features and the exploration of features that have the potential to be relevant. In doing so, the study evaluated how effective the manipulation of the search component in feature selection is on achieving high accuracy with reduced dimensions. A control group experimental design was used to observe factual evidence. The context of the experiment was the high dimensional data experienced in performance tuning of complex database systems. The Wilcoxon signed-rank test at .05 level of significance was used to compare repeated classification accuracy measurements on the independent experiment and control group samples. Encouraging results with a p-value < 0.05 were recorded and provided evidence to reject the null hypothesis in favour of the alternative hypothesis which states that meta-heuristic search approaches are effective in achieving high accuracy with reduced dimensions depending on the outcome variable under investigation.
I.3.6. Electronics
Samira Mavaddati; Mohammad Razavi
Abstract
Rice is one of the most important staple crops in the world and provides millions of people with a significant source of food and income. Problems related to rice classification and quality detection can significantly impact the profitability and sustainability of rice cultivation, which is why the importance ...
Read More
Rice is one of the most important staple crops in the world and provides millions of people with a significant source of food and income. Problems related to rice classification and quality detection can significantly impact the profitability and sustainability of rice cultivation, which is why the importance of solving these problems cannot be overstated. By improving the classification and quality detection techniques, it can be ensured the safety and quality of rice crops, and improving the productivity and profitability of rice cultivation. However, such techniques are often limited in their ability to accurately classify rice grains due to various factors such as lighting conditions, background, and image quality. To overcome these limitations a deep learning-based classification algorithm is introduced in this paper that combines the power of convolutional neural network (CNN) and long short-term memory (LSTM) networks to better represent the structural content of different types of rice grains. This hybrid model, called CNN-LSTM, combines the benefits of both neural networks to enable more effective and accurate classification of rice grains. Three scenarios are demonstrated in this paper include, CNN, CNN in combination with transfer learning technique, and CNN-LSTM deep model. The performance of the mentioned scenarios is compared with the other deep learning models and dictionary learning-based classifiers. The experimental results demonstrate that the proposed algorithm accurately detects different rice varieties with an impressive accuracy rate of over 99.85%, and 99.18% to identify quality for varying combinations of rice varieties with an average accuracy of 99.18%.
E. Pejhan; M. Ghasemzadeh
Abstract
This research is related to the development of technology in the field of automatic text to image generation. In this regard, two main goals are pursued; first, the generated image should look as real as possible; and second, the generated image should be a meaningful description of the input text. our ...
Read More
This research is related to the development of technology in the field of automatic text to image generation. In this regard, two main goals are pursued; first, the generated image should look as real as possible; and second, the generated image should be a meaningful description of the input text. our proposed method is a Multi Sentences Hierarchical GAN (MSH-GAN) for text to image generation. In this research project, we have considered two main strategies: 1) produce a higher quality image in the first step, and 2) use two additional descriptions to improve the original image in the next steps. Our goal is to focus on using more information to generate images with higher resolution by using more than one sentence input text. We have proposed different models based on GANs and Memory Networks. We have also used more challenging dataset called ids-ade. This is the first time; this dataset has been used in this area. We have evaluated our models based on IS, FID and, R-precision evaluation metrics. Experimental results demonstrate that our best model performs favorably against the basic state-of-the-art approaches like StackGAN and AttGAN.
H.3. Artificial Intelligence
Saiful Bukhori; Muhammad Almas Bariiqy; Windi Eka Y. R; Januar Adi Putra
Abstract
Breast cancer is a disease of abnormal cell proliferation in the breast tissue organs. One method for diagnosing and screening breast cancer is mammography. However, the results of this mammography image have limitations because it has low contrast and high noise and contrast as non-coherence. This research ...
Read More
Breast cancer is a disease of abnormal cell proliferation in the breast tissue organs. One method for diagnosing and screening breast cancer is mammography. However, the results of this mammography image have limitations because it has low contrast and high noise and contrast as non-coherence. This research segmented breast cancer images derived from Ultrasonography (USG) photo using a Convolutional Neural Network (CNN) using the U-Net architecture. Testing on the CNN model with the U-Net architecture results the highest Mean Intersection over Union (Mean IoU) value in the data scenario with a ratio of 70:30, 100 epochs, and a learning rate of 5x10-5, which is 77%, while the lowest Mean IoU in the data scenario with a ratio 90:10, 50 epochs, and a learning rate of 1x10-4 learning rate, which is 64.4%.
H. Sarabi Sarvarani; F. Abdali-Mohammadi
Abstract
Bone age assessment is a method that is constantly used for investigating growth abnormalities, endocrine gland treatment, and pediatric syndromes. Since the advent of digital imaging, for several decades the bone age assessment has been performed by visually examining the ossification of the left hand, ...
Read More
Bone age assessment is a method that is constantly used for investigating growth abnormalities, endocrine gland treatment, and pediatric syndromes. Since the advent of digital imaging, for several decades the bone age assessment has been performed by visually examining the ossification of the left hand, usually using the G&P reference method. However, the subjective nature of hand-craft methods, the large number of ossification centers in the hand, and the huge changes in ossification stages lead to some difficulties in the evaluation of the bone age. Therefore, many efforts were made to develop image processing methods. These methods automatically extract the main features of the bone formation stages to effectively and more accurately assess the bone age. In this paper, a new fully automatic method is proposed to reduce the errors of subjective methods and improve the automatic methods of age estimation. This model was applied to 1400 radiographs of healthy children from 0 to 18 years of age and gathered from 4 continents. This method starts with the extraction of all regions of the hand, the five fingers and the wrist, and independently calculates the age of each region through examination of the joints and growth regions associated with these regions by CNN networks; It ends with the final age assessment through an ensemble of CNNs. The results indicated that the proposed method has an average assessment accuracy of 81% and has a better performance in comparison to the commercial system that is currently in use.
H. Khodadadi; V. Derhami
Abstract
A prominent weakness of dynamic programming methods is that they perform operations throughout the entire set of states in a Markov decision process in every updating phase. This paper proposes a novel chaos-based method to solve the problem. For this purpose, a chaotic system is first initialized, and ...
Read More
A prominent weakness of dynamic programming methods is that they perform operations throughout the entire set of states in a Markov decision process in every updating phase. This paper proposes a novel chaos-based method to solve the problem. For this purpose, a chaotic system is first initialized, and the resultant numbers are mapped onto the environment states through initial processing. In each traverse of the policy iteration method, policy evaluation is performed only once, and only a few states are updated. These states are proposed by the chaos system. In this method, the policy evaluation and improvement cycle lasts until an optimal policy is formulated in the environment. The same procedure is performed in the value iteration method, and only the values of a few states proposed by the chaos are updated in each traverse, whereas the values of other states are left unchanged. Unlike the conventional methods, an optimal solution can be obtained in the proposed method by only updating a limited number of states which are properly distributed all over the environment by chaos. The test results indicate the improved speed and efficiency of chaotic dynamic programming methods in obtaining the optimal solution in different grid environments.
H.3. Artificial Intelligence
Akram Pasandideh; Mohsen Jahanshahi
Abstract
Link prediction (LP) has become a hot topic in the data mining, machine learning, and deep learning community. This study aims to implement bibliometric analysis to find the current status of the LP studies and investigate it from different perspectives. The present study provides a Scopus-based bibliometric ...
Read More
Link prediction (LP) has become a hot topic in the data mining, machine learning, and deep learning community. This study aims to implement bibliometric analysis to find the current status of the LP studies and investigate it from different perspectives. The present study provides a Scopus-based bibliometric overview of the LP studies landscape since 1987 when LP studies were published for the first time. Various kinds of analysis, including document, subject, and country distribution are applied. Moreover, author productivity, citation analysis, and keyword analysis is used, and Bradford’s law is applied to discover the main journals in this field. Most documents were published by conferences in the field. The majority of LP documents have been published in the computer science and mathematics fields. So far, China has been at the forefront of publishing countries. In addition, the most active sources of LP publications are lecture notes in Computer Science, including subseries lecture notes in Artificial Intelligence (AI) and lecture notes in Bioinformatics, and IEEE Access. The keyword analysis demonstrates that while social networks had attracted attention in the early period, knowledge graphs have attracted more attention, recently. Since the LP problem has been approached recently using machine learning (ML), the current study may inform researchers to concentrate on ML techniques. This is the first bibliometric study of “link prediction” literature and provides a broad landscape of the field.
H.3. Artificial Intelligence
Mohamad Mahdi Yadegar; Hossein Rahmani
Abstract
In recent years, new technologies have brought new innovations into the financial and commercial world, giving fraudsters many ways to commit fraud and cost companies big time. We can build systems that detect fraudulent patterns and prevent future incidents using advanced technologies. Machine learning ...
Read More
In recent years, new technologies have brought new innovations into the financial and commercial world, giving fraudsters many ways to commit fraud and cost companies big time. We can build systems that detect fraudulent patterns and prevent future incidents using advanced technologies. Machine learning algorithms are being used more for fraud detection in financial data. But the common challenge is the imbalance of the dataset which hinders traditional machine learning methods. Finding the best approach towards these imbalance datasets is the problem many of the researchers are facing when trying to use machine learning methods. In this paper, we propose the method called FinFD-GCN that use Graph Convolutional Networks (GCNs) for fraud detection in credit card transaction datasets. FinFD-GCN represents transactions as graph in which each node represents a transaction and each edge represents similarity between transactions. By using this graph representation FinFD-GCN can capture complex relationships and anomalies that may have been overlooked by traditional methods or were even impossible to detect with conventional approaches, thus enhancing the accuracy and robustness of fraud detection in financial data. We use common evaluation metrics and confusion matrices to evaluate the proposed method. FinFD-GCN achieves significant improvements in recall and AUC compared to traditional methods such as logistic regression, support vector machines, and random forests, making it a robust solution for credit card fraud detection. By using the GCN model for fraud detection in this credit card dataset we outperformed base models 5% and 10%, with respect to F1 and AUC, respectively.
Z. Anari; A. Hatamlou; B. Anari; M. Masdari
Abstract
The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. ...
Read More
The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since the membership functions of each web page are different from those of other web pages, so automatic finding the number and position of TMF is significant. In this paper, a different reinforcement-based optimization approach called LA-OMF was proposed to find both the number and positions of TMFs for fuzzy association rules. In the proposed algorithm, the centers and spreads of TMFs were considered as parameters of the search space, and a new representation using learning automata (LA) was proposed to optimize these parameters. The performance of the proposed approach was evaluated and the results were compared with the results of other algorithms on a real dataset. Experiments on datasets with different sizes confirmed that the proposed LA-OMF improved the efficiency of mining fuzzy association rules by extracting optimized membership functions.
Seyyed A. Hoseini; P. Kabiri
Abstract
When a camera moves in an unfamiliar environment, for many computer vision and robotic applications it is desirable to estimate camera position and orientation. Camera tracking is perhaps the most challenging part of Visual Simultaneous Localization and Mapping (Visual SLAM) and Augmented Reality problems. ...
Read More
When a camera moves in an unfamiliar environment, for many computer vision and robotic applications it is desirable to estimate camera position and orientation. Camera tracking is perhaps the most challenging part of Visual Simultaneous Localization and Mapping (Visual SLAM) and Augmented Reality problems. This paper proposes a feature-based approach for tracking a hand-held camera that moves within an indoor place with a maximum depth of around 4-5 meters. In the first few frames the camera observes a chessboard as a marker to bootstrap the system and construct the initial map. Thereafter, upon arrival of each new frame, the algorithm pursues the camera tracking procedure. This procedure is carried-out in a framework, which operates using only the extracted visible natural feature points and the initial map. Constructed initial map is extended as the camera explores new areas. In addition, the proposed system employs a hierarchical method on basis of Lucas-Kanade registration technique to track FAST features. For each incoming frame, 6-DOF camera pose parameters are estimated using an Unscented Kalman Filter (UKF). The proposed algorithm is tested on real-world videos and performance of the UKF is compared against other camera tracking methods. Two evaluation criteria (i.e. Relative pose error and absolute trajectory error) are used to assess performance of the proposed algorithm. Accordingly, reported experimental results show accuracy and effectiveness and of the presented approach. Conducted experiments also indicate that the type of extracted feature points has not significant effect on precision of the proposed approach.
Y. Sharafi; M. Teshnelab; M. Ahmadieh Khanesar
Abstract
A new multi-objective evolutionary optimization algorithm is presented based on the competitive optimization algorithm (COOA) to solve multi-objective optimization problems (MOPs). Based on nature-inspired competition, the competitive optimization algorithm acts between animals such as birds, cats, bees, ...
Read More
A new multi-objective evolutionary optimization algorithm is presented based on the competitive optimization algorithm (COOA) to solve multi-objective optimization problems (MOPs). Based on nature-inspired competition, the competitive optimization algorithm acts between animals such as birds, cats, bees, ants, etc. The present study entails main contributions as follows: First, a novel method is presented to prune the external archive and at the same time keep the diversity of the Pareto front (PF). Second, a hybrid approach of powerful mechanisms such as opposition-based learning and chaotic maps is used to maintain the diversity in the search space of the initial population. Third, a novel method is provided to transform a multi-objective optimization problem into a single-objective optimization problem. A comparison of the result of the simulation for the proposed algorithm was made with some well-known optimization algorithms. The comparisons show that the proposed approach can be a better candidate to solve MOPs.
H.5.7. Segmentation
Mohsen Erfani Haji Pour
Abstract
The segmentation of noisy images remains one of the primary challenges in image processing. Traditional fuzzy clustering algorithms often exhibit poor performance in the presence of high-density noise due to insufficient consideration of spatial features. In this paper, a novel approach is proposed that ...
Read More
The segmentation of noisy images remains one of the primary challenges in image processing. Traditional fuzzy clustering algorithms often exhibit poor performance in the presence of high-density noise due to insufficient consideration of spatial features. In this paper, a novel approach is proposed that leverages both local and non-local spatial information, utilizing a Gaussian kernel to counteract high-density noise. This method enhances the algorithm's sensitivity to spatial relationships between pixels, thereby reducing the impact of noise. Additionally, a C+ means initialization approach is introduced to improve performance and reduce sensitivity to initial conditions, along with an automatic smoothing parameter tuning method. The evaluation results, based on the criteria of fuzzy assignment coefficient, fuzzy segmentation entropy, and segmentation accuracy, demonstrate a significant improvement in the performance of the proposed method.
F. Jafarinejad
Abstract
In recent years, new word embedding methods have clearly improved the accuracy of NLP tasks. A review of the progress of these methods shows that the complexity of these models and the number of their training parameters grows increasingly. Therefore, there is a need for methodological innovation for ...
Read More
In recent years, new word embedding methods have clearly improved the accuracy of NLP tasks. A review of the progress of these methods shows that the complexity of these models and the number of their training parameters grows increasingly. Therefore, there is a need for methodological innovation for presenting new word embedding methodologies. Most current word embedding methods use a large corpus of unstructured data to train the semantic vectors of words. This paper addresses the basic idea of utilizing from structure of structured data to introduce embedding vectors. Therefore, the need for high processing power, large amount of processing memory, and long processing time will be met using structures and conceptual knowledge lies in them. For this purpose, a new embedding vector, Word2Node is proposed. It uses a well-known structured resource, the WordNet, as a training corpus and hypothesis that graphic structure of the WordNet includes valuable linguistic knowledge that can be considered and not ignored to provide cost-effective and small sized embedding vectors. The Node2Vec graph embedding method allows us to benefit from this powerful linguistic resource. Evaluation of this idea in two tasks of word similarity and text classification has shown that this method perform the same or better in comparison to the word embedding method embedded in it (Word2Vec). This result is achieved while the required training data is reduced by about 50,000,000%. These results provide a view of capacity of the structured data to improve the quality of existing embedding methods and the resulting vectors.
H.3. Artificial Intelligence
Ali Zahmatkesh Zakariaee; Hossein Sadr; Mohamad Reza Yamaghani
Abstract
Machine learning (ML) is a popular tool in healthcare while it can help to analyze large amounts of patient data, such as medical records, predict diseases, and identify early signs of cancer. Gastric cancer starts in the cells lining the stomach and is known as the 5th most common cancer worldwide. ...
Read More
Machine learning (ML) is a popular tool in healthcare while it can help to analyze large amounts of patient data, such as medical records, predict diseases, and identify early signs of cancer. Gastric cancer starts in the cells lining the stomach and is known as the 5th most common cancer worldwide. Therefore, predicting the survival of patients, checking their health status, and detecting their risk of gastric cancer in the early stages can be very beneficial. Surprisingly, with the help of machine learning methods, this can be possible without the need for any invasive methods which can be useful for both patients and physicians in making informed decisions. Accordingly, a new hybrid machine learning-based method for detecting the risk of gastric cancer is proposed in this paper. The proposed model is compared with traditional methods and based on the empirical results, not only the proposed method outperform existing methods with an accuracy of 98% but also gastric cancer can be one of the most important consequences of H. pylori infection. Additionally, it can be concluded that lifestyle and dietary factors can heighten the risk of gastric cancer, especially among individuals who frequently consume fried foods and suffer from chronic atrophic gastritis and stomach ulcers. This risk is further exacerbated in individuals with limited fruit and vegetable intake and high salt consumption.
I.3.7. Engineering
Elahe Moradi
Abstract
Thyroid disease is common worldwide and early diagnosis plays an important role in effective treatment and management. Utilizing machine learning techniques is vital in thyroid disease diagnosis. This research proposes tree-based machine learning algorithms using hyperparameter optimization techniques ...
Read More
Thyroid disease is common worldwide and early diagnosis plays an important role in effective treatment and management. Utilizing machine learning techniques is vital in thyroid disease diagnosis. This research proposes tree-based machine learning algorithms using hyperparameter optimization techniques to predict thyroid disease. The thyroid disease dataset from the UCI Repository is benchmarked to evaluate the performance of the proposed algorithms. After data preprocessing and normalization steps, data balancing has been applied to the data using the random oversampling (ROS) technique. Also, two methods of grid search (GS) and random search (RS) have been employed to optimize hyperparameters. Finally, employing Python software, various criteria were used to evaluate the performance of proposed algorithms such as decision tree, random forest, AdaBoost, and extreme gradient boosting. The results of the simulations indicate that the Extreme Gradient Boosting (XGB) algorithm with the grid search method outperforms all the other algorithms, obtaining an impressive accuracy, AUC, sensitivity, precision, and MCC of 99.39%, 99.97%, 98.85%, 99.40%, 98.79%, respectively. These results demonstrated the potential of the proposed method for accurately predicting thyroid disease.
J. Tayyebi; E. Hosseinzadeh
Abstract
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal ...
Read More
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is presented to cluster incomplete fuzzy data. The method substitutes missing attribute by a trapezoidal fuzzy number to be determined by using the corresponding attribute of q nearest-neighbor. Comparisons and analysis of the experimental results demonstrate the capability of the proposed method.
E. Feli; R. Hosseini; S. Yazdani
Abstract
In Vitro Fertilization (IVF) is one of the scientifically known methods of infertility treatment. This study aimed at improving the performance of predicting the success of IVF using machine learning and its optimization through evolutionary algorithms. The Multilayer Perceptron Neural Network (MLP) ...
Read More
In Vitro Fertilization (IVF) is one of the scientifically known methods of infertility treatment. This study aimed at improving the performance of predicting the success of IVF using machine learning and its optimization through evolutionary algorithms. The Multilayer Perceptron Neural Network (MLP) were proposed to classify the infertility dataset. The Genetic algorithm was used to improve the performance of the Multilayer Perceptron Neural Network model. The proposed model was applied to a dataset including 594 eggs from 94 patients undergoing IVF, of which 318 were of good quality embryos and 276 were of lower quality embryos. For performance evaluation of the MLP model, an ROC curve analysis was conducted, and 10-fold cross-validation performed. The results revealed that this intelligent model has high efficiency with an accuracy of 96% for Multi-layer Perceptron neural network, which is promising compared to counterparts methods.
E. Zarei; N. Barimani; G. Nazari Golpayegani
Abstract
Cardiac Arrhythmias are known as one of the most dangerous cardiac diseases. Applying intelligent algorithms in this area, leads into the reduction of the ECG signal processing time by the physician as well as reducing the probable mistakes caused by fatigue of the specialist. The purpose of this study ...
Read More
Cardiac Arrhythmias are known as one of the most dangerous cardiac diseases. Applying intelligent algorithms in this area, leads into the reduction of the ECG signal processing time by the physician as well as reducing the probable mistakes caused by fatigue of the specialist. The purpose of this study is to introduce an intelligent algorithm for the separation of three cardiac arrhythmias by using chaos features of ECG signal and combining three types of the most common classifiers in these signal’s processing area. First, ECG signals related to three cardiac arrhythmias of Atrial Fibrillation, Ventricular Tachycardia and Post Supra Ventricular Tachycardia along with the normal cardiac signal from the arrhythmia database of MIT-BIH were gathered. Then, chaos features describing non-linear dynamic of ECG signal were extracted by calculating the Lyapunov exponent values and signal’s fractal dimension. finally, the compound classifier was used by combining of multilayer perceptron neural network, support vector machine network and K-Nearest Neighbor. Obtained results were compared to the classifying method based on features of time-domain and time-frequency domain, as a proof for the efficacy of the chaos features of the ECG signal. Likewise, to evaluate the efficacy of the compound classifier, each network was used both as separately and also as combined and the results were compared. The obtained results showed that Using the chaos features of ECG signal and the compound classifier, can classify cardiac arrhythmias with 99.1% ± 0.2 accuracy and 99.6% ± 0.1 sensitivity and specificity rate of 99.3 % ± 0.1