I.3.7. Engineering
Elahe Moradi
Abstract
Thyroid disease is common worldwide and early diagnosis plays an important role in effective treatment and management. Utilizing machine learning techniques is vital in thyroid disease diagnosis. This research proposes tree-based machine learning algorithms using hyperparameter optimization techniques ...
Read More
Thyroid disease is common worldwide and early diagnosis plays an important role in effective treatment and management. Utilizing machine learning techniques is vital in thyroid disease diagnosis. This research proposes tree-based machine learning algorithms using hyperparameter optimization techniques to predict thyroid disease. The thyroid disease dataset from the UCI Repository is benchmarked to evaluate the performance of the proposed algorithms. After data preprocessing and normalization steps, data balancing has been applied to the data using the random oversampling (ROS) technique. Also, two methods of grid search (GS) and random search (RS) have been employed to optimize hyperparameters. Finally, employing Python software, various criteria were used to evaluate the performance of proposed algorithms such as decision tree, random forest, AdaBoost, and extreme gradient boosting. The results of the simulations indicate that the Extreme Gradient Boosting (XGB) algorithm with the grid search method outperforms all the other algorithms, obtaining an impressive accuracy, AUC, sensitivity, precision, and MCC of 99.39%, 99.97%, 98.85%, 99.40%, 98.79%, respectively. These results demonstrated the potential of the proposed method for accurately predicting thyroid disease.
M. Kakooei; Y. Baleghi
Abstract
Semantic labeling is an active field in remote sensing applications. Although handling high detailed objects in Very High Resolution (VHR) optical image and VHR Digital Surface Model (DSM) is a challenging task, it can improve the accuracy of semantic labeling methods. In this paper, a semantic labeling ...
Read More
Semantic labeling is an active field in remote sensing applications. Although handling high detailed objects in Very High Resolution (VHR) optical image and VHR Digital Surface Model (DSM) is a challenging task, it can improve the accuracy of semantic labeling methods. In this paper, a semantic labeling method is proposed by fusion of optical and normalized DSM data. Spectral and spatial features are fused into a Heterogeneous Feature Map to train the classifier. Evaluation database classes are impervious surface, building, low vegetation, tree, car, and background. The proposed method is implemented on Google Earth Engine. The method consists of several levels. First, Principal Component Analysis is applied to vegetation indexes to find maximum separable color space between vegetation and non-vegetation area. Gray Level Co-occurrence Matrix is computed to provide texture information as spatial features. Several Random Forests are trained with automatically selected train dataset. Several spatial operators follow the classification to refine the result. Leaf-Less-Tree feature is used to solve the underestimation problem in tree detection. Area, major and, minor axis of connected components are used to refine building and car detection. Evaluation shows significant improvement in tree, building, and car accuracy. Overall accuracy and Kappa coefficient are appropriate.
R. Satpathy; V. B. Konkimalla; J. Ratha
Abstract
The present work was designed to classify and differentiate between the dehalogenase enzyme to non–dehalogenases (other hydrolases) by taking the amino acid propensity at the core, surface and both the parts. The data sets were made on an individual basis by selecting the 3D structures of protein ...
Read More
The present work was designed to classify and differentiate between the dehalogenase enzyme to non–dehalogenases (other hydrolases) by taking the amino acid propensity at the core, surface and both the parts. The data sets were made on an individual basis by selecting the 3D structures of protein available in the PDB (Protein Data Bank). The prediction of the core amino acid were predicted by IPFP tool and their structural propensity calculation was performed by an in-house built software, Propensity Calculator which is available online. All datasets were finally grouped into two categories namely, dehalogenase and non-dehalogenase using Naïve Bayes, J-48, Random forest, K-means clustering and SMO classification algorithm. By making the comparison of various classification methods, the proposed tree method (Random forest) performs well with a classification accuracy of 98.88 % (maximum) for the core propensity data set. Therefore we proposed that, the core amino acid propensity could be approved as a novel potential descriptor for the classification of enzymes.