R. Satpathy; V. B. Konkimalla; J. Ratha
Abstract
The present work was designed to classify and differentiate between the dehalogenase enzyme to non–dehalogenases (other hydrolases) by taking the amino acid propensity at the core, surface and both the parts. The data sets were made on an individual basis by selecting the 3D structures of protein ...
Read More
The present work was designed to classify and differentiate between the dehalogenase enzyme to non–dehalogenases (other hydrolases) by taking the amino acid propensity at the core, surface and both the parts. The data sets were made on an individual basis by selecting the 3D structures of protein available in the PDB (Protein Data Bank). The prediction of the core amino acid were predicted by IPFP tool and their structural propensity calculation was performed by an in-house built software, Propensity Calculator which is available online. All datasets were finally grouped into two categories namely, dehalogenase and non-dehalogenase using Naïve Bayes, J-48, Random forest, K-means clustering and SMO classification algorithm. By making the comparison of various classification methods, the proposed tree method (Random forest) performs well with a classification accuracy of 98.88 % (maximum) for the core propensity data set. Therefore we proposed that, the core amino acid propensity could be approved as a novel potential descriptor for the classification of enzymes.