[1]
C. Yu, S.-Y. Cheng, R. L. He and S. S.-T. Yau, "Protein map: an alignment-free sequence comparison method based on various properties of amino acids," Gene, vol. 486, no. 1-2, pp. 110-118, 2011.
|
[2]
|
F. Zhang, H. Song, M. Zeng, Y. Li, L. Kurgan and M. Li, "DeepFunc: a deep learning framework for accurate prediction of protein functions from protein sequences and interactions," Proteomics, vol. 19, no. 12, p. 1900019, 2019.
|
[3]
|
P. Larranaga, B. Calvo, R. . Santana, C. Bielza, J. Galdiano, I. Inza, J. Lozano, R. Armananzas, G. . Santafe, A. Perez and V. Robles, "Machine learning in bioinformatics," Briefings in bioinformatics, vol. 7, no. 1, pp. 86-112, 2006.
|
[4]
|
J. Shen, J. Zhang, X. Luo, W. Zhu, K. Yu, K. Chen, Y. Li and H. Jiang, "Predicting protein--protein interactions based only on sequences information," Proceedings of the National Academy of Sciences, vol. 104, no. 11, pp. 4337-4341, 2007.
|
[5]
|
Y. Ge, S. Zhao and X. Zhao, "A step-by-step classification algorithm of protein secondary structures based on double-layer SVM model," Genomics, vol. 112, no. 2, pp. 1941-1946, 2020.
|
[6]
|
Z. Lv, S. Jin, H. Ding and Q. Zou, "A random forest sub-Golgi protein classifier optimized via dipeptide and amino acid composition features," Frontiers in bioengineering and biotechnology, vol. 7, p. 215, 2019.
|
[7]
|
C. L. P. Gupta, A. Bihari and S. Tripathi, "Protein Classification using Machine Learning and Statistical Techniques: A Comparative Analysis," arXiv preprint arXiv:1901.06152, 2019.
|
[8]
|
O. Yakhnenko, A. Silvescu and V. Honavar, "Discriminatively trained markov model for sequence classification," in Fifth IEEE International Conference on Data Mining (ICDM'05), IEEE, 2005, pp. 8--pp.
|
[9]
|
W. Zheng, L. Yang, . R. J. Genco, J. Wactawski-Wende, M. Buck and Y. Sun, "SENSE: Siamese neural network for sequence embedding and alignment-free comparison," Bioinformatics, vol. 35, no. 11, pp. 1820-1828, 2019.
|
[10]
|
B. Dogan, "An alignment-free method for bulk comparison of protein sequences from different species," Balkan Journal of Electrical and Computer Engineering, vol. 7, no. 4, pp. 405-416, 2019.
|
[11]
|
S. Biđin, I. Vujaklija, T. Paradžik, A. Bielen and D. Vujaklija, "Leitmotif: protein motif scanning 2.0," Bioinformatics, vol. 36, no. 11, pp. 3566-3567, 2020.
|
[12]
|
S. Seo, M. Oh, Y. Park and S. Kim, "DeepFam: deep learning based alignment-free method for protein family modeling and prediction," Bioinformatics, vol. 34, no. 13, pp. i254-i262, 2018.
|
[13]
|
D. Zhang and M. Kabuka, "Protein Family Classification from Scratch: A CNN based Deep Learning Approach," IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2020.
|
[14]
|
A. Dabba, A. Tari and D. Zouache, "Multiobjective artificial fish swarm algorithm for multiple sequence alignment," INFOR: Information Systems and Operational Research, vol. 58, no. 1, pp. 38-59, 2020.
|
[15]
|
M. S. Waterman, T. F. Smith and W. A. Beyer, "Some biological sequence metrics," Advances in Mathematics, vol. 20, no. 3, pp. 367-387, 1976.
|
[16]
|
J. D. Thompson, D. G. Higgins and T. J. Gibson, "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice," Nucleic acids research, vol. 22, no. 22, pp. 4673-4680, 1994.
|
[17]
|
K. Katoh, K. Misawa, K.-i. Kuma and T. Miyata, "MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform," Nucleic acids research, vol. 30, no. 14, pp. 3059-3066, 2002.
|
[18]
|
. R. C. Edgar, "MUSCLE: a multiple sequence alignment method with reduced time and space complexity," BMC bioinformatics, vol. 5, no. 1, p. 113, 2004.
|
[19]
|
C. Notredame, D. G. Higgins and J. Heringa, "T-Coffee: A novel method for fast and accurate multiple sequence alignment," Journal of molecular biology, vol. 302, no. 1, pp. 205-217, 2000.
|
[20]
|
F. Naznin, R. Sarker and D. Essam, "Vertical decomposition with genetic algorithm for multiple sequence alignment," BMC bioinformatics, vol. 12, no. 1, p. 353, 2011.
|
[21]
|
H. Zhu, Z. He and Y. Jia, "A novel approach to multiple sequence alignment using multiobjective evolutionary algorithm based on decomposition," IEEE journal of biomedical and health informatics, vol. 20, no. 2, pp. 717-727, 2015.
|
[22]
|
S. R. Eddy, "Profile hidden Markov models," Bioinformatics (Oxford, England), vol. 14, no. 9, pp. 755-763, 1998.
|
[23]
|
F. Naznin, R. Sarker and D. Essam, "Progressive alignment method using genetic algorithm for multiple sequence alignment," IEEE Transactions on Evolutionary Computation, vol. 16, no. 5, pp. 615-631, 2012.
|
[24]
|
. W. R. Pearson and D. J. Lipman, "Improved tools for biological sequence comparison," Proceedings of the National Academy of Sciences, vol. 85, no. 8, pp. 2444-2448, 1988.
|
[25]
|
W. R. Pearson, "Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms," Genomics, vol. 11, no. 3, pp. 635-650, 1991.
|
[26]
|
S. F. Altschul, T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller and D. J. Lipman, "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs," Nucleic acids research, vol. 25, no. 17, pp. 3389-3402, 1997.
|
[27]
|
M. Bhagwat, L. Young and . R. R. Robison, "Using BLAT to find sequence similarity in closely related genomes," Current protocols in bioinformatics, vol. 37, no. 1, pp. 1-41, 2012.
|
[28]
|
S. Schwartz, W. J. Kent, A. Smit, Z. Zhang, R. Baertsch, . R. C. Hardison, D. Haussler and W. Miller, "Human--mouse alignments with BLASTZ," Genome research, vol. 13, no. 1, pp. 103-107, 2003.
|
[29]
|
B. Ma, J. Tromp and M. Li, "PatternHunter: faster and more sensitive homology search," Bioinformatics, vol. 18, no. 3, pp. 440-445, 2002.
|
[30]
|
A. Chakraborty and S. Bandyopadhyay, "FOGSAA: Fast optimal global sequence alignment algorithm," Scientific reports, vol. 3, p. 1746, 2013.
|
[31]
|
A. Wong, T. Reichert, D. Cohen and B. Aygun, "A generalized method for matching informational macromolecular code sequences," Computers in biology and medicine, vol. 4, no. 1, pp. 43-57, 1974.
|
[32]
|
S. Batzoglou, L. Pachter, J. P. Mesirov, B. Berger and E. S. Lander, "Human and mouse gene structure: comparative analysis and application to exon prediction," Genome research, vol. 10, no. 7, pp. 950-958, 2000.
|
[33]
|
M. Brudno, . C. B. Do, G. M. Cooper, M. F. Kim, E. Davydov, E. D. Green, A. Sidow and S. Batzoglou, "LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA," Genome research, vol. 13, no. 4, pp. 721-731, 2003.
|
[34]
|
A. L. Delcher, A. Phillippy, J. Carlton and S. L. Salzberg, "Fast algorithms for large-scale genome alignment and comparison," Nucleic acids research, vol. 30, no. 11, pp. 2478-2483, 2002.
|
[35]
|
N. Bray, I. Dubchak and L. Pachter, "AVID: A global alignment program," Genome research, vol. 13, no. 1, pp. 97-102, 2003.
|
[36]
|
W. Huang, D. M. Umbach and L. Li, "Accurate anchoring alignment of divergent sequences," Bioinformatics, vol. 22, no. 1, pp. 29-34, 2006.
|
[37]
|
S. Min, B. Lee and S. Yoon, "Deep learning in bioinformatics," Briefings in bioinformatics, vol. 18, no. 5, pp. 851-869, 2017.
|
[38]
|
N. Liu, J. Han, D. Zhang, S. Wen and T. Liu, "Predicting eye fixations using convolutional neural networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 362-370.
|
[39]
|
J. K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho and Y. Bengio, "Attention-based models for speech recognition," in Advances in neural information processing systems, 2015, pp. 577-585.
|
[40]
|
R. Kiros, Y. Zhu, R. R. Salakhutdinov, R. Zemel, R. Urtasun, A. Torralba and S. Fidler, "Skip-thought vectors," in Advances in neural information processing systems, 2015, pp. 3294-3302.
|
[41]
|
E. Asgari and M. R. Mofrad, "Continuous distributed representation of biological sequences for deep proteomics and genomics," PloS one, vol. 10, no. 11, p. e0141287, 2015.
|
[42]
|
M. Zeng, F. Zhang, F.-X. Wu, Y. Li, J. Wang and M. Li, "Protein--protein interaction site prediction through combining local and global features with deep neural networks," Bioinformatics, vol. 36, no. 4, pp. 1114-1120, 2020.
|
[43]
|
W. Zhong and F. Gu, "Predicting Local Protein 3D Structures Using Clustering Deep Recurrent Neural Network," IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2020.
|
[44]
|
B. Panda and B. Majhi, "A novel improved prediction of protein structural class using deep recurrent neural network," Evolutionary Intelligence, pp. 1-8, 2018.
|
[45]
|
R. Jafari and . M. M. Javidi, "Solving the protein folding problem in hydrophobic-polar model using deep reinforcement learning," SN Applied Sciences, vol. 2, no. 2, p. 259, 2020.
|
[46]
|
H. Hou, T. Gan, Y. Yang, X. Zhu, S. Liu, W. Guo and J. Hao, "Using deep reinforcement learning to speed up collective cell migration," BMC bioinformatics, vol. 20, no. 18, pp. 1-10, 2019.
|
[47]
|
B. Liu, C.-C. Li and K. Yan, "DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks," Briefings in Bioinformatics, 2019.
|
[48]
|
P. Baldi and G. Pollastri, "The principled design of large-scale recursive neural network architectures--dag-rnns and the protein structure prediction problem," Journal of Machine Learning Research, vol. 4, no. Sep, pp. 575-602, 2003.
|
[49]
|
D. Bhowmik, S. Gao, M. T. Young and A. Ramanathan, "Deep clustering of protein folding simulations," BMC bioinformatics, vol. 19, no. 18, pp. 47-58, 2018.
|
[50]
|
Y. Cao, T. A. Geddes, J. Y. H. Yang and P. Yang, "Ensemble deep learning in bioinformatics," Nature Machine Intelligence, vol. 2, no. 9, pp. 500-508, 2020.
|
[51]
|
Z. Guo, J. Liu, Y. Wang, M. Chen, D. Wang, D. Xu and J. Cheng, "Diffusion models in bioinformatics: A new wave of deep learning revolution in action," arXiv preprint arXiv:2302.10907, 2023.
|
[52]
|
S. Zhang, R. Fan, Y. Liu, S. Chen, Q. Liu and W. Zeng, "Applications of transformer-based language models in bioinformatics: a survey," Bioinformatics Advances, vol. 3, no. 1, 2023.
|
[53]
|
T. N. Kinyanjui, K. Mugoye and R. Kibuku, "Multi-Head Self-Attention Fusion Network for Enhanced Multi-Class Crop Disease Classification," Journal of AI and Data Mining, vol. 13, no. 2, pp. 227-240, 2025.
|
[54]
|
V. Vimbi, N. Shaffi and M. Mahmud, "Interpreting artificial intelligence models: a systematic review on the application of LIME and SHAP in Alzheimer’s disease detection," Brain Informatics, vol. 11, no. 1, p. 10, 2024.
|
[55]
|
C. Molnar, "Interpretable machine learning," 2020.
|
[56]
|
P. H. "Game theory: A Multi-leveled approach," 2015.
|
[57]
|
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh and D. Batra, "Grad-cam: Visual explanations from deep networks via gradient-based localization," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 618-626.
|
[58]
|
J. Vig, A. Madani, L. R. Varshney, C. Xiong, R. Socher and N. F. Rajani, "Bertology meets biology: Interpreting attention in protein language models," arXiv preprint arXiv:2006.15222, 2020.
|
[59]
|
"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences," Proceedings of the National Academy of Sciences, vol. 118, no. 15, p. e2016239118, 2021.
|
[60]
|
I.-I. Comm, "Abbreviations and symbols for nucleic acids, polynucleotides, and their constituents," Biochemistry, vol. 9, no. 20, pp. 4022-4027, 1970.
|
[61]
|
X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010, pp. 249-256.
|
[62]
|
D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
|
[63]
|
R. L. Tatusov, M. Y. Galperin, D. A. Natale and E. V. Koonin, "The COG database: a tool for genome-scale analysis of protein functions and evolution," Nucleic acids research, vol. 28, no. 1, pp. 33-36, 2000.
|
[64]
|
R. L. Tatusov, E. V. Koonin and D. J. Lipman, "A genomic perspective on protein families," Science, vol. 278, no. 5338, pp. 631-637, 1997.
|
[65]
|
M. Y. Galperin, K. S. Makarova, Y. I. Wolf and E. V. Koonin, "Expanded microbial genome coverage and improved protein family annotation in the COG database," Nucleic acids research, vol. 43, no. D1, pp. D261-D269, 2015.
|
[66]
|
N. M. Razali, . Y. B. Wah and others, "Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests," Journal of statistical modeling and analytics, vol. 2, no. 1, pp. 21-33, 2011.
|
[67]
|
R. C. Blair and J. J. Higgins, "Comparison of the power of the paired samples t test to that of Wilcoxon's signed-ranks test under various population shapes," Psychological Bulletin, vol. 97, no. 1, p. 119, 1985.
|