Document Type : Original/Review Paper


1 Department of Computer Engineering, Faculty of Engineering, Alzahra University, Tehran, Iran

2 Data Mining Lab, Department of Computer Engineering, Faculty of Engineering, Alzahra University, Tehran, Iran



Regression testing reduction is an essential phase in software testing. In this step, the redundant and unnecessary cases are eliminated, whereas software accuracy and performance are not degraded. So far, various researches have been proposed in regression testing reduction field. The main challenge in this area is to provide a method that maintain fault-detection capability while reducing test suites. In this paper, a new test suite reduction technique is proposed based on data mining. In this method, in addition to test suite reduction, its fault-detection capability is preserved using both clustering and classification. In this approach, regression test cases are reduced using a bi-criteria data mining-based method in two levels. In each level, the different and useful coverage criteria and clustering algorithms are used to establish a better compromise between test suite size and the ability of reduced test suite fault detection. The results of the proposed method have been compared to the effects of five other methods based on PSTR and PFDL. The experiments show the efficiency of the proposed method in the test suite reduction in maintaining its capability in fault detection.


[1] C. Coviello, S. Romano, G. Scanniello, A. Marchetto, G. Antoniol, and A. Corazza, “Clustering support for inadequate test suite reduction,” in 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2018, pp. 95–105.
[2] E. G. Cartaxo, P. D. L. Machado, and F. G. O. Neto, “On the use of a similarity function for test case selection in the context of model-based testing,” Softw. Testing, Verif. Reliab., vol. 21, no. 2, pp. 75–100, 2011.
[3] D. Shin, S. Yoo, M. Papadakis, and D.-H. Bae, “Empirical evaluation of mutation-based test case prioritization techniques,” Softw. Testing, Verif. Reliab., vol. 29, no. 1–2, p. e1695, 2019.
[4] G. J. Myers, T. Badgett, T. M. Thomas, and C. Sandler, The art of software testing, vol. 2. Wiley Online Library, 2004.
[5] P. K. Gupta, “K-Step Crossover Method based on Genetic Algorithm for Test Suite Prioritization in Regression Testing,” JUCS-Journal Univers. Comput. Sci., vol. 27, p. 170, 2021.
[6] A. Singh Verma, A. C. Choudhary, and S. Tiwari, “Regression Test Suite Minimization Using Modified Artificial Ecosystem Optimization Algorithm,” J. Inf. Technol. Manag., vol. 13, no. 1, pp. 22–41, 2021.
[7] H. Hussein, A. Younes, and W. Abdelmoez, “Quantum algorithm for solving the test suite minimization problem,” Cogent Eng., vol. 8, no. 1, p. 1882116, 2021.
[8] A. Nadeem, A. Awais, and others, “TestFilter: a statement-coverage based test case reduction technique,” in 2006 IEEE International Multitopic Conference, 2006, pp. 275–280.
[9] V. Chaurasia, Y. Chauhan, and K. Thirunavukkarasu, “A survey on test case reduction techniques,” Int. J. Sci. Res., 2014.
[10] R. Wang, B. Qu, and Y. Lu, “Empirical study of the effects of different profiles on regression test case reduction,” IET Softw., vol. 9, no. 2, pp. 29–38, 2015.
[11] L. Raamesh and G. V Uma, “Reliable mining of automatically generated test cases from software requirements specification (SRS),” arXiv Prepr. arXiv1002.1199, 2010.
[12] A. A. Saifan and others, “Test Case Reduction Using Data Mining Classifier Techniques.,” JSW, vol. 11, no. 7, pp. 656–663, 2016.
[13] G. Rothermel, M. J. Harrold, J. Von Ronne, and C. Hong, “Empirical studies of test-suite reduction,” Softw. Testing, Verif. Reliab., vol. 12, no. 4, pp. 219–249, 2002.
[14] M. Alian, D. Suleiman, and A. Shaout, “Test case reduction techniques-survey,” Int. J. Adv. Comput. Sci. Appl., vol. 7, no. 5, pp. 264–275, 2016.
[15] L. You and Y. Lu, “A genetic algorithm for the time-aware regression testing reduction problem,” in 2012 8th International Conference on Natural Computation, 2012, pp. 596–599.
[16] S. Nachiyappan, A. Vimaladevi, and C. B. SelvaLakshmi, “An evolutionary algorithm for regression test suite reduction,” in 2010 International Conference on Communication and Computational Intelligence (INCOCCI), 2010, pp. 503–508.
[17] A. Kaur and D. Bhatt, “Hybrid particle swarm optimization for regression testing,” Int. J. Comput. Sci. Eng., vol. 3, no. 5, pp. 1815–1824, 2011.
[18] R. Nagar, A. Kumar, S. Kumar, and A. S. Baghel, “Implementing test case selection and reduction techniques using meta-heuristics,” in 2014 5th international conference-confluence the next generation information technology summit (Confluence), 2014, pp. 837–842.
[19] C. Coviello, S. Romano, G. Scanniello, and G. Antoniol, “GASSER: Genetic Algorithm for teSt Suite Reduction,” in Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2020, pp. 1–6.
[20] Z. Chen, B. Xu, X. Zhang, and C. Nie, “A novel approach for test suite reduction based on requirement relation contraction,” in Proceedings of the 2008 ACM symposium on Applied computing, 2008, pp. 390–394.
[21] B. Vaysburg, L. H. Tahat, and B. Korel, “Dependence analysis in reduction of requirement based test suites,” in Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis, 2002, pp. 107–111.
[22] N. F. M. Nasir, N. Ibrahim, M. M. Deris, and M. Z. Saringat, “Test case and requirement selection using rough set theory and conditional entropy,” in International Conference on Computational Intelligence in Information System, 2018, pp. 61–71.
[23] M. Santosh and R. Singh, “Test Case Minimization By Generating Requirement Based Mathematical Equations,” Int. J. Eng. Res. \& Technol., vol. 2, no. 6, pp. 1180–1188, 2013.
[24] Z. Anwar and A. Ahsan, “Multi-objective regression test suite optimization with fuzzy logic,” in INMIC, 2013, pp. 95–100.
[25] A. A. Haider, A. Nadeem, and S. Rafiq, “Multiple objective test suite optimization: A fuzzy logic based approach,” J. Intell. \& Fuzzy Syst., vol. 27, no. 2, pp. 863–875, 2014.
[26] A. A. Haider, S. Rafiq, and A. Nadeem, “Test suite optimization using fuzzy logic,” in 2012 international conference on emerging technologies, 2012, pp. 1–6.
[27] C. Malz, N. Jazdi, and P. Gohner, “Prioritization of test cases using software agents and fuzzy logic,” in 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation, 2012, pp. 483–486.
[28] P. Harris and N. Raju, “A Greedy Approach for Coverage-Based Test Suite Reduction.,” Int. Arab J. Inf. Technol., vol. 12, no. 1, 2015.
[29] P. Konsaard and L. Ramingwong, “Total coverage based regression test case prioritization using genetic algorithm,” in 2015 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), 2015, pp. 1–6.
[30] J. Offutt, J. Pan, and J. M. Voas, “Procedures for reducing the size of coverage-based test sets,” in Proceedings of the 12th International Conference on Testing Computer Software, 1995, pp. 111–123.
[31] B. Jiang, Y. Mu, and Z. Zhang, “Research of optimization algorithm for path-based regression testing suit,” in 2010 Second International Workshop on Education Technology and Computer Science, 2010, vol. 2, pp. 303–306.
[32] S. McMaster and A. Memon, “Fault detection probability analysis for coverage-based test suite reduction,” in 2007 IEEE International Conference on Software Maintenance, 2007, pp. 335–344.
[33] M. Weiser, “Program slicing. IEEE Transactions on Software Engineering, SE-10 (4): 352--357.” July, 1984.
[34] S. Arlt, A. Podelski, and M. Wehrle, “Reducing GUI test suites via program slicing,” in Proceedings of the 2014 international symposium on software testing and analysis, 2014, pp. 270–281.
[35] Z. Chen, Y. Duan, Z. Zhao, B. Xu, and J. Qian, “Using program slicing to improve the efficiency and effectiveness of cluster test selection,” Int. J. Softw. Eng. Knowl. Eng., vol. 21, no. 06, pp. 759–777, 2011.
[36] S. Tallam and N. Gupta, “A concept analysis inspired greedy algorithm for test suite minimization,” ACM SIGSOFT Softw. Eng. Notes, vol. 31, no. 1, pp. 35–42, 2005.
[37] S. Xu, H. Miao, and H. Gao, “Test suite reduction using weighted set covering techniques,” in 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, 2012, pp. 307–312.
[38] S. Parsa and A. Khalilian, “A bi-objective model inspired greedy algorithm for test suite minimization,” in International Conference on Future Generation Information Technology, 2009, pp. 208–215.
[39] C.-T. Lin, K.-W. Tang, J.-S. Wang, and G. M. Kapfhammer, “Empirically evaluating Greedy-based test suite reduction methods at different levels of test suite complexity,” Sci. Comput. Program., vol. 150, pp. 1–25, 2017.
[40] B. Suri, I. Mangal, and V. Srivastava, “Regression test suite reduction using an hybrid technique based on BCO and genetic algorithm,” Spec. Issue Int. J. Comput. Sci. \& Informatics (IJCSI), ISSN, pp. 2231–5292, 2011.
[41] S. Sampath, R. Bryce, and A. M. Memon, “A uniform representation of hybrid criteria for regression testing,” IEEE Trans. Softw. Eng., vol. 39, no. 10, pp. 1326–1344, 2013.
[42] S. Yoo and M. Harman, “Using hybrid algorithm for pareto efficient multi-objective test suite minimisation,” J. Syst. Softw., vol. 83, no. 4, pp. 689–701, 2010.
[43] K. Z. Zamli, F. Din, B. S. Ahmed, and M. Bures, “A hybrid Q-learning sine-cosine-based strategy for addressing the combinatorial test suite minimization problem,” PLoS One, vol. 13, no. 5, p. e0195675, 2018.
[44] D. Panwar, P. Tomar, and V. Singh, “Hybridization of Cuckoo-ACO algorithm for test case prioritization,” J. Stat. Manag. Syst., vol. 21, no. 4, pp. 539–546, 2018.
[45] C. Xia, Y. Zhang, and Z. Hui, “Test Suite Reduction via Evolutionary Clustering,” IEEE Access, vol. 9, pp. 28111–28121, 2021.
[46] A. Marchetto, G. Scanniello, and A. Susi, “Combining code and requirements coverage with execution cost for test suite reduction,” IEEE Trans. Softw. Eng., vol. 45, no. 4, pp. 363–390, 2017.
[47] Z. K. Zandian and M. Keyvanpour, “Systematic identification and analysis of different fraud detection approaches based on the strategy ahead,” Int. J. Knowledge-based Intell. Eng. Syst., vol. 21, no. 2, pp. 123–134, 2017.
[48] N. Mottaghi and M. R. Keyvanpour, “Test suite reduction using data mining techniques: A review article,” in 2017 International Symposium on Computer Science and Software Engineering Conference (CSSE), 2017, pp. 61–66.
[49] S. Kansomkeat, P. Thiket, and J. Offutt, “Generating test cases from UML activity diagrams using the Condition-Classification Tree Method,” in 2010 2nd International Conference on Software Technology and Engineering, 2010, vol. 1, pp. V1--62.
[50] S. Parsa, A. Khalilian, and Y. Fazlalizadeh, “A new algorithm to Test Suite Reduction based on cluster analysis,” in 2009 2nd IEEE International Conference on Computer Science and Information Technology, 2009, pp. 189–193.
[51] K. Muthyala and R. Naidu, “A novel approach to test suite reduction using data mining,” Indian J. Comput. Sci. Eng., vol. 2, no. 3, pp. 500–505, 2011.
[52] U. J. Kameswari, A. Saikiran, K. V. K. Reddy, and N. Varun, “Novel techniques for test suite reduction,” Int. J. Sci. Adv. Technol., vol. 1, no. 8, 2008.
[53] R. Dash, R. Dash, and I. Siksha, “Application of K-mean algorithm in software maintenance,” Int. J. Emerg. Technol. Adv. Eng., vol. 2, no. 5, 2012.
[54] B. Subashini and D. JeyaMala, “Reduction of test cases using clustering technique,” Int. J. Innov. Res. Eng. Technol, vol. 3, no. 3, pp. 1992–1996, 2014.
[55] C. Chantrapornchai, K. Kinputtan, and A. Santibowanwing, “Test case reduction case study for white box testing and black box testing using data mining,” Int. J. Softw. Eng. Its Appl., vol. 8, no. 6, pp. 319–338, 2014.
[56] S. Prasad, M. Jain, S. Singh, and C. Patvardhan, “Regression optimizer a multi coverage criteria test suite minimization technique,” Int. J. Appl. Inf. Syst., vol. 1, no. 8, 2012.
[57] R. Chauhan, P. Batra, and S. Chaudhary, “An efficient approach for test suite reduction using density based clustering technique,” Int. J. Comput. Appl., vol. 97, no. 11, 2014.
[58] P. Harris and N. Raju, “Towards test suite reduction using maximal frequent data mining concept,” Int. J. Comput. Appl. Technol., vol. 52, no. 1, pp. 48–58, 2015.
[59] C. Coviello, S. Romano, G. Scanniello, A. Marchetto, A. Corazza, and G. Antoniol, “Adequate vs. inadequate test suite reduction approaches,” Inf. Softw. Technol., vol. 119, p. 106224, 2020.
[60] N. Chetouane, F. Wotawa, H. Felbinger, and M. Nica, “On Using k-means Clustering for Test Suite Reduction,” in 2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), 2020, pp. 380–385.
[61] J. Chandrasekaran, H. Feng, Y. Lei, R. Kacker, and D. R. Kuhn, “Effectiveness of dataset reduction in testing machine learning algorithms,” in 2020 IEEE International Conference On Artificial Intelligence Testing (AITest), 2020, pp. 133–140.
[62] M. Gordan, S. R. Sabbagh-Yazdi, Z. Ismail, K. Ghaedi, and H. Hamad Ghayeb, “Data mining-based structural damage identification of composite bridge using support vector machine,” J. AI Data Min., vol. 9, no. 4, pp. 415–423, 2021.
[63] A. Hasan-Zadeh, F. Asadi, and N. Garbazkar, “Investigating Changes in Household Consumable Market Using Data Mining Techniques,” J. AI Data Min., vol. 9, no. 3, pp. 341–349, 2021.
[64] J. H. Andrews, L. C. Briand, Y. Labiche, and A. S. Namin, “Using mutation analysis for assessing and comparing testing coverage criteria,” IEEE Trans. Softw. Eng., vol. 32, no. 8, pp. 608–624, 2006.
[65] M. Gligoric, A. Groce, C. Zhang, R. Sharma, M. A. Alipour, and D. Marinov, “Guidelines for coverage-based comparisons of non-adequate test suites,” ACM Trans. Softw. Eng. Methodol., vol. 24, no. 4, pp. 1–33, 2015.
[66] A. Gupta and P. Jalote, “An approach for experimentally evaluating effectiveness and efficiency of coverage criteria for software testing,” Int. J. Softw. Tools Technol. Transf., vol. 10, no. 2, pp. 145–160, 2008.
[67] M. Gligoric, A. Groce, C. Zhang, R. Sharma, M. A. Alipour, and D. Marinov, “Comparing non-adequate test suites using coverage criteria,” in Proceedings of the 2013 International Symposium on Software Testing and Analysis, 2013, pp. 302–313.
[68] P. Yildirim and D. Birant, “K-linkage: A new agglomerative approach for hierarchical clustering,” Adv. Electr. Comput. Eng., vol. 17, no. 4, pp. 77–88, 2017.
[69] Y. Pang, X. Xue, A. S. Namin, Y.-F. Shi, S. Kang, and P.-P. Song, “A Clustering-Based Test Case Classification Technique for Enhancing Regression Testing.,” JSW, vol. 12, no. 3, pp. 153–164, 2017.
[70] Y. Pang, X. Xue, and A. S. Namin, “Identifying effective test cases through k-means clustering for enhancing regression testing,” in 2013 12th International Conference on Machine Learning and Applications, 2013, vol. 2, pp. 78–83.
[71] A. K. Jain and R. C. Dubes, Algorithms for clustering data. Prentice-Hall, Inc., 1988.
[72] L. Kaufman and P. J. Rousseeuw, Finding groups in data: an introduction to cluster analysis, vol. 344. John Wiley \& Sons, 2009.
[73] Y. Yang, X. Guan, and J. You, “CLOPE: a fast and effective clustering algorithm for transactional data,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 2002, pp. 682–687.
[74] M. Hutchins, H. Foster, T. Goradia, and T. Ostrand, “Experiments on the effectiveness of dataflow-and control-flow-based test adequacy criteria,” in Proceedings of 16th International conference on Software engineering, 1994, pp. 191–200.
[75] R. Abou Assi, W. Masri, and C. Trad, “Substate Profiling for Effective Test Suite Reduction,” in 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE), 2018, pp. 123–134.
[76] S. Parsa and A. Khalilian, “On the optimization approach towards test suite minimization,” Int. J. Softw. Eng. its Appl., vol. 4, no. 1, pp. 15–28, 2010.
[77] A. Khalilian and S. Parsa, “Bi-criteria test suite reduction by cluster analysis of execution profiles,” in IFIP Central and East European Conference on Software Engineering Techniques, 2009, pp. 243–256.
[78] M. Harman, S. A. Mansouri, and Y. Zhang, “Search-based software engineering: Trends, techniques and applications,” ACM Comput. Surv., vol. 45, no. 1, pp. 1–61, 2012.
[79] I. Hamzaoglu and J. H. Patel, “Test set compaction algorithms for combinational circuits,” IEEE Trans. Comput. Des. Integr. Circuits Syst., vol. 19, no. 8, pp. 957–963, 2000.
[80] S. U. R. Khan, S. P. Lee, N. Javaid, and W. Abdul, “A systematic review on test suite reduction: Approaches, experiment’s quality evaluation, and guidelines,” IEEE Access, vol. 6, pp. 11816–11841, 2018.
[81] M. R. Keyvanpour, H. Homayouni, and H. Shirazee, “Automatic software test case generation: An analytical classification framework,” Int. J. Softw. Eng. Its Appl., vol. 6, no. 4, pp. 1–16, 2012.
[82] M. Marré and A. Bertolino, “Using spanning sets for coverage testing,” IEEE Trans. Softw. Eng., vol. 29, no. 11, pp. 974–984, 2003.
[83] M. Kalkov and D. Pamakha, “Code coverage criteria and their effect on test suite qualities,” 2013.
[84] K. Wang, C. Xu, and B. Liu, “Clustering transactions using large items,” in Proceedings of the eighth international conference on Information and knowledge management, 1999, pp. 483–490.
[85] I. Hooda and R. Chhillar, “A review: Study of test case generation techniques,” Int. J. Comput. Appl., vol. 107, no. 16, 2014.