Page 52 - SMILESENG
P. 52

Intl. Summer School on Search- and Machine Learning-based Software Engineering
    Interactive algorithm
Tester’s actions
Modify test sequence, parameters, assertions
Interaction mechanism
Feedback integration
  Tester-based evaluation (readability)
Tester-guided search (fault detection)
Evaluate test readability or detection capability
Compare
similar test cases (e.g., mutants)
When?
Min. coverage, detected mutants...
What?
Best tests, common target...
How?
Test case, Test case + mutated lines
For how long?
Current test, transfer to other tests/runs
Change?
Permanent or editable feedback
 Fig. 1. Design alternatives for a search-based interactive algorithm to address the test case generation problem.
to avoid premature interactions, and the relation between test cases and targets can be used to guide the selection of test cases to be shown. The tester’s feedback —which might be revisited or not— could be applied to the selected test cases, or transferred to other tests in the same or a different run.
III. TOWARDS INTERACTIVE READABILITY ASSESSMENT
Our current efforts are focused on the challenge of improv- ing the readability of the generated test suites based on the tester’s preferences. With that purpose in mind, we are de- signing a new version of EvoSuite that includes an interactive module to pause the search at different points throughout the execution and to incorporate readability assessments of several test cases. At each of those moments, the system is intended to prepare an interaction with the tester, in which:
• Different test cases from the population are selected, all of them covering one of the targets pursued by the search.
• Both the test cases and the target are shown to the tester
for his/her revision.
• The system receives the readability assessment made by
the tester in the form of a readability score for each test.
The collected scores are intended to be used in the formation of the final test suite, prioritizing the most readable test cases from the tester’s perspective. This interactive version will be configurable to set the desired number of interactions and how many tests are selected for revision, among other parameters.
IV. CHALLENGES AND OPEN ISSUES We are currently facing the following challenges:
a) Technical challenges: The interactive module should monitor the state of the evolution to choose the right moments to interact and decide which test cases are worthy revised by the tester. This is clearly conditioned by the inner procedures of EvoSuite, which includes different evolutionary algorithms and performs additional steps to build the final test suite (e.g., removal of redundant tests). If the tester is asked to evaluate readability, a practical ranking scale should be defined. Even more, the incorporation of this information should not deviate the search from its primary objective, i.e., the test suite should not increase code readability at the expense of reducing coverage. Also, the algorithm should be prepared to deal with possible inconsistencies in the tester’s feedback.
b) Experimental challenges: Choosing the class under test becomes an important decision: while simple classes might be easily covered by EvoSuite, complex ones might result difficult for the human to understand. Another aspect is the difficulty in recruiting participants with the necessary expertise on software testing to empirically validate the interactive ap- proach in a realistic scenario. Also, participants with different testing knowledge will have different perceptions of readability because it is considered a highly subjective concept.
c) Practicability issues: Transferring our approach to an industrial setting represents a long-term challenge. Whether this interactive approach could be practical when applied on industrial codebases and could actually serve to bridge the gap between the state of research and practice is still an open issue.
ACKNOWLEDGMENT
Work supported by the European Commission (FEDER), the Spanish Ministry of Science and Innovation (RTI2018- 093608-BC33, RED2018-102472-T, PID2020-115832GB-I00) and the Andalusian Regional Government (DOC 00944).
REFERENCES
[1] S. Ali, L. C. Briand, H. Hemmati, and R. K. Panesar-Walawege, “A systematic review of the application and empirical investigation of search- based test case generation,” IEEE Trans. Softw. Eng., vol. 36, no. 6, pp. 742–762, 2010.
[2] G. Fraser and A. Zeller, “Mutation-driven generation of unit tests and oracles,” IEEE Trans. Softw. Eng., vol. 38, no. 2, pp. 278–292, 2012.
[3] M. M. Almasi, H. Hemmati, G. Fraser, A. Arcuri, and J. Benefelds, “An industrial evaluation of unit test generation: Finding real faults in a financial application,” in Proc. 39th Int. Conf. Software Engineering (ICSE): Software Engineering in Practice Track, 2017, pp. 263–272.
[4] S. Shamshiri, J. M. Rojas, J. P. Galeotti, N. Walkinshaw, and G. Fraser, “How do automatically generated unit tests influence software main- tenance?” in Proc. 11th Int. Conf. Software Testing, Verification and Validation (ICST), 2018, pp. 250–261.
[5] D. Roy, Z. Zhang, M. Ma, V. Arnaoudova, A. Panichella, S. Panichella, D. Gonzalez, and M. Mirakhorli, “DeepTC-Enhancer: Improving the Readability of Automatically Generated Tests,” in Proc. 35th IEEE/ACM Int. Conf. Automated Software Engineering (ASE), 2020, pp. 287–298.
[6] A. Ram´ırez, J. R. Romero, and C. L. Simons, “A systematic review of interaction in search-based software engineering,” IEEE Trans. Softw. Eng., vol. 45, no. 8, pp. 760–781, 2019.
[7] B. Marculescu, R. Feldt, R. Torkar, and S. M. Poulding, “Transferring interactive search-based software testing to industry,” J. Syst. Softw., vol. 142, pp. 156–170, 2018.
[8] A. Ram´ırez, P. Delgado-Pe´rez, K. J. Valle-Go´mez, I. Medina-Bulo, and J. R. Romero, “Interactivity in the generation of test cases with evolutionary computation,” in Proc. IEEE Congress on Evolutionary Computation (CEC), 2021, pp. 2395–2402.
40




















































   50   51   52   53   54