Publication
A. Ramírez, J.A. Parejo, J.R. Romero*, S. Segura and A. Ruiz-Cortés. “Evolutionary composition of QoS-aware web services: a many-objective perspective”. Expert Systems with Applications, vol. 72, pp. 357-370. 2017.
Abstract
Web service based applications usually invoke services provided by third-parties in their workflow. The Quality of Service (QoS) provided by the invoked supplier can be expressed in terms of the Service Level Agreement specifying the values contracted for particular aspects like cost or throughput, among others. Hence, developers are required to scrutinize the service market in order to select those candidates that best fit with the expected composition focusing on different QoS aspects. This search problem, a.k.a. QoS-aware web service composition, is characterized by the presence of many diverse QoS properties to be simultaneously optimized from a multi-objective perspective. This paper explores the suitability of many-objective evolutionary algorithms for tackling the binding problem of web services on the basis of a real-world benchmark with 9 QoS properties. Then, a complete comparative study provides empirical evidence on the adequacy of the most recent and sophisticated techniques to achieve a better trade-off between all the QoS properties. Furthermore, an in-depth study shows that some algorithms are able to promote specific QoS properties while keeping high values for the rest of attributes, enabling appealing advantages for the application of many-objective evolutionary algorithms within the field of service oriented computation.
Highlights
- QoS-aware web service composition requires multiple simultaneous QoS attributes.
- Having conflicting QoS properties requires computationally efficient approaches.
- A comparative experimental study of multi and- many-objective algorithms is presented.
- Many-objective proposals can promote certain QoS properties while keeping trade-off.
Additional material
Experimental study
- Experiment #1. It considers web service compositions having a maximum of 10, 20, 30, 40, or 50 tasks, where each task contains a different set of candidate services. Combining these elements, a total of 15 problem instances have been generated, i.e. 3 instances per maximum number of tasks, each one associated to a different set of candidate services but sharing the workflow.
- Experiment #2. In order to validate the conclusions drawn from Experiment #1, Experiment #2 should serve to prove that the parameter fixed, i.e the workflow, does not have a marked influence on the outcomes. Therefore, 15 different structures of composition were generated for 3 representative instances, i.e. 10, 30 and 50 tasks, leading to a total of 45 problem instances.
Problem instances
Problem instances for Experiment #1 and #2 are available for download as ZIP file (855 KB).
All the problem instances used in the experimentation were generated by the instance generator proposed in: J.A. Parejo, S. Segura, P. Fernández, A. Ruiz-Cortés. “QoS-aware web services composition using GRASP with path relinking”. Expert Systems with Applications, vol. 41(9), pp. 4211-4223.
The QoS values of the candidate services have been extracted from the QWS dataset.
Experimental results
Results are available for download in Excel format:
- Experiment #1 (79 KB)
- Experiment #2 (172 KB)
These files contain the mean and standard deviation of the QoS values of the solutions belonging to the Pareto sets returned by each algorithm, as well as the quality indicators used for the statistical validation.
Statistical tests
Experiment #1
Friedman and Holm tests
Comparison of the algorithms in terms of hypervolume
i | Algorithm | Ranking | z | p | alpha/i | Hypothesis |
---|---|---|---|---|---|---|
7 | NSGA-III | 8.000 | 7.3790 | 1.5945E-13 | 0.0071 | Rejected |
6 | SPEA2 | 6.1333 | 5.2920 | 1.2097E-07 | 0.0083 | Rejected |
5 | GrEA | 6.0000 | 5.1430 | 2.7045E-07 | 0.0100 | Rejected |
4 | MOEA/D | 4.8000 | 3.8013 | 1.4393E-04 | 0.0125 | Rejected |
3 | IBEA | 4.2667 | 3.2050 | 1.3505E-03 | 0.0167 | Rejected |
2 | NSGA-II | 3.4000 | 2.2361 | 2.5347E-02 | 0.0250 | Accepted |
1 | HypE | 2.0000 | 0.6708 | 5.0233E-01 | 0.0500 | Accepted |
0 | e-MOEA | 1.4000 | – | – | – | – |
Friedman test:
Iman and Davenport statistic considering reduction performance (distributed according to F-distribution with 7 and 98 degrees of freedom): 63.1879
Critical value at the significance level (alpha=0.01): 2.8272
Holm test:
Holm test rejects those hypothesis that have a p-value < 0.025.
Comparison of the algorithms in terms of spacing
i | Algorithm | Ranking | z | p | alpha/i | Hypothesis |
---|---|---|---|---|---|---|
7 | IBEA | 8.0000 | 7.8262 | 5.0269E-15 | 0.0071 | Rejected |
6 | HypE | 6.5333 | 6.1865 | 6.1532E-10 | 0.0083 | Rejected |
5 | GrEA | 6.4000 | 6.0374 | 1.5663E-09 | 0.0100 | Rejected |
4 | NSGA-III | 4.7333 | 4.1740 | 2.9931E-05 | 0.0125 | Rejected |
3 | e-MOEA | 4.0000 | 3.3541 | 7.9623E-04 | 0.0167 | Rejected |
2 | SPEA2 | 2.7333 | 1.9379 | 5.2632E-02 | 0.0250 | Accepted |
1 | MOEA/D | 2.6000 | 1.7889 | 7.3638E-02 | 0.0500 | Accepted |
0 | NSGA-II | 1.0000 | – | – | – | – |
Friedman test:
Iman and Davenport statistic considering reduction performance (distributed according to F-distribution with 7 and 98 degrees of freedom): 202.1765
Critical value at the significance level (alpha=0.01): 2.8272
Holm test:
Holm test rejects those hypothesis that have a p-value < 0.025.
Cliff’s Delta test (effect size)
Cliff’s Delta test results in raw format (.txt):
- Hypervolume (9 KB)
- Spacing (9 KB)
Experiment #2
Friedman and Holm tests
Comparison of the algorithms in terms of hypervolume
i | Algorithm | Ranking | z | p | alpha/i | Hypothesis |
---|---|---|---|---|---|---|
7 | NSGA-III | 8.0000 | 12.4366 | 1.6544E-35 | 0.071 | Rejected |
6 | SPEA2 | 6.4222 | 9.3812 | 6.5210E-21 | 0.0083 | Rejected |
5 | GrEA | 5.7778 | 8.1333 | 4.1788E-16 | 0.0100 | Rejected |
4 | IBEA | 4.6667 | 5.9816 | 2.2095E-09 | 0.0125 | Rejected |
3 | MOEA/D | 4.6444 | 5.9386 | 2.8751E-09 | 0.0167 | Rejected |
2 | NSGA-II | 2.9556 | 2.6681 | 7.6292E-03 | 0.0250 | Rejected |
1 | HypE | 1.9556 | 0.7316 | 4.6444E-01 | 0.0500 | Accepted |
0 | e-MOEA | 1.5778 | – | – | – | – |
Friedman test:
Iman and Davenport statistic considering reduction performance (distributed according to F-distribution with 7 and 308 degrees of freedom): 220.9533
Critical value at the significance level (alpha=0.01): 2.6977
Holm test:
Holm test rejects those hypothesis that have a p-value < 0.05.
Comparisons of the algorithms in terms of spacing
i | Algorithm | Ranking | z | p | alpha/i | Hypothesis |
---|---|---|---|---|---|---|
7 | IBEA | 8.0000 | 13.5554 | 7.3568E-42 | 0.0071 | Rejected |
6 | HypE | 6.6222 | 10.8874 | 5.2671E-25 | 0.0083 | Rejected |
5 | GrEA | 6.3333 | 10.3280 | 5.2671E-25 | 0.0100 | Rejected |
4 | NSGA-III | 4.3333 | 6.4550 | 1.0824E-10 | 0.0125 | Rejected |
3 | e-MOEA | 3.9778 | 5.7664 | 8.0963E-09 | 0.0167 | Rejected |
2 | SPEA2 | 3.2444 | 4.3463 | 1.3842E-05 | 0.0250 | Rejected |
1 | MOEA/D | 2.4889 | 2.8832 | 3.9363E-03 | 0.0500 | Rejected |
0 | NSGA-II | 1.0000 | – | – | – | – |
Friedman test:
Iman and Davenport statistic considering reduction performance (distributed according to F-distribution with 7 and 98 degrees of freedom): 453.6330
Critical value at the significance level (alpha=0.01): 2.6977
Holm test:
Holm test rejects all the hypotheses.
Cliff’s Delta test (effect size)
Cliff’s Delta test results in raw format:
- Hypervolume (10 KB)
- Spacing (9 KB)