An iterated greedy algorithm for improving the generation of synthetic patterns in imbalanced learning

Hits: 5241
Research areas:
Year:
2017
Type of Publication:
In Proceedings
Keywords:
Over-sampling, imbalanced classification, ADASYN, iterative greedy algorithm, metaheuristics
Authors:
Volume:
10305
Book title:
14th International Work-Conference on Artificial and Natural Neural Networks (IWANN2017)
Series:
Lecture Notes in Computer Science (LNCS)
Pages:
513-524
Organization:
Cádiz, Spain
Month:
14th-16th June
ISBN:
978-3-319-59146-9
BibTex:
Abstract:
Real-world classification datasets often present a skewed distribution of patterns, where one or more classes are under-represented with respect to the rest. One of the most successful approaches for alleviating this problem is the generation of synthetic minority samples by convex combination of available ones. Within this framework, adaptive synthetic (ADASYN) sampling is a relatively new method which imposes weights on minority examples according to their learning complexity, in such a way that difficult examples are more prone to be over-sampled. This paper proposes an improvement of the ADASYN method, where the learning complexity of these patterns is also used to decide which sample of the neighbourhood is selected. Moreover, to avoid suboptimal results when performing the random convex combination, this paper explores the application of an iterative greedy algorithm which refines the synthetic patterns by repeatedly replacing a part of them. For the experiments, six binary datasets and four over-sampling methods are considered. The results show that the new version of ADASYN leads to more robust results and that the application of the iterative greedy metaheuristic significantly improves the quality of the generated patterns, presenting a positive effect on the final classification model.
Back