Statistically-driven Coral Reef metaheuristic for automatic hyperparameter setting and architecture design of Convolutional Neural Networks

Hits: 6614
Research areas:
Year:
2020
Type of Publication:
In Proceedings
Keywords:
Convolutional Neural Networks, Coral Reef based optimisation, architecture definition, optimisation
Authors:
Book title:
Proceedings of the 2020 IEEE Congress on Evolutionary Computation (CEC2020)
Pages:
1-8
Organization:
Glasgow, UK
Month:
19th-24th July
ISBN:
978-1-7281-6926-2
BibTex:
Abstract:
The adjustment of the hyperparameters and network structure of Convolutional Neural Networks (CNNs) composes an important step towards building effective, but still efficient learning models. The selection of the best configuration is a problem-dependent task that involves to explore an enormous and complex search space. Due to this reason, the use of heuristicbased search fits perfectly within this task, seeking to obtain a near to optimal solution in a complex and large exploratory space. This paper presents SCRODeep, a self-adapting algorithm based on a statistically-driven Coral Reef Optimisation algorithm (SCRO), for the selection of the most adequate CNNs architecture in a particular domain. This metaheuristic has been designed to navigate through a search space where the architecture (defining the particular set of layers, including convolutional or pooling layers), and the hyperparameters of the network (i.e. activation functions, number of units or the kernel initializer, among others) are represented, but where the connections weights and bias are inferred using typical CNNs optimisation algorithms. In contrast to other approaches, where the use of a metaheuristic implies in turn to fix a series of hyperparameters (i.e. the mutation probability in a genetic algorithm), our approach follows a selfparametrisation perspective, thus removing the necessity of fixing these values. The method has been tested in the design of CNNs for image classification, showing that SCRODeep is able to find competitive solutions, while the complexity of the architectures found is constrained.
Back