IMPROVING THE ARCHITECTURE AND INTERPRETABILITY OF DEEP NEURAL NETWORKS: AN EVOLUTIONARY COMPUTING APPROACH
BASIC INFORMATION
Ph.D. Student: Antonio R. Moya Martín-Castaño
Advisor: Sebastián Ventura
Defended on: September 2024
Keywords: Evolutionary Algortihms, Neural Networks
THESIS PROPOSAL
Many advances have been made in recent years in the field of artificial intelligence. Within these advances, the importance of a growing paradigm such as deep learning stands out, in which models inspired by the structure of the neuron networks of the human nervous system are developed to perform computationally costly tasks such as image recognition, speech synthesis, etc.
For proper learning in these deep models, it is necessary to properly adjust the architecture and hyper-parameters of these models, a task that is computationally impossible to do by brute force. For the optimization of architecture and hyper-parameters, numerous techniques have already been developed. These include, for example, random search, grid search, or Bayesian optimization techniques. In our proposal, we focus on other techniques also used for this and other areas: evolutionary algorithms. We can find in the literature numerous cases in which the use of these algorithms in optimization problems leads to great results. From this type of algorithm, we highlight their capacity to balance exploitation and exploration during the search process.
Despite reducing the time required by a brute-force technique, optimization techniques (even more so in the case of evolutionary algorithms) generally have to deal with very high computational times. An evolutionary algorithm is therefore proposed in which the process of searching for the best solution is guided by partial evaluations of the deep learning models, obtaining an approximate result in each of these evaluations that indicates how good could be each model, so that the time spent in general by the evolutionary algorithm is notably reduced.
Thus, the partial objectives are the following:
- Define an evolutionary algorithm that allows the optimization of architecture and hyper-parameters of deep learning models used for different tasks, such as image classification, natural language processing, or human activity recognition.
- Adapt our proposal to alleviate the computational cost by using partial evaluations of each model.
- Apply our evolutionary algorithm to optimize the hyper-parameters of very complex deep learning models.
FUNDS
The development of this thesis is being supported by:
- Spanish Ministry of Education, Culture and Sports under the FPU program (FPU18/06307).
- Spanish Ministry of Science and Innovation and the European Regional Development Fund, under project PID2020-115832GB-I00.