Partitions and Source Code
This page presents benchmark dataset's partitions, especific papers datasets' partitions and source code.
- Generic benchmark datasets' partitions
- A. Sáez, J. Sánchez-Monedero, P. A. Gutiérrez,C. Hervás-Martínez, Machine learning methods for binary and multiclass classification of melanoma thickness from dermoscopic images, IEEE Transactions on Medical Imaging, Accepted, 2015. (datasets and links to methods' implementation)
- Javier Sánchez-Monedero, Pilar Campoy-Muñoz, Pedro Antonio Gutiérrez and César Hervás-Martínez, "A guided data projection technique for classification of sovereign ratings: the case of European Union 27", Applied Soft Computing, 2014 (datasets and links to methods' implementation)
- J. Sánchez-Monedero, P.A. Gutiérrez, Peter Tiño, C. Hervás-Martínez. " Exploitation of Pairwise Class Distances for Ordinal Classification ", Neural Computation, Accepted, 2013 (Matlab source code and datasets)
- J. Sánchez-Monedero, P.A. Gutiérrez, M. Pérez-Ortiz, C. Hervás-Martínez. " An n-spheres based synthetic data generator for supervised classification ", International Work Conference on Artificial Neural Networks (IWANN), 2013 (Matlab source code of the synthetic data generator)
- M. Pérez-Ortiz, R. Colmenarejo, J.C. Fernández y C. Hervás-Martínez. "Can machine learning techniques help to improve the Common Fisheries Policy?", International Work Conference on Artificial Neural Networks (IWANN), 2013 (best model)
- M. Pèrez-Ortiz, P. A. Gutiérrez y C. Hervás-Martínez. "Projection based ensemble learning for ordinal regression". 2012 (datasets and statistical results)
- P. A. Gutiérrez, C. Hervás-Martínez, F. J. Martínez-Estudillo y M. Carbonero-Ruz. "A two-stage evolutionary algorithm based on sensitivity and accuracy for multi-class problems", Information Sciences. 2012 (datasets and statistical results)
- P. A. Gutiérrez, C. Hervás-Martínez y F. J. Martínez-Estudillo. "Logistic Regression by Means of Evolutionary Radial Basis Function Neural Networks", IEEE Transacctions on Neural Networks, 2011 (datasets)
- F. Fernández-Navarro, C. Hervás-Martínez, J. Sanchez-Monedero and P. A. Gutiérrez. "MELM-GRBF: A modified version of the extreme learning machine for generalized radial basis function neural networks", Neurocomputing, Vol. 74, Issue 16, 2011, pp. 2502-2510. (source code)
- F. Fernandez-Navarro, C. Hervás-Martínez y P. A. Gutiérrez. "A dynamic over-sampling procedure based on sensitivity for multi-class problems", Pattern Recognition, 2011 (datasets)
- P. A. Gutiérrez, C. Hervás-Martínez, M. Carbonero-Ruz and J. C. Fernandez-Caballero. "Combined Projection and Kernel Basis Functions for Classification in Evolutionary Neural Networks", Neurocomputing, 2009 (datasets and source code)
- F. J. Martínez-Estudillo, C. Hervás-Martínez, P. A. Gutiérrez y A. C. Martínez-Estudillo. "Evolutionary Product-Unit Neural Networks Classifiers", Neurocomputing, 2008 (datasets)
Benchmark datasets' partitions
The following zip file contains the partitions for some datasets used in the research group. You can download it to perform comparisons with the published results.
The most of datasets have been obtained from the UCI (University of Irvine, California) repository.
- PREPROCESSING: All nominal variables have been transformed to binary variables.The missing values have been replaced with the average (in the case of continous variables) or the mode (in the case of nominal or binary variables).
- EXPERIMENTAL DESIGN: The experimental design was conducted using a "hold-out" cross-validation procedure. Each dataset has been splitted in two stratified partitions, which means that each partition keep the original class distribution. The training partition (train_*.dat) contains the 75% of the original instances and the test partition (test_*.dat) the remaining 25%.
- DATA FORMAT: For each dataset we have generated one file for training purposes and other file for testing purposes, similar to the following one:
200 4 2
1 1 1 1 -1 2 2
0.696481734 0.358437482 0.425834333 0.330313732 0.222490899 0 1
0.590389914 0.430674851 0.869041807 0.070911615 0.634302531 0 1
0.827655687 0.617833022 0.949440873 0.670138426 0.640808376 0 1
0.810716912 0.262116166 0.454194418 0.854706083 0.279769507 1 0
...
So the format of the files is the following:
Pattern1
Pattern2
Pattern3
...
PatternN
where is the number of patterns that the file contains,
is the number of input variables of the file patterns, is number of outputs and is a vector whose number of elements is the same than the total number of variables of the file patterns. Each element of the vector codes the interpretation of the corresponding variable, the variable being an input variable if the element is equal to 1, an output variable if the element is equal to 2 and finally, if the element is equal to -1 this variable should be ignored. All the elements of the file have to be "Tab" or "Space" separated.
- LISTING OF ALL AVAILABLE DATASETS:
- Anneal
- Audio
- Autos
- Balance
- Breast-Cancer
- Breast-Cancer Wisconsin
- Card
- Dermatology
- Ecoli
- Gene (Splice)
- German
- Glass
- Glassg2
- Heart Statlog
- Heart-C
- Heart Disease Problem
- Hepatitis
- Horse
- Hypothyrois
- Ionosphere
- Iris
- Krkopt
- KrVsKp
- Labor
- Lenses
- Letter
- Liver
- Lymphography
- Newthyroid
- Optdigits
- Page-Blocks
- Pendigits
- Pima
- Post-Operatory
- Primary-Tumor
- Promoters
- Satimage
- Segment
- Sick
- Sonar
- Soybean
- Tic-Tac-Toe
- Vehicle
- Vote
- Vowel
- Waveform
- Wine
- Yeast
- Zoo
Efficient Fog Prediction with Multi-objective Evolutionary Neural Networks. Data and Source Code of the Multi-objective and Mono-objective Algorithms. A. M. Durán-Rosal, J. C. Fernández, C. Casanova-Mateo, J.Sanz-Justo, S. Salcedo-Sanz, C. Hervás-Martı́nez
The following file contains the data and Source Code of the Multi-objective and Mono-objective Algorithms.
M. Pérez-Ortiz, R. Colmenarejo, J.C. Fernández y C. Hervás-Martínez. "Can machine learning techniques help to improve the Common Fisheries Policy?", International Work Conference on Artificial Neural Networks (IWANN), 2013.
The following file contains the best model (decision tree) obtained for the problem of predicting the environmental impact of the Spanish Fleet.
M. Pérez-Ortiz, P. A. Gutiérrez y C. Hervás-Martínez. "Projection based ensemble learning for ordinal regression".
The following file includes the detailed results for the different test sets used, taking into account 6 different measures for evaluating an ordinal classifier and 16 different methodologies.
This file includes the specific partitions (in different formats) which have been used for obtaining the previous results.
P. A. Gutiérrez, C. Hervás-Martínez, F. J. Martínez-Estudillo y M. Carbonero-Ruz. "A two-stage evolutionary algorithm based on sensitivity and accuracy for multi-class problems", Information Sciences. Vol. 197. 2012, pp. 20-37.
This file includes the detailed results obtained for testing sets of some benchmark datasets from the UCI repository, using different fitness functions and the "E+A" methodology.
This file includes all the specific partitions used for obtaining the above results.
P. A. Gutiérrez, C. Hervás-Martínez y F. J. Martínez-Estudillo. "Logistic Regression by Means of Evolutionary Radial Basis Function Neural Networks", IEEE Transacctions on Neural Networks, Vol. 22. 2011, pp. 246-263.
This file includes all the UCI datasets used for obtaining the results of the paper where the "LIRBF" methodology is presented.
This other file includes the source code of the LIRBF algorithm in Java:
F. Fernández-Navarro, C. Hervás-Martínez, J. Sanchez-Monedero and P. A. Gutiérrez. "MELM-GRBF: A modified version of the extreme learning machine for generalized radial basis function neural networks", Neurocomputing, Vol. 74, Issue 16, 2011, pp. 2502-2510.
The MELM-GRBF source code is a modified version of the original ELM source code by Mr. Qin-Yu Zhu and Dr. Guang-Bin Huang available at http://www.ntu.edu.sg/home/egbhuang/.
F. Fernandez-Navarro, C. Hervás-Martínez y P. A. Gutiérrez. "A dynamic over-sampling procedure based on sensitivity for multi-class problems", Pattern Recognition, Vol. 44. 2011, pp. 1821–1833.
This file includes all the specific partitions used for obtaining the results of the paper "A dynamic over-sampling procedure based on sensitivity for multi-class problems".
P. A. Gutiérrez, C. Hervás-Martínez, M. Carbonero-Ruz and J. C. Fernandez-Caballero. "Combined Projection and Kernel Basis Functions for Classification in Evolutionary Neural Networks", Neurocomputing, Vol. 72. 2009, pp. 2731-2742.
This file includes all the specific partitions used for obtaining the results of the paper:
This other file includes the source code of the CBFEP algorithm in Java:
F. J. Martínez-Estudillo, C. Hervás-Martínez, P. A. Gutiérrez y A. C. Martínez-Estudillo. "Evolutionary Product-Unit Neural Networks Classifiers", Neurocomputing, Vol. 72. 2008, pp. 548-561.
The following file includes the training and testing sets of the "Diabetes(12fold)" and "Australian(10fold)" experiments performed in the paper.
Francisco Bérchez-Moreno, Antonio M. Durán-Rosal, César Hervás Martı́nez, Pedro A. Gutiérrez and Juan C. Fernández. "A Memetic Dynamic Coral Reef Optimisation Algorithm for simultaneous training, design, and optimisation of artificial neural networks".
The following file contains the training and validation sets used in the development of the article.