Authors
Eduardo Pérez and Sebastián Ventura
Problem statement
Despite the expertise of dermatologists, the early diagnosis of melanoma remains as a tough task, because it is present in many different shapes, sizes and colors, even between samples of the same category, as can be seen in the following Figure (images taken from DERM-LIB dataset).
Skin image datasets
Dataset | Source | Img | ImbR | IntraC | InterC | DistR | Silho |
---|---|---|---|---|---|---|---|
BCN20000 | (Combalia, 2019) | 17,393 | 2.848 | 9,014 | 10,107 | 0.892 | 0.153 |
DERM-LIB | (Ballerini, 2013) | 407 | 4.355 | 7,171 | 9,163 | 0.783 | 0.270 |
DERM7PT-C | (Kawahara, 2019) | 827 | 2.282 | 15,442 | 16,318 | 0.946 | 0.086 |
DERM7PT-D | (Kawahara, 2019) | 827 | 2.282 | 15,971 | 16,866 | 0.947 | 0.087 |
HAM10000 | (Tschandl, 2018) | 7,818 | 6.024 | 8,705 | 9,770 | 0.891 | 0.213 |
ISBI2016 | (Gutman, 2016) | 1,273 | 4.092 | 10,553 | 10,992 | 0.960 | 0.101 |
ISBI2017 | (Codella, 2018) | 2,745 | 4.259 | 9,280 | 9,674 | 0.959 | 0.089 |
MED-NODE | (Giotis, 2015) | 170 | 1.429 | 9,029 | 9,513 | 0.949 | 0.068 |
MSK-1 | (Codella, 2018) | 1,088 | 2.615 | 11,753 | 14,068 | 0.835 | 0.173 |
MSK-2 | (Codella, 2018) | 1,522 | 3.299 | 9,288 | 9,418 | 0.986 | 0.062 |
MSK-3 | (Codella, 2018) | 225 | 10.842 | 8,075 | 8,074 | 1.000 | 0.112 |
MSK-4 | (Codella, 2018) | 943 | 3.366 | 6,930 | 7,162 | 0.968 | 0.065 |
PH2 | (Mendonca, 2013) | 200 | 4.000 | 12,688 | 14,928 | 0.850 | 0.210 |
SDC-198 | (Sun, 2016) | 648 | 4.735 | 14,054 | 14,840 | 0.947 | 0.116 |
UDA-1 | (Gutman, 2016) | 557 | 2.503 | 11,730 | 12,243 | 0.958 | 0.083 |
UDA-2 | (Gutman, 2016) | 60 | 1.609 | 11,297 | 11,601 | 0.974 | 0.020 |
Total/Avg | 36,703 | 3.784 | 10,686 | 11,546 | 0.928 | 0.119 |
Table 1: Datasets. |
- UDA, MSK, HAM10000 and BCN20000 datasets are included in the ISIC repository. Also, the images in BCN20000 would be considered hard-to-diagnose and had to be excised and histopathologically diagnosed.
- PH2 dataset comprises high-quality dermoscopic images, manual segmentation, clinical diagnosis and the identification of several dermoscopic structures, performed by expert dermatologists.
- Dermofit Image Library dataset gathers 1,300 focal high-quality skin lesion images under standardised conditions. Each image has a diagnosis based on expert opinion and like PH2 dataset, it includes a binary segmentation mask that denotes the lesion area. We obtained it for a small fee.
- MED-NODE dataset collects 170 non-dermoscopic images from common digital cameras; this type of image is very important to prove the models with data from affordable devices.
- SDC-198 dataset contains 6,584 real-world images from 198 categories to encourage further research and its application in real life scenarios.
- DERM7PT is a benchmark dataset composed by clinical and dermoscopic images, allowing to check if it is significant to use dermoscopic images versus images taken with digital cameras.
References
- Combalia, M., Codella, N. C. F., Rotemberg, V., Helba, B., Vilaplana, V., Reiter, O., Carrera, C., Barreiro, A., Halpern, A. C., Puig, S., & Malvehy, J. (2019). BCN20000: Dermoscopic Lesions in the Wild. https://arxiv.org/abs/1908.02288
- Ballerini, L., Fisher, R. B., Aldridge, B., & Rees, J. (2013). A color and texture based hierarchical K-NN approach to the classification of non-melanoma skin lesions. In Lecture Notes in Computational Vision and Biomechanics (Vol. 6). https://doi.org/10.1007/978-94-007-5389-1_4
- Kawahara, J., Daneshvar, S., Argenziano, G., & Hamarneh, G. (2019). Seven-point checklist and skin lesion classification using multitask multimodal neural nets. IEEE Journal of Biomedical and Health Informatics, 23(2), 538–546. https://doi.org/10.1109/JBHI.2018.2824327
- Tschandl, P., Rosendahl, C., & Kittler, H. (2018). The HAM10000 Dataset: A Large Collection of Multi-Source Dermatoscopic Images of Common Pigmented Skin Lesions. ArXiv Preprint ArXiv:1803.10417.
- Gutman, D., Codella, N. C. F., Celebi, E., Helba, B., Marchetti, M., Mishra, N., & Halpern, A. (2016). Skin Lesion Analysis toward Melanoma Detection: A Challenge at the International Symposium on Biomedical Imaging (ISBI) 2016, hosted by the International Skin Imaging Collaboration (ISIC). http://arxiv.org/abs/1605.01397
- Codella, N. C. F., Gutman, D., Celebi, M. E., Helba, B., Marchetti, M. A., Dusza, S. W., Kalloo, A., Liopyris, K., Mishra, N., Kittler, H., & Halpern, A. (2018). Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). Proceedings – International Symposium on Biomedical Imaging, 2018–April, 168–172. https://doi.org/10.1109/ISBI.2018.8363547
- Giotis, I., Molders, N., Land, S., Biehl, M., Jonkman, M. F., & Petkov, N. (2015). MED-NODE: A computer-assisted melanoma diagnosis system using non-dermoscopic images. Expert Systems with Applications, 42(19), 6578–6585. https://doi.org/10.1016/J.ESWA.2015.04.034
- Mendonca, T., Ferreira, P. M., Marques, J. S., Marcal, A. R. S., & Rozeira, J. (2013). PH2 – A dermoscopic image database for research and benchmarking. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, 5437–5440. https://doi.org/10.1109/EMBC.2013.6610779
- Sun, X., Yang, J., Sun, M., & Wang, K. (2016). A benchmark for automatic visual classification of clinical skin disease images. European Conference on Computer Vision, 206–222.