applsci-15-11689

Telechargé par Elkatraz Prison
Academic Editor: Christos Bouras
Received: 3 October 2025
Revised: 28 October 2025
Accepted: 29 October 2025
Published: 31 October 2025
Citation: Ali, M.D.; Iqbal, M.A.; Lee,
S.; Duan, X.; Kim, S.K. Explainable AI
Based Multi Class Skin Cancer
Detection Enhanced by Meta Learning
with Generative DDPM Data
Augmentation. Appl. Sci. 2025,15,
11689. https://doi.org/10.3390/
app152111689
Copyright: © 2025 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license
(https://creativecommons.org/
licenses/by/4.0/).
Article
Explainable AI Based Multi Class Skin Cancer Detection
Enhanced by Meta Learning with Generative DDPM
Data Augmentation
Muhammad Danish Ali 1, Muhammad Ali Iqbal 2, Sejong Lee 3, Xiaoyun Duan 4and Soo Kyun Kim 2,*
1Department of Electronic Engineering, Jeju National University, Jeju 63243, Republic of Korea;
2Department of Computer Engineering, Jeju National University, Jeju 63243, Republic of Korea;
3School of Computer Science and Engineering Yeungnam University, 280 Daehak-ro,
Gyeongsan 38541, Republic of Korea; [email protected]
4School of Software, Anyang Normal University, Anyang 455002, China; [email protected]
*Correspondence: [email protected]
Abstract
Despite the widespread success of convolutional deep learning frameworks in computer
vision, significant limitations persist in medical image analysis. These include low image
quality caused by noise and artifacts, limited data availability compromising robustness
on unseen data, class imbalance leading to biased predictions, and insufficient feature
representation, as conventional CNNs often fail to capture subtle patterns and complex
dependencies. To address these challenges, we propose DAME (Diffusion-Augmented
Meta-Learning Ensemble), a unified architecture that integrates hybrid modeling with
generative learning using the Denoising Diffusion Probabilistic Model (DDPM). The DDPM
component improves resolution, augments scarce data, and mitigates class imbalance. A
hybrid backbone combining CNN, Vision Transformer (ViT), and CBAM captures both local
dependencies and long-range spatial relationships, while CBAM further enhances feature
representation by adaptively emphasizing informative regions. Predictions from multiple
hybrids are aggregated, and a logistic regression meta classifier learns from these outputs
to produce robust decisions. The framework is evaluated on the HAM10000 dataset, a
benchmark for multi-class skin cancer classification. Explainable AI is incorporated through
Grad CAM, providing visual insights into the decision-making process. This synergy
mitigates CNN limitations and demonstrates superior generalizability, achieving 98.6%
accuracy, 0.986 precision, 0.986 recall, and a 0.986 F1-score, significantly outperforming
existing approaches. Overall, the proposed framework enables accurate, interpretable, and
reliable medical image diagnosis through the joint optimization of contextual modeling,
feature discrimination, and data generation.
Keywords: skin cancer; convolutional neural networks (CNN); deep learning; meta learning;
Convolutional Block Attention Module (CBAM); Data Augmentation with Diffusion
Models (DDPMs)
1. Introduction
Skin cancer is one of the most common and aggressive cancers worldwide, leading to
significant health deterioration or even loss of life. In the United States alone, it is estimated
that over 9500 individuals are diagnosed with skin cancer every day, while more than
Appl. Sci. 2025,15, 11689 https://doi.org/10.3390/app152111689
Appl. Sci. 2025,15, 11689 2 of 34
two individuals lose their lives due to this disease [
1
,
2
]. Unfortunately, skin cancer is
not limited to developed nations, as recent research from Asian countries also reveals its
growing incidence and severity as a public health and clinical concern. According to the
World Health Organization, reported skin cancer cases result in approximately 853 deaths
per year. India faces a similar challenge, with an estimated 1.5 million new instances
identified annually. In China, there has been a significant rise in various types of skin
cancer, particularly in urban regions. Overall, among all types of cancer affecting Asian
countries, skin cancer accounts for approximately 2 to 4 percent of cases, highlighting the
significant burden of this disease in the region [
3
5
]. Skin cancer is generally classified into
several categories, as illustrated in Figure 1.
Skin cancer includes a wide range of malignant pathologies, including dermatofibro-
mas, melanoma, vascular lesions, actinic keratosis, basal cell carcinomas, melanocytic nevi,
and benign keratoses. Identifying and preventing these types of skin cancer at an early
stage is critical for preserving life. Most people often face challenges in scheduling regular
check-ups due to a lack of availability, limited access to healthcare, and individual circum-
stances. Moreover, the initial undervaluation of skin irregularities can lead to advancement
into critical, life-threatening stages [68].
However, the diagnosis of skin cancer continues to be an essential but challenging task.
The advancement of computer-aided techniques for diagnosing skin lesions has become
a top priority in recent research. The ABCD rule, which focuses on asymmetry, irregular
borders, distinctive color, and dermatological features, is one of the most frequently em-
ployed approaches. Dermatologists widely use the ABCD rule to diagnose skin cancer.
Nevertheless, it may be challenging to differentiate between malignant and non-cancerous
images due to factors such as noise, poor contrast, and uneven boundaries.
Figure 1. Seven types of skin cancer are included in the HAM10000 dataset.
Appl. Sci. 2025,15, 11689 3 of 34
1.1. Role of Machine Learning and Deep Learning in the Diagnosis of Skin Cancer
Accurate diagnosis of skin cancer is a crucial area of research, and the potential
of machine learning, in particular, offers potential for significant improvements [
9
11
].
The key to successful treatment and increased chances of survival lies in early detection.
While conventional diagnostic approaches have long been the standard, the introduction
of advanced technologies such as deep learning and transfer learning has opened up
new opportunities. These innovative techniques are enhancing both the accuracy and
speed of skin cancer diagnosis, representing a significant advancement in medical research
and a beacon of hope for the future. Artificial neural networks are used in deep learn-
ing, a type of machine learning, to identify feature patterns specific to different kinds of
skin lesions [1214].
For instance, a neural network can be trained on a large dataset of skin cancer im-
ages. Then, when presented with a new image, it can quickly and accurately identify
potential cancerous lesions. Skin cancer can be diagnosed using convolutional neural
networks (CNNs), a particular type of neural network that has demonstrated remarkable
performance in image-based diagnosis [
15
,
16
]. CNNs can speed up the diagnostic process
and improve accuracy by identifying essential features such as texture, color, and pattern.
Additionally, CNNs can be enhanced to better extract relevant information in medical
images by incorporating attention mechanisms [17].
Although there have been advancements in deep neural network architectures and
attention mechanisms, skin cancer diagnosis remains a challenging task due to inter-
class and intra-class variations. Due to factors such as growth stage, patient demograph-
ics, or environmental conditions, lesions of the same type, such as melanoma, can dif-
fer significantly in size, color, shape, and texture. This variation makes it difficult for
models to generalize across cases within the same class. Another challenge is inter-
class similarity, since benign and malignant lesions may be visually similar and it is
hard to distinguish between the two with the help of deep learning models as well as
medical professionals [18,19].
It is this similarity that results in a higher chance of misclassification, especially in
borderline cases. Class imbalance is another problem with medical imaging. In the skin
cancer datasets, some of the classes are highly represented as compared to others. This also
makes it harder to train the models and to generalize, as most available datasets are small
and do not represent any varieties of skin or lesion manifestation [20,21].
Noise is another source of error that can obscure important lesion features and lower
diagnostic accuracy in dermoscopy images and include hair and lighting differences,
and imaging artifacts, which contribute to further complexity. Moreover, more sophisticated
models that may use transformers or attention mechanisms are more accurate, but are
computationally complex and therefore not applicable in real-time clinical use, especially
with resource-constrained environments. Lastly, when trained on small or imbalanced
datasets, deep learning models are prone to overfitting, resulting in poor generalization to
new and unseen cases.
To address these challenges, we proposed the DAME framework (Diffusion-Augmented
Meta-Learning Ensemble) as shown in Figure 2that integrates the local feature extraction
capabilities of ResNet50 and VGG19 with the global context modeling of vision transformers
for explainable medical image classification.
Appl. Sci. 2025,15, 11689 4 of 34
Figure 2. The proposed innovative Diffusion-Augmented Meta-Learning Ensemble Framework.
1.2. Research Contribution
The main contribution of this research is as follows:
1:
Our research proposes the DAME (Diffusion-Augmented Meta-Learning Ensemble)
framework, a unified multi-architecture deep learning model that synergistically
combines convolutional backbones (ResNet50 and VGG19) with a Vision Transformer
(ViT) module.
2:
To enhance local and global feature representation, the architecture incorporates the
Convolutional Block Attention Module (CBAM), which adaptively refines spatial and
channel-wise information.
3:
The proposed model is enhanced through the integration of generative modeling,
specifically the Denoising Diffusion Probabilistic Model (DDPM), which facilitates
robust feature learning under data scarcity, class imbalance, and noise.
4:
We incorporated a meta classifier trained on hybrid model predictions to refine the
decision boundary further, enabling accurate and generalizable detection of skin
cancer metastases.
5:
This research introduces a novel approach to enhance black-box model explain-
ability by applying Grad CAM to visualize and highlight regions that impact clas-
sification outcomes. The organization of the paper in the subsequent sections is
as follows:
Appl. Sci. 2025,15, 11689 5 of 34
Section 2reviews related work, Section 3covers problem formulation and the research
objective, and Section 4covers the proposed methodology. Section 5covers the experimental
setting, Section 6covers the results and evaluation, and Section 7covers the discussion.
Section 8covers the limitations and future work, and Section 9covers the conclusions.
2. Related Work
With the widespread application of complex neural network technologies across
biomedical domains, healthcare imaging analysis has emerged as a fundamental technique
for supporting clinical decision-making. It has also become a primary research focus at the
intersection of visual computing and healthcare intelligence. Nevertheless, the complex
multidimensional characteristics of clinical images, insufficient data availability, and dif-
ficulties in annotation continue to pose persistent challenges to training efficiency and
model generalizability. To address these issues, scholars have thoroughly investigated data
augmentation strategies, representation learning techniques, and the development of novel
classification frameworks. Table 1provides a summary of recent studies in skin cancer
classification and the identified research gap
2.1. Medical Image Feature Extraction and Classification
Huang et al. [
22
24
] studied the application of multispectral imaging technology for
the identification and classification of skin cancer. They specifically focused on seborrheic
keratosis (SK), squamous cell carcinoma (SCC), and basal cell carcinoma (BCC). The experi-
mental observation of the HIS-based system demonstrates a performance enhancement of
7.5% over the conventional RGB-based method. This enhancement is primarily attributed
to the increased dataset size employed for training convolutional neural networks; given
the computational demands of image processing tasks, larger and heterogeneous datasets
are needed to ensure that the CNNs are developed and tested as a whole. There is an-
other important detail that can be improved upon, which is the size of the dataset used
to train convolutional neural networks. This development can contribute greatly to the
result. Future research should therefore emphasize dataset augmentation and precision
enhancement before extending to other architectural aspects.
Yang et al. [
25
] proposed a multipurpose convolutional neural network model of multi-
class categorization of seven types of skin lesions. Despite their promises, the results of the
segmentation are not good regarding their relevance to the work. Moreover, classification
accuracy varied across lesion categories, with only two classes achieving satisfactory
predictive performance. The validation dataset comprises 7.5% of the total data, which
remains reliable as representative samples were ensured during the stratified splitting
process. Even though the proposed multipurpose deep neural network was promising in
terms of binary cancer detection and lesion segmentation, it is not suitable in complex multi-
class classification. The challenges mentioned above are the reasons why the increased
research and developments in the field of skin cancer classification are important.
Priyadharshini et al. [
26
,
27
] introduced an Extreme Learning framework using the
Teaching–Learning-Based Optimization (TLBO) approach. The ELM functions as an effi-
cient and precise one-hidden-layer, unidirectional neural network to extract texture features
for skin cancer categorization. Simultaneously, the TLBO algorithm enhances model param-
eters to improve performance. This combination aims to categorize skin lesions as benign
or malignant.
In Abhiram et al. [
28
], an image classification framework named Deskinned was pro-
posed for the identification of skin lesions. Their model was optimized and assessed using
the HAM10000 dataset, and its results were compared against three widely recognized
pre-trained frameworks: Inception V3, VGG16, and AlexNet. With a substantially greater
1 / 34 100%
La catégorie de ce document est-elle correcte?
Merci pour votre participation!

Faire une suggestion

Avez-vous trouvé des erreurs dans l'interface ou les textes ? Ou savez-vous comment améliorer l'interface utilisateur de StudyLib ? N'hésitez pas à envoyer vos suggestions. C'est très important pour nous!