Peer-reviewed | Open Access | Multidisciplinary
The growing complexity and diversity of clinical data have increased the demand for intelligent systems capable of learning from both visual and non-visual medical information. Traditional deep learning models, although highly effective in pattern recognition tasks, often face limitations when handling heterogeneous data sources or when interpretability is required in clinical decision-making. This study presents a hybrid deep learning framework that integrates convolutional neural networks (CNNs) with transformer-based architectures to enhance both medical image understanding and data-driven analytics. The proposed model leverages the spatial representation power of CNNs and the contextual learning ability of transformers to achieve a unified interpretation of imaging and patient metadata. Experimental evaluations on benchmark datasets demonstrate that the hybrid approach consistently outperforms conventional single-model architectures, showing an average improvement of 4-7% in diagnostic accuracy and a significant reduction in false-positive rates. Beyond numerical gains, the framework also improves model transparency by providing attention-based feature maps that aid clinicians in understanding prediction rationale. These findings highlight the practical potential of hybrid deep learning architectures in advancing computer-aided diagnosis, clinical risk assessment, and data-centric healthcare research.
Keywords: Hybrid Deep Learning, Medical Imaging, Data Analytics, Multimodal AI, Healthcare Informatics, CNN, Transformer