E ISSN: 2583-049X
logo

International Journal of Advanced Multidisciplinary Research and Studies

Volume 5, Issue 6, 2025

An Effective Ensemble Multimodal Learning System for Medical Diagnosis



Author(s): Thirumalaimuthu Thirumalaiappan Ramanathan

Abstract:

Multi-modal learning allows the integration of heterogeneous data modalities to achieve better prediction performance on clinical datasets. In this paper, we extend our approach to multimodal deep learning models and consider a presence of three tabular datasets: Stroke Prediction, Framingham Heart Study (FHS), Chronic Kidney Disease (CKD) that are split into demographics, clinical and laboratory measurements. We design three multimodal schemes (i.e., MDNN, MAE, and Transformer based fusion) to extract the intra- and inter-modality feature interaction adaptively. For reference, we also compare with a single-stream deep model (Early Fusion MLP) and traditional machine learning classifiers such as Logistic Regression, Decision Tree, Random Forest, Gradient Boosting Machine (GBM), Support Vector Machines (SVM), and Extreme Gradient Boosting Machine (XGBoost). The model performance is measured through these three metrics: accuracy, F1-score and area under the ROC curve (AUC). This study demonstrates how MDNN, MAE, and Transformer based fusion are effective in structured multimodal settings. The results of our study suggest that multimodal learning is an effective strategy for exploiting diverse medical data types in clinical prediction problems, by effectively utilizing modality-specific feature representation.


Keywords: Machine Learning, Deep Learning, Multimodal Deep Neural Network, Multimodal Autoencoders, Multimodal Transformers, Medical Diagnosis

Pages: 1710-1719

Download Full Article: Click Here