Lung cancer remains a critical global health challenge, with current risk assessment methods limited by single-modal data and traditional approaches. This research addresses the need for more accurate and comprehensive risk prediction by developing an advanced deep learning framework for multi-modal biomedical data fusion. Motivated by recent discoveries linking clinical phenotypes, molecular biomarkers (circRNAs), and gut microbiome to lung cancer and its complications, I propose the Cross-Modal Attention Fusion Network (CMAF-Net). CMAF-Net integrates specialized deep encoders for tabular clinical, circRNA expression, and phylogeneticstructured microbiome data. Its core innovation lies in a cross-modal attention fusion module that dynamically learns intermodal dependencies, complemented by a contrastive learning-based modal alignment loss for semantically consistent feature representations. A multi-task prediction head simultaneously forecasts lung cancer risk and associated complications. Evaluated on a comprehensive simulated dataset, CMAF-Net consistently outperforms traditional machine learning models and state-of-the-art baselines. Notably, it achieves an AUC-ROC of 0.91 for lung cancer prediction, demonstrating a significant improvement. Ablation studies confirmed the crucial contribution of each architectural component. This framework represents a significant step towards leveraging heterogeneous biological information for robust, precise lung cancer screening and personalized patient management.