Hui Du

and 5 more

In recent years, molecular property prediction methods based on graph neural networks have demonstrated significant advantages by modeling molecules as graph structures and leveraging their powerful feature extraction capabilities. However, since acquiring molecular label data typically relies on time-consuming and expensive experimental validation, data scarcity has become a major bottleneck limiting the further improvement of model performance. The emergence of graph contrastive learning offers a promising solution to this challenge. By pre-training on unlabeled datasets, graph contrastive learning can learn discriminative molecular representations, thereby mitigating the issue of insufficient labeled data. Nevertheless, due to the unique characteristics of molecular data, data augmentation may introduce semantic drift or impair the model’s generalization ability. To address this, we propose a Strongly and Weakly Augmented Graph Contrastive Learning model for Molecular Property Prediction (SWA-GCMPP), which aims to enhance the model’s generalization ability while preserving molecular semantic information. Specifically, SWA-GCMPP introduces two types of augmented views: a weakly augmented view and a strongly augmented view. The weakly augmented view utilizes a trainable topology augmenter to generate molecular graphs that preserve the molecule’s core topological structure, ensuring robust semantic consistency for contrastive learning. In contrast, the strongly augmented view applies four distinct graph augmentation strategies to introduce diverse structural variations, thereby improving the model’s generalization capability. Comprehensive experiments conducted on multiple molecular datasets from MoleculeNet demonstrate the effectiveness of SWA-GCMPP in molecular property prediction.