Abstract

Accurate prediction of protein-ligand binding affinities (PLAs) is essential for drug discovery, repositioning, and design. Deep learning (DL) techniques have shown promise in PLA prediction, due to their high expressive power, leading to significant advancements. However, the relationships among ligand and protein entities are not fully characterized, resulting in relatively simplistic representations that may affect the accuracy and generalization. In this study, we propose GIF-PLA (Graph-based heterogeneous Information Fusion framework for Protein-Ligand binding Affinity prediction), a structure-aware approach that effectively estimates the binding affinity scores of protein-ligand pairs with the assistance of protein-ligand binding complexes. Protein-ligand binding complexes are transformed into heterogeneity graphs with meta-paths, in parallel with protein sequences and ligand SMILES strings, fed into cascaded deep neural networks, respectively. GIF-PLA has the ability to acquire structure-oriented information, encompassing topological interactions and high-order nonlinear relationships, as well as sequence-oriented 1D grammatical information. Finally, a fusion module to effectively integrate multi-level information, and a visualization technique to demonstrate the biological significance of the predictions. The resulting GIF-PLA model substantially outperforms state-of-the-art methods on well-known datasets.

Instructions:

csv files

Dataset Files

Files have not been uploaded for this dataset

Datasets

Standard Dataset

The dataset of GIF-PLA

Abstract

Dataset Files

QUESTIONS?