Document Type : Original Article
Authors
1
Ph.D. in IT Engineering , Department of Computer Engineering and Information Technology, University of Qom, Qom, Iran.
2
Kheirolah Rahsepar Fard Department of Computer Engineering and Information Technology, University of Qom, Qom, Iran.
3
Department of Computer Engineering, Malek Ashtar University of Technology, Tehran
Abstract
The advent of large language models (LLMs) has revolutionized the integration of knowledge graphs (KGs) in the biomedical and cognitive sciences, effectively overcoming the limitations of traditional machine learning methods in capturing intricate semantic links among genes, diseases, and cognitive processes. This paper introduces MultiCNKG, an innovative framework that merges three distinct knowledge sources: the Cognitive Neuroscience Knowledge Graph (CNKG), containing 2.9K nodes and 4.3K edges across 9 node types and 20 edge types; the Gene Ontology (GO), featuring 43K nodes and 75K edges across 3 node types and 4 edge types; and the Disease Ontology (DO), comprising 11.2K nodes and 8.8K edges with 1 node type and 2 edge types. Utilizing advanced LLMs such as GPT-4, we perform automated entity alignment, semantic similarity computation, and graph augmentation to construct a unified, cohesive KG that interconnects genetic mechanisms, neurological disorders, and cognitive functions. The resulting MultiCNKG unified graph encompasses 6.9K nodes across 5 distinct types and 11.3K edges spanning 7 relational types, establishing a multi-layered analytical pipeline from molecular to behavioral domains. Empirical evaluations demonstrate robust framework performance, achieving an 85.20% precision rate, 87.30% recall, 92.18% coverage, 82.50% graph consistency, a 40.28% novelty detection rate, and an 89.50% expert validation score. Furthermore, link prediction benchmarks utilizing TransE (MR: 391, MRR: 0.411) and RotatE (MR: 263, MRR: 0.395) yield highly competitive performance against standard benchmarks like FB15k-237 and WN18RR. Ultimately, this integrated KG advances clinical and research applications in personalized medicine, cognitive disorder diagnostics, and data-driven hypothesis formulation within cognitive neuroscience.
Keywords