ConnectomeDB Methods


Ligand-Receptor Interaction Modelling

ConnectomeDB models ligand–receptor interactions involving peptide ligands (non-peptide ligands are excluded). All ligand–receptor pairs are treated as simplex interactions with a direction from a ligand-expressing cell to a receptor-expressing cell.

Heteromeric receptors and ligands

ConnectomeDB splits complexes heteromeric receptors and ligands into simplex pairs. E.g. IFNB1 which binds to the IFNAR1:IFNAR2 complex, is modelled as two separate pairs IFNB1–IFNAR1 and IFNB1–IFNAR2. Similarly, for a heteromeric ligand TNFSF13:TNFSF13B binding to TNFRSF17 would be split into TNFSF13–TNFRSF17 and TNFSF13B–TNFRSF17.

Bidirectional signalling

In the case of bidirectional signalling, where signals are triggered in both cells, the pair is modelled as two directional interactions. E.g. CD28–CD86 and CD86–CD28.

Incorporation of Existing LR Pairs

Protein ligand–mediated ligand–receptor pairs from our previous database release 2020 and from other major resources, including CellChatDB_v2, CellPhoneDB_v5, CellTalkDB, ICELLNET_v2, and LIANA_consensus, were downloaded and frozen in May 2025.

Note: Non-protein ligands were removed, and complex protein-based pairs were split into single pairs. For further details, see Ligand–Receptor Interaction Modelling above.

Identification of Putative New LR Pairs

Multiple strategies were used to identify novel ligand–receptor (LR) pairs. Initially, we conducted non-exhaustive manual searches of abstracts using keywords such as ligand and receptor, and collected candidate pairs encountered over the past five years.

This process was significantly accelerated using prompt-based queries to large language models. Perplexity AI was employed to perform an exhaustive reciprocal search: (i) identifying receptors that bind to each known ligand in our database, and (ii) ligands that bind to each known receptor.

All candidate pairs were parsed to extract gene symbols, and only those representing valid gene pairs not already present in ConnectomeDB2020 were retained for manual validation.

Manual Curation of Ligand–Receptor Triplets

Each interaction in ConnectomeDB is recorded as a triplet: ligand symbol : receptor symbol : PubMed ID. Only valid LR interactions supported by primary experimental evidence are included; literature reviews are excluded, as they often overgeneralize (e.g. “Eph receptors interact with ephrins”) without specifying direct evidence for individual pairs.

For entries from existing databases with a PubMed ID, a single curator manually assessed the cited article. If it contained primary evidence for the interaction, the triplet was retained in ConnectomeDB2025.

For interactions lacking primary evidence—due to incorrect, retracted or missing PubMed IDs, review-only citations, or newly identified pairs (via manual or AI-assisted searches)—two independent curators conducted manual literature searches. One curator identified a candidate publication, and a second curator independently verified whether it provided valid primary evidence. If there was disagreement, a third expert curator acted as a tie-breaker to determine inclusion.

Subcellular localization

The subcellular localization of ligands and receptors was annotated using Perplexity, UniProt, and The Human Protein Atlas (HPA) and manual curation. For HPA data, only localization information with “Enhanced”, “Supported”, or “Approved” reliability was considered; entries marked as “Uncertain” were excluded.

The vast majority of ligand–receptor pairs involve ligands that are secreted or located on the plasma membrane of the sending cell, interacting with a receptor on the plasma membrane of the receiving cell.

Note: ConnectomeDB2025 also incorporates some non-canonical pairs that are worth mentioning.

  • The PENK–OGFR pair involves interaction of an endogenous opioid peptide (OGF processed from PENK) that binds to OGFR, which is localized to the nuclear membrane.

  • The S100A1–TLR4 pair involves interaction of S100A1, which is normally intracellular but released via non-canonical means upon cellular stress or damage.

Orthologous Pairs

Original interactions

For each literature-verified ligand–receptor pair, we recorded the species of origin for both ligand and receptor proteins, generating a total of 6,094 quintuplets of ligand : receptor : PMID : ligand_species : receptor_species. Most experimental evidence came from human or mouse proteins, supporting 3,634 and 1,017 quintuplets respectively. Additional evidence came from rat (170), chicken (29), Xenopus (21), zebrafish (19), cow (3), rabbit (1), and tetraodon (1). Notably, 1,199 pairs were supported by cross-species experiments, where the ligand and receptor were derived from different organisms (e.g. porcine RLN2 binding to human RXFP1).

Orthology mapping

To enable analyses in other species, ligands and receptors in the original pairs were mapped to orthologues in human, mouse and 12 vertebrate species including Pan troglodytes (chimp), Macaca mulatta (macaque), Callithrix jacchus (marmoset), Equus caballus (horse), Sus scrofa (pig), Canis lupus familiaris (dog), Bos taurus (cow), Ovis aries (sheep), Rattus norvegicus (rat), Gallus gallus (chicken), Xenopus tropicalis (frog) and Danio rerio (zebrafish). Data from the Alliance of Genome Resources and Ensembl were used.

All interactions and their evidence (Direct/Inferred) are provided in the species-based Ligand-Receptor Browser tables.