cdl2025 provides tools and pipelines for importing clinical ontologies
(ICD10, HPO, UMLS) into a Neo4j graph database, computing ontology
embeddings, and integrating patient-generated data for downstream
analytics.
Slides of the masterclass: Google Drive
- Overview
- Installation
- Data Pipeline
- Import Commands
- Neo4j & APOC Virtualization
- Contributing
- License
This project supports:
- Loading ICD10 codes, chapters, groups, and embeddings
- Loading HPO ontology and embeddings
- Mapping UMLS concepts to ICD10 and HPO
- Ingesting patient-generated annotations
- Creating virtualized views in Neo4j via APOC Data Virtualization
Create and activate a Python virtual environment:
virtualenv -p python3 venv
source venv/bin/activate
pip install -r requirements.lockThe ICD10 pipeline loads:
- Disease codes + hierarchy\
- Chapters\
- Groups\
- Embeddings
The HPO pipeline loads:
- The full HPO ontology\
- Precomputed HPO embeddings
Maps UMLS concepts (from MRCONSO.RRF) to:
- ICD10 Codes\
- HPO Terms
Import patient-generated annotations (CSV-format).
TODO: Add schema specification and examples.
python -m factory.icd10 --backend neo4j --file icd102019syst_codes.txtpython -m factory.icd10_chapter --backend neo4j --file icd102019syst_chapters.txtpython -m factory.icd10_group --backend neo4j --file icd102019syst_groups.txtpython -m factory.icd10_embedding --backend neo4jpython -m factory.hpo --backend neo4jpython -m factory.hpo_embedding --backend neo4jpython -m factory.umls_map --backend neo4j --file MRCONSO.RRFpython -m factory.patient_annotation --backend neo4j --file patient03.csvDownload the appropriate release:
https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/5.26.0
Place it in Neo4j's plugins/ directory.
dbms.security.procedures.unrestricted=apoc.*
dbms.security.procedures.allowlist=apoc.*
apoc.import.file.enabled=true
apoc.import.file.use_neo4j_config=true
CALL apoc.dv.catalog.install(
"patient", "cdl2025",
{
type: "CSV",
url: "file:///patient.csv",
labels: ["Patient"],
query: "map.PatientID = $patientId",
desc: "Patient details with patientId"
}
);Pull requests are welcome!
Please use feature branches and follow existing module patterns.