Personalized medicine

Personalized medicine divides patients into increasingly fine categories—clinical phenotypes—to enable tailored diagnosis and treatment.

Finer clinical phenotypes

Common Data Models (CDMs)

Common Data Models standardize healthcare data so multiple institutions can combine and analyze it for research on specific diseases, treatments, outcomes, and more.

Common data models overview

Healthcare systems store and label data differently. The same concept can be named in different ways even within a single organization, and certainly across countries. Without standardization, queries are inconsistent and data integration is hard.

Different terms across organizations

CDMs define common structures and fields so the same kinds of data live in the same place, regardless of source system.

CDM details

Widely used open-source research CDMs include:

There is no single “best” CDM—each has strengths and trade-offs depending on your use case, data shape, and query patterns.

OMOP

OMOP is the CDM I focus on here because I work with it regularly. It has an active community and solid documentation.

OMOP = Observational Medical Outcomes Partnership.

It is governed by the OHDSI community (Observational Health Data Sciences and Informatics) at Columbia University, New York.

OHDSI logo

History

Adoption

OMOP has broad global adoption. Estimates suggest data for roughly 1.4 billion individuals have been mapped to OMOP worldwide.

OMOP worldwide adoption

Key features

Data model

The OMOP CDM is primarily relational.

Terminology differences across languages and sources are handled via standardization. For example, ICD‑10 (English) vs CIM‑10 (French) are mapped to a common standard such as SNOMED CT.

OMOP concept hierarchy

Useful links

Deep dive: OMOP CDM

DDL scripts to create the model are available here: OHDSI/CommonDataModel. These SQL scripts create the standard tables of the OMOP CDM.

OMOP model and conventions

OMOP v5 model

Conceptual view

OMOP distinguishes two sides: source and standard.

Source

Standard

Mapping

All source terms should be mapped to standard concepts. This allows multiple data sources to converge on a shared, community‑understood vocabulary.

OMOP mapping

You can search and download standardized vocabularies and mappings in Athena (official OHDSI vocabulary portal): https://athena.ohdsi.org/search-terms/start

OMOP disease hierarchy

Medical hierarchies

Hierarchical terminologies enable powerful roll‑ups and drill‑downs.

Medical terminologies hierarchy

Example hierarchies:

Medication terminology

For medications, OMOP uses RxNorm as the primary standard, provided by the U.S. National Library of Medicine.

RxNorm and related terminologies

OMOP also includes additional medication hierarchies. One example is NDF‑RT (National Drug File – Reference Terminology), which organizes medications by related diseases/indications.

Why standardize with OMOP?

Summary

Querying the OMOP

Key concepts:

Concept IDs and names mapping

Example hierarchy:

OMOP parent-children ancestor hierarchy

OMOP uses broad, standardized table and field names so it can generalize across many health systems worldwide. A few examples:

These umbrella names intentionally cover many similar events (e.g., external consults, urgent care, inpatient stays) under a single “visit” construct. The same idea applies to other domains.

Using hierarchies can drastically simplify queries: instead of listing hundreds or thousands of specific concepts, you can select a single ancestor concept_id and include all of its descendants via CONCEPT_ANCESTOR.