Programming Language Lineage Dataset

An open, evidence-backed dataset of programming language implementation and influence relationships. Every relationship includes a confidence score and at least one evidence source URL.

112 Total nodes
98 Languages
14 Tools
347 Relationships

Relationship Breakdown

TypeCount
influenced189
compiler written in78
runtime written in56
bootstrap written in14
transpiled to8
rewritten in2

Schema

Each language node contains: id, name, first_release_year, paradigm, typing, cluster_hint.

Each relationship contains: from_language, to_language, relationship, confidence (0–1), evidence_source (URL), notes.

Download

The raw dataset JSON is available at:

https://languagelineage.org/dataset/v4/lineage_v4.json

Citation

Language Lineage dataset (languagelineage.org). Accessed 2026.
Explore in Graph →