Providers sit at the center
Provider SDKs account for 23.1% of all normalized edges, and OpenAI SDK appears in 546 final repos. The provider layer is the most common and the most connective part of the stack.
+--------------------------------------------------------------+ | OSS AI STACK MAP :: SNAPSHOT REPORT | | SOURCE: data/run-2026-03-29-methodology-v5 | | LENS: major, active, public OSS AI repos on GitHub | +--------------------------------------------------------------+
This report reads directly from data/run-2026-03-29-methodology-v5 and summarizes the current stack choices across the project’s final GitHub AI set.
Study frame: GitHub-only, public, non-fork, non-archived, active within 1 month, and at least 1,000 stars. Published stack edges come from manifests, SBOMs, bounded import fallback, repo identity, and reviewed README fallback when an included repo would otherwise remain unmapped.
Provider SDKs account for 23.1% of all normalized edges, and OpenAI SDK appears in 546 final repos. The provider layer is the most common and the most connective part of the stack.
303 of 979 technology-mapped repos (30.9%) use at least two tracked providers. The most common provider pairing is Anthropic SDK plus OpenAI SDK in 244 repos. That is followed by Google GenAI SDK plus OpenAI SDK in 218 repos. Major OSS projects are not clustering around a single vendor.
Training, orchestration, providers, and retrieval dominate the graph. Evaluation and observability remain comparatively thin, with only 32 guardrail/eval edges and 70 observability edges.
Inference from the aggregate counts: the modal major OSS AI repo in this snapshot is organization-owned, Python-first, anchored on a provider SDK, often layers in Hugging Face training tools, and then adds orchestration, retrieval, and a lightweight UI shell.
966 final repos have manifests and 837 have SBOM dependency evidence. 13 repos map only via canonical repo identity, 0 combine direct evidence with fallback signals, and 41 remain README-only.
2 included repos (0.2%) have no normalized technology edge, so graph-like analysis describes the mapped subset, not the entire final population.
651 repos were judge-reviewed and 157 judge overrides were applied in this snapshot. The published set remains rule-first, but not purely rule-only. A seeded validation sample reviewed 99 final repos (10.1%) in addition to hardening. The validation audit reviewed 119 repos, changed 17 decisions, excluded 7 repos from the sampled final set, and estimates a false-positive rate of 5.9% with a 95% interval of 2.9% to 11.7%.
Rule-only yields 972 final repos. Judge adjustment yields 981. Direct-only evidence maps 938 repos, and reviewed fallback lifts that to 979. Baseline comparison: /home/agent/oss-ai-stack-map/data/run-2026-03-25-repaired-v16
These visuals summarize the technology-connected subset of the final population. Eigenvector highlights the core hubs, betweenness isolates bridge technologies, repo degree shows stack breadth per mapped repo, and category mixing shows which layers of the stack actually co-occur.