Community · Code · Contracts

Law shapes the future.
It should be accessible to everyone.

We're building open source tools to unlock the legal knowledge trapped inside documents, contracts, and systems — and making it available to developers, researchers, and anyone who needs it.

Explore the projects

Contracts govern nearly every meaningful transaction in modern life. Employment terms, rental agreements, insurance policies, software licenses — these documents determine who owes what to whom, and under what conditions. Yet the tools for understanding, comparing, and analyzing them have remained locked behind commercial paywalls, inaccessible to the people most affected by their contents.

The legal profession has long treated its documents as proprietary artifacts. The irony is hard to miss: the very system meant to codify public rights and obligations operates on infrastructure that most of the public — and most developers — cannot touch.

We believe this has to change. Not through disruption or rhetoric, but through the quiet, compounding power of open source. One tool at a time. One contribution at a time. One knowledge base that anyone can fork, annotate, and extend.

the process
OPEN SOURCE LEGAL The Knowledge Distillation Funnel NATURAL LANGUAGE VERSION CONTROLLED COLLECTIONS STRUCTURED DATA AUTOMATION & CURATION PRESENTATION Raw legal material Contracts, regulations, statutes Stored & versioned Corpora, forks, document stores Labeled & queryable Annotations, schemas, ontologies Refined & verified AI agents + human review Answers & interfaces Viewers, redlines, dashboards DISTILLED INSIGHTS & ANSWERS DISTILLATION Each layer is open source. Each compounds on the others.
the work

The Stack

From document parsing to AI-powered annotation, these projects form a complete open source toolkit for legal knowledge work.

Open-Source-Legal Python · React
OpenContracts
The platform at the center of everything. Self-hosted document annotation, version control, semantic search, and MCP integration. Humans and AI agents building knowledge bases together — annotating with precision, forking public corpora, and querying with structured LLM-powered extraction.
Annotation — Custom label schemas with precise multi-page span selection
AI Agents — Configurable assistants that search, annotate, and reason over your corpus
MCP — Expose your knowledge base to Claude, Cursor, and any MCP-compatible tool
Version Control — Git for knowledge: fork, branch, restore, never lose work
Community — Threaded discussions, @mentions, voting, leaderboards
JSv4 C# · WASM
Docxodus
Office XML redline engine forked from OpenXMLTools and upgraded to .NET 8.0. Compares Word documents, detects moved sections, converts DOCX to HTML, and runs entirely in the browser via WebAssembly. Available as NuGet package, npm module, and CLI tool.
JSv4 TypeScript · React
react-docxodus-viewer
Drop-in React component for viewing DOCX documents entirely in the browser. Powered by Docxodus WASM — no server required. Supports tracked changes, comments, pagination, header/footer rendering, and move detection with configurable display modes.
JSv4 Python · C#
Python-Redlines
Docx tracked change redlines for the Python ecosystem. Wraps .NET's WmlComparer to democratize document comparison — a capability historically locked behind commercial software. Cross-platform binaries for Linux, macOS, and Windows.
Open-Source-Legal TypeScript
CAML
Corpus Article Markup Language — a human-readable markdown superset for authoring beautiful, interactive legal articles and knowledge bases. Write like markdown, render like a publication. Includes a zero-dependency parser and a React renderer with themed components, interactive blocks, and customizable design tokens.

The Conviction

Most knowledge lives in documents. Contracts, regulations, research papers, policies — the material that governs how organizations and societies actually work. That knowledge is usually trapped: locked in PDFs, scattered across drives, understood fully by a handful of people who happened to read the right things at the right time.

Then large language models arrived, and the world suddenly needed exactly what careful curation has always produced: structured, annotated, version-controlled knowledge bases that AI can actually reason over. The collaborators these platforms were designed for finally showed up — they just turned out to be AI agents.

But the best AI systems still need carefully curated data. The difference now is that curation and AI can happen in the same place. Human annotation remains the ground truth. AI builds on top of that work — it doesn't replace it.

This is the DRY principle applied to institutional knowledge: annotate once, build on it forever. Fork a public corpus to refine someone else's annotations. Contribute back. Let the community compound what any individual couldn't do alone.

The code is open. The knowledge should be too.

Whether you're a developer, a legal professional, a researcher, or just someone who believes the law should be understandable — there's a place for you here.