Return Home

Open Contracts

The first open source, PDF, document labeling tool; it comes with a smooth, React-based frontend and lets you control your training data


We're biased on this one, but Open Contracts offers a smooth, modern tool to collaboratively label contracts for machine learning applications. Unlike every other free or open source tool (that we know of) that converts your contract to text first and provides a text labeling experience that relies on plain, unformatted text, Open Contracts was built from the ground up to work with PDFs. Because of this decision, we hope it will be well-received by even non-technical users who can browse and search a contract PDF just as they would normally. The goal is for no technical knowledge to be required by end users.

Why use Open Contracts? It not only lets you label contracts and store your labelled data in an open source format, it also lets you share your label sets and create new data sets from existing ones. Why constantly re-invent the wheel? Create a base dataset or use an open source one (like the Atticus Project's), and then instantly fork it to build a bespoke dataset for a specific application. Open Contracts puts you in control of what's most valuable - your legal data.


Open Source:Yes
Paid Support:No





Easily Share and Export Your Data

You can easily export annotated or un-annotated documents. You can also export entire data sets, which can be shared and loaded into other instances of Open Contracts.


Easy, Drag-and-Drop Labeling of Native PDFs

Unlike other text labeling tools (that we know of), Open Contracts is designed to let you view and label native PDFs in a high-quality PDF viewer (Mozilla's Excellent PDF.js). Non-technical users will have no problem using this tool.


Manage Multiple Datasets and Easily Fork Existing Datasets

Open Contracts makes it easy to create multiple different collections of labelled documents from the same source material. Easily "fork" an existing data set and create a customized version using your existing data as a base. This is a great way to leverage public data sets like the Atticus Project to create your own bespoke, custom training data.


Simple, React-based Frontend Is Easy to Navigate

Open Contracts is built on Node.js and React. Its MIT-licensed front-end is smooth and easy to navigate for non-technical users. You can do plain text searches or search and filter by document label type or labelled text (e.g. quickly find all text labelled as "Indemnification Clause" in your data sets).