Blackstone is a natural language processing library that’s been built on top of the excellent, open source spaCy NLP frramework. It was developed at the Incorporated Council of Law Reporting for England and Wales’ research lab, ICLR&D by Daniel Hoadley.
Blackstone is experimental and is not quite production-grade, but it offers some intriguing functionality. Namely, given raw legal text extracted from a contract or other legal text, it can identify:
Named Entites:
| Ent | Name | Examples |
|------------|------------------------------------------------------------------|---------------------------------------------------------------| | CASENAME | Case names | e.g. Smith v Jones , In re Jones , In Jones' case | | CITATION | Citations (unique identifiers for reported and unreported cases) | e.g. (2002) 2 Cr App R 123 | | INSTRUMENT | Written legal instruments | e.g. Theft Act 1968, European Convention on Human Rights, CPR | | PROVISION | Unit within a written legal instrument | e.g. section 1, art 2(3) | | COURT | Court or tribunal | e.g. Court of Appeal, Upper Tribunal | | JUDGE | References to judges | e.g. Eady J, Lord Bingham of Cornhill |
Legal Text Categories:
| Cat | Description |
|------------|--------------------------------------------------------------------------| | AXIOM | The text appears to postulate a well-established principle | | CONCLUSION | The text appears to make a finding, holding, determination or conclusion | | ISSUE | The text appears to discuss an issue or question | | LEGAL_TEST | The test appears to discuss a legal test | | UNCAT | The text does not fall into one of the four categories above |
Abbreviation Detection:
Blackstone has a custom component to held link abbreviations to their proper named entity.
<u>For example, they provide sample code that can resolve “ECtHR” to “European Court of Human Rights”:</u>
import spacy
from blackstone.pipeline.abbreviations import AbbreviationDetector
nlp = spacy.load("en_blackstone_proto")
# Add the abbreviation pipe to the spacy pipeline.
abbreviation_pipe = AbbreviationDetector(nlp)
nlp.add_pipe(abbreviation_pipe)
doc = nlp('The European Court of Human Rights ("ECtHR") is the court ultimately responsible for applying the European Convention on Human Rights ("ECHR").')
print("Abbreviation", "t", "Definition")
for abrv in doc._.abbreviations:
print(f"{abrv} t ({abrv.start}, {abrv.end}) {abrv._.long_form}")
>>> "ECtHR" (7, 10) European Court of Human Rights
>>> "ECHR" (25, 28) European Convention on Human Rights
Compound Case References:
Blackstone can also detect compound case references used in many common law jurisdictions.
<u>For example, given the example text below:</u>
quote As I have indicated, this was the central issue before the judge. quote On this issue the defendants relied (successfully below) on the quote decision of the High Court in Gelmini v Moriggia [1913] 2 KB 549. quote In Jones' case [1915] 1 KB 45, the defendant wore a hat.
Blackstone can identify the following case names:
>>> Gelmini v Moriggia [1913] 2 KB 549
>>> Jones' case [1915] 1 KB 45
Legislation Linking:
Blackstone has been trained to try to match provisions of a given piece of legislation to the parent legislation and then will attempt to pull the URL for the applicable legislation. This appears to be limited to UK legislation so far, but it’s a very impressive feature that may be of use to those of you in the UK or provide inspiration for those in other jurisdictions.