11 publicly accessible legal datasets

Here are 11 publicly accessible legal datasets to practice and build your AI skills with, no client data required.

Note: Always check copyright status and licensing terms before using any dataset in a commercial product. I'm not an IP lawyer, and this is a fast-moving area. If you have specific questions about copyright status or licensing for your use case, consult one.

  1. The Atticus Project (CUAD & ACORD) 500+ annotated commercial contracts + 126,000 expert-rated clause pairs. Built by lawyers, for lawyers: https://www.atticusprojectai.org/datasets/
  2. ⁠LEDGAR 80,000 contract provisions from public SEC filings: https://huggingface.co/datasets/coastalchp/ledgar
  3. ContractNLI 607 annotated NDAs from Stanford. Clean and structured, free under Creative Commons: https://stanfordnlp.github.io/contract-nli/
  4. ⁠Stanford MCC - Material Contracts Corpus 1 million+ real contracts from SEC filings dating back to 2000: mcc.law.stanford.edu
  5. Caselaw Access Project (Harvard) Millions of U.S. court decisions spanning centuries, structured and API-accessible: https://case.law/
  6. CourtListener by the Free Law Project Full-text federal and state opinions with a robust API for building on top of real decisions: https://www.courtlistener.com/help/api/
  7. GovInfo Federal Court Decisions Official federal court opinions from the U.S. Government Publishing Office, with an API connector: https://www.govinfo.gov/app/collection/uscourts
  8. ICC Case Transcripts Public transcripts from International Criminal Court proceedings. Useful for international law, evidence, and witness examination workflows: https://www.icc-cpi.int/case-transcripts
  9. LexGLUE Multiple legal datasets bundled in one place. Great for exploring across task types: https://huggingface.co/datasets/coastalcph/lex_glue
  10. GAO Bid Protest Decisions 5,000+ bid protest decisions from the Government Accountability Office, structured and ready to use: https://github.com/KMisener90/GAO-Bid-Protest-Dataset
  11. ⁠CBCA Decision Dataset Civilian Board of Contract Appeals decisions for government contract claims: https://github.com/KMisener90/CBCA-Decision-Dataset-2007-8.27.2025