..
In Re: Federal Cases
README FILE
This file is http://bulk.resource.org/courts.gov/0_README.html and was last revised on Mon Feb 11 08:19:53 PST 2008. This directory contains:
-
The b subdirectory contains books. See the index file for more information.
-
The c subdirectory contains 1,858 volumes of case law converted to XHTML.
-
The f subdirectory contains our initial release of some experiments in ultrafiche scanning. See the readme for more information.
-
The hein subdirectory contains the Federal Cases, donated by William S. Hein & Co. See the readme for more information.
-
The pacer subdirectory contains documents recycled from the U.S. PACER system. See pacer.resource.org for more information.
The c subdirectory contains the initial release of case law. This is an initial release and the data will be extensively reformatted. If you are going to mirror this data, please grab the tarballs, not the individual files. The data are available using the following protocols:
-
Via FTP at ftp://bulk.resource.org/courts.gov/
-
Via HTTP at http://bulk.resource.org/courts.gov/
-
Via rsync at rsync://bulk.resource.org/courts.gov/
-
Via bittorrent at http://torrent.resource.org/ [not yet fully operational]
A few commments about the formatting of the case law:
-
We started with bulk data from Fastcase, which you are welcome to use in the c/raw subdirectory.
-
You can use our transformation code without restriction. The alpha release is in the raw/code subdirectory.
-
All files are stamped with a CC0 label and we are asserting these are Works of the United States Government. You can learn more about CCZERO on the Creative Commons wiki.
-
A SHA-1 hash is computed on all files and be viewed in the individual volume indices. It would be better if the United States government signed their own opinions, but this intermediate step allows you to verify the integrity of the files.
-
The c/css subdirectory contains the core case and print sheets and an index style sheet.
Developers and members of the public are invited to submit their comments to us about transformation issues. The preferred method of communication is an open mailing list, which can be reached at:
-
http://groups.google.com/group/open-case-law
-
open-case-law@googlegroups.com
Known bugs in the case law transformation process include:
-
The initial data received was very rough and we are still having significant issues on recognition of metadata, parsing broken paragraphs, footnotes, and other mangled markup, and parsing files with missing markup.
-
Creative Commons still has CC0 URLs on their lab system. We will move towards a stable URL in the future for CC0 assertion.
-
Slip opinions do not always have full page citations to current opinions. We intend to reparse in particular the Supreme Court opionions to have full citations.
-
Citation marking with <cite> tags and insertion of pagination are intended for future releases.
-
Short case names are missing and will be added in by comparing to parallel collections.
Carl AT media.org for Public.Resource.Org