News

HUPD, House Administrators Respond to Reports of Man Who Repeatedly Exposed Himself in Pforzheimer House

News

Staff, Parents Ask for Clarity as CPS Transitions K-Lo Students to New Schools

News

Harvard’s Lobbying Spending Rose by 17% in 2024, the Most in More Than a Decade

News

Federal Judge Temporarily Blocks Trump’s Funding Freeze

News

‘A Complicated Marriage’: Cambridge Calls on Harvard to Increase Optional PILOT Payments

Harvard Law School Library Innovation Lab Launches U.S. Federal Data Vault

The Harvard Law School Library Innovation Lab published more than 311,000 datasets as part of a new collection on Thursday.
The Harvard Law School Library Innovation Lab published more than 311,000 datasets as part of a new collection on Thursday. By Julian J. Giordano
By Caroline G. Hennigan and Bradford D. Kimball, Crimson Staff Writers

The Harvard Law School Library Innovation Lab published a vast collection of federal datasets on Thursday, preserving them as part of its newly-established data vault project.

The collection contains more than 311,000 datasets, which were archived from data.gov, federal Github repositories, and the National Institutes of Health’s online paper database. The release comes as many federal sites, particularly those referencing diversity, equity, and inclusion, have been taken offline after President Donald Trump’s recent slate of executive orders.

“This isn’t about the change in administration specifically, but the change in administration does offer a great example of why it’s so valuable to preserve things and for citizens to preserve things,” LIL Director Jack Cushman said.

The datasets, which range from the national death index to fruit and vegetable prices, are intended to be used for academic research, policymaking, and general public use.

“I really encourage anyone to go to data.gov and just click most of the datasets, because it’s such a great way to see the breadth of what it is that the government collects,” Cushman said.

The program is a continuation of LIL’s previous efforts to preserve online materials. In 2013, the group released Perma.cc, which is a tool that helps archive sites and documents that are linked to or cited in legal filings.

In addition to the data release, LIL will also be releasing the software used to preserve the data.

This isn’t the first large data release from LIL in recent months. In December, LIL’s Institutional Data Initiative announced plans to expand the public domain data available for training AI models.

“We’re finding those things that libraries have been doing for hundreds of years and bringing them to the places where people access information now, whether it's through an AI tool, through search engines, through writing software,” Cushman said.

“People are getting information in very different ways now, and we want to make sure that they’re still keeping the stuff that matters and able to use it to empower themselves,” he added.

Amanda Watson, HLS’s assistant dean for library and information services, said in a Thursday press release that the project continued the longstanding values of the Law School Library.

“This project isn’t just about investing in technology, it’s about upholding our fundamental belief that government information belongs to the public,” she said in the release.

—Staff writer Caroline G. Hennigan can be reached at caroline.hennigan@thecrimson.com. Follow her on X @cghennigan.


—Staff writer Bradford D. Kimball can be reached at bradford.kimball@thecrimson.com.

Want to keep up with breaking news? Subscribe to our email newsletter.

Tags
Harvard Law SchoolLibrariesUniversityUniversity NewsFront Middle Feature