Harvard Law School Library today announced the launch of the Institutional Data Initiative, a research initiative through which it will work with libraries and museums, government agencies, and others to publish their collections as data. The data can be used for various purposes, including training AI models.
The Institutional Data Initiative (IDI) will first focus on refining a collection of one million public domain books, scanned at Harvard Library. It will also work with the Boston Public Library to make millions of pages from historical newspapers available as data. While these collections belong to long-form text, IDI is looking to partner with others on all forms of data, including scientific and biomedical.
IDI’s launch is supported by Microsoft and OpenAI. For long-term funding, IDI is planning to work with several philanthropic and industry supporters.
Burton Davis, Vice President and Deputy General Counsel, Microsoft, said the following regarding IDI:
“Microsoft is proud to support the establishment of the Institutional Data Initiative, which will work to increase access to knowledge and high-quality data for all builders of AI. We are committed to enabling broad access to data and empowering a more inclusive AI ecosystem. Since 2020, we have worked to close the data divide, ensuring that every organization has access to the data it needs to innovate and achieve more, which is essential to growing a vibrant, competitive AI economy.”
Microsoft has always believed that everyone can benefit from collaboration around open and available data. In fact, back in 2020, Microsoft launched the Open Data Campaign through which organizations of all sizes have access to the data they need to develop AI applications.
Tom Rubin, Chief of Intellectual Property and Content, OpenAI, said the following regarding the IDI launch:
“Academic institutions have long been key partners in artificial intelligence research and progress, and Harvard’s Institutional Data Initiative is a powerful example of this. The public domain plays a vital role in the spread of knowledge and creativity, and OpenAI is delighted to support this effort. We are inspired by Prof. Zittrain’s leadership throughout this important project and are eager to see its impact.”
By making large datasets readily available, IDI is contributing to the advancement of AI technology and its accessibility for all.
0 Comments - Add comment