Sunteți pe pagina 1din 2

McDonald, John. 2006. A Review of Relevant International Initiatives. http://www.collectionscanada.gc.ca/digital-initiatives/012018-3300-e.

html Summarized by Ernest Hoffman This report provides an overview of current digital information initiatives and organizations from around the world, together with analysis of how they could inform the development of CDIS. It focuses on national and regional initiatives that have articulated a strategy, assessed the feasibility of a strategy, or constitute a de facto strategy. The report is focused on 'born digital' information, based on the assumption that the greatest challenges being faced by national governments concern the management of information created electronically and for which its ongoing use and preservation in digital form are dependent upon technologies that will change over time, and because approaches to managing born-digital information are in flux. Most of the initiatives address born-digital data such as government records, scientific research, etc, and so are unrelated to our project. As we are concerned with the preservation of complete webpages, including news content together with links, comments, social media elements, ads, etc, the areas of the report which are relevant to our project are web-harvesting policies, systems and practices. The most developed web harvesting systems covered in the report are: The Swedish Royal Librarys Kulturarw3 project, which has been harvesting Swedish websites since 1996. Their approach has often been cited as an example of a 'whole domain' or 'comprehensive collection' and is based on the approach taken by the Internet Archive in the U.S. This is the most comprehensive and longest-running program of its kind in the world, and would provide a good starting point for a Canadian online news harvesting program. The e-Depot of the national library of the Netherlands, an automated system for the ingestion, description, management and long-term storage of electronic publications. Like the Swedish program, it is characterized as actually more of a storage system than an active preservation system, and was created in collaboration with IBM. Unlike the Swedish program, however, web-harvesting is only one aspect of the e-Depot, which functions as a very developed automation model for all kinds of records, and could provide a good template for large-scale digital acquisition in Canada. Australias PANDORA initiative (Preserving and Accessing Networked Documentary Resources of Australia) which is a national network of distributed archives similar to the TDR network envisioned by the CDIS, with different institutions are responsible for different types of information. Pandora collects both websites and discrete publications, so presumably it is archiving Australian news websites.

The New Zealand National Library has also established a "trusted digital repository" based on the Library's own "digital information strategy" which has among its objectives, To ensure the long-term storage and preservation of New Zealand's online heritage, which would also include online news content. On the policy and rights side, the report highlights the Danish Royal Librarys Digital Policies Framework. The review of existing policies and the development of a Strategic Plan led to the establishment of the "Hybrid Library" initiative. Danish legislation on archiving web pages now allows the Royal Library to harvest published documents without problems of copyright. Both their review of existing policies and the legislation which allows them to harvest would be worth looking into. Outside of rights issues, the main challenge to automated mass-archiving of websites is accessing them later, so the ability to emulate the original hardware and software environment is essential. The report highlights Camileon (Creative Archiving at Michigan & Leeds: Emulating the Old on the New), a joint initiative between the University of Michigan and Leeds University. It ran from 1999 to 2003 and explored various emulation techniques for the preservation of digital information, and would have something to say about website retrieval and emulation. Because this report provides only brief descriptions of the initiatives it covers, its main value is that it provides links to the most important plans and projects underway around the world. It would be worthwhile to search the domains of the various national libraries, government sites and other organizations listed here for keywords related to web-harvesting, etc, as well as those related to online news in particular. This is especially the case for the British and U.S. digital preservation initiatives, which are too large and complex to have been given detailed treatment in this report, but which no doubt have web-harvesting and news-specific elements within them. The report also contains a list of journals where digital preservation research is published, and these could also be searched with our chosen keywords.