You are currently browsing the monthly archive for January 2008.

jsa1.jpgArticle by Steve Bailey of JISC, in JSA vol 28 no 2, October 2007.

Bailey says that much of the debate about digi pres has been led by technical issues, such as media degradation, bitstream preservation, and emulation vs migration arguments. “Far less attention has been paid to developing technologies for deciding what of the vast volume of information we create must be kept, for what purpose, and for how long” (p119). This means that selection and appraisal becomes crucial. You would think that, in a world where massive volumes of information are created, the ability to sort the wheat from the chaff woud be valued, but this does not seem to be the case. Indeed the trend seems to be to keep everything, and Google for what you want, rather than classify and file. Bailey suggests that this sounds the death-knell for EDRM systems.

He also points out that archivists have long had a problem with physical preservation anyway. Paper and parchment preservation problems are dealt with by conservators, not by archivists, few of whom understand the hard science of the subject. And even before e-records were invented archivists had decided to deal with non-direct media, such as video or sound recordings, by treating them as “special collections” cases. “Digital records are perhaps the latest and most extreme example of our reluctance to get our professional hands dirty (p119).” The difference now is that digital records are simply too ubiquitous for archivists to fob off onto someone else.

Review in JSA vol 28 no 2, October 2007, by Caroline Shenton.

Brown’s purpose is to provide a broad overview of the subject, aimed at policy makers and webmasters, although Shenton points out that this book would be useful to ICT professionals too. Brown avoids discussing the details of technical methodologies, in order to prevent his book from becoming quickly outdated. Brown covers aspects such as the models and processes for selection, the main methods of web archiving, QA and cataloguing issues, legal issues, and some speculations about the future. Shenton thinks the only real aspect which Brown has missed is the issue of cost. Overall she rates it highly, so I’ll need to add this one to my reading list.

oais1.jpgNoted from OAIS. It strikes me that the concept of the Designated Community is central to how an OAIS even begins to think about its digital preservation. No one is saving records just for fun. They save records so that someone else will consult them at a later date. How we define ‘someone else,’ together with their interests and concerns, determines what features we need to preserve.

The atom unit here is the Consumer, which is defined in the Model (1.7.2) as “those persons or client systems who interact with OAIS services to find preserved information of interest and to access that information in detail. This can include other OAISs as well as internal OAIS persons or systems.” The Consumer is the entity which receives a DIP. Read the rest of this entry »

deegantanner.jpg Digital Preservation (Digital Futures Series) (Hardcover), by Marilyn Deegan (Editor), Simon Tanner (Editor). Hardcover: 260 pages; Publisher: Facet Publishing (18 Sep 2006); ISBN-10: 1856044858. Available at Amazon.

This is the most recent book published in the UK on digital preservation, and if I can speak from a parochial viewpoint for a bit, it’s nice to have a UK slant on things, with details given about UK projects. This means that Digital Preservation contains some practical information which is not present in Borghoff et al.

Read the rest of this entry »

Article by Jeffrey Darlington in TNA’s RecordKeeping magazine, Summer 2004. Nearly 4 years old now, of course.

Digital Archive

TNA established a Digital Preservation Dept to preserve the increasing number of born-digital records which UK government departments were creating, and to offer guidance on digipres issues to the wider community. In April 2003 TNA’s Digital Archive was launched. The Digital Archive “uses open standards and technologies wherever possible, including extensive use of Java and XML. The system stores electronic records with their associated preservation metadata.” The DA can store WP docs, emails, websites, sound, video and databases.

Read the rest of this entry »

A brief report on this appeared in ARC 205, Sept 2006.

The working party was set up following a conference in Nov 2005 and first met in April 2006. Some worrying trends and issues became apparent, including:

  • all councils represented on the WP were implementing EDRMS, or were planning to, but not all archives services were properly involved
  • no archives service had set up guidelines for managing e-accessions or for advising creators or depositors about digi pres
  • it is difficult to engage archives colleagues in discussions about e-records, which is due to a number of factors, including lack of IT knowledge and the current cultural/political focus on outreach and education
  • archives services have such limited resources that development of a selection policy is de-prioritised
  • authenticity of records (outside an EDRMS) is a challenge
  • some services even doubt their capacity to collect digital records due to patchy ICT provision, under resourcing, skill shortages.

The same would be true of other regions,  I imagine?

I have added a new page called “So you want to keep all your stuff?“, which is aimed at the home PC user. I’ve tried to make it as straightforward as possible, in the hope that anyone who stumbles across it will get a useful guide on how to preserve their own documents.

Any comments about how to improve it, or make it clearer, are very welcome.

oais1.jpgNoted from OAIS. Representation Information is a crucial concept, as it is only through our understanding of the Representation Information that a Data Object can be opened and viewed. The Representation Information itself can only be interpreted with respect to a suitable Knowledge Base.

The Representation Information concept is also inextricably tied in with the concept of the Designated Community, becuase how we define the Designated Community (and its associated Knowledge Base) determines how much Representation Information we need. “The OAIS must understand the Knowledge Base of its Designated Community to understand the minimum Representation Information that must be maintained… Over time, evolution of the Designated Community’s Knowledge Base may require updates to the Representation Information to ensure continued understanding” (2.2.1).

Read the rest of this entry »

borghoff.jpg Notes from Borghoff et al. Emulation has some notable advantages over migration, not least that it guarantees the greatest possible authenticity. The document’s original bitstream will always remain unchanged. All (!) we have to do is make sure that a working copy of the original app is available. As it’s impossible to keep the hardware running, we have to emulate the original system on new systems.

In theory there are no limitations on the format of the record- even dynamic behaviour should be preserved ok. But there are three massive worries with emulation: (a) can it be achieved at reasonable cost?, (b) is it possible to resolve all the copyright and legal issues involved in running software programs over decades? and (c) will the human-computer interface of the long term future be able to cope with the mouse-and-keyboard interface of today’s applications? The only realistic way to answer (c) would be to create a “vernacular copy” (p.78) but this strikes me as migration under a different name – just my own thought.

Read the rest of this entry »

oais1.jpgNoted from the OAIS model.

The OAIS model generally is not prescriptive, but it contains one section (3.1) where it lays out the responsibilities that an organisation must discharge in order to operate as an OAIS. These are:

1. Negotiate with Information Producers and accept appropriate information from them. This is simply the digital equivalent of what any record office does, though an OAIS in practice needs to gather much more information about a given accession, for PDI purposes.

2. Obtain sufficient control of the information to the level needed to ensure long term preservation. In a paper archive this is largely (a) keeping the stuff in a box and (b) capturing any access, copyright and legal restrictions as necessary. In a digital repository there is (c) the need to capture all the technical metadata for PDI purposes too. There may be additional legal issues as well, concerning authenticity, software copyright etc. “It is important for the OAIS to recognize the separation that can exist between physical ownership or possession of Content Information and ownership of intellectual property rights in this information” (3.2.2). The OAIS in practice may need to obtain authority to migtae Content Information to new representation forms.

3. Determine which groups should become the Designated Community able to understand the information. This is a more important task in a digital archive than a paper one, because how we define the DC determines what sort and level of Representation Information we need to keep alongside the Content data. The DC may change over time. OAIS suggests (3.2.3) that selecting a broader rather than a narrower definition helps long term preservation, as it means that more detailed RI is captured at an early stage, rather than leaving it until later.

4. Ensure that the preserved information is independently understandable to the DC, so that no further expert assistance is needed. [AA: this is an interesting point as paper repositories often work in the opposite way: the DC is so large (“the general public”) that a searchroom has to employ professional archivists and well-trained archive assistants to be on hand to explain the documents to the visitor.] The quality of being “independently understandable” will change over time. This means that RI will have to be updated as the years go by, even if the DC itself does not change.

5. Follow documented policies and procedures to ensure that (a) the information can be preserved against all reasonable contingencies, and (b) the information can be disseminated as authenticated copies of the original or as traceable back to the original. Section 3.2.5 suggests that these policies should be available to producers, consumers and any related repositories, and that the DC should be monitored so that the Content Information is still understandable to them. An OAIS should also have a long term technology usage plan.

6. Makes the preserved data available to the DC. An OAIS should have published policies on access and restrictions, so that the rights of all parties are protected.