You are currently browsing the tag archive for the ‘emulation’ tag.

Geoffrey Brown of the Indiana University Department of Computer Science has a nice presentation available online which talks about the CIC Floppy Disk Project, and which along the way argues the case for emulation. The CIC FDP is intended to make publications deposited with federal US libraries available via FTP over the Web. In many cases this means saving not just the contents of the floppy, but also the applications needed to make the contents readable. One of his diagrams makes the point that emulation results in two separate repositories, one for documents and the other for software.

The project doesn’t appear to be strict emulation, in that some leeway is allowed. For instance, slide no. 16 bullets the software necessary for the project, one of which is Windows 98, even though “most disks were for msdos, Win 3.1”. I take that to mean that while most floppies were created on Win 3.1, they work just as well in Win 98, so let’s use Win 98 instead. Strict emulation theory probably isn’t too happy with that.

Slide 21 is the most interesting as it contains a handy summary of the problems of migration:

  • Loss of information (e.g. word edits)
  • Loss of fidelity (e.g. “WordPerfect to Word isn’t very good”). WordPerfect is one of the apps listed earlier as necessary for their emulation.
  • Loss of authenticity: users of migrated document need access to the original to verify authenticity [AA: but this depends on how you define authenticity, surely?]
  • Not always technically possible ( e.g. closed proprietary formats)
  • Not always practically feasible (e.g. costs may be too high)
  • Emulation may necessary anyway to enable migration.
Advertisements

dioscuri1.jpgDioscuri is (as far as I am aware) the first ever emulator created specifically with long term digital preservation in mind. It is available for download from Sourceforge, and the project’s own website is here.

This awesomely ambitious project began in 2005 as a co-operative venture between the Dutch National Archives and the Koninklijke Bibliotheek. The first working example came out in late 2007. The project has now been subsumed within the European PLANETS project.

Read the rest of this entry »

Most creators of digital records do not care tuppence about the long term preservation of their documents, which is why people in the digi pres field continually try to raise awareness of the issues.

Which prompts a question – does successful emulation undermine our efforts? If the creators of records believe that someone 75 years from now will create a succesful emulator which will run Excel 2003 (say), then there is no pressure on them to create their records now in any other format, is there? Creators can carry on creating records in closed, proprietary formats, to their hearts’ content. Every new report of a successful emulation project is yet another nail in the coffin of trying to persuade creators to use different formats.

oais1.jpgNoted from OAIS.

Section 5 of the OAIS model explicitly addresses practical approaches to preserve digital information. The model immediately ties its colours to the migration mast. “No matter how well an OAIS manages its current holdings, it will eventually need to migrate much of its holdings to different media and/or to a different hardware or software environment to keep them accessible” (5.1). Emulation is mentioned later on, but always with some sort of proviso or concern attached.

Migration 

OAIS identifies three main motivators behind migration (5.1.1). These are:

  • keeping the repository cost-effective by taking advantage of new technologies and storage capabilities
  • staying relevant to changing consumer expectations
  • simple media decay.

OAIS then models four primary digital migration types (5.1.3). In order of increasing risk of information loss, they are:

  • refreshment of the bitstream from old media to newer media of the exact same type, in such a way that no metadata needs to be updated. Example: copying from one CD to a replacement CD.
  • replication of the bitstream from old media to newer media, and for which some metadata does need updating. The only metadata change would be the link between the AIP’s own unique ID and the location on the storage of the AIP itself (the “Archival Storage mapping infrastructure”). Example: moving a file from one directory on the storage to another directory.       
  • repackaging the bitstream in some new form, requring a change to the Packaging Information. Example: moving files off a CD to new media of different type.
  • transformation of the Content Information or PDI, while attempting to preserve the full information content. This last one is the one we traditionally term “migration,” and is the one which poses the most risk of information loss.

In practice there might be mixtures of all these. Transformation is the biggie, and section 5.1.3.4 goes into it in some detail. Within transformation you can get reversible transformation, such as replacing ASCII codes with UNICODE codes, or using a lossless compression algorithm; and non-reversible transformation, where the two representations are not semantically equivalent. Whether NRT has preserved enough information content may be difficult to establish.

Because the Content Information has changed in a transformation, the new AIP qualifies as a new version of the previous AIP. The PDI should be updated to identify the source AIP and its version, and to describe what was done and why (5.1.4). The first version of the AIP is referred to as the original AIP and can be retained for verification of information preservation.

The OAIS Model also looks at the possibility of improving or upgrading the AIP over time. Strictly speaking, this isn’t a transformation, but is instead creating a new Edition of an AIP, with all its own associated metadata. This can be viewed as a replacement for a previous edition, but it may be useful to retain the previous edition anyway.

There’s also a Derived AIP, which could be a handy extraction of information aggregated from multiple AIPs. But this does not replace the earlier AIPs.

Emulation 

All that is fine for pure data. But what if the look and feel needs preserving too?

The easy thing to do in the short to medium term is simply to pay techies to port the original software to the new environment. But OAIS points out that there are hidden problems. It may not be obvious when the app runs that it is functioning incorrectly. Testing all possible output values is unlikely to be cost effective for any particular OAIS. Commercial bridges, which are commercially provided conversion SW packages transforming data to other forms with similar look and feel, suffer from the same problems, and in addition give rise to potential copyright issues.

“If source code or commercial bridges are not available and there is an absolute requirement for the OAIS to preserve the Access look and feel, the OAIS would have to experiment with “emulation” [sic] technology” (5.2.2).

Emulation of apps has even more problems than porting. If the output isn’t visible data but is something like (eg) sound, then it becomes nearly impossible to know whether the current output is exactly the same as the sound made 20 years ago on a different combination of app and environment. We would need to also record the sound in some other (non-digital!) form, to use as validation information.

A different approach would be to emulate the hardware instead. But the OAIS model has an excellent paragraph summarising the problems here, too, which I’ll quote in full (in 5.2.2.2):

 “One advantage of hardware emulation is the claim that once a hardware platform is emulated successfully all operating systems and applications that ran on the original platform can be run without modification on the new platform. However, this does not take into account dependencies on input/output devices. Emulation has been used successfully when a very popular operating system is to be run on a hardware system for which it was not designed, such as running a version of Windows™ on an Apple™ machine. However even in this case, when strong market forces encourage this approach, not all applications will necessarily run correctly or perform adequately under the emulated environment. For example, it may not be possible to fully simulate all of the old hardware dependencies and timings, because of the constraints of the new hardware environment. Further, when the application presents information to a human interface, determining that some new device is still presenting the information correctly is problematical and suggests the need to have made a separate recording of the information presentation to use for validation. Once emulation has been adopted, the resulting system is particularly vulnerable to previously unknown software errors that may seriously jeopardize continued information access. Given these constraints, the technical and economic hurdles to hardware emulation appear substantial.” 

Top stuff.

borghoff.jpg Notes from Borghoff et al. Emulation has some notable advantages over migration, not least that it guarantees the greatest possible authenticity. The document’s original bitstream will always remain unchanged. All (!) we have to do is make sure that a working copy of the original app is available. As it’s impossible to keep the hardware running, we have to emulate the original system on new systems.

In theory there are no limitations on the format of the record- even dynamic behaviour should be preserved ok. But there are three massive worries with emulation: (a) can it be achieved at reasonable cost?, (b) is it possible to resolve all the copyright and legal issues involved in running software programs over decades? and (c) will the human-computer interface of the long term future be able to cope with the mouse-and-keyboard interface of today’s applications? The only realistic way to answer (c) would be to create a “vernacular copy” (p.78) but this strikes me as migration under a different name – just my own thought.

Read the rest of this entry »

This is running a program on a future computer which makes it emulate the hardware of an older computer, enabling software written for that computer to run, and therefore nothing ever needs to be migrated. Writing such software is not trivial (although in theory it only needs to be done once). AA: I think this is the Macintosh Lisa project?? You can also emulate specific applications (AA: I’m thinking of Marathon here), but this involves a huge amount of retro-engineering, if the format is proprietary. Stricly speaking, you are not actually running the original program at all, just something pretending to be the program (I’m not running Marathon, I’m running Aleph One).

Presumably the DOS command window in today’s Windows systems is an OS emulator. (Rich says it is.) And you can run the original 1981 VisiCalc spreadsheet software on it – as far as I’m aware the code used is still the original code, not a rewritten code (http://www.bricklin.com/history/vcexecutable.htm accessed 28.11.2007).

Emulation is attractive, as in theory it captures all aspects of the original file – the content, the formulae, their relationships, the behaviour, the apperance. But it is very difficult, not least because it all has to be worked out while the original platform is still active. You then need to preserve the emulator, the OS, the application installation files and the records. (So you need to remember to keep them.) It’s probably too much work for individual files like spreadsheets. It’s worth noting that emulators have only really been done for games, not for spreadsheet programs.

Brief article about the project in TNA’s RecordKeeping for Autumn 2004.

Original project

The original project was only possible because of a Government programme which had put a BBC Micro into every school in the country by 1980-81, creating a user base of compatible computers. School children in 1986 entered their own data onto their school computers, which was copied onto floppy disks or tapes sent to the BBC. All these text and images, together with analogue photographs of OS maps, were transferred to analogue videotape. The community data finally totalled 29,000 photographs and 27,000 maps. The whole database was then assembled on master videotapes from which the final videodiscs were produced. The monitor was usually a TV, which imposed a limit on the level of detail visible at once: users needed to switch between maps, pictures and text.

Restoration project

There were a number of parallel rescue projects but the one which actually worked was a collaboration between TNA, BBC and others. It did not rescue data from the videodiscs, but from the master tapes.

Independently, LongLife Data Ltd had developed a new PC interface to the community data. It works in the same way as the real one but because a modern monitor has higher resolution than a 1980s TV screen, pictures and text can be shown simultaneously. This is the version now available on the web.

Alans thoughts

  • the data was restored from analogue videotapes, not from the videodiscs or from the submitted floppy disks. After 15 years the tapes were still readable. So in a sense it’s a straightforward media refreshing thing.
  • the new interface is not an exact emulation of the old interface. It is a wholly new app. The current browsing experience has therefore lost authenticity. (Though the data is the same.)
  • can we find out anything about the authenticity of the data itself?

1. Emulation requires technical software knowledge, ie. it requires techies, which in turn means it requires money. Migration however can be done in-house by suitably trained staff.

2. According to strict emulation theory, we should open last year’s documents on an emulation of last year’s system. But of course no one does that: they just migrate the doc up to the new version of the app.

3. Emulation seems to mean different things at different times. It could mean

  • same functionality as the original app, but not the appearance. This is true of (say) the web-based Enigma code machine emulators. They don’t emulate the wooden box appearance of the original, just the process. Or it could mean
  • different fucntionality, same appearance. This is true of Marathon, and many other game emulations I imagine.

The Marathon Open Source Project. Perhaps more complicated than I thought. I think the chronology is:

  • Marathon released on Mac, 1994
  • Marathon 2 released on Mac, using a different game engine, 1995
  • Bungie release the source code for M2’s game engine, 1999
  • Aleph 1 released, a new game engine based on Bungie’s source code
  • M1A1 uses Aleph 1 game engine but the original Marathon texture files etc, 2002
  • Whole thing gets ported to Windows.
  • Bungie make whole trilogy available as freeware, 2004

Hmm. Aleph One actually is better code than the original, as it includes loads of enhancements, as Wikipedia points out:

“A number of aesthetic additions to Marathon Infinity have been developed. In early 2000, OpenGL rendering support was added, which at the preference of the user could smooth walls, landscapes, monsters, items and weapons to give them less of a pixelated appearance. Additional features using OpenGL include translucent media (allowing for translucent liquids) and colored fog. As time progressed, anisotropic filtering replaced smoothing and the addition of z-buffer increased game performance. Aleph One supports higher screen resolutions than Marathon Infinity and can use external background tracks in MP3 format. Though not heavily emphasized, there is support for three-dimensional models.

Though many of the changes are sensory, some involve greater engine capabilities. More than twice as many polygons can be drawn on the screen at a single time as Marathon Infinity and viewing distances can be far larger. Lighting effects can be more advanced and more polygons with transparent edges can be viewed in a single frame, allowing for structures such as pyramids and incredibly tall staircases. Though it is currently not supported, early versions of Aleph One were able to accomplish truly three-dimensional polygons, allowing for real bridges and balconies as opposed to just creating illusory 3D with overlapping polygons.The maximum number of creatures a level can hold is three hundred and the sprite-drawing capabilities of Aleph One are far superior to those of Marathon Infinity. Controls have been slightly expanded as well. Aleph One has an option that allows interchanged running and walking, as well as sinking and swimming in liquids. The mouse can be used more effectively and its sensitivity can be set. If desired, weapon switching may be disabled.”

Marathon Markup Language

Wikipedia again: “In 2000, support for a markup language which would eventually be called the Marathon Markup Language or MML for short was added. MML files can set things such as file names, weapons order, the colors of the automap feature, transparency of certain sprites and other things. One of the most frequent uses of this language is for installing high-resolution wall and weapon textures for play.”

AA: is Aleph one actually an emulation, then?