Permanence of the web, part 2

May 20, 2013 | By Jim Stovall | Filed in: news web sites.

A number of years ago I called the local newspaper office because I was interested in getting a copy of a picture of an old and prominent local building in its early years. I spoke to the managing editor, a good friend, who told me with some embarrassment that the newspaper didn’t have any pictures like that.

Surely, I said, this building was in many news events and would have appeared in some photo that the newspaper had kept.

Nope, he said. It wasn’t there. He sounded as if he had had this conversation before.

Some years before that, he said, the man who was then managing editor had decided that the newspaper did not need to keep a lot of old photos, ones that he knew they would never use again. So he had them thrown out. It saved money and space.

How could that have happened, I wondered. How could the man have been so stupid, so dense? Didn’t he understand that part of the newspaper’s role is to preserve an archive that helps understand the history – the “story,” if you will – of the community?

Obviously, he didn’t.

That’s one of the points that Victoria McCargar, a senior editor of the Los Angeles Times, touches on in her essay about the disappearance of data from digital files in the Seybold Report (Feb. 9, 2005).

The problem with funding archives, moreover, is that it’s difficult for budgeters to see a return on investment. While digital preservation costs are still mostly a matter of speculation, most researchers agree that it will be expensive. True, some news archives generate a modest revenue stream from reselling old images and articles in new digital forms, but beyond that, publishers and chief financial officers aren’t necessarily willing to spend money to meet some vaguely perceived obligation to maintain a record of history in the making.

McCargar’s article goes more deeply into the dangers to digital data than just the neglect (and sometime stupidity) of the keepers of the data. It serves as a wake-up call to those of us who have blithely identified “permanence” as one of the major assets of the web. True, digital data are not subject to the deteriorating effects of air, light and time, but McCargar points out other dangers. Some of these are obvious; others less so.

Software obsolescence. All of us have run into this problem from time to time. We suddenly find that our new software won’t open our old files; or our old software isn’t compatible with another piece of software we often use. McCargar mentions WordStar, the most widely used word processing program of the 1970s and 1980s. No current word processing programs will open a WordStar file.

Hardware obsolescence. Who remembers punch cards? 8-inch floppy disks?

Inadequate metadata. Metadata is the information about information, and you can think of it as technical and content-related. The technical describes the file in which the data reside. The content-related describes the information itself. We do not always do a good job of providing either of these kinds of information about a file, and consequently, we lose track of what is in the file and how it is set up. This is the age-old indexing problem that librarians and archivists have always struggled with.

Lack of standards and best practices. Digital data is still so new that we are still trying to understand it. The development of a standard set of practices is only beginning.

Lack of institutional discipline. Few journalists give much thought to preserving what they have done from day to day. They figure that is someone else’s job. Their job is to meet their deadlines and produce their publications, broadcasts or web sites. In a sense, they’re right, but McCargar argues that they must be at least a part of the preservation process and must understand and perform the responsibilities they are assigned in this area.

Copyright. Who owns the information, and how should the owner be compensated for its use? As with indexing, these are questions that have been around for years, but we are likely to lose digital data because copyright issues cannot always be easily or readily resolved.

McCargar’s article also suggests some solutions to these problems, but the suggestions are only a beginning – only, in her words, “short-term, stop-gap” methods to stem the tide of digital disappearance. We are at a point where many people involved with producing digital information are not aware of the problems with preserving it. We have begun to address even the most basic questions:

What are we archiving? In the days of shelves and manila envelopes, limits on archives were a function of space, and it was obvious that periodic decisions had to be made about what to discard. One of the interesting developments of the Digital Age is the gradual abandonment of archival policies, written or otherwise, that spelled out what was going to be kept permanently, what was to be kept temporarily and for how long, and what was to be “de-accessioned” outright. Creators and archivists didn’t always see eye to eye on the policies, though, so it’s not surprising that as technology improved, creators began asking archivists to take in more material than ever before, whether or not they were equipped to handle it.

One of the strengths of the web continues to be its permanence – and with that, its retrievability and duplicability. These were the promises of some of the thinkers who conceived of the idea of digitizing information because it was simply overwhelming us physically. Ironically, these strengths are being eroded because we have not recognized the problems outlined in McCargar’s article.

Jim Stovall (Posted April 4, 2005)

Get a FREE copy of Kill the Quarterback

Get a free digital copy of Jim Stovall's mystery novel, Kill the Quarterback. You will also get Jim's newsletter and advanced notice of publications, free downloads and a variety of information about what he is working on. Jim likes to stay in touch, so sign up today.

Powered by ConvertKit

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *