Standards and the Problems with Digital Technology

(this page is a part of the essay I wrote for the Open Government Book. For copyright info, see the introduction)


Standards and the problem with Digital Technology

The switch to digital documents entails two separate problems: obsolescent media and unreadable software formats. The obsolescent media problem is hardware-related. Digital storage media are much more fragile than nondigital ones: parchment lasts millennia when handled well, hard drives just a few years. Furthermore, digital media go out of date as new and better ones are invented—for instance, lots of people stored documents on floppy disks in the 1980s and 1990s, but hardly any computer systems can be found now with floppy disk drives.

The second problem is much more serious. Even when the container works perfectly, bit sequences are absolutely useless if you don’t know what they mean, and if the instructions you need to read or translate them are lost or too expensive to buy. These aren’t hypotheses. Almost all the files created by public and private businesses around the world are already encoded in a way that only one suite of programs, from one single, for-profit company, can read without compatibility problems. What if that company went belly up? Think it’s too big to fail? Isn’t this what everybody would have said in 2008 about Lehman Brothers, General Motors, or Chrysler?

According to Jerome P. McDonough, assistant professor in the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign, the total amount of data in the files of all types, from “government records to tax files, email, music, and photos” that could be lost due to “ever-shifting platforms and file formats” is about 369 billions of billions of bytes (As a reference, the size of this chapter is less than 30,000 bytes.) The Nimitz diagrams are locked inside files whose format, being unknown, can’t be decoded with modern software. This is not an isolated example; all over the world, billions of designs, from furniture to water purification systems, bridges and buildings, plane and car parts, are stored in a format that only the few developers of one program ever knew how to read without errors.

We can’t go back to the predigital era. It would be stupid to do so. But if we don’t start managing digital data and communications the right way—with a view toward both real interoperability and future readability—both private and public life will become harder to manage.

Luckily, many governments are aware of these hardware and software problems, but the only recourse they’ve found is precisely the one I just derided: sticking to nondigital media. Most national archives, and many other public and private organizations around the world, still waste a lot of money and resources because they don’t feel safe depending only on digital documents for long-term storage. For example, the Virginia State library “cannot accept records for permanent storage on digital media at this time due to the lack of hardware and software standards.” Consequently, “Electronic records identified as permanent…must be converted to archival quality microfilm or alkaline paper before being transferred to the Library”. What if, 20, 30, or 40 years from now, the digital records of your pension payments were unreadable? What is the benefit of digital documents for a small business, if it must continuously update software and hardware without any need except to maintain archives, or continue to (re)enter data by hand in incompatible systems?

The health care system experiences the worst of the situation, suffering from both high costs and subpar care. For example, Rep. Tauzin explicitly complained that none of the hospitals he visited were able to share digital records with one another. Many governments worldwide are fighting the same battle, on a much bigger scale. In the United States, the George W. Bush administration left behind 100 trillion bytes of electronic records. That’s 50 times as much as President Clinton left in 2001, but surely much less than what the Obama administration will produce. Already, the Bush archives, which include historical documents such as top-secret email tracing plans for the Iraq war, contain data in “formats not previously dealt with” by the U.S. National Archives. So, to come full circle, if Obama’s pen was like digital media, anyone wanting to read the memoranda would have to buy the same kind of pen. Not much openness or freedom of information in that!

Go to part 4: Why Has Digital Gone Bad So Often?