In Buzz Commentary

What seems permanent on the internet may not actually be the case forever. (PHOTO: Andy Tarnoff)

The missing JS archives and the digital age's great failure

Last month, the Milwaukee Journal Sentinel made headlines across the country when its archives disappeared from Google News. As Michail Takach wrote for Urban Milwaukee, "On Tuesday, Aug. 16, the Milwaukee Journal, Milwaukee Sentinel and Milwaukee Journal Sentinel listings vanished from the Google News Archive home page. This change came without any advance warning and still has no official explanation."

It's worse than that. Not only are the archives gone from Google, but NewsBank, the company which now controls 120 years of historic of Milwaukee newspaper archives, attempted to sell it for $1.5 million to the Milwaukee library system, even though that library provided much of the original archived content to Google in the first place. Milwaukee needs access to the Journal, the Sentinel and the merged Journal Sentinel content, but not at the cost of almost all the library's annual acquisition budget.

While the city's merged daily paper was never perfect, when it was owned by Journal Communications, it was a remarkably good paper for a city of this size. Its beat and investigative reporters were dogged and, even if not first to a story, willing to dig until there was no dirt left to turnover. Its website, even in the early days of newspaper websites, was easy to navigate and search, besting almost any other medium-market newspaper web edition in usability and aesthetics both.

In the last few years, after reporter and editor buyouts, a sale to Scripps and the most recent sale to Gannett – publisher of USA Today, the McDonalds of newspapers – quality on the page slipped and, now, its website has the ugly, unmanageable Gannett cookie-cutter template. You can't even easily find all of today's stories at jsonline.com, let alone find past stories through searching.

This may not matter to 99 percent of the Journal Sentinel's subscribing customers or occasional visitors. Nor may it matter to the staffers of the paper today, who presumably have access to whatever they need internally.

But for people like me who rely on internet news not just for self-congratulatory columns like this one but also for education or research purposes, access to new, recent and old news is critical. Journalists write the first draft of history, they say, but if those drafts are inaccessible to those who follow, they are as worthless as the missing paper (or pixels) they were originally printed on.

Further digging by the Urban Milwaukee folks suggests the Journal Sentinel archive's disappearance is just a temporary hiccup as the paper is folded into the Gannett mothership.

But the story is a large and stunning reminder of how, as historian Erik Loomis put it, "anything online can disappear in a blink."

As much as we like to think these things we create and adore and use frequently are permanent, all it takes is one decision in a boardroom somewhere for things to disappear. The Google News archive project itself, which digitized the Journal and Sentinel archives and made them searchable, was abandoned in 2011 because it wasn't terribly profitable. Useful and necessary, sure – but Google, even with its "Don't Be Evil" mantra, is a for-profit enterprise.

I personally worry about what Google finds profitable at any particular moment, because I have more than a decade of writing lodged deep in the belly of Google's Blogger and blogspot.com services. The commenting service I used back then is already dead, and none of the tens of thousands of comments posted to my blog exist anywhere that I'm aware of. On the other hand, I do have archived copies of all of those Blogger posts saved to my hard drive, which is in turn backed up to another hard drive in my house and also to a couple of different places out there in the cloud. It really only takes one fried hard drive to learn how important redundant backups are!

But that's not the point: As someone who still writes actively today, I don't just need to know what I wrote back then; I often want you, the reader, to know as well. I want you to be able to click on a link here at OnMilwaukee and be redirected to a working, readable version of my source material.

Increasingly, though, when I do go back to my previous online writing, I find lots of what the internet has come to term "linkrot," broken links to web pages that are no longer available. Just to make the point, I picked out a random post of mine from 2008; three of five links to specific stories or other bloggers in that post now go to 404 pages. The links have rotted; there is no way for you to judge whether I fairly or accurately characterized those sources at the time.

This is true also for many links to the Journal Sentinel's website in the early to mid-aughts, and for pages from organizations and institutions that still exit today but may have changed hosting services or re-done their file structure without redirecting old links. Sometimes the internet's "wayback machine," archive.org, has cached versions of these pages if you know what to look for, but that's not guaranteed, either. To the Journal Sentinel's credit, most contemporary links to the paper's online stories from about 2008 on do redirect to working versions of those stories.

Again, this "linkrot" in old blog posts is probably not important to most people even if they're aware it exists. But when you start to think about the ever-expanding scope of digital content creation by normal people, the trend can start to feel pretty alarming. How easily can you find something you wrote on Facebook a few years ago? Could you track down a particularly witty tweet of yours from 2011 if you wanted to? If you're an oldster like me, you too may wonder what happened to all the BBS posts you used to read and write at alt.startrek.kirkvpicard.

And, when the time comes to memorialize you, will your grandchildren be able to find pictures you posted to Snapchat or Instagram – or be able to read the data on the camera memory cards they find buried with the floppy disks, Zip Drives and CDRs in that shoebox under your bed?

I recently laughed (not out loud, though) at my mother when she asked me to email her updated pictures of my cats so she could print them out and put them into magnetic refrigerator frames. In retrospect, I have to admit her habit of printing many physical copies of digital pictures has distinct appeal when I consider the fragility of all the photos I have existing only as files on my phone or laptop.

These are not just the idle ramblings of a grumpy old man, or even a jaded Gen-Xer. All of us rely on many free and paid online services that exist now but, history suggests, will eventually not. Email, cloud backups, social media and blogs, chat programs, photo storage and even major news and entertainment websites have all come and gone. In that last case, Gawker's archives can for now still be accessed, but for how long? Those stories live on a server somewhere, and someone's got to keep paying the bills or they will disappear.

And disappear things do! This story from the Atlantic last year is a fascinating chronicle of a single Pulitzer Prize-winning print series and its struggle to get and stay online. The newspaper folded, the rights-holders took years to agree to let it be published again and technology changed – the original web story made heavy use of Adobe Flash, an all-but-dead platform in 2015. That story has a happy ending, in that the prize-winning series is available online now. But the rest of the news from that paper? The news from many other outlets that have gone under over the years? Gone forever.

To quote Erik Loomis again, "If it is going to be a requirement that someone profit in order to make primary sources public, the future of the historical profession is grim indeed." The same is true for amateur historians, cranky bloggers and even just normal people who have come to rely on ones and zeroes whose accessibility exists only as long someone else finds it profitable.

I don't have an answer to linkrot, and I am not aware of anyone who does. I don't know what to offer by way of reassurance when sites you love and providers you trust disappear from the internet. I have nothing other than indignation and disgust to offer in the current case of where archived Milwaukee Journal and Milwaukee Sentinel stories are – or aren't.

But now is as good a time as any to remind the world that our digital present – and especially our digital past – is fragile and transitory. We must remember that, and we must find ways to guard against it that don't involve extorting libraries or counting on authors to backup their own work and fund its continued existence.

Talkbacks


Post a comment / write a review.

Facebook Comments

Disclaimer: Please note that Facebook comments are posted through Facebook and cannot be approved, edited or declined by OnMilwaukee.com. The opinions expressed in Facebook comments do not necessarily reflect those of OnMilwaukee.com or its staff.