Anatomy of a Web Archive

I'm inclined to blame the semantic flexibility of the word "archive" for the fact that someone with no previous exposure to web archives might variously suppose that they are: the result of saving web pages from the browser, institutions acting as repositories for web resources, a navigational feature of some websites allowing for browsing of past content, online storage platforms imagined to be more durable than the web itself, or, simply, "the Wayback Machine." For as many policies and practices guide cultural heritage institutions' approaches to web archiving, however, the "web archives" that they create and preserve are remarkably consistent. What are web archives, exactly?

Link Persistence, Website Persistence

A previous post on this blog explored why it's so hard to come up with a reliable measurement of the average lifespan of a webpage. In essence, the argument came down to this: links and the websites they represent tend to become decoupled over time. Without a broad understanding of how that process takes place, it's hard to make definitive claims about the persistence of websites when available automated tools can only capably check for the persistence of links.

