Web Archivability


Enhancing metadata improves the legibility of your content to search engines and both its discoverability and temporalization within web archives.

Report date of publication or updates

The timestamps associated with archived content in an archival replay interface like the Internet Archive Wayback Machine reflect when it was visited by the crawler, not necessarily when it was published or updated. Providing the date of publication or last update through the HTTP Last-Modified response header and/or body text helps users to better understand the temporal context of the content. This is especially important in a web archive, which may reflect a composite of resources collected at widely-varying times. Timestamps are also valuable in legal proceedings pdf icon, as they substantiate when content may or may not have been published or updated.

Improve embedded metadata

Using page-specific titles and description <META> elements simplifies curation of websites by the many memory institutions engaged in web content collecting pdf icon and improves the presentation of search result snippets.

Use meaningful web addresses

If the only available information about a historical web resource that has since disappeared is its web address, the best hope for recovering it is if that specific resource is available in a web archive, somewhere and that the date(s) of capture or last update align with the intended date of the original reference.

However, a web address that conveys something about the resource provides additional clues that can be used to rediscover the resource's new location, which can in turn lead to previously-unknown archived versions. If your content management system allows it, configure it so that web addresses include the publication date and at least a truncated version of the content title.

Permalink | Forked from an (archived) crosspost to Stanford Libraries Web ArchivingCreative Commons Attribution-ShareAlike 4.0 International License