Web Archivability


There are various enhancements you can make to your website that will improve its usability in both its future archived and present contexts.

Maintain stable links

Stable links, maintained either through redirects or by not changing web addresses, ensures the continued usability of content shared via social media as well as individual and social bookmarks, and are one of the nascent web's earliest best practices. Given that Internet Archive Wayback Machine-like replay systems index web addresses and not websites per se, maintaining stable links over time also ensures that future users can browse the successive captures of an individual website in a single timeline pdf icon.

Conform to web standards

Contemporary browsers are (and have to be) remarkably tolerant in rendering markup with widely-varying degrees of conformance to both past and present formal specifications. To increase the chance that future browsers will be able to interpret today's code, validate against current web standards.

Use durable data formats

Along with markup, file formats may be at risk of becoming obsolete; even if the content is saved to long-term storage, there may be no way of interpreting it in the future. Research pdf icon suggests pdf icon that the tacit user base of common web formats tends to ensure those formats' continued support, making a format's popularity a good predictor of its durability. Prefer open formats or at least those that can be read using open-source software. National memory institutions, including the Library of Congress and the Portuguese Web Archive, also offer specific examples of preferred formats and criteria for durability.

Follow web accessibility best practices

Once web content is archived, it can no longer be changed, which means that its accessibility can be improved no further than when it was archived. Following web accessibility best practices may not only be a legal mandate or organizational priority; it critically ensures the usability of your website for (the growing number of) users with impairments. The specific guideline to provide equivalent text for non-textual content can facilitate both search crawler indexing and later full-text search in the archive.

Report media type and character encoding

Reporting the correct internet media type helps the browser to determine whether it can render or execute the content itself or must hand it off to a helper application. Confirm that the HTTP Content-Type response header is accurate and consistent with any page-level http-equiv <META> tags with value Content-Type.

Reporting the correct character encoding ensures that non-English characters, including accented characters and characters in non-Latin scripts, will be rendered correctly. Character encoding is specified using the aforementioned HTTP Content-Type response header or page-level http-equiv <META> tags with value Content-Type.

Use responsive design

User-agent personalization is opaque to the archival crawler, meaning that only one of many possible versions of a website will be archived. Using responsive design ensures that archive users will continue to have a comparable experience of the original website, regardless of the platform they use for access. It is also an effective way of creating a mobile-friendly website, a ranking factor for at least one major search provider. If mobile device detection for the purpose of serving alternate content is absolutely necessary, ensure that well-known archival crawler user-agents aren't treated as mobile clients.

Permalink | Forked from an (archived) crosspost to Stanford Libraries Web ArchivingCreative Commons Attribution-ShareAlike 4.0 International License