WIKIPEDIA: LINK ROT, by Wikipedia



This page is about (primarily) link rot in external links. For broken section links within Wikipedia, see Wikipedia:Database reports/Broken section anchors. For internal links which point to deleted or non-existent articles, see WP:REDLINKS. For other uses, see Wikipedia:Citing sources § Preventing and repairing dead links.

Like most large websites, Wikipedia suffers from the phenomenon known as link rot, where external links become dead, as the linked web pages or complete websites disappear, change their content, or move without HTML redirection. This presents a significant threat to Wikipedia’s reliability policy and its source citation guideline.

In general, do not delete cited information solely because the URL to the source does not work any longer. Tools, procedures, and processes are available as outlined in this document.

Preventing link rot

Links added by editors to the English Wikipedia mainspace are automatically saved to Wayback Machine within about 24 hours (nb. in practice not every link is getting saved for various reasons). This is done with a program called “NoMore404” which Internet Archive runs and maintains; other language wiki sites are included. It monitors EventStreams API, extracts new external URLs and adds a snapshot to the Wayback. This system became active sometime after 2015, though previous efforts were also made. Also, sometime after 2012, archive.today (aka archive.is) attempted to archive all external links then existing on Wikipedia at that time. This was incomplete but a significant number of links were added to archive.today during this period making it a major archival source filling in gaps of coverage. Archive.today is still making some automated archives as of 2020, though the extent of coverage and frequency is unknown.

As of 2015, there is a Wikipedia bot and tool called WP:IABOT that automates fixing link rot. It runs continuously, checking all articles on Wikipedia if a link is dead, adding archives to Wayback Machine (if not yet there), and replacing dead links in the wikitext with an archived version. This bot runs automatically but it can also be directed by end users through its web interface. It is available when viewing any page's history, located near the top of the page on the line of “External Tools“, with the “Fix dead links” option.

As of 2015, the periodic bot WP:WAYBACKMEDIC checks for link rot in the archive links themselves. Archive databases are dynamic: archives move or go missing, new ones are added, etc. This bot maintains existing archive links on English Wikipedia. It also archives resources on request at WP:URLREQ. It is a flexible tool that can carry out many custom jobs such as URL migration/move, usurped domains, soft-404 discovery and repair.

Repairing a dead link

Check for archived versions at one of the many web archive services. The “Big 3” archive services are web.archive.org, webcitation.org and archive.today. These account for over 90% of all archives on Wikipedia, with web.archive.org being over 80% of all archive links. Other archive services are listed at WP:WEBARCHIVES.

Keeping dead links

A dead, unarchived source URL may still be useful. Such a link indicates that information was (probably) verifiable in the past, and the link might provide another user with greater resources or expertise with enough information to find the reference. It could also return from the dead. With a dead link, it is possible to determine if it has been cited elsewhere, or to contact the person originally responsible for the source. For example, one could contact the Yale Computer Science department if http://www.cs.yale.edu/~EliYale/Defense-in-Depth-PhD-thesis.pdf [dead link] were dead.


Back to top