User:GreenC/WaybackMedic 2.5
< User:GreenC
by GreenC
Wayback Medic 2.5 is a bot that adds and maintains links from the list of known web archive services in use on the English Wikipedia.
Edits made after 2018-12-04 are by version 2.5
The bot operator is User:GreenC. The bot account is User:GreenC bot. The bot (software) is "WaybackMedic".
WaybackMedic Fixes
Fix numberFunction nameExample editDescriptionNotesDate added
1fixthespuriousoneExampleRemove spurious |1= in cite templates.August 2016
2fixmissingprotocolExample1. Add https if protocol missing from the archive.org URL.
2. Convert existing protocol http to https.
3. Add second-level domain web if missing (archive.org/web/ → web.archive.org/web/)
4. Add /web/ path (web.archive.org/2016/ → web.archive.org/web/2016/). In some URLs adding /web/ breaks the link, test for those.
HTTPS per RFCAugust 2016
3fixemptyarchiveExample1. If |archiveurl= is empty or missing but |archivedate= has content, attempt to find a working archive URL based on the archive date, otherwise add {{dead link}} if appropriate.
2. If |archivedate= is empty or missing but |archiveurl= has content, generate date value based on timestamp in the archive URL.
3. If |archiveurl= and |archivedate= are empty, remove both and leave a {{dead link}} if appropriate.
August 2016
4fixbadstatusExampleCheck all Wayback Machine URLs for response code errors (anything but 200s). If an error code, try for a better URL via the Wayback API – first using accessdate, then using the earliest date available. If none there, check WebCite API. Try Memento API which checks a few dozen other archives. Other techniques undocumented. If still none found, remove |archiveurl= and |archivedate= and add
August 2016
6fixemptywaybackExampleThe wayback template is mangled in a certain way. Action: re-assemble. It won't delete multiple instances if they exist in the same ref (as in the Example).August 2016
7fixencodedurlExampleThe URL was incorrectly encoded. Fully decode URL and re-encode.August 2016
8fixdatemismatchExample1. Ensure |archivedate= matches the snapshot date in the URL
2. Ensure date format matches dmy or mdy if set (retain ymd if in use)
August 2016
Convert WebCite URL's from short-form to long-form
Convert Freezepage.com URL's from short-form to long-form
WebCite UsageJanuary 2017
10fixstraydtExampleRemove stray {{dead link}} template when an archive exists for the linkJanuary 2017
11fixwamExampleMerge and -->
Merge completed February 5, 2017
Webarchive TfMJanuary 2017
12fixiatsExamplearchive url -> |archive-url)January 2017
13fixswitchurlExampleMove an archive.org URL from |url= to |archiveurl= and add |archivedate= if missing.January 2017
1. A {{wayback}} is embedded in a CS template.
2. A {{dead link}} is embedded in a CS template.
January 2017
16<various>ExampleTimestamp and/or |archivedate= is 19700101 and/or out-of-bounds.January 2017
17fixdoubleurlExamplearchive.org URLs are doubled, tripled, etc..January 2017
18fixemptywebarchiveExample{{webarchive}}|date= is missing or empty value.January 2017
19fixdoublewebarchiveExampleRemove duplicate {{webarchive}} instances.January 2017
20fixembwebarchiveExampleA {{cite web}} is embedded in a {{webarchive}}January 2017
1. Convert Archive.is URL's from short-form to long-form
2. Fix URL encoding of broken links
Archive.is UsageJanuary 2017
22fixitemsExampleChange "/items/" URLs that are using machine IDsBRFAJanuary 2017
23encodemagExampleConvert MediaWiki encoding to url encoding in URLs (ie. {{!}} and {{=}})RFC3986January 2017
24decodespaceExampleConvert %20 to +, + to %20, etc.. in URLs that can be repaired this waySee alsoJune 2017
Remove typical garbage characters found at the end of URLs: .,;:-"l(%XX)('')February 2018
26fixcommentarchiveExampleOpen-up commented-out archives and add a |deadurl= "yes" or "no"February 2018
27waytree_x2encodingExampleRepair double URL-encoding eg. %3A -> %253AFebruary 2018
28fixencodebugExampleRepair missed URL-encoding of square bracketsT186417February 2018
Restore truncated Wayback URLFebruary 2018
30fixiatsExampleConvert |title={title} -> |title=Archived copyT203865September 2018
31urlchangerExampleMove broken URL to a new working URL and undo previous archives.BOTREQNovember 2018
Edits that might be cosmetic. Only with other edits.
1. Del trailing # in URLs
2. Del empty archive fields
3. archive.is --> archive.today
4. Fix double fragments
5. Convert protocol-relative URLs
WP:PRURL, T214855, Archive.todayJanuary 2019
Technical details
About every 2–3 months, the bot creates a new batch of articles to process, about 50,000 to 100,000, taking about 1–2 weeks to complete, then takes a break before the next batch 2–3 months later. Typically it follows behind IABot editing the same articles IABot did during that 2–3 month period. This is because WaybackMedic started life as a bug fixer for IABot, a task it can still perform as needed. Also because WaybackMedic does not have a dead link checker so it relies on IABot to tag links dead so it knows which ones might be saved.
Paid Editor
GreenC, in accordance with the Wikimedia Foundation's Terms of Use, discloses that he has been paid by the Internet Archive for his contributions to Wikipedia. This funding is for the ongoing development of WaybackMedic and a module of InternetArchiveBot related to books.
General sources
Last edited on 26 February 2021, at 15:11
Content is available under CC BY-SA 3.0 unless otherwise noted.
Privacy policy
Terms of Use
HomeRandomNearbyLog inSettingsDonateAbout WikipediaDisclaimers