Mirroring Wikimedia project XML dumps
This page coordinates the efforts for mirroring Wikimedia project XML dumps around the globe, on independent servers, similar to the GNU/Linux .isos mirror sites. See the list of mirrors below for the dumps.
Requirements
Space
We require 25.1 TB for the 5 most recent dumps (most desired option). This would be 3 sets of full dumps and 2 sets of partial dumps. This is based on estimates from December 2020.
Alternative options:
Additional options:
Bandwidth
Wikimedia provides about 70 MB/s via dumps.wikimedia.org for XML dumps, as of January 2017.
Current mirrors
Dumps
OrganizationContentsLocationAccess
Academic Computer Club, Umeå UniversityLast 2 good XML dumpsUmeå, Sweden
BytemarkLast 5 good XML dumps (stalled since May 2020)York, United Kingdom
C3SLLast 5 good XML dumps (stalled since November 2019)Paraná, Brazil
BringYourLast 5 good XML dumpsCalifornia, United States
Your.orgAll public dataIllinois, United States
N/A (individual hoster)Last 5 good XML dumps (defunct, March 2021)Colorado, United States
Internet ArchiveAll public data (updated semi-manually)California, United States
Instructions for finding old Wikidata entity dumps (RDF and JSON) can be found on Wikidata:Database download.
Media
Note: The media files in the mirror may be outdated, please use with care. Have a look at the last modified date.
Organisation
Contents
Access

Your.org
Media (current version only)
Media tarballs
OrganisationContentsAccess
Your.orgMedia tarballs per project (except Commons)
Internet ArchiveMedia tarballs per day for Wikimedia Commons
Notes for the wikimediacommons collection
Other notes
Pageview stats, MediaWiki tarballs, other files
The nd.edu site is restricted to certain institutions with Internet2/ESnet/Geant connectivity, but those with access (primarily academics and researchers) will have high bandwidth downloads.
OrganisationContentsAccess
Academic Computer Club, Umeå University'Other' datasets
Your.Org'Other' datasets
Center for Research Computing, University of Notre DameWikidata entity dumps, pageview and other stats, Picture of the Year tarballs, Kiwix openzim files, other. Restricted ESnet/Geant/I2 access only!
Potential mirrors
If you are a hosting organization and want to volunteer, please send email to ops-dumps@wikimedia.org with XML dumps mirror somewhere in the subject line.
Based on your space and bandwidth restrictions, decide how many dumps you want to mirror, whether you want to mirror in addition or alternatively the archives (pre-2009 dumps) and/or "other" datasets. Let us know that in the email. We'll need the hostname for our rsync config, the name for the ipv6 address if there is a separate name, or in case there is no ipv6 connectivity, a note to that effect, and a contact email address.
Once your information is added to our rsync config, you'll be able to pick up the desired dirs and files from the appropriate rsync module:
We recommend a daily cron job for this.
If you are brainstorming organizations that might be interested, see discussion page.
See also
External links
dumps.wikimedia.org
Last edited on 14 October 2021, at 21:40
Meta
Content is available under CC BY-SA 3.0 unless otherwise noted.
Privacy policy
Terms of Use
Desktop
 Home Random Log in  Settings  Donate  About Meta  Disclaimers
WatchEdit