[DEPRECATED] Wikistats pageview filesMaintained by WMF Analytics
NOTE: This dataset has had some problems and we are no longer generating new data, since September 2020. We are phasing it out in favor of Pageviews Complete
. This new dataset is a work in progress, we still have some formatting issues to fix. When it's finished we will announce it widely and explain how to migrate.
Hourly page views per article
for around 30 million article titles (Sept 2013) in around 800+ Wikimedia wikis. Repackaged (with extreme shrinkage, without losing granularity), corrected, reformatted. Daily files and two monthly files (see notes below).
Notes for hourly page views
Both sets of hourly files are derived from the best data available at the time:
The huge hourly files for page views per article per wiki have been massively compressed by merging 720 files per month, thus removing massive redundancy (80% of record space is article title, and a title can occur in all 720 files). All of this shrinkage without losing hourly granularity.
- wiki code (subproject.project)
- article title
- monthly total (with interpolation when data is missing)
- hourly counts
In the wiki code field, the subproject is the language code (fr, el, ja, etc) or meta, commons etc.
The project is one of b (wikibooks), k (wiktionary), n (wikinews), o (wikivoyage), q (wikiquote), s (wikisource), v (wikiversity), z (wikipedia), m (wikimedia subprojects: commons, meta, species, etc). An .m project suffix combined with a language subproject, i.e. en.m, means the page counts come from the mobile site.
Hourly counts can be deciphered as follows:
from 0 to 23, written as 0 = A, 1 = B ... 22 = W, 23 = X
from 1 to 31, written as 1 = A, 2 = B ... 25 = Y, 26 = Z, 27 = [, 28 = \, 29 = ], 30 = ^, 31 = _
Example: 33 views on day 2, hour 4, and 155 views on day 3, hour 7 are coded as 'BE33,CH155'