m.mediawiki.org
Reading/Web/PDF Functionality
< Reading‎ | Web
Other languages:
Bahasa Indonesia • ‎Bahasa Melayu • ‎Deutsch • ‎English • ‎dansk • ‎español • ‎français • ‎italiano • ‎polski • ‎português • ‎português do Brasil • ‎suomi • ‎tarandíne • ‎русский • ‎العربية • ‎فارسی • ‎中文 • ‎日本語 • ‎한국어
Update on PDF rendering, July 15 2019
We’ve launched the new PDF renderer. We’re looking at feedback, but haven't so far seen any significant issues. We might incorporate some suggestions, but want to note that this is not an ongoing project with continuous development. In other words, now that it's deployed and proven to work, the new renderer is entering maintenance mode. The talk page of this page won’t be abandoned, but it could take a while before anyone reacts, simply because everyone's got so much else on their plate.
In terms of books, we've left it in the hands of volunteer developers and PediaPress. We'll be glad to reach out to them with questions, but we're not planning any involvement in terms of the technical implementation.
Update on PDF rendering, June 4, 2019
We have deployed the new renderer for single-article PDFs for all projects. We hope this will resolve the issues associated with the Electron renderer, which was often unable to generate PDFs as expected. Please feel free to try out the new renderer and let us know if you have questions or come across any bugs or other issues.
Update on PDF rendering, March 18 2019
We're getting close to the deployment of our new renderer, Proton, with only a few tasks remaining as blockers (as can be seen in the task graph in phab:T181084). We will post another update once the deployment date is set. This renderer will replace the electron renderer as the default PDF renderer for single-page PDFs.
Update on books, August 17 2018
Sample book from PediaPress
Here is an updated and more comprehensive sample of the new book renderer. The layout changed quite a from the first version presented at Wikimania. Thanks for all the feedback. The export still has a number of significant issues: page breaks, infoboxes, tables, and math formulas need to be improved substantially. This sample file focusing on international scripts and math formulas reveals some of the problems that still need to be solved. Math formulas are currently rendered using MathML - switching to LaTeX should lead to significant improvements.
Update on books, August 8 2018
We have been working with PediaPress on generating and styling the new books. They have provided us with a sample of the current output, which will be very similar to the final version. We discussed points of improvement with the PediaPress team, which they are addressing currently. If you have any feedback or other comments on these samples, please let us know on the talk page.
Update on books, April 2018
Books functionality will be returning via PediaPress. After investigating the new renderer in depth, we realized that core features of the original book creator (such as page numbers and table of contents) would be very difficult to implement using the new renderer. In addition, we had significant issues with our concatenation code. Thus, we had to look for alternatives in terms of bringing back the PDF books functionality on Wikimedia projects. We reached out to PediaPress, who were the original patrons of books on Wikipedia to see if they would be interested in taking up PDF rendering for books once again. They have agreed and we are currently working on the details and schedule. They will start by working on a temporary solution based on an older technology that has previously been used to create PDF. This might have some drawbacks when it comes to graphical elements, such as maps, but will mean a faster working solution. They then plan to work on a new HTML-to-PDF renderer afterwards, based on feedback on the first implementation.
Update January 2018
We're currently preparing performance tests of the PDF to book function. We should know more in early February.
Update September 2017
Our current PDF rendering service, the offline content generator (OCG), is no longer maintainable. Simply put, it's breaking down. The Reading team at the Wikimedia Foundation has been working towards replacing it for months. OCG has been running on outdated code which may introduce security vulnerabilities and other major issues in the future. Over the last three months, we’ve had banners on the PDF creation page asking for feedback on the prototype for our new renderer. The new renderer will have improved capabilities from OCG – it will be able to print tables and infoboxes and will contain styling focused on better readability. We've gathered a lot of good feedback on the prototype and are working on making the required updates to our new PDFs.
Later addendum: Turning PDF book rendering OFF for the short term
Unfortunately, major issues with our old renderer (OCG) will require us to remove it as a rendering option prior to completing the necessary updates for the books feature. This is earlier than we wanted. By the time we remove OCG, the work for rendering of single articles will be completed. However, the rendering of books will be paused while we evaluate and complete the necessary work. Our initial choice of renderer for the replacement, the Electron rendering service, is not capable of supporting PDFs of larger sizes and fails when attempting to render a book with multiple articles. We will be working to select a new rendering system for books which can handle the size of the files and support our requirements. This is not how we planned to do this. We never aimed to temporarily remove the book PDF functionality.
Timeline:
Functionality:
For a full list of current and upcoming functionality, see below.
In addition to this page being updated, this will be communicated in a banner on PDF creation page, in Tech News and on some Wikimedia mailing lists.
Introduction
Our current PDF rendering service, the offline content generator, is no-longer maintainable. Simply put, it's breaking down. Originally created by a third party, it currently runs on outdated code which may introduce security vulnerabilities and other major issues in the future. If we're to have the PDF functionality, we unfortunately have to replace it, or we might suddenly find ourselves in a situation where we'd have to take it down without having planned to do so.
Additionally, it does not support a number of rendering requests from the community, the main one being the ability to render tables. We have selected a new service, the electron rendering service, as a suitable replacement. Our next step is to duplicate the functionality provided by OCG using the electron rendering service. Below, we will describe the main portions of the functionality we have identified as necessary. We would like to invite conversation around what is missing or what is superfluous in the provided list. We would also like to highlight over our future plans for PDF rendering to gather initial feedback.
Known Issues
There is currently a bug within Firefox (upstream) that affects the styling of infoboxes displayed across multiple pages. Progress from Mozilla can be tracked here: https://bugzilla.mozilla.org/show_bug.cgi?id=688556​.
Userbase
The following table shows a sample of traffic to the Electron "Download as PDF" service for over a 6 hour period. The traffic is broken down by operating system (OS), browser, and the browser major version (e.g. Windows 7, Chrome v61.*). Note well that the majority of our traffic appears to come from Windows based machines.
OSBrowserBrowser Major Version% of requests 
OtherOther-14.38
Windows 7Chrome6112.42
Windows 10Chrome618.83
Windows 7IE117.33
Windows 7Firefox566.59
Windows 10Firefox563.82
Windows 10Edge153.24
Windows 8.1Chrome613.07
Windows XPChrome492.2
Windows 10Chrome591.53
Windows 10IE111.51
Windows 8.1Firefox561.31
Windows XPFirefox521.22
Windows 8Chrome611.15
Windows 8.1IE111.15
Mac OS XSafari110.9
Windows 7Firefox530.89
Windows 7Firefox520.78
UbuntuFirefox560.78
Windows XPIE60.7
Windows 7Chrome550.68
Windows 7Firefox550.62
Mac OS XChrome610.62
AndroidUC Browser110.6
Windows 10Edge140.59
Windows 7Opera480.53
AndroidChrome Mobile610.49
Windows 10Opera480.44
Windows 7Chrome600.4
Windows VistaChrome490.39
Windows 7Yandex Browser170.37
Windows 10Firefox550.37
Mac OS XSafari100.36
Windows 10Chrome500.34
AndroidAndroid40.33
Mac OS XFirefox560.33
Windows 10Chrome600.32
Windows 8.1Chrome430.3
AndroidAmazon Silk600.29
Windows 7Sogou Explorer10.27
Windows 8IE100.26
Windows 7IE80.26
Windows 7IE90.25
Windows 8Opera120.25
LinuxFirefox520.25
Mac OS XFirefox530.24
Windows 7Firefox450.24
Windows 10Firefox570.24
Windows 7Firefox380.22
Windows 10Firefox470.21
Current Functionality Requirements
The following is a list of the current requirements for PDF rendering for single-article PDF's and for books. The requirements different from the current implementation are displayed in bold.
History
Update After Consultation
Proposed PDF and print styles based on feedback from consultation
We launched a consultation on the current implementation of the PDF renderer in early June, 2017. After reviewing the consultation responses, we have made the following observations:
Based on the feedback, we have incorporated the following into our new print styles:
The remainder of the requests above will be postponed until the second iteration of the PDF renderer, in which we plan to build a settings mode that will allow for customization of the available options.
Proposal
The following is a proposal for the scope of functionality necessary for PDF rendering:
Differences between current and future implementation
OCGNew ServiceNotes
Rendering individual articlesYesYes
Rendering multiple articles using the book creatorYesYes
Contains table of contents for multiple articlesYesYes
Renders tablesNoYes
AttributionYesYesOpen question: location of attribution within the new service
StylingLatexNew styles
N-column layoutYesNo
Default 2-column layoutYesTentativeDefault one column or two-column layout will be chosen based on feedback and quantitative and/or qualitative testing
Output formatPDF, PlaintextPDF Only
Design
The new PDF styles will be designed for increased readability. Based on community feedback and qualitative or quantitative testing, support for a 2-column layout may be built for the book creator and/or for individual PDFs.
Development and Deployment Roadmap
The following is a rough outline of the development and deployment roadmap. It is subject to change.
  1. April – May 2017:
    1. The Reading team builds back-end support for functionality identified above
    2. Communities are consulted on expanding or shrinking proposed functionality
    3. Qualitative test performed for styling
  2. June – July 2017:
    1. New styles implemented
    2. First iteration is launched along with OCG on all projects and performance is compared
    3. Iterations based on consultations and identified edge cases
  3. August 2017 – September 2017
    Additional changes made if necessary
  4. October 2017
    Second iteration launched without OCG on all projects
Single Articles
Phabricator Tracking
All PDF-related changes including sunsetting OCG, replacing the Electron PDF renderer, and any updates to books or the collections extension are tracked under the phabricator project Proton. The project page will display any recent updates for all tasks related to PDFs.
Books
Functionality available in October, 2017
Note: no changes will be made to the current book creator workflow at this time
Functionality available in November - December, 2017
Books will contain a table of contents with page numbers
Selecting a section from the table of contents will navigate the user to the corresponding section within the book
Styles for books will be updated for improved readability
Alternative
There is an alternative way of exporting MediaWiki to LaTeX, PDF, ODT and EPUB:
http://mediawiki2latex.wmflabs.org/
The computational resources on the server are limited.
If you run Ubuntu Linux and want results faster, you can install the m2l-pyqt or mediawiki2latex packages.
Last edited on 3 April 2021, at 23:11
Content is available under CC BY-SA 3.0 unless otherwise noted.
Privacy policy
Terms of Use
Desktop
HomeRandomLog inSettingsDonateAbout MediaWiki.orgDisclaimers
WatchEdit