News/Toolforge.org
< News
Tracked in Phabricator
Task T234617RESOLVED
The Toolforge project is ready to start using toolforge.org as its DNS domain for naming both infrastructure (for example the Toolforge bastion servers) and Toolforge hosted webservices, replacing tools.wmflabs.org. This change includes introducing "host based" naming for webservices. This page describes the changes, how to handle it, and what to expect. This change is expected to primarily affect Toolforge webservices, with very few expected changes for tools that do not use a webservice interface.
What is changing?
Timeline
What should I do?
SSH to the Toolforge bastions using new hostnames
Update your ssh configuration and muscle memory to use the new canonical bastion names:
Test your webservice
All webservices running on both the Kubernetes cluster and the job grid are behind HTTP proxy layers which understand URLs in the legacy tools.wmflabs.org/<toolname> and new <toolname>.toolforge.org forms. Initial testing of a tool can be done simply by using the new hostname and URL to access the tool's web interface.
Some Kubernetes webservices may be missing the ingress object needed to route <toolname>.toolforge.org requests to the tool. When this is the case, visiting https://<toolname>.toolforge.org/ will return a "No webservice" response. See the Kubernetes ingress instructions below for instructions on fixing this.
Migrate your webservice
Once you have tested your webservice and are fairly confident that it should work as expected using the new hostname and URL path you can configure it to redirect all visitors to the new canonical URL. This can simplify migration for tools where the code does not handle both URLs easily at the same time. It is also a neat way to introduce the new domain to your tool's users.
What are the primary changes with moving to the new domain?
New service names for accessing Toolforge SSH bastions
Please update your SSH configuration to use the new fully qualified domain names. The SSH host key fingerprints are the same and may be verified at Help:SSH Fingerprints/login.toolforge.org and Help:SSH Fingerprints/dev.toolforge.org respectively.
The legacy service names will continue to exist for the foreseeable future, but documentation should be updated to use the new canonical names.
New URLs for webservices
Webservices running in Toolforge will be served now in a new URL scheme:
A system will be put in place to ensure that links to the legacy URLs will continue to work. We understand that many of the tools running on Toolforge have been advertised and used widely and that "cool URIs don't change".
Temporary --canonical switch for webservice
During the compatibility period, the webservice utility has a command line switch to force a redirect from the legacy URL scheme to the new one:
When you start your webservice like this a mechanism will be activated to automatically redirect incoming requests by using the HTTP status code 307 Temporary Redirect. Example of such redirection:
Note how the tool name was removed from the path, but URL parameters and arguments are kept the same.
When the redirect mechanism is activated, requests for the legacy URL will not reach your application code because the redirection happens on an external component. Your webservice will only see requests in the new domain and URL scheme.
If you stop your webservice and start it again without the --canonical argument, everything will return to normal legacy behavior and URL scheme, with no redirection. This feature is why we are using the 307 status code for the redirects during the compatibility period rather than the 301 status code. Modern browsers cache 301 redirects which makes reverting to the older URL scheme more difficult.
When the compatibility period is over, the --canonical command line switch will no longer be available (see Timeline and After migration period sections below).
Webservice templates
The $HOME/.webservicerc configuration file was intended to help us deal with the growing number of arguments to the webservice command by making it easy to state that the Kubernetes backend should be used. In the years since we have added even more arguments and that does not seem likely to stop as we attempt to expose more control of Kubernetes deployments.
During the 2020 Kubernetes cluster migration we had a temporary migrate action which made it possible to reuse a tool's $HOME/service.manifest file across restarts. This was very handy, and it inspired us to think about a reasonable way to make this functionality available for the start action more generally.
Webservice version 0.65 introduces the concept of a "template" file which can be used to store arguments (and eventually other structured content) for starting a webservice. The code will look for a "--template=..." cli argument and also fallback to looking for a $HOME/service.template file. The latter is what most tools will be expected to use, but we may find interesting uses for multiple templates in a single tool as well.
A webservice template file is a YAML document. It can contain these settings:
By saving desired startup state in a file, the user can use simple webservice stop; webservice start commands again!
Solutions to common problems
My Kubernetes webservice does not have the proper ingress objects
Some Kubernetes webservices will need to perform a full stop and start cycle to create the necessary Kubernetes ingress object for routing <toolname>.toolforge.org requests to the tool. When this is the case, visiting https://<toolname>.toolforge.org/ will return a "No webservice" response.
OAuth not working at new domain
Tracked in Phabricator
Task T244473RESOLVED
The OAuth grants created at meta:Special:OAuthConsumerRegistration/propose include a "callback" URL that tells MediaWiki where to send the user after they approve an OAuth consumer to have access to some of their on-wiki rights.
MediaWiki's OAuth system does not currently allow updating the registered URL after the initial request has been approved. After discussing various alternatives that could be used to work around this, it was decided that the easiest general solution to this problem would be to have Tool maintainers request new grants and be in control of which grant is used at runtime in their application.
To assist in getting new grants approved as quickly as reasonably possible, please use this process:
See the grant for the bash tool as an example.
If you are anxious to have the new grant approved quickly, you can join #wikimedia-cloud connect and ask for an expedited review with a message like "!help please approve <URL to your proposed grant>". This will not guarantee immediate action, but it can help a bit. If that succeeds, you can skip the usual Steward requests/Miscellaneous + {{oauthapprequest}} process.
Flask and cookies
TODO: This section could be retitled to be about cookie handling in general and include the Flask tutorial specific notes.
Under the legacy scheme, it was necessary to configure the
APPLICATION_ROOT (or directly the
SESSION_COOKIE_PATH​) to ensure that the session cookie would only be sent to your tool and not other tools on tools.wmflabs.org. Under the new scheme, this is no longer necessary; if you followed the instructions at Help:Toolforge/My first Flask OAuth tool, the fix is to remove the APPLICATION_ROOT line from the tool’s ~/www/python/src/config.yaml file:
$ sed -i.bak '/APPLICATION_ROOT/d' ~/www/python/src/config.yaml
Interwiki links point to legacy URLs rather than toolforge.org
Tracked in Phabricator
Task T247432RESOLVED
Multiple inter wiki prefixes reference the legacy tools.wmflabs.org domain. During the transition period where all tools still have a tools.wmflabs.org URL these interwiki links will continue to work as expected. A visitor following an IW link will get redirected to the proper $TOOL.toolforge.org URL if the target tool has applied the --canonical change to their webservice. Once we reach the final stage of the migration the Toolforge admin team will work with others to update the IW prefix configuration used on the Wikimedia wikis to point directly at the $TOOL.toolforge.org hosts. This will probably involve a new redirection service hosted in Toolforge to preserve the existing IW link structure.
URL query string is not sent with redirect to toolforge.org
Tracked in Phabricator
Task T250625RESOLVED
Kubernetes ingress objects created by webservice before v0.68 did not include the proper configuration to send query string data when redirecting from tools.wmflabs.org/TOOL to TOOL.toolforge.org.
This has been fixed in the version of webservice deployed on all Toolforge bastions. Affected tools will need to
webservice stop; webservice --backend​=​kubernetes --canonical [type] start
to recreate the ingress object.
If you see this behavior using the gridengine backend, please contact us.
Rewrites in $HOME/.lighttpd.conf not working
A $HOME/.lighttpd.conf configuration file is used by some tools to rewrite URLs internally to reference a different file on disk. This is commonly used in PHP applications which have a 'front router" pattern where all requests should actually go to a single PHP script (for example MediaWiki with pretty URLs).
These rewrites will be prefixed with the tool's name matching the legacy URL pattern of tools.wmflabs.org/$TOOL. For your new $TOOL.toolforge.org URL pattern, the tool name should be removed from the rewrite.
Example from the persondata tool:
url.rewrite-once += ( "/persondata/p/(.+)" => "/persondata/person.php?title=$1" )
Updated configuration that will work for both URL schemes:
url.rewrite-once += ( "/persondata/p/(.+)" => "/persondata/person.php?title=$1"​, "/p/(.+)" => "/person.php?title=$1"​)
Updated configuration for tools using --canonical (only $TOOL.toolforge.org):
url.rewrite-once += ( "/p/(.+)" => "/person.php?title=$1"​)
Cross-Origin Resource Sharing (CORS) requests broken
A deliberate breaking change in this migration is separating each tool into it's own "origin" in the browser's sandbox policies. This makes inter-tool communication in the browser more restrictive. Tools which are meant to be accessed from other tools using a browser (cross origin requests or CORS requests), will need to explicitly allow those requests by setting an "Access-Control-Allow-Origin" header.
For a tool using Lighttpd as its webserver (the webserver for the PHP backends), this configuration in $HOME/lighttpd.conf should allow such requests:
setenv.add-response-header += ( "Access-Control-Allow-Origin" => "*" )
This is the same header that would have been historically needed for any tool allowing CORS requests from on-wiki. Using the * wildcard origin makes the result publicly available to ANY other tool. If you require credentialing etc you will require a more advanced setup.
URLs to static files named /$TOOL... returning 404
Tracked in Phabricator
Task T254640RESOLVED
The system generated lighttpd config contains a directive for the legacy URL scheme which is intended to strip the tool's name from the root of URLs: alias.url = ( "/$TOOL" => "/data/project/$TOOL/public_html/")​. This alias causes issues with URLs like $TOOL.toolforge.org/$TOOL.css in that it tells lighttpd to look for a file named $HOME/public_html/.css after the alias is applied.
This behavior is fixed by webservice v0.72 which was deployed on 2020-06-17. When the --canonical flag is applied while starting a webservice the generated lighttpd config file will no longer include the alias directive. Note that the fix only works when --canonical is used, so you could still see the problem with tools where you are just testing the $TOOL.toolforge.org routing and not forcing it to be used exclusively.
Why are we doing this?
All webservices running on Toolforge has been traditionally served using the URL scheme https://tools.wmflabs.org/<toolname>/​. URLs in this form are known as "path based routing". This means that all webservices running in Toolforge share the same domain (tools.wmflabs.org), even though they are different tools, managed by different developers, with different source code, different purposes, and different scopes. The separate applications are only differentiated by the path component of the URL.
Path based routing was commonly used in shared hosting environments in the early years of the World Wide Web. As web applications became more complex, using "host based routing" where each application is given its own hostname became more common. Web browsers also adopted many internal features related to security and sandboxing which rely on this host based separation. In 2020 it is long past time for Toolforge to catch up to this trend.
The new toolforge.org domain also helps us get a step closer in moving away from the labs keyword. Toolforge is a established service for the Wikimedia community and should not be considered an experiment anymore.
After migration period
When the migration period is over, we expect every webservice running in Toolforge to be working well with the new domain and URL scheme.
We plan to preserve old URLs by introducing a legacy redirector which will work mostly like what the --canonical option does.
Every request to URLs in the legacy scheme will receive a HTTP status code 308 moved permanently.
The list of tools that will receive this redirection will be static, and when the migration period is over new webservices will only be allowed in the new scheme.
Exceptions to service migration
Toolforge email will still be operational under the legacy address formats and is not currently being updated:
See also
Some other information that might be relevant related to this topic:
Last edited on 14 July 2020, at 20:57
Wikitech
Content is available under CC BY-SA 3.0 unless otherwise noted.
Privacy policy
Terms of Use
Desktop
 Home Random Log in  Settings  Donate  About Wikitech  Disclaimers
WatchEdit