Help:Toolforge/Elasticsearch

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

This page contains information about local Elasticsearch services in Toolforge. For information about the replica of the Wikimedia CirrusSearch Elasticsearch indices available from Toolforge and Cloud VPS instances, see Help:CirrusSearch elasticsearch replicas.

About Elasticsearch

Elasticsearch is a full-text search system built on Apache Lucene. It can be used to index and search data stored as JSON documents.

Elasticsearch is the technology used to power Wikimedia's CirrusSearch system.

Elasticsearch for Toolforge

An Elasticsearch version 7 cluster for all tools is available at http://elasticsearch.svc.tools.eqiad1.wikimedia.cloud, on the non-standard port 80.

This Elasticsearch cluster is a shared resource. All documents indexed in it can be read by anonymous users from within Toolforge. Write access is needed to create new indexes, and a password is needed to store or update documents. However, see addition notes on the talk page.

Read-only access

The Elasticsearch servers allow anyone to read any of the indexes that it contains. This access is limited to other hosts in the Toolforge project (e.g. Kubernetes containers, and the bastion servers).

The Elasticsearch service is available on port 80 at http://elasticsearch.svc.tools.eqiad1.wikimedia.cloud

Note: The default Elasticsearch port (9200) is not used.

Write access

Elasticsearch does not offer multi-tenant access control in its open source version.

PUT, POST, or DELETE requests sent to the Elasticsearch servers require HTTP Basic Authentication using a username and password specific to each tool.

Requests for write access can be made by filing this Phabricator task.

When credentials have been created they will be made available as envvars. Your tool can access them as TOOL_ELASTICSEARCH_USER and TOOL_ELASTICSEARCH_PASSWORD.

An older procedure placed the credentials in /data/project/$TOOL/.elasticsearch.ini.

Access requests are currently processed manually and may take a few days to be fulfilled.

Python considerations

If you get an error message:

elasticsearch.exceptions.UnsupportedProductError: The client noticed that the server is not a supported distribution of Elasticsearch

you have probably installed a client library which is incompatible with the version running on the server. Running the obvious pip install elasticsearchwill get you the wrong version, as will pip install opensearch. What you want is pip install opensearch-py. There's more about this on stackoverflow.

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)