Help:CirrusSearch elasticsearch replicas

From Wikitech
Revision as of 12:39, 17 May 2019 by Ebernhardson (talk | contribs)

Cloud Elastic is a replica of the CirrusSearch elasticsearch indices made available to WMF cloud applications. Applications can use the full power of the elasticsearch search API's to query the search indices in ways that CirrusSearch doesn't expose directly on the wiki's themselves.

Accessing

There are actually three clusters, named chi, psi and omega. chi contains approximately the 200 largest wikis. psi and omega contain equal splits of the remaining smaller wikis.

Name URL
chi https://cloudelastic1001.wikimedia.org:8243/
psi https://cloudelastic1001.wikimedia.org:8643/
omega https://cloudelastic1001.wikimedia.org:8443/

Clusters can be accessed through each other using the elasticsearch cross cluster search syntax. For example labswiki, which lives on the omega cluster, can be queried through the chi cluster with:

curl -XGET https://cloudelastic1001.wikimedia.org:8243/omega:labswiki/_search?q=example

Indices Available

All wikis have two indices, of the format <dbname>_content and <dbname>_general. The content index contains all of the content namespaces of the wiki, the general index contains everything else. So for example on wikipedia's articles are found in the content index, and talk pages are found in the general index. Querying both indices can be done by providing only the wiki db name.

Schema

See mw:Extension:CirrusSearch/Schema.

Example Use Cases

Query all wikis

curl -XGET https://cloudelastic1001.wikimedia.org:8243/*,*:*/_search?q=example