Help:CirrusSearch elasticsearch replicas: Difference between revisions

From Wikitech
Content deleted Content added
No edit summary
Line 28: Line 28:


The set of indices that exist in a cluster can be queried through the elasticsearch [https://www.elastic.co/guide/en/elasticsearch/reference/6.5/cat-indices.html cat indices] API.
The set of indices that exist in a cluster can be queried through the elasticsearch [https://www.elastic.co/guide/en/elasticsearch/reference/6.5/cat-indices.html cat indices] API.

<code>curl -XGET https://cloudelastic1001.wikimedia.org:9843/_cat/indices</code>


=== Schema ===
=== Schema ===

Revision as of 13:23, 17 May 2019

Cloud Elastic is a replica of the CirrusSearch elasticsearch indices made available to WMF cloud applications. These servers are not generally accessible from the internet at large, rather they are only accessible through applications running inside WMF cloud. Applications can use the full power of the elasticsearch search API's to query the search indices in ways that CirrusSearch doesn't expose directly on the wiki's themselves.

Accessing

There are actually three clusters, named chi, psi and omega. chi contains approximately the 200 largest wikis. psi and omega contain equal splits of the remaining smaller wikis.

Name URL
chi https://cloudelastic1001.wikimedia.org:8243/
psi https://cloudelastic1001.wikimedia.org:8643/
omega https://cloudelastic1001.wikimedia.org:8443/

Clusters can be accessed through each other using the elasticsearch cross cluster search syntax. For example labswiki, which lives on the omega cluster, can be queried through the chi cluster with:

curl -XGET https://cloudelastic1001.wikimedia.org:8243/omega:labswiki/_search?q=example

Indices Available

All wikis have two indices, of the format <dbname>_content and <dbname>_general. The content index contains all of the content namespaces of the wiki, the general index contains everything else. So for example on wikipedia's articles are found in the content index, and talk pages are found in the general index. Querying both indices can be done through an alias by providing only the wiki db name.

The set of indices that exist in a cluster can be queried through the elasticsearch cat indices API.

curl -XGET https://cloudelastic1001.wikimedia.org:9843/_cat/indices

Schema

See mw:Extension:CirrusSearch/Schema.

Example Use Cases

Query all indices

curl -XGET https://cloudelastic1001.wikimedia.org:8243/*,*:*/_search?q=example

Query all content indices

curl -XGET https://cloudelastic1001.wikimedia.org:8243/*_content,*:*_content/_search?q=example

Fetch full document for single page by page id

curl -XGET https://cloudelastic1001.wikimedia.org:8243/enwiki_content/page/33179123

Fetch full document for single page by title

curl -XGET https://cloudelastic1001.wikimedia.org:8243/enwiki_content/_search?q=title.keyword:Elasticsearch