On 9/8/21 11:55 PM, Ryan Kemper wrote:
We noticed a user who was responsible for the most
requests by far
(albeit still not a large percentage of total requests) and banned
them, and that immediately restored full service availability
(following another quick round of blazegraph restarts to get the
deadlocked blazegraph processes back up and running properly).
This problem is resolved (for now at least). I'll be sending an e-mail
out to the user we banned informing them of the user agent ban.
Note my Phabricator response at:
https://phabricator.wikimedia.org/T206560#7342750.
It covers the "Anytime Query" functionality in Virtuoso i.e., one of the
built-in features you can use to protect against attacks (intentional or
inadvertent) attacks.
Sometimes folks don't have a clear sense of the impact of queries,
relative to the usage needs of others. There are other occasions where
they just want to download everything etc..
In some cases, you may have to ban an account. Historically though, the
"Anytime Query" has kept the "Fair Use" rules of DBpedia intact [1].
[1]
https://www.dbpedia.org/resources/sparql/ -- search on "Fair Use"
Kingsley
On Wed, Sep 8, 2021 at 8:03 PM Ryan Kemper <rkemper(a)wikimedia.org
<mailto:rkemper@wikimedia.org>> wrote:
Our WDQS backend servers (in CODFW only) have incredibly patchy
availability currently.
As a result a sizeable portion of queries made to
query.wikidata.org <http://query.wikidata.org> are failing or
taking unusually long.
We're doing our best to isolate a cause (basically a user or
user(s) submitting particularly expensive or error-generating
queries). Until we succeed in that service availability is likely
to be quite poor.
Note that we currently have a mitigation in place where we're
restarting blazegraph across the affected hosts (codfw) hourly,
but that mitigation is insufficient currently.
You can see the current status of wdqs backend server availability
here:
https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=…
<https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=7&from=now-1h&to=now&refresh=1m>
^ This is a graph of our total triple count (i.e. not explicitly a
graph of service availability), but servers affected by the
blazegraph deadlock issue that we're experiencing fail to report
metrics while they're affected. So the presence or absence of RDF
triple counts for a given host corresponds to its uptime
_______________________________________________
Wikidata mailing list -- wikidata(a)lists.wikimedia.org
To unsubscribe send an email to wikidata-leave(a)lists.wikimedia.org
--
Regards,
Kingsley Idehen
Founder & CEO
OpenLink Software
Home Page:
http://www.openlinksw.com
Community Support:
https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:
https://medium.com/openlink-software-blog
Virtuoso Blog:
https://medium.com/virtuoso-blog
Data Access Drivers Blog:
https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
Personal Weblogs (Blogs):
Medium Blog:
https://medium.com/@kidehen
Legacy Blogs:
http://www.openlinksw.com/blog/~kidehen/
http://kidehen.blogspot.com
Profile Pages:
Pinterest:
https://www.pinterest.com/kidehen/
Quora:
https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:
https://twitter.com/kidehen
Google+:
https://plus.google.com/+KingsleyIdehen/about
LinkedIn:
http://www.linkedin.com/in/kidehen
Web Identities (WebID):
Personal:
http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
:
http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this