Wikipedia talk:Wikidata/2017 State of affairs/Archive 13

Archive 10 Archive 11 Archive 12 Archive 13 Archive 14

An example of how Wikidata works

An editor, User:ScorumME, with no edits to Wikidata whatsoever and no information on his edits on other wikis, appears, and proposes/requests a bot as his very first edit[1]. At the bot request, User:Mbch331 (an admin there) requests to make a few test edits. The bot owner complies, and User:Ymblanter, another Wikidata admin and bureaucrat, checks the test edits and approves the bot.

The test edits which lead to the approval of this bot are:

Looking at these additions, we see that all entries are referenced. To [5] https://scorum.me/football/tourneys/1103-2/j-league. Notice anything particular about that website? Right, the website name is the same as that of the bot owner.

It gets worse. When you follow that link, you go to [6] https://scorumcoins.com/, a site to promote a new cryptocurrency with "Tokens Crowdsale Starts In X days": "1 SCR = 1 USD WE ACCEPT ONLY BTC AND ETH" and so on. Scroll down, down, down and you'll find "KEY PARTNER: Microsoft" (may be true or not, no way or interest to verify) and "OUR FRIENDS: Wikipedia":

"we've partnered up with Wikipedia to provide sports fans with the match data from Scorum right on Wikipedia pages." Um, what? No way, José. You've infiltrated one unreliable affiliate of Wikipedia, where even people with serious responsabilities seem to accept no matter what, no questions asked. That doesn't mean that you've partnered up with Wikipedia, or that e.g. enwiki as a whole would ever accept your contributions (some pro-Wikidata editors may accept this shit, but let's hope they are still a minority here).

Does the site even contain the statistics it is supposed to source? Well, I can go to here, but clicking on "Japan" gievs no result. I'm not able to type into the search box either.

This is the kind of site and editor which gets speedily promoted to "bot" so they can edit at high speeds, create numerous pages like the same one five times in a row[7][8][9][10][11] even though they had already created the same page a month earlier[12]. Never mind that Wikidata already had an item for this[13]...

So please tell me again, why should we trust this site, Wikidata for anything at all, if their admins and bureaucrats promote spambots and no one checks their history or contributions either before or after this promotion? Fram (talk) 12:36, 30 October 2017 (UTC)

I am sorry to say but this looks like endless rant "Wikipedia lies" and "Wikipedia is not reliable". I approved the bot because it complied with the policy. I can block it or remove the flag, if somebody, rather than posting here, will go to d:Wikidata:Administrators' noticeboard, make a case and get consensus. I am not going to act on the basis on what whoever writes on a project different from Wikidata. This would be as ridiculous as if someone would complain on Wikidata on someone's behavior on the English Wikipedia, and I would administer a topic ban without discussing here.--Ymblanter (talk) 12:52, 30 October 2017 (UTC)
If you are unhappy with my activity as a bureaucrat on Wikidata, you can initiate the procedure of flag removal. My prediction is you will be laughed at, but you can try anyway.--Ymblanter (talk) 12:53, 30 October 2017 (UTC)
Thanks for exemplifying the problems with Wikidata. "I approved the bot because it complied with the policy." Then your policy sucks big time. If the edits made by that bot are policy-compliant, and you blindly approve such bots, not even bothering to check or comment on these issues, then this is rather frightening (and even more so when one realises that you are an admin here as well!). Anyway, if some discussion on Wikidata would point out similar problems with a bot on enwiki, and I would come across that discussion, I would have no problem with blocking that bot or at least shutting it down after having checked that the complaints are valid. The important thing to me is the integrity of enwiki, not whether a discusssion follows some bureaucracy. Your prediction about flag removal is probably correct, but whether that is a good thing or yet another indication of the problems with Wikidata is of course another question. Fram (talk) 13:18, 30 October 2017 (UTC)
Ok. You are bullshitting again. Fine, go on without me. You have an interesting skill of immediately alienating people.--Ymblanter (talk) 13:29, 30 October 2017 (UTC)
And I should care about alienating you because... ? Alienating a Wikidata defender who sees no problem with the above issues is really the least of my concerns. Alienating an enwiki admin who hijacks an AN section to complain about "stalking" and "fucking lies" because a problematic discussion they made at Wikidata is criticized should be something I should avoid in the future because what exactly? You have not adressed anything about the bot edits I highlighted, only said that they "complied with policy". And that doesn't worry you and doesn't indicate some major problem at Wikidata? If your policy allows such bots, then it is high time to adjust your policy. Fram (talk) 13:42, 30 October 2017 (UTC)
The data is being used to populate Template:2017 J1 League table with WD, transcluded by J1 League and 2017 J1 League. These articles credit https://stats.scorum.com/football/tourneys/1103-2/j-league Thincat (talk) 14:06, 30 October 2017 (UTC)
Thanks. I have undone the additions of his template to some pages, and nominated two templates for deletion. I have also warned the user about spamming his website. Fram (talk) 14:27, 30 October 2017 (UTC)
  • @Ymblanter: have I understood correctly that this bot was approved in September even though the user had no background at all with Wikimedia projects until that point? No edits at all. Two people said on Wikidata that the same thing could happen here, but it couldn't, unless I've misunderstood something. Why would a bot from a completely unknown person be approved? SarahSV (talk) 16:16, 31 October 2017 (UTC)
    @SlimVirgin:, Yes, this is correct. We do approve bots from external persons if the task is reasonable and there are no objections. Our bot policy does not say the person should have edited projects before, only that the bot must have test edits. This can be changed of course; however in the discussion I raised today on Wikidata (d:Wikidata:Project Chat#Bots) this has so far not been brought up as an argument. If there is consensus that this has been a serious problem, I will be more than happy to help working towards improving the approval bot policy which could include a minimum tenure of a bot owner on Wikidata, or on Wikimedia projects, or whatever.--Ymblanter (talk) 16:22, 31 October 2017 (UTC)
    @Ymblanter: thank you for the reply. That's a high-risk approach, especially when it's someone who's adding material from his own website. He can change the website's content at any point and none of you would notice. SarahSV (talk) 17:30, 31 October 2017 (UTC)
    @SlimVirgin: This is right, though in this case it looks like an oversight, not as deliberate spam (there is no difference for us of course). I actually think it is more important that we have explicit support for the bot tasks, not the absence of opposition, but then the consequence would be that most tasks never get approved. Let us see whether the community is ready to accept this at this point.--Ymblanter (talk) 17:35, 31 October 2017 (UTC)

Oh joy! Let's fill Wikipedia with Wikidata infoboxes importing refspam from wikidata. I'm sure the wikidata community will pay attention to refspam shortly after they address BLP, RS and V. Alsee (talk) 18:48, 31 October 2017 (UTC)

Well, there is way more spam in the Wikipedia infoboxes (not Wikidata related), but I will leave you the joy of discovering it.--Ymblanter (talk) 18:54, 31 October 2017 (UTC)
  • This is mind-blowing. The account has been indefinitely blocked here for USERNAME and being here purely for self-promotion and with a very clear financial COI with the crypotcurrency. This is not even a little ambiguous. User:Ymblanter you are pretty aggressively defending this, but I don't understand why or how. Are the values of Wikidata so different, that this kind of blatant self-promotion is really OK there? Even beyond that, from a librarian kind of standpoint, I can't understand the basis for allowing so many "references" to a site where the contents themselves have no sourcing, and that was just recently created and may not exist in few months if the venture fails. If you made a bad judgement, that is understandable (everybody screws up) but if you are actually defending it, please explain this in the values of the movement.
What you did with this bot and your defense of that is incomprehensible to me. Please explain. Jytdog (talk) 15:00, 1 November 2017 (UTC)
Jytdog, Ymblanter is not responsible for Wikidata's lack of policy regarding this. If its within the wikidata (lack of) policy framework to approve that, then tough. ENWP is not in a position to dictate how wikidata runs itself as a separate project. Just add it to the list of clear reasons we cant trust anything coming out of wikidata - along with all the other non-existant policies, a project that allows insufficiently scrutinized spambots is not something we can allow to be drawn from. Only in death does duty end (talk) 15:18, 1 November 2017 (UTC)
I think that is basically the point Jytdog is trying to make. Wikidata is unsuitable for inclusion in English Wikipedia, because it is unfortunately full of crap, and although the same can be claimed for Wikipedia, we can fix it here when we find it, according to our own standards and customs and with our own tools and edit interfaces. We are not, as English Wikipedians, here to fix Wikidata. As it happens, I do to some extent, edit Wikidata, but that is (or at least should be) a completely separate issue. · · · Peter (Southwood) (talk): 15:37, 1 November 2017 (UTC)
Oh I understand the point he is making, I was referring to his quizzing of Ymblanter - its completely redundant because whatever Ymblanter replies, they are not responsible for Wikidata's lack of compliance with ENWP policy. Only in death does duty end (talk) 15:44, 1 November 2017 (UTC)
Well, the site does contain information on the J-league results, and it was difficult to foresee that (i) the owner will come here and include their links to template via Wikidata; (ii) url's get broken and links turn spam. As soon as I learned that this happened I have taken the flag back. It would be great if anybody volunteered to add links e.g. from the official site of the J-league but nobody did so far. In Wikidata, we have mechanism of ranging references (we can for example always range these down if there are better ones), and there is no major problem with having references which are not reliable (as soon as there is no concern they are false), they can always be deprecated. This is why it is not a problem having for examples references to Wikipedia. On the other hand, somebody in Wikipedia seriously fucked up by letting these references in. And this somebody was not me.--Ymblanter (talk) 16:15, 1 November 2017 (UTC)
What i do not understand - what I am "quizzing" you about, is your judgement. In the absence of policy the judgement of admins is all the more important. It is incomprehensible to me that you allowed somebody with a financial COI to create a bot to spam their website into Wikidata and from there into any project that uses that data. How is it, that the financially-driven spamming, and the potential death of the new website didn't factor into your judgement? How does that judgement reflect the values of the movement? Jytdog (talk) 15:02, 2 November 2017 (UTC)
Why do not you discuss it in the proper venue to discuss - d:Wikidata:Project Chat#Bots? So far most people there are of the opinion that what I have done is acceptable. Concerning my judgement in general, I have admin flag on four projects, and so far nobody except for Fram ever complained. If you are convinced that I suddenly lost a clue and can not be admin any more - join Fram in their Arbcom case, which I believe has no merit.--Ymblanter (talk) 16:24, 2 November 2017 (UTC)
"and from there into any project that uses that data" - this is exactly my point. Just do not use the data if you do not think they are coming from a good source. It is a lot of bad data in Wikidata. There is data sourced to Wikipedia, there is data sourced to IMDB, there is data sourced to Findagrave. The data cited to that website in question was better than data cited to IMDB, for example. The problem is on your side. Just do not use it.--Ymblanter (talk) 16:36, 2 November 2017 (UTC)
Same way btw as there is a lot of junk in Wikipedia. I always discourage people to use it if they do not know how to use it. We have a lot of incompetent users complaining about vandalism, POV, stubs and all kind of things - is this really our fault?--Ymblanter (talk) 16:39, 2 November 2017 (UTC)
The And you are lynching Negroes aka whataboutism response is abysmal. Nobody in WP would ever allow a bot like that to run, and what this is showing is that wikidata doesn't have the policy background, nor the administrative judgement (even somebody who is also an admin in en-WP!) to keep spam out. There is no way in hell we should be opening doors to Wikidata in this situation. If you were even coming close to acknowledging, "yes i fucked up here" we would be having a different discussion. But your 100% resistant and bullshity response drives my trust in Wikidata to zero. If that was your goal (and it certainly was not to win trust) you have succeeded. Completely Jytdog (talk) 05:59, 4 November 2017 (UTC)
Okay trying to wrap my head around this? Because someone allowed the use of WD we messed up here? Seems like a strong justification not to use WD until more checks and balances are in place. Doc James (talk · contribs · email) 20:48, 1 November 2017 (UTC)
Well, if we want a big picture, clearly import of well-sourced statements from Wikidata would be beneficial as soon as we can solve vandalism problems (which are currently small compared with Wikipedia but still non-negligible). But unfortunately most people refuse to see the big picture and insist on either completely banning Wikidata from the English Wikipedia or on allowing everything here.--Ymblanter (talk) 20:55, 1 November 2017 (UTC)
Some of us just insist upon proper checks and balances before WD is to be used on EN WP. And we have been making that same statement for years.
Just looked at WD and it still contains the spam links... Doc James (talk · contribs · email) 21:01, 1 November 2017 (UTC)
Actually, I did not mean you. As far as I know, you have not been on a crusade.--Ymblanter (talk) 21:04, 1 November 2017 (UTC)
So what we have here is actually a back door for getting spam links into Wikipedia. When you do a search for the link here you find nothing in main space.[14]
When I search on WD nothing comes up.[15]
Yet looking here I see 5 spam links.[16]
Does WD have no mechanism to find spam links? Doc James (talk · contribs · email) 21:08, 1 November 2017 (UTC)
From what I see, [17] refers to the version of the item which since has been edited, and links changed. Wikidata unsurprisingly does not find them. (I am not an expert in link search though).--Ymblanter (talk) 21:18, 1 November 2017 (UTC)
Doc James - one of the seriously annoying issues with wikidata is that the search box can't search things like spam links. Not long ago I was trying to search reference links. I realized that some items were bypassing the filter that (supposedly) prevents importing data items which are unsourced or Wikipedia-sourced. After struggling and extensively researching how to do it in the search box, I discovered that it's not possible. I had to resort to teaching myself the Wikidata-database-query-language in order to search the ref links. Trust me you do not want to try it. Even as a programmer I found it painful. However I did discover two notable things. #1 I found upwards of a million Wikipedia-sourced items bypassing the filter.... the references don't mention that it's sourced from Wikipedia so the filter can't catch them. #2 Trying to search reference links via wikidata-database-query-language is broken as hell. Wikidata is seriously not-designed for searching the contents of ref-links. There's a time-limit on how long a query is allowed to run, and searching ref links is so slow that it dies before the search can complete. So when I say I found "upwards of a million refs" bypassing the filter, I mean the search found 1.1 million before it died. There could actually be 99 million hits for all I know. If there is a spam-ref you're looking for, the query will only scan the beginning of the database then it will time-out and die. If you want to search whether a spam-ref exists, or if you want to find and clean up all occurrences, it would be extremely difficult. I think if you manually break the search into pieces you could try searching chunk-by-chunk without dying to a timeout. However I couldn't even begin to guess how long it would take, how difficult it would be, or how realistic and reliable the search would be. The virtually nonexistent ability to search refs probably shouldn't be so surprising. The Wikidata community/philosophy has approximately zero interest in refs and sources. Alsee (talk) 23:18, 1 November 2017 (UTC)
  • Note, the discussion at Wikidata chat about this incident has been archived there, and is here. We should refer to this often in discussions of the use of Wikidata in Wikipedia. Bad decisions are made in Wikidata to allow bots to run; bad claims are created there, allowing bad content to flow into en-Wikipedia where ever Wikidata is used. Jytdog (talk) 15:28, 18 November 2017 (UTC)


Whatever happens, Wikidata by itself is a small community. It’s doomed to remain that way. A database simply can’t attract people as Wikipedia. As a consequence, Wikipedias can be as picky as they want, Wikidatians, by themselves, simply wont have the powerforce to fulfil their requirements, most likely. The only way is … that Wikipedians they involve themselves in Wikidata. By doing this, we can sum out the best of our respective knowledge to make every other wikimedia project to use the best datas of the others. All we need is to understand that Wikidata is a common place, and as such it must accept datas that may not be bccepted locally because another wiki would accept them. They can be filtered on Wikipedias when they import them various ways Wikidata and lua provides. And that Wikidata needs some courageous Wikipedians to go out of their confort zone and participate in Wikidata. Putting hermetic borders between project in doomed to fail. What we need is more cooperation and more understanding between communities for the work of some to benefit to others. Curating data is best to do on one project than to be done independantly on 200+ projects … enwiki is actually probably the project with the most contributors, by far. Enqlish is also a pivot language that a lot of people reads and is the base for a lot of other wikis work. Building good descriptions for items on Wikidata is an help to avoid confusion. Being able to « fork » discussions is a way to not help building a consistency between (inter)wikis and not to find interwiki conflicts. It doubles the work instead of mutualizing it. It defeats the whole purpose of sharing datas between project. Beeing picky on the data imports is useless if the Wikidata community is not big enough to manage them. Fighting vandalism is too hard if Wikipedians don’t look at them and if we don’t cooperate on these. Long story short : If Wikipedias don’t use Wikidata datas and if communities don’t make efforts on their own to understand each other and cooperate, they will all lose. This is not a WMF vs. enwiki issue, it’s a enwiki vs. wikidata vs. frwiki vs. … community issue. TomT0m (talk) 17:39, 25 November 2017 (UTC)

Technical question to WMF

DannyH (WMF), How does the mobile view handle redirects? · · · Peter (Southwood) (talk): 18:28, 14 November 2017 (UTC)

On the mobile web, "Redirected from (...)" shows up in a black box at the bottom of the screen which disappears after a few seconds. On the apps, "Redirected from (...)" shows up in search results, in place of the short description. Does that help? -- DannyH (WMF) (talk) 18:58, 14 November 2017 (UTC)
If it means that redirects do not need a short description, then yes. · · · Peter (Southwood) (talk): 04:43, 15 November 2017 (UTC)
DannyH (WMF) Please confirm above, and
How are disambiguation pages handled? If they do not need a short description, please specify.· · · Peter (Southwood) (talk): 19:15, 16 November 2017 (UTC)
 
A usage of descriptions by a disambiguation page
There is no technical reason for a page to need a description, so no, redirects don't need a description. They also rarely show up in UI elements, though I guess if there is an exact title match widget (rarely used) then they might. Disambiguation pages are pages. I personally find the Wikidata description "Wikipedia disambiguation page" very useful in the usecase on the rightside here. It was previously stated however that all descriptions for disambiguation pages should me removed, please check the earlier discussion. —TheDJ (talkcontribs) 20:45, 16 November 2017 (UTC)
TheDJ, I agree with you that Wikipedia disambiguation page is about as useful a description as can reasonably be expected for a disambiguation page. The point for me is that this is a description that can either be very easily bot inserted, or maybe not as easily be extracted from the category, making disambiguation pages a trivial problem to manage. I would prefer someone technical from WMF to confirm this, because it is WMF that would be implementing it if necessary. · · · Peter (Southwood) (talk): 15:31, 17 November 2017 (UTC)
Hi Pbsouthwood: Redirects don't need a description. I wrote a bit about other pages that don't need descriptions above; I'll quote it here because there's a lot of "above". :)
We can remove descriptions completely on types of pages where the description is inherently worthless or repetitive. The examples that I know right now are list pages, disambiguation pages, category pages and the main page. "Disambiguation page" may be useful in search and in the VE link modal, but they're not useful at the top of the article page, and descriptions for list/category pages are worthless in all circumstances. It's possible that there are other examples where descriptions are inappropriate or meaningless, and it would be really helpful for any/everybody on this page to identify more examples, so that we can exclude them as well.
-- DannyH (WMF) (talk) 22:11, 17 November 2017 (UTC)
It must be possible for Wikipedia editors to see the short descriptions from ordinary desktop read view, so they can see whether they are appropriate, and take action where they are not. They will be part of the article, not a separate entity independent of the article. The first iteration will not always be the best description possible. Some descriptions may take years to settle down to an optimum. We don't even have a guideline for maximum length yet. (If there is a technical limit, it should be specified.) It would be nice if those people who really don't want to see them can hide them as an option, but for maintenance, more eyes are better. It must also be possible from edit view to distinguish whether the short description has been deliberately left out as unnecessary, or just has not been created yet. Failure to provide this information could lead to edit wars and other undesirable and potentially uncivil activity. Maybe WMF does not care if this happens, but it is a serious waste of resources to Wikipedia. Just like this discussion is a waste of time we could be using to improve the encyclopaedia, but we have to expend this time to prevent the encyclopaedia from being made worse. Descriptions for list pages are not always unnecessary. We must allow for the occasional instance where a short description would be useful or necessary. Use of a null parameter is intended to deal with this problem. If the short description is set as none, don't display a short description because an editor has assessed this as the best option. If someone later comes up with a better idea, the word none is simply overwritten with the better description, which must be displayed. The converse may also happen occasionally, where the editors of an article decide that no description is necessary and by overwriting the existing description with the string none, indicate that no description is to be displayed. This could also be done by using two parameters, one for the description, and one to indicate whether or not it must be displayed, but that looks more likely to be misunderstood and misused. · · · Peter (Southwood) (talk): 09:58, 20 November 2017 (UTC)
Pbsouthwood: So far, I've heard from people who want to see the short description in fewer places, rather than more. Have you talked to other people about your desire to display the short description on desktop article view? Regarding the "null" use case: if there's a feature request, then I need to know what the use cases are -- at least some specific examples. That's how our jobs work, as a product team -- we evaluate feature requests based on whether the problem the feature is supposed to solve actually exists. If you're asking for "null" to render a blank description, can you give me some examples of pages where that would be useful? -- DannyH (WMF) (talk) 02:33, 21 November 2017 (UTC)
DannyH (WMF), If there is to be a short description used in the way it is currently being used, I think you will find that Wikipedians will want to be able to monitor it and fix it when it is broken. That is the whole point of getting it to be part of Wikipedia and not of Wikidata. As Iridescent says above, people fix things that they see are broken, and that often/usually happens when they read the article, not so much when they are already editing for another reason, particularly when they are editing a section, or when the short description is not particularly visible in the editor. I foresee that there will be editors who will resent having a new short description displayed in the read view, so I suggest that it should be possible to hide it from those who actively want to hide it from themselves. Whether this should be done by default or by a preference is something that can only be decided by an RfC, as no-one has the authority to make that decision for the general Wikipedia community, and we have seen many fiascoes to demonstrate that point. I have not specifically spoken to anyone about this, I am simply applying logic and experience, and relying on fellow Wikipedians to step forward and make their point when the disagree. So far this has not happened on the point of visibility in read view, but it might. I am not sure of who you mean by people who want to see short descriptions in fewer places, maybe those are the people who would prefer them not to be used at all. Have you checked with them how they would like the use of short descriptions to be continued and remain invisible to Wikipedians? · · · Peter (Southwood) (talk): 13:54, 21 November 2017 (UTC)
Pbsouthwood, maybe I misunderstood. There's two definitions of visible here: #1) visible as a magic word in the edit window when you're editing an article, and #2) visible to all readers when you're looking at an article page on a desktop computer. We both agree that #1 is important -- if you can edit the magic word, you need to be able to see it in the edit window. #2 is the one that I may have misunderstood. Do you think the descriptions should be visible to all readers on the article page? -- DannyH (WMF) (talk) 01:08, 22 November 2017 (UTC)
Regarding the matter of use cases for a blank description (where "none" would be used in the template): I assume you mean cases other than the obvious case of most list articles. There have been a few that I found when experimenting with short descriptions for WPSCUBA articles, but I did not record which ones they were. I will try to find them, but there are several hundred to search through, and I do not have skills in automated searches, so it might take some time to dig them up. · · · Peter (Southwood) (talk): 14:06, 21 November 2017 (UTC)
Yes, I've already got a set of page types where I think short descriptions are not helpful in article view: list pages, category pages, disambiguation pages, the main page. We can suppress showing those descriptions. What I'd like to see are specific examples of article pages where having a short description would be unhelpful or inappropriate. We can't build a "null" feature unless we know actual use cases where that would be an appropriate choice. -- DannyH (WMF) (talk) 01:08, 22 November 2017 (UTC)
DannyH (WMF), I think there is some miscommunication or other problem here. Why are you considering suppressing anything?!? If we don't put a description on a page, then there is nothing to display. That includes any page. If we do put a description on any page then it should be displayed. That includes list pages, category pages, disambiguation pages, and the main page. In fact unless there is a technical obstacle, that should include WP: pages, USER: pages, various TALK pages, and everywhere else. I am not specifically arguing for non-article use cases at the moment. However one of the most valuable essences of wiki is that a page-is-a-page, and virtually all functionality is general to all pages. If and when we want descriptions on other pages, it should "just work".
The keyword specification is very very simple. If the keyword is present then the page has a description and it is displayed in relevant locations. An absent keyword, or a keyword with no value, is no-description. I think everything else can be handled by a template and regular editing.
There could be a transition period before the keyword is activated, or where wikidata is a temporary default value for blank descriptions. This would give the community time to work on the new descriptions, before shutting off Wikidata descriptions.
Things like list pages, category pages, disambiguation pages, and the main page are merely pages that might be skipped when we work out details for a bot-run to add a keyword-or-template to pages. Alsee (talk) 05:08, 22 November 2017 (UTC)
Alsee, you said that you want us to collaborate on this solution. A WMF product team is going to build this tool, and we need to understand the use cases before we add your feature requests to the spec. I understand that you have a specific vision for the feature, but that needs to be a shared vision if we're going to work together. -- DannyH (WMF) (talk) 05:27, 22 November 2017 (UTC)
DannyH (WMF), yes I was discussion a specific "vision". As I've said, I'm more than happy to collaborate on an RFC including multiple options. I will cheerfully help include any proposal you like. However the last RFC on wikidata descriptions resulted in descriptions being removed completely from mobile-web. I think it likely that consensus is going to continue to be against wikidata descriptions. The discussion on this page started with the (shocking-to-the-community) discovery that wikidata descriptions were only partially removed from mobile. I personally helped shut-down a hasty and blunt RFC against the rest of the descriptions, in the hopes of a more collaborative WMF-community process.
To be explicit and clear: are we collaborating towards an RFC which includes <any proposal you want> and a non-wikidata option? An RFC drafted collaboratively, with a good-faith effort to include and address any and all concerns you have? If so, great, and I apologize if my comments failed to maintain a multi-proposal style. Alsee (talk) 06:49, 22 November 2017 (UTC)
Hi Alsee, yes, we can continue this discussion anywhere and in any format. I'd be happy to collaborate on an RfC. -- DannyH (WMF) (talk) 19:13, 28 November 2017 (UTC)

The mobile search desctiption image on the right has the description for Cardiac arrest: "A congestive heart breath". This is nonsense. It has been changed on Wikidata on 26 September to "sudden stop in effective blood flow due to the failure of the heart to contract effectively; congestive heart breath" which is too long and still ends with the same nonsense. This vandalism has persisted since March 2017[18], apart from a few periods when it was replaced with worse (or more obvious) vandalism. This page is seen on average 2,700 times per day on enwiki, e.g. in April it was seen 84,000 times. I'll leave it to others to count how many times this description was the first thing people saw on the app, mobile, search, ... Fram (talk) 08:23, 20 November 2017 (UTC)

Vandalism on Wikipedia

Since people apparently are keen on sharing examples of vandalism on Wikidata (just for fun, without making any effort to remove it), let me share this: This edit was made in December 2012, and introduced deliberately false information to a BLP article. After that, the article was edited 28 times until I accidentally came across this a couple of days ago (looking for smth else). I just happen to know the guy, and I suspected smth is wrong, and eventually found that BLP vandalism was in the article for five years before I reverted it. I basically tell everybody for a long time that Wikipedia is not reliable and can contain deliberate vandalism, and this is just one more illustration. Every reader of this article for the last five years was deliberately misled by the English Wikipedia. We just do not have any mechanisms of dealing with this, despite having a large editor base.--Ymblanter (talk) 15:49, 5 December 2017 (UTC)

This vandalism btw never propagated to Wikidata.--Ymblanter (talk) 15:50, 5 December 2017 (UTC)
And you don't see the difference between vandalism on one field in one relatively obscure article, and vandalism changing the name of a country in an unknown (and technically unlimited) number of pages at once? Fram (talk) 15:53, 5 December 2017 (UTC)
Fram, I learned already that it is useless talking to you because you do not listen, but yes, I see the difference. Wikipedia vandalism was there for 5 years and was misleading; the readers really thought the guy graduated from Tomsk. The vandalism on Wikidata lasted several hours and was not misleading: Everybody knew this is vandalism, it was noticed and corrected by a responsible user who was interested in removing vandalism rather than advertising it.--Ymblanter (talk) 15:57, 5 December 2017 (UTC)
Perhaps the editor who inserted really thought he graduated from Tomsk as well? In any case, that vandalism is not misleading (i.e. is much, much easier to spot) does not somehow make it less serious. Yes, enwiki has old vandalism as much as you like, just like Wikidata has. What the relevance of this is for something Wikidata specific (i.e. the ability to change data which affects many, many pages at once) where we have an example from today, on the talk page of the state of affairs of Wikidata nowadays, escapes me, unless your argument is "enwiki isn't perfect, so you shouldn't criticize Wikidata". In any case, if you can't discuss things without personalizing them and introducing personal attacks, then please just don't reply in the future. I haven't made comments about you (or anyone), the vandalism and reversal were not about you, so perhaps we can just stick to the actual issues and leave personal animosity out of it? Fram (talk) 16:48, 5 December 2017 (UTC)
Oh, and I did remove the vandalism on enwiki, in the place where I encountered it, immediately. But I refuse to edit Wikidata, and I have no obvious way to find all articles where a certain Wikidata item may be displayed. And sharing it here isn't "just for fun", it is a necessary part of any discussion about the future of Wikidata on enwiki to know what kind of problems we may get by using such data (just like everyone is free to illustrate what kind of benefits we might get by using Wikidata). If I use hypothetical problems, people ask for actual examples. So I just report on reality without afecting it (I don't introduce or facilitate Wikidata vandalism, I don't revert or report it either). As far as Wikidata is concerned, I'm a reader, not an editor. Fram (talk) 16:54, 5 December 2017 (UTC)
By the way, did he study at the Moscow State University or the Moscow Institute of Physics and Technology (infobox)? They don't seem to be the same. Fram (talk) 16:57, 5 December 2017 (UTC)
They were the same when he enrolled to Moscow State University, and then after one or two years the faculty he studied at became an independent university which is Moscow Institute of Physics and Technology. It could have been better written in the article, but at least this is not incorrect.--Ymblanter (talk) 17:23, 5 December 2017 (UTC)
OK, thanks. Fram (talk) 18:36, 5 December 2017 (UTC)

Circular "sourcing" on Wikidata

User:Alsee has raised this issue already, but it doesn't seem to have been thoroughly discussed and resolved yet. Wikidata infoboxes like Template:Infobox person/Wikidata only show sourced data (as default, one can overrule this), but "sourced" only means "doesn't have the indication "imported from Wikipedia"" and includes quite a few other common methods of "sourcing" on Wikidata which don't involve any actual source. Queries of WMFlabs are one example, "inferred from" is another.

If you for example would change John F. Kennedy to have {{Infobox person/Wikidata | fetchwikidata=ALL}} at the top, it would show rather surprising results (e.g. the only award shown is an Italian one, and his alma mater shows 4 schools but not the one currently in his infobox, Harvard). It would also only include 1 of his children, Caroline. This is because it is the only one that is sourced on Wikidata.

Well, actually, it isn't sourced, it is "Inferred from Caroline Kennedy". That page on Wikidata indeed says that John F. is her father, but without a source. So this is unsourced information which is shown in infoboxes here even if we supposedly only use sourced information. Now, obviously in this case the information isn't wrong, it is simply an example.

But we should have a list of all such sourcing versions on Wikidata which aren't actually external sources (never mind reliable ones, which is a different issue). Everything that is "inferred from", "stated in" with an internal link to another Wikidata item only, without an actual ID or page or URL, "imported from: corresponding Wikidata item", ... Fram (talk) 09:43, 7 December 2017 (UTC)

Wikidata vandalism again affecting enwiki articles

 
Image from enwiki article showing Wikidata vandalism

For nearly 7 hours today, Nepal no longer existed, and the Nepalese on Wikidata (and enwiki wherever the Wikidata label was used) lived in "Nepalpeneflacido" instead (with "flacido" meaning "flaccid", and, well, you can guess the rest).[19] Like I said before, changing the label on Wikidata is the equivalent of a page move on enwiki. Wikidata has no means at the moment to prevent such moves (or they need to protect all of the page, they can't protect only the label), and not enough editors to patrol this (despite claims about the much larger base of editors they have and so on). And on enwiki (or on other wikis), not enough people (hardly any) have Wikidata changes enabled in their watchlist as that produces loads of unreadable garbage and changes which don't affect enwiki at all. So these changes time and again remain unnoticed for hours (or longer), affecting an unknown number of pages. It happened with Romania recently, now with Nepal, probably others I didn't notice at the time as well.

The effect of this on enwiki is limited now. But if e.g. many more biographies would have the Wikidata version of the infobox, or other types of infoboxes would be converted to pure Wikidata versions, this would become much more problematic. The strength of Wikidata (one change affecting many pages at once) is a serious weakness if you can't be reasonably sure that such changes are either beneficial or very quickly reverted. Fram (talk) 15:38, 5 December 2017 (UTC)

Did you first notice the vandalism when you initially edited the Lumbini article? Richard Nevell (talk) 19:07, 5 December 2017 (UTC)
Yes. Fram (talk) 21:45, 5 December 2017 (UTC)
So you just left it rather than fixed the vandalism on Wikidata? Richard Nevell (talk) 22:34, 5 December 2017 (UTC)
Richard, why do you think Fram has a moral obligation to edit Wikidata? The rules, the knowledge base, the expectations, and the community seem to me to be different from those on Wikipedia (though I have to admit a lot of ignorance on this subject). Shouldn't a volunteer be able to choose which communities they want to invest their time in? - Dank (push to talk) 23:17, 5 December 2017 (UTC)
I don't think it's a matter of being obligated to do something, but I do find it curious that someone would notice a mistake and then leave it. For me it does change the initial framing of this post. Richard Nevell (talk) 23:32, 5 December 2017 (UTC)
I want to be clear that I'm not anti-Wikidata; I don't know enough about it to be against it. I see a large volunteer community putting effort into it, and that's probably a good thing, and worth doing. But when Wikipedians try to explain the problems that arise ... and Fram was bringing up a relevant point here, I thought ... we're confronted by people who seem to want to shame us for not being sufficiently pro-Wikidata. Maybe that wins points on some kind of scorecard, but it doesn't seem like a strategy that's likely to produce an end result of successful integration and cooperation. - Dank (push to talk) 23:34, 5 December 2017 (UTC)
Wikipedia and Wikidata have a lot more in common than they do separating them. Using that common ground as the basis for collaboration between the two communities would be beneficial for both sites. Richard Nevell (talk) 23:45, 5 December 2017 (UTC)
When I notice a template using Wikidata (World Heritage Site) creating problems on hundreds of enwiki articles and propose a solution, you oppose that solution and then proceed to do nothing about the problems, meaning that the problems (actual wrong information on enwiki articles) persists for many more months. This is apparently not a problem for you. But when I see a problem on enwiki caused by Wikidata vandalism, fix it on enwiki where I notice it, and then do nothing further, you go all moral outrage on me? Even though I have tried rather hard to fix the root cause (using Wikidata on enwiki), which would make the symptom (vandalism of an English label on Wikidata) a rather futile form of vandalism instead of the effective one it was now. You are the one promoting the use of Wikidata on enwiki, and claiming that this is so beneficial for both; then you have the obligation to organize things so that such problems happen less and less often, and to search for solutions. But all I see is someone who rejects solutions but then is surprised when others don't want to edit their pet project which causes the problems. I had seen this coming (that people would expect us to edit Wikidata whether we want to or not), but it's not going to work. Fram (talk) 05:37, 6 December 2017 (UTC)
Whatever Wikipedia(s) and Wikidata may have in common is more likely to be lost than strengthened by attempts to coerce Wikipedians into maintaining Wikidata, whether they are technical effects like imposing Wikidata descriptions on Wikipedia displays om Mobile view, or rhetoric claiming that Wikipedians have any ethical obligation to work on another project. From the Wikipedian point of view Wikidata is becoming more trouble than it is worth. Beware the backlash. · · · Peter (Southwood) (talk): 06:27, 6 December 2017 (UTC)

Similar vandalism (changing the English label) from the last few days, which all lasted for hours:

  • Oceania was named "africa" for more than 5 hours
  • Canada was "Culo" for nearly five hours
  • Guinea was only "Negrazo" for nearly two hours, so that's relatively quick
  • Faggot (food) is now "Meatballs" (not reverted yet, after more than 24 hours)
  • Astronomy is in English now "천문학" since 36 hours (not reverted yet)
  • Henry VIII: for more than two hours was completely vandalized, resulting e.g. in everyone getting "obey hitler" as the English description on their apps or elsewhere (same for French and German readers, by the way). The IP could vandalize as much as it wanted for 23 consecutive edits spread over more than 30 minutes, so it's not one subtle edit which slipped through the cracks

As a bonus, for people comparing relative short-term vandalism on wikidata with long-term vandalism on enwiki: Diego Simeone, a page seen by some 900 people a day on enwiki alone: Since 9 May until yesterday (i.e. for nearly seven months), he had a completely wrong name "Roberto Fernández" and the not-so-flattering and not-so-English description "futbolista medio".

The above is only a selection of some of the most obvious and high profile vandalism, and doesn't include some very high-profile unreverted vandalism examples (yeah, sue me). It only focuses on one aspect of vandalism (English labels), and doesn't include things like J. K. Rowling being a Reptilian with 43 children for hours... Fram (talk) 10:07, 6 December 2017 (UTC)

One of the unreverted high profile examples has meanwhile been found: Muhammad Ali was known for more than 1 day as "Muhammad L'kahba"... And the two unreverted ones I linked to above were reverted four minutes later! Fram (talk) 19:51, 6 December 2017 (UTC)

On the case of kpop artist Suga, though I do not feel the least obligation to fix things in wikidata, I *did* try to fix the vandalism there, but it was so intermixed with good editions that it was a complete mess. It's pretty obvious that vandalism control is not working conveniently at Wikidata, and there is founded doubt that it ever will, so one first, important step towards the usability of that resource could be closing it to anonymous/newby editing. And even so, we'll still have to deal with editwarring migrating from the wikipedia articles into Wikidata, something there's no solution for at the moment, but to completely remove the Wikidata gadget (infobox or whatever) from the already protected article on the Wikipedia side. At this point, each time I use information from wikidata live on Wikipedia articles, I feel like I'm doing the wrong thing. It's blatantly not reliable.-- Darwin Ahoy! 01:20, 7 December 2017 (UTC)

And for 21 hours, Wikidata didn't have an entry for John F. Kennedy (like I said, a rather high profile page), but they had one for Putita loca ("crazy bitch") instead. Fram (talk) 08:19, 7 December 2017 (UTC)

@Fram and DarwIn: I don't think you'll find anyone that *likes* that this vandalism happens. However, there is a reason why {{sofixit}} was created. Remember that this type of vandalism used to be (and still sometimes is!) a common argument against Wikipedia vs. traditional encyclopaedias - and we have tackled that on enwp/ptwp with a mixed success rate. Just complaining about this issue really doesn't help, it's much better to be pro-active either individually (by reverting the vandalism, or pointing it out to someone that can revert it for you) or systematically (by passing on the lessons learnt here / figuring out better ways of catching Wikidata vandalism - e.g., see m:2017 Community Wishlist Survey/Wikidata/Better countervandalism tools). Personally, I don't draw a line between Wikipedia/Commons/Wikidata/Wikisource/etc. - they are all different ways to share knowledge, and I try to use the best tool for the job. However, I acknowledge that some don't like to cross project borders (even if they can use the same login on either side of the border), and although I can't understand it, I still want to help fix the problems, both individually and systematically. This conversation so far hasn't done that, though (except for @Ymblanter's reverts). Thanks. Mike Peel (talk) 22:31, 7 December 2017 (UTC)
@Mike Peel: Unfortunately, like it or not, there are borders, and I knew them the hardest way when I was blocked in an alien Wikipedia for trying to remove blatant original research from there, because on that project - as I've found later - original research produced by local Wikipedians was perfectly OK on that project. It is simply not acceptable that Wikipedia editors can't control information and vandalism on their own project, and have to go on errands to alien projects begging for something to be done. It's simply not acceptable, from whatever side we look at it. It is not anyone's obligation that every time a vandalism spread, or an edit war happens, they have to find their way to the Wikidata admin board - wherever that is - and explain the whole situation to an alien community nobody knows nothing about, with their own rules, and request that some action is taken. In the case of edit wars, it would basically mean to extend the conflict to yet another project. This is not acceptable in the least. Either Wikidata solves their problems, or as sad as it can be, we would be much better off using Wikidata only for interwiki purposes - (and, hopefully, without the Wikidata community messing up with that, and interfering in the way other projects work those issues, as they have been doing). You asked for suggestions, I already made one: Completely blocking Wikidata to IP/newbie editing, at least until some functional vandalism control is put on place, would be a good start to make it usable.-- Darwin Ahoy! 00:21, 8 December 2017 (UTC)
There are many ways to fix things of course. Vandalism reversion on Wikidata is dealing with individual cases. Improving or getting rid of Wikidata infoboxes and the like is dealing with the root cause on the enwiki side. Getting better anti-vandal tools is a possible solution on the Wikidata side. "Just complaining about this issue really doesn't help" but that's hardly the only thing I have done of course. That you don't like many of my actions doesn't mean that they haven't happened of course. That vandalism happens on enwiki as well is hardly an argument to use a site which is even worse at catching vandalism instead. I agree that we should use the best tools available, but I don't see how you can pretend that Wikidata is that tool at the moment (or perhaps ever). Wikidata (like Wikisource, Wikiquote, ...) is a tool with a different purpose. We wouldn't transclude info from Wikivoyage or Wikinews into enwiki either. "I still want to help fix the problems, both individually and systematically." As long as it means using Wikidata though, like at the WHS infobox? You are causing many of the problems we have here with Wikidata, so I don't think you are the best person to lecture about "sofixit" and about willingness to fix problems. You are willing to fix minor issues as long as no one questions the major one, which is "is Wikidata really the best tool for this or that job". This conversation, like many others here, are a way to increase understanding and awareness of the actual scale of the problems, which aren't anecdotical but chronic and serious. Reverts of individual cases or protection of individual cases minutes after I have pointed them out here is not fixing the actual problems, it's adressing a forest fire with a water gun and berating someone else who is on the phone (to the fire brigade) instead of picking up a second water gun. Fram (talk) 08:02, 8 December 2017 (UTC)