Wikidata:Requests for comment/Is it worth to import labels from interlanguage links of other wikipedia projects to wikidata

From Wikidata
Jump to navigation Jump to search
An editor has requested the community to provide input on "Is it worth to import labels from interlanguage links of other wikipedia projects to wikidata" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.

If you have an opinion regarding this issue, feel free to comment below. Thank you!

Dear wikipedians:

I have run a bot task for five years. The bot read interlanguage links on zhwiki, jawiki, and add the label to wikidata. For me, it helps a lot. Peoples can read the translation of the entity easily. Moreover, Peoples can know what articles reference to the entity, just by the summary of the bot made. However, when there are mistakes on zhwiki or jawiki, it causes bad result, and need fix them on zhwiki or jawiki. (Please refer to the archive of my talk page.) It is annoying for some wikipedians. Althrough I think this is a common problem for all bots importing data from other wiki projects, and if there are bad interlanguage links, what we should do is to find and to fix them, not just leave it alone, there are sill many advises tell me not to import labels anymore.
So I have a question: Is it worth to import labels from interlanguage links of other wikipedia projects to wikidata, or I should abandon the task totally? Thank you for your time and attention. --Kanashimi (talk) 05:00, 12 April 2021 (UTC)[reply]
My objection is to a more specific case: I ask to not import those labels into the alias ("Also known as") field.
  1. Writing to the "Also known as" field means adding extra declarations: "this label is a valid term in the target language", and "this label is known among speakers of the target language". Wikipedians place the ILL(interlanguage link) template to link to the source article. In practice, the target language part could be labeled casually or arbitrarily, especially if there is no established word in that language, because few would bother meeting all the rigorous rules necessary for a proper article name. ja.wikipedia has an additional issue that the ILL template is named "Template:仮リンク", literally:"temporary link". Overall, the practical quality as a source of aliases is very low.
  2. Importing wrong information means doubling the places of mis-information, doubling the necessary manpower to correct them. Or more, because disproving a term, figuring that it is not an alias in that language, is a very difficult task. Far difficult than deleting a little word just uttered in a Wikipedia article.
The idea of cross-referencing interlanguage link itself is nice, as long as
  1. The cross-reference list bears a more "robotic" name, without any semantic or linguistic implications.
  2. It is a dynamic list that automatically reflects all edits and deletes of the ILL, not a permanent write into articles.
Perhaps enhancing the "What links here" to a cross-wikipedia system, with ILL flags, may be the best solution. At least, write somewhere other than the alias field. Wotheina (talk) 08:00, 12 April 2021 (UTC)[reply]
Thank you for the comments. In fact, I also maintain the interlanguage link in jawiki, converting w:ja:Template:仮リンク to normal wikilink. If the Japanese link target of 仮リンク exists, I will convert it to normal wikilink. And what I select to put into wikidata is the Japanese link target. Sorry that I do not think the quality of link targets of 仮リンク is very low, or the jawiki wikipedians will not allow this kind of conversion.
For the second part, if Special:WhatLinksHere can list up the link source, it is good. But since the template format of each wiki project is different, it will change with time; and not every wiki project has its interlanguage link template, I am afraid it is difficult to implement the function. Kanashimi (talk) 08:52, 12 April 2021 (UTC)[reply]
I agree with you that it's better to just fix the problematic cases, rather than having to do all the work from scratch. I think the bot is useful and should continue this task. NMaia (talk) 22:22, 12 April 2021 (UTC)[reply]
  • I can't say much about the task in relation to additions in zh or ja. Speakers of these languages would have to evaluate if an occasional error is problematic given the thousands (or ten thousands?) of labels it adds in each run. It still puzzles me that so much can be found, but it appears to be made possible by a specific way these Wikipedia editions use interwikis.
    I'm still much in support of another task of that bot: it attempts to extract Latin script names from jawiki, when the Wikidata items doesn't have such labels. I think this helps/helped us avoid duplicates not easily spotted by other means as we generally try to compare (Latin script) labels.
    As any bot, I suppose its functions evolve over time and I found Kanashimi always open to input. --- Jura 22:33, 12 April 2021 (UTC)[reply]
  • I don't quite understand the problem with ja and zh, but as for the title question - full support. There are many maintenance items which have different labels in Wikipedia and Wiktionary (and other projects) so they worth to be added - at least for the sake of search. And yes, in aliases. And even if they seem wrong, adding them will help to check correctness of interlanguage linking too. --Infovarius (talk) 12:35, 14 April 2021 (UTC)[reply]