Wikidata:Requests for permissions/Bot/Pi bot 18
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved --Lymantria (talk) 06:15, 11 January 2021 (UTC)[reply]
Pi bot 18[edit]
Pi bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Mike Peel (talk • contribs • logs)
Task/s: Import coordinates from Commons and Wikipedias
Function details: The code looks for coordinates that can be imported from sitelinked Commons categories (currently using uses of commons:Template:Object location in categories, other sources may be added later), or in sitelinked Wikipedia articles (currently using en:Category:Coordinates not on Wikidata), and imports them to Wikidata. On Commons it also removes the template (provided it is within 100m of the Wikidata coordinate) and adds the Wikidata Infobox if it's not already present. This would run twice-weekly.
Example edits from Commons: [1], [2] (NB: as demonstrated here, this script does not check for P625 as a qualifier value, as such values aren't used on Commons or enwp). From enwp: [3], [4], [5]. on Commons the removal looks like this: [6] (should already be covered by another bot task there, but I'll double-check on Commons before running this part (it's out of Wikidata bot request scope).
Thanks. Mike Peel (talk) 19:15, 25 December 2020 (UTC)[reply]
- @Mike Peel: I think your code has a minor bug. Here you load the coordinates in order but if the item has multiple coordinates and one is None (because it's set to unknown value for example) then you will add the claim. Also, if there are multiple templates with coordinates you grab one of them arbitrarily. I think we should maximally err on not importing when there's confusion. Also, I'm wondering if you should add a reference to the claim (to enwiki/commons) with a link to the version of the article you used so we can track data lineage. Support in concept though.BrokenSegue (talk) 00:13, 26 December 2020 (UTC)[reply]
- @BrokenSegue: Thanks for the code review! I've modified the codes so that if there are multiple categories but one is bad then it will skip the item. I'm also now counting the number of coordinate templates, and skipping the item if there isn't just 1. I would prefer not to add a reference, as it's not really a reference, and the source wiki is in the edit summary if people want to check where it came from. Thanks. Mike Peel (talk) 18:24, 26 December 2020 (UTC)[reply]
- I agree "imported from enwiki" or whatever isn't a great reference but it's better than nothing. Going through the edit history isn't a tractable process in the general case. BrokenSegue (talk) 18:28, 26 December 2020 (UTC)[reply]
- @BrokenSegue: I generally think it is worse than nothing, since it implies that the statement is referenced when it isn't really, particularly since the references section is auto-collapsed. But I can add it if need be. Thanks. Mike Peel (talk) 18:34, 26 December 2020 (UTC)[reply]
- Well, I'd guess a majority or maybe just plurality of references here are just "imported from enwiki" so people should already be aware that the presence of a reference isn't that meaningful. I think tracking the information's lineage is useful but it's not absolutely critical. BrokenSegue (talk) 18:37, 26 December 2020 (UTC)[reply]
- @BrokenSegue: I generally think it is worse than nothing, since it implies that the statement is referenced when it isn't really, particularly since the references section is auto-collapsed. But I can add it if need be. Thanks. Mike Peel (talk) 18:34, 26 December 2020 (UTC)[reply]
- I agree "imported from enwiki" or whatever isn't a great reference but it's better than nothing. Going through the edit history isn't a tractable process in the general case. BrokenSegue (talk) 18:28, 26 December 2020 (UTC)[reply]
- @BrokenSegue: Thanks for the code review! I've modified the codes so that if there are multiple categories but one is bad then it will skip the item. I'm also now counting the number of coordinate templates, and skipping the item if there isn't just 1. I would prefer not to add a reference, as it's not really a reference, and the source wiki is in the edit summary if people want to check where it came from. Thanks. Mike Peel (talk) 18:24, 26 December 2020 (UTC)[reply]
- Also maybe it should check for coordinates as qualifiers? Looking at your example edit here it seems the coordinate is kinda already present but for the HQ location (which makes more sense, do battalions have coordinates really?) BrokenSegue (talk) 00:16, 26 December 2020 (UTC)[reply]
- @BrokenSegue: The problem is that coordinates as qualifiers aren't used on enwp/commons/elsewhere, and they're not likely to be. Most of them were added by @DeltaBot: without a bot task, it was discussed at [7], and hopefully that script is no longer being run. It would generally be better to have P625 as the main property, with a qualifier of applies to part (P518). I can write another bot script to move them back if needed, but that adds complications beyond just importing them again (perhaps there are valid qualifier coordinate values that are different from also-valid commons/enwp values). Thanks. Mike Peel (talk) 18:24, 26 December 2020 (UTC)[reply]
- Category:Coordinates not on Wikidata contains a large number of bad entries (e.g. [8]). And in most times the item does not exist and you need to create one.--GZWDer (talk) 15:28, 26 December 2020 (UTC)[reply]
- @GZWDer: You're linking to 2018 edits, surely the bad entries were also fixed on enwp at the same time as they were removed from here? I don't understand the 'item does not exist' point. Thanks. Mike Peel (talk) 18:24, 26 December 2020 (UTC)[reply]
- Even if the pages listed are fixed on enwp, there may be newer cases that coordinate template are applied to articles of spacecraft, people or company (where a headquarter statement is preferred). Also, most newer pages with coordinates not in Wikidata are newly created, and do not have Wikidata items.--GZWDer (talk) 20:44, 26 December 2020 (UTC)[reply]
- @GZWDer: I've added an exclusion for instance of (P31)=human (Q5) (although perhaps coordinates do make sense for humans with graves?), I can add other checks if needed. I didn't add it for spacecraft as there aren't that many articles. For companies, see the above comments, perhaps I'm missing where this was previously discussed? For pages without Wikidata items, I mostly rely on your bot script to create new items, any news on when it will be running again? Thanks. Mike Peel (talk) 18:56, 28 December 2020 (UTC)[reply]
- Welcome to comment on Wikidata:Requests_for_permissions/Bot/RegularBot 2 for regular creation of new items. In the meanwhile items may be imported from given sets of categories (e.g. recent deaths) but this obviously will not cover all unconnected pages.--GZWDer (talk) 19:00, 28 December 2020 (UTC)[reply]
- I've commented (again) there. In this case, unconnected items would be skipped over. I might propose another pi bot task to create items if needed, but I'd prefer it if your already existing bot script did that! Thanks. Mike Peel (talk) 19:13, 28 December 2020 (UTC)[reply]
- @GZWDer: Sorry ... I've proposed Wikidata:Requests for permissions/Bot/Pi bot 19 to create the Wikidata items for enwp articles/categories, but I'd still rather see your bot do this. Thanks. Mike Peel (talk) 19:30, 3 January 2021 (UTC)[reply]
- I've commented (again) there. In this case, unconnected items would be skipped over. I might propose another pi bot task to create items if needed, but I'd prefer it if your already existing bot script did that! Thanks. Mike Peel (talk) 19:13, 28 December 2020 (UTC)[reply]
- Welcome to comment on Wikidata:Requests_for_permissions/Bot/RegularBot 2 for regular creation of new items. In the meanwhile items may be imported from given sets of categories (e.g. recent deaths) but this obviously will not cover all unconnected pages.--GZWDer (talk) 19:00, 28 December 2020 (UTC)[reply]
- @GZWDer: I've added an exclusion for instance of (P31)=human (Q5) (although perhaps coordinates do make sense for humans with graves?), I can add other checks if needed. I didn't add it for spacecraft as there aren't that many articles. For companies, see the above comments, perhaps I'm missing where this was previously discussed? For pages without Wikidata items, I mostly rely on your bot script to create new items, any news on when it will be running again? Thanks. Mike Peel (talk) 18:56, 28 December 2020 (UTC)[reply]
- Even if the pages listed are fixed on enwp, there may be newer cases that coordinate template are applied to articles of spacecraft, people or company (where a headquarter statement is preferred). Also, most newer pages with coordinates not in Wikidata are newly created, and do not have Wikidata items.--GZWDer (talk) 20:44, 26 December 2020 (UTC)[reply]
- @GZWDer: You're linking to 2018 edits, surely the bad entries were also fixed on enwp at the same time as they were removed from here? I don't understand the 'item does not exist' point. Thanks. Mike Peel (talk) 18:24, 26 December 2020 (UTC)[reply]
I will approve this request in a couple of days, provided that no objections will be raised. Lymantria (talk) 16:57, 9 January 2021 (UTC)[reply]