Duplicate catalogs edit

Hello Jonathan, I noticed that some of the catalogues you uploaded to mix'n'match are duplicates of others which you had uploaded earlier:

Perhaps @Magnus Manske: can help you merge these catalogues (or just delete one of them). Mahir256 (talk) 04:53, 16 December 2018 (UTC)Reply

Thanks for the note, I am aware of my duplication (after the fact of hitting submit), doh! A limitation to Mix-n-match as uploaders cannot truly fix their mistakes even marking not active won't hide the old. You beat me to pinging Magnus on those. Thanks for the note, and it is nice to know people actually are using Mix-n-match (or at least watching). I can't do all these on my own. My technique matured in OpenRefine so that I need to fix some older uploads as they are not complete and excluded some territories etc. I know of a few other corrections to descriptions that are needed ie outlying " , " and Louisiana has Parishes not Counties... well learning is fun, but it doesn't harm the matching. Thanks again. Wolfgang8741 (talk) 05:10, 16 December 2018 (UTC)Reply

GNIS Matches edit

Hi Jonathan, thanks for contacting me about helping with the GNIS catalogs on mix & match. I had fun working on them (love mix & match) and the best thing is, someone finds the effort useful! I'll continue to work on them. Hope you will come to one of the upcoming Wikimedia DC events in the New Year. Best wishes, Uncommon fritillary (talk) 02:56, 17 December 2018 (UTC)Reply

New page for catalogues edit

Hi, I created a new page for collecting sites that could be added to Mix'n'match and I plan to expand it with the ones that already have scrapers by category. Feel free to expand, use for property creation. Best, Adam Harangozó (talk) 10:45, 6 November 2019 (UTC)Reply

Thank you for participating in the FindingGLAMs Challenge! edit

  Thank you for participating in the FindingGLAMs Challenge!
By improving information about GLAM institutions on Wikidata, you made the Wikimedia projects better for everyone!

Alicia Fagerving (WMSE) (talk) 14:19, 16 March 2020 (UTC)Reply

New OpenRefine reconciliation service edit

Hi!

Thank you for wearing the {{User loves OpenRefine}} userbox on your user page!

Because the existing Wikidata reconciliation service has had severe performance issues recently, I have created a new one which should be faster and more robust. You can add it to OpenRefine in the reconciliation dialog with the following URL: https://wikidata.reconci.link/en/api (or by replacing en by any other language code).

If you have any issues with this new service, let me know.

Happy reconciling! − Pintoch (talk)

US State Legislators edit

Thanks for all your work on adding position held (P39) statements for the state legislators — this is really great and super-useful. I've been considering adding some of this myself, but if you're in the middle of an upload spree, I don't want to step on your toes with any of it. Are you hoping to get to some degree of completeness with any of it at the minute, or is that something that's likely to be more of a longer-term hope? --Oravrattas (talk) 08:32, 19 January 2022 (UTC)Reply

Thanks, I've started this manual import of the current U.S. State Senators for their current position and the dates of terms from their Wikipage Infobox to get more familiarity with the data and data errors in the office holders Infoboxes before considering tackling any automated import or tool for continuous verification across Wikidata and English Wikipedia. A incremental degrees of completeness is a longer term goal, but my current scope of work is aimed at sync between Wikipedia and Wikidata to highlight the data gaps where both communities could work together to tackle completeness.
If tackling the current representatives won't interfere with my current scope and may be a worthy comparison of technique.
My objectives so far are:
  1. Become familiar with the English Wikipedia infobox data and embedded errors as well as inconsistencies and missing Q's in Wikidata for the current position holders and data sync gaps.
  2. Import all current U.S. State Senate position holders position held and start date of current session and replaces/replaced in sync with Wikidata from English Wikipedia Infobox and
  3. Flag missing Infoboxes.
  4. Identify where constraints and validation would improve Infobox parsing for officeholders while identifying pages and infoboxes needing cleanup in Wikipedia
  5. Flag missing WikiProject working group claim for officeholders on English Wikipedia.
  6. Identify which external Identifiers are missing from mix-n-match for politicians and would be useful to compare for coverage and details ie Vote Smart candidate ID (P3344) and Ballotpedia ID (P2390)
My consideration here is current office holders are a means to anchor from and work back in time to fill out detail and work out issues with the data model over time. For example electoral districts are currently applying districts as the named value while not accounting for district reshaping (ie redistricting) and including the years the district shape existed. The importance of this would depend on what detail that Wikimedia and wikidata users need to query. For now, Linking to a single numeric district entity makes sense as that is what level exists in Wikipedia with future detail and differentiation being added to the model later when the need to increase resolution is warranted.
I don't have an immediate timeline on this work, but more a direction of where it may go. I've mostly been focused on Geographic items in Wikidata than political office holders, but thought Wikiproject every politician would be a fun look into U.S. Civics. Happy to hear any feedback or work on coordinating on the data ingestion and sync. Wolfgang8741 (talk) 16:22, 20 January 2022 (UTC)Reply
@Wolfgang8741: This is all really great! Are you pulling the data out the infoboxes manually, or are there good tools for extracting/importing that information? One thing I'd like to do is set up some comparisons against an external source to help see where there are gaps. Unfortunately Open States only usually has useful identifiers for cross-matching when a state legislator has also been in Congress, so using it for bulk upload would require a lot of reconciliation first, but I suspect that simply comparing their data with what's already in Wikidata could be useful purely as a reporting tool / sanity check. It'll probably be a few weeks before I get a chance to look at any of that, but do you have a sense of what might be a good state to test with first? --Oravrattas (talk) 06:35, 25 January 2022 (UTC)Reply
Yes, manually at this moment. The Harvest Templates function of Pywikibot might make automating a step forward, but based on what I'm seeing would want to put some sanity checks and data checks in place between extracting from the templates and inserting into Wikidata through any automated means.
  • I took a look at OpenStates (this group is new to me, but I remember the original work of the Sunlight Foundation), which is cool and CC0 meaning they're compatible for import, but currently their discussion and issues have zero mention of Wikidata. A step forward might be to propose OpenStates ID to Wikidata and reaching out to OpenStates to add a Wikidata Q field within their records to allow for cross linking and sanity checks with less need to resolve entities again. To help with the heavy lift syncing and resolving identities with https://mix-n-match.toolforge.org/ as a collaborative space might be a good tool and allows monitoring. It looks that OpenStates are using the format ocd-person/9b425a88-36ae-439c-b2f9-8da167b9ff27 for unique IDs for people. Have you worked with them or reached out to them?
  • Before I started New Jersey had been worked on and is what I modeled some of the structure on, but I don't know how functionally complete that data is currently compared to the other states I've started. If you would like to collaborate on a state I'd be happy to focus my effort on one. I'd personally be more interested in Michigan, Ohio, or West Virginia. Wolfgang8741 (talk) 16:39, 28 January 2022 (UTC)Reply
I reached out on the OpenStates slack about crosslinking Wikidata and OpenStates and proposing an ID. They seem open to crosslinking if Wikidata IDs could be provided. I'm going to work on a Mix-n-Match catalog to assist with resolving one state and propose an OpenState ID property to link back. This would help with your sanity check and give an idea of what work this is to bring the workflows together. Both being CC0 is useful too. Wolfgang8741 (talk) 17:33, 28 January 2022 (UTC)Reply
@OravrattasI've created a Mix-n-Match catalog for OpenStates people and will add the OpenStates property once the proposed ID is settled and approved. Wolfgang8741 (talk) 18:37, 10 February 2022 (UTC)Reply
@Wolfgang8741: that's great! I've done some matching, but Mix-n-Match is a bit too unbearably slow for me at the minute, so I'll try coming back to it again tomorrow and hopefully it'll be a bit faster then. --Oravrattas (talk) 20:21, 10 February 2022 (UTC)Reply
@OravrattasI've completed matches for 7259 ids with the remaining 737 do not have an EN Wiki (maybe a few have been created) and may have Wikidata IDs, but without the property proposal approval to add IDs there is not much motivation to dig into the additional data. I found some issues with some names so a blind import may not be the best, but definitely let me know if you do any import or sanity checks with the OpenStates resolved data. I also proposed the Michigan Legislative Bio ID which may aid in validation of Michigan politicians. Wolfgang8741 (talk) 06:01, 23 February 2022 (UTC)Reply
Wow, that's phenomenal work! I've tinkered a little bit around the edges here and there, as I've come across relevant things, but it's likely to be at least another few weeks before I get the chance to go deep on any of it.
NB: I didn't get a notification of the Michigan Legislative Bio ID proposal, so it's possible that the 'ping project' didn't work. (It also looks like you might have doubled up the references to the Senate a couple of times: I'm presuming this was meant to be the House and Senate?) Oravrattas (talk) 06:13, 23 February 2022 (UTC)Reply
Your contributions to the matching was appreciated, definitely didn't do it all on my own. Thanks for the heads up in the proposal, fixed the double ref, you were correct. Wolfgang8741 (talk) 06:25, 23 February 2022 (UTC)Reply

Call for participation in a task-based online experiment edit

Dear Wolfgang8741,

I hope you are doing good,

I am Kholoud, a researcher at King's College London, and I work on a project as part of my PhD research, in which I have developed a personalised recommender system that suggests Wikidata items for the editors based on their past edits. I am collaborating on this project with Elena Simperl and Miaojing Shi.

I am inviting you to a task-based study that will ask you to provide your judgments about the relevance of the items suggested by our system based on your previous edits.

Participation is completely voluntary, and your cooperation will enable us to evaluate the accuracy of the recommender system in suggesting relevant items to you. We will analyse the results anonymised, and they will be published to a research venue.

The study will start in late January 2022 or early February 2022, and it should take no more than 30 minutes.

If you agree to participate in this study, please either contact me at kholoud.alghamdi@kcl.ac.uk or use this form https://docs.google.com/forms/d/e/1FAIpQLSees9WzFXR0Vl3mHLkZCaByeFHRrBy51kBca53euq9nt3XWog/viewform?usp=sf_link

I will contact you with the link to start the study.

For more information about the study, please read this post: https://www.wikidata.org/wiki/User:Kholoudsaa

In case you have further questions or require more information, don't hesitate to contact me through my mentioned email.

Thank you for considering taking part in this research.

Regards

Kholoudsaa (talk) 18:32, 28 January 2022 (UTC)Reply

Territorial overlaps edit

Regarding this revert, I think my edit was closer to current best practices. territory overlaps (P3179) was created with the express purpose that located in the administrative territorial entity (P131) would be limited to stating the territories that the subject lies completely within, in this case only Ohio (Q1397) but neither Butler County (Q485561) nor Warren County (Q489576). Otherwise the two properties would be redundant to each other. I recognize that this gets especially awkward in the rare cases where a township extends across county lines, but it would be better to model that case with the township's government's item having more than one parent organization (P749) statement. Minh Nguyễn 💬 05:50, 5 November 2022 (UTC)Reply

@Mxn I noticed the territory overlaps (P3179) property after reverting, but left the revert since it is not commonly applied across cities with such boundary nor an probably more importantly adopted for display in WikiCommons not other infoboxes across Wikimedia projects. This may be more of a broader issue for property behavior modifications without tracking of properties for associated templates relying on the property. While I agree that for modeling it makes a cleaner, but currently relying on has real world impacts until the templates and queries are updated to provide the same resolution about the location for the object overlapping administrative boundaries. I'm not opposed to the move to the modeling you're suggesting, but how to move to such and checking that downstream impacts are handles prior to modifying models may be a good discussion to bring up at WikiConference North America and more broadly. I find the communication channels across Wiki to be limited and lacking to ensure there are maintainers ready and willing to make modifications prior to breaking changes. Wolfgang8741 (talk) 13:25, 5 November 2022 (UTC)Reply

Mooring dolphin edit

Hi Wolfgang,

I don't speak English as my native language, but I would call mooring cell (Q116944508) a mooring dolphin. Kind regards Jackie Bensberg (talk) 21:33, 27 March 2023 (UTC)Reply

Hi @Jackie Bensberg. From what I've read the mooring cell is not considered a mooring dolphin. The CATMOR classification in [1] suggests it should be classified as a post/pile type (or it may be best represented as a subclass of post/pile on Wikidata). There are a number of mooring types not yet represented as separate Q concepts and are ripe for being added or organized to line up with the International Hydrographic Organization (Q233611)'s S-100 data model found at [2] [3] and with OSM's tagging OpenStreetMap Morrings tagging schema. Wolfgang8741 (talk) 16:14, 30 March 2023 (UTC)Reply

District edits edit

Hey thanks for the edits you've been making to the state district items. I noticed you were making a lot of very similar edits. You may already know this but you may want to look into petscan and quickstatements for ways to make big batches of edits like this with less manual effort. Anyways either way thanks for doing it. BrokenSegue (talk) 04:28, 11 April 2024 (UTC)Reply

I kicked off a job to set the description for all remaining undescribed districts: https://quickstatements.toolforge.org/#/batch/228079 BrokenSegue (talk) 04:45, 11 April 2024 (UTC)Reply
Yeah, I knew there weren't that many so I defaulted to manual, but I've used both when there is greater than 100 items to deal with. I wanted to check other statements for consistency too and had the time. It was more going down a rabbit hole manually than intentional large batch edit. Wolfgang8741 (talk) 23:48, 11 April 2024 (UTC)Reply