Module talk:Language

(Redirected from Module talk:ISO 639 name/doc)
Latest comment: 3 months ago by Error in topic Samaritan
WikiProject iconLanguages Template‑class
WikiProject iconThis module is within the scope of WikiProject Languages, a collaborative effort to improve the coverage of languages on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
TemplateThis module does not require a rating on Wikipedia's content assessment scale.

Italicisation of Halkomelem edit

This discussion moved to Template talk:Lang#Italicisation of Halkomelem, as it pertains to {{lang}}.

Wikt-lang Template: Link to subsection for words with multiple "Etymologies" in the language? edit

For example, consider the English word "see." There are two completely etymologically-distinct meanings: a verb for visually perceiving or understanding, and a Catholic diocese headed by a bishop.

To show this, the Wiktionary page for "see" has multiple Etymology subsections within the English section. (In this case, only two, but some pages have many more.)

When using the {{Wikt-lang}} template, there's no option to link to a specific Etymology subsection. For example, {{Wikt-lang|en|see}}see, which always links to the beginning of the "English" section. If I'm editing the Wikipedia page about Episcopal Sees, I'd like to be able to link to the second etymology of "see" in Wiktionary.

This probably requires a new, ffth parameter for "Etymology Number" or something like that. (@Erutuon:, @Jberkel:?) Dark Jackalope (talk) 23:47, 9 November 2022 (UTC)Reply

Greenlandic / Kalaallisut edit

Wiktionary lists Greenlandic language words under the heading "Greenlandic" but the template provides the word "Kalaallisut". Eievie (talk) 03:46, 7 January 2023 (UTC)Reply

missing data? edit

Am I mistaken, or should every language tag that is listed in the ["languages"] table have both of ["name"] and ["article"] entries? If that is so then:

these are missing ["name"]:

these are missing ["article"]:

In the above list the language names are the names that Module:Lang associates with the language tag. When a second name is listed, Module:Lang and Wiktionary definitions of the language tag disagree. In the case of the IETF-like language tags, Module:lang does not support extlangs and the extlangs that Wiktionary uses are not defined in the IANA language-subtag-registry file. Wiktionary names used above were taken from one of:

Am I mistaken? Is there a reason that these language tags do not have both of ["name"] and ["article"]?

Trappist the monk (talk) 15:50, 14 December 2023 (UTC)Reply

@Editor Erutuon: you are (apparently) the original author of this module. Do you have an answer for my questions?
Trappist the monk (talk) 20:32, 17 December 2023 (UTC)Reply
It's been a long time since I looked at this module. It doesn't currently use the article field in any code that can be invoked by a template.makeLinkedName uses the article field, but it's orphaned and apparently I never tested it, because it would give a "tried to concatenate nil" error for language codes that don't have an article field or don't have a Wikipedia_name or name field. I think I meant to use it in another invokable module function that would prepend the language name like {{lang-fr}} does, and never got around it. But it would probably be better to have Module:lang generate that language name. So for now the article field is unnecessary.
The name field is used to specify the canonical name (the name of the language used in level-2 headers in Wiktionary entries). If I remember right, in the code that is invoked by templates, it is only needed when mw.language.fetchLanguageName(languageCode, 'en') doesn't return the canonical name. Thus, be doesn't have a name field because mw.language.fetchLanguageName('be', 'en') returns Belarusian, the canonical name; but ab has a name field with the canonical name Abkhaz because mw.language.fetchLanguageName('ab', 'en') returns Abkhazian and aaq has Abnaki because mw.language.fetchLanguageName('aaq', 'en') returns nil.
The name fields are incomplete because I hadn't worked out how to fully populate the list of idiosyncratic Wiktionary language names and keep it up to date. There's no way for Lua modules on Wikipedia to access data modules on Wiktionary, so unless someone manually copies modules from Wiktionary to Wikipedia every time there is a change, the syncing would have to be done with JavaScript or an off-wiki script.
We don't use real extlang subtags on Wiktionary or IETF language tags, but only use idiosyncratic codes identifying languages that can have entries, or language families, or variants (dialects for instance) of languages or language families, or variants of variants. This Wikipedia module only cares about languages that have entries. (It could also be made to handle codes of variants of those languages, if we wanted to figure that out; they are listed in wikt:Module:etymology languages/data along with variants of language families.) Some languages that have entries use ISO 639 language codes (probably sometimes with a different circumscription from what ISO prescribes), and others use codes that are sequences of two or three sequences of two or three letters joined by hyphens. (The codes for languages that have entries all match a regex ^[a-z][a-z][a-z]?(-[a-z][a-z][a-z]?){0,2}$, but according to this table, most hyphenated codes have segments of three letters.) There's logic to how the language codes are made, such as that the first segment is usually an existing code, but as far as I know, those making Wiktionary language codes never look at the IANA subtag registry.
Wiktionary puts these idiosyncratic language codes in lang attributes in the HTML. When they happen to be valid IETF language tags (as many are), it's because Wiktionary uses an ISO language code and the IANA registry also uses it. There has been talk about translating Wiktionary language codes into IETF tags, but no work on it so far. If Wiktionarians do that, we'll have to come up with some private-use tags, as Wikipedia has done with ine-x-proto, and it could simplify things if Wiktionary and Wikipedia use the same private-use tags.
Anyway, that's a lot of background information. Syncing the name field from the Wiktionary language data modules requires 1. grabbing all the pairs of codes and canonical names of languages that have entries from wikt:Module:languages/code to canonical name (this can't be done in Lua on Wikipedia), 2. filtering out the names that are equal to the output of mw.language.fetchLanguageName(code, 'en'), 3. writing the remaining codes and names to a new data module on Wikipedia (separate from Module:Language/data, which contains some data that has to be manually edited), 4. making this module use the new data module. The name field could then be removed from Module:Language/data. There are several other problems in this module, but this would solve the name problem. — Eru·tuon 21:49, 17 December 2023 (UTC)Reply
Ah, thank for that. My purpose is to add a level-2-header anchor to a wiktionary interwiki link. With what you've written above, it looks like a two step process. Step one is to query Module:Language/data for the Wiktionary preferred name. When that query returns nil, then step two: attempt to fetch the language name from MediaWiki using mw.language.fetchLanguageName(<tag>, 'en'). When both fail, use the value in |anchor= if present and emit an error category; when |anchor= is missing or empty, emit an error message and an error category so that the underlying data can be fixed. It ain't pretty but it will work I think.
Trappist the monk (talk) 23:38, 17 December 2023 (UTC)Reply
Looks good. That's roughly what {{wikt-lang}} does already, apart from not having a |anchor= parameter. — Eru·tuon 22:06, 30 December 2023 (UTC)Reply

Samaritan edit

{{lang|smp|example}} (the code for Samaritan Hebrew language) adds a page to Category:Articles containing Samaritan-language text but Samaritan language redirects to Samaritan Aramaic language sam. -- Error (talk) 20:06, 17 January 2024 (UTC)Reply

That is not a 'failing' of Module:Language (because that module is not used by {{lang}}). It may be a failing of Module:Lang which when queried gives these results:
{{lang|fn=category_from_tag|smp|link=yes}}Category:Articles containing Samaritan Hebrew-language text
{{lang|fn=name_from_tag|smp|link=yes}}Samaritan Hebrew
{{lang|fn=category_from_tag|sam|link=yes}}Category:Articles containing Samaritan Aramaic-language text
{{lang|fn=name_from_tag|sam|link=yes}}Samaritan Aramaic
The ISO 639-3 custodian assigns smp to Samaritan. Similarly, the IANA language-subtag-registry file also assigns smp to Samaritan. Apparently, no one at en.wiki has cared enough about which article is linked when smp is used in {{lang}} to gain a consensus to override the ISO 639-3 name or to repoint the Samaritan-language-redirect so that it points elsewhere.
Issues involving {{lang}}, {{transliteration}}, and other templates that use Module:Lang, should be discussed at Template talk:Lang, not here. Where is it best to discuss repointing the Samaritan-language-redirect? I don't know, except to say that such discussion should not be here.
Trappist the monk (talk) 20:48, 17 January 2024 (UTC)Reply
Thanks. Asked at Template talk:lang#Samaritan --Error (talk) 01:23, 18 January 2024 (UTC)Reply