Wikidata:Lexicographical data/Documentation
< Wikidata:Lexicographical data
Other languages:
Bahasa Indonesia Bahasa MelayuDeutsch English EsperantoNederlands Türkçe asturianucatalà dagbanli dansk eestiespañol français italiano polskisvenska Ελληνικά русскийсрпски / srpski українська հայերենالعربية فارسی हिन्दी বাংলা 中文日本語 한국어
Overview
 
Documentation
 
Development
 
Tools
 
Support for Wiktionary
 
How to help
 
Lexemes
 
Discussion
 
Wikidata:Lexicographical data
This documentation page is currently being reworked. Some important changes may occur.
This is the main documentation page for lexicographical data on Wikidata.
See also the technical documentation on extension WikibaseLexeme.
Introduction
Data Model
Visualization of the Lexeme data model
The data model of WikibaseLexeme describes the structure of the data that is handled as "Lexemes" in Wikibase. The text below is a summary; for more detailed information, see Extension:WikibaseLexeme/Data Model.
A Lexeme is a lexical element of a language, such as a word, a phrase, or a prefix (see Lexeme on Wikipedia). Lexemes are Entities in the sense of the Wikibase data model.
From a high level the Lexeme hierarchy is modeled like so:
Lexeme (ID)
A Lexeme is described using the following information:
This data model is further extended by the set of properties typically used for Lexeme statements, Form statements, and Sense statements. See Wikidata:Lexicographical data/Properties for an overview of these properties and Wikidata:Property proposal/Lexemes for current proposals of additional properties.
Sample Lexeme by Language and Lexical Category
verbnounpronounadjectiveadverbprepositionpostpositionconjunctioninterjectionnumeraldeterminergrammatical particle
Arabic
ذهب ‎(L7882)
كِتَابٌ ‎(L2233)
أنا ‎(L7883)
جميل ‎(L7884)
عادةً ‎(L7885)
في ‎(L2452)
لَكِنَّ (L7886))
يعني ‎(L7887)
واحد ‎(L7891)
هذا ‎(L7892)
English
go (L3006)
book (L536)
I (L487)
beautiful (L3360)
usually (L4114)
in (L2987)
ago (L3240)
but (L1387)
oh (L4327)
one (L327)
this (L2994)
German
wissen (L2058)
Zukunft (L80)
ich (L7877)
ausgezeichnet (L530)
querbeet (L7059)
in (L6748)
aber (L7879)
ach (L7889)
eins (L7880)
dieser (L7881)
Korean
먹다 (L17)
사람 (L130)
나 (L246)
괴롭다 (L100)
함께 (L168)
가만 (L86)
극/極 (L83)
고전적/古典的 (L49)
Spanish
ir (L7385)
libro (L317)
yo (L55951)
hermoso (L55952)
normalmente (L55953)
en (L11741)
pero (L55954)
oh
uno (L44969)
esto (L55955)
French
aller (L750)
livre (L6873)
je (L9094)
beau (L7026)
toujours (L9105)
dans (L9148)
mais (L9261)
merci (L11618)
un (L9167)
ce (L9203)
Pashtoتللکتابزهښکلیپهخویو
Persian
رفتن ‎(L2921)
کتاب
من ‎(L2377)
زیبادررااماآخیکاین
Russian
быть (L2111)
вода (L189)
я (L2027)
хороший (L10951)
хорошо (L10948)
в/въ (L2109)
-
и (L2108)
всё (L2115)
три (L32930)
-
не (L2110)
Swedish
göra (L38963)
boll (L32310)
han (L35645)
listig (L39404)
ofta (L35726)
på (L35650)
-
och (L35648)
hej (L246342)
fem (L46944)
den (L47066)
ju (L53540)
Mandarin Chinese设置纪录片
他 (L7967)
坚韧不拔慢慢地但是
一 (L1773)
?
的 (L7975)
In some cases or languages, there may be multiple entities for related words, in others just one. The below table provides an overview how they may be linked:

One or several lexemes for nouns?
difference in1 lexeme2+ lexemes
senseadd several sensesadd applicable sense to lexemelink other(s) with homograph lexemeduplicate forms on each
etym.add etym. to each senseadd etym. to lexeme baselink other(s) with homograph lexemeduplicate forms on each
genderadd gender to each senseadd gender to lexeme baselink other(s) with homograph lexemeduplicate forms on each
common/properadd several sensesuse lexical category "noun"add applicable sense to lexemelink other(s) with homograph lexemeduplicate forms on each
caps/lowercaseadd several formsqualify forms to applicable sensesadd applicable sense to lexemelink other(s) with homograph lexemeadd only applicable forms
singular/pluraladd several formsqualify forms to applicable sensesadd applicable senseif possible link other(s) with homograph lexemeadd only applicable forms
pronunciationadd the same form twicequalify forms to applicable senses, add prononciationadd applicable senseif possible link other(s) with homograph lexemeadd form and applicable pronunciation
forms/spellingadd several forms or alternate formsqualify forms to applicable sensesadd applicable senseif possible link other(s) with homograph lexemeadd only applicable forms
For a given language and criterion (first column), just one of the two might apply

Interface
Lexeme
Screenshot of the Lexeme creation page
Create a new Lexeme
  1. Go to Special:NewLexeme
  2. Enter a lemma (dictionary form of a word) — Lemma
  3. Enter the language of the lexeme by typing the name of the language or Q-ID — Language of Lexeme
  4. In the field that appears above, enter the language code of the lemma — Spelling variant of the Lemma
  5. Enter the lexical category by typing its name or the Q-ID (example: verb, noun, adjective...) — Lexical category
  6. Click on "Create"
  7. The Lexeme is now created with this basic information, you can continue editing it
Screenshot of the top of a Lexeme page
Edit a Lexeme
  1. Click on the edit button, next to the lemma
  2. Edit the content of the different fields
    • Lemma
    • Language code of the lemma — Spelling variant
    • Language of the Lexeme — Language
    • Lexical category
  3. Click on "publish"
Screenshot of the interface to edit a statement
Add, edit or delete statements of a Lexeme
  1. To add a statement of a Lexeme, click on "add statement"
  2. Enter a property: start typing its name in the property field (example: derived from lexeme) and select it in the suggester
  3. Enter a value.
    Note: A Wikidata property for lexicographic senses (Q54275340) such as translation (P5972) or synonym (P5973) does not currently support value search results for senses by Lexeme name. That means in order to enter a value for a statement, you need to enter the precise Lexeme Sense ID for the Lexeme Sense you want as a value. For example, mother (L3625) has the statement synonym (P5973) mom/mum (L11530). Entering L11530-S2 is the only way this value can be published.
    As seen here, Wikidata will not be able to find Lexemes and their senses when searching by their name.

    Searching by a precise Lexeme Sense ID however returns a publishable result.
  4. Just like on Items, you can add qualifiers and references
  5. Save by clicking "publish"
  6. To edit a statement, click on "edit"
  7. To delete a statement, click on "edit", then "remove"
Delete a Lexeme
Go to WD:RFD
Search for a Lexeme
Here's how you can look for Lexemes, Lemmas, Forms or Senses, via Special:Search or the search box on any page:
Note that the selector (drop-down menu popping up to suggest results) is not working yet. But if you press Enter or search after typing your keyword, you'll access the results.
Form
add a Form
Create a new Form
  1. In the Forms section, click on "add Form"
  2. Fill the representation — Representation (mandatory)
  3. Fill the language code of the representation — Spelling variant (mandatory)
  4. Enter one or several grammatical features, by typing their name and selecting them in the list of items — Grammatical features
Edit a Form
  1. Click on the "edit" button next to the representation
  2. Modify the content in the fields
  3. Click on "publish"
Delete a Form
  1. Click on the "edit" button next to the representation
  2. Click on "remove"
Transliterations (Scripts/Phonetics)
New subpage link to be added here (proposed by on mailing list by Thadguidry (talk) 04:29, 13 December 2020 (UTC))
Sense
Create a new Sense
  1. In the Senses section of a Lexeme, click on "add Sense"
  2. Enter a language code (for example: en, fr, zh) — Language
  3. Enter a gloss (very short phrase defining the meaning)(equivalent to: skos:definition) — Gloss. NOTE:If a gloss is quoted or citable from a source, then use gloss quote (P8394)
  4. You can add new glosses by clicking on "add"
  5. Click on "publish"
Translations, Synonyms, etc.
For each Sense there can be many Sense statements made to not only other Senses, but also to Items through translations, synonyms, antonyms, connotations, register, evokes, usage examples, refers-to-concept, etc.
This is shown on the colored visualization of the Lexeme Data Model svg image above.
Edit a Sense
  1. Click on the "edit" button, next to the Sense ID
  2. Edit the content of the different fields
  3. Click on "publish"
Remove a Sense
  1. Click on the "edit" button, next to the Sense ID
  2. Click on "remove"
Features
See also: Wikidata:Lexicographical data/Development
What is included in the first version
What will be added in the future
Ordered from near to long-term plans
See also
Last edited on 18 September 2021, at 15:13
Wikidata
All structured data from the main, Property, Lexeme, and EntitySchema namespaces is available under the Creative Commons CC0 License; text in the other namespaces is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy.
Privacy policy
Terms of Use
Desktop
 Home Random  Nearby  Log in  Settings  Donate  About Wikidata  Disclaimers
WatchEdit