SAMPA was devised as a hack
to work around the inability of text encodings
to represent IPA symbols. Later, as Unicode
support for IPA symbols became more widespread, the necessity for a separate, computer-readable system for representing the IPA in ASCII decreased. However, X-SAMPA is still useful as the basis for an input method
for true IPA.
- The IPA symbols that are ordinary lower case letters have the same value in X-SAMPA as they do in the IPA.
- X-SAMPA uses backslashes as modifying suffixes to create new symbols. For example, O is a distinct sound from O\, to which it bears no relation. Such use of the backslash character can be a problem, since many programs interpret it as an escape character for the character following it. For example, such X-SAMPA symbols do not work in EMU, so backslashes must be replaced with some other symbol (e.g., an asterisk: '*') when adding phonemic transcription to an EMU speech database. The backslash has no fixed meaning.
- X-SAMPA diacritics follow the symbols they modify. Except for ~ for nasalization, = for syllabicity, and ` for retroflexion and rhotacization, diacritics are joined to the character with the underscore character _.
- The underscore character is also used to encode the IPA tiebar: k_p codes for /k͡p/.
- The numbers _1 to _6 are reserved diacritics as shorthand for language-specific tone numbers.
Lower case symbols
Asterisks (*) mark sounds that do not have X-SAMPA symbols. Daggers (†) mark IPA symbols that have recently been added to Unicode
. Since April 2008, the latter is the case of the labiodental flap
, symbolized by a right-hook v
in the IPA:
. A dedicated symbol for the labiodental flap does not yet exist in X-SAMPA.
Last edited on 27 May 2021, at 22:32
Content is available under CC BY-SA 3.0
unless otherwise noted.