The Penn Treebank Tag Set
The tagset used in tagging the demo corpus available here is the Penn Treebank
Tag set, described for example in Mitchell P. Marcus, Beatrice
Santorini, and Mary Ann Marcinkiewicz: Building a Large Annotated Corpus
of English: The Penn Treebank, in Computational Linguistics,
Volume 19, Number 2 (June 1993), pp. 313--330 (Special Issue on
Using Large Corpora). The tagging was done at UPenn.
The following part-of-speech tags are used in the corpus:
1. CC Coordinating conjunction
2. CD Cardinal number
3. DT Determiner
4. EX Existential there
5. FW Foreign word
6. IN Preposition or subordinating conjunction
7. JJ Adjective
8. JJR Adjective, comparative
9. JJS Adjective, superlative
10. LS List item marker
11. MD Modal
12. NN Noun, singular or mass
13. NNS Noun, plural
14. NP Proper noun, singular
15. NPS Proper noun, plural
16. PDT Predeterminer
17. POS Possessive ending
18. PP Personal pronoun
19. PP$ Possessive pronoun
20. RB Adverb
21. RBR Adverb, comparative
22. RBS Adverb, superlative
23. RP Particle
24. SYM Symbol
25. TO to
26. UH Interjection
27. VB Verb, base form
28. VBD Verb, past tense
29. VBG Verb, gerund or present participle
30. VBN Verb, past participle
31. VBP Verb, non-3rd person singular present
32. VBZ Verb, 3rd person singular present
33. WDT Wh-determiner
34. WP Wh-pronoun
35. WP$ Possessive wh-pronoun
36. WRB Wh-adverb
IMS Stuttgart / WWW@IMS.Uni-Stuttgart.DE / Tue May 19 18:04:13 1998 (hofmanaa)