1. 6 Annotated Text Corpora Many text corpora contain linguistic annotations, representing POS tags, named entities, syntactic structures, semantic roles, and so forth. NLTK provides convenient ways to access several of these corpora, and has data packages containing corpora and corpus samples, freely downloadable for use in teaching and research. 1. 2 lists some of the corpora. For information about downloading them, see http://nltk.