Fran Ramovš Institute of Slovenian Language ZRC SAZU
Corpus Laboratory

Searching the POS-tagged core corpus

Examples:se je zdelo- multiword unit "se je zdelo".
 *svet*- words containing "svet".
 :a a a- three adverbs in a row.
 :pže* sže*- all the occurences of the pairs: feminine adjective singular followed by a corresponding noun.



        POS-tagged core corpus, 1 mil. words, contains the following Slovenian texts and translations into Slovenian:
  1. Collected works of Ciril Kosmač - 408.000 words
  2. Tomo Križnar: O iskanju ljubezni / In Search of Love - 132.000 words
  3. George Orwell: 1984 - 91.000 words
  4. Plato: Republic - 93.000 words
  5. The Bible, New Testament - 150.000 words
  6. Gustave Flaubert: Bouvard and Pécuchet - 86.000 words
  7. Newspaper DELO on Internet (sample from 1997) - 52.000 words
        Search expression may be up to 74 characters long and the number of hits shown is limited to 200.



Page posted by 24 September 2008; date of last change: 14 October 2008.

URL: http://bos.zrc-sazu.si/gradivo_en.html