Text categorisation by content and genre (GMD)

Application domain Text categorisation by content and genre
Source LIMAS corpus of contemporary German
Dataset size 500 documents with 2000 words each, up to 75,000 facts per run
-Data format RIBL (Sets of ground facts in Prolog notation)
Systems used RIBL, LVQ (OLVQl), IBL, IEL-IG
References [1]
Pointers mathias.kirsten@gmd.de


  1. Wolters, Maria and Kirsten, Mathias (1999): Exploring the Use of Linguistic Features in Domain and Genre Classification. in: Proceedings of the Meeting of the European Chapter of the Association for Computational Linguistics, Bergen, Norway. Available online at http://www.ikp.uni-bonn.de/~mwo/publik.html

