Generated on 1/13/2010 at 23:46:8 with version v0.3.2 5 Mars 2009 11:12 *Statistics for entities and category structure: Known entities:369417 Categories :Categories with 100 entries and more (all words) :Categories with 100 entries and more (2 firsts words) :Categories with 100 entries and more (1 firsts words) :Clusterisation results *Structure info of original file:/DATA_NLG/eswiki-20080712-pages-articles.xml Stats Total count : 37863340 Entries : 369407 Redir : 272560 Homonims : 37824 Score for the named entity classification task: Label pers (42) Precision=0.739130434782609 Recall=0.80952380952381 FS=0.772727272727273 (34:46) Label org (10) Precision=0.5 Recall=0.7 FS=0.583333333333333 (7:14) Label date (3) Precision=0 Recall=0 FS=0 (:) Label place (28) Precision=0 Recall=0 FS=0 (:) Label unk (31) Precision=0.888888888888889 Recall=0.516129032258065 FS=0.653061224489796 (16:18) Label prod (13) Precision=0.615384615384615 Recall=0.615384615384615 FS=0.615384615384615 (8:13) Page sets used for test = 131 (en fichier test 369407 : pris 127) FScore global=0.523880131866634 (local=0.43741774098917) Stats for graphs and redirections Amount of graph:369407 Redirect in graph=1422121 |