Generated on 1/13/2010 at 23:46:8 with version v0.3.2 5 Mars 2009 11:12

*Statistics for entities and category structure:


Known entities:369417

Categories :Categories with 100 entries and more (all words)

:Categories with 100 entries and more (2 firsts words)

:Categories with 100 entries and more (1 firsts words)

:Clusterisation results



*Structure info of original file:/DATA_NLG/eswiki-20080712-pages-articles.xml

Stats
Total count : 37863340
Entries : 369407
Redir : 272560
Homonims : 37824


Score for the named entity classification task:

Label pers (42) Precision=0.739130434782609 Recall=0.80952380952381 FS=0.772727272727273 (34:46)
Label org (10) Precision=0.5 Recall=0.7 FS=0.583333333333333 (7:14)
Label date (3) Precision=0 Recall=0 FS=0 (:)
Label place (28) Precision=0 Recall=0 FS=0 (:)
Label unk (31) Precision=0.888888888888889 Recall=0.516129032258065 FS=0.653061224489796 (16:18)
Label prod (13) Precision=0.615384615384615 Recall=0.615384615384615 FS=0.615384615384615 (8:13)

Page sets used for test = 131 (en fichier test 369407 : pris 127) FScore global=0.523880131866634 (local=0.43741774098917)


Stats for graphs and redirections
Amount of graph:369407
Redirect in graph=1422121