Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 1, 2012

Is Wikipedia Going To Explode?

Filed under: Combinatorics,Wikipedia — Patrick Durusau @ 9:10 pm

I ran across a problem in Wikipedia that may mean it is about to explode. You decide.

You have heard about the danger of “combinatorial explosions” if we have more than one identifier. Every identifier has to be mapped to every other identifier.

Imagine that a – j represent different identifiers for the same subject.

This graphic represents a “small” combinatorial explosion.

combinatorial explosion

If that looks hard to read, here is a larger version:

Large Explosion

Is that better? 😉

Here is where I noticed the problem: the Wikipedia XML file has synonyms for the entries.

The article on anarchism has one hundred and one other names:

  1. af:Anargisme
  2. als:Anarchismus
  3. ar:لاسلطوية
  4. an:Anarquismo
  5. ast:Anarquismu
  6. az:Anarxizm
  7. bn:নৈরাজ্যবাদ
  8. zh-min-nan:Hui-thóng-tī-chú-gī
  9. be:Анархізм
  10. be-x-old:Анархізм
  11. bo:གཞུང་མེད་ལམ་སྲོལ།
  12. bs:Anarhizam
  13. br:Anveliouriezh
  14. bg:Анархизъм
  15. ca:Anarquisme
  16. cs:Anarchismus
  17. cy:Anarchiaeth
  18. da:Anarkisme
  19. pdc:Anarchism
  20. de:Anarchismus
  21. et:Anarhism
  22. el:Αναρχισμός
  23. es:Anarquismo
  24. eo:Anarkiismo
  25. eu:Anarkismo
  26. fa:آنارشیسم
  27. hif:Khalbali
  28. fo:Anarkisma
  29. fr:Anarchisme
  30. fy:Anargisme
  31. ga:Ainrialachas
  32. gd:Ain-Riaghailteachd
  33. gl:Anarquismo
  34. ko:아나키즘
  35. hi:अराजकता
  36. hr:Anarhizam
  37. id:Anarkisme
  38. ia:Anarchismo
  39. is:Stjórnleysisstefna
  40. it:Anarchismo
  41. he:אנרכיזם
  42. jv:Anarkisme
  43. kn:ಅರಾಜಕತಾವಾದ
  44. ka:ანარქიზმი
  45. kk:Анархизм
  46. sw:Utawala huria
  47. lad:Anarkizmo
  48. krc:Анархизм
  49. la:Anarchismus
  50. lv:Anarhisms
  51. lb:Anarchismus
  52. lt:Anarchizmas
  53. jbo:nonje’asi’o
  54. hu:Anarchizmus
  55. mk:Анархизам
  56. ml:അരാജകത്വവാദം
  57. mr:अराजकता
  58. arz:اناركيه
  59. ms:Anarkisme
  60. mwl:Anarquismo
  61. mn:Анархизм
  62. nl:Anarchisme
  63. ja:アナキズム
  64. no:Anarkisme
  65. nn:Anarkisme
  66. oc:Anarquisme
  67. pnb:انارکی
  68. ps:انارشيزم
  69. pl:Anarchizm
  70. pt:Anarquismo
  71. ro:Anarhism
  72. rue:Анархізм
  73. ru:Анархизм
  74. sah:Анархизм
  75. sco:Anarchism
  76. simple:Anarchism
  77. sk:Anarchizmus
  78. sl:Anarhizem
  79. ckb:ئانارکیزم
  80. sr:Анархизам
  81. sh:Anarhizam
  82. fi:Anarkismi
  83. sv:Anarkism
  84. tl:Anarkismo
  85. ta:அரசின்மை
  86. th:อนาธิปไตย
  87. tg:Анархизм
  88. tr:Anarşizm
  89. uk:Анархізм
  90. ur:فوضیت
  91. ug:ئانارخىزم
  92. za:Fouzcwngfujcujyi
  93. vec:Anarchismo
  94. vi:Chủ nghĩa vô chính phủ
  95. fiu-vro:Anarkism
  96. war:Anarkismo
  97. yi:אנארכיזם
  98. zh-yue:無政府主義
  99. diq:Anarşizm
  100. bat-smg:Anarkėzmos
  101. zh:无政府主义

Now you can imagine the “combinatorial explosion” that awaits the entry on anarchism in Wikipedia, one hundred and two names (102, including English) when compared to my ten identifiers.

Except that Wikipedia leaves the relationships between all these identifiers for anarchism unspecified.

You can call them into existence, one to the other, as needed, but then you assume the burden of processing them. All the identifiers remain available to other users for their purposes as well.

Hmmm, with the language prefixes mapping to scopes, this looks like a good source for names and variant names for topics in a topic map.

What do you think?


According to my software, this is post #4,000. Looking for ways to better deliver information about topic maps and their construction. Suggestions (not to mention support) welcome!

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress