Two impromptu esperanto statistics

✍️ Written on 2023-07-09 in 1154 words. Part of life languages Esperanto

Motivation

I came across Rajki’s list of etymological data. I immediately desired to know where the majority of Esperanto can be attributed to (in terms of source language). I created some illustration of this statistical data in May.

Furthermore, issue 1378 (4) of april 2023 of the UEA magazine “Esperanto” shows the number of members per country. Looking at the absolute numbers was not meaningful to me. So I looked up the ratio of members to the number of nationals.

Illustration 1: Etymology

It took me 20 minutes to find the Etimologia Vortaro de Esperanto by András Rajki (2006). It took me 20 more minutes to turn the PDF into a text file and evaluate it with python. And 20 more minutes to visualize its data with LibreOffice.

The dictionary features only about 4800 entries. The linked version does not contain his remarks, which I am paraphrasing here:

  • András Rajki focused on Zamenhof’s native languages: pol, rus, yid.

  • Zamenhof was very interested in Lithuanian and German and learned the other languages mentioned in the diagram; except Dutch. András added it deliberately.

  • Greek and Spanish are heavily underrepresented deliberately.

  • Non-indo-germanic languages like Aramaic and Hebrew would occur but are left out.

I also had conclusions on my own:

  • There are better databases, but András' one was easily accessible to me.

  • I normalized language identifiers to ISO 639-3. For Greek, András only looked at Ancient Greek references (grc, not ell).

  • A rigorous approach would put Spanish into the Top 3. I even expect it at the first place.

  • To quote Pierre Janton “More than 75% of the lexemes are taken from Latin languages, in particular from Latin and French, and 20% from the Anglo-germanic languages, the remaining include loan-words from Greek, above all scientific ones, Slavic languages and, for a very small proportion from Hebrew (amen), Arabic (alkazabo) and Japanese (anzuo), etc.”

There are some interesting links related to this topic:

The final illustration looks as follows:

Illustration of the etymology of Esperanto vocabulary

Illustration 2: UEA members

A nice Esperanto friend gifted me the 1378 (4) issue of “Esperanto”. “Esperanto” is the magazine of UEA (Universala Esperanto-Asocio), which is the largest Esperanto association to my knowledge. The final page shows the official membership numbers of 2022. Looking at the absolute values of 1411 for China, 869 for Germany, and 975 for Japan, I wondered about the numbers in relation to registered nationals.

I collected all membership numbers of countries with more than 50 members. I wondered about the correct threshold, and had to adjust it once, but I want to get rid of countries with a low ratio, but a ratio can be very high for tiny countries. 50 is the threshold, I settled with. The numbers of nationals is extracted from the English Wikipedia. Often the Wikipedia page of a country does not provide census data from 2022. Then I used the closest approximated values.

Illustration of UEA members per country
Argentino; 109; 46044703
Albanio; 30; 2793592
Armenio; 3000756
Aŭstralio; 113; 26000000
Aŭstrio; 95; 9027999
Belgio; 305; 11697557
Benino; 73; 13754688
Bosnio-Hercegovino; 89; 3475000
Brazilo; 290; 203062512
Britio; 342; 60800000
Bulgario; 152; 6447710
Burundo; 395; 13162952
Ĉeĥio; 90; 10827529
Ĉilio; 52; 18549457
Ĉinio; 1411; 1411750000
Danio; 117; 5935619
Finnlando; 174; 5614571
Francio; 713; 68042591
Germanio; 869; 84270625
Hinda Unio; 81; 1428627663
Hispanio; 328; 47222613
Hungario; 161; 9678000
Indonezio; 83; 277749853
Irano; 130; 87590873
Israelo; 130; 9745760
Italio; 534; 58853482
Japanio; 975; 124840000
Kanado; 137; 39858480
Kolombio; 172; 49336454
Kongo (Dem. Resp.); 136; 111859928
Korea Resp.; 314; 51966948
Kroatio; 112; 3871833
Kubo; 451; 10985974
Latvio; 98; 1842226
Litovio; 119; 2862380
Luksemburgo; 106; 660809
Madagaskaro; 57; 28812195
Meksiko; 162; 129875529
Nederlando; 252; 17886100
Nepalo; 76; 30666598
Norvegio; 114; 5488984
Pakistano; 55; 249566743
Peruo; 55; 32440172
Pollando; 272; 38036118
Rusio; 198; 147182123
Serbia; 81; 6647003
Slovenio; 112; 2116972
Svedio; 246; 10481937
Svislando; 183; 8738791
Ukrainio; 101; 36744636
Usono; 607; 333287557
Vjetnamio; 139; 99460000
== Membraro entute

Albanio	30
Ĉilio	52
Peruo	55
Pakistano	55
Madagaskaro	57
Benino	73
Nepalo	76
Serbia	81
Hinda Unio	81
Indonezio	83
Bosnio-Hercegovino	89
Ĉeĥio	90
Aŭstrio	95
Latvio	98
Ukrainio	101
Luksemburgo	106
Argentino	109
Slovenio	112
Kroatio	112
Aŭstralio	113
Norvegio	114
Danio	117
Litovio	119
Israelo	130
Irano	130
Kongo (Dem. Resp.)	136
Kanado	137
Vjetnamio	139
Bulgario	152
Hungario	161
Meksiko	162
Kolombio	172
Finnlando	174
Svislando	183
Rusio	198
Svedio	246
Nederlando	252
Pollando	272
Brazilo	290
Belgio	305
Korea Resp.	314
Hispanio	328
Britio	342
Burundo	395
Kubo	451
Italio	534
Usono	607
Francio	713
Germanio	869
Japanio	975
Ĉinio	1411

== Proporcio per lando

Hinda Unio	0.00001 %
Pakistano	0.00002 %
Indonezio	0.00003 %
Ĉinio	0.00010 %
Kongo (Dem. Resp.)	0.00012 %
Meksiko	0.00012 %
Rusio	0.00013 %
Vjetnamio	0.00014 %
Brazilo	0.00014 %
Irano	0.00015 %
Peruo	0.00017 %
Usono	0.00018 %
Madagaskaro	0.00020 %
Argentino	0.00024 %
Nepalo	0.00025 %
Ukrainio	0.00027 %
Ĉilio	0.00028 %
Kanado	0.00034 %
Kolombio	0.00035 %
Aŭstralio	0.00043 %
Benino	0.00053 %
Britio	0.00056 %
Korea Resp.	0.00060 %
Hispanio	0.00069 %
Pollando	0.00072 %
Japanio	0.00078 %
Ĉeĥio	0.00083 %
Italio	0.00091 %
Germanio	0.00103 %
Francio	0.00105 %
Aŭstrio	0.00105 %
Albanio	0.00107 %
Serbia	0.00122 %
Israelo	0.00133 %
Nederlando	0.00141 %
Hungario	0.00166 %
Danio	0.00197 %
Norvegio	0.00208 %
Svislando	0.00209 %
Svedio	0.00235 %
Bulgario	0.00236 %
Bosnio-Hercegovino	0.00256 %
Belgio	0.00261 %
Kroatio	0.00289 %
Burundo	0.00300 %
Finnlando	0.00310 %
Kubo	0.00411 %
Litovio	0.00416 %
Slovenio	0.00529 %
Latvio	0.00532 %
Luksemburgo	0.01604 %

Conclusion

The illustrations show that French, Italian, and English contributed most to the vocabulary (assuming Spanish was excluded purposefully). Furthermore, Luxemburg, Latvia, Slovenia, Lithuania, and Cuba are countries with a large ratio of UEA members. Even though the absolute numbers of China were leading, in practice the proportion is rather small and Cuba or Burundi win.

Fun projects!