Word cloud for programming languages according to Tiobe Index

I wanted to generate a word cloud with the current usage of programming languages for my talk at Grazer Linuxtage. I extracted the statistics from the TIOBE index, wrote a script to generate a string representing the frequency by frequency of occurence of its name and generated the word cloud.

Let’s go through it step by step:

data = {
    'Java': 16.041,
    'C': 15.745,
    'C++': 6.962,
    'Objective-C': 5.890,
    'C#': 4.947,
    'JavaScript': 3.297,
    'PHP': 3.009
    
}

# normalize name
names = list(data.keys())
for name in names:
    new_name = name.replace(' ', '')
    value = data[name]

    del data[name]
    data[new_name] = value

# distribute by frequency
out = []
for name, stats in data.items():
    for i in range(int(10 * stats)):
        out.append(name)

print(' '.join(out))

print('\n{} words printed.'.format(len(out)))

The data dictionary associates a programming language with its relative frequency in percent. Those values are taken from the TIOBE Index. Be aware that some languages occur below some threshold leading to the following disclaimer by TIOBE:

The following list of languages denotes #51 to #100. Since the differences are relatively small, the programming languages are only listed (in alphabetical order).

I assumed a frequency of 0.1% for those languages. This gives us the following data dictionary:

data = { 'Java': 16.041, 'C': 15.745, 'C++': 6.962, 'Objective-C': 5.890, 'C#': 4.947, 'JavaScript': 3.297, 'PHP': 3.009, 'Python': 2.690, 'Visual Basic': 2.199, 'Visual Basic .Net': 2.126, 'Delphi/Object Pascal': 1.469, 'Perl': 1.340, 'Transact-SQL': 1.275, 'MATLAB': 1.263, 'ABAP': 1.228, 'F#': 1.196, 'PL/SQL': 1.110, 'Ruby': 1.068, 'R': 1.028, 'Pascal': 1.027, 'SAS': 1.001, 'PostScript': 0.954, 'ML': 0.890, 'Swift': 0.882, 'Scala': 0.873, 'Logo': 0.783, 'COBOL': 0.744, 'J': 0.706, 'Assembly': 0.656, 'Fortran': 0.600, 'Scratch': 0.587, 'OpenEdge ABL': 0.543, 'Lisp': 0.503, 'Ada': 0.454, 'ActionScript': 0.415, 'Max/MSP': 0.408, 'Lua': 0.403, 'D': 0.403, 'Prolog': 0.349, 'RPG (OS/400)': 0.330, 'Inform': 0.305, 'Go': 0.297, 'Groovy': 0.292, 'PL/I': 0.265, 'Scheme': 0.263, 'Q': 0.261, 'LabVIEW': 0.260, 'C shell': 0.245, 'VBScript': 0.242, 'Erlang': 0.239, '(Visual) FoxPro' : 0.1, '4th Dimension/4D' : 0.1, 'Alice' : 0.1, 'Apex' : 0.1, 'Arc' : 0.1, 'Automator' : 0.1, 'Awk, Bash' : 0.1, 'bc' : 0.1, 'Bourne shell' : 0.1, 'cg' : 0.1, 'CL (OS/400)' : 0.1, 'Clean' : 0.1, 'Clojure' : 0.1, 'cT' : 0.1, 'Dart' : 0.1, 'DiBOL' : 0.1, 'Factor' : 0.1, 'Forth' : 0.1, 'Hack' : 0.1, 'Haskell' : 0.1, 'Icon' : 0.1, 'IDL' : 0.1, 'Io' : 0.1, 'Ioke' : 0.1, 'J#' : 0.1, 'JScript' : 0.1, 'Korn shell': 0.1, 'Ladder Logic': 0.1, 'M4': 0.1, 'Magic': 0.1, 'Mathematica': 0.1, 'Moto': 0.1, 'NATURAL': 0.1, 'NXT-G': 0.1, 'OpenCL': 0.1, 'Oz': 0.1, 'PILOT': 0.1, 'PowerShell': 0.1, 'Programming Without Coding Technology': 0.1, 'Pure Data': 0.1, 'S': 0.1, 'SPARK': 0.1, 'SPSS': 0.1, 'Standard ML': 0.1, 'Tcl': 0.1, 'TOM': 0.1, 'VHDL': 0.1, 'X10': 0.1, 'Z shell': 0.1 }

We normalize the dictionary (here we just remove spaces from the name). Then we add it to the out list corresponding to the statistics. Because we multiply by 10, a programming language with a frequency of 6.962% will occur 69 times. A programming language with frequency 3.009% will occur 30 times. Finally these occurrences are concatenated with a space giving us:

C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C PureData LabVIEW LabVIEW Prolog Prolog Prolog ProgrammingWithoutCodingTechnology JScript Assembly Assembly Assembly Assembly Assembly Assembly Dart Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal Delphi/ObjectPascal F# F# F# F# F# F# F# F# F# F# F# Inform Inform Inform Logo Logo Logo Logo Logo Logo Logo Ioke Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java Java RPG(OS/400) RPG(OS/400) RPG(OS/400) ML ML ML ML ML ML ML ML Bourneshell Cshell Cshell SPARK NXT-G LadderLogic PILOT Icon Scheme Scheme Mathematica VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net VisualBasic.Net cg Haskell Ada Ada Ada Ada Oz JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript JavaScript R R R R R R R R R R StandardML Transact-SQL Transact-SQL Transact-SQL Transact-SQL Transact-SQL Transact-SQL Transact-SQL Transact-SQL Transact-SQL Transact-SQL Transact-SQL Transact-SQL C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ C++ Erlang Erlang Swift Swift Swift Swift Swift Swift Swift Swift Clean Max/MSP Max/MSP Max/MSP Max/MSP IDL NATURAL Zshell cT Pascal Pascal Pascal Pascal Pascal Pascal Pascal Pascal Pascal Pascal Magic Forth Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C Objective-C OpenEdgeABL OpenEdgeABL OpenEdgeABL OpenEdgeABL OpenEdgeABL PL/I PL/I D D D D TOM VBScript VBScript Q Q Ruby Ruby Ruby Ruby Ruby Ruby Ruby Ruby Ruby Ruby Fortran Fortran Fortran Fortran Fortran Fortran Arc SAS SAS SAS SAS SAS SAS SAS SAS SAS SAS (Visual)FoxPro M4 PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP PHP C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# C# Lua Lua Lua Lua Tcl Kornshell Alice MATLAB MATLAB MATLAB MATLAB MATLAB MATLAB MATLAB MATLAB MATLAB MATLAB MATLAB MATLAB Io Awk,Bash PostScript PostScript PostScript PostScript PostScript PostScript PostScript PostScript PostScript X10 SPSS S VHDL ABAP ABAP ABAP ABAP ABAP ABAP ABAP ABAP ABAP ABAP ABAP ABAP Clojure VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic VisualBasic PowerShell Automator J J J J J J J COBOL COBOL COBOL COBOL COBOL COBOL COBOL Moto Hack DiBOL 4thDimension/4D Scala Scala Scala Scala Scala Scala Scala Scala OpenCL Factor J# Go Go Perl Perl Perl Perl Perl Perl Perl Perl Perl Perl Perl Perl Perl Apex ActionScript ActionScript ActionScript ActionScript CL(OS/400) bc Lisp Lisp Lisp Lisp Lisp Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Python Groovy Groovy Scratch Scratch Scratch Scratch Scratch PL/SQL PL/SQL PL/SQL PL/SQL PL/SQL PL/SQL PL/SQL PL/SQL PL/SQL PL/SQL PL/SQL

925 words printed.

I plugged this data into an awesome word cloud generator and got the following result (scale = n, # of words = 925, 4 orientations, Archimedean):

TIOBE Index programming languages by frequency in word cloud
TIOBE Index programming languages as of 24th of April 2015

Thanks goes to Jason Davies (word cloud generator) and TIOBE.

Word cloud for programming languages according to Tiobe Index