Introduction
Some time ago when discussing with my brother, we concluded that
switch statements can be better optimized at machine
level than if/else. The discussion started because
python does not support switch constructs.
Some time later I was questioning myself if this was really true.
I mean the most obvious way to implement switch was
a list of conditional jumps in assembly code. In this case there
would be no performance advantage at all. But there is not a
hashtable created by the compiler, right?!
switch in C
Okay, let's take a simple program with a switch.
C, C++, Java, PHP and ActionScript use the same syntax. It's like
"switch
("<condition>")"
"{"
("case"
<constant>":"
(<expression>;)+
(break;)?
)+
"}".
An optional default block can be added, which is executed
whenever no other case matches. break terminates the
current switch; missing to use it will lead to a program executing all
kinds of case blocks.
Let's start with this C code. It uses some input character as condition and has 3 case branches. Just for the record: The input character is only used here to urge the compiler to not optimize the switch away (which is the case whenever a constant is the condition).
#include <stdio.h> int main() { char a = getchar(); switch (a) { case 'A': a = 0; break; case 'B': a = 1; break; case 'C': a = 2; break; case 'D': a = 3; break; case 'E': a = 4; break; default: a = -1; break; } return a; }
My assumption
My assumption is that this source code will get translated into the following machine pseudo code:
00 0x41 (ASCII A) 01 0x42 (ASCII B) 02 0x43 (ASCII C) 03 0x44 (ASCII D) 04 0x45 (ASCII E) 10 IF ([a]-00 == 0) GOTO 15 11 IF ([a]-01 == 0) GOTO 17 12 IF ([a]-02 == 0) GOTO 19 13 IF ([a]-03 == 0) GOTO 1B 14 IF ([a]-04 == 0) GOTO 1D 13 [a] = -1 (default case) 14 GOTO 1E 15 [a] = 0 (case A) 16 GOTO 1E 17 [a] = 1 (case B) 18 GOTO 1E 19 [a] = 2 (case C) 1A GOTO 1E 1B [a] = 3 (case D) 1C GOTO 1E 1D [a] = 4 (case E) 1E EAX = [a] (return a)
In this case the machine has to test each condition separately, which
simply takes Ο(n). This would be equal to a program using
if else repetitions.
if (a == 'A') a = 0; else if (a == 'B') a = 1; else if (a == 'C') a = 2; else if (a == 'D') a = 3; else if (a == 'E') a = 4; else a = -1;
elif in python
python does not have a switch statement. Guido asked the
community and they did not like it
[pyt01]. Like me. Default case,
break and lots of control flow. It's just as low-level like
for (i=0; i<42; i++) and not easy to process in our
brain. Furthermore in some languages there is not only one comparison
operator (weakly vs strongly typed) and the way of comparsion has to
define another convention. Well, the last point is actually not a
problem in python…
So we have to use if/else again. The equivalent python
program looks like this:
#!/usr/bin/env python import sys def func(a): if a == 'A': a = 0 elif a == 'B': a = 1 elif a == 'C': a = 2 elif a == 'D': a = 3 elif a == 'E': a = 4 else: a = -1 return a if __name__ == '__main__': sys.exit(func(raw_input()[0]))
Of course we are interested into the assembler generated by python. So let's have a look using the dis module [dis01].
>>> dis.dis(ifelse.func)
6 0 LOAD_FAST 0 (a)
3 LOAD_CONST 1 ('A')
6 COMPARE_OP 2 (==)
9 POP_JUMP_IF_FALSE 21
7 12 LOAD_CONST 2 (0)
15 STORE_FAST 0 (a)
18 JUMP_FORWARD 90 (to 111)
8 >> 21 LOAD_FAST 0 (a)
24 LOAD_CONST 3 ('B')
27 COMPARE_OP 2 (==)
30 POP_JUMP_IF_FALSE 42
9 33 LOAD_CONST 4 (1)
36 STORE_FAST 0 (a)
39 JUMP_FORWARD 69 (to 111)
10 >> 42 LOAD_FAST 0 (a)
45 LOAD_CONST 5 ('C')
48 COMPARE_OP 2 (==)
51 POP_JUMP_IF_FALSE 63
11 54 LOAD_CONST 6 (2)
57 STORE_FAST 0 (a)
60 JUMP_FORWARD 48 (to 111)
12 >> 63 LOAD_FAST 0 (a)
66 LOAD_CONST 7 ('D')
69 COMPARE_OP 2 (==)
72 POP_JUMP_IF_FALSE 84
13 75 LOAD_CONST 8 (3)
78 STORE_FAST 0 (a)
81 JUMP_FORWARD 27 (to 111)
14 >> 84 LOAD_FAST 0 (a)
87 LOAD_CONST 9 ('E')
90 COMPARE_OP 2 (==)
93 POP_JUMP_IF_FALSE 105
15 96 LOAD_CONST 10 (4)
99 STORE_FAST 0 (a)
102 JUMP_FORWARD 6 (to 111)
17 >> 105 LOAD_CONST 11 (-1)
108 STORE_FAST 0 (a)
18 >> 111 LOAD_FAST 0 (a)
114 RETURN_VALUE
Comparisons for each case. Pretty much the same like
if/else in C. Now we have seen enough source code.
What about the reality?
What it looks like when we choose the red pill.
Hashtables
Hashtables are important data structures. They are widely used among different software implementations. This includes programming languages like PHP [php01] and python [pyh01].
The primary reason we are mentioning hashtables here is the lookup
runtime. We were talking about Ο(n) before in context
of if/else. Looking up the last case statement (= worst
case) takes as long as iterating over all case conditions. Hashtables
using hashing functions allow us runtimes with Ο(1).
C to amd64 assembly
gcc -Wall -S switch.c
I will reduce the source code to important parts and add comments.
main: .LFB0: pushq %rbp movq %rsp, %rbp call getchar # char a = getchar(); movb %al, -1(%rbp) movsbl -1(%rbp), %eax subl $65, %eax # [1] a = a - 65 cmpl $4, %eax ja .L2 # [2] if (4 > a) GOTO L2 mov %eax, %eax movq .L8(,%rax,8), %rax # [3] load branch address to %rax jmp *%rax # jump to case branch .section .rodata # array of pointers to cases .L8: .quad .L3 .quad .L4 .quad .L5 .quad .L6 .quad .L7 .text .L3: # case 'A': movb $0, -1(%rbp) jmp .L9 .L4: # case 'B': movb $1, -1(%rbp) jmp .L9 .L5: # case 'C': movb $2, -1(%rbp) jmp .L9 .L6: # case 'D': movb $3, -1(%rbp) jmp .L9 .L7: # case 'E': movb $4, -1(%rbp) jmp .L9 .L2: # default: movb $-1, -1(%rbp) nop .L9: movsbl -1(%rbp), %eax # return a; leave ret
This assembler code comes with 3 neat tricks. The first one is to subtract 65 from variable a. 65 is so special only because it's the first number to be a condition to be tested (ASCII A). If you subtract 65 from a, you have to jump to the default case if a is greater than 4. The compiler recognized this situation (all numbers between 65 and 69 point to some case) and this is trick number two for me.
The third trick is also referenced in the source code by brackets and describes the way to lookup the pointer to the case codeblock by a hashtable. We have to do some arithmetic and have to explain GAS (GNU assembly syntax):
.L8(,%rax,8)
means take the content of the address .L8 + (%rax * 8).
And what does that mean? Well, on L8 we have the array of pointers to
the cases. The margin between the pointers is 8 on a 64bit system. So
case 'A' is on .L8 + 1 * 8.
case 'B' is on .L8 + 2 * 8. And so on.
Solving this expression returns an pointer to the first command of the
case code block. So .L8 + (%rax * 8) is actually our
hashing function and there is no iteration over cases. Lookup takes
Ο(1).
Conclusion
So it is true: switch statements can be translated to
hashtables at assembler level and therefore are more efficient than
if/else constructs. If you use the else if
code in C at least GCC is not able to evaluate this "hashtable would
be possible" situation. As far as python is concerned, it does not
support switches and therefore cannot be translated to hashtables
easily.