Division by constant in assembly

Written on 2020-11-18 in 610 words ✍️.
Part of cs IT-security pqcrypto

Motivation

CRYSTALS-Kyber is one of the schemes in the third round of the NIST post-quantum standardization project. In particular, the (reference implementation as well as) optimized implementation uses a divison by KYBER_Q (= 3329) in the decoding step. Our claim is that division by a constant is often represented as multiplication (sometimes called “division by invariant integers using multiplication”). But on which platforms and optimization levels?

Code snippet

The following code snippet can be inserted on godbolt to reproduce the data:

#include <stdint.h>
int square(uint16_t num) { return num / 3329; }

godbolt results

Table 1. Analysis which instruction is used for representing division on godbolt with different setups
Platform -O3 Result

x86-64 gcc 10.2

no

imul, no div

x86-64 gcc 10.2

yes

imul, no div

x86-64 gcc 8.3

no

imul, no div

x86-64 gcc 8.3

yes

imul, no div

x86-64 clang 11.0.0

no

idiv

x86-64 clang 11.0.0

yes

imul

ARM gcc 9.2.1

no

umull, no div

ARM gcc 9.2.1

yes

umull, no div

ARM64 gcc 8.2

no

umull, no div

ARM64 gcc 8.2

yes

umull, no div

armv8-a clang 11.0.0

no

sdiv

armv8-a clang 11.0.0

yes

umull

RISC-V rv32gc clang 10.0.0

no

mulhu

RISC-V rv32gc clang 10.0.0

yes

mulhu

WebAssembly clang (trunk)

no

i32.div_s

WebAssembly clang (trunk)

yes

i32.div_u

Result

I think the overall conclusion is roughly: This is compiler-specific and less platform-specific. gcc always uses multiplication. LLVM prefers to use division unless it is an optimized build.

Credits

The idea and background on this topic was contributed by my co-author, Peter. Thank you!