classic-mceliece-rust 1.0

Written on 2022-04-01 in 1285 words ✍️.
Part of cs IT-security pqcrypto software-development programming-languages rustlang

Update 2024-10-15: I discovered the attribute “lattice-based” slipped into this original article. I guess it must be related to the date of publication (April’s fool!) and removed it.

Motivation

In my previous job, I advised students on bachelor theses. Topics, I offered, included the implementation of post-quantum cryptography schemes in rust. One student chose to implement Classic McEliece in rust. This week I finally made all KAT KEM testcases pass and our 1.0 release of classic-mceliece-rust, a pure-rust implementation of the Classic McEliece KEM.

classic-mceliece-rust

Classic McEliece

Classic McEliece is a third-NIST-round post-quantum key encapsulation mechanism to negotiate a secret key between two parties. It is a finalist and considered as conservative choice since the general design was found already by Robert McEliece in 1978. It has a rich history of security analysis. Since it is based on error-correcting codes, operations on large matrices is the main topic for the implementation.

API

The API consists of some constants that define the size of buffers as well as the three KEM steps (key generation, encryption, decryption) and an RNG implementation AesState. It follows the API design of other implementations such as rusty-saber.

use classic_mceliece_rust::AesState;
use classic_mceliece_rust::{crypto_kem_dec, crypto_kem_enc, crypto_kem_keypair};
use classic_mceliece_rust::{
    CRYPTO_BYTES, CRYPTO_CIPHERTEXTBYTES, CRYPTO_PUBLICKEYBYTES, CRYPTO_SECRETKEYBYTES,
};
use std::error;

To run the negotiation, you need to allocate all buffers and then use a proper seed (like /dev/urandom). We left out fetching a proper seed in the example code:

let mut pk = [0u8; CRYPTO_PUBLICKEYBYTES];
let mut sk = [0u8; CRYPTO_SECRETKEYBYTES];
let mut ct = [0u8; CRYPTO_CIPHERTEXTBYTES];
let mut ss_alice = [0u8; CRYPTO_BYTES];
let mut ss_bob = [0u8; CRYPTO_BYTES];
let mut rng = AesState::new();
//rng.randombytes_init([0u8; 48]);  // TODO use a proper seed (like bytes from /dev/urandom) here

Then, we run the KEM steps:

crypto_kem_keypair(&mut pk, &mut sk, &mut rng)?;
crypto_kem_enc(&mut ct, &mut ss_bob, &pk, &mut rng)?;
crypto_kem_dec(&mut ss_alice, &ct, &sk)?;
assert_eq!(ss_bob, ss_alice);

The assert shows our desired property: the shared secret of both parties is the same.

The API design includes the preference of annotating the array size as part of the argument (i.e. references to arrays, not references to slices). I hope in the future, this leads to optimizations such as a skip of boundary checks. I once more modelled the different variants through feature flags. Andreas Kage pointed me to an article explaining an alternative proc_macro approach. However this approach takes more effort again and I did have the time for it.

My notes on Classic McEliece

CME was by far the most annoying among {NTRU, Saber, CME} to implement. Why is that?

The huge public key size prevents unit tests to be embedded in the implementation file. I store the test vectors inside a separate testdata.txt which I built a tiny parser for.
In general, the C reference implementation uses pointer arithmetic a lot, uses type casts and does not document the array size. All these things require research to get an appropriate rust implementation running.
The NIST submission provides all implementations separately whereas the rust implementation is just one codebase. As a result, I had to analyze the differences between all versions and add guarded for-loops where I missed them for one specific variant. This is not a problem in general, but 10 variants is quite a lot.
controlbits.rs took a huge effort to be ported. Why? The C implementation calls cbrecursion with an array which is then casted to a different type in recursive subcalls. Tracking memory accesses based on pointer arithmetic took much time as well.
layer_ex in benes.rs actually takes a two-dimensional array. In C, the algorithm just assume contiguous memory arrays and thus access items with pointer arithmetic. In rust, the two dimensions need to be modelled properly. In short: the algorithm is inadequate.
pk_gen contains a for loop which is 30 times slower in the unoptimized executable. But already opt-level=1 makes is decent fast. It took me a while to recognize.

As a result, the implementation would have been much smoother, if array sizes are documented. My other requests might have resulted in a worse runtime in C and thus, I cannot reasonably complain here.

Characteristics

The README file lists some properties:

Classic McEliece is a code-based key encapsulation mechanism (KEM)
The implementation is based on the Classic McEliece reference implementation of NIST round 3
The implementation does not utilize any concurrency techniques (SIMD/threading/…, except maybe auto-vectorization on your CPU)
It depends on sha3 as SHA-3 implementation and aes as AES block cipher (used as RNG) implementation
It passes the 100 testcases of the C reference implementation
It implements all 10 variants of the Classic McEliece KEM
The implementation takes between 100 milliseconds (mceliece348864) and 500 milliseconds (mceliece8192128f) to run on a modern computer
The implementation is constant-time on software instruction level
The random number generator is based on AES256 in counter mode
First described in 1978, the cryptographic scheme has a rich history in security analysis. Its large public key size, however, often limits adoption.

Conclusion

I am glad to finally have this piece of software running stable and see some more progress in the intersection of rust and PQC. Thank you Bernhard for this enduring journey!