Papers

Due to my time at university, I read a bunch of papers. Since the beginning of my PhD studies, I systematically review and summarize them. Here are my notes.

Table of contents:

Papers and notes

A Masked Ring-LWE Implementation §

Title: “A Masked Ring-LWE Implementation” by Oscar Reparaz, Sujoy Sinha Roy, Frederik Vercauteren, Ingrid Verbauwhede [url] [dblp]
Published in 2015 at CHES 2015 and I read it in 2020/07
Abstract: Lattice-based cryptography has been proposed as a postquantum public-key cryptosystem. In this paper, we present a masked ring-LWE decryption implementation resistant to first-order side-channel attacks. Our solution has the peculiarity that the entire computation is performed in the masked domain. This is achieved thanks to a new, bespoke masked decoder implementation. The output of the ring-LWE decryption are Boolean shares suitable for derivation of a symmetric key. We have implemented a hardware architecture of the masked ring-LWE processor on a Virtex-II FPGA, and have performed side channel analysis to confirm the soundness of our approach. The area of the protected architecture is around 2000 LUTs, a 20 % increase with respect to the unprotected architecture. The protected implementation takes 7478 cycles to compute, which is only a factor ×2.6 larger than the unprotected implementation.

ambiguity

\[ \operatorname{th}(x) = \begin{cases} 0 & \text{if } x \in (0, q/4) \cup (3q/4, q) \\ 1 & \text{if } x \in (q/4, 3q/4) \end{cases} \]

The intervals apparently should denote inclusive-exclusive boundaries.

error

On page 686, c1 and c2 is swapped erroneously all the time. On page 685, c2 is defined such that it contains the message. Consequently, factor r2 must be applied to c1, not c2. However, page 686 repeatedly multiplies c2 to the different r values.

Masked Decoder table

Sorry, I don't have table support in Zotero.

First line is meant to be read as “If a' is in quadrant (1) and a'' is in quadrant (1), then a is inconclusive (∅) unlike bit 0 or 1”

a' a'' a

1 1 ∅
1 2 1
1 3 ∅
1 4 0
2 1 1
2 2 ∅
2 3 0
2 4 ∅
3 1 ∅
3 2 0
3 3 ∅
3 4 1
4 1 0
4 2 ∅
4 3 1
4 4 ∅

quotes

summary

Nice paper. It is sad that c1 and c2 got mixed up on page 686. The idea to mask is indeed natural with r2 = r2' + r2'' (mod q) and a' := a' + Δ1 for the decoder a'' := a'' - Δ1. Isn't sampling also a problem when masking ring-LWE? If so, the title is inappropriate and should be “A Masked Ring-LWE Decoder Implementation”. The described implementation works for CPA-only. It gives a factor of 2.679 in terms of CPU cycles and 3.2 in terms of runtime in microseconds in the decryption step. With N=16 (maximum number of iterations in the decoder), you get 1 error in about 400 000 bits. This means in NewHope's 256 bit encapsulated keys, you get 1 error in 1420 messages. I did not understand why the masked decoder requires the value of the previous iteration (loop in Figure 3) for a long time. Then I recognized. I don't like the fact that the source code was not published with the paper (→ reproducible research).

typo

A Side-channel Resistant Implementation of SABER §

Title: “A Side-channel Resistant Implementation of SABER” by Michiel Van Beirendonck, Jan-Pieter D'Anvers, Angshuman Karmakar, Josep Balasch, Ingrid Verbauwhede [url] [dblp]
Published in 2020 at IACR eprint 2020 and I read it in 2020-11
Abstract: The candidates for the NIST Post-Quantum Cryptography standardization have undergone extensive studies on efficiency and theoretical security, but research on their side-channel security is largely lacking. This remains a considerable obstacle for their real-world deployment, where side-channel security can be a critical requirement. This work describes a side-channel resistant instance of Saber, one of the lattice-based candidates, using masking as a countermeasure. Saber proves to be very efficient to mask due to two specific design choices: power-of-two moduli, and limited noise sampling of learning with rounding. A major challenge in masking lattice-based cryptosystems is the integration of bit-wise operations with arithmetic masking, requiring algorithms to securely convert between masked representations. The described design includes a novel primitive for masked logical shifting on arithmetic shares, as well as adapts an existing masked binomial sampler for Saber. An implementation is provided for an ARM Cortex-M4 microcontroller, and its side-channel resistance is experimentally demonstrated. The masked implementation features a 2.5x overhead factor, significantly lower than the 5.7x previously reported for a masked variant of NewHope. Masked key decapsulation requires less than 3,000,000 cycles on the Cortex-M4 and consumes less than 12kB of dynamic memory, making it suitable for deployment in embedded platforms.

open questions

quotes

summary

 

typo

Additively Homomorphic Ring-LWE Masking §

Title: “Additively Homomorphic Ring-LWE Masking” by Oscar Reparaz, Ruan de Clercq, Sujoy Sinha Roy, Frederik Vercauteren, Ingrid Verbauwhede [url] [dblp]
Published in 2016 at PQCrypto 2016 and I read it in 2020-07
Abstract: In this paper, we present a new masking scheme for ringLWE decryption. Our scheme exploits the additively-homomorphic property of the existing ring-LWE encryption schemes and computes an additive-mask as an encryption of a random message. Our solution differs in several aspects from the recent masked ring-LWE implementation by Reparaz et al. presented at CHES 2015; most notably we do not require a masked decoder but work with a conventional, unmasked decoder. As such, we can secure a ring-LWE implementation using additive masking with minimal changes. Our masking scheme is also very generic in the sense that it can be applied to other additively-homomorphic encryption schemes.

quotes

The most important statement is equation 3:

decryption(c1, c2) ⊕ decryption(c1’, c2’) = decryption(c1 + c1’, c2 + c2’)

summary

A paper of rather low quality. The core essence is Equality 3: decryption(c1, c2) ⊕ decryption(c1’, c2’) = decryption(c1 + c1’, c2 + c2’). Besides that, there are many confusing statements. The workings of Equality 3 are barely mentioned (i.e. correctness of Equality 3 is IMHO insufficiently discussed), but should be understandable for everyone in the field. Correctness is non-obvious, because we have an addition on the RHS and XOR on the LHS. But it is trivial, because on the RHS, we consider ℤq whereas LHS uses ℤ2 and the XOR is addition in ℤ2. Unlike the CHES 2015 paper, no additional circuitry is needed, which makes it a super interesting approach. The failure rate increases by a factor of 92 (3.6 × 10-5 → 3.3 × 10-3 per bit), which is a difficult problem of its own, but given in the CHES 2015 approach as well.

In summary, the approach computes the encryption of the original message (⇒ (c1, c2)) and also the encryption of some random value (⇒ (c1’, c2’)). Then, we don't decrypt dec(c1, c2), but dec(c1 + c1’, c2 + c2’).

typo

Aggregated Private Information Retrieval §

Title: “Aggregated Private Information Retrieval” by Lukas Helminger, Daniel Kales, Christian Rechberger [url] [dblp]
Published in 2020-05 at and I read it in 2020-07
Abstract: With the outbreak of the coronavirus, governments rely more and more on location data shared by European mobile network operators to monitor the advancements of the disease. In order to comply with often strict privacy requirements, this location data, however, has to be anonymized, limiting its usefulness for making statements about a filtered part of the population, like already infected people.

quotes

summary

The paper implements a nice idea. The usefulness of the usecase needs to be discussed with public health experts (what does it help if I know that many infected people live in this block of houses?). However, I have been told the entire paper was written in 1 month and that is quite impressive considering the technical depth in the field of Homomorphic Encryption.

There are many typos and to me, the main purpose of the protocol in Figure 1 was not comprehensible before talking to the authors. I also didn't understand in which ways ε-Differential Privacy is desirable or how it can be ensured and which definition they used for “vector c is binary” before going into details in section 3.2. Apparently, a binary vector is desirable to prevent leakage. For Figure 2, they used the Teχ package cryptocode to illustrate the protocol. To the best of my understanding, this is just a reiteration of Figure 2. On page 13, the paragraph “Note that, as we have already mentioned in Section 3.6” should be moved to the concluding remarks. On page 14, it is unclear, what “isolated profiles” are. I didn't go through the details of section 5.

BasicBlocker: Redesigning ISAs to Eliminate Speculativ… §

Title: “BasicBlocker: Redesigning ISAs to Eliminate Speculative-Execution Attacks” by Jan Philipp Thoma, Jakob Feldtkeller, Markus Krausz, Tim Güneysu, Daniel J. Bernstein [url] [dblp]
Published in 2020-07 at and I read it in 2020-08
Abstract: Recent research has revealed an ever-growing class of microarchitectural attacks that exploit speculative execution, a standard feature in modern processors. Proposed and deployed countermeasures involve a variety of compiler updates, firmware updates, and hardware updates. None of the deployed countermeasures have convincing security arguments, and many of them have already been broken.

Comment: Preprint

quotes

summary

Quite good paper.

“Preventing speculative execution exploits” conjured up more sophisticated expectations for my part, but in general the idea is legit and the research was done properly. Essentially, the additional bb instruction annotates how many following instructions do not contain control flow instructions which allows a processor to prefetch them without any speculative considerations. An additional lcnt instruction for RISC-V handles loops.

In the reading group it was criticized that formal definitions were cumbersome and inappropriate, but I consider it a strong suit that the idea was not only presented, but formalized. I think the negative aspects are that some statements are not properly attributed; like Observation 1 which is not a contribution by this paper, but prior work. Furthermore statements like “One could easily extend our design to larger, more complex CPUs” seem unjustified. Intel Skylake CPUs are built with 19 stage pipelines, which certainly shows different performance metrics. So more research into the relation of number of pipeline stages and performance is required.  A new instruction type for RISC-V is also a non-trivial modification in practice. The corresponding code is not yet published, but the paper was just released 1 month prior reading. Another weak point is selling, which focuses on the good performance of BasicBlocker in the nettle-aes and nettle-sha benchmarks. As cryptographic applications, these obviously do not rely on branching except for “for loops”, which is completely non-representative, but statements like “In some cases, BasicBlocker even outperforms sophisticated branch predictors” can be found. Finally, Claudio pointed out very well, that JIT compilation cannot be implemented with BB since the number of instructions of a block are most likely not known.

Overall, a quite good paper, but less aggressive selling would have increased reputation from my perspective.

Cyclone: A safe dialect of C §

Title: “Cyclone: A safe dialect of C” by Greg Morrisett, James Cheney, Dan Grossman, Michael Hicks, Yanling Wang [url] [dblp]
Published in 2002 at USENIX 2002 and I read it in 2020-02
Abstract:

Ambiguity

“Arrays and strings are converted to ?-pointers as necessary (automatically by the compiler).”

→ When exactly?

Example

“We don’t consider it an error if non-pointers are uninitialized. For example, if you declare a local array of non-pointers, you can use it without initializing the elements:

char buf[64];       // contains garbage ..
sprintf(buf,"a");   // .. but no err here
char c = buf[20]; // .. or even here

This is common in C code; since these array accesses are in-bounds, we allow them.”

Example

“However, this technique will not catch even the following simple variation:

char *itoa(int i) {
  char buf[20];
  char *z;
  sprintf(buf,"%d",i);
  z = buf;
  return z;
}

Here, the address of buf is stored in the variable z, and then z is returned. This passes gcc -Wall without complaint.”

Quotes

Region blocks

“Therefore, Cyclone provides a feature called growable regions. The following code declares a growable region, does some allocation into the region, and deallocates theregion:

region h {
  int *x = rmalloc(h,sizeof(int));
  int ?y = rnew(h) { 1, 2, 3 };
  char ?z = rprintf(h,"hello");
}

The code uses a region block to start a new, growable region that lives on the heap. The region is deallocated on exit from the block (without an explicit free).”

Summary

Great paper with pragmatic approach derived from the Typed Assembly Language project. Definitely worth a read. Essential for everyone interested in the Rust programming language as this project inspired many ideas related to proper memory management and lifetimes. Implementation needs to be compiled on your own, but it is not maintained anymore anyways. Furthermore, there are more papers following the progress of the project and they also introduced more drastic changes which discards the label “pragmatic”.

Tagged unions

“We solve this in Cyclone in two steps. First, we add tagged unions to the language:”

tunion t {
  Int(int);
  Str(char ?);
};

[…]

void pr(tunion t x) {
  switch (x) {
    case &Int(i): printf("%d",i); break;
    case &Str(s): printf("%s",s); break;
  }
}

“The printf function itself accesses the tagged arguments through a fat pointer (Cyclone’s varargs are bounds checked)”

Everything Old is New Again: Binary Security of WebAss… §

Title: “Everything Old is New Again: Binary Security of WebAssembly” by Daniel Lehmann, Johannes Kinder, Michael Pradel [url] [dblp]
Published in 2020 at USENIX Security 2020 and I read it in 2020-07
Abstract: WebAssembly is an increasingly popular compilation target designed to run code in browsers and on other platforms safely and securely, by strictly separating code and data, enforcing types, and limiting indirect control flow. Still, vulnerabilities in memory-unsafe source languages can translate to vulnerabilities in WebAssembly binaries. In this paper, we analyze to what extent vulnerabilities are exploitable in WebAssembly binaries, and how this compares to native code. We find that many classic vulnerabilities which, due to common mitigations, are no longer exploitable in native binaries, are completely exposed in WebAssembly. Moreover, WebAssembly enables unique attacks, such as overwriting supposedly constant data or manipulating the heap using a stack overflow. We present a set of attack primitives that enable an attacker (i) to write arbitrary memory, (ii) to overwrite sensitive data, and (iii) to trigger unexpected behavior by diverting control flow or manipulating the host environment. We provide a set of vulnerable proof-of-concept applications along with complete end-to-end exploits, which cover three WebAssembly platforms. An empirical risk assessment on real-world binaries and SPEC CPU programs compiled to WebAssembly shows that our attack primitives are likely to be feasible in practice. Overall, our findings show a perhaps surprising lack of binary security in WebAssembly. We discuss potential protection mechanisms to mitigate the resulting risks.

quotes

summary

Wonderful paper. Firstly, a security analysis was necessary and I think they covered an appropriate amount of attack vectors. Secondly it is, of course, sad to see that many old vulnerabilities still work on a new platform.

I am unconditionally in favor of true read-only memory in WebAssembly. As David points out, this can also be enforced by a compiler by appropriate runtime checks. However, David also specified a criterion: Iff a memory attack could lead to an escape from the WASM sandbox, then it is an issue of WebAssembly sandbox and should be prevented at this level.

I keep wondering about stack canaries and guard pages. Maybe it was a decision between the trade-off of security and performance. But I am not convinced by it with 100%. Thirdly, the paper is well-structured and gives sufficient data to reason its arguments. The attacks were okay, the introduction to WebAssembly was superb and justifying the claims regarding indirect calls with quantitative data in section 6 was outstanding. I think everyone in IT security can easily follow it. I love it!

WebAssembly architecture:

toc

• Introduction
• Background on WebAssembly
• Security Analysis of Linear Memory
• Managed vs. Unmanaged Data
• Memory Layout
• Memory Protections
• Attack Primitives
• Obtaining a Write Primitive
• Stack-based Buffer Overflow
• Stack Overflow
• Heap Metadata Corruption
• Overwriting Data
• Overwriting Stack Data
• Overwriting Heap Data
• Overwriting “Constant” Data
• Triggering Unexpected Behavior
• Redirecting Indirect Calls
• Code Injection into Host Environment
• Application-specific Data Overwrite
• End-to-End Attacks
• Cross-Site Scripting in Browsers
• Remote Code Execution in Node.js
• Arbitrary File Write in Stand-alone VM
• Quantitative Evaluation
• Experimental Setup and Analysis Process
• Measuring Unmanaged Stack Usage
• Measuring Indirect Calls and Targets
• Comparing with Existing CFI Policies
• Discussion of Mitigations
• WebAssembly Language
• Compilers and Tooling
• Application and Library Developers
• Related Work
• Conclusion

FastSpec: Scalable Generation and Detection of Spectre… §

Title: “FastSpec: Scalable Generation and Detection of Spectre Gadgets Using Neural Embeddings” by M. Caner Tol, Koray Yurtseven, Berk Gulmezoglu, Berk Sunar [url] [dblp]
Published in 2020 at and I read it in 2020-07
Abstract: Several techniques have been proposed to detect vulnerable Spectre gadgets in widely deployed commercial software. Unfortunately, detection techniques proposed so far rely on hand-written rules which fall short in covering subtle variations of known Spectre gadgets as well as demand a huge amount of time to analyze each conditional branch in software. Since it requires arduous effort to craft new gadgets manually, the evaluations of detection mechanisms are based only on a handful of these gadgets.

ambiguity

quotes

summary

Very advanced paper. Perfect combination of Machine Learning technology with microarchitectural attack work. Shows a huge effort and nice considerations regarding Generative Adversarial Networks. However, I could not understand technical aspects of the machine learning part

The masking rate (page 6) seems to be the percentage of hidden tokens during the training phase. Figure 4 is a little bit awkward and a little bit random. FastSpec/scan.sh seems to show how FastSpec was called to evaluate OpenSSL. And commands.txt tries to explain it somehow.

typo

High-speed Instruction-set Coprocessor for Lattice-bas… §

Title: “High-speed Instruction-set Coprocessor for Lattice-based Key Encapsulation Mechanism: Saber in Hardware” by Sujoy Sinha Roy, Andrea Basso [url] [dblp]
Published in 2020 at CHES 2020 and I read it in 2020-11
Abstract: In this paper, we present an instruction set coprocessor architecture for lattice-based cryptography and implement the module lattice-based post-quantum key encapsulation mechanism (KEM) Saber as a case study. To achieve fast computation time, the architecture is fully implemented in hardware, including CCA transformations. Since polynomial multiplication plays a performance-critical role in the module and ideal lattice-based public-key cryptography, a parallel polynomial multiplier architecture is proposed that overcomes memory access bottlenecks and results in a highly parallel yet simple and easy-to-scale design. Such multipliers can compute a full multiplication in 256 cycles, but are designed to target any area/performance trade-offs. Besides optimizing polynomial multiplication, we make important design decisions and perform architectural optimizations to reduce the overall cycle counts as well as improve resource utilization.

open questions

quotes

summary

A good read with appropriate comparisons with other schemes and implementations. Reasonable arguments are provided for design decisions. The runtime results were expectable and have been met.

Instruction                             Cycle Count
Keygen Encapsulation Decapsulation
––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
SHA3-256 339 585 303
SHA3-512 0 62 62
SHAKE-128 1,461 1,403 1,403
Vector sampling 176 176 176
Polynomial multiplications 2,685 3,592 4,484
(i.e. 47% of column) (i.e. 54%) (i.e. 56%)
Remaining operations 792 800 1,606
Total cycles 5,453 6,618 8,034
Total time at 250 MHz 21.8 μs 26.5 μs 32.1 μs

typo

Historical Notes on the Fast Fourier Transform §

Title: “Historical Notes on the Fast Fourier Transform” by W Lewis, Peter D Welch [url] [dblp]
Published in 1967 at IEEE 1967 and I read it in 2020-03
Abstract: The fast Fourier transform algorithm has a long and interesting history that has only recently been appreciated. In this paper, the contributions of many investigators are described and placed in historical perspective.

summary

Number "Not" Used Once - Key Recovery Fault Attacks on… §

Title: “Number "Not" Used Once - Key Recovery Fault Attacks on LWE Based Lattice Cryptographic Schemes” by Prasanna Ravi, Shivam Bhasin, Anupam Chattopadhyay [url] [dblp]
Published in 2018 at COSADE 2019 and I read it in 2020-05
Abstract: This paper proposes a simple single bit flip fault attack applicable to several LWE (Learning With Errors Problem) based lattice based schemes like KYBER, NEWHOPE, DILITHIUM and FRODO which were submitted as proposals for the NIST call for standardization of post quantum cryptography. We have identified a vulnerability in the usage of nonce, during generation of secret and error components in the key generation procedure. Our fault attack, based on a practical bit flip model (single bit flip to very few bit flips for proposed parameter instantiations) enables us to retrieve the secret key from the public key in a trivial manner. We fault the nonce in order to maliciously use the same nonce to generate both the secret and error components which turns the LWE instance into an exactly defined set of linear equations from which the secret can be trivially solved for using Gaussian elimination.

ambiguity

What is z in the provided interval in section 2.6? Seems to be either an arbitrary integer or z of Algorithm 1 (NewHope)

error

“later converted to the NTT domain using the PolyBitRev function” … no, it is a separate step

inconsistency

quotes

summary

Short paper, tiny errors, no practical implementation (how successful can the attack on Frodo be implemented?) (correction: unlike the paper, their presentation slides show evaluation data), the structure of the text is beautiful and it is well-written. I particularly enjoyed the summary of weak LWE instances. Masking is also another countermeasure, but is heavy-weight compared to the proposed error-correcting codes

typo

The aforementioned fault vulnerabilities associated with the nonce only occur due to the implementation strategies used and hence can we believe can be easily corrected albeit with a small performance overhead.

⇒ hence can, we believe, can be easily corrected

Piret and Quisquater’s DFA on AES Revisited §

Title: “Piret and Quisquater’s DFA on AES Revisited” by Christophe Giraud, Adrian Thillard [url] [dblp]
Published in 2010 at and I read it in 2020-06
Abstract: At CHES 2003, Piret and Quisquater published a very efficient DFA on AES which has served as a basis for many variants published afterwards. In this paper, we revisit P&Q’s DFA on AES and we explain how this attack can be much more efficient than originally claimed. In particular, we show that only 2 (resp. 3) faulty ciphertexts allow an attacker to efficiently recover the key in the case of AES-192 (resp. AES-256). Our attack on AES-256 is the most efficient attack on this key length published so far.

quotes

summary

Good paper. Ideas are immediate. Not all attacks presented give runtimes, but the (more important) number of faulty ciphertexts is analyzed properly. The original P&Q attack is neat in general.

The original attack from 2003 uniquely identifies the key with 2 faulty ciphertext pairs with probability 98%.

typo

Figure 2 swaps the last SubBytes and ShiftRows operation

SEVurity: No Security Without Integrity §

Title: “SEVurity: No Security Without Integrity” by Luca Wilke, Jan Wichelmann, Mathias Morbitzer, Thomas Eisenbarth [url] [dblp]
Published in 2020 at IEEE Symposium on Security and Privacy 2020 and I read it in 2020-06
Abstract: One reason for not adopting cloud services is the required trust in the cloud provider: As they control the hypervisor, any data processed in the system is accessible to them. Full memory encryption for Virtual Machines (VM) protects against curious cloud providers as well as otherwise compromised hypervisors. AMD Secure Encrypted Virtualization (SEV) is the most prevalent hardware-based full memory encryption for VMs. Its newest extension, SEV-ES, also protects the entire VM state during context switches, aiming to ensure that the host neither learns anything about the data that is processed inside the VM, nor is able to modify its execution state. Several previous works have analyzed the security of SEV and have shown that, by controlling I/O, it is possible to exfiltrate data or even gain control over the VM’s execution. In this work, we introduce two new methods that allow us to inject arbitrary code into SEV-ES secured virtual machines. Due to the lack of proper integrity protection, it is sufficient to reuse existing ciphertext to build a high-speed encryption oracle. As a result, our attack no longer depends on control over the I/O, which is needed by prior attacks. As I/O manipulation is highly detectable, our attacks are stealthier. In addition, we reverse-engineer the previously unknown, improved Xor-Encrypt-Xor (XEX) based encryption mode, that AMD is using on updated processors, and show, for the first time, how it can be overcome by our new attacks.

Intel SGX's role

Intel Software Guard Extensions (SGX) was the first widely available solution for protecting data in RAM. However, it only can protect a small chunk of RAM, not the VM as a whole.

Memory encryption systems for VMs

Comparison: S. Mofrad, F. Zhang, S. Lu, and W. Shi, “A comparison study of intel sgx and amd memory encryption technology,”, 2018

Notes

On virtual memory with virtual machines

On virtualized systems, two different page tables are used. Within the VM, the VA used by the guest, the Guest Virtual Address (GVA), is translated to the Guest Physical Address (GPA). The GPA is the address which the VM considers to be the PA. However, on the host system itself another page table is introduced, to allow multiple VMs to run on the same physical machine. This second page table is called the Second Level Address Translation (SLAT), or Nested Page Table (NPT) [5]. The NPT translates the GPA into the Host Physical Address (HPA), the actual address of the data in physical memory.

summary

Trust in cloud providers

One reason for not adopting cloud services is the required trust in the cloud provider: As they control the hypervisor, any data processed in the system is accessible to them.

Tweakable block ciphers

One popular method for storage encryption are tweakable block ciphers, such as AES-XTS [1], which is, e.g., used in Apple’s FileVault, MS Bitlocker and Android’s file-based encryption. Tweakable block ciphers provide encryption without data expansion as well as some protection against plaintext manipulation. A tweak allows to securely change the behavior of the block cipher, similar to instantiating it with a new key, but with little overhead.

XEX and Xor-Encrypt (XE) are methods to turn a block cipher such as AES into a tweakable blockcipher, where a tweak-derived value is XORed with the plaintext before encryption (and XORed again after encryption in the case of XEX).

typo

“before it continuous its operation” → “before it continues its operation.”

Templates vs. Stochastic Methods: A Performance Analys… §

Title: “Templates vs. Stochastic Methods: A Performance Analysis for Side Channel Cryptanalysis” by Benedikt Gierlichs, Kerstin Lemke-Rust, Christof Paar [url] [dblp]
Published in 2006 at CHES 2006 and I read it in 2020-06
Abstract: Template Attacks and the Stochastic Model provide advanced methods for side channel cryptanalysis that make use of ‘a-priori’ knowledge gained from a profiling step. For a systematic comparison of Template Attacks and the Stochastic Model, we use two sets of measurement data that originate from two different microcontrollers and setups. Our main contribution is to capture performance aspects against crucial parameters such as the number of measurements available during profiling and classification. Moreover, optimization techniques are evaluated for both methods under consideration. Especially for a low number of measurements and noisy samples, the use of a T-Test based algorithm for the choice of relevant instants can lead to significant performance gains. As a main result, T-Test based Templates are the method of choice if a high number of samples is available for profiling. However, in case of a low number of samples for profiling, stochastic methods are an alternative and can reach superior efficiency both in terms of profiling and classification.

ambiguity

quotes

summary

Important empirical results. Parameters and assumptions are okay. Results are fundamental and significant. However, the description of the methods (templates, stochastic) are quite bad. Looking up the references is required.

Notes:

The design of a Unicode font §

Title: “The design of a Unicode font” by C Bigelow, K Holmes [url] [dblp]
Published in 1993 at and I read it in 2020-10
Abstract: The international scope of computing, digital information interchange, and electronic publishing has created a need for world-wide character encoding standards. Unicode is a comprehensive standard designed to meet such a need. To be readable by humans, character codes require fonts that provide visual images — glyphs — corresponding to the codes. The design of a font developed to provide a portion of the Unicode standard is described and discussed.

ambiguity

“The inclusion of non-alphabetic symbols and non-Latin letters in these 8-bit character
sets required font developers to decide whether the assorted symbols and non-Latin letters
should be style-specific or generic.”

A definition of style-specific and generic would be nice.

“Hence, although accents need to be clearly differentiated, they do not need to be emphatic, and, indeed, overly tall or heavy accents can be more distracting than helpful to readers.”

What does empathic mean in this context?

nicely written

“To design such a font is a way to study and appreciate, on a microcosmic scale, the manifold variety of literate culture and history.”

“A ‘roman’ typeface design (perhaps upright would be a less culture-bound term, since a Unicode font is likely to include Greek, Cyrillic, Hebrew, and other scripts)” …

question

One of the advantages of Unicode is that it includes a Generic Diacritical Marks set of ‘floating’ diacritics that can be combined with arbitrary letters. These are not case-sensitive, i.e. there is only one set of floating diacritics for both capitals and lowercase

Isn't this a property of the font file format?

quotes

summary

A neat paper. Due to my lack of understanding of the status quo in 1993, I cannot judge on the approach and quality. There are some considerations, I am not familiar with (e.g. why do we need to distinguish only two kinds of diacritics - lowercase and uppercase), but the paper gives a broad overview over design decisions that need to be made, when designing a ‘Unicode’ font. They designed 1700 characters, but Unicode 1.1 specifies 34,168 characters. It is part of the paper to discuss “One big font vs. many little fonts”.

Too Much Crypto §

Title: “Too Much Crypto” by Jean-Philippe Aumasson [url] [dblp]
Published in 2020-01-03 at Real-World Crypto 2020 and I read it in 2020/01
Abstract: We show that many symmetric cryptography primitives would not be less safe with significantly fewer rounds. To support this claim, we review the cryptanalysis progress in the last 20 years, examine the reasons behind the current number of rounds, and analyze the risk of doing fewer rounds. Advocating a rational and scientific approach to round numbers selection, we propose revised number of rounds for AES, BLAKE2, ChaCha, and SHA-3, which offer more consistent security margins across primitives and make them much faster, without increasing the security risk.

ambiguity

“Where we examine the reasons behind the number of rounds, comment on the risk posed by quantum computers, and finally propose new primitives for a future where less energy is wasted on computing superfluous rounds.”

Is this an English sentence?

notes

Good paper; its survey is its strong suit. However, the particular choice of proposed parameters has little justification

typo

If all the best cryptanalysts could find was a distinguisher in the chosen-ciphertext related-key model, then even less likely are practical key recovery attack in the chosen-plaintext model.

typo

But as as noted in §2, numbers such as

Tweaks and Keys for Block Ciphers: The TWEAKEY Framewo… §

Title: “Tweaks and Keys for Block Ciphers: The TWEAKEY Framework” by Jérémy Jean, Ivica Nikolić, Thomas Peyrin [url] [dblp]
Published in 2014 at ASIACRYPT 2014 and I read it in 2020-06
Abstract: We propose the TWEAKEY framework with goal to unify the design of tweakable block ciphers and of block ciphers resistant to related-key attacks. Our framework is simple, extends the key-alternating construction, and allows to build a primitive with arbitrary tweak and key sizes, given the public round permutation (for instance, the AES round). Increasing the sizes renders the security analysis very difficult and thus we identify a subclass of TWEAKEY, that we name STK, which solves the size issue by the use of finite field multiplications on low hamming weight constants. We give very efficient instances of STK, in particular, a 128-bit tweak/key/state block cipher Deoxys-BC that is the first AES-based ad-hoc tweakable block cipher. At the same time, Deoxys-BC could be seen as a secure alternative to AES-256, which is known to be insecure in the related-key model. As another member of the TWEAKEY framework, we describe Kiasu-BC, which is a very simple and even more efficient tweakable variation of AES-128 when the tweak size is limited to 64 bits.

Properties of tweakable block ciphers:

quotes (on contributions)

quotes (on history)

Non-intuitive results on birthday paradox bounds

Future work

Tweakey

Performance and area

summary

I think this paper is some good research product. I am not an expert on symmetric cryptography and cannot judge upon the security analysis and possible attacks, but to me it seems to consider relevant properties. Unrelated to the paper, I was not aware of beyond-birthday-attack-security which totally intrigued me. Related to the paper, follow-up work could be made regarding the question “What are sufficient conditions to achieve a secure tweakable block cipher with Tweakey?”. Well done!

typo

When a Patch is Not Enough - HardFails: Software-Explo… §

Title: “When a Patch is Not Enough - HardFails: Software-Exploitable Hardware Bugs” by Ghada Dessouky, David Gens, Patrick Haney, Garrett Persyn, Arun Kanuparthi, Hareesh Khattri, Jason M. Fung, Ahmad-Reza Sadeghi, Jeyavijayan Rajendran [url] [dblp]
Published in 2018-12-01 at and I read it in 2020-07
Abstract: Modern computer systems are becoming faster, more efficient, and increasingly interconnected with each generation. Consequently, these platforms also grow more complex, with continuously new features introducing the possibility of new bugs. Hence, the semiconductor industry employs a combination of different verification techniques to ensure the security of System-on-Chip (SoC) designs during the development life cycle. However, a growing number of increasingly sophisticated attacks are starting to leverage cross-layer bugs by exploiting subtle interactions between hardware and software, as recently demonstrated through a series of real-world exploits with significant security impact that affected all major hardware vendors.

HCF

“A behavior humorously hinted at in IBM System/360 machines in the form of a Halt-and-Catch-Fire (HCF) instruction.”

→ See also Christopher Domas' research on x86 since 2017
→ Pentium F00F (C7C8) bug

quotes

summary

A technical report discussing how bugs are introduced in the hardware design process and slip through testing tools. Specifically, they define HardFails as RTL bugs that are difficult to detect and can be triggered from software potentially compromising the entire platform. Their classification in 4 HardFails properties {Cross-modular effects, Timing-flow gap, Cache-state gap, Hardware/Firmware-interactions} is non-exhaustive (as pointed out in the conclusion). In my opinion, it is too copious when discussing the hardware development process and laying out the advantages/disadvantages of various tools. I think it could have been more concise (e.g. in “Proof assistant and theorem-proving” the scalability issue is mentioned twice).
Besides that, I think it give a nice overview over the issues hardware design has to deal with and yes, we need better tool support. But there is (obviously) no solution to the mentioned problems in the paper.
Catherine pointed out that CLKScrew in Table 2 does not need Firmware interaction and Foreshadow has nothing to do with Firmware interaction.