Papers
I read a bunch of papers. This peaked during my PhD studies, but I enjoyed good papers before and after that. I systematically review and summarize them. Here are my notes.
Table of contents:
- § A Core Calculus for Documents: Or, Lambda: The Ultimate Document
- § A Masked Ring-LWE Implementation
- § A Practical Analysis of Rust’s Concurrency Story
- § A Provably Secure True Random Number Generator with Built-In Tolerance to Activ…
- § A Public-Key Cryptosystem Based On Algebraic Coding Theory
- § A Replication Study on Measuring the Growth of Open Source
- § A Side-channel Resistant Implementation of SABER
- § A Sound Method for Switching between Boolean and Arithmetic Masking
- § A generalized approach to document markup
- § A plea for lean software
- § A system for typesetting mathematics
- § Additively Homomorphic Ring-LWE Masking
- § Aggregated Private Information Retrieval
- § BasicBlocker: Redesigning ISAs to Eliminate Speculative-Execution Attacks
- § Benchmarking Post-quantum Cryptography in TLS
- § Bitcoin: A Peer-to-Peer Electronic Cash System
- § Borrow checking Hylo
- § C-rusted: The Advantages of Rust, in C, without the Disadvantages
- § Cheat Sheets for Data Visualization Techniques
- § Crouching error, hidden markup [Microsoft Word]
- § Cryptanalysis of ring-LWE based key exchange with key share reuse
- § Cryptographic competitions
- § Cyclone: A safe dialect of C
- § Design Guidelines for Domain Specific Languages
- § Detecting Unsafe Raw Pointer Dereferencing Behavior in Rust
- § EWD1300: The Notational Conventions I Adopted, and Why
- § Engineering a sort function
- § Everything Old is New Again: Binary Security of WebAssembly
- § Exploring the Vulnerability of R-LWE Encryption to Fault Attacks
- § FPGA Vendor Agnostic True Random Number Generator
- § FastSpec: Scalable Generation and Detection of Spectre Gadgets Using Neural Emb…
- § Grading on a Curve: How Rust can Facilitate New Contributors while Decreasing V…
- § High-speed Instruction-set Coprocessor for Lattice-based Key Encapsulation Mech…
- § Historical Notes on the Fast Fourier Transform
- § Horizontal side-channel vulnerabilities of post-quantum key exchange protocols
- § How ISO C became unusable for operating systems development
- § How Usable are Rust Cryptography APIs?
- § How did Dennis Ritchie produce his PhD thesis? A typographical mystery
- § In defense of PowerPoint
- § Is rust used safely by software developers?
- § Languages and the computing profession
- § Learn&Fuzz: Machine learning for input fuzzing
- § Markup systems and the future of scholarly text processing
- § McBits: Fast Constant-Time Code-Based Cryptography
- § Mind your language: on novices' interactions with error messages
- § NTRU: A ring-based public key cryptosystem
- § New directions in cryptography
- § Number "Not" Used Once - Key Recovery Fault Attacks on LWE Based Lattice Crypto…
- § On the Security of Password Manager Database Formats
- § On the criteria to be used in decomposing systems into modules
- § PDF/A considered harmful for digital preservation
- § Parsing Expression Grammars: A Recognition-Based Syntactic Foundation
- § Phonological Differences between Japanese and English: Several Potentially Prob…
- § Piret and Quisquater’s DFA on AES Revisited
- § Power analysis attack on Kyber
- § Practical CCA2-Secure and Masked Ring-LWE Implementation
- § Practical Evaluation of Masking for NTRUEncrypt on ARM Cortex-M4
- § Region-Based Memory Management in Cyclone
- § Revisiting “what is a document?”
- § SEVurity: No Security Without Integrity
- § SOK: On the Analysis of Web Browser Security
- § Scribble: Closing the Book on Ad Hoc Documentation Tools
- § Seven deadly sins of introductory programming language design
- § Seven great blunders of the computing world
- § SoK: Eternal War in Memory
- § SoK: Exploiting Network Printers
- § SoK: Science, Security and the Elusive Goal of Security as a Scientific Pursuit
- § Software-based Power Side-Channel Attacks on x86
- § Some instructive mathematical errors
- § Strategies for Parallel Markup
- § Sweeping for Leakage in Masked Circuit Layouts
- § TEXML: Resurrecting TEX in the XML world
- § Templates vs. Stochastic Methods: A Performance Analysis for Side Channel Crypt…
- § The Aesthetics of Reading
- § The Case of Correlatives: A Comparison between Natural and Planned Languages
- § The European Union and the Semantic Web
- § The Honey Badger of BFT Protocols
- § The Implementation of Lua 5.0
- § The Profession as a Culture Killer
- § The Security Risk of Lacking Compiler Protection in WebAssembly
- § The UNIX Time- Sharing System
- § The design of a Unicode font
- § The problem with unicode
- § Too Much Crypto
- § Toward decent text encoding
- § Transitivaj kaj netransitivaj verboj en Esperanto
- § Tweaks and Keys for Block Ciphers: The TWEAKEY Framework
- § Underproduction: An Approach for Measuring Risk in Open Source Software
- § Understanding memory and thread safety practices and issues in real-world Rust …
- § Unicode and math, a combination whose time has come — Finally!
- § Vnodes: An Architecture for Multiple File System Types in Sun UNIX
- § What is a “document”?
- § When a Patch is Not Enough - HardFails: Software-Exploitable Hardware Bugs
- § You Really Shouldn't Roll Your Own Crypto: An Empirical Study of Vulnerabilitie…
- § π is the Minimum Value for Pi
Papers and notes
A Core Calculus for Documents: Or, Lambda: The Ultimat… §
Title: “A Core Calculus for Documents: Or, Lambda: The Ultimate Document” by Will Crichton, Shriram Krishnamurthi [url] [dblp]
Published in 2025-01-05 at Proceedings of the ACM on Programming Languages, Vol 8, POPL and I read it in 2024-12
Abstract: WILL CRICHTON and SHRIRAM KRISHNAMURTHI, Brown University, USA Passive documents and active programs now widely comingle. Document languages include Turing-complete programming elements, and programming languages include sophisticated document notations. However, there are no formal foundations that model these languages. This matters because the interaction between document and program can be subtle and error-prone. In this paper we describe several such problems, then taxonomize and formalize document languages as levels of a document calculus. We employ the calculus as a foundation for implementing complex features such as reactivity, as well as for proving theorems about the boundary of content and computation. We intend for the document calculus to provide a theoretical basis for new document languages, and to assist designers in cleaning up the unsavory corners of existing languages. CCS Concepts: • Applied computing ! Document scripting languages; • Theory of computation ! Type theory.
quotes
- “This proliferation of document languages raises foundational questions. What are the common characteristics of document languages? How do they relate? Are existing languages well-designed, or can we identify what appear to be flaws in their design? Given that documents are programs, can we reason about them?” (Crichton and Krishnamurthi, 2024, p. 231)
-
PHP: “The key takeaway: an impure semantics for templates can restrict the ability of authors to design easily-composable document abstractions.” (Crichton and Krishnamurthi, 2024, p. 3)
React: “The key takeaway: computations over documents can have subtle dependency structures, which easily leads to bugs when combined with reactivity.” (Crichton and Krishnamurthi, 2024, p. 4)
Scribble: “The key takeaway: the desugaring of templates to terms requires careful scrutiny to understand how it composes with other language features.” (Crichton and Krishnamurthi, 2024, p. 4) - “First, we will establish a scope by asking: what is a document, and what is a document language? Within this paper, we consider a document to be “structured prose,” that is, plain text optionally augmented with styles (e.g., bold or italics) or hierarchy (e.g., paragraphs or sections) and interspersed with figures (e.g., images or tables). This definition includes objects like academic papers and news articles, and it excludes objects like source code, spreadsheets, and computational notebooks. It is useful to restrict the scope of documents because (a) many languages are often used to generate objects in the former set, and (b) those languages have commonalities which have not yet been carefully scrutinized via the lens of PL formalism, unlike e.g. spreadsheets [Bock et al. 2020]. We will give a formal definition of structured prose in the ensuing sections” (Crichton and Krishnamurthi, 2024, p. 5)
-
“String template programs are the highest level of the document calculus in the domain of strings. Therefore, we can now proceed by enriching the domain with additional structure. The most common form of structured document is an attributed tagged tree like this:
Node n ::= text s | node (s, [(s,s)*], n*)” (Crichton and Krishnamurthi, 2024, p. 9) - “For example, one form of validity is well-typedness of the input: a document expression e is valid if • ⊢ e: NodeTy list. Another form of validity is well-formedness of the output: a document expression e is valid if e ↦ v and v ∈ Article. In the wild, validity sometimes means parseability: the CommonMark specification for Markdown states that ‘any sequence of characters is a valid CommonMark document’.” (Crichton and Krishnamurthi, 2024, p. 15)
- “In LATEX, for example, a reference to the next section in this document like \ref{sec:reforesting} will be replaced by the text “4.2”. This operation is non-local, because the document language cannot know the section number of a forward reference at the point of reference.” (Crichton and Krishnamurthi, 2024, p. 16)
- “‘Markup languages’ have a long history as programming languages for marking up documents that are presented on paper (literally via printing, or metaphorically in a PDF), or presented in the browser. Coombs et al. [1987] developed an early markup theory that distinguished “procedural markup”, or low-level graphical commands, from “descriptive markup”, or high-level structuring of a document. Descriptive markup, later called a “ordered hierarchy of content objects” [DeRose et al. 1997], formed the basis of systems such as SGML [Goldfarb 1990] and HyTime [Newcomb et al. 1991] that would go on to inspire HTML and XML. The article domain of the document calculus defined in Section 3.2 is a model for descriptive markup, the predominant model for document languages used today. (The notable exception to this is Teχ, which evaluates into procedural markup but uses “environments” to attempt to simulate the experience of descriptive markup.)” (Crichton and Krishnamurthi, 2024, p. 23)
- “In that sense, this paper’s subtitle is deliberately inaccurate: lambda is not the ultimate document. As Olin Shivers wrote, “lambda is not a universally su�cient value constructor,” and that holds true for constructing documents as well. To that end, future work on document languages should investigate the design of syntaxes that trade-o� intuitiveness, error-tolerance, and systematicity” (Crichton and Krishnamurthi, 2024, p. 26)
summary
String {Literal, Program, Template Literal, Template Program}.
Article {Literal, Program, Template Literal, Template Program}.
… corresponds to …
{quoted string, programming languages with string APIs, C printf or python f-strings, C preprocessor}
{CommonMark Markdown, Programming languages with document APIs, JSX Javascript or Scribble Racket, Typst}
reforestation means turning implicit elements like paragraphs into tree nodes
typo
- “for this purpose by supporting the langauge of regular trees as types” (Crichton and Krishnamurthi, 2024, p. 25)
A Masked Ring-LWE Implementation §
Title: “A Masked Ring-LWE Implementation” by Oscar Reparaz, Sujoy Sinha Roy, Frederik Vercauteren, Ingrid Verbauwhede [url] [dblp]
Published in 2015 at CHES 2015 and I read it in 2020/07
Abstract: Lattice-based cryptography has been proposed as a postquantum public-key cryptosystem. In this paper, we present a masked ring-LWE decryption implementation resistant to first-order side-channel attacks. Our solution has the peculiarity that the entire computation is performed in the masked domain. This is achieved thanks to a new, bespoke masked decoder implementation. The output of the ring-LWE decryption are Boolean shares suitable for derivation of a symmetric key. We have implemented a hardware architecture of the masked ring-LWE processor on a Virtex-II FPGA, and have performed side channel analysis to confirm the soundness of our approach. The area of the protected architecture is around 2000 LUTs, a 20 % increase with respect to the unprotected architecture. The protected implementation takes 7478 cycles to compute, which is only a factor ×2.6 larger than the unprotected implementation.
ambiguity
\[ \operatorname{th}(x) = \begin{cases} 0 & \text{if } x \in (0, q/4) \cup (3q/4, q) \\ 1 & \text{if } x \in (q/4, 3q/4) \end{cases} \]
The intervals apparently should denote inclusive-exclusive boundaries.
error
On page 686, c1 and c2 is swapped erroneously all the time. On page 685, c2 is defined such that it contains the message. Consequently, factor r2 must be applied to c1, not c2. However, page 686 repeatedly multiplies c2 to the different r values.
Masked Decoder table
Sorry, I don't have table support in Zotero.
First line is meant to be read as “If a' is in quadrant (1) and a'' is in quadrant (1), then a is inconclusive (∅) unlike bit 0 or 1”
a' a'' a
1 1 ∅
1 2 1
1 3 ∅
1 4 0
2 1 1
2 2 ∅
2 3 0
2 4 ∅
3 1 ∅
3 2 0
3 3 ∅
3 4 1
4 1 0
4 2 ∅
4 3 1
4 4 ∅
quotes
- “The area of the protected architecture is around 2000 LUTs, a 20 % increase with respect to the unprotected architecture. The protected implementation takes 7478 cycles to compute, which is only a factor ×2.6 larger than the unprotected implementation.”
- “So far, the reported implementations have focused mainly on efficient implementation strategies, and very little research work has appeared in the area of side channel security of the lattice-based schemes.”
- “Most notably, masking is both a provably sound and popular in industry.”
- “However, there are not many masking schemes specifically designed for postquantum cryptography. In Brenner et al. present a masked FPGA implementation of the post-quantum pseudo-random function SPRING.”
- “In the rest of the paper, we focus on protecting the ring-LWE decryption operation against side-channel attacks with masking. The decryption algorithm is considerably exposed to DPA attacks since it repeatedly uses long-term private keys. In contrast, the encryption or key-generation procedures use ephemeral secrets only”
- “We analyze the error rates of the decryption operation in Sect. 6 and apply error correcting codes.”
- “the message is first lifted to a ring element m̄ ∈ R q by multiplying the messagebits by q/2.”
- “The most natural way to split the computation of the decryption as Eq. 2 is to split the secret polynomial r additively into two shares r' and r'' such that r[i] = r'[i] + r''[i] (mod q) for all i.”
- “The final threshold th(·) operation of Eq. 2 is obviously non-linear in the base field Fq, and hence cannot be independently applied to each branch”
- “For instance, in [4] an approach based on masked tables was used.”
- “We design a bespoke masked decoder that results in a very compact implementation.”
- “a denotes a single coefficient and (a', a'') its shares such that a' + a'' = a (mod q).”
- “In roughly half of the cases, we can apply one of the 8 rules previously described to deduce the value of th(a).”
- “a' ← a' + Δ1 and a'' ← a'' − Δ1 for certain Δ1”
- “See the extended version of this paper for exemplary values of Δi.” → http://www.reparaz.net/oscar/ches2015-lwe → 404 Not Found
- Masked Table Lookup: “This is a well-studied problem that arises in other situations (for instance, when masking the sbox lookup in a typical block cipher) and there are plenty of approaches here to implement such masked table lookup. We opted for the approach of masked tables as in [26].”
- “The usual precautions are applied when implementing f. For our target FPGA platform, we carefully split the 7-bit input to 1-bit output function f into a balanced tree of 4-bit input LUTs, in such a way that any intermediate input or output of LUTs does not leak in the first order.”
- “In Table 1, we can see that the proposed masking of the ring-LWE architecture incurs an additional area overhead of only 301 LUTs and 129 FFs in comparison to the unprotected version.”
- “we could straightforward reduce the additional area cost by reusing the 13-bit addition and subtraction circuits present in the arithmetic coprocessor. […] For simplicity, we did not implement this approach.”
- “Thus in total, a masked decryption operation requires 7478 cycles. The arithmetic coprocessor and the masked decoder run in constant time and constant flow.”
- “We point out that the approach laid out in Sect. 3 scales quite well with the security order. To achieve security at level d+1, one would need to split the computation of Eq. 2 into d branches analogously to Eq. 3.”
- “We provide a very advantageous setting for the adversary: we assume that the evaluator knows the details about the implementation (for example, pipeline stages).”
- “The evaluation methodology to test if the masking is sound is as follows. We first proceed with first-order key-recovery attacks when the randomness source (PRNG) is switched off. […] Then we switch on the PRNG and repeat the attacks. If the masking is sound, the first-order attacks shall not succeed. In addition, we perform second-order attacks to confirm that the previous first-order analyses were carried out with enough traces.”
- “We can see that starting from ≈2000 measurements this second-order attack is successful.”
- “We remark that the relatively low number of traces required for the second-order attack is due to the very friendly scenario for the evaluator. The platform is low noise and no other countermeasure except than masking was implemented.”
summary
Nice paper. It is sad that c1 and c2 got mixed up on page 686. The idea to mask is indeed natural with r2 = r2' + r2'' (mod q) and a' := a' + Δ1 for the decoder a'' := a'' - Δ1. Isn't sampling also a problem when masking ring-LWE? If so, the title is inappropriate and should be “A Masked Ring-LWE Decoder Implementation”. The described implementation works for CPA-only. It gives a factor of 2.679 in terms of CPU cycles and 3.2 in terms of runtime in microseconds in the decryption step. With N=16 (maximum number of iterations in the decoder), you get 1 error in about 400 000 bits. This means in NewHope's 256 bit encapsulated keys, you get 1 error in 1420 messages. I did not understand why the masked decoder requires the value of the previous iteration (loop in Figure 3) for a long time. Then I recognized. I don't like the fact that the source code was not published with the paper (→ reproducible research).
typo
- “In our implementation, N = 16 iterations produces a satisfactory result.”
- “essentially maps the output of each quadrant qi' and qi'' (2 bits each) after the i-the iteration”
A Practical Analysis of Rust’s Concurrency Story §
Title: “A Practical Analysis of Rust’s Concurrency Story” by Aditya Saligrama, Andrew Shen, Jon Gjengset [url] [dblp]
Published in 2019 at and I read it in 2021-11
Abstract: Correct concurrent programs are difficult to write; when multiple threads mutate shared data, they may lose writes, corrupt data, or produce erratic program behavior. While many of the data-race issues with concurrency can be avoided by the placing of locks throughout the code, these often serialize program execution, and can significantly slow down performance-critical applications. Programmers also make mistakes, and often forget locks in less-executed code paths, which leads to programs that misbehave only in rare situations.
quotes
- “In this work, we examine how this aspect of Rust’s type system impacts the development and refinement of a concurrent data structure, as well as its ability to adapt to situations when correctness is guaranteed by lower-level invariants (e.g., in lock-free algorithms) that are not directly expressible in the type system itself. We detail the implementation of a concurrent lock-free hashmap in order to describe these traits of the Rust language. Our code is publicly available at https://github.com/saligrama/concache and is one of the fastest
concurrent hashmaps for the Rust language, which leads to mitigating bottlenecks in concurrent programs.” - “The primary cause of headache in concurrent programs is data races. A data race occurs when two threads attempt to read and write the same memory location at the same time.”
- “Lock-free data structures are data structures that have been designed specifically for concurrent access. Operations on such data structures do not generally need to take locks, and can
proceed concurrently even when there are many readers and many writers.” - “Code that is marked as unsafe is allowed to alias and typecast pointers (though it must be valid Rust in every other way), which is sufficient to implement any concurrent algorithm.”
- “[…] we implemented several concurrent hashmap designs of increasing sophistication and evaluated their performance. We started out using the hashmap provided by the standard library, wrapped in a reference-counted reader-writer lock, which, as expected, exhibited poor multi-core scalability.”
- “In Rust, ‘safety’ is generally defined as not being susceptible to undefined behavior, which occurs when compilers make certain assumptions that are not satisfied during execution.”
- “In Rust, every variable is owned by some scope. When a scope ends, it is responsible for cleaning up any resources used by the variables that it owns.”
- “Each variable can only have one owner at a given time, but ownership can be passed to other scopes through function calls or returns.”
- “Rust allows the owner of a variable to give out temporary references to a variable (this is called borrowing the variable).”
- “While the borrow checker (the part of the compiler that checks that all references are valid) guarantees that there are no data-races, additional mechanisms are needed to ensure that multithreaded programs behave correctly.”
- “Rust also has some types that provide interior mutability : certain types allow you to modify a variable even if you only have an immutable reference to it. For example, the Cell type allows you to swap the value of a variable through a &Cell , which is safe as long as the Cell is only accessed from a single thread.”
- “A type that is Send can safely be sent to another thread (that is, its ownership can be passed across thread boundaries), whereas a type that is Sync can safely be accessed from another thread (that is, a reference to it can be passed across thread boundaries). A type whose members are all Send is itself Send , and the same applies to Sync . A non-atomic reference-counted variable (Rc in the standard library) is neither Send nor Sync, whereas a Cell is Send but not Sync . Rust requires code that is spawned on a new
thread to be Send , which ensures that threads are not able to access shared data unless that
data is contained in a structure that allows concurrent access.” - “A code block that is marked as unsafe is allowed to create raw pointers — pointers without an associated lifetime — and cast them to different types, or back to regular references. This allows the developer to maintain multiple mutable pointers to the same data, and expose them as mutable references when their manually-checked invariants indicate that doing so is safe. This is necessary to implement, e.g., a lock, which exposes a mutable reference only when it has checked that there is no-one else holding the lock.”
- “Users of a library that contains unsafe code do not have to mark their own code as unsafe; the library authors effectively promise that their library provides a safe external interface.”
- “Conversely, in other languages such as Go, where the lock is separate from the inner type and threads are not forced to take the lock to access data.”
- “It is also very useful as many concurrent algorithms are written in C-style pseudo code making it easier to port these algorithms into Rust.”
- “When writing concurrent code, you often need to temporarily violate the Rust safety restrictions, or guarantee them through invariants that the compiler cannot check (e.g., multiple mutable pointers to the same data, or pointer manipulations).”
- “Conversely, auto-free tends to be somewhat difficult when writing unsafe code as we must be careful to not accidentally drop a temporarily owned item (in our case, through Box::from_raw).”
- via mem::forget Rust's safety guarantees don't include a guarantee that destructors are always run
- “We observed a tendency to overuse unwrap() while prototyping as it was easier to do so, shortening the length of the code.”
- “While we recognize that robust error handling is difficult to implement, we would appreciate shorter and more efficient ways to transition from prototype to final code in terms of error handling.”
- “We would ideally like to see a compiler feature, perhaps in cargo lint, that could detect if atomics are not necessary (i.e., accessed by only one thread).”
- fn foo(p: &mut *mut usize)
“This parameter is fairly confusing and it can be difficult to determine the purpose of the mut, *, and &. For a beginner with little experience with Rust’s with Rust's pointer and reference types, many questions arise with this sort of function parameter. What is the difference between the two mut s? How does this type correspond to variable p ? The overall complexity of the parameter creates confusion and can make it difficult to read and write Rust function parameters.” - “When writing our concurrent hashmap, we did not use any other Ordering except Ordering::SeqCst, in part because the meaning of the different orderings were not clear to us. It is difficult to understand the implications of any particular Ordering, even after reading the documentation.”
summary
A nice report about the implementation of a concurrent hashmap in rust. The basic primitives of rust are well-explained, but the lock-free design of the hash map is not discussed at all. The relative performance characteristics are presented in a figure without appropriate caption and nice examples are provided for struggles with rust. Overall a nice bachelor/master-level project outcome.
- Performance charts should mention “higher is better” and should be better placed
- The examples (e.g. auto-free, unhelpful compiler errors) are very good!
- println!() in get_val should be eprintln!()
- It is pretty severe for an academic paper to skip the semantics of memory orderings if they are “difficult to understand”
- I couldn't understand why there are two search functions in the section “unnecessary use of lifetimes”
A Provably Secure True Random Number Generator with Bu… §
Title: “A Provably Secure True Random Number Generator with Built-In Tolerance to Active Attacks” by Berk Sunar, William Martin, Douglas Stinson [url] [dblp]
Published in 2007-01 at and I read it in 2021-11-08
Abstract: This paper is a contribution to the theory of true random number generators based on sampling phase jitter in oscillator rings. After discussing several misconceptions and apparently insurmountable obstacles, we propose a general model which, under mild assumptions, will generate provably random bits with some tolerance to adversarial manipulation and running in the megabit-persecond range. A key idea throughout the paper is the fill rate, which measures the fraction of the time domain in which the analog output signal is arguably random. Our study shows that an exponential increase in the number of oscillators is required to obtain a constant factor improvement in the fill rate. Yet, we overcome this problem by introducing a postprocessing step which consists of an application of an appropriate resilient function. These allow the designer to extract random samples only from a signal with only moderate fill rate and, therefore, many fewer oscillators than in other designs. Last, we develop fault-attack models and we employ the properties of resilient functions to withstand such attacks. All of our analysis is based on rigorous methods, enabling us to develop a framework in which we accurately quantify the performance and the degree of resilience of the design.
clarifications and questions
- “Moreover, the entire motivation for the relatively prime ring length model is unclear.”
But the motivation was provided above! Namely,
“[…] It has been proposed that, in order to fill as many urns as possible and in order to make the behavior of
the r rings as independent as possible, the ring lengths n1, …, nr should be pairwise relatively prime integers.” - “Thus, this feature of the model is not only impractical, but its value to the model is questionable to begin with.” ⇒ why questionable? (too little justification)
- It would be nice to see variable d in section 5 be defined explicitly.
- The actual argument against relatively prime ring lengths can be found at the start of section 6.2
- r ≈ N log N … here (in section 6.3) log grows slower than reciprocal sum by about 0.57
- Coupon collector's problem:
“Let N>0 be the number of urns. Let 0<p<1 be the confidence level. Determine minimum number of rings such that at least N are filled with probability ≥ p” - After Theorem 1, N log2 N is suddenly mentioned with log base 2 which used to be the natural logarithm?! (making the error of the estimate larger)
quotes
- “Good TRNG design rests on the quality of three components: Entropy Source […] Harvesting Mechanism […] Postprocessing”
- “A postprocessor may be as simple as a von Neumann corrector [4] or may be as complicated as an extractor function [1] or a one-way hash function such as SHA-1 [4]. One should scrutinize postprocessors which modify the output conditionally on its statistical properties. There is great danger in deterministic methods aimed at improving the ‘appearance of randomness.’”
- “Inspired by these works and observations, we develop a set of requirements for our TRNG design:
- The design should be purely digital (no analog components are allowed).
- The harvesting mechanism should be simple (easy to analyze); it should preserve and optimally sample the entropy source. In other words, the unpredictability of the TRNG should be based not on the complexity of the harvesting mechanism, but only on the unpredictability of the entropy source.
- A strict mathematical justification of the entropy collection mechanism should be given, with all assumptions clearly stated and at least empirically justified. The design should be sufficiently simple to allow rigorous analysis.
- No correction circuits are allowed. Many times an adaptive correction circuit is used either to ‘adjust
the sampling frequency’ or to ‘smooth the output distribution.’ Since most such circuits use the
characteristic of the output to adaptively process the entropy source, they introduce further correlations. For instance, a correction circuit that counts the number of ones and zeros and compensates for the delay of a sampler accordingly will clearly introduce further bias to the output sequence. - Compact and efficient design (high throughput per area and energy spent). No amplifiers or other analog components are allowed, which would consume more energy and make the analysis difficult. Note that, since we are not allowing analog components, we have to sample variations in the time domain (such as the design in [8] does) rather than variations in the voltage levels. This criterion also means that we cannot use complicated post-processing schemes (e.g., SHA-1).”
- “Oscillators provide a simple and effective method to build TRNGs [5]. A simple digital oscillator may be built by chaining an odd number of inverter gates in a ring configuration.”
- “A practical configuration for harvesting jitter is based on the idea of sampling the output of a ring oscillator using the output of another oscillator. This configuration is commonly referred to as coupled oscillators. If the periods of the two oscillators are well matched, then it should be that, with high probability, we are sampling from the transition zones and not the deterministic part of the waveform.”
- “Unfortunately, problems arise in practical realization of coupled oscillators:
- Exactly matching the period of the two oscillators is fairly difficult and requires the use of special layout design techniques at the VLSI level.
- Due to imperfections, the two signals may drift relative to one another. This makes for very fragile TRNG designs.”
- “We take the following axiomatic approach, which, for the time being, neglects phase drift. Suppose Rj is an oscillator ring with period T = Tj. That is, at times 0, T, 2T, …, the signal is designed to switch from low to high and, at times T/2, 3T/2; 5T/2, …, the signal is designed to switch from high to low.”
- “Fortunately, the claim in our axiom is strongly supported by empirical evidence. In [2], for example, over a million oscilloscope captures (sampling at 4 Giga samples/sec) of jitter from a single oscillator ring with 83 inverters having T≈146,8ns (f=6:81 Mhz) were displayed and exhibited classical bell curve behavior.”
- “We will call these subintervals ‘urns’ because, as we shall soon see, the problem which now faces us is one well-known to probabilists who study urn models.”
- “Looking at Fig. 3, one notices that there is ‘waste’ of entropy when two rings are in transition at the same point in time. One is therefore tempted to find ways to minimize such overlap. It has been proposed that, in order to fill as many urns as possible and in order to make the behavior of the r rings as independent as possible, the ring lengths n1, …, nr should be pairwise relatively prime integers.”
- “Moreover, the entire motivation for the relatively prime ring length model is unclear.”
- “For a given number N of urns, a given fill rate 0 < f ≤ 1, and a given level of confidence 0 < p < 1, we would like to determine the minimum number r = M(N, f, p) of rings necessary so that, among the N urns, the event that at least fN are filled has probability at least p. For f = 1, this is the Coupon Collector’s Problem.”
- “we will prefer to keep the confidence p close to one while decreasing the fill rate f and we will present a rigorous postprocessing strategy in Section 7 to recover full confidence in the quality of the bits generated by the overall design.”
- “An (n, m, t)-resilient function is a function F(x1, x2, …, xn) = (y1, y2, …, ym) from ℤn2 to ℤm2 enjoying the property that, for any t coordinates i1, …, t1, for any constants a1, …, at from ℤ2 and any element y of the codomain Prob[F(x) = y | xi1 = a1, …, xit = at] = 1/2m. In the computation of this probability, all xi for i ∉ {i1, …, it} are viewed as independent random variables, each of which takes on the value 0 or 1 with probability 0.5.”
- “In more informal terms, knowledge of any t values of the input to the function does not allow one to make any better than a random guess at the output.”
- “Theorem 1 (e.g., [15]). Let G be a generator matrix for an (n, m, d)-linear code C. Define a function f: {0,1}n → {0,1}m by the rule f(x) = xGT. Then, f is an (n, m, d-1)-resilient function.”
- “[…] if one wishes to implement the resilient function in hardware, this can still be achieved efficiently. All one has to do is to implement a vector times a (constant) matrix product, as described in Theorem 1.”
- “There is a trade-off between code length and the size of the buffers we need to use in the implementation of the resilient function. A code of short length is easy to implement and requires smaller buffers, but using such a code runs a higher risk of being compromised when there is a burst of errors created by natural causes or by an intelligent attacker.”
- “Definition 2. The simple code Hm, the dual of the Hamming code Hm, is a [2m-1, m, 2m-1] linear code.”
- “A 4 bit version of a binary XOR-tree is shown on the top in Fig. 5. In the first level, the ring outputs are XORed together in a pairwise manner. If any single one of these XOR gates are faulted by the adversary, fixing its output to either a zero or one output bits, the effect may be seen as two of the ring outputs having become deterministic and, as before, the bias will be eliminated by the resilient function as much as the built-in strength parameter t permits.”
- “[…] the output lines should be sampled one after another in a sequential scheme.”
- “If our rings use 13 inverters, then experimental evidence shows that the period is roughly 25 ns and the standard deviation for the jitter random variable is σ=0.5ns [2]. So, σ=0.02T. Now, we want all samples to be within 1/4σ of the mean of some jitter event. So, with tolerance (μ-σ/4, μ+σ/4), we can say that 1 percent of the spectrum is filled with jitter for each ring. So, we have N = 100 urns in our combinatorial model. We note that the tolerance 14 ensures that each generated bit will yield at least 0.97 bits of entropy per sampled bit.”
- “With these parameters, the formulae developed in Section 6.4 (or last row, third entry in Table 5) tell us that r = 114 rings will be enough to fill at least 0.60N of the urns with probability at least 0.99. The output is sampled from the 114 rings and fed into a resilient function. Now, for our resilient function, we employ a [256, 16, 113]-code which is a known extended BCH code [14]. This means that the samples are grouped into blocks of 256 bits and fed into the resilient function which returns only 16 bits. The code we selected has minimum distance d = 113 and, therefore, the resilience of the associated resilient function is t = 112. With a block length of 256 and fill rate of 0.60, out of the 256 bits which go into our resilient function, (1 - 0.6) · 256 = 102.4 ≈ 103 bits bits will be deterministic. Since our resilient function can tolerate up to 112 corrupted bits, the design has an additional margin to resist additional (adversarial and nonadversarial) faults and errors of up to 9 bits.”
- “The output is 16 bits per each 256 bits sampled. These 16 bits each have 0.97 bits of entropy. Since the frequency of the circuit is 1/25 ns = 40 Mhz, this model gives us a random stream with bit rate of 40·16/256 = 2.5 Mbps, where each bit stream with bit rate of 256 carries 0.97 bits of entropy.”
summary
A very good paper. It specifies the requirements of a True Random Number Generator, builds up the theory with mathematical background and finally presents one particular instance with desired properties.
I struggled seeing the connection to linear codes which is given by Theorem 1, but I assume one just has to follow reference [15] which proves Theorem 1.
A good TRNG relies on {entropy source, harvesting mechanism, postprocessing}. The design uses ring oscillators (quantified by the urn model with prime ring lengths and the Coupon Collector's Problem), a XOR tree and a linear code namely simplex code (justified by the (n, m, t)-resilient function model).
Prior work:
- [6] amplification, requires significant power
- [4] similar
- [7] samples PLL jitter
- [8] samples LSFR and cellular automaton, complicated harvesting
- [11] metastable circuits
Test suites:
- DIEHARD [12]
- NIST test suites [13]
4 gigasamples/sec = 4·109/s corresponds to 4·109s between 2 samples = 4ns
Figure 5 shows a neat network to make the XOR tree more robust by redundancy.
A Public-Key Cryptosystem Based On Algebraic Coding Th… §
Title: “A Public-Key Cryptosystem Based On Algebraic Coding Theory” by RJ McEliece [url] [dblp]
Published in 1978 at and I read it in 2021-03
Abstract:
summary
Certainly a tremendously important result. But from a paper perspective, I think it is sad that it is not self-contained. Goppa codes are left out and a reference to [5] is made. Also the semantics of the Patterson algorithm on the second page are left out. It should at least specify that the algorithm creates u' for a given x.
Sardinas-Patterson algorithm determines whether a given variable-length code is uniquely decodable.
A Replication Study on Measuring the Growth of Open So… §
Title: “A Replication Study on Measuring the Growth of Open Source” by Michael Dorner, Maximilian Capraro, Ann Barcomb, Krzysztof Wnuk [url] [dblp]
Published in 2022-01 at and I read it in 2022-04
Abstract: Context: Over the last decades, open-source software has pervaded the software industry and has become one of the key pillars in software engineering. The incomparable growth of open source reflected that pervasion: Prior work described open source as a whole to be growing linearly, polynomially, or even exponentially. Objective: In this study, we explore the long-term growth of open source and corroborating previous findings by replicating previous studies on measuring the growth of open source projects. Method: We replicate four existing measurements on the growth of open source on a sample of 172,833 open-source projects using Open Hub as the measurement system: We analyzed lines of code, commits, new projects, and the number of open-source contributors over the last 30 years in the known open-source universe. Results: We found growth of open source to be exhausted: After an initial exponential growth, all measurements show a monotonic downwards trend since its peak in 2013. None of the existing growth models could stand the test of time. Conclusion: Our results raise more questions on the growth of open source and the representativeness of Open Hub as a proxy for describing open source. We discuss multiple interpretations for our observations and encourage further research using alternative data sets.
quotes
- “Prior work described open source as a whole to be growing linearly, polynomially, or even exponentially.” (Dorner et al., 2022, pp. -)
- “We replicate four existing measurements on the growth of open source on a sample of 172,833 open-source projects using Open Hub as the measurement system: We analyzed lines of code, commits, new projects, and the number of open-source contributors over the last 30 years in the known open-source universe.” (Dorner et al., 2022, p. 1)
- “Open source has evolved from small communities of volunteers driven by non-monetary incentives to foundations that host large projects and support decentralized innovation among many global industries [13]” (Dorner et al., 2022, p. 2)
- “Three horizontal, longitudinal studies investigated the growth of open source as a whole: [9] from 2003, [24] from 2007, and in 2008 [11].” (Dorner et al., 2022, p. 2)
-
“In detail, the contributions of this paper are:
- a detailed discussion of the three prior studies,
- the multi-dimensional measurements using Open Hub as a measuring system to quantify the growth of open source with respect to lines of code, commits, contributors, and projects, and, thereby,
- the dependent and independent replication of the measurements by three prior studies.” (Dorner et al., 2022, p. 2)
-
“Study A [9] from 2003 at FreshMeat.net with 406 projects accounted until 2002-07-01
Study B [24] from 2007 at SourceForge.net with 4,047 projects
Study C [11] from 2008 at Ohloh.net with 5,122 projects from 1995-01-01 until 2006-12-31
from Table 1: Three prior [9, 11, 24] studies on the evolution of open source, showing data source, project sample size, and the considered data time frame.” (Dorner et al., 2022, p. 3) - “Their construction is unclear and, in consequence, we cannot replicate the measurements.” (Dorner et al., 2022, p. 4)
- “Vitality V is defined by V = R·A/L (1) where—according to description—R is the number of releases in a given period (t), A is the age of the project in days, and L is the number of releases in the period t.” (Dorner et al., 2022, p. 4)
- “However, a comment is a valuable contribution to a software project. In modern programming languages like Go or Python, comments are directly embedded in the source code for documentation purposes.” (Dorner et al., 2022, p. 7)
- “Therefore, our measurement is consistent with Hypothesis 1, which states that open source grows (in bytes).” (Dorner et al., 2022, p. 7)
- “When commit size is calculated as the total number of commits in a period, it follows a power-law distribution [27]” (Dorner et al., 2022, p. 8)
- “At the time of the data collection (2021-06-04 to 2021-06-07), Open Hub lists 355,111 open-source projects as of 2021-06-06.” (Dorner et al., 2022, p. 10)
- “After filtering our data set for the time frame from 1991-01-01 to 2020-12-31, 173,265 projects were available for further analysis.” (Dorner et al., 2022, p. 10)
- “Observation 2: Although initially growing exponentially until 2009, the growth in lines of code has continuously slowed down since 2013.” (Dorner et al., 2022, p. 13)
- “Observation 4: Although growing exponentially until 2009 and reaching its peak in March 2013 with 107,915 contributors, the number of open-source contributors has, as of 2018, decreased to the level of 2008.” (Dorner et al., 2022, p. 14)
- “We also encountered similar problems while crawling Open Hub: 181,846 of 355,111 projects do not contain information on the development activity. We also found that the accumulated number of lines added does not fit the measured lines of code. Additionally, we found a large drop in added projects in 2011 we are not able to explain (Figure 5). We speculate that Open Hub could have decreased the number of projects it adds, so that newer projects are under-represented.” (Dorner et al., 2022, p. 16)
- “In this study, we conducted a large-scale sample study on open-source projects and their cumulative growth. We analyzed the number of developers and their contributions with respect to lines of code, commits, and new projects to the open-source universe. We leveraged Open Hub as a measuring system to measure development activities of 172,833 open-source projects over the last 30 years concerning those four quantities.” (Dorner et al., 2022, p. 19)
replications
-
exact replications
- dependent: keep all conditions identical
- independent: vary one or more major aspects
summary
In this study (I read version #5), the authors replicate results from 2003 (Capiluppi, A., Lago, P., Morisio, M., “Characteristics of open source projects”), 2007 (Koch, S., “Software evolution in open source projects—a large-scale investigation”), and 2008 (Deshpande, A., Riehle, D., “The Total Growth of Open Source”) on the OpenHub project index. Open Hub lists 355,111 open-source projects as of 2021-06-06 and the development activity is available for 173,305 projects. The previous studies claimed that Open Source grows w.r.t. byte size (2003), grows quadratically (2007) or exponentially (2008) w.r.t. lines of code as well as exponentially w.r.t. projects (2008).
For the 2003 study, the authors remark “their construction is unclear and, in consequence, we cannot replicate the measurements”. The status (e.g. “beta”, “mature”) is also determined in an unclear way. The 2003 study’s lack of specification regarding the magnitude is also weak (“grows”?!).
In essence, the data sources from 2003 and 2007 are not publicly available anymore. As such the statements need to be replicated with new data for an extended timeline. The result is that the authors recognize a peak around 2010 and 2013 and a steady decline afterwards in various parameters (size in bytes, LoCs, # of projects, new projects).
The paper is well-written and the intention is clear. A little more thorough investigation around the peaks from 2010 and 2013 could be done since they occur in multiple parameters (c.f. Fig. 2 and Fig. 3) and thus seem significant. I suggest to validate against hypotheses that activities around Github or the Linux foundations influenced the way how projects are maintained. On page 16, it is mentioned that 181,846 projects do not contain information on the development activity. It should be explicitly pointed out that those are not included in the respective figures (at least this is what I assume).
The paper shows how bad the situation regarding empirical replication of data in software development is. On the other hand, it shows how it improved because version control and public availability of data improved. My personal summary is just that Open Source changed over time and since 2013 in particular.
A Side-channel Resistant Implementation of SABER §
Title: “A Side-channel Resistant Implementation of SABER” by Michiel Van Beirendonck, Jan-Pieter D'Anvers, Angshuman Karmakar, Josep Balasch, Ingrid Verbauwhede [url] [dblp]
Published in 2020 at IACR eprint 2020 and I read it in 2020-11
Abstract: The candidates for the NIST Post-Quantum Cryptography standardization have undergone extensive studies on efficiency and theoretical security, but research on their side-channel security is largely lacking. This remains a considerable obstacle for their real-world deployment, where side-channel security can be a critical requirement. This work describes a side-channel resistant instance of Saber, one of the lattice-based candidates, using masking as a countermeasure. Saber proves to be very efficient to mask due to two specific design choices: power-of-two moduli, and limited noise sampling of learning with rounding. A major challenge in masking lattice-based cryptosystems is the integration of bit-wise operations with arithmetic masking, requiring algorithms to securely convert between masked representations. The described design includes a novel primitive for masked logical shifting on arithmetic shares, as well as adapts an existing masked binomial sampler for Saber. An implementation is provided for an ARM Cortex-M4 microcontroller, and its side-channel resistance is experimentally demonstrated. The masked implementation features a 2.5x overhead factor, significantly lower than the 5.7x previously reported for a masked variant of NewHope. Masked key decapsulation requires less than 3,000,000 cycles on the Cortex-M4 and consumes less than 12kB of dynamic memory, making it suitable for deployment in embedded platforms.
open questions
- “Saber.Masked.KEM.Decaps does not require multiplication of two masked polynomials, which is a significantly more expensive computation.”
It doesn't? - “A possible countermeasure is to randomize the order of execution of these vulnerable routines. Randomness should be used to shuffle the order of operations in Saber’s multiplication or introduce dummy operations.”
Is there an actual implementation?
quotes
- “This work describes a side-channel resistant instance of Saber, one of the lattice-based
candidates, using masking as a countermeasure.” - “NIST has already announced that, in the second round, more stress will be put on implementation aspects.”
- “The security of both problems relies on introducing noise into a linear equation. However, in LWE-based schemes the noise is explicitly generated and added to the equation, while the LWR problem introduces noise through rounding of some least significant bits.”
- “In our masked implementation of Saber, we develop a novel primitive to perform masked logical shifting on arithmetic shares.”
- “Furthermore, Saber avoids excessive noise sampling due to its choice for LWR.”
- “We integrate and profile our masked CCA-secure decapsulation in the PQM4 [KRSS] post-quantum benchmark suite for the Cortex-M4, showing our close-to-ideal 2.5x overhead in CPU cycles. This factor can directly be compared to the overhead factor 5.7x reported by Oder et al., which is the work most closely related to ours, and we show that it can largely be attributed to the masking-friendly design choices of Saber.”
- “We say that a PKE is δ-correct if P[Decrypt(sk, ct) ≠ m : ct ← Encrypt(pk, m)] ≤ δ.” (i.e. smaller is better)
- “The Saber package is based on the Module Learning With Rounding (MLWR) problem, and its security can be reduced to the security of this problem. MLWR is a variant of the well known Learning With Errors (LWE) problem [Reg04], which combines a module structure as introduced by Langlois and Stehlé [LS15] with the introduction of noise through rounding as proposed by Banerjee et al. [BPR12].”
- “The additions with the constant terms h 1 , h 2 and h are needed to center the errors introduced by rounding around 0, which reduce the failure probability of the protocol.”
- “The reason being that even without side-channel information, the Saber.PKE is vulnerable to chosen-ciphertext attacks if the secret key is re-used, which was shown by Fluhrer [Flu16] to be the case for all current LWE-based and LWR-based IND-CPA secure encryption schemes.”
- “Several secure A2B as well as B2A conversion algorithms exist. These generally come in two flavours, depending on whether the arithmetic shares use a power-of-two or a prime modulus. The former group have received considerably more research interest due to their use in symmetric primitives, and they are typically more efficient and simpler to implement.”
- “Finally, Schneider et al. [SPOG19] combine the previous two algorithms, and at the same time present a new algorithm, B2Aq, which works for arbitrary moduli as well as arbitrary security orders. However, when instantiated as a power-of-two conversion, e.g. q = 28, B2Aq only outperforms [BCZ18] and [CGV14] for more than nine shares.”
- “In the remainder of this section, we first describe the Coron-Tchulkine [CT03] table-based A2B algorithm, including the fix from [Deb12].”
- “Another similarity we share with [KMRV18] is the use of the ARM Cortex-M4’s support for SIMD instructions to speed up execution.”
- “We use the Test Vector Leakage Assessment (TVLA) methodology introduced by Goodwill et al. [GJJR11] in order to validate the security of our implementation.”
- “In our experiments we use a non-specific fix vs. random test.”
- “TVLA uses the Welch’s t-test to detect differences in the mean power consumption between the two sets.”
- “After 100 000 measurements, our t-test results for Saber.Masked.KEM.Decaps with masks ON still show some slight excursions past the ±4.5 confidence boundary. This is sometimes expected for long traces, and therefore, as per [GJJR11], we conduct a second independent t-test showing that these excursions are never at the same time instant.”
- “Our most efficient design is (D), where both A2A tables and the SecBitSlicedSampler are implemented.”
- “Masking has so far received limited attention in post-quantum cryptography, but will become increasingly important in the continuation of the NIST standardization process.”
- “Oder et al. do not present the dynamic memory consumption for an unmasked design, such that we only make the masked comparison for that performance metric.”
- “From Table 5, these two A2A conversions take roughly 60,000 CPU cycles, whereas masked sampling of four error polynomials from β μ would take approximately 1,026,000 CPU cycles. The high cost of masked binomial sampling is further illustrated in [OSPG18] (Table 2), where roughly 71% of the decapsulation’s CPU cycles are spent in the masked sampling routine.”
- “A possible countermeasure is to randomize the order of execution of these vulnerable routines. Randomness should be used to shuffle the order of operations in Saber’s multiplication or introduce dummy operations.”
- “A possible countermeasure is to randomize the order of execution of these vulnerable routines. Randomness should be used to shuffle the order of operations in Saber’s multiplication or introduce dummy operations.”
summary
- Table 2 is an illustration for ℤ4
- “From this table, it can be seen that the linear operations, i.e. polynomial arithmetic, have roughly a factor 2x overhead in the masked design, due to the duplication of every polynomial multiplication. Non-linear operations, on the other hand, have overhead factors ranging from 7x for A2A conversion to 23x for binomial sampling. Our design requires 5048 random bytes, and spends roughly 100,000 cycles sampling these from the TRNG”
- Saber.Masked.PKE.Dec takes 1.96 times as long in the masked version
Saber.Masked.PKE.Enc takes 2.48 times as long in the masked version
typo
- Symmetric crypto primitive names are printed in math mode, making the kerning terrible.
- “Where Reparaz et al. successfully masked a Chosen-Plaintext Attack (CPA)-secure RLWE decryption, real-world applications typically require Chosen-Ciphertext Attack (CCA) secure primitives, which can be obtained using an appropriate CCA-transform.”
“Whereas Reparaz et al. successfully masked a Chosen-Plaintext Attack (CPA)-secure RLWE decryption, real-world applications typically require Chosen-Ciphertext Attack (CCA) secure primitives, which can be obtained using an appropriate CCA-transform.” - “A first-order masking splits any sensitive variable x in the algorithm into two shares x1 and x2, such that x = x1☉ x2, and perform all operations in the algorithm on the shares separately.”
“A first-order masking splits any sensitive variable x in the algorithm into two shares x1 and x2, such that x = x1☉ x2, and performs all operations in the algorithm on the shares separately.” - “In grey the operations that are influenced by the long term secret s and thus vulnerable to side-channel attacks.”
“In grey are operations that are influenced by the long term secret s and thus vulnerable to side-channel attacks.” - “Even though previous attacks have focused on schoolbook mulitplication …”
“Even though previous attacks have focused on schoolbook multiplication …”
A Sound Method for Switching between Boolean and Arith… §
Title: “A Sound Method for Switching between Boolean and Arithmetic Masking” by Louis Goubin [url] [dblp]
Published in 2001 at CHES 2001 and I read it in 2021-10
Abstract: Since the announcement of the Differential Power Analysis (DPA) by Paul Kocher and al., several countermeasures were proposed in order to protect software implementations of cryptographic algorithms. In an attempt to reduce the resulting memory and execution time overhead, a general method was recently proposed, consisting in “masking” all the intermediate data.
quotes
- “we present two new ‘BooleanToArithmetic’ and ‘ArithmeticToBoolean’ algorithms, proven secure against DPA attacks. Each of these algorithms uses only very simple operations: ‘XOR’, ‘AND’, subtractions and ‘logical shift left’. Our ‘Boolean-ToArithmetic’ algorithm uses a constant number (namely 7) of such elementary operations, whereas the number of elementary operations involved in our ‘ArithmeticToBoolean’ algorithm is proportional (namely equal to 5K + 5) to the size (i.e. the number K of bits) of the processor registers.”
- “In the present paper, we focus on the “masking method”, initially suggested by Goubin and Patarin in [10], and studied further in [11].”
- “In this paper, we solved the following open problem (stated in [6]):
find an efficient algorithm for converting from boolean masking to arithmetic masking and conversely, in which all intermediate variables are decorrelated from the data to be masked, so that it is secure against DPA”
summary
One of the most beautiful papers, I have read.
- It uses established mathematical definitions.
- All steps are well-documented and can be read linearly.
- It solves a generic problem.
- Boolean masking: x' = x ⊕ r
Arithmetic masking: A = x − r mod 2K
Minor issues:
- “Appendix N” actually means “Annex N”
- Where does Corollary 1.2 come from? It is not very explicit, thus I want to analyze it:
A = (x' ⊕ r) - r = Φx'(r) = Ψx'(γ) ⊕ Ψx'(r ⊕ γ) = [(x' ⊕ γ) ‑ γ] ⊕ x' ⊕ [(x' ⊕ (r ⊕ γ)) - (r ⊕ γ)] - For the Proof of Lemma 1, remember that AND is right-distributive over XOR
- In the Proof of Lemma 3, the twos-complement is assumed. Thus the entire algorithm only works under this assumption.
A generalized approach to document markup §
Title: “A generalized approach to document markup” by Dr C F Goldfarb [url] [dblp]
Published in 1981 at SIGPLAN 1981 and I read it in 2024-11
Abstract:
examples
SCRIPT language
.sk 1Text processing and word processing systems typically require users to intersperse additional information In the natural text of the document being processed.This added information, called 'markup,' serves two purposes:.tb 4.of 4.sk 11. it separates the logical elements of the document; and.of 4.sk 12. it specifies the processing functions to beperformed on those elements..of 0.sk 1
GML example
:p.Text processing and word processing systems typically require users to intersperse additional information in the natural text of the document being processed.This added information, called :q.markup::q., serves two purposes::ol.:li.it separates the logical elements of the document; and:li.it specifies the processing functions to be performed on those elements.::ol.
GML example:
:fig id-angelfig:figbody :artwork depth-2qp::artwork::figbody:figcap. Three Angels Dancing::figcap::fig
SGML/XML equivalent:
<fig id="angelfig"> <figbody> <artwork depth="24p"/> </figbody> <figcap> Three Angels Dancing </figcap></fig>
quotes
- “Text processing and word processing systems typically require users to intersperse additional information in the natural text of the document being processed.” (Goldfarb, p. 68)
- “Procedural markup is also inflexible.” (Goldfarb, p. 68)
- “These disadvantages of procedural markup are avoided by a markup schema due to C. F. Goldfarb, E. J. Mosher, and R. A. Lorie (3, 4). It is called the "Generalized Markup Language" (GML) because it does not restrict documents to a single application, formatting style, or processing system.” (Goldfarb, p. 69)
-
“GML is based on two novel postulates:
- Markup should describe a document's structure and other attributes, rather than specify processing to be performed on it, as descriptive markup need be done only once and will suffice for all future processing.
- Markup should be rigorous, so the techniques available for processing rigorously-defined objects like programs and data bases can be used for processing documents as well.” (Goldfarb, p. 69)
- “The comma that follows the quotation element is not actually part of it, but is brought inside the quotation marks during formatting.” (Goldfarb, p. 69)
- “Different formats can then be obtained from the same markup by invoking a different set of homonymous procedures.” (Goldfarb, p. 70)
-
“From this we can construct a 3-step model of document processing:
- Recognition
- Mapping
- Processing” (Goldfarb, p. 69)
- “In the case of low-level elements such as words and sentences, the user is normally given little control over the processing, and almost none over the recognition.” (Goldfarb, p. 70)
- “In terms of the document processing model, the advantage of descriptive markup is that it permits the user to define attributes--and therefore element types--not known to the formatter, and to specify the processing for them.” (Goldfarb, p. 70)
- “So far the discussion has addressed only a single attribute, the generic identifier, whose value characterizes an element's semantic role or purpose. Some descriptive markup schemes refer to markup as "generic coding," because the GI is the only attribute they recognize (5). In generic coding schemes, recognition, mapping, and processing can be accomplished all at once by the simple device of using GIs as control procedure names.” (Goldfarb, p. 70)
- “Different formats can then be obtained from the same markup by invoking a different set of homonymous procedures.” (Goldfarb, p. 70)
- “One way in which GML differs from generic coding schemes is in the conceptual and notational tools it provides for dealing with this hierarchical structure.” (Goldfarb, p. 70)
- “There has been a 40% reduction in markup, since three ending generic identifiers are no longer needed.” (Goldfarb, p. 71)
- “Tothis the author added some element types required for textbooks (such as "exercise"), and mnemonic names for "constant" elements like mathematical and logical symbols.” (Goldfarb, p. 72)
A plea for lean software §
Title: “A plea for lean software” by N. Wirth [url] [dblp]
Published in 1995 at and I read it in 2024-03
Abstract:
quotes
- “Memory requirements of today’s workstations typically jump substantially—from several to many megabytes—whenever there’s a new software release.”
- “About 25 years ago, an interactive text editor could be designed with as little as 8,000 bytes of storage. (Modern program editors request 100 times that much!)”
-
“With a touch of humor, the following two laws reflect the state of the art admirably well:
- Software expands to fill the available memory. (Parkinson)
- Software is getting slower more rapidly than hardware becomes faster. (Reiser)”
- “those arbitrarily overlapping windows suggested by the uncritically but widely adopted desktop metaphor; and fancy icons decorating the screen display, such as antique mailboxes and garbage cans that are further enhanced by the visible movement of selected items toward their ultimate destination.”
- “Increasingly, people seem to misinterpret complexity as sophistication, which is baffling—the incomprehensible should cause suspicion rather than admiration.”
- “Good engineering is characterized by a gradual, step-wise refinement of products that yields increased performance under given constraints and with given resources.”
- “the most demanding aspect of system design is its decomposition into modules. Each module is a part with a precisely defined interface that specifies imports and exports.”
- “Precisely defining the right decomposition is difficult and can rarely be achieved without iterations. Iterative (tuning) improvements are of course only possible up to the time of system release.”
- “To gain experience, there is no substitute for one’s own programming effort.”
A system for typesetting mathematics §
Title: “A system for typesetting mathematics” by Brian W. Kernighan, Lorinda L. Cherry [url] [dblp]
Published in 1975 at and I read it in 2022-01
Abstract: The syntax of the language is specified by a small context-free grammar; a compiler-compiler is used to make a compiler that translates this language into typesetting commands. Output may be produced on either a phototypesetter or on a terminal with forward and reverse half-line motions. The system interfaces directly with text formatting programs, so mixtures of text and mathematics may be handled simply. This paper was typeset by the authors using the system described.
error
- “assume that that sqrt(a+b)”
“assume that sqrt(a+b)”
quotes
- “Mathematics is known in the trade as difficult, or penalty, copy because it is slower, more difficult, and more expensive to set in type than any other kind of copy normally occuring in books and journals”
- “On UNIX, the phototypesetter is driven by a formatting program called TROFF. TROFF was designed for setting running text.”
- “Thus the language should not assume, for instance, that parentheses are always balanced, for they are not in the half-open interval (a,b]. Nor should it assume that that sqrt(a+b) can be replaced by (a+b)^1/2, or that 1/(1-x) is better written as 1/(1-x) (or vice versa).”
- “A secondary, but still important, goal in our design was that the system should be easy to implement, since neither of the authors had any desire to make a long-term project of it.”
- “The standard mode of operation is that when a document is typed, mathematical expressions are input as part of the text, but marked by user settable delimiters.”
- “Input is free-form. Spaces and new lines in the input are used by EQN to separate pieces of the input; they are not used to create space in the output.”
- “Free-form input is easier to type initially; subsequent editing is also easier, for an expression may be typed as many short lines.”
- “Extra white space can be forced into the output by several characters of various sizes. A tilde ‘~’ gives a space equal to the normal word spacing in text; a circumflex ‘^’ gives half this much, and a tab character spaces to the next tab stop.”
- “Here spaces are necessary in the input to indicate that sin, pi, int, and omega are special, and potentially worth special treatment. EQN looks up each such string of characters in a table, and if appropriate gives it a translation.”
- “The spaces after the 2's are necessary to mark the end of the superscripts;”
- “Braces {} are used to group objects together; in this case they indicate unambiguously what goes over what on the left-hand side of the expression.”
- “Centering and making the ∑ big enough and the limits smaller are all automatic.”
- “There is a facility for making braces, brackets, parentheses, and vertical bars of the right height, using the keywords left and right:”
- “Thus we can say
lim~ roman "sup" ~x sub n = 0
to ensure that the supremum doesn't become a superscript:
lim sup xn = 0” - “That is, we assume x^a^b is x^(a^b), not (x^a)^b.”
- “One of our users commented that although the output is not as good as the best hand-set material, it is still better than average, and much better than the worst.”
summary
Design paper for a fundamental tool in the UNIX space is introduced. It is neat to see how the fundamentals of mathematical typesetting developed. It is truly an achievement that the paper is set using eqn & troff itself. However, the formal grammar behind eqn is not formalized rigorously, seems pretty complex (custom disambiguation rules).
- unlike Teχ, a different representation in display mode and inline mode is not considered
- On the one hand, the design implies a 1:1 mapping to its representation (due to the fact that secretaries are meant to be able to write the formulas by looking at them) and simultaneously rejects it in its syntax because “pi over 2” does not specify whether it should be \frac{\pi}{2}, \pi/2 or \pi÷2
- The paper says “nor should it assume that sqrt(a+b) can be replaced by (a+b)^{1/2}”
This seems universally true in all mathematical domains?! Apparently, it might mean that its representation cannot be changed because maybe x^{\frac12} stands for the 2nd component of the first vector of x (x is a sequence of vectors). - If multiple whitespaces collapse and white space can be introduced between semantic units at will, the syntax seems to be called “free form input”.
- Braces {} are used to disambiguate groups just like in the Teχ case
- An arrow to iterate towards a limit (x→0) is denoted “x -> 0”
- \left and \right is introduced. Instead of Teχ's “\right.”, the \right can just be dropped
- In a case distinction, you have a large brace and cases with similar structure (“if” keywords shall be aligned). This syntax now lists elements columnwise; unlike Teχ which lists elements rowwise
- diacritics are supported via special commands (e.g. “x dotdot” for ẍ)
- Indeed, in mathematics x^a^b usually means x^(a^b) because (x^a)^b could be written x^(a*b) in a simplified manner
- “Printed books cannot compete with the birds and flowers of illuminated manuscripts on esthetic grounds, either, but they have some clear economic advantages.”
I learned that this must refer to graphical elements (birds, flowers) common in illuminated manuscripts and that illuminated manuscripts take longer to produce than the eqn pipeline.
Additively Homomorphic Ring-LWE Masking §
Title: “Additively Homomorphic Ring-LWE Masking” by Oscar Reparaz, Ruan de Clercq, Sujoy Sinha Roy, Frederik Vercauteren, Ingrid Verbauwhede [url] [dblp]
Published in 2016 at PQCrypto 2016 and I read it in 2020-07
Abstract: In this paper, we present a new masking scheme for ringLWE decryption. Our scheme exploits the additively-homomorphic property of the existing ring-LWE encryption schemes and computes an additive-mask as an encryption of a random message. Our solution differs in several aspects from the recent masked ring-LWE implementation by Reparaz et al. presented at CHES 2015; most notably we do not require a masked decoder but work with a conventional, unmasked decoder. As such, we can secure a ring-LWE implementation using additive masking with minimal changes. Our masking scheme is also very generic in the sense that it can be applied to other additively-homomorphic encryption schemes.
quotes
The most important statement is equation 3:
decryption(c1, c2) ⊕ decryption(c1’, c2’) = decryption(c1 + c1’, c2 + c2’)
- “we do not require a masked decoder but work with a conventional, unmasked decoder.”
- “Masking [CJRR99,GP99] is a provable sound countermeasure against DPA.”
- “A caveat of our approach is that we need to place additional assumptions on the underlying arithmetic hardware compared to the CHES 2015 approach.”
- “The operation ⊕ is the xor operation on bits or strings of bits.”
- “In the literature there are several encryption schemes based on the ring-LWE problem, for example [LPR10,FV12,BLLN13] etc.”
- “Among all the computations, polynomial multiplication is the costliest. Most of the reported implementations use the Number Theoretic Transform (NTT) to accelerate the polynomial multiplications.”
- “The proposed randomized decryption. To perform the decryption of (c1, c2) in a randomized way, the implementation follows the following steps:
- Internally generate a random message m’ unknown to the adversary
- Encrypt m’ to (c1’, c2’)
- Perform decryption(c1 + c1’, c2 + c2’) to recover m ⊕ m’.
- The masked recovered message is the tuple (m’, m ⊕ m’).”
- “This approach has the nice property of not requiring a masked decoder.”
- “The obvious disadvantage is that extra circuitry or code is required to perform the encryption. Another disadvantage is the increased decryption failure rate. When two ciphertexts are added, the amount of noise increases. The added noise increases the decryption failure rate as we will see in Sect. 4.3.”
- “Our countermeasure can be thought of as ciphertext blinding.”
- “Thus, straightforward first-order DPA attack does not immediately apply. Nevertheless, more refined first-order DPA attacks do apply.”
- “In Appendix A we describe a strategy to detect whether s[i] = 0 or s[i] ≠ 0, which leads to an entropy loss.”
- “after all, Eq. 3 may seem to imply that the decoding function is linear. However, this is clearly not the case.”
- “When the masking is turned off, the decryption failure rate is 3.6 × 10-5 per bit. The failure rate increases to 3.3 × 10-3 per bit when the masking turned on.”
- “In terms of speed, the costliest process is the encryption. It is 2.8 times slower than the decryption.”
summary
A paper of rather low quality. The core essence is Equality 3: decryption(c1, c2) ⊕ decryption(c1’, c2’) = decryption(c1 + c1’, c2 + c2’). Besides that, there are many confusing statements. The workings of Equality 3 are barely mentioned (i.e. correctness of Equality 3 is IMHO insufficiently discussed), but should be understandable for everyone in the field. Correctness is non-obvious, because we have an addition on the RHS and XOR on the LHS. But it is trivial, because on the RHS, we consider ℤq whereas LHS uses ℤ2 and the XOR is addition in ℤ2. Unlike the CHES 2015 paper, no additional circuitry is needed, which makes it a super interesting approach. The failure rate increases by a factor of 92 (3.6 × 10-5 → 3.3 × 10-3 per bit), which is a difficult problem of its own, but given in the CHES 2015 approach as well.
In summary, the approach computes the encryption of the original message (⇒ (c1, c2)) and also the encryption of some random value (⇒ (c1’, c2’)). Then, we don't decrypt dec(c1, c2), but dec(c1 + c1’, c2 + c2’).
- “A caveat of our approach is that we need to place additional assumptions on the underlying arithmetic hardware compared to the CHES 2015 approach.”
- which ones? performance assumption on encryption?
- “Thus, straightforward first-order DPA attack does not immediately apply. Nevertheless, more refined first-order DPA attacks do apply.”
- “In particular, the practitioner should pay careful attention to leaking distances if implemented in software, since during the masked decoding both shares are handled in contiguous temporal locations.”
- undefined terminology “distances”
- “are handled in contiguous temporal locations” is not necessarily true (only likely true)
- “after all, Eq. 3 may seem to imply that the decoding function is linear. However, this is clearly not the case.”
- elaboration would be nice
- decryption failure is inherent to ring-LWE
- essentially, it is close to linearity
- but decryption is non-linear due to error
- and even neglecting the error as constant, the decryption failure is skewed here and thus linearity is not really given
- Figure 2 is incomprehensible and axes insufficiently explained
- Appendix A is vacuous. How can it be done? I guess the only statement is “it can be classified”, which is a trivial statement (considering template attacks) which does not need more justifications.
typo
- “The countermeasure makes harder the DPA attack” → “The countermeasure makes the DPA attack harder”
- “Note that the distribution of […] when s = 0 and […] is uniform random is different from the distribution of […] when s = 0.” → “Note that the distribution of […] when s = 0 and […] is uniform random but is different from the distribution of […] when s = 0.”
Aggregated Private Information Retrieval §
Title: “Aggregated Private Information Retrieval” by Lukas Helminger, Daniel Kales, Christian Rechberger [url] [dblp]
Published in 2020-05 at and I read it in 2020-07
Abstract: With the outbreak of the coronavirus, governments rely more and more on location data shared by European mobile network operators to monitor the advancements of the disease. In order to comply with often strict privacy requirements, this location data, however, has to be anonymized, limiting its usefulness for making statements about a filtered part of the population, like already infected people.
quotes
- “In this research, we aim to assist with the disease tracking efforts by designing a protocol to detect coronavirus hotspots from mobile data while still maintaining compliance with privacy expectations.”
- “Governments in Italy, Germany, and Austria are relying on this metadata to monitor how people are complying with stay-at-home orders.”
- “we design a specialized private information retrieval (PIR) protocol.”
- “In this paper, we are interested in the single-server variants, namely computational PIR (CPIR), which rely on cryptographic hardness assumptions to hide the query from the server. Recent work has heavily improved on the original ideas of Chor et al. Many CPIR implementations use homomorphic encryption (HE) to hide the queries from the server while still allowing him to perform operations on the
query.” - “HE is a cryptographic primitive that allows performing computations on encrypted data without knowing the secret decryption key.”
- “We assume without loss of generality that the first column of the database of the server consists of unique identifiers.”
- “The threat model of the APIR protocol is similar to PIR protocols, i.e., the server should not know which elements were retrieved by the client.”
- “To prevent the client from learning individual entries, we make sure that the client’s list of identifiers has a guaranteed minimum cardinality and that each identifier is unique.”
- “We report multithreaded runtimes of 30 minutes for the standard APIR protocol, and 1 hour when extra steps are applied to ensure the input vector is not malicious.”
- “The RSA encryption scheme is homomorphic for multiplication and Paillier’s cryptosystem is homomorphic for addition. However, it was not until Gentry’s groundbreaking work from 2009 that we were able to construct the first fully homomorphic encryption (FHE) scheme, a scheme which in theory can evaluate an arbitrary circuit on encrypted data. His construction is based on ideal lattices and is deemed to be too impractical ever to be used, but it led the way to construct more efficient schemes in many following publications”
- “Once this noise becomes too large and exceeds a threshold, the ciphertext cannot be decrypted correctly anymore. We call such a scheme a somewhat homomorphic encryption scheme (SHE), a scheme that allows evaluating an arbitrary circuit over encrypted data up to a certain depth.”
- “In his work, Gentry introduced the novel bootstrapping technique, a procedure that reduces the noise in a ciphertext and can turn a (bootstrappable) SHE scheme into an FHE scheme.”
- “In many practical applications it is, therefore, faster to omit bootstrapping and choose a SHE scheme with large enough parameters to evaluate the desired circuit.”
- “two different types of PIR: information theoretic PIR (IT-PIR) protocols which rely on multiple, non-colluding servers to ensure privacy; and
computational PIR (CPIR) where a single server manages the database and encryption is used hide the query.” - “An established technique to achieve ε-differential privacy is the Laplace mechanism, i.e., to add Laplacian noise to the final result of the computation.”
- “H should not learn the movement pattern of any individual. More concretely, H is not allowed to query the location data for less than W different people. W has to be chosen in such a way that the data aggregation provides anonymity and its exact value will highly depend on the actual underlying data, which is the reason why we do not give a generic value in this paper.”
- “First, M could try to find the place of residence by assuming people sleep at home.”
- […] “remove those places for the heat map creation.”
- “The computationally most expensive phase in the protocol is the Data Aggregation phase, in which the server multiplies a huge matrix to a homomorphically encrypted input vector.”
summary
The paper implements a nice idea. The usefulness of the usecase needs to be discussed with public health experts (what does it help if I know that many infected people live in this block of houses?). However, I have been told the entire paper was written in 1 month and that is quite impressive considering the technical depth in the field of Homomorphic Encryption.
There are many typos and to me, the main purpose of the protocol in Figure 1 was not comprehensible before talking to the authors. I also didn't understand in which ways ε-Differential Privacy is desirable or how it can be ensured and which definition they used for “vector c is binary” before going into details in section 3.2. Apparently, a binary vector is desirable to prevent leakage. For Figure 2, they used the Teχ package cryptocode to illustrate the protocol. To the best of my understanding, this is just a reiteration of Figure 2. On page 13, the paragraph “Note that, as we have already mentioned in Section 3.6” should be moved to the concluding remarks. On page 14, it is unclear, what “isolated profiles” are. I didn't go through the details of section 5.
BasicBlocker: Redesigning ISAs to Eliminate Speculativ… §
Title: “BasicBlocker: Redesigning ISAs to Eliminate Speculative-Execution Attacks” by Jan Philipp Thoma, Jakob Feldtkeller, Markus Krausz, Tim Güneysu, Daniel J. Bernstein [url] [dblp]
Published in 2020-07 at and I read it in 2020-08
Abstract: Recent research has revealed an ever-growing class of microarchitectural attacks that exploit speculative execution, a standard feature in modern processors. Proposed and deployed countermeasures involve a variety of compiler updates, firmware updates, and hardware updates. None of the deployed countermeasures have convincing security arguments, and many of them have already been broken.
Comment: Preprint
quotes
- “The obvious way to simplify the analysis of speculative-execution attacks is to eliminate speculative execution. This is normally dismissed as being unacceptably expensive, but the underlying cost analyses consider only software written for current instruction-set architectures, so they do not rule out the possibility of a new instruction-set architecture providing acceptable performance without speculative execution.”
- “The IBM Stretch computer in 1961 automatically speculated that a conditional branch would not be taken: it began executing instructions after the conditional branch, and rolled the instructions back if it turned out that the conditional branch was taken.”
- “Software analyses in the 1980s such as [12] reported that programs branched every 4–6 instructions.”
- “The penalty for mispredictions grew past 10 cycles. Meanwhile the average number of instructions per cycle grew past two, so the cost of each mispredicted branch was more than 20 instructions.”
- “The P-cycle branch-misprediction cost is the time from early in the pipeline, when instructions are fetched, to late in the pipeline, when a branch instruction computes the next program counter.”
- “A branch delay slot means that a branch takes effect only after the next instruction.”
- “The recent paper [49] introduces a RISC-V extension to flush microarchitectural state and shows that the extension stops several covert channels.”
- “Another approach to ISA modifications against transient-execution attacks is to explicitly tell the CPU which values are secret, and to limit the microarchitectural operations that can be carried out on secret values [35, 50, 51].”
- “The standard separation of fetch from decode also means that every instruction is being speculatively fetched.”
- “Program code can be divided into basic blocks. To do so, all possible control flow paths are mapped into a directed graph called Control Flow Graph (CFG) so that each edge of the CFG represents a control flow from one instruction to the next. If two vertices A and B are connected by a single edge (A → B) with A having only a single outbound edge and B having only a single inbound edge, the vertices are merged if the two instructions are sequential in memory.”
- “The microarchitectural state of a CPU is affected only by instructions that will eventually be retired.”
- “The most important implication of this goal is that the CPU must abandon any speculative behavior. This eliminates a major source of complexity inside the security analysis of modern CPUs.”
- “Within this basic block, the CPU is allowed to fast-fetch instructions, knowing that upcoming instructions can be found in a sequential order in memory and will definitely be executed. That is, since per definition, within the basic block, no control flow changes can occur. The instruction further provides information whether the basic block is sequential, stating that the control flow continues with the next basic block in the sequence in memory. If a basic block does not contain a control-flow instruction it is therefore sequential.”
- “We also modified the behavior of existing control-flow instructions, such as bne, j and jlre.”
- “If the current basic block does not contain a control-flow instruction, which is indicated by the sequential flag of the bb instruction, the CPU can fetch the next bb instruction directly.”
- “A processor supporting the bb instruction is required to have an instruction counter IC, a target register T , a branch flag B, and an exception flag E, all initialized to 0 on processor reset and used only as defined below.”
- “We can seamlessly support hardware loop counters in our design concept. One new instruction (lcnt) is necessary to store the number of loop iterations into a dedicated register.”
- “BasicBlocker can also be used as a form of coarse-grained CFI [1,2], as it only allows control flow changes to beginnings of basic blocks, indicated by a bb instruction. This reduces
the prospects of success for (JIT-)ROP [36, 37] attacks, as the variety of potential gadgets is reduced.” - “One could easily extend our design to larger, more complex CPUs.”
- “The more parallel pipelines a CPU has, the more important it gets to build software with large basic blocks, as small basic blocks with branch dependent instructions cause a higher performance loss on a multi-issue CPU compared to a single-issue CPU.”
- “It would also be possible to integrate our solution into a secure enclave by providing a modified fetch unit for the enclave.”
- “The bb instruction does not fit into any of the existing RISC-V instruction types so that we define a new instruction type to achieve an optimal utilization of the instruction bits (Figure 6).”
- “We used a configuration with five stages (IF, ID, EX, MEM, WB) and 4096 byte, one-way instruction- and data cache.”
- “Our compiler is based on the LLVM [28] Compiler Framework version 10.0.0, where we modified the RISC-V backend by introducing our ISA extension and inserting new compilation passes at the very end of the compilation pipeline to not interfere with other passes that do not support our new instructions.”
- “Linker relaxation, however, is one optimization that could reduce the number of instructions by substituting calls with a short jumping distance by a single jump instruction instead of two instructions (aupic and jalr).”
- “Highly optimized code, such as cryptographic libraries, are barely affected by branch prediction at all.”
- “Throughout our benchmarks, the average code size overhead was 17%.”
- “In some cases, BasicBlocker even outperforms sophisticated branch predictors.”
summary
Quite good paper.
“Preventing speculative execution exploits” conjured up more sophisticated expectations for my part, but in general the idea is legit and the research was done properly. Essentially, the additional bb instruction annotates how many following instructions do not contain control flow instructions which allows a processor to prefetch them without any speculative considerations. An additional lcnt instruction for RISC-V handles loops.
In the reading group it was criticized that formal definitions were cumbersome and inappropriate, but I consider it a strong suit that the idea was not only presented, but formalized. I think the negative aspects are that some statements are not properly attributed; like Observation 1 which is not a contribution by this paper, but prior work. Furthermore statements like “One could easily extend our design to larger, more complex CPUs” seem unjustified. Intel Skylake CPUs are built with 19 stage pipelines, which certainly shows different performance metrics. So more research into the relation of number of pipeline stages and performance is required. A new instruction type for RISC-V is also a non-trivial modification in practice. The corresponding code is not yet published, but the paper was just released 1 month prior reading. Another weak point is selling, which focuses on the good performance of BasicBlocker in the nettle-aes and nettle-sha benchmarks. As cryptographic applications, these obviously do not rely on branching except for “for loops”, which is completely non-representative, but statements like “In some cases, BasicBlocker even outperforms sophisticated branch predictors” can be found. Finally, Claudio pointed out very well, that JIT compilation cannot be implemented with BB since the number of instructions of a block are most likely not known.
Overall, a quite good paper, but less aggressive selling would have increased reputation from my perspective.
- A branch delay slot detects a lack of data dependency and thus moves the instruction after the jump before the jump. This is an older concept than out-of-order execution.
- Definitions (page 4):
- Definition 1: Microarchitectural Effects
- Definition 2: Retired Instructions
- Definition 3: Instruction Stream
- Definition 4: Control Flow Instructions
- Definition 5: Basic Block
- Definition 6: t-security
- Definitions (page 5 & 7):
- Definition 7: Hardware secure processor
- Definition 8: BB Instruction
- Definition 9: BB Delayed Branches
- Definition 10: BB Exceptions
- Definition 11: BB Required
- Definition 12: BB Prefetching
- “It would also be possible to integrate our solution into a secure enclave by providing a modified fetch unit for the enclave.”
- “configuration with five stages 4096 byte, one-way instruction- and data cache.”
- “Obviously, BasicBlocker slightly increases the code size as every basic block is extended by an additional instruction” with “average code size overhead was 17%”
- Gem5 project: “The gem5 simulator is a modular platform for computer-system architecture research, encompassing system-level architecture as well as processor microarchitecture. gem5 is a community led project with an open governance model”
- https://en.wikichip.org/wiki/WikiChip
Benchmarking Post-quantum Cryptography in TLS §
Title: “Benchmarking Post-quantum Cryptography in TLS” by Christian Paquin, Douglas Stebila, Goutam Tamvada [url] [dblp]
Published in 2020-04 at PQCRYPTO 2020 and I read it in 2021-06
Abstract: Post-quantum cryptographic primitives have a range of tradeoffs compared to traditional public key algorithms, either having slower computation or larger public keys and ciphertexts/signatures, or both. While the performance of these algorithms in isolation is easy to measure and has been a focus of optimization techniques, performance in realistic network conditions has been less studied. Google and Cloudflare have reported results from running experiments with post-quantum key exchange algorithms in the Transport Layer Security (TLS) protocol with real users’ network traffic. Such experiments are highly realistic, but cannot be replicated without access to Internet-scale infrastructure, and do not allow for isolating the effect of individual network characteristics.
quotes
- “In this work, we develop and make use of a framework for running such experiments in TLS cheaply by emulating network conditions using the networking features of the Linux kernel.”
- “Among our key results, we observe that packet loss rates above 3–5% start to have a significant impact on post-quantum algorithms that fragment across many packets, such as those based on unstructured lattices.”
- “We can see at least three major lines of work: (draft) specifications of how post-quantum algorithms could be integrated into existing protocol formats and message flows [9,17,33,34,37,41]; prototype implementations demonstrating such integrations can be done [6–8,15,19,20,30,31] and whether they would meet existing constraints in protocols and software [10]; and performance evaluations in either basic laboratory network settings [6,7] or more realistic network settings [8,15,19,21,22].”
- draft specifications:
- Campagna, M., Crockett, E.: Hybrid Post-Quantum Key Encapsulation Meth-
ods (PQ KEM) for Transport Layer Security 1.2 (TLS). - Kiefer, F., Kwiatkowski, K.: Hybrid ECDHE-SIDH key exchange for TLS.
- Schanck, J.M., Stebila, D.: A Transport Layer Security (TLS) extension for estab-
lishing an additional shared secret. - Schanck, J.M., Whyte, W., Zhang, Z.: Quantum-safe hybrid (QSH) ciphersuite
for Transport Layer Security (TLS) version 1.2. - Stebila, D., Fluhrer, S., Gueron, S.: Design issues for hybrid key exchange in TLS
1.3. - Whyte, W., Zhang, Z., Fluhrer, S., Garcia-Morchon, O.: Quantum-safe hybrid
(QSH) key exchange for Transport Layer Security (TLS) version 1.3.
- Campagna, M., Crockett, E.: Hybrid Post-Quantum Key Encapsulation Meth-
- prototype implementations:
- Bos, J.W., Costello, C., Ducas, L., Mironov, I., Naehrig, M., Nikolaenko, V.,
Raghunathan, A., Stebila, D.: Frodo: take off the ring! practical, quantum-secure
key exchange from LWE. - Bos, J.W., Costello, C., Naehrig, M., Stebila, D.: Post-quantum key exchange for
the TLS protocol from the ring learning with errors problem. - Braithwaite, M.: Experimenting with post-quantum cryptography.
- Kampanakis, P., Sikeridis, D.: Two post-quantum signature use-cases: Non-issues,
challenges and potential solutions. - Kwiatkowski, K., Langley, A., Sullivan, N., Levin, D., Mislove, A., Valenta, L.: Mea-
suring TLS key exchange with post-quantum KEM. - Langley, A.: CECPQ2.
- Open Quantum Safe Project. OQS-OpenSSL.
- Bos, J.W., Costello, C., Ducas, L., Mironov, I., Naehrig, M., Nikolaenko, V.,
- analysis:
- Crockett, E., Paquin, C., Stebila, D.: Prototyping post-quantum and hybrid key
exchange and authentication in TLS and SSH.
- Crockett, E., Paquin, C., Stebila, D.: Prototyping post-quantum and hybrid key
- evaluation
- Bos, J.W., Costello, C., Ducas, L., Mironov, I., Naehrig, M., Nikolaenko, V.,
Raghunathan, A., Stebila, D.: Frodo: take off the ring! practical, quantum-secure
key exchange from LWE. - Bos, J.W., Costello, C., Naehrig, M., Stebila, D.: Post-quantum key exchange for
the TLS protocol from the ring learning with errors problem. - Braithwaite, M.: Experimenting with post-quantum cryptography.
- Kampanakis, P., Sikeridis, D.: Two post-quantum signature use-cases: Non-issues, challenges and potential solutions.
- Kwiatkowski, K., Langley, A., Sullivan, N., Levin, D., Mislove, A., Valenta, L.: Mea-
suring TLS key exchange with post-quantum KEM. - Langley, A.: Post-quantum confidentiality for TLS.
- Langley, A.: Real-world measurements of structured-lattices and supersingu-
lar isogenies in TLS.
- Bos, J.W., Costello, C., Ducas, L., Mironov, I., Naehrig, M., Nikolaenko, V.,
- draft specifications:
- “Our framework is inspired by the NetMirage [40] and Mininet [23] network emulation software, and uses the Linux kernel’s networking stack to precisely and independently tune characteristics such as link latency and packet loss rate.”
- “Some of our key observations from the network emulation experiments measuring TLS handshake completion time are as follows. For the median connection, handshake completion time is significantly impacted by substantially slower algorithms (for example, supersingular isogenies (SIKE p434) has a significant performance floor compared to the faster structured and unstructured lattice algorithms), although this effect disappears at the 95th percentile. For algorithms with larger messages that result in fragmentation across multiple packets, performance degrades as packet loss rate increases: for example, median connection time for unstructured lattice key exchange (Frodo-640-AES) matches structured lattice performance at 5–10% packet loss, then begins to degrade; at the 95th percentile, this effect is less pronounced until around 15% packet loss. We see similar trends for post-quantum digital signatures, although with degraded performance for larger schemes starting around 3–5% packet loss since a TLS connection includes multiple public keys and signatures in certificates.”
- “Our experiments focused on post-quantum-only authentication, rather than hybrid authentication. We made this choice because, with respect to authenticating connection establishment, the argument for a hybrid mode is less clear: authentication only needs to be secure at the time a connection is established (rather than for the lifetime of the data as with confidentiality). Moreover, in TLS 1.3 there is no need for a server to have a hybrid certificate that can be used with both post-quantum-aware and non-post-quantum aware clients, as algorithm negotiation will be complete before the server needs to send its certificate.”
- Algorithm | Public key (bytes) | Ciphertext (bytes) | Key gen. (ms) | Encaps. (ms) | Decaps. (ms)
ECDH NIST P-256 | 64 | 64 | 0.072 | 0.072 | 0.072
SIKE p434 | 330 | 346 | 13.763 | 22.120 | 23.734
Kyber512-90s | 800 | 736 | 0.007 | 0.009 | 0.006
FrodoKEM-640-AES | 9,616 | 9,720 | 1.929 | 1.048 | 1.064 - “We used liboqs for the implementations of the post-quantum algorithms; liboqs takes its implementations directly from teams’ submissions to NIST or via the PQClean project [16].”
- “The goal of the emulated network experiments was to measure the time elapsed until completion of the TLS handshake under various network conditions. Following the procedure in Sect. 3, we created two network namespaces and connected them using a veth pair, one namespace representing a client, and the other a server. In the client namespace, we ran a modified version of OpenSSL’s”
- “s_time program, which measures TLS performance by making, in a given time period, as many synchronous (TCP) connections as it could to a remote host using TLS; our modified version (which we’ve called s timer), for a given number of repetitions, synchronously establishes a TLS connection using a given post-quantum algorithm, closes the connection as soon as the handshake is complete, and records only the time taken to complete the handshake. In the server namespace, we ran the nginx [28] web server, built against OQS-OpenSSL 1.1.1
so that it is post-quantum aware.” - “For context, telemetry collected by Mozilla on dropped packets in Firefox (nightly 71) in September and October 2019, indicate that, on desktop computers, packet loss rates above 5% are rare: for example, in the distribution of WEBRTC_AUDIO_QUALITY_OUTBOUND_PACKETLOSS_RATE, 67% of the 35.5 million samples collected had packet loss less than 0.1%, 89% had packet loss less than
1%, 95% had packet loss less than 4.3%, and 97% had packet loss less than
20%” - “Finally, for each combination of round-trip time and packet loss rate, and for each algorithm under test, 40 independent s timer “client” processes were run, each making repeated synchronous connections to 21 nginx worker processes, each of which was instructed to handle 1024 connections”
- “Given that these are data-centre-to-data-centre links, the packet loss on these links is practically zero.”
- “The Apache Benchmark (ab) tool [2] was installed on the client VM to measure connection time;”
- “At the median, over high quality network links (packet loss rates ≤ 1%), we observe that public key and ciphertext size have little impact on handshake completion time, and the predominant factor is cryptographic computation time:”
- “This is to be expected since the maximum transmission unit (MTU) of an ethernet connection is 1500 bytes whereas Frodo-640-AES public key and ciphertext sizes are 9616 bytes and 9720 bytes respectively, resulting in fragmentation across multiple packets.”
- “16 IP packets must be sent by the client to establish a TLS connection using ecdh-p256-frodo640aes.”
- “”
summary
Neat experiment. OpenQuantumSafe's project implementation is used to run a network experiment on Linux virtual network devices where packet loss and round trip times are the considered variables. The key finding is obvious, but the collected data is provided publicly enabling future research. It was very interesting to read Firefox telemetry's reference packet loss data.
- Equivalent actual reference data for data centers would be great
- Ethernet MTU is 1518 bytes (with 18 bytes metadata). Devices must be able to handle datagrams of size ≥576 bytes (IPv4) or ≥1280 bytes (IPv6)
Bitcoin: A Peer-to-Peer Electronic Cash System §
Title: “Bitcoin: A Peer-to-Peer Electronic Cash System” by Satoshi Nakamoto [url] [dblp]
Published in 2009-03 at and I read it in 2024-12
Abstract: A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to prevent double-spending. We propose a solution to the double-spending problem using a peer-to-peer network. The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work. The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power. As long as a majority of CPU power is controlled by nodes that are not cooperating to attack the network, they'll generate the longest chain and outpace attackers. The network itself requires minimal structure. Messages are broadcast on a best effort basis, and nodes can leave and rejoin the network at will, accepting the longest proof-of-work chain as proof of what happened while they were gone.
summary
Very interesting. Even though, I knew Bitcoin’s idea before reading the paper, I enjoyed reading it because it perfectly describes proof-of-work as central idea and takes the threat of overtaking the network serious. The writing style is easy to follow and well-structured. I was only surprised by the description of multiple in/out values (i.e. the phrasing) and furthermore the limit of minable coins is not explicitly discussed. Interesting to revisit hardware considerations from 2008 in 2024.
Borrow checking Hylo §
Title: “Borrow checking Hylo” by Dimi Racordon, Dave Abrahams [url] [dblp]
Published in 2023 at SPLASH 2023 and I read it in 2024-11
Abstract: Hylo is a language for high-level systems programming that promises safety without loss of efficiency. It is based on mutable value semantics, a discipline that emphasizes the independence of values to support local reasoning. The result—in contrast with approaches based on sophisticated aliasing restrictions—is an efficient, expressive language with a simple type system and no need for lifetime annotations. Safety guarantees in Hylo programs are verified by an abstract interpreter processing an intermediate representation, Hylo IR, that models lifetime properties with ghost instructions. Further, lifetime constraints are used to eliminate unnecessary memory allocations predictably.
quotes
- “Hylo is a language for high-level systems programming that promises safety without loss of efficiency. It is based on mutable value semantics, a discipline that emphasizes the independence of values to support local reasoning. The result—in contrast with approaches based on sophisticated aliasing restrictions—is an efficient, expressive language with a simple type system and no need for lifetime annotations.” (Racordon and Abrahams, 2023, p. 1)
- “Enter Hylo [1], a project that promises memory safety with zero-cost abstraction. Unlike languages with similar goals such as Rust [13], Hylo does not attempt to police references with a sophisticated type system; instead, it simply removes them from the user model. Variables cannot share mutable state, eliminating data races, and object lifetimes can be tracked trivially, eliminating temporal safety violations. As a result, Hylo is immune to most commonly reported software vulnerabilities [4].” (Racordon and Abrahams, 2023, p. 1)
- “Hylo enforces MVS using a form of linear type system [20]. Such systems require that all values be used exactly once, thus guaranteeing freedom from aliasing and resource leakage. Most are, however, extremely restrictive and have been widely criticized for being impractical. Fortunately, Hylo makes two observations to address this issue.” (Racordon and Abrahams, 2023, p. 2)
- “The first is that destruction need not be explicit, as the compiler can detect unused values and destroy them automatically.1 The second is that reading or modifying a value is safe as long as such value cannot be modified concurrently. As a result, Hylo only requires that all values be consumed (returned from a function, moved to new storage, or deinitialized) exactly once.” (Racordon and Abrahams, 2023, p. 2)
summary
Hylo uses the concepts of Mutable Value Semantics (MVS) and linear types to build a type system which provides memory safety. Unlike rust where the behavior is unambiguously encoded into the types, Hylo allows to provide multiple implementations and decides on function-level which implementation is most appropriate based on the context (e.g. if you don’t use a struct afterwards anymore, it does not make sense to create a copy upon modification). In this example (the program implements bit access/assignment into a uint32), implementations with the semantics sink/let/inout are provided (“set” is left out):
extension UInt32 { public subscript(_ i: Int): Bool { sink { return (self & (1 << i)) != 0 } let { yield (self & (1 << i)) != 0 } inout { var b = self[i] yield &b if b { &self |= 1 << i } else { &self &= ~(1 << i) } } }}
public fun main() { var s: UInt32 = 0 &s[4] = true print(s) // "16"}
Maybe I overlooked the details but what does sink/let/inout relate to? It seems to be implicitly relate to the return value only. I could not figure out what will happen with multiple return values. I also could not figure out how interior mutability is represented.
But the resulting model by Hylo is simpler than rust’s borrow checking and provides a simpler, easier-to-communicate framework for memory safety.
C-rusted: The Advantages of Rust, in C, without the Di… §
Title: “C-rusted: The Advantages of Rust, in C, without the Disadvantages” by Roberto Bagnara, Abramo Bagnara, Federico Serafini [url] [dblp]
Published in 2023-02 at and I read it in 2023-05-07
Abstract: C-rusted is an innovative technology whereby C programs can be (partly) annotated so as to express: ownership, exclusivity and shareability of language, system and user-defined resources; dynamic properties of objects and the way they evolve during program execution; nominal typing and subtyping. The (partially) annotated C programs can be translated with unmodified versions of any compilation toolchain capable of processing ISO C code. The annotated C program parts can be validated by static analysis: if the static analyzer flags no error, then the annotations are provably coherent among themselves and with respect to annotated C code, in which case said annotated parts are provably exempt from a large class of logic, security, and run-time errors.
Comment: 7 pages, 4 figures
quotes
- “There are very strong economical reasons behind the use of the C programming language, namely: 1) C compilers exist for almost any processor; 2) C compiled code is very efficient and without hidden costs; 3) C allows writing compact code thanks to the many builtin operators and the limited verbosity of its constructs; 4) C is defined by an ISO standard [1]; 5) C, possibly with extensions, allows easy access to the hardware; 6) C has a long history of usage, including in critical systems; 7) C is widely supported by all sorts of tools.” (Bagnara et al., 2023, p. 1)
- “On the other hand, several of C’s strong points have negative counterparts, e.g.: 1) The fact that C code can efficiently be compiled to machine code for almost any architecture is due to the fact that, whenever this is possible and convenient, highlevel constructs are mapped directly to a few machine instructions: given that instructions sets differ from one architecture to the other, this is why the behavior of C programs is not fully defined. 2) The reason why the maximum execution time of C programs can be estimated with good precision by expert programmers is because there is nothing happening under the hood and, in particular, there is no built-in run-time error detection.” (Bagnara et al., 2023, p. 1)
- “These negative sides of C compound when memory handling is concerned, as memory handling is fully under the programmers responsibility: 1) memory references in C are (unless special care is taken) raw pointers that bring with themselves no information about the associated memory block or its intended use; 2) no run-time checks are made to ensure the safety of pointer arithmetic, memory accesses, and memory deallocation; 3) code involving memory addressing with pointers can be particularly opaque to peer review. Some of the most common C memory issues are: • dereferencing invalid pointers, including null pointers, dangling pointers (pointers to deallocated memory blocks), and misaligned pointers; • use of uninitialized memory; • memory leaks; • invalid deallocation (including double free and free with invalid argument); • buffer overflow.” (Bagnara et al., 2023, p. 1)
- “Even though various coding standards (with MISRA C being the most authoritative one) and lots of “bug finders” exist, there is no verification tool that can guarantee, in a strong sense, the absence of a large class of software defects in a consistent, effective and repeatable way.” (Bagnara et al., 2023, p. 1)
- “MISRA C provides guidelines for writing software that is on average much safer;” (Bagnara et al., 2023, p. 1)
-
“systems based on deductive methods, like Frama-C [2], require programmers that are highly skilled in mathematical logic and, even when such programmers are available” (Bagnara et al., 2023, p. 1)
“development time is multiplied by a factor from 2 to 4” (Bagnara et al., 2023, p. 2)
- “types in C do not offer programmers a way of expressing non-trivial data properties that are bound to the program logic. For instance: • an open file has the same type as a closed file; • a resource or a transaction has the same type independently from its state; • an exclusive reference and a shared reference to a resource are indistinguishable; • an integer with special values that represent error conditions is indistinguishable from an ordinary integer.” (Bagnara et al., 2023, p. 2)
- “Most importantly, this enables the C-rusted Analyzer, which is based on the ECLAIR Software Verification Platform, to verify correctness on any platform, with any architecture, and for each compiler.” (Bagnara et al., 2023, p. 2)
- “That is, keep using C, exactly as before, using the same compilers and the same tools, the same personnel . . . but incrementally adding to the program the information required to demonstrate correctness, using a system of annotations that is not based on mathematical logic and can be taught to programmers in a week of training.” (Bagnara et al., 2023, p. 5)
- “This technique is not new: it is called gradual typing, and consists in the addition of information that does not alter the behavior of the code, yet it is instrumental in the verification of its correctness. Gradual typing has been applied with spectacular success in the past: Typescript has been created 10 years ago, and in the last 6 years its diffusion in the community of JavaScript developers has increased from 21% to 69%. And it will continue to increase: simply put, there is no reason to write more code in the significantly less secure and verifiable JavaScript language” (Bagnara et al., 2023, p. 5)
- “For instance, C-rusted is 100% compatible with MISRA C: a C program that is MISRA compliant can be rusted without negatively impacting MISRA compliance.” (Bagnara et al., 2023, p. 6)
- “Contrast this with Rust and Zig: they are not standardized and, as a matter of fact, they frequently change in 3C-rusted is compatible with any version of the ISO C Standard and can be used with any C toolchain. 4We note on passing that, in the authors’ opinion, C-to-Rust transpilation [16], [17], [18] is not a real solution: transpiling well-written C code to unreadable and unmaintainable Rust code could possibly solve only a small fraction of the problems at the cost of creating several new problems. This, however, goes beyond the scope of this paper. a way that does not follow a rigorous process. This is the main reason why qualifying a Rust or Zig compilation toolchain is impossible today. In contrast, any qualified C compiler is, as is, a qualified C-rusted compiler.” (Bagnara et al., 2023, p. 6)
- “The analysis is rigorously intraprocedural, i.e., it is done one function at a time, using only the information available for that function in the translation unit defining it, which includes the annotations possibly provided in function declarations.” (Bagnara et al., 2023, p. 7)
summary
In this paper, the authors extend the MISRA C Checker with additional type checks and type annotations which allow it to apply the common rust semantics (“owner handles” map to ownership, “exclusive handles” to &, “shared handles” to &mut) to C. I like the reference to gradual typing since it really follows that paradigm. But once more the system only captures a tiny fraction of issues and semantics are not enforced but only provide hints. And as with all gradual typing systems, the user loses overview whether is code strictly follows all recommendations or required properties are just not met because insufficient type information is provided.
As a result, I consider this as “nice approach” in a line of many others, but it likely won’t receive popularity.
Cheat Sheets for Data Visualization Techniques §
Title: “Cheat Sheets for Data Visualization Techniques” by Zezhong Wang, Lovisa Sundin, Dave Murray-Rust, Benjamin Bach [url] [dblp]
Published in 2020-04-25 at CHI 2020 and I read it in 2020-10-18
Abstract: This paper introduces the concept of ‘cheat sheets’ for data visualization techniques, a set of concise graphical explanations and textual annotations inspired by infographics, data comics, and cheat sheets in other domains. Cheat sheets aim to address the increasing need for accessible material that supports a wide audience in understanding data visualization techniques, their use, their fallacies and so forth. We have carried out an iterative design process with practitioners, teachers and students of data science and visualization, resulting six types of cheat sheet (anatomy, construction, visual patterns, pitfalls, false-friends and well-known relatives) for six types of visualization, and formats for presentation. We assess these with a qualitative user study using 11 participants that demonstrates the readability and usefulness of our cheat sheets.
definitions
- cheat sheet → “This paper introduces the concept of ‘cheat sheets’ for data visualization techniques, a set of concise graphical explanations and textual annotations inspired by infographics, data comics, and cheat sheets in other domains. Cheat sheets aim to address the increasing need for accessible material that supports a wide audience in understanding data visualization techniques, their use, their fallacies and so forth.”
- visualization literacy → “Boy et al. [25] provide a formal description for visualization literacy as ‘the ability to confidently use a given data visualization to translate questions specified in the data domain, as well as interpreting visual patterns in the visual domain as properties in the data domain.’”
quotes
- “In this paper, we present cheat sheets for a selection of non-trivial visualization techniques representative to different data types (temporal, relational, hierarchical, statistical/quantitative, multidimensional) and taught by our collaborators: parallel coordinates plots (PCP), adjacency matrices, Whiskers Plots (box-plots), tree maps, confluence graphs [16] and time curves [17].”
- “More systematic description of visualizations are found in Wilkinson [71], who demonstrates a catalog with mathematical principles and fundamental components of quantitative statistics graphics. However, to quickly understand and use visualization techniques, more succinct and novice-friendly
instructions are required. Kirk [46] systematically presents 39 visualisation techniques, each being presented on a single page, with primarily textual descriptions, but with consistently structured information covering description, examples, how to read it, presentation tips and variations & alternatives.”- [71] = Leland Wilkinson. 2012. The grammar of graphics. In Handbook of Computational Statistics. Springer, 375–414.
- [46] = Andy Kirk. 2016. Data visualisation: a handbook for data driven design. Sage.
- “The project closest to our visualization cheat sheets are some examples in the Data Visualization Catalogue [59]. The catalogue lists 60 visualization techniques, each of which coming with an example, a textual description, an anatomy, links to tools, and a lose tag list (e.g., ‘data-over-time’, ‘patterns’).”
- “data-to-viz.com [38] provides similar contents yet adds ‘common mistakes’ and ‘build your own’ with R and python galleries. While the collection covers many techniques, many explanations lack detail, depicted patterns are rarely and only superficially explained, and there is no deeper support in engaging with learning and using the technique.”
- “cheat sheets has been found to improve students’ exam performance [67, 66], especially when students spend time with preparing cheat sheets [32, 42].”
- “we identified the following initial design principles for creating cheat sheets:”
- D1—Modularity
- D2—Context independence: “Sheets should not rely on specific data examples or external contexts.”
- D3—Clear graphics
- D4—Style neutrality
- “For simplicity we did not include alternative terminology (e.g., ‘node’, ‘vertex’).”
- “From the comments we learned that some concepts (‘nodes’, ‘connections’, ‘filled-cells’) were not explained well on the sheet and that matrices were generally seen as the most complicated visualization (“The more difficult concept—adjacent matrix was least attractive but I think that is because
I find the concept difficult to understand.”(P7)).” - “When encountering a problem, a majority of participants first tried to solve the questions themselves before consulting the sheets. If they consulted the sheet, they first tried to answer their question from the graphics and only then consulting the textual annotations.”
- “we cannot say how much cheat sheets improved real performance, or which parts were most important.”
- “We therefor invite designers, educators, researchers, and practitioners to use our material, to create their own cheat sheets, and to suggest directions for the expansion of cheat sheets that merit exploration.”
- Guidelines for Creating Cheat Sheets:
- “Abstraction is good to make examples general and make this generalizability understood.”
- “Concrete examples help understanding a visualization in the first place such as in our Introduction.”
- “Visual hierarchy, using fonts, sizes, gray backgrounds, grouping, and careful use of colors and separators helps structure content and make information retrievable.”
- “Maximize graphic-text ratio to support lookup and visual examplification; use text to give additional information.”
- “If necessary, split information across several comic panels to reduce visual complexity and aid explanation.”
- “Show pattern variations wherever possible to help understanding the main characteristic of a pattern.”
- “Use complementary terminology to help linking concepts from visual space, abstract data space, and concrete application space.”
- “We defined six types of sheets for six common, yet non-trivial visualization techniques and refined them through an iterative method involving expert interviews and workshops, readability feedback, design-by-example, a readability study and a guideline workshop.”
- “”
summary
Nice paper. First, it looks like there is no necessity to write a paper about cheatsheets, but the authors attributed appropriate contextualization and also presented usability study results. In the end, it is still all about the cheat sheets and they themselves are well-designed and nice (though I am in favor of exhaustiveness; thus e.g. I would like to see alternative terminology included).
The website is beautiful, btw: https://visualizationcheatsheets.github.io/
typos
- “complementary techinques”
- “caused problems for some participants (‘What’s the relation between dimensions B and F?’,” …
- missing closing parenthesis
Crouching error, hidden markup [Microsoft Word] §
Title: “Crouching error, hidden markup [Microsoft Word]” by N. Holmes [url] [dblp]
Published in 2001-09 at and I read it in 2021-08
Abstract:
quotes
- “That a reputable and widely read—if sardonic—column should end with the words ‘There’s
money to be made for old rope in IT’ reflects badly on the computing profession.” - “By all reports—such as Ted Lewis’s ‘Fast, Expensive, and Horribly Complex’ (Computer, Sept. 1999, pp. 120, 118-119)—the poor quality of software plays a salient role in the poor quality of service that computer users so frequently complain about.”
- “Why have those standards been ineffective in promoting software usability and serviceability? First, it may be that the standards we have are inappropriate. Second, even if appropriate standards do exist, it may be that the computing industry has yet to accept them.”
- “As a long-term user of formatting programs such as Script, Roff, and TeX, I felt apprehensive about using Word’s radically different approach, but as every working group had opted to produce its report using Word, I had no choice.”
- “The problem’s true nature became clear when I chanced upon ‘Lilac: A Two-View Document Editor’ by Kenneth Brooks (Computer, June 1991, pp. 7-19). Brooks described a system that
combines overt markup with WYSIWYG.” - “Markup conventions have a rich history. If you take a long-term view, markup conventions have been used in the data processing industry for thousands of years. Markup is conventional annotation designed to convey guidance to the user of plain text about the text’s intended treatment: This guidance originally applied to how the text should be read aloud and is otherwise known as punctuation.”
- “Writers rarely inserted spaces between words until Irish scribes in the late seventh century found it convenient to abandon the traditional scriptio continua.”
- “Programs such as Script and Roff proved quite useful on the relatively crude printers available at the time.”
- “Fortunately, the markup problem caught the attention of Donald Knuth. He studied the typographical tradition and brilliantly adapted the computer industry’s pathetic 7-bit character set to handle text markup capable of producing documents that would make a professional compositor proud. Knuth’s markup system, called TeX (http://www.tug.org), provides the kind
of typesetting system that would make a superb basis for a controllable WYSIWYG document editor. It’s versatile, extensible, thorough, and traditional. It adopts rather than ignores the printing
industry’s traditions. It’s also quite widely used by professionals such as mathematicians, who have their own particular problems with typography.” - “Curiously, at the heart of HTML’s markup style is a punctuation symbol called the diple (pronounced to rhyme with Ripley)—a quotation mark of scribes that the printing industry developed into several, all absent from both ASCII and EBCDIC.”
- “XML is to HTML what Unicode is to the ASCII character set. Just as Unicode ignored the writing system that ASCII so poorly served, and went on to curdle the world’s languages, so XML has ignored the document and gone on to curdle its bibliography.”
summary
In this article Holmes derives from his personal experiences the requirements for a modern text formatting system. He favors a markup language over the intransparent data model of WYSIWYG editors. He concludes that a dual system is the preferred user-centric way to go wheres Teχ is praised for its visual result.
I like his fundamental discussion of standards and former text formatting tools. His preference for Lilac seems understandable, but lacks depth. Whereas the concept was also applied to Macromedia's Dreamweaver on the HTML markup language, the situation becomes increasingly difficult with a larger data model and more powerful layouting scheme. CSS allows users to place objects on defined pixels which makes it difficult to select and edit elements in a visual editor if elements get close. One of the issues, the author completely ignores is Teχ's property as representational notation (in contrast to HTML). As such Teχ is less a markup language than a sequence of instructions to generate a document.
Cryptanalysis of ring-LWE based key exchange with key … §
Title: “Cryptanalysis of ring-LWE based key exchange with key share reuse” by Scott Fluhrer [url] [dblp]
Published in 2016 at and I read it in 2020-06
Abstract:
quotes
- “With Diffie-Hellman, it is perfectly safe to reuse the same public key share for multiple exchanges. One such use is the ‘ephemeral-static’ mode; in this case, Alice might select a private value, and publish the corresponding key share.”
- “The idea behind this protocol is that Alice computes V = SS' A + SE', while Bob computes V' = SSA' + S'E, they differ by SE' - S'E, as S, S', E, E' are small elements”
- “She deliberately selects S', E' so that coefficient 0 of Alice’s computation of US is near 0 (that is, near the boundary of quadrants I and IV; actually any of the quadrant boundaries could be used). For coefficient 0, Eve sets that error-reconcliation bit to indicate that the values in the range [0, p/2) are mapped to one bit value, while values in [p/2, p − 1] are mapped to the other.”
- “The first step in the attack for Eve is to find a lightweight value S' where (SS' A)[0] = ±1.”
- “The above can be done with perhaps 4,000 queries (assuming a ring size of N = 1024 and assuming that the value of S was generated using a discrete Gaussian Distribution with a standard deviation of circa 3).”
- “One place where this can potentially come up is in the current TLS 1.3 draft[6]. In this draft, they allow a server to declare a ’static keyshare’.”
- “The TLS 1.3 draft uses either DH or ECDH, which are both safe when used in this manner. However, if one were to replace the DH or ECDH with a ring-LWE based key exchange, this would become insecure.”
summary
typos
- “then it will be necessary for a fresh key share be generated for each exchange,” →
“then it will be necessary for a fresh key share to be be generated for each exchange,” - Alice selects ”small” elements →
Alice selects “small” elements - each element of V is ”close” to the corresponding →
each element of V is “close” to the corresponding - “Eve’s goal is to recover the value S the corresponds to Alice’s public key” →
“Eve’s goal is to recover the value S that corresponds to Alice’s public key” - “Each time after Alice and Eve has performed the key exchange protocol,” →
“Each time after Alice and Eve have performed the key exchange protocol,” - “Eve when then be able to generate one guess to Alice’s shared secret,” →
“Eve will then be able to generate one guess to Alice’s shared secret,” - “(where the notation F[i] specifies coefficient i-th of the ring element F),” →
“(where the notation F[i] specifies coefficient i of the ring element F),” - “She can do this by searching for values S' which consists of at most three coefficents are [1, −1] and the rest 0,” →
“She can do this by searching for values S' which consist of at most three coefficents that are [1, −1] and the rest 0,” →
Cryptographic competitions §
Title: “Cryptographic competitions” by Daniel J Bernstein [url] [dblp]
Published in 2020-12 at and I read it in 2021-06-20
Abstract: Competitions are widely viewed as the safest way to select cryptographic algorithms. This paper surveys procedures that have been used in cryptographic competitions, and analyzes the extent to which those procedures reduce security risks.
quotes
- “DES, the output of the first cryptographic competition, had an exploitable key size (see [47], [60], [113], [30], and [52]), had an exploitable block size (see [78] and [29]), and at the same time had enough denials of exploitability (see, e.g., [61], [46, Section 7], [63], and [1]) to delay the deployment of stronger ciphers for decades. As another example, AES performance on many platforms relies on table lookups with secret indices (“S-table” or “T-table” lookups), and these table lookups were claimed to be “not vulnerable to timing attacks” (see [45, Section 3.3] and [83, Section 3.6.2]), but this claim was incorrect (see [16] and [104]), and this failure continues to cause security problems today (see, e.g., [39]). As a third example, SHA-3 was forced to aim for a useless 2512 level of preimage security, and as a result is considerably larger and slower than necessary, producing performance complaints and slowing down deployment (see, e.g., [73])—which is a security failure if it means that applications instead use something weak (see, e.g., [76]) or nothing at all.”
- “If I set a speed record for some computation, am I doing it just for the thrill? Does the speed record actually matter for users? Making software run faster is a large part of my research; I want to believe that this is important, and I have an incentive to exaggerate its importance.”
- “Android had required storage encryption since 2015 except on ‘devices with poor AES performance (50 MiB/s and below)’.”
- “A recent paper ‘Post-quantum authentication in TLS 1.3: a performance study’ [99] states ‘Reports like [1] lead us to believe that hundreds of extra milliseconds per handshake are not acceptable’.”
- “Akamai’s underlying report [7] says, as one of its ‘key insights’, that ‘just a 100-millisecond delay in load time hurt conversion rates by up to 7%’.”
- “Could it be that what matters for sales isn’t actually the last bit of speed in delivering the web page, but rather the content of the web page?”
- “Perhaps there are other confounding factors, such as lower-income customers buying fewer products and also tending to have slower network connections.”
- “Even if hundreds of extra milliseconds per page load are unacceptable, it is an error to conflate this with hundreds of extra milliseconds per handshake. A TLS handshake sets up a session that can be, and often is, used to load many pages without further handshakes.”
- “All of the signature systems listed in [99, Table 1] have software available that signs in under 20 milliseconds on a 3GHz Intel Haswell CPU core from 2013 (never mind the possibility of parallelizing the computation across several cores).”
- “The total size of a public key and a signature in [99, Table 1] is at most 58208 bytes, so a 100Mbps network connection (already common today, never mind future trends) can transmit 4 public keys and 4 signatures in 20 milliseconds. For comparison, [62, “Total Kilobytes”] shows that the median web page has grown to 2MB, and that the average is even larger.”
- “Other major web servers in 2017 used initcwnd ranging from 10 through 46, according to [36].”
- “In the opposite direction, compared to a web server taking initcwnd as 46, a web server taking initcwnd as just 10 is sacrificing two round trips, almost half a second for a connection between the US and Singapore.”
- “[37] also said that Keccak ‘relies on completely different architectural principles from those of SHA-2 for its security’.”
- “Comparing the symmetric competitions shows trends towards larger and more complex inputs and outputs in the cryptographic algorithm interfaces. DES has a 64-bit block size; AES has a 128-bit block size. Stream ciphers encrypt longer messages. Hash functions hash longer messages. […] This does not mean that complexityis a goal per se: symmetric algorithms with larger interfaces often reach levels of efficiency that seem hard to achieve with smaller interfaces, and there are some security arguments for larger interfaces.”
- “During the AES competition, Biham [31, Table 3] reported that ‘the speed of the candidate ciphers on Pentium 133MHz MMX’ was
- 1254 cycles for Twofish,
- 1276 cycles for Rijndael,
- 1282 cycles for CRYPTON,
- 1436 cycles for RC6,
- 1600 cycles for MARS,
- 1800 cycles for Serpent,”
- “Biham also reported speeds scaled to “proposed minimal rounds”: 956 cycles for Serpent (17 rounds rather than 32), 1000 cycles for MARS (20 rounds rather than 32), 1021 cycles for Rijndael (8 rounds rather than 10),”
- “Why did these reports end up with such different numbers? And why did NIST’s AES efficiency testing [80] feature CRYPTON as the fastest candidate in its tables and its graphs, 669 Pentium Pro cycles to encrypt, with Rijndael needing 809 Pentium Pro cycles to encrypt?”
- “for example, the Rijndael encryption algorithm mapping a 128-bit plaintext and a 128-bit key to a 128-bit ciphertext.”
- “The benchmarking mechanism varies, for example in the handling of per input timing variations, initial code-cache-miss slowdowns, operating-system interrupts, clock-frequency variations, and cycle-counting overheads.”
- “The advertisement mechanism varies. As an example, measurements that are slower than previous work are likely to be suppressed if the advertisement mechanism is a paper claiming to set new speed records, but less likely to be suppressed if the advertisement mechanism is a paper claiming to compare multiple options.”
- “For example, the OpenSSL cryptographic library contains 26 AES implementations, almost all in assembly language, in a few cases weaving AES computations together with common hash-function computations. The library checks the target platform and selects an implementation accordingly.”
- “Compare [75], which estimates that AES produced $250 billion in worldwide economic benefits between 1996 and 2017.”
- “NIST […] in particular refused to list speeds of the fast Serpent implementations from [85] and [54], since those implementations had been constructed by ‘1000 hours of execution of search programs’ and ‘do not necessarily port to different platforms’.”
- “The costs of cryptographic optimization—in human time or computer time—can be huge obstacles for submitters without serious optimization experience.”
- “As part of the eSTREAM competition, Christophe De Cannière developed a new API for stream-cipher software, and wrote a new benchmarking toolkit [34] to measure implementations supporting the API. […] Notably, it tried more compiler options; it supported assembly-language software; and it was published.”
- “In 2008, eSTREAM was drawing to a close, and the SHA-3 competition had been announced. Lange and I started eBACS, a unified benchmarking project that includes eBASC for continued benchmarking of stream ciphers, eBASH for benchmarking of hash functions, and eBATS. We replaced BATMAN, the original benchmarking toolkit for eBATS, with a new benchmarking toolkit, SUPERCOP. By late 2009, eBASH had collected 180 implementations of 66 hash functions in 30 families. 7 eBASH became the primary source of software-performance information for the SHA-3 competition. See [37].”
- “The SUPERCOP API was carefully extended to handle more operations, such as authenticated encryption. CAESAR, NISTPQC, and NISTLWC required submissions to provide software using the SUPERCOP API. SUPERCOP now includes 3716 implementations of 1255 cryptographic functions in hundreds of families. See [27].”
- “Risk #1 of cryptography is that the cryptography isn’t used.”
- “How, then, is a competition for top speed not the same as a competition for minimum security?”
- “For example, say the efficiency metric is bit operations per bit of plaintext to encrypt a long stream; and say the minimum allowed security level is 2128. My understanding of current attacks is that Serpent reaches this security level with 12 rounds, using about 75 operations per bit; Rijndael reaches this security level with 8 rounds, using about 160 operations per bit; and Salsa20 reaches
this security level with 8 rounds, using 54 operations per bit. If these are the competitors then Salsa20 wins the speed competition. A user who can afford, say, 80 operations per bit then takes 12 rounds of Salsa20 (78 operations per bit). The same user would also be able to afford 12 rounds of Serpent, but 12 rounds of Salsa20 provide a larger security margin, presumably translating into a lower risk of attack.” - ”Serpent proposed more than twice as many rounds as necessary and had the whole proposal dismissed as being too slow.“
- “It was announced in 2012 that malware called “Flame” had been exploiting MD5 collisions since at least 2010; the analysis of [100] concluded that the Flame attackers had used an ‘entirely new and unknown’ variant of [101] (meaning new from the public perspective), that the Flame design ‘required world-class cryptanalysis’, and that it was ‘not unreasonable to assume’ that this cryptanalysis predated [101].”
- “As another example, the Rijndael designers and NIST claimed that Rijndael was ‘not vulnerable to timing attacks’, but this was incorrect, as noted in Section 1: AES was then broken by cache-timing attacks.”
- Fig.3.4: “Competitions:
- DES: the Data Encryption Standard (1974–1976)
- AES: the Advanced Encryption Standard (1998–2000)
- eSTREAM: the ECRYPT Stream Cipher Project (2005–2008)
- SHA-3: a Secure Hash Algorithm (2008–2012)
- CAESAR: Competition for Authenticated Encryption: Security, Applicability, and Robustness (2014–2019)
- NISTPQC: NIST Post-Quantum Cryptography Standardization Project (2017–?)
- NISTLWC: NIST Lightweight Cryptography Standardization Project (2019–?)”
- “CAESAR selection decisions will be made on the basis of published analyses. If submitters disagree with published analyses then they are expected to promptly and publicly respond to those analyses. Any attempt to privately lobby the selection-committee members is contrary to the principles of public evaluation and should be expected to lead to disqualification.”
- “I don’t know how to prove that factoring in expert judgments is more reliable than simply taking the fastest unbroken algorithm. Maybe it isn’t—or maybe there’s a better approach. It would be beneficial for the cryptographic community to put more effort into analyzing and optimizing risk-management techniques.”
- “Regarding accusations that IBM and NSA had ‘conspired’, Tuchman said ‘We developed the DES algorithm entirely within
IBM using IBMers. The NSA did not dictate a single wire!’” - “In 1979, NSA director Bobby Inman gave a public speech [63] including the following comments: ‘First, let me set the record straight on some recent history. NSA has been accused of intervening in the development of the DES and of tampering with the standard so as to weaken it cryptographically. This allegation is totally false.’”
- “However, an internal NSA history book ‘American cryptology during the cold war’ tells a story [66, pages 232–233] of much heavier NSA involvement in DES: […]”
- “‘NSA worked closely with IBM to strengthen the algorithm against all except brute force attacks and to strengthen substitution tables, called S-boxes. Conversely, NSA tried to convince IBM to reduce the length of the key from 64 to 48 bits. Ultimately, they compromised on a 56-bit key.’”
- “See, e.g., [13], describing NSA’s efforts between 2014 and 2018 to convince ISO to standardize Simon and Speck. One can only guess how many more algorithm-selection processes NSA was influencing through proxies in the meantime.”
- “Being more careful than whatever is required for a publication is taking time away from writing more papers, and as a community we want a sufficiently steady stream of broken cryptosystems as continued fuel for the fire.”
- “More broadly, performance seems to be the most powerful weapon we have in the fight against ideas for reducing security risks. Performance constantly drives us towards the edge of disaster, and that’s what we want.”
- “The challenge for thecommunity is to figure out whether we can maintain success in what we’re paid to do without a neverending series of security failures.”
summary
Very interesting meta-level read with many historical remarks. As organizers of cryptographic competitions, djb also has required background information to share. Performance as criterion is broadly discussed in the first half. The contextualization of running competitions is done nicely. Personal remarks can be found more towards the end. But it cannot offer solutions to the posed (difficult) problems.
Hitchhiker's guide to the paper:
- pages 3–6 goes into the details of (potentially false) advertising speed as important factor
- pages 7–8 discusses network-level aspects of speed
- pages 10–12 explains the difficult of proper benchmarking
- page 14 “Around that time the Internet was reported to be communicating roughly 258 bytes per year, more than doubling every year.” ⇒ citation would be nice
- pages 15–16 discuss benchmarking platforms
- page 17 discussion of the security margin
- page 19 contains a thought experiment
Uses Bayes Theorem: P(A | B) is shown where P(A) is the probability that the candidate gets broken and P(B) is the probability that the scheme is not broken within 36 months - page 21 / Fig 3.4 lists competitions systematically
- page 25–27 debates NSA involvement
- page 29 motivates community's role
Significant quotes:
- “Could it be that what matters for sales isn’t actually the last bit of speed in delivering the web page, but rather the content of the web page?”
- “Compare [75], which estimates that AES produced $250 billion in worldwide economic benefits between 1996 and 2017.”
- “The costs of cryptographic optimization—in human time or computer time—can be huge obstacles for submitters without serious optimization experience.”
- “SUPERCOP now includes 3716 implementations of 1255 cryptographic functions in hundreds of families”
- “Risk #1 of cryptography is that the cryptography isn’t used.”
- “How, then, is a competition for top speed not the same as a competition for minimum security?”
- “Being more careful than whatever is required for a publication is taking time away from writing more papers, and as a community we want a sufficiently steady stream of broken cryptosystems as continued fuel for the fire.”
- “More broadly, performance seems to be the most powerful weapon we have in the fight against ideas for reducing security risks. Performance constantly drives us towards the edge of disaster, and that’s what we want.”
- “The challenge for thecommunity is to figure out whether we can maintain success in what we’re paid to do without a neverending series of security failures.”
Cyclone: A safe dialect of C §
Title: “Cyclone: A safe dialect of C” by Greg Morrisett, James Cheney, Dan Grossman, Michael Hicks, Yanling Wang [url] [dblp]
Published in 2002 at USENIX 2002 and I read it in 2020-02
Abstract: Cyclone is a safe dialect of C. It has been designed from the ground up to prevent the buffer overflows, format string attacks, and memory management errors that are common in C programs, while retaining C’s syntax and semantics. This paper examines safety violations enabled by C’s design, and shows how Cyclone avoids them, without giving up C’s hallmark control over low-level details such as data representation and memory management.
Ambiguity
“Arrays and strings are converted to ?-pointers as necessary (automatically by the compiler).”
→ When exactly?
Example
“We don’t consider it an error if non-pointers are uninitialized. For example, if you declare a local array of non-pointers, you can use it without initializing the elements:
char buf[64]; // contains garbage ..
sprintf(buf,"a"); // .. but no err here
char c = buf[20]; // .. or even here
This is common in C code; since these array accesses are in-bounds, we allow them.”
Example
“However, this technique will not catch even the following simple variation:
char *itoa(int i) {
char buf[20];
char *z;
sprintf(buf,"%d",i);
z = buf;
return z;
}
Here, the address of buf is stored in the variable z, and then z is returned. This passes gcc -Wall without complaint.”
Quotes
- “Cyclone is a safe dialect of C. It has been designed from the ground up to prevent the buffer overflows, format string attacks, and memory management errors that are common in C programs, while retaining C’s syntax and semantics. This paper examines safety violations enabled by C’s design, and shows how Cyclone avoids them, without giving up C’s hallmark control over low-level details such as data representation and memory management.”
- “Every introductory C programming course warns against them and teaches techniques to avoid them, yet they continue to be announced in security bulletins every week. There are reasons for this that are more fundamental than poor training:
- One cause of buffer overflows in C is bad pointer arithmetic, and arithmetic is tricky. […]
- C uses NUL-terminated strings. […]
- Out-of-bounds pointers are commonplace in C. […]”
- “Our goal is to design Cyclone so that it has the safety guarantee of Java (no valid program can commit a safety violation) while keeping C’s syntax, types, semantics, and idioms intact.”
- “We must reject some safe programs,because it is impossible to implement an analysis that perfectly separates the safe programs from the unsafe programs.”
- Table 1: “Restrictions imposed by Cyclone to preserve safety”
- NULL checks are inserted to prevent segmentation faults
- Pointer arithmetic is restricted
- Pointers must be initialized before use
- Dangling pointers are prevented through region analysis and limitations on free
- Only “safe” casts and unions are allowed
- goto into scopes is disallowed
- switch labels in different scopes are disallowed
- Pointer-returning functions must execute return
- setjmp and longjmp are not supported
- Table 2: “Extensions provided by Cyclone to safely regain C programming idioms”
- Never-NULL pointers do not require NULL checks
- “Fat” pointers support pointer arithmetic with run-time bounds checking
- Growable regions support a form of safe manual memory management
- Tagged unions support type-varying arguments
- Injections help automate the use of tagged unions for programmers
- Polymorphism replaces some uses of void *
- Varargs are implemented with fat pointers
- Exceptions replace some uses of setjmp and longjmp
- “If you call getc(NULL), what happens? The C standard gives no definitive answer.”
- “Cyclone’s region analysis is intraprocedural — it is not a whole-program analysis.”
- “Here ‘r is a region variable.” (c.f. rust's notation)
- “Obviously, programmers still need a way to reclaim heap-allocated data. We provide two ways. First, the programmer can use an optional garbage collector. This is very helpful in getting existing C programs to port to Cyclone without many changes. However, in many cases it constitutes an unacceptable loss of control.”
- “A goto that does not enter a scope is safe, and is allowed in Cyclone. We apply the same analysis to switch statements, which suffer from a similar vulnerability in C.”
- “The Cyclone compiler is implemented in approximately 35,000 lines of Cyclone. It consists of a parser, a static analysis phase, and a simple translator to C. We use gcc as a back end and have also experimented with using Microsoft Visual C++.”
- “When a user compiles with garbage collection enabled, we use the Boehm-Demers-Weiser conservative garbage collector as an off-the-shelf component.”
- “We achieve near-zero overhead for I/O bound applications such as the web server and the http programs, but there is a considerable overhead for computationally-intensive benchmarks;”
- “Cyclone’s representation of fat pointers turned out to be another important overhead. We represent fat pointers with three words: the base address, the bounds address, and the current pointer location (essentially the same representation used by Mc-Gary’s bounded pointers [26]).”
- “Good code generation can make a big difference: we found that using gcc’s -march=i686 flag increased the speed of programs making heavy use of fat pointers (such as cfrac and grobner) by as much as a factor of two, because it causes gcc to use a more efficient implementation of block copy.”
- “We found array bounds violations in three benchmarks when we ported them from C to Cyclone: mini_httpd, grobner, and tile. This was a surprise, since at least one (grobner) dates back to the mid 1980s.”
- “Cyclone began as an offshoot of the Typed Assembly Language (TAL) project”
- “In C, a switch case by default falls through to the next case, unless there is an explicit break. This is exactly the opposite of what it should be: most cases do not fall through, and, moreover, when a case does fall through, it is probably a bug. Therefore, we added an explicit fallthru statement,” […] “Our decision to “correct” C’s mistake was wrong. It made porting error-prone because we had to examine every switch statement to look for intentional fall throughs, and add a fallthru statement.”
- “There is an enormous body of research on making C safer. Most techniques can be grouped into one of the following strategies:”
- Static analysis. […]
- Inserting run-time checks. […]
- Combining static analysis and run-time checks. […]
Region blocks
“Therefore, Cyclone provides a feature called growable regions. The following code declares a growable region, does some allocation into the region, and deallocates theregion:
region h {
int *x = rmalloc(h,sizeof(int));
int ?y = rnew(h) { 1, 2, 3 };
char ?z = rprintf(h,"hello");
}
The code uses a region block to start a new, growable region that lives on the heap. The region is deallocated on exit from the block (without an explicit free).”
Summary
Great paper with pragmatic approach derived from the Typed Assembly Language project. Definitely worth a read. Essential for everyone interested in the Rust programming language as this project inspired many ideas related to proper memory management and lifetimes. Implementation needs to be compiled on your own, but it is not maintained anymore anyways. Furthermore, there are more papers following the progress of the project and they also introduced more drastic changes which discards the label “pragmatic”.
Tagged unions
“We solve this in Cyclone in two steps. First, we add tagged unions to the language:”
tunion t {
Int(int);
Str(char ?);
};
[…]
void pr(tunion t x) {
switch (x) {
case &Int(i): printf("%d",i); break;
case &Str(s): printf("%s",s); break;
}
}
“The printf function itself accesses the tagged arguments through a fat pointer (Cyclone’s varargs are bounds checked)”
Design Guidelines for Domain Specific Languages §
Title: “Design Guidelines for Domain Specific Languages” by Gabor Karsai, Holger Krahn, Claas Pinkernell, Bernhard Rumpe, Martin Schindler, Steven Völkel [url] [dblp]
Published in 2009 at Proceedings of the 9th OOPSLA Workshop on Domain-Specific Modeling and I read it in 2023-12
Abstract: Designing a new domain specific language is as any other complex task sometimes error-prone and usually time consuming, especially if the language shall be of high-quality and comfortably usable. Existing tool support focuses on the simplification of technical aspects but lacks support for an enforcement of principles for a good language design. In this paper we investigate guidelines that are useful for designing domain specific languages, largely based on our experience in developing languages as well as relying on existing guidelines on general purpose (GPLs) and modeling languages. We defined guidelines to support a DSL developer to achieve better quality of the language design and a better acceptance among its users.
summary
Quite okay paper presenting 26 design guidelines for domain-specific languages along 5 categories. For example “Compose existing languages where possible”, “Limit number of language elements”, and “Use descriptive notations”. I wonder whether they could have clarified terminology “consistency”, “redundancy”, etc. Specifically they could have shown examples for syntaxes violating these guidelines.
Detecting Unsafe Raw Pointer Dereferencing Behavior in… §
Title: “Detecting Unsafe Raw Pointer Dereferencing Behavior in Rust” by Zhijian Huang, Yong Jun Wang, Jing Liu [url] [dblp]
Published in 2018 at IEICE 2018 and I read it in 2021-11
Abstract: The rising systems programming language Rust is fast, efficient and memory safe. However, improperly dereferencing raw pointers in Rust causes new safety problems. In this paper, we present a detailed analysis into these problems and propose a practical hybrid approach to detecting unsafe raw pointer dereferencing behaviors. Our approach employs pattern matching to identify functions that can be used to generate illegal multiple mutable references (We define them as thief function) and instruments the dereferencing operation in order to perform dynamic checking at runtime. We implement a tool named UnsafeFencer and has successfully identified 52 thief functions in 28 real-world crates∗, of which 13 public functions are verified to generate multiple mutable references.
quotes
- “Our approach employs pattern matching to identify functions that can be used to generate illegal multiple mutable references (We define them as thief function) and instruments the dereferencing operation in order to perform dynamic checking at runtime.”
- “Thus, unsafe Rust feature is enabled to allow programmers perform dangerous operations: dereferencing a raw pointer, calling an unsafe function or method, accessing or modifying a mutable static variable and implementing an unsafe trait [2].”
- “There are two types of raw pointers in Rust: immutable raw pointer (*const T) and mutable raw pointer (*mut T).”
- “Values in Rust are scoped and bound to a unique owner (variable).”
- “Nicholas [6] introduced a scenario in which freed Box value can be accessed.”
- “We name these functions as thief functions and design a thief function pattern to model them. The pattern is made up of the following conditions:
- The return value of the function is a mutable reference or data containing mutable references as member fields.
- The input arguments of the function contain no mutable references.
- The function is not declared with unsafe.”
- “We implement the approach as a tool UnsafeFencer on Rust compiler plugin. UnsafeFencer is made up of two components: finder and fencer. The finder transverses the Abstract Syntax Tree (AST) to find thief functions matching the pattern.”
- “The code is available at https://github.com/qorost/unsafefencer.”
- “”
summary
A nice specific LLVM compiler plugin to detect a special case where multiple mutable references in the context of unsafe are generated. Its practicality is limited, but the authors provided pragmatic tools to make it as accessible as possible.
typos/problems
- Box<unsize> should be Box<usize>
- Caption Fig. 1: Examples
- Fig .2: Shdowi32 should be Shadowi32
- “We downloaded all the available crate (as of May. 2017)” →
“We downloaded all the available crates (as of May. 2017)” - “Due to the inappropriate configuration of running environment, 5,703 crates were able to run the experiments successfully.” is an incomprehensible sentence to me
- In Figure 4, the text is too small
EWD1300: The Notational Conventions I Adopted, and Why §
Title: “EWD1300: The Notational Conventions I Adopted, and Why” by Edsger W. Dijkstra [url] [dblp]
Published in 2002 at and I read it in 2021-09
Abstract:
quotes
- “Without much hesitation I have decided to stick to the usual infix operators.”
- “It made me realize why I like it so much for associative operators: it allows us to write p + q + r without being forced to choose between (p + q) + r and p + (q + r): in prefix notation, the choice between + + pqr and + p + qr would have been unavoidable.”
- “Only use priority rules that are frequently appealed to, and hence are familiar.”
- “Do not introduce priority rules that destroy symmetry.”
- “When you have the freedom, choose the larger symbol for the operator with the lower binding power.”
- “Surround the operators with the lower binding power with more space than those with a higher binding power.”
- “For unary operators, give them the highest binding power and stick to prefix operators.”
- “I learned to appreciate expressions built with operators as a way of avoiding functional notation.”
- “A. N. Whitehead made many wise remarks it is a pleasure to agree with, but I cannot share his judgement when he applauds the introduction of the invisible multiplication sign. The multiplication being so common, he praises the mathematical community for the efficiency of its convention, but he ignores the price. The one price is confusion: look at the different semantics of the juxtapositions in {3 1/2, 3y, 32}. Is it a wonder that little children (many of whom have a most systematic mind) get confused by the mathematics they are taught?”
- “Brevity is much more effectively obtained by macroscopic measures such as avoiding duplication, case analysis and superfluous nomenclature, than by such microscopic measures as making the multiplication sign invisible.”
- “So, instead of the traditional f(x) I now write f.x.”
- “The convention that function application is left-associative has been adopted almost universally.”
- “[…] postulating that functional composition has a greater binding power than functional application.”
- “<i: i < 10: i²>
The explicit enumeration of the dummies between < and the first : acts like ALGOL 60’s
declaration of the local variables of an inner block.” - “We therefore present the calculation in the following format
A
→ {hint why A→B}
B
→ {hint why B→C}
C” - “But personally I am in favour of making the hints as clear and helpful as possible.”
- “For the formula number’s place the left margin is to be preferred over the right margin, because then it is easier to maintain a small distance between the number and the formula it numbers. (A similar remark applies to page numbers in tables of contents.)”
- “In addition I have chosen to use a pair of square brackets to denote for a boolean function universal quantification over its domain: [b] ≡ ∀t :: b . t .”
summary
A nice read summing up opinions related to mathematical notation. Prefix/Infix notation, precedence rules, and the layout of proofs are discussed. Details are often lacking, but I expected it to be only some opinion paper.
Engineering a sort function §
Title: “Engineering a sort function” by Jon L. Bentley, M. Douglas McIlroy [url] [dblp]
Published in 1993 at Software - Practice and Experience and I read it in 2022-12-29
Abstract: We recount the history of a new qsort function for a C library. Our function is clearer, faster and more robust than existing sorts. It chooses partitioning elements by a new sampling scheme; it partitions by a novel solution to Dijkstra’s Dutch National Flag problem; and it swaps efficiently. Its behavior was assessed with timing and debugging testbeds, and with a program to certify performance. The design techniques apply in domains beyond sorting.
quotes
- “C libraries have long included a qsort function to sort an array, usually implemented by Hoare’s Quicksort.1 Because existing qsorts are flawed, we built a new one. This paper summarizes its evolution.” (Bentley and McIlroy, 1993, p. 1249)
- “Shopping around for a better qsort, we found that a qsort written at Berkeley in 1983 would consume quadratic time on arrays that contain a few elements repeated many times—in particular arrays of random zeros and ones.” (Bentley and McIlroy, 1993, p. 1249)
- “The sort need not be stable; its specification does not promise to preserve the order of equal elements.” (Bentley and McIlroy, 1993, p. 1250)
- “Sedgewick studied Quicksort in his Ph.D. thesis” (Bentley and McIlroy, 1993, p. 1251)
- “A more efficient (and more familiar) partitioning method uses two indexes i and j. Index i scans up from the bottom of the array until it reaches a large element (greater than or equal to the partition value), and j scans down until it reaches a small element. The two array elements are then swapped, and the scans continue until the pointers cross. This algorithm is easy to describe, and also easy to get wrong—Knuth tells horror stories about inefficient partitioning algorithms.” (Bentley and McIlroy, 1993, p. 1252)
- “As a benchmark, swapping two integers in inline code takes just under a microsecond.” (Bentley and McIlroy, 1993, p. 1253)
- “using inline swaps for integer-sized objects and a function call otherwise.” (Bentley and McIlroy, 1993, p. 1254)
- “When a colleague and I modified sort to improve reliability and efficiency, we found that techniques that improved performance for other sorting applications sometimes degraded the performance of sort.’” (Bentley and McIlroy, 1993, p. 1254)
- “Partitioning about a random element takes Cn ∼∼ 1.386n lg n comparisons. We now whittle away at the constant in the formula. If we were lucky enough to choose the median of every subarray as the partitioning element, we could reduce the number of comparisons to about n lg n.” (Bentley and McIlroy, 1993, p. 1254)
- “We adopted Tukey’s ‘ninther’, the median of the medians of three samples, each of three elements.” (Bentley and McIlroy, 1993, p. 1255)
- “Tripartite partitioning is equivalent to Dijkstra’s ‘Dutch National Flag’ problem.” (Bentley and McIlroy, 1993, p. 1257)
- “Quicksort with split-end partitioning (Program 7) is about twice as fast as the Seventh Edition qsort.” (Bentley and McIlroy, 1993, p. 1258)
- “Since the expected stack size is logarithmic in n, the stack is likely to be negligible compared to the data—only about 2,000 bytes when n = 1,000,000.” (Bentley and McIlroy, 1993, p. 1259)
- “We therefore emulated Knuth’s approach to testing TeX: ‘I get into the meanest, nastiest frame of mind that I can manage, and I write the nastiest code I can think of; then I turn around and embed that in even nastier constructions that are almost obscene.’” (Bentley and McIlroy, 1993, p. 1260)
- “P. McIlroy’s merge sort has guaranteed O (n log n) worst-case performance and is almost optimally adaptive to data with residual order (it runs the highly nonrandom certification suite of Figure 1 almost twice as fast as Program 7), but requires O (n) additional memory.” (Bentley and McIlroy, 1993, p. 1262)
- “The key to performance is elegance, not battalions of special cases.” (Bentley and McIlroy, 1993, p. 1263)
summary
In this paper, the authors try to improve the performance of qsort; a Quicksort implementation by Lee McMahon based on Scowen’s ‘Quickersort’ shipped with the Seventh Edition Unix System. Predictably, Quicksort exhibits quadratic behavior in an easy-to-discover set of inputs. As a result, the authors look at some other proposals and use techniques like split-end partitioning to improve upon these set of inputs.
An old, but down-to-earth paper. It is interesting to see how asymptotic behavior was really the main driver for considerations during that time (contrary to considerations regarding caches like today). But I want to emphasize that actual benchmarks were also provided in the paper.
typo
“to sift the” (Bentley and McIlroy, 1993, p. 1250)
Everything Old is New Again: Binary Security of WebAss… §
Title: “Everything Old is New Again: Binary Security of WebAssembly” by Daniel Lehmann, Johannes Kinder, Michael Pradel [url] [dblp]
Published in 2020 at USENIX Security 2020 and I read it in 2020-07
Abstract: WebAssembly is an increasingly popular compilation target designed to run code in browsers and on other platforms safely and securely, by strictly separating code and data, enforcing types, and limiting indirect control flow. Still, vulnerabilities in memory-unsafe source languages can translate to vulnerabilities in WebAssembly binaries. In this paper, we analyze to what extent vulnerabilities are exploitable in WebAssembly binaries, and how this compares to native code. We find that many classic vulnerabilities which, due to common mitigations, are no longer exploitable in native binaries, are completely exposed in WebAssembly. Moreover, WebAssembly enables unique attacks, such as overwriting supposedly constant data or manipulating the heap using a stack overflow. We present a set of attack primitives that enable an attacker (i) to write arbitrary memory, (ii) to overwrite sensitive data, and (iii) to trigger unexpected behavior by diverting control flow or manipulating the host environment. We provide a set of vulnerable proof-of-concept applications along with complete end-to-end exploits, which cover three WebAssembly platforms. An empirical risk assessment on real-world binaries and SPEC CPU programs compiled to WebAssembly shows that our attack primitives are likely to be feasible in practice. Overall, our findings show a perhaps surprising lack of binary security in WebAssembly. We discuss potential protection mechanisms to mitigate the resulting risks.
quotes
- “We find that many classic vulnerabilities which, due to common mitigations, are no longer exploitable in native binaries, are completely exposed in WebAssembly. Moreover, WebAssembly enables unique attacks, such as overwriting supposedly constant data or manipulating the heap using a stack overflow.”
- “WebAssembly is an increasingly popular bytecode language that offers a compact and portable representation, fast execution, and a low-level memory model. Announced in 2015 and implemented by all major browsers in 2017, WebAssembly is supported by 92% of all global browser installations as of June 2020. The language is designed as a compilation target, and several widely used compilers exist, e.g., Emscripten for C and C++, or the Rust compiler, both based on LLVM”
- “There are two main aspects to the security of the WebAs-sembly ecosystem: (i) host security, the effectiveness of the runtime environment in protecting the host system against malicious WebAssembly code; and (ii) binary security, the effectiveness of the built-in fault isolation mechanisms in preventing exploitation of otherwise benign WebAssembly code”
- “Comparing the exploitability of WebAssembly binaries with native binaries, e.g., on x86, shows that WebAssembly re-enables several formerly defeated attacks because it lacks modern mitigations. One example are stack-based buffer overflows, which are effective again because WebAssembly binaries do not deploy stack canaries.”
- “The original WebAssembly paper addresses this question briefly by saying that “at worst, a buggy or exploited WebAssembly program can make a mess of the data in its own memory”
- “Regarding data-based attacks, we find that one third of all functions make use of the unmanaged (and unprotected) stack in linear memory. Regarding control-flow attacks, we find that every second function can be reached from indirect calls that take their target directly from linear memory. We also compare WebAssembly’s type-checking of indirect calls with native control-flow integrity defenses.”
- “There are four primitive types: 32 and 64 bit integers (i32 , i64) and single and double precision floats (f32 , f64). More complex types, such as arrays, records, or designated pointers do not exist.”
- “Branches can only jump to the end of surrounding blocks, and only inside the current function. Multi-way branches can only target
blocks that are statically designated in a branch table. Unrestricted gotos or jumps to arbitrary addresses are not possible. In particular, one cannot execute data in memory as bytecode instructions.” - “The call_indirect instruction on the left pops a value from the stack, which it uses to index into the so called table section. Table entries map this index to a function, which is subsequently called. Thus, a function can only be indirectly called if it is present in the table.”
- “In contrast to other byte-code languages, WebAssembly does not provide managed memory or garbage collection. Instead, the so called linear memory is simply a single, global array of bytes.”
- “One of the most basic protection mechanisms in native programs is virtual memory with unmapped pages. A read or write to an unmapped page triggers a page fault and terminates the program, hence an attacker must avoid writing to such addresses. WebAssembly’s linear memory, on the other hand, is a single, contiguous memory space without any holes, so every pointer ∈ [0, max_mem] is valid. […] This is a fundamental limitation of linear memory with severe consequences. Since one cannot install guard pages between static data, the unmanaged stack, and the heap, overflows in one section can silently corrupt data in adjacent sections.”
- “In WebAssembly, linear memory is non-executable by design, as it cannot be jumped to.”
- “an overflow while writing into a local variable on the unmanaged stack, e.g., buffer, may overwrite other local variables in the same and even in other stack frames upwards in the stack”
- “Because in WebAssembly no default allocator is provided by the host environment, compilers include a memory allocator as part of the compiled program”
- “While standard allocators, such as dlmalloc, have been hardened against a variety of metadata corruption attacks, simplified and lightweight allocators are often vulnerable to classic attacks. We find both emmalloc and wee_alloc to be vulnerable to metadata corruption attacks, which we illustrate for a version of emmalloc in the following.”
- “As WebAssembly has no way of making data immutable in linear memory, an arbitrary write primitive can change the value of any non-scalar constant in the program, including, e.g., all string literals.”
- “Version 1.6.35 of libpng suffers from a known buffer overflow vulnerability (CVE-2018-14550 [3]), which can be exploited when converting a PNM file to a PNG file. When the library is compiled to native code with modern compilers on standard settings, stack canaries prevent this vulnerability from being exploited. In WebAssembly, the vulnerability can be exploited unhindered by any mitigations.”
- “While exec and the log_* functions have different C++ types, all three functions have identical types on the WebAssembly level (Figure 8b). The reason is that both integers and pointers are represented as i32 types in WebAssembly, i.e., the redirected call passes WebAssembly’s type check.”
- “To the best of our knowledge, it is the first security analysis tool for WebAssembly binaries. The analysis is written in Rust”
- “For example, with ten nested calls (assuming a uniform distribution of functions), there would be some data on the unmanaged stack with 1 − ((1 − 0.33) 10 ) ≈ 98.2% probability.”
- “Averaged over all 26 programs, 9.8% of all call instructions are indirect,”
- “[…] how many functions are type-compatible with at least one call_indirect instruction and present in the table section. The percentage of indirectly callable functions ranges from 5% to 77.3%, with on average 49.2% of all functions in the program corpus.”
- “WebAssembly’s type checking of indirect calls can be seen as a form of control-flow integrity (CFI) for forward edges. Backward
edges, i.e., returns, are protected due to being managed by the VM and offer security conceptually similar to shadow stack solutions.” - multiple memories proposal: Andreas Rossberg. “Multiple per-module memories for Wasm”. https://github.com/WebAssembly/multi-memory, 2019.
- reference types proposal: Andreas Rossberg. “Proposal for adding basic reference types”. https://github.com/WebAssembly/reference-types, 2019.
- “Examples that would benefit Web-Assembly compilers are FORTIFY_SOURCE-like code rewriting, stack canaries, CFI defenses, and safe unlinking in memory allocators. In particular for stack canaries and rewriting commonly exploited C string functions, we believe there are no principled hindrances to deployment.”
- “Developers of WebAssembly applications can reduce the risk by using as little code in “unsafe” languages, such as C, as possible.”
- “The language has a formally defined type system shown to be sound”
- via [37] = “Not So Fast: Analyzing the Performance of WebAssembly vs. Native Code”: “We find that the mean slowdown of WebAssembly vs. native across SPEC bench-marks is 1.55×for Chrome and 1.45×for Firefox, with peak slowdowns of 2.5×in Chrome and 2.08×in Firefox”
summary
Wonderful paper. Firstly, a security analysis was necessary and I think they covered an appropriate amount of attack vectors. Secondly it is, of course, sad to see that many old vulnerabilities still work on a new platform.
I am unconditionally in favor of true read-only memory in WebAssembly. As David points out, this can also be enforced by a compiler by appropriate runtime checks. However, David also specified a criterion: Iff a memory attack could lead to an escape from the WASM sandbox, then it is an issue of WebAssembly sandbox and should be prevented at this level.
I keep wondering about stack canaries and guard pages. Maybe it was a decision between the trade-off of security and performance. But I am not convinced by it with 100%. Thirdly, the paper is well-structured and gives sufficient data to reason its arguments. The attacks were okay, the introduction to WebAssembly was superb and justifying the claims regarding indirect calls with quantitative data in section 6 was outstanding. I think everyone in IT security can easily follow it. I love it!
WebAssembly architecture:
- not possible
- overwriting string literals in supposedly constant memory is not possible
- In WebAssembly, linear memory is non-executable by design, as it cannot be jumped to
- WebAssembly’s linear memory is a single, contiguous memory space without any holes, so every pointer ∈ [0, max_mem] is valid.
- measures, restrictions
- program memory, data structures of underlying VM and stack are separated
- binaries are easily type-checked
- only jump to designated code locations
- WebAssembly has two mechanisms that limit an attacker’s ability to redirect indirect calls. First, not all functions defined in or exported into a WebAssembly binary appear in the table for indirect calls, but only those that may be subject to an indirect call. Second, all calls, both direct and indirect, are type checked.
- missing
- In WebAssembly, there is no ASLR.
- does not use stack canaries
- buffer and stack overflows are thus very powerful attack primitives in WebAssembly
- an overflow while writing into a local variable on the unmanaged stack, e.g., buffer, may overwrite other local variables in the same and even in other stack frames upwards in the stack,
- As WebAssembly has no way of making data immutable in linear memory, an arbitrary write primitive can change the value of any non-scalar constant in the program, including, e.g., all string literals.
toc
• Introduction
• Background on WebAssembly
• Security Analysis of Linear Memory
• Managed vs. Unmanaged Data
• Memory Layout
• Memory Protections
• Attack Primitives
• Obtaining a Write Primitive
• Stack-based Buffer Overflow
• Stack Overflow
• Heap Metadata Corruption
• Overwriting Data
• Overwriting Stack Data
• Overwriting Heap Data
• Overwriting “Constant” Data
• Triggering Unexpected Behavior
• Redirecting Indirect Calls
• Code Injection into Host Environment
• Application-specific Data Overwrite
• End-to-End Attacks
• Cross-Site Scripting in Browsers
• Remote Code Execution in Node.js
• Arbitrary File Write in Stand-alone VM
• Quantitative Evaluation
• Experimental Setup and Analysis Process
• Measuring Unmanaged Stack Usage
• Measuring Indirect Calls and Targets
• Comparing with Existing CFI Policies
• Discussion of Mitigations
• WebAssembly Language
• Compilers and Tooling
• Application and Library Developers
• Related Work
• Conclusion
Exploring the Vulnerability of R-LWE Encryption to Fau… §
Title: “Exploring the Vulnerability of R-LWE Encryption to Fault Attacks” by Felipe Valencia, Tobias Oder, Tim Güneysu, Francesco Regazzoni [url] [dblp]
Published in 2018 at CS2@HiPEAC 2018 and I read it in 2020-06
Abstract: The future advent of quantum computer pushes for the design and implementation of public-key cryptosystems capable of resisting quantum attacks. Lattice-based cryptography, especially when implemented over ideal lattices, is one of the most promising candidates to replace current public-key systems. Area and performance of different designs have been explored in a wide number of platforms, including embedded ones. However, the resistance of these constructions against physical attacks, although fundamental to guarantee security, still remains largely unexplored.
ambiguity
Description of “Randomization of the key” (page 5):
“Decryption can also be attacked by altering the key injecting a random fault in one coefficient. The drawback of this attack is that it is necessary to know the difference δ between the original key value and the new one. If this is known, the attacker starts to send faulty ciphertexts to the decryption oracle. For each δ we note for each bit in the decrypted message whether it was decrypted correctly or not. We repeat this procedure until we know the decryption error status for each δ and each bit of the message. Then, for each message bit, we count the number of the times the error status has changed between δi and δi+1 , for each δ ∈ [0, q-1]. The number of error status changes divided by two is the absolute value of the secret key at this position.”
- “altering the key” → secret key is faulted/modified
“send faulty ciphertexts” → so the ciphertext (c1, c2) is also faulted? - Does this attack work coefficient-wise? Since c1·sk is a polynomial multiplication (unless in NTT domain?!), it affects multiple coefficients. On the other hand, the text talks about individual bits.
- δ is defined as “δ := sk - sk'” where sk' is the faulted secret key. “changed between δi and δi+1” implies that δi must be ordered. How?
Best effort interpretation:
- I fault the secret key during decryption. There is no faulty ciphertext.
- I assume it is a coefficient-wise attack.
- I assume we increase δ from 1 to q-1: δi → from i=1 to q-1. Decode(…) thus gives only one bit of interest.
- Let C := #{Decode(c1·(sk+δi)+c2) ≠ Decode(c1·(sk+δi+1)+c2) | i ∈ [0, q-1]} be the “number of the times the error status has changed”.
The main statement is abs(sk) = C/2. The larger δ, the more reduction in ℤq we have. Recognize that factor c1 is multiplied. Thus the more “error status changes” we have ⇒ the larger C. If δ gets closer to q/2, then δi ≠ δi+1 for almost all i, because a factor of q/2 (or close to it) switches the decoded bit from 0 to 1 (or vice versa).
Robert pointed out another attack that is somewhat similar:
- If (sk+δi) == 0, then c1·(sk+δi) == 0. Thus, if δi = -sk, then Decode(c2) == Decode(pke1 + e3 + m) will be computed. Thus, we can try various ciphertexts and almost only for δi==-sk, the ciphertext will always be correctly decoded. Thus, we can read the secret key for the computed δi
inconsistency
- change of notation: sometime m with tilde above, sometime enc(m)
- “cipher text” versus “ciphertext”
quotes
- “Our qualitative analysis represents one of the first steps towards the deployment of reliable post-quantum cryptography resistant against fault attacks.”
- “Lattice-based primitives are based on the hard and quantum resistant problem of finding the solutions of a system of linear equations when an error is introduced.”
- “To the best of our knowledge, thispaper is one of the first attempt of systematically discussing the robustness of a post quantum scheme against fault attacks.”
- “This transformation is very interesting for lattice-based cryptography because polynomial multiplication in the frequency domain has linear complexity instead of quadratic complexity.”
- “To allow efficient computation of the NTT, the coefficient ring has to contain primitive roots of unity”
- “To do this, we consider three types of adversaries:
- an adversary that can set to zero a single or multiple bits of a vector,
- and adversary that can skip a single instruction within a loop at the desired iteration, and
- an adversary that can change any data bit to a random value.”
- “In our case we obtained the possible set of values of A using 2 ciphertexts and we continue with the next iteration, thus we use 2*N number of decryptions to recover the key using this attack.”
- “Our simulations using Matlab demonstrated that running 1000 attacks having as parameter n = 192 and q = 4093 the average number of cipher texts needed to completely recover the secret key is 7.8.”
- “Our results showed that, on average, the secret key can be recovered using 268.8 ciphertext, when the parameters have been set to n = 192 and q = 4093.”
- “When running 1000 simulated attacks using Matlab with n = 192 and q = 4093, we needed on average 2 ciphertexts for recovering a key component.”
summary
I like the overview-style nature of the paper. It is properly structured. The overview section is very short, dense, and fine. The fault attacks mentioned are not sophisticated, but the overview makes it a nice paper. However, the last attack “Randomization of the key” is incomprehensible. Furthermore it is only published on ACM and all references are broken there. In conclusion: Nice read if you are interested in ring-LWE and fault attacks.
typo
- “one of the first attempt” → “one of the first attempts”
- “Learning with Errors problem problem” → “Learning with Errors problem problem”
- “In our case we obtained the possible set of values of A using 2 ciphertext and we continue with the next iteration, thus we use 2*N number of decryption to recover the key” → “In our case we obtained the possible set of values of A using 2 ciphertexts and we continue with the next iteration, thus we use 2*N number of decryptions to recover the key”
- “and the attack succesfully recovers the keys.” → “and the attack successfully recovers the keys.”
- “the secret key can be recovered using 268.8 ciphertext,” → “the secret key can be recovered using 268.8 ciphertexts,”
- “Since the result include the threshold,” → “Since the result includes the threshold,”
FPGA Vendor Agnostic True Random Number Generator §
Title: “FPGA Vendor Agnostic True Random Number Generator” by Dries Schellekens, Bart Preneel, Ingrid Verbauwhede [url] [dblp]
Published in 2006 at FPL 2006 and I read it in 2021/12
Abstract: This paper describes a solution for the generation of true random numbers in a purely digital fashion; making it suitable for any FPGA type, because no FPGA vendor specific features (e.g., like phase-locked loop) or external analog components are required. Our solution is based on a framework for a provable secure true random number generator recently proposed by Sunar, Martin and Stinson. It uses a large amount of ring oscillators with identical ring lengths as a fast noise source – but with some deterministic bits – and eliminates the non-random samples by appropriate post-processing based on resilient functions. This results in a slower bit stream with high entropy. Our FPGA implementation achieves a random bit throughput of more than 2 Mbps, remains fairly compact (needing minimally 110 ring oscillators of 3 inverters) and is highly portable.
quotes
- “Our solution is based on a framework for a provable secure true random number generator recently proposed by Sunar, Martin and Stinson. It uses a large amount of ring oscillators with identical ring lengths as a fast noise source – but with some deterministic bits – and eliminates the non-random samples by appropriate post-processing based on resilient functions. This results in a slower bit stream with high entropy. Our FPGA implementation achieves a random bit throughput of more than 2 Mbps, remains fairly compact (needing minimally 110 ring oscillators of 3 inverters) and is highly portable.”
- “The implementation details of these solutions are mostly not published – because of their commercial value – but three different techniques seem widely used to generate a random bitstream: sampling jittered oscillators, chaotic circuits, and direct amplification of resistor or PN junction noise.”
- “the latter two techniques are impossible, because they depend on analog components.”
- “The design parameters of most proposals […] are mainly determined by trial and error until the random bitstream passes the statistical tests provided by the NIST [10] or DIEHARD [11] test suite. Mathematical modeling of the entropy source and justifica- tion of the post-processing is mostly absent. In this respect, the approach taken by Sunar, Martin and Stinson is completely different [12]. They start by modeling the entropy collection process and use this model to specify the design parameters, i.e. the number and length of ring oscillators and the requirements for the post-processor.”
- “In the article we describe an FPGA realization of the proposal from Sunar et al.”
- “Two popular post-processors to reduce bias are the von Neumann corrector [2] and XOR corrector [6].”
- “Evaluation of designs that use a low entropy noise source and that rely on high compression in the post-processor, is more difficult, because the standard BSI criteria will not be met. 2 To address this problem, the concept of a stateless random bit generator is introduced in [14]. For this class of generators, the verification of a minimum entropy limit could be performed directly on the post-processed random numbers.”
- “The results slightly differ from those in the paper of Sunar et al. We believe this is because they used another mathematical program producing the inaccurate calculations. All our calculations have been performed with MAGMA (http://magma.maths.usyd.edu.au/calc/) and have been verified with GP/PARI (http://pari.math.u-bordeaux.fr/).”
- “In our research, we have implemented this proposal on a Xilinx FPGA and we have verified that the theoretical background of the framework indeed holds in practice. In order to minimize the required hardware resources, we made a slight change to the original proposal. By using ring oscillators with a shorter length, namely l = 3, we significantly reduced the number of inverters needed in the noise source.”
- “The fact that fewer inverters (around 13/3 ≈ 4 times) could lead to much higher random bitstream (333/40 ≈ 8 times) appears rather suspicious. Because of this, we decided to keep the sampling frequency at 40 MHz. We also keep the choice of the [256,16,113]-code, because cyclic codes can implemented in a very efficient matter.”
summary
Rather uninteresting paper. It takes the ideas of Sunar, Martin and Stinson [12] and applies them to an FPGA architecture. After a tiny discussion with related work, the authors reason their choice for parameters.
This paper was suggested in the course “Cryptographic Engineering” I attended.
FastSpec: Scalable Generation and Detection of Spectre… §
Title: “FastSpec: Scalable Generation and Detection of Spectre Gadgets Using Neural Embeddings” by M. Caner Tol, Koray Yurtseven, Berk Gulmezoglu, Berk Sunar [url] [dblp]
Published in 2020 at and I read it in 2020-07
Abstract: Several techniques have been proposed to detect vulnerable Spectre gadgets in widely deployed commercial software. Unfortunately, detection techniques proposed so far rely on hand-written rules which fall short in covering subtle variations of known Spectre gadgets as well as demand a huge amount of time to analyze each conditional branch in software. Since it requires arduous effort to craft new gadgets manually, the evaluations of detection mechanisms are based only on a handful of these gadgets.
ambiguity
- DNN (page 3) is undefined
- Equation 1: Notation X~Pdata is unknown to me
- “which alters the flags”. What is a flag? (Step 4, page 5)
quotes
- “Several techniques have been proposed to detect vulnerable Spectre gadgets in widely deployed commercial software. Unfortunately, detection techniques proposed so far rely on hand-written rules which fall short in covering subtle variations of known Spectre gadgets as well as demand a huge amount of time to analyze each conditional branch in software.”
- “In this work, we employ deep learning techniques for automated generation and detection of Spectre gadgets.”
- “Using mutational fuzzing, we produce a data set with more than 1 million Spectre-V1 gadgets which is the largest Spectre gadget data set built to date.”
- “we conduct the first empirical usability study of Generative Adversarial Networks (GANs) for creating assembly code without any human interaction.”
- “While the initial variants of Spectre [26] exploit conditional and indirect branches, Koruyeh et al. [27] proposes another Spectre variant obtained by poisoning the entries in Return-Stack-Buffers (RSBs).”
- “The proposed detection tools mostly implement taint analysis [66] and symbolic execution [15,65] to identify potential gadgets in benign applications. However, the methods proposed so far are have two shortcomings: (1) the scarcity of Spectre gadgets prevents the comprehensive evaluation of the tools, (2) the scanning time increases drastically with increasing binary file sizes.”
- “BERT [9] was proposed by the Google AI team to learn the relations between different words in a sentence by applying a self-attention mechanism [63].”
- “In summary,
- We extend 15 base Spectre examples to 1 million gadgets via mutational fuzzing,
- We propose SpectreGAN which leverages conditional GANs to create new Spectre gadgets by learning the distribution of existing Spectre gadgets in a scalable way,
- We show that both mutational fuzzing and SpectreGAN create diverse and novel gadgets which are not detected by oo7 and Spectector tools
- We introduce FastSpec which is based on supervised neural word embeddings to identify the potential gadgets in benign applications orders of magnitude faster than rule-based methods.”
- “Hence, Spectre-type attacks are not completely resolved yet and finding an efficient countermeasure is still an open problem.”
- “There are a number of known Spectre variants: Spectre-V1 (bounds check bypass), Spectre-V2 (branch target injection), Spectre-RSB [27, 34] (return stack buffer speculation), and Spectre-V4 [19] (speculative store bypass).”
- “The heavy part of the training is handled by processing unlabeled data in an unsupervised manner. The unsupervised phase is called pre-training which consists of masked language model training and next sentence prediction procedures.”
- “Defenses against Spectre. There are various detection methods for speculative execution attacks. Taint analysis is used in oo7 [66] software tool to detect leakages. As an alternative way, the taint analysis is implemented in the hardware context to stop the speculative execution for secret dependent data [53, 71]. The second method relies on symbolic execution analysis. Spectector [15] symbolically executes the programs where the conditional branches are treated as mispredicted. Furthermore, SpecuSym [18] and KleeSpectre [65]
aim to model cache usage with symbolic execution to detect speculative interference which is based on Klee symbolic execution engine. Following a different approach, Speculator [35] collects performance counter values to detect mispredicted branches and speculative execution domain. Finally, Specfuzz [44] uses a fuzzing strategy to analyze the control flow paths which are most likely vulnerable against speculative execution attacks.” - “Our mutation operator is the insertion of random Assembly instructions with random operands.”
- “The overall success rate of fuzzing technique is around 5% out of compiled gadgets.”
- “the input assembly functions are converted to a sequence of tokens T' = {x'1, … x'N} where each token represents an instruction, register, parenthesis, comma, intermediate value or label. SpectreGAN is conditionally trained with each sequence of tokens where a masking vector m = (m1, …, mN) with elements mt ∈ {0, 1} is generated.”
- “The training procedure consists of two main phases namely, pre-training and adversarial training.”
- “We keep commas, parenthesis, immediate values, labels, instruction and register names as separate tokens.”
- “The tokenization process converts the instruction "movq (%rax), %rdx" into the list ["movq", "(", "%rax", ")", ",", "%rdx"] where each element of the list is a token x't. Hence, each token list T' = {x'1, …, x'N} represents an assembly function in the data set.”
- “SpectreGAN is trained with a batch size of 100 on NVIDIA GeForce GTX 1080 Ti until the validation perplexity converges in Figure 2. The pre-training lasts about 50 hours while the adversarial training phase takes around 30 hours.” (80 hours = 3d 8h)
- “SpectreGAN is trained with a masking rate of 0.3, the success rate of gadgets increases up to 72%. Interestingly, the success rate drops for other masking rates, which also demonstrates the importance of masking rate choice.”
- “To illustrate an example of the generated samples, we fed the gadget in Listing 2 to SpectreGAN and generated a new gadget in Listing 3.”
- “Mutational fuzzing and SpectreGAN generated approximately 1.2 million gadgets in total.”
- “The quality of generated texts is mostly evaluated by analyzing the number of unique n-grams.”
- “However, it is challenging to examine the effects of instructions in the transient domain since they are not visible in the architectural state. After we carefully analyzed the performance counters for the Haswell architecture, we determined that two counters namely, UOPS_ISSUED : ANY and UOPS_RET IRED : ANY give an idea to what extent the speculative window is altered. UOPS_ISSUED : ANY counter is incremented every time a μop is issued which counts both speculative and non-speculative μops. On the other hand, UOPS_RET IRED : ANY counter only counts the executed and committed μops which automatically excludes speculatively executed μops.”
- “we have selected 100,000 samples from each gadget example uniformly random due to the immense time consumption of oo7 (150 hours for 100K gadgets) which achieves 94% detection rate.”
- “Interestingly, specific gadget types from both fuzzing and SpectreGAN are not caught by oo7. When a gadget contains cmov or xchg or set instruction and its variants, it is not identified as a Spectre gadget.”
- “Listing 5: XCHG gadget: When a past value controlled by the attacker is used in the Spectre gadget, oo7 cannot detect the XCHG gadget”
- “For each Assembly file, Spectector is adjusted to track 25 symbolic paths of at most 5000 instructions each, with a global timeout of 30 minutes. The remaining parameters are kept as default.”
- “23.75% of the gadgets are not detected by Spectector. We observed that 96% of the undetected gadgets contain unsupported instruction/register which is the indicator of an implementation issue in Spectector.”
- “After we examined the undetected gadgets, we observed that if the gadgets include either sfence/mfence/lfence or 8-bit registers (%al, %bl, %cl, %dl), they are likely to bypass Spectector.”
- “Differently, the mask positions are selected from 15% of the training sequence and the selected positions are masked and replaced with <MASK> token with 0.80 probability, replaced with a random token with 0.10 probability or kept as the same token with 0.10 probability.”
- “Since it is not possible to visualize the high dimensional embedding vectors, we leverage the t-SNE algorithm [33] which maps the embedding vectors to a three-dimensional space as shown in Figure 4.”
- “The output probabilities of the softmax layer are the predictions on the assembly code sequence.”
- “We combine the assembly data set that was generated in Section 4 and the disassembled Linux libraries to train FastSpec. Although it is possible that Linux libraries contain Spectre-V1 gadgets, we assume that the total number of hidden Spectre gadgets are negligible comparing the total size of the data set.”
- “In total, a dataset of 107 million lines of assembly code is collected which consists of 370 million tokens after the pre-processing.”
- “The pre-training phase takes approximately 6 hours with a sequence length of 50. We further train the positional embeddings for 1 hour with a sequence length of 250. The fine-tuning takes only 20 minutes on the pre-trained model to achieve classifying all types of samples in the test data set correctly.”
- “In the evaluation of FastSpec, we obtained 1.3 million true positives and 110 false positives (99.9% precision rate) in the test data set which demonstrates the high performance of FastSpec. We assume that the false positives are Spectre-like gadgets in Linux libraries, which needs to be explored deeply in the future work. Moreover, we only have 55 false negatives (99.9% recall rate) which yields 0.99 F-1 score on the test data set.”
- “The processing time of FastSpec is independent of the number of branches whereas for Spectector and oo7 the analysis time increases drastically.”
- “Consequently, FastSpec is faster than oo7 and Spectector 455 times and 75 times on average, respectively.”
- “The total number of tokens is 203,055 while the analysis time is around 17 minutes.”
- “This work for the first time proposed NLP inspired approaches for Spectre gadget generation and detection.”
summary
Very advanced paper. Perfect combination of Machine Learning technology with microarchitectural attack work. Shows a huge effort and nice considerations regarding Generative Adversarial Networks. However, I could not understand technical aspects of the machine learning part
- Several techniques have been proposed to detect vulnerable Spectre gadgets in widely deployed commercial software. Unfortunately, detection techniques proposed so far rely on hand-written rules. Current shortcomings are: (1) the scarcity of Spectre gadgets prevents the
comprehensive evaluation of the tools, (2) the scanning time increases drastically with increasing binary file sizes - Approach:
- Mutational fuzzing is used to expand Kocher's 15 + Spectector's 2 Spectre gadgets to more than 1 miilion
- A Generative Adversarial Network (SpectreGAN; based on MaskGAN) is used to generate Assembly
- FastSpec (based on BERT by Google) takes Assembly and determines whether some binary contains a Spectre gadget
- They achieve 455× the performance of oo7 and 75× the performance of Spectector
- The Linux kernel shows 1.3 million true positive Spectre gadgets and FastSpec found 110 false positives in 107 million lines of ASM code
- 379 matches were found in the OpenSSL 1.1.1g library
- On github, there is a 390 MB tar gz archive (split up). Decompressed, it has a size of 6.7 GB. 972 MB seem to be 239 025 Spectre test gadget files in ASM format
The masking rate (page 6) seems to be the percentage of hidden tokens during the training phase. Figure 4 is a little bit awkward and a little bit random. FastSpec/scan.sh seems to show how FastSpec was called to evaluate OpenSSL. And commands.txt tries to explain it somehow.
typo
- “The critical time period before the flush happens is commonly referred to the transient domain.” → ““The critical time period before the flush happens is commonly referred to as the transient domain.”
- “microarchtiectures” → “microarchitectures”
- “An resule of a sample gadget” → “An result of a sample gadget”
Grading on a Curve: How Rust can Facilitate New Contri… §
Title: “Grading on a Curve: How Rust can Facilitate New Contributors while Decreasing Vulnerabilities” by Justin Tracey, Ian Goldberg [url] [dblp]
Published in 2023 at SecDev 2023 and I read it in 2023-12-20
Abstract: New contributors are critical to open source projects. Without them, the project will eventually atrophy and become inactive, or its experienced contributors will bias the future directions the project takes. However, new contributors can also bring a greater risk of introducing vulnerable code. For projects that have a need for both secure implementations and a strong, diverse contributor community, this conflict is a pressing issue. One avenue being pursued that could facilitate this goal is rewriting components of C or C++ code in Rust—a language designed to apply to the same domains as C and C++, but with greater safety guarantees. Seeking to answer whether Rust can help keep new contributors from introducing vulnerabilities, and therefore ease the burden on maintainers, we examine the Oxidation project from Mozilla, which has replaced components of the Firefox web browser with equivalents written in Rust. We use the available data from these projects to derive parameters for a novel application of learning curves, which we use to estimate the proportion of commits that introduce vulnerabilities from new contributors in a manner that is directly comparable. We find that despite concerns about ease of use, first-time contributors to Rust projects are about 70 times less likely to introduce vulnerabilities than first-time contributors to C++ projects. We also found that the rate of new contributors increased overall after switching to Rust, implying that this decrease in vulnerabilities from new contributors does not result from a smaller pool of more skilled developers, and that Rust can in fact facilitate new contributors. In the process, we also qualitatively analyze the Rust vulnerabilities in these projects, and measure the efficacy of the common SZZ algorithm for identifying bug-inducing commits from their fixes.
quotes
- “We find that despite concerns about ease of use, first-time contributors to Rust projects are about 70 times less likely to introduce vulnerabilities than first-time contributors to C++ projects.” (Tracey and Goldberg, 2023, p. 1)
- “Project enhancements are less likely to be implemented when “heroes” dominate development in open source projects [1], and developers for projects such as the Tor Project have struggled to secure funding for maintenance work that is less attractive to traditional funding sources” (Tracey and Goldberg, 2023, p. 1)
- “The scripts and hand-annotated data necessary to reproduce our results are publicly available.” (Tracey and Goldberg, 2023, p. 2)
- “From this, we obtain a list of commits that nominally contributed to the bug, which are called “fix-inducing” commits (in our case, they are also the VCCs), and the information associated with those commits (authors, times, histories, etc.).” (Tracey and Goldberg, 2023, p. 2)
- “possible source of of information” (Tracey and Goldberg, 2023, p. 4)
- “For example, many vulnerabilities were hidden behind feature flags while new functionality was under development, and therefore could not execute in normal Firefox builds. In our analysis, these vulnerabilities would be introduced when some C++ code was committed that, when executed, could cause an exploit to succeed—not the commit that flipped the feature flag that allowed the vulnerable code to execute.” (Tracey and Goldberg, 2023, p. 4)
- “we opted for a manual review of the issues in question; i.e., we read the relevant information on the issue tracker, understood what the vulnerability consisted of, then went through the history of the files until we found a commit that introduced the vulnerability from the perspective of a code reviewer.” (Tracey and Goldberg, 2023, p. 4)
- “If the learning curve for a project has a more negative slope than the learning curve of another project, it indicates that contributors to this project more quickly learn to avoid adding vulnerabilities than the other project.” (Tracey and Goldberg, 2023, p. 5)
- “In Rust, there exists a notion of “soundness”, which is a guarantee that all code, excluding lines explicitly annotated as “unsafe”, cannot have undefined behavior [27]. That is, soundness implies that if some Rust code does not make use of the unsafe keyword, and it compiles, then the code is well defined. Soundness bugs, then, are bugs that allow this guarantee to be violated in some way—e.g., a bug that allows some safe code written in a particular way to cause a null pointer to dereference somewhere. In the Rust ecosystem, soundness bugs are largely treated as security vulnerabilities, as they can violate the memory safety of the language, which can in turn be used to exploit traditional memory safety vulnerabilities.” (Tracey and Goldberg, 2023, p. 6)
- “Because most C++ vulnerabilities were memory-safety vulnerabilities (at least 87% in our data), and Rust is a memory-safe language, it is expected that most C++ vulnerabilities would not translate into identical Rust vulnerabilities.” (Tracey and Goldberg, 2023, p. 7)
- “Six vulnerabilities involved some form of direct interaction with C++ code (1557208, 1577439, 1593865, 1614971, 1622291, 1758223), functionality known as Foreign Function Interfacing (FFI). In every case, this was caused by a race condition with C++ threads.” (Tracey and Goldberg, 2023, p. 7)
- “SZZ attributed a total of 130 commits over 43 issues, with a maximum of 14 commits attributed to one issue, versus 77 commits over 54 issues from our manual review, with a maximum of 4 commits attributed to one issue.” (Tracey and Goldberg, 2023, p. 7)
- “The y-intercepts of the two learning curves imply that for Oxidation projects, a first-time contributor to a C++ project was approximately 70 times as likely to introduce a vulnerability as a first-time contributor to an equivalent Rust project.” (Tracey and Goldberg, 2023, p. 8)
summary
The authors applied a lot of reasoning on the question which security vulnerabilities occured and how commits contributed to the issue at hand. They put a lot of thought into comprehending the empirical data and categorizing it properly. But it is really not a significant sample size and the central 70-times statement is a strong claim which I don’t think is backed by sufficient data.
The publicly available data deserved special recognition.
High-speed Instruction-set Coprocessor for Lattice-bas… §
Title: “High-speed Instruction-set Coprocessor for Lattice-based Key Encapsulation Mechanism: Saber in Hardware” by Sujoy Sinha Roy, Andrea Basso [url] [dblp]
Published in 2020 at CHES 2020 and I read it in 2020-11
Abstract: In this paper, we present an instruction set coprocessor architecture for lattice-based cryptography and implement the module lattice-based post-quantum key encapsulation mechanism (KEM) Saber as a case study. To achieve fast computation time, the architecture is fully implemented in hardware, including CCA transformations. Since polynomial multiplication plays a performance-critical role in the module and ideal lattice-based public-key cryptography, a parallel polynomial multiplier architecture is proposed that overcomes memory access bottlenecks and results in a highly parallel yet simple and easy-to-scale design. Such multipliers can compute a full multiplication in 256 cycles, but are designed to target any area/performance trade-offs. Besides optimizing polynomial multiplication, we make important design decisions and perform architectural optimizations to reduce the overall cycle counts as well as improve resource utilization.
open questions
- Figure 3: Why are control signals leading to the building blocks like AddPack? Wouldn't it be simpler (for synchronization) to make control signals part of the communication over the bus?
- “The overhead of memory access during polynomial multiplication plays a critical role in lattice-based cryptography (e.g., [RVM+14], [BMTK+20]) and could hinder or complicate logic-level parallel processing.”
- What kind of role?
quotes
- “In 2012, Göttert et al. [GFS + 12] reported the first hardware implementation of the ideal lattice-based LPR [LPR10] public-key encryption scheme. Their implementation used a massively parallel and unrolled NTT-based polynomial multiplier architecture that consumed millions of LUTs and flip-flops.”
- “A comparison of most round 2 submissions, including Saber, can be found in [DFA + 20].”
- “In [DKSRV18] the authors of Saber proposed a fast polynomial multiplier based on the Toom-Cook algorithm [Knu97] and showed that a non-NTT parameter set does not make their implementation slow.”
- “In practice, more than 50% of the computation time is spent on generating pseudo-random numbers using SHAKE128, thus making it the performance bottleneck.”
- “Dang et al. [DFAG19] compare seven lattice-based key encapsulation methods on HW/SW codesign platforms. They report that out of the seven tested protocols (FrodoKEM, Round5, Saber, NTRU-HPS, NTRU-HRSS, Streamlined NTRU Prime and NTRULPRime), Saber is the fastest protocol in the encapsulation operation and second fastest in the decapsulation operation.”
- “At the same time, implementing such an accelerator is a challenging research topic because it requires making careful design decisions that take into account both algorithmic and architectural alternatives for the internal building blocks and their interactions at the protocol level.”
- “When a HW-only implementation is considered, one design option is to cascade different building blocks in the data-path following the standard data-flow model, if the blocks are required in multiple parallel instances.”
- “To achieve programmability and flexibility, we realize an instruction-set coprocessor architecture for Saber.”
- “In this work, we use the open-source high-speed implementation of the Keccak core that was designed by the Keccak Team [Tea19]. This high-speed implementation of Keccak computes ‘state-permutations’ at a gap of only 28 cycles, thus generating 1,344 bits of pseudo-random string every 28 cycles during the extraction-phase. Furthermore, we observed that one instance of the Keccak core consumes around 5K LUTs and 3K registers, which are respectively nearly 21% and 31% of the overall area in our implementation.”
- via SaberX4 paper: “Our proof-of-concept software implementation of SaberX4 achieves nearly 1.5 times higher throughput at the cost of latency degradation within acceptable margins, compared to the AVX2-
optimized non-batched implementation of Saber by its authors.“
- via SaberX4 paper: “Our proof-of-concept software implementation of SaberX4 achieves nearly 1.5 times higher throughput at the cost of latency degradation within acceptable margins, compared to the AVX2-
- “The use of ‘4-bit’ signed-magnitude’ representation simplifies the hardware architecture because we can store 16 such samples easily in a 64-bit word of the data memory. Thus, no sample is split across two words.”
- Because consider that [-3, 3] requires 3 bits (7 states), [-4, 4] requires 4 bits (9 states) and [-5, 5] requires 4 bits (11 states). If we use 4 bits, we can cover all cases and it will be not split among bytes. Consider 3 bits. Then we have the following partition among 3 bytes: [3 3 2][1 3 3 1][2 3 3]
- “asymptotically the second fastest after the NTT-based polynomial multiplication.”
- Remove “the”, since Schönhage-Strassen and Fürer are two NTT-based multiplication algorithms
- “The hardware implementation of the Toom-Cook polynomial multiplication by Bermudo Mera et al. [BMTK + 20] describes the challenges in implementing the recursive function calls in hardware and proposes efficient architectures.”
- “This nega-cyclic rotation happens since the reduction-polynomial is x256 +1.”
summary
A good read with appropriate comparisons with other schemes and implementations. Reasonable arguments are provided for design decisions. The runtime results were expectable and have been met.
- “In this paper, we present an instruction set coprocessor architecture for lattice-based cryptography and implement the module lattice-based post-quantum key encapsulation mechanism (KEM) Saber as a case study.”
- “Since polynomial multiplication plays a performance-critical role in the module and ideal lattice-based public-key cryptography, a parallel polynomial multiplier architecture is proposed that overcomes memory access bottlenecks and results in a highly parallel yet simple and easy-to-scale design.”
- “For the module dimension 3 (security comparable to AES-192), the coprocessor computes CCA key generation, encapsulation, and decapsulation in only 5,453, 6,618 and 8,034 cycles respectively, making it the fastest hardware implementation of Saber to our knowledge.”
- “module dimension 3” corresponds to “NIST PQC category 3”
- “The Vivado project and all HDL source codes are available at https://github.com/sujoyetc/SABER_HW.”
- In Figure 3, the pseudo-random word from data memory is split into chunks of size μ. μ bits are turned into a binomial sample of size 4 bits.
- “There is no conditional branching in the algorithms used and all the building blocks have been designed to be constant-time.”
- “Software benchmarking [KRSS19] of many lattice-based KEM schemes have reported that 50-70% of the overall computation time is spent on executing the Keccak function, thus making it the most performance-critical component.”
- The Saber reference implementation uses code similar to Kyber for binomial sampling
- In Figure 7, the sign bit only depends on one argument, because s(x) is always positive.
- “The instruction-set coprocessor architecture is described in mixed Verilog and VHDL and is compiled using Xilinx Vivado for the target platform Xilinx ZCU102 board that has an UltraScale+ XCZU9EG-2FFVB1156 FPGA.”
- “We tested the functional correctness of the coprocessor on the ZCU102 board and at 250 MHz clock frequency, the CCA-secure key generation, encapsulation and decapsulation operations take 21.8, 26.5, and 32.1 μs respectively.”
- “As the polynomial multiplier architecture is scalable, we implemented a variant of it with MAC units fitting two multipliers. With this higher-performing architecture, the cycle counts for polynomial multiplications nearly halves, …”
- “The overall cycle count for Saber (module dimension 3) is 4,320, 5,231 and 6,461 for key generation, encapsulation, and decapsulation respectively. Thus, the cycle count is reduced by 21%, 21%, and 20% respectively. The increased speed comes with increased area consumption of 1.83× for LUTs and 1.74× for flip-flops (this is both due to the increased area consumption of the MAC units with two multipliers and due to the pipelining).”
- “In Table 5 we compare our flexible architecture with some of the recent hardware implementations of post-quantum KEM schemes.”
- “The SIKE [JF11] scheme relies on the computational hardness of the supersingular isogeny problem. Its most recent hardware implementation by Massolino et al. [MLRB20] targets high speed and even beats Frodo KEM. Our hardware implementation of Saber is around 500 to 600 times faster than their implementation.”
Instruction Cycle Count
Keygen Encapsulation Decapsulation
––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
SHA3-256 339 585 303
SHA3-512 0 62 62
SHAKE-128 1,461 1,403 1,403
Vector sampling 176 176 176
Polynomial multiplications 2,685 3,592 4,484
(i.e. 47% of column) (i.e. 54%) (i.e. 56%)
Remaining operations 792 800 1,606
Total cycles 5,453 6,618 8,034
Total time at 250 MHz 21.8 μs 26.5 μs 32.1 μs
typo
- “This means that each MAC unit should receive in input”
“This means that each MAC unit should receive an input” - “Nevertheless, our coprocessor has been tested in the hardware”
“Nevertheless, our coprocessor has been tested in hardware”
Historical Notes on the Fast Fourier Transform §
Title: “Historical Notes on the Fast Fourier Transform” by W Lewis, Peter D Welch [url] [dblp]
Published in 1967 at IEEE 1967 and I read it in 2020-03
Abstract: The fast Fourier transform algorithm has a long and interesting history that has only recently been appreciated. In this paper, the contributions of many investigators are described and placed in historical perspective.
summary
- Survey paper on historical developments of the four years before. Nice and straight. Forward summary though I didn't go through details of the Prime factor algorithm.
- “The greatest emphasis, however, was on the computational economy that could be derived from the symmetries of the sine and cosine functions”
- “use the periodicity of the sine-cosine functions to obtain a 2N-point Fourier analysis from two N-point analyses with only slightly more than N operations. Going the other way, if the series to be transformed is of length N and N is a power of 2, the series can be split into log, N subseries”
- “The number of computations in the resulting successive doubling algorithm is therefore proportional to N log₂ N rather than N²”
- “The fast Fourier transform algorithm of Cooley and Tukey is more general in that it is applicable when N is composite and not necessarily a power of 2”
- “the algorithms are different for the following reaons : 1) in the Thomas algorithm the factors of N must be mutually prime; 2) in the Thomas algorithm the calculation is precisely multidimensional Fourier analysis with no intervening phase shifts or “twiddle factors” as they have been called ; and 3) the correspondences between the one-dimensional index and the multidimensional indexes in the two algorithms are quite different.”
- “The factor Wj0n0, referred to as the “twiddle factor” by Gentleman and Sande, is usually combined with either the Wrj0n factor in (eq 4) or the Wsjin0 factor in (eq 5)”
- eq 4: A_1(j_0, n_0) = \sum_{n1=0}^{r-1} A(n_1, n_0) W_r^{j_0 n_1}
- eq 5: X(j_1, j_0) = \sum_{n_0=0}^{s-1} A_1(j_0, n_0) W_s^{j_1 n_0} W_N^{j_0 n_0}
Horizontal side-channel vulnerabilities of post-quantu… §
Title: “Horizontal side-channel vulnerabilities of post-quantum key exchange protocols” by Aydin Aysu, Youssef Tobah, Mohit Tiwari, Andreas Gerstlauer, Michael Orshansky [url] [dblp]
Published in 2018 at HOST 2018 and I read it in 2020-09
Abstract: Key exchange protocols establish a secret key to confidentially communicate digital information over public channels. Lattice-based key exchange protocols are a promising alternative for next-generation applications due to their quantum-cryptanalysis resistance and implementation efficiency. While these constructions rely on the theory of quantum-resistant lattice problems, their practical implementations have shown vulnerability against sidechannel attacks in the context of public-key encryption or digital signatures. Applying such attacks on key exchange protocols is, however, much more challenging because the secret key changes after each execution of the protocol, limiting the side-channel adversary to a single measurement.
quotes
- “The crux of our idea is to apply a horizontal attack that makes hypothesis on several intermediate values within a single execution all relating to the same secret and to combine their correlations for accurately estimating the secret key.”
- “We analyzed two key exchange protocols, NewHope (USENIX’16) and Frodo (CCS’16),”
- “Applying DPA is, however, signficantly more challenging on key exchange protocols because these protocols, unlike public-key decryption or digital signatures, work with ephemeral secrets: each invocation of the protocol will process a unique value that will result in a new, distinct secret key. Therefore, the adversary is limited to a single power measurement.”
- “[…] illustrates the classic DPA, known as Vertical DPA.”
- “Horizontal DPA [22], Figure 1(b), by contrast, performs multiple tests by targeting different computations and combines them to extract the sub-key from a single power measurement.”
- “While Frodo performs matrix multiplication, NewHope uses polynomial multiplication.”
- “We validate that the number of horizontal tests, which is 1024 for NewHope and 752 for Frodo, is sufficient to extract the key.”
- “The adversary in DPA attacks has physical access to the device and can read power measurements as the device computes the cryptographic routine.”
- “[…] and assume that the adversary can locate around which clock cycle to start the DPA analysis using prior techniques.”
- “[…] three groups: simple power analysis (SPA), DPA, and template attacks.”
- “Park et al. proposed an SPA attack on the polynomial mulitplication of the R-LWE-based public-key decryption process, exploiting the SPA power leaks of an implementation on 8-bit AVR microcontrollers.”
- “The main difference between the two is that while Frodo relies on the LWE problem, NewHope uses R-LWE.”
- “The error distribution is a Rényi divergence approximation to a rounded continuous Gaussian with variance ς2 = 1.75, which is centered at 0 with a tail cut at ±5.”
- “The crux of our attack is to observe that all of these operations rely on the same coefficient of the secret polynomial.”
- “Even when there is a modular reduction, the output changes by a single overflow bit if the modulo value is a power of two, which is exactly the case for the modulo 2 15 multiplication of Frodo.”
- “Since these two instantiations use a modular reduction with a power of two, the modulo operation is free, which is simply a truncation of the adder output to log2(q) bits.”
- “However, in the NewHope case, the modular reduction is with the constant integer 12289. Thus, NewHope requires a full-scale reduction after the modular multiplication. To efficiently implement this operation, we used the Barret Reduction technique [31],”
- “The size of matrix A, which requires storing 752 × 752 × 15 bits, creates a problem for the Frodo implementation.” (remark: i.e. 1036 kB)
- “Since our target FPGA cannot store this amount of data due to BRAM limitations, it has to generate parts of A on-the-fly during the computation of A·S.”
- “An alternative method to implement polynomial multiplication is using NTT, […]”
summary
The research was okay, but this paper does also not show any unexpected results. I liked the paper's brevity. In essence, it is the application of a horizontal DPA on Frodo and NewHope KEMs, but the evaluation on a resource-constrained devices is more specific. It is also unusual to see the evaluation on custom built hardware/FPGA.
- It is awkward that Park does not have an appropriate citation (i.e. “[10]”) in section 3. The formulation of the error distribution (“Rényi divergence …”) is wonderful.
- page 85, top left: “to extract one key bit at a time” … of course, you extract bits, but this is only the special case for bytes, which is the common/expected hypothesis with the Hamming weight. So either you did something unusual here (→ remarks needed) or “bits” should be “bytes” here. Also in general, the first two sentences are phrased difficult to comprehend (“intermediate sum […] is 0” → no, its initial value is zero).
- page 85, right bottom: the bound check text is awkward. I don't get it. Algorithm 1 shows a non-constant time implementation. How did you achieve it and parallel computation does not make it constant-time.
- Citation “[11, Appendix A]” is just wrong. Appendix A does not show a zero-value attack. Did you mean some attacks from the “Exploring the Vulnerability of R-LWE Encryption to Fault Attacks” paper?
- section 4.B “Limitations of Our Attack” mentions NTT as “alternative method to implement polynomial multiplication”, but actually it is the default for NewHope. But my point is that you did not answer the question, whether your attack also works on NTTs?
- I read some article on microcontroller (“The Amazing $1 Microcontrollers”): RAM ranges from 0.25–16 kB with a median of 1 kB. Flash ranges from 4–64 kB with median 16 kB
typo
- “mulitplication” → “multiplication”
- “indexes” → “indices”
- “from a specific distribtuion,” → “from a specific distribution,”
- “This is multiplication of” → “This is a multiplication of”
- “our attack as it increases the number of test for the horizontal analysis.” → “our attack as it increases the number of tests for the horizontal analysis.”
How ISO C became unusable for operating systems develo… §
Title: “How ISO C became unusable for operating systems development” by Victor Yodaiken [url] [dblp]
Published in 2021 at PLOS@SOSP 2021 and I read it in 2022-12-25
Abstract: The C programming language was developed in the 1970s as a fairly unconventional systems and operating systems development tool, but has, through the course of the ISO Standards process, added many attributes of more conventional programming languages and become less suitable for operating systems development. Operating system programming continues to be done in non-ISO dialects of C. The differences provide a glimpse of operating system requirements for programming languages.
Comment: PLOS '21: Proceedings of the 11th Workshop on Programming Languages and Operating Systems October 2021
quotes
- “The Rationale for the C standard [9] cited C’s capability to function as a "high-level assembler" and explained that "many operations are defined to be how the target machine’s hardware does it rather than by a general abstract rule" but C also has traditional attributes of an ALGOL style programming language.” (Yodaiken, 2021, p. 1)
- “A common argument (made e.g. by Dietz [1]) is that programmers are wrong: their objections to changes in C semantics embody "a fundamental and pervasive misunderstanding: the compiler [is] not ’reinterpreting’ the semantics but rather [is] beginning to take advantage of leeway explicitly provided by the C standard."” (Yodaiken, 2021, p. 2)
- “Limitations of ISO C for OS development have been noted in academic literature "Systems or library C codes often cannot be written in standard-conformant C" [20]. and by practitioners e.g. [38].” (Yodaiken, 2021, p. 2)
- “For example, a well-known security issue in the Linux kernel was produced by a compiler incorrectly assuming a pointer null check was unnecessary ([40] fig. 6) and deleting it as an optimization.” (Yodaiken, 2021, p. 2)
- “For an example of an implementation of malloc in K&R (page 187) the text explains there is a question about whether "pointers to different blocks ... can be meaningfully compared", something not guaranteed by the standard. The conclusion is "this version of malloc is portable only among machines for which general pointer comparison is meaningful."” (Yodaiken, 2021, p. 2)
-
“Ritchie’s main objection was to a type attribute intended to limit aliasing (two or more active pointers/references addressing the same storage).
the committee is planting timebombs that are sure to explode in people’s faces. Assigning an ordinary pointer to a pointer to a ‘noalias’ object is a license for the compiler to undertake aggressive optimizations that are completely legal by the committee’s rules, but make hash of apparently safe programs.” (Yodaiken, 2021, p. 3)
-
“Even for something as apparently innocent as
for(i=0; i < *b; i++) a[i] = a[i]+ *v;if it is possible that for some i, a+i == v or a+i == b or even a+i = &v so compiled code must reload both values on each iteration of the loop” (Yodaiken, 2021, p. 3) - “A year later, C89 imposed type restrictions on access to C objects as a way of facilitating type-based alias analysis (TBAA) (section 3.3 in C89). The basic idea is that ISO C forbade accessing an object of one type via a pointer (or other "left hand side") of a different enough type, with an exception for character pointers which can access everything (sort of as discussed below).” (Yodaiken, 2021, p. 3)
-
“The usual example of TBAA optimization involves "lifting" variables out of loops. For the loop above
long b1 = *b; long v1 = *v; //liftedfor(int i= 0; i < b1; i++) a[i]= a[i]+v1);
is a legal optimization if the type of a[i] doesn’t match the types of *b and *v because the compiler "knows" that those pointers cannot alias objects of a different type.” (Yodaiken, 2021, p. 3) -
“There is no general escape mechanism, even though char pointers are allowed to alias anything2. The absence of escapes is a major change in C’s type system as Brian Kernighan’s critique of Pascal makes clear [12]:
There is no way [in Pascal] to override the type mechanism when necessary, nothing analogous to the “cast” mechanism in C. This means that it is not possible to write programs like storage allocators or I/O systems in Pascal, because there is no way to talk about the type of object that they return, and no way to force such objects into an arbitrary type for another use.” (Yodaiken, 2021, p. 4)
- “Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.” (Yodaiken, 2021, p. 4)
- “This modest view of undefined behavior is not, however, the prevailing one, which is that the compiler can assume undefined behavior is impossible and can optimize on the basis of that assumption. In fact, it is currently argued that the standard interpretation allows implementations to take any action at all, not just for (say) an overflowing execution but for the entire program, if they detect a single feasible instance of undefined behavior.” (Yodaiken, 2021, p. 4)
- “By C18, the ISO C Standard document included a 10-page, incomplete list of undefined behaviors covering everything from type constraints to syntax errors and synchronization errors. Most C programs contain undefined behavior – certainly every operating system code base does. Perhaps more troubling, as [2] points out, this concept of undefined behavior makes C compilers unstable.” (Yodaiken, 2021, p. 4)
-
“And by 2011, Chris Lattner, the main architect of the Clang/LLVM compilers was echoing Ritchie’s warning [16]:
To me, this is deeply dissatisfying, partially because the compiler inevitably ends up getting blamed, but also because it means that huge bodies of C code are land mines just waiting to explode. This is even worse because [...] there is no good way to determine whether a large scale application is free of undefined behavior, and thus not susceptible to breaking in the future.” (Yodaiken, 2021, p. 4)
- “C "ints" are fixed-size sequences of bytes interpreted as 2s complement values that map into the ring ℤ/2𝑘 ℤ where (𝑥∗𝑦)/𝑦 = 𝑥 is not a theorem3. Gcc x86-64 with the optimizer on will reveal that (see the code and compilation [43]) if 𝑥 is an "int" and 𝑥 = 1, 000, 000, 000 then calculating (𝑥 ∗5)/5 directly produces 1, 000, 000, 000 but also 𝑧 = 𝑥 ∗ 5 = 705032704 and then 𝑧/5 = 141006540. The result depends on whether the compiler can recognize the overflows. Paradoxical results are easy to generate.” (Yodaiken, 2021, p. 5)
-
“
float *f = malloc(sizeof(float));*f = 3.14; //1int *a = (int *)f*a = 4; //2” (Yodaiken, 2021, p. 5) - “So have we converted the object pointed to by f to an int in statement 2? Section 6.5 paragraph 7 says " An object shall have its stored value accessed only by an lvalue expression that has one of the following types", which are limited to compatible types and character types.” (Yodaiken, 2021, p. 5)
- “The second part of Ritchie’s comment cited above is "and I don’t know how much performance improvement is likely to result." It is difficult to find any documentation of significant performance advantages of any kind of undefined behavior optimization [2].” (Yodaiken, 2021, p. 6)
-
“Lee [17] provides an example of the claimed advantage of undefined behavior for overflow.
for(int i=0 ; i <= N; i++) a[i] = x+1;
If 𝑖 and 𝑁 are both 32bit and the target machine is x86-64, it is sometimes assumed that permitting overflow requires a sign extend of i on each iteration.” (Yodaiken, 2021, p. 6) - “Assuming overflow is impossible allows omitting the sign extend. Sign extension is a fast operation so if the loop is non-trivial, the cost will be lost in the noise.” (Yodaiken, 2021, p. 6)
- “Alias detection in the abstract is not Turing computable [29] and C pointers make approximate alias detection difficult [14], but there are effective algorithms that can detect most aliasing [7] and it is a design choice to make aliasing optimization rely on assumptions about program code that are not validated.” (Yodaiken, 2021, p. 6)
-
“The CompCert compiler is aimed at control systems that have many of the same properties as operating systems, does not do any undefined behavior based optimization4 (and does not optimize extensively) [19] and has a deterministic semantics [18]:
The semantics is deterministic and makes precise a number of behaviors left unspecified or undefined in the ISO C standard [...] CompCert generates code that is more than twice as fast as that generated by gcc without optimizations, and competitive with gcc at optimization levels 1 and 2. On average, CompCert code is only 7% slower than gcc -O1 and 12% slower than gcc -O2.” (Yodaiken, 2021, p. 6)
summary
Victor Yodaiken picks up the central statement “ISO C is unusable for operating systems development” and discusses C’s history, optimization, the type system and undefined behavior with some several illustrative examples to justify that statement. Good contextualization.
typo
- page 3, the example has a trailing right parenthesis
- page 5, example is missing semicolon to close statement
How Usable are Rust Cryptography APIs? §
Title: “How Usable are Rust Cryptography APIs?” by Kai Mindermann, Philipp Keck, Stefan Wagner [url] [dblp]
Published in 2018 at QRS 2018 and I read it in 2021-12
Abstract: Context: Poor usability of cryptographic APIs is a severe source of vulnerabilities. Aim: We wanted to find out what kind of cryptographic libraries are present in Rust and how usable they are. Method: We explored Rust’s cryptographic libraries through a systematic search, conducted an exploratory study on the major libraries and a controlled experiment on two of these libraries with 28 student participants. Results: Only half of the major libraries explicitly focus on usability and misuse resistance, which is reflected in their current APIs. We found that participants were more successful using rustcrypto which we considered less usable than ring before the experiment. Conclusion: We discuss API design insights and make recommendations for the design of crypto libraries in Rust regarding the detail and structure of the documentation, higher-level APIs as wrappers for the existing low-level libraries, and selected, good-quality example code to improve the emerging cryptographic libraries of Rust.
quotes
- “We found that participants were more successful using rust-crypto which we considered less usable than ring before the experiment. Conclusion: We discuss API design insights and make recommendations for the design of crypto libraries in Rust regarding the detail and structure of the documentation, higher-level APIs as wrappers for the existing low-level libraries, and selected, good-quality example code to improve the emerging cryptographic libraries of Rust.”
- “Georgiev et al. [3] analyzed SSL certificate validation code in a range of applications and found that the libraries themselves are correct “for the most part” but developers often misunderstand the APIs which is the “primary cause” for vulnerabilities.”
- “Egele et al. [5] automatically checked Android apps and found widespread security flaws such as ECB mode, constant keys, constant salts and constant pseudorandom number generator (PRNG) seeds, all of which can be traced back to a misuse of the cryptographic APIs in Android.”
- “Lazar et al. [6] investigated 269 vulnerabilities from the CVE database and found that the majority is caused by application code which misuses the properly implemented cryptographic libraries. Das and King [7] defined seven properties to determine how safe a cryptographic library is and applied them to six libraries for the most popular programming languages. Nadi et al. [8] empirically investigated the Java
Cryptography Architecture (JCA) by analyzing StackOverflow posts, GitHub repositories and surveying developers. They found that the APIs are perceived as being too low-level and recommended task-based API (similar to the cryptography.io library) and improved documentation as solutions.” - “Through our search, we found the following 81 libraries:
- 10 crypto-specific utility libraries for constant-time operations, secret memory and similar.
- 13 larger libraries which offer multiple primitives and usually multiple algorithms per primitive. The implementations are either written in Rust or attached through wrappers to code in another language. The major libraries in this category are introduced in section III-A.
- 35 libraries which implement a single primitive, algorithm or a small family of them in Rust.
- 5 libraries which offer a simpler interface to other implementations for specific application scenarios.
- 18 libraries that implement cryptosystems or protocols (mostly Transport Layer Security (TLS)) and usually depend on lower-level libraries.”
- “We determined major libraries semi-manually: Any library with 20 or more dependent crates is considered a major library.”
- “Two projects aim to implement the most relevant cryptographic primitives natively in Rust, to eventually become an alternative to the established crypto libraries in C. The older project is rust-crypto which is the second most popular crypto library after rust-openssl and the newer one is octavo which is still incomplete and insecure.”
- “A relatively new library that aspires to provide even more usable and misuse-resistant APIs is ring. It uses a slimmed-down code base of BoringSSL, Google’s simplified OpenSSL fork and excludes the entire TLS stack and most deprecated algorithms. The API is developed independently of the underlying Assembler/C code and notable efforts and innovations are made regarding misuse resistance. Because ring is technically a fork of BoringSSL, all contributions to BoringSSL are also counted towards ring’s statistics.”
- “Even after these subtractions, ring still has by far the highest number of commits and quite many contributors considering its young age.”
- “They are not independent, though, as rust-crypto’s API design cleverly integrates elements in a way that makes sense from an implementation perspective: the Hmac<D> takes a hashing algorithm (digest) D as a parameter, for example. The same concept is applied throughout the library, e.g., the generic CTR mode implementation can be used with any block cipher.”
- “Misuse resistance: The HMAC API returns a custom struct which overrides the == operator so that comparisons are performed in constant time.”
- “Misuse resistance: ring exclusively offers authenticated kinds of encryption which prevent accidental misuse of unauthenticated encryption. The HMAC module provides a verify() function that uses a constant-time comparison.”
- “Although the top-level documentation contains extensive information about how a cryptosystem is defined mathematically, about Kerckhoff’s Principle and about what key lengths are secure, it does not inform about the key/initialization vector (IV) lengths actually required by the ChaCha20 cipher. Digging into the code (or trial and error) revealed that they must be multiples of 32 bits; hence the 112 bits recommended for ‘medium-term protection’ by the documentation would not work.”
- “This makes the components nicely composable but often requires the caller to use the unpleasant ‘turbofish’ operator.”
- “The explorer often had trouble preallocating the vectors for &mut [u8] parameters (Rust’s out-parameters). There is a simple solution since March 2015: vec ![0; length ]. But before that many Rust users were frustrated when allocating a vector that some ridiculous solutions were suggested, including one that requires two lines of code and an unsafe block 9 (still the accepted answer) or one that creates an infinite iterator and collects it.”
- “The controlled experiment was conducted with students of the University of Osnabrück, who had just completed a semester-long lecture on the Rust programming language.”
- “Before we describe the sample, we exclude some of the participant’s results because participants 3, 4, 6, 14 and 23 tried to implement the Advanced Encryption Standard themselves but failed. We ignore their data as they might have performed differently if the task had been understood better. For the same reason we discount participant 15 who implemented a Caesar cipher (unsuccessful). We also disregard the data from the supervisor (participant 13) of the course. This leaves 11 participants in both groups.”
- “The remaining 22 participants were between 19 and 26 years old (median 22.5) and most of them were male (seven were female).”
- “Calculating the effectiveness for each participant and summing it up to get a total effectiveness of the experiment sample we got Erc = 0.66 for rust-crypto and Ering = 0.28 for ring. The distributions for the effectiveness of the two libraries is depicted in the top left boxplot in Figure 1.”
- “Hence, we could not reject the null hypothesis that there is no difference in the effectiveness of using rust-crypto and ring.”
- “Satisfaction can be measured by standardized questionnaires. We used the System Usability Scale (SUS) by Brooke [15] (similar to [13]) which is simple, quick and accurate [16] as it contains only a set of 10 questions.”
- “Only 4 of 7 (57%) who used the example code from/for rust-crypto, completed the task. Participants who did not use the example did not complete the task.”
- “Not a single participant was worried about the security implications of their choices during the experiment. Everyone went with the defaults provided by the library and tried to get something working. Those who succeeded did not reconsider earlier choices. Accordingly, the participants were rather unsure about the security of their code. In particular, all rust-crypto users ended up using unauthenticated encryption without knowing its potential dangers and the vast majority stuck to the CBC mode and PKCS padding given by the code example.”
- “The code example also made rust-crypto users more confident in the perceived security of their solution, […]”
- “[…] a relatively large number of participants also visited Wikipedia and StackOverflow to learn about AES encryption, nonces and other topics.”
- “In-place APIs are difficult: […]”
- “This kind of in-place API has a number of benefits: it does not require extra heap allocations on the part
of the library – some environments do not have a heap, so this can be a hard requirement – and it uses minimal extra memory, as the plaintext can be overwritten.” - “From the comments about the experiment we also get ‘Did I mention that I missed a good documentation for rust-crypto?’ and ‘The ring library desperately needs a good documentation with examples’”
- “Viewing the current Rust crypto APIs in the light of recent research, we found that: Insecure defaults do not occur and most APIs try to avoid defaults entirely. Authenticated encryption is not advertised enough in low-level libraries, whereas the high-level libraries omit unauthenticated encryption altogether for maximal misuse resistance. Few high-level libraries are available. A few projects do not warn about deprecated/broken algorithms. There are no measures against accidental nonce reuse.
- Do not explain cryptographic concepts yourself but link to comprehensible resources.
- Make recommendations when there are multiple choices. E.g. if parameters can be constructed differently, it should be explained what the different choices imply.”
summary
A well-written usability paper. The authors thoroughly determine a list of rust cryptographic libraries and their history. Subsequently one author tried to implement an advanced usecase with the rust-crypto, ring, rust-openssl, rust_sodium, and octavo libraries. Then 22 participants with little cryptographic knowledge were asked to implement a basic usecase. The results were expected but can lead to practical improvement suggestions (which happened partially).
- RQ1 is a vague research question
- The loop in the source code of section 5b is really terrible.
- in-place does not necessarily require dynamic memory allocation (discussion in section 5.J). You supply an output buffer and the result is written there. Then the data of the original object does not need to be modified to store the result.
typos
- “Therefor we decided” → “Therefore, we decided”
How did Dennis Ritchie produce his PhD thesis? A typog… §
Title: “How did Dennis Ritchie produce his PhD thesis? A typographical mystery” by David F. Brailsford, Brian W. Kernighan, William A. Ritchie [url] [dblp]
Published in 2022-11 at and I read it in 2023-02
Abstract:
summary
An interesting look at Dennis Ritchie’s thesis. Dennis Ritchie never got his PhD formally, but finished his thesis where two finalization iterations are known. In this report the authors look at his thesis (“How was the thesis prepared?”) and recognize that his thesis was likely typeset on an IBM 2741, but remarkably some details (grid, 4 versus 4 (symbol without connection at the highest point)) don’t align and show manual intervention.
In defense of PowerPoint §
Title: “In defense of PowerPoint” by N. Holmes [url] [dblp]
Published in 2004 at and I read it in 2021-08
Abstract:
quotes
- “Computing professionals who blame their machinery for their failures set a bad example for computer users already prone to using the computer as a scapegoat.”
- “PowerPoint is just presentation technology’s latest iteration and will eventually be replaced by something else.”
- “Presentation technology first took a direct form. In Europe more than two millennia ago, presenters developed mnemonic techniques. For centuries, early books served only as a reference for presentations, the idea of reading silently being considered strange when first introduced.”
- “Currently, presenters most neglect the persuasive aspect, yet in olden times, knowledge of rhetorical principles was considered one of a classical education’s more important benefits.”
- “Some teachers see PowerPoint as a splendid tool to help them convey ideas. Others prefer to use browser-driven HTML.”
- “Our digital technology would be better used in the classroom by administering drill and practice as a foundation for literacy and numeracy so that teachers can concentrate on the more important job of inculcating and encouraging social capability, which they must do personally.”
- “”
summary
The argument is “blame the user, not the tool”. Otherwise this article just recites known arguments.
Is rust used safely by software developers? §
Title: “Is rust used safely by software developers?” by Ana Nora Evans, Bradford Campbell, Mary Lou Soffa [url] [dblp]
Published in 2020-05 at ICSE 20 and I read it in 2023-01-08
Abstract: Rust, an emerging programming language with explosive growth, provides a robust type system that enables programmers to write memory-safe and data-race free code. To allow access to a machine’s hardware and to support low-level performance optimizations, a second language, Unsafe Rust, is embedded in Rust. It contains support for operations that are difficult to statically check, such as C-style pointers for access to arbitrary memory locations and mutable global variables. When a program uses these features, the compiler is unable to statically guarantee the safety properties Rust promotes. In this work, we perform a large-scale empirical study to explore how software developers are using Unsafe Rust in real-world Rust libraries and applications. Our results indicate that software engineers use the keyword unsafe in less than 30% of Rust libraries, but more than half cannot be entirely statically checked by the Rust compiler because of Unsafe Rust hidden somewhere in a library’s call chain. We conclude that although the use of the keyword unsafe is limited, the propagation of unsafeness offers a challenge to the claim of Rust as a memory-safe language. Furthermore, we recommend changes to the Rust compiler and to the central Rust repository’s interface to help Rust software developers be aware of when their Rust code is unsafe.
errors
- page 2, top, “assume the call we be” should be “assume the call will be”
- page 10, “we believe the way we segmented crates is makes our analysis representative of the larger Rust ecosystem.” should be “we believe the way we segmented crates is makes our analysis representative of the larger Rust ecosystem.”
- page 10, “and doe not include crates” should be “and doe not include crates”
quotes
- “To allow access to a machine’s hardware and to support low-level performance optimizations, a second language, Unsafe Rust, is embedded in Rust. It contains support for operations that are difficult to statically check, such as C-style pointers for access to arbitrary memory locations and mutable global variables. When a program uses these features, the compiler is unable to statically guarantee the safety properties Rust promotes” (Evans et al., 2020, p. 1)
- “Our results indicate that software engineers use the keyword unsafe in less than 30% of Rust libraries, but more than half cannot be entirely statically checked by the Rust compiler because of Unsafe Rust hidden somewhere in a library’s call chain. We conclude that although the use of the keyword unsafe is limited, the propagation of unsafeness offers a challenge to the claim of Rust as a memory-safe language” (Evans et al., 2020, p. 1)
- “Operations such as configuring hardware or reading a network socket involve manipulating memory in ways that the compiler cannot guarantee to be safe.” (Evans et al., 2020, p. 1)
- “This functionality was originally described as “pragmatic safety” [14] when Rust was first introduced, and allows developers to use their own discretion when writing Rust code. Part of the justification for allowing Unsafe Rust code is that uses of unsafe would be easy to locate and audit, and that developers can decide how much unchecked code they are willing to accept in their software.” (Evans et al., 2020, p. 1)
- “The Rust open source community recently formed a new Rust working group to create a “Unsafe Code Guideline Reference” to help guide developers [36].” (Evans et al., 2020, p. 1)
- “The majority of unsafe uses in the Rust ecosystem are to call other Rust functions that are marked unsafe. We find that only 22% of these unsafe functions are to external libraries implemented in C, suggesting that a majority of the Unsafe Rust is actually from Rust code where the software developer decided to disable the compiler checks.” (Evans et al., 2020, p. 2)
- “The ownership mechanism in Safe Rust requires that a unique variable is the owner for every memory location.” (Evans et al., 2020, p. 2)
- “The definition of memory-safety used by Rust is similar to the one proposed by Szekeres et al. [30].” (Evans et al., 2020, p. 2)
- “A Rust program is memory-safe if it is free of any memory errors such as dereferencing a null or dangling pointer, reading or writing unaligned pointers, and reading uninitialized memory [34].” (Evans et al., 2020, p. 2)
-
“A developer may directly use Unsafe Rust by creating a code block labeled with the keyword unsafe, which is required for the following operations:
- Calling a function marked unsafe, non-Rust external function, or a compiler intrinsic (a function whose implementation is handled specially by the compiler).
- Dereferencing a C-style pointer.
- Accessing a mutable global variable.
- Using inline assembly instructions.
- Accessing a field of a union type.” (Evans et al., 2020, p. 3)
- “An example of a possibly dangerous unsafe function call is the mem::transmute() function used to coerce the contents of an arbitrary memory location into a specific Rust type” (Evans et al., 2020, p. 3)
- “Using unsafe also makes mutable reference aliasing possible, leading to undefined behavior.” (Evans et al., 2020, p. 3)
- “For example, a function that sets the length of a vector, Vector.set_length(), will be marked unsafe despite only setting an internal field because if it is called with a parameter larger than the capacity of the internal buffer, future vector operations may access unallocated memory. This example also demonstrates how memory safety errors may occur in Safe Rust code after any use of Unsafe Rust.” (Evans et al., 2020, p. 4)
- “A trait may be declared unsafe if it contains any unsafe method or its implementations must satisfy an invariant. An example from the Rust standard library is the trait Send, and any type that implements Send declares that it is safe to change the ownership of the type to another thread.” (Evans et al., 2020, p. 4)
- “The motivation for RQ3 is to understand if interactions with C are the main source of Unsafe Rust. If yes, then most Unsafe Rust can be eliminated by implementing the libraries in Rust.” (Evans et al., 2020, p. 4)
- “Additionally, to compare crates contributed by the larger Rust community with those developed by members of the core Rust development team and Mozilla Research [32], we analyze the application Servo [3], a web browser engine from Mozilla. Servo, endemic of the larger Rust ecosystem, itself is implemented as a collection of about fifty discrete crates, and together with all its dependencies, compiling Servo involves compiling almost 400 different crates.” (Evans et al., 2020, p. 5)
- “To generate a call graph, we use the Rust compiler to compile the crate to an intermediate representation of Rust known as MIR (Middle Intermediate Representation). We then use the "control flow graphs" of functions obtained directly from the MIR representation to run a context-sensitive analysis which uses type inference to find the precise functions that can be called. This analysis starts at the leaf terminal nodes of the control flow graph of a function.” (Evans et al., 2020, p. 5)
- “To obtain a “popular” subset, we fetch the per-crate downloads numbers from crates.io and select the most downloaded crates that account for ninety percent of the downloads from crates.io. These 500 or so crates form a group we call most downloaded. From these most downloaded crates we were able to compile 473 crates.” (Evans et al., 2020, p. 6)
- “Overall, 29% of crates directly include some sort of Unsafe Rust in them.” (Evans et al., 2020, p. 7)
- “The number of unsafe blocks per crate is small for the majority of the crates, more than 90% of the crates have fewer than ten unsafe blocks” (Evans et al., 2020, p. 7)
- “However, the crates for Servo use unsafe functions and unsafe traits two or three times as frequently as general crates, which is consistent with the growing preference in the Rust community that unsafe is encapsulated at a higher level than individual functions. That is, exposing unsafe functions is acceptable as long they are eventually enclosed within a safe interface.” (Evans et al., 2020, p. 8)
- “On average, across all crates in our dataset, a crate depends on twelve other crates.” (Evans et al., 2020, p. 8)
- “Among the Rust unsafe calls, 47.6% are calls to the Rust Core Library. Of these unsafe functions in Core, 36.4% are of functions in the ptr module used to manually manage memory through C-style(raw) pointers and 40% are calls of functions that are unsafe wrappers to SIMD instructions and architecture-specific intrinsics.” (Evans et al., 2020, p. 9)
- “A majority (55%) indicated the use Unsafe Rust for higher performance, with Safe Rust being too restrictive as the next most common reason (40%). The other reasons selected include: the Safe Rust alternative is too verbose or complicated (25%), needed to make the code compile (10%), and faster to write code with Unsafe Rust (5%). Further, respondents provide other reasons, including: implementing fundamental data structures, custom concurrency primitives and system calls, interaction with specialized hardware, and integration with C or other languages.” (Evans et al., 2020, p. 9)
- “Most respondents (65%) indicated they read the code very carefully, until they convince themselves that the code is correct. Another frequently used technique (55% selected this option) is adding runtime checks to prevent memory corruption. Half of the respondents write more unit tests for the function or method that uses unsafe Rust.” (Evans et al., 2020, p. 9)
- “having discussions with experienced Rust developers in person or online, reading the documentation and Rust books, creating theoretical proofs, using available test generation, running fuzzing and analysis tools, and using Miri [17].” (Evans et al., 2020, p. 9)
- “Anecdotally, over 100 unsound uses of unsafe in a popular Rust web framework, Actix [31], were only discovered and partially fixed by the Rust community after an online post by a concerned user [37].” (Evans et al., 2020, p. 10)
- “Motivated by this example, we propose the following changes to the crates.io interface: i) a new tag or badge for crates that include Unsafe Rust; ii) a dependency tree for each library with the crates that use Unsafe Rust clearly marked; and iii) a list of code reviews for any Unsafe Rust. Previous research established that code review in open source software communities is common and successful in eliminating a large number of errors [7, 26, 27]. Alami et al. [2] find that open source software developers develop a mature attitude to negative feedback and improve their code through a cycle of review, rejection, and improvement.” (Evans et al., 2020, p. 10)
- “Java is a safe language, but the runtime provides a “backdoor” that permits the circumvention of Java’s safety guarantees to enable high-performance systems-level code. Mastrangelo et al. [22] perform a large-scale analysis of Java bytecode to determine how these unsafe capabilities are used in real world. The authors determine that 25% of the Java code analyzed depends on unsafe Java code. One explanation for why the 25% of the analyzed Java code lacks safety because of use of unsafe API may be that it is not an integral part of the language, like in Rust, and it is not exposed by the java standard libraries. Huang et al. [16] study unsafe crash patterns and implement a bytecode-level transformation that introduces runtime checks to help diagnose and prevent some memory errors caused by the use of the unsafe API.” (Evans et al., 2020, p. 11)
summary
The authors evaluate occurence of unsafe code blocks in popular crates. Specifically, they download roughly 500 crates which account for 90% of crates.io downloads. Somehow, only 473 of these crates were compilable. After compilation into MIR, they look at the call graph and special care needs to be taken for monomorphized code (here a “conservative” and an “optimistic” assumption is taken on unsafe usage in parallel). They discover that 29% of crates use unsafe code blocks and due to dependencies more than 50% of crates are unsafe overall.
The main motivation for unsafe seems to be performance, but a too permissive compiler is another reason. Usage occurs when interacting with hardware, interfacing C software or preventing overhead when implementing data structures. Rust developers are aware of the significance of unsafe blocks and thus pay more attention through code reviews, expert consultation, or even application of MIRI on these blocks. A general tendency occurs that low-level primitives in unsafe blocks is abstracted behind a high-level API yielding unsafe-free crates on top and optimized/system-level crates on the bottom.
Well written paper. Well defined research questions. Though I did not expect from the research topic in general, the paper showed well conducted research.
I was also not convinced that the crates are representative because more popular crates might receive more attention and might have higher performance demands. On the contrary, these crates are used most often and thus represent actual usage. Thus I reject my own argument.
I learned that Java also has an Unsafe API. Specifically, the paper “Use at Your Own Risk: The Java Unsafe API in the Wild” analyzed 86,500 Java bytecode archives and learned that 25% instantiated the Unsafe() API.
Languages and the computing profession §
Title: “Languages and the computing profession” by N. Holmes [url] [dblp]
Published in 2004 at and I read it in 2021-09
Abstract:
quotes
- “as in the case of German versus English as Samuel Langhorne Clemens described (www.bdsnett.no/klaus/twain).”
- “The difficulty here is deciding which meanings are primary.”
- “Regularity. The rules for combining and ordering codes, and for systematic codes such as those for colors, must be free from exceptions and variations.”
- “Designing the intermediate language to be spoken as words and thus to serve as an auxiliary language would be a mistake.”
- “First, designing the intermediate language for general auxiliary use would unnecessarily and possibly severely impair its function as an intermediary. Second, a global auxiliary language’s desirable properties differ markedly from those needed for an intermediary in translation, as the auxiliary language Esperanto’s failure in the intermediary role demonstrates.”
- “Defining the intermediate language requires developing and verifying its vocabulary and grammar as suitable for mediating translation between all classes and kinds of natural language.”
- “Indeed, the qualities of an intermediate language could make search engines much more effective.”
- “Strategically, a much better way to use digital technology to help the poor and counter global inequity and its symptomatic digital divide would be for the UN to take responsibility for the development and use of a global intermediate translation language.”
summary
Very shallow reading which proposes an intermediate language. He discusses some properties (specifity, precision, regularity, literality, neutrality) but fails to achieve a coherent requirement analysis.
Learn&Fuzz: Machine learning for input fuzzing §
Title: “Learn&Fuzz: Machine learning for input fuzzing” by Patrice Godefroid, Hila Peleg, Rishabh Singh [url] [dblp]
Published in 2017 at ASE 2017 and I read it in 2021-07
Abstract: Fuzzing consists of repeatedly testing an application with modified, or fuzzed, inputs with the goal of finding security vulnerabilities in input-parsing code. In this paper, we show how to automate the generation of an input grammar suitable for input fuzzing using sample inputs and neural-network-based statistical machine-learning techniques. We present a detailed case study with a complex input format, namely PDF, and a large complex security-critical parser for this format, namely, the PDF parser embedded in Microsoft’s new Edge browser. We discuss and measure the tension between conflicting learning and fuzzing goals: learning wants to capture the structure of wellformed inputs, while fuzzing wants to break that structure in order to cover unexpected code paths and find bugs. We also present a new algorithm for this learn&fuzz challenge which uses a learnt input probability distribution to intelligently guide where to fuzz inputs.
quotes
- “We present a detailed case study with a complex input format, namely PDF, and a large complex security-critical parser for this format, namely, the PDF parser embedded in Microsoft’s new Edge browser.”
- “We also present a new algorithm for this learn&fuzz challenge which uses a learnt input probability distribution to intelligently guide where to fuzz inputs.”
- “There are three main types of fuzzing techniques in use today: (1) blackbox random fuzzing [30], (2) whitebox constraint-based fuzzing [11], and (3) grammar-based fuzzing [26], [30], […]”
- “this paper presents the first attempt at using neural-network-based statistical learning techniques for this problem. Specifically, we use recurrent neural networks for learning a statistical input model that is also generative: it can be used to generate new inputs based on the probability distribution of the learnt model (see Section III for an introduction to these learning techniques).”
- “The full specification of the PDF format is over 1, 300 pages long [1]. Most of this specification – roughly 70% – deals with the description of data objects and their relationships between parts of a PDF document.”
- “PDF files are encoded in a textual format, which may contain binary information streams (e.g., images, encrypted data). A PDF document is a sequence of at least one PDF body. A PDF body is composed of three sections: objects, cross-reference table, and trailer.”
- “In this paper, we investigate how to leverage and adapt neural-network-based learning techniques to learn a grammar for non-binary PDF data objects.”
- “In contrast, learning automatically the structure (rules) for defining cross-reference tables and trailers, which involve constraints on lists, addresses, pointers and counters, are too complex and less promising for learning with neural networks. We also do not consider binary data objects, which are encoded in binary (e.g., image) sub-formats and for which fully-automatic blackbox and whitebox fuzzing are already effective.”
- “The key intuition of the SampleFuzz algorithm is to introduce unexpected characters in objects only in places where the model is highly confident, in order to trick the PDF parser.”
- “For evaluating the performance of the char-rnn model, we train multiple models parameterized by number of passes, called epochs, that the learning algorithm performs over the training dataset. An epoch is thus defined as an iteration of the learning algorithm to go over the complete training dataset. We evaluate the char-rnn models trained for five different numbers of epochs: 10, 20, 30, 40, and 50. In our setting, one epoch takes about 12 minutes to train the model, and the model with 50 epochs takes about 10 hours to learn.”
- “We used a self-contained single-process test-driver executable provided to us by Microsoft Windows organization.”
- “We extracted about 63,000 non-binary PDF objects out of a diverse set of 534 (well-formed) PDF files. These 534 files themselves were provided to us by the Windows fuzzing team and had been used for prior extended fuzzing of the Edge PDF parser.”
- “the Edge PDF parser processes full PDF files, not single objects. Therefore we wrote a simple program to correctly append a new PDF object to an existing (well-formed) PDF file, which we call
a host, following the procedure discussed in Section II for updating a PDF document.” - “Each test typically covers on the order of half a million unique instructions; this confirms that the Edge PDF parser is a large and non-trivial application.”
- “1,000 PDF files take about 90 minutes to be processed (both to be tested and get the coverage data).”
- “Figure 8 reports the overall coverage and the pass-rate for each set. Each set of 30,000 PDF files takes about 45 hours to be processed.”
- “During the experiments previously reported in this section, no bugs were found. Note that the Edge PDF parser had been thoroughly fuzzed for months with other fuzzers including SAGE [11]) before we performed this study, and that all the bugs found during this prior fuzzing had been fixed in the version of the PDF parser we used for this study.”
- “However, during a longer experiment with Sample+Random, 100,000 objects and 300,000 PDF files (which took nearly 5 days), a stack-overflow bug was found in the Edge PDF parser: a regular-size PDF file is generated (its size is 33Kb) but it triggers an unexpected recursion in the parser, which ultimately results in a stack overflow.”
- “this pass rate is high enough to generate diverse well-formed objects that cover a lot of code in the PDF parser, yet low enough to also exercise error-handling code in many parts of that parser.”
- “Our in-depth study only considers one complex benchmark, namely PDF and its non-binary objects, and one parser, namely the Edge PDF parser. For other input formats, other training sets, and other parsers, the results could vary, and we have no evidence that the detailed observations made about our specific experimental results would carry over to other contexts.”
- “In particular, this paper is the first to observe and provide clear experimental evidence of the general tension between the conflicting goals of learning and fuzzing: learning wants to maximize pass rate while fuzzing wants to maximize coverage.”
- “An important consequence of this observation is that better learning does not imply better fuzzing.”
- “”
summary
Interesting paper. It clearly explains the consecutive steps taken with a focus on the reason why the next step has been done. I would love to see more examples how the {NoSample, Sample, SampleSpace} strategies produce (kinds of) PDF objects in the paper.
However, due to the nature of machine learning it is limited to non-binary PDF objects only. Furthermore, I criticize the reproducibility of the paper since neither the PDF parser executable, the fuzzing output nor the reported bug are publicly accessible and can be verified. I also keep wondering… the “smallest three PDF files of our set of 534 files […] are of size 26 Kb, 33 Kb and 16 Kb” is quite large. Why not a minimal PDF file?
The key result regarding the tension between coverage (→ coverage) and pass rate (→ learning) is intriguing.
Short summary: This paper generates non-binary PDF objects and adds them to small PDF files to create a complete PDF file. The objects are generated with three different machine learning strategies called …
- “NoSample”: predict next character, useful to generate well-formed objects, meaningless for fuzzing
- “Sample”: sample next characters, useful for fuzzing, does not always generate meaningful objects
- “SampleSpace”: Sample if prefix ends with whitespace, NoSample otherwise.
The ML approach uses a recurrent neural network which uses sequences of 100 characters in the training phase. An LSTM model with 2 hidden layers (128 hidden states each) is used. The training used {10, 20, 30, 40, or 50} passes (called epochs). They evaluated the strategy most useful for coverage (“Sample-40e”) and for the pass rate (SampleSpace achieves pass rate between 70%–97% with 50 epochs). With this setup they found one stack overflow bug triggering an infinite recursion in the parser. One needs to point out that other fuzzers (e.g. SAGE) have been applied to the parser before.
pass rate = For each test execution, we programmatically check (grep) for the presence of parsing-error messages in the PDF-parser execution log. If there are no error messages, we call this test pass otherwise we call it fail. Pass tests corresponds to PDF files that are considered to be well-formed by the Edge PDF parser.
coverage = For each test execution, we measure instruction coverage, that is, the set of all unique instructions executed during that test.
Markup systems and the future of scholarly text proces… §
Title: “Markup systems and the future of scholarly text processing” by James H. Coombs, Allen H. Renear, Steven J. DeRose [url] [dblp]
Published in 1987 at Communications of the ACM and I read it in 2024-11-19
Abstract: Markup practices can affect the move toward systems that support scholars in the process of thinking and writing. Whereas procedural and presentational markup systems retard that movement, descriptive markup systems accelerate the pace by simplifying mechanical tasks and allowing the authors to focus their attention on the content.
error
“For example, contiguous quotation marks are to be separated by “/ , “, which is a “typesetting command that causes TEX to insert a small amount of space”” (Coombs et al., 1987, p. 5)
Should be a backslash.
quotes
- “Reid deveioped Scribe. freeing authors from formatting concerns and providing them with integrated tools for bibliography and citation management [SO].” (Coombs et al., 1987, p. 1)
- “Previously. developers worked with the model of scholar as researching and composing author. Recently. however, the dominant model construes the author as typist or, even worse. as typesetter.’ Instead of enabling scholars to perform tasks that were not possible before, today’s systems emulate typewriters.” (Coombs et al., 1987, p. 1)
- “Those who have access to more powerful systems rarely have the time to learn to exploit them fully, and many find them too complicated to use at all. This is quite understandable, since most text formatters on minicomputers and mainframes were developed under a model that is even more inappropriate than author-as-typist. Written by and for programmers, these systems often require quasi-programming skills, treating authors as programmer-typists.” (Coombs et al., 1987, p. 1)
- “Thus. we see far more attention paid to keyboards. printers. fonts. displays. graphics. colors. and similar features than to the retrieval and structuring of information or even to the verification of spelling and grammar.” (Coombs et al., 1987, p. 2)
- “Second, developers and authors have lost sight of the fact that there are two products in the electronic development of a document: the printout and the “source” file. Currently. everything is directed toward producing the printout: the source is a mere byproduct. wasting valuable disk space, useful for little more than producing another printout of the same document.” (Coombs et al., 1987, p. 2)
- “Actually. Goldfarb ignores presentational markup; SGML distinguishes “natural-language notation” (punctuational markup) from “formatted text notations” (presentational markup).” (Coombs et al., 1987, p. 3)
- “Procedural markup is characteristically associated with batch text formatters, such as nroff/troff and TFX. Word processors like WordStar, however, supplement their presentational editing commands with “dot commands.”” (Coombs et al., 1987, p. 4)
- “To summarize, there are six types of document markup, but only three are in full competition: presentational, procedural, and descriptive. Presentational markup clarifies the presentation of a document and makes it suitable for reading. Procedural markup instructs a text formatter to “do X,” for example, skip three lines, in order to create presentational markup. Finally, descriptive markup tells text formatters that “this is an X,” for example, a long quotation. Normally, text formatters treat presentational markup in source files as if it were text; that is, no special processing is performed. Procedural markup, however, is processed according to the rules specified in the documentation for the system; and descriptive markup is usually mapped onto procedural markup. In addition, descriptive markup is well suited for processing by an open set of applications.” (Coombs et al., 1987, p. 6)
- “it should be clear that typesetting is a skilled task requiring special training in such concepts as typefaces, styles, and sizes; and leading, weighting, kerning, widows, rectos, versos, letterspacing, loose lines, and all the apparatus of professional designers.” (Coombs et al., 1987, p. 8)
- “In addition, we have raised, somewhat obliquely, issues of what authors view. We have suggested briefly that interfaces that expose or display descriptive markup will help authors focus on content and stay aware of structure. This is clearly an oversimplification. First, not all markup provides significant structural information or, better, information that is important to the author at the moment. Second, even descriptive markup can become so dense as to obscure the text. Furthermore, it might be in the author’s best interests at any particular time to see, for example, the enumeration of items in a list instead of the descriptive markup. Ultimately, we have to conclude there is no simple relationship between viewing format and content orientation. An ideal system would provide authors with the ability to select among a number of different views of a file, and it should be possible to display the markup for some textual elements and to conceal the markup for others.” (Coombs et al., 1987, p. 12)
summary
The paper considers available tools for document preparation / text editing / word processing and extracts classifications for markup. Specifically it extracts the notions of descriptive, procedural, presentational, punctuational, and scribal markup but concludes (in section “Markup theory”) that only three of them are in competition. It proceeds to point out what advantages descriptive markup has over presentational markup.
McBits: Fast Constant-Time Code-Based Cryptography §
Title: “McBits: Fast Constant-Time Code-Based Cryptography” by Daniel J. Bernstein, Tung Chou, Peter Schwabe [url] [dblp]
Published in 2015 at CHES 2013 and I read it in 2022-04
Abstract: This paper presents extremely fast algorithms for code-based public-key cryptography, including full protection against timing attacks. For example, at a 2128 security level, this paper achieves a reciprocal decryption throughput of just 60493 cycles (plus cipher cost etc.) on a single Ivy Bridge core. These algorithms rely on an additive FFT for fast root computation, a transposed additive FFT for fast syndrome computation, and a sorting network to avoid cache-timing attacks.
quotes
- “To summarize, all of these examples of bitsliced speed records are for small Sboxes or large binary fields, while code-based cryptography relies on medium-size fields and seems to make much more efficient use of table lookups.” (Bernstein et al., 2013, p. 3)
- “[…] we point out several ways that our decryption algorithm improves upon the algorithm used in [44]: we use an additive FFT rather than separate evaluations at each point (“Chien search”); we use a transposed additive FFT rather than applying a syndrome-conversion matrix; we do not even need to store the syndrome-conversion matrix, the largest part of the data stored in [44]; and we use a simple hash (see Section 6) rather than a constant-weight-word-to-bit-string conversion” (Bernstein et al., 2013, p. 6)
- “For multipoint evaluation we use a characteristic-2 “additive FFT” algorithm introduced in 2010 [39] by Gao and Mateer (improving upon an algorithm by von zur Gathen and Gerhard in [40], which in turn improves upon an algorithm proposed by Wang and Zhu in [77] and independently by Cantor in [29]), together with some new improvements described below.” (Bernstein et al., 2013, p. 7)
-
“The basic idea of the algorithm is to write f in the form f0(x2−x)+xf1(x2−x) for two half-degree polynomials f0, f1 ∈ Fq[x]; this is handled efficiently by the ‘radix conversion’ described below. This form of f shows a large overlap between evaluating f(α) and evaluating f(α + 1). Specifically, (α + 1)2 − (α + 1) = α2 − α, so
f(α) = f0(α2 − α) + αf1(α2 − α),
f(α + 1) = f0(α2 − α) + (α + 1)f1(α2 − α).
Evaluating both f0 and f1 at α2 − α produces both f(α) and f(α + 1) with just a few more field operations: multiply the f1 value by α, add the f0 value to obtain f(α), and add the f1 value to obtain f(α + 1).” (Bernstein et al., 2013, p. 8) - “Consider the problem of computing the vector (∑α rα, ∑α rαα, …, ∑α rααd), given a sequence of q elements rα ∈ Fq indexed by elements α ∈ Fq, where q = 2m. This vector is called a ’syndrome’.” (Bernstein et al., 2013, p. 10)
- “The transposition principle states that if a linear algorithm computes a matrix M (i.e., M is the matrix of coefficients of the inputs in the outputs) then reversing the edges of the linear algorithm, and exchanging inputs with outputs, computes the transpose of M .” (Bernstein et al., 2013, p. 12)
- “In particular, since syndrome computation is the transpose of multipoint evaluation, reversing a fast linear algorithm for multipoint evaluation produces a fast linear algorithm for syndrome computation.” (Bernstein et al., 2013, p. 12)
- “This procedure produced exactly the desired number of operations in Fq but was unsatisfactory for two reasons. First, there were a huge number of nodes in the graph, producing a huge number of variables in the final software. Second, this procedure eliminated all of the loops and functions in the original software, producing a huge number of lines of code in the final software. Consequently the C compiler, gcc, became very slow as m increased and ran out of memory around m = 13 or m = 14, depending on the machine we used for compilation.” (Bernstein et al., 2013, p. 13)
- “A ‘sorting network’ uses a sequence of ‘comparators’ to sort an input array S. A comparator is a data-independent pair of indices (i, j); it swaps S[i] with S[j] if S[i] > S[j]. This conditional swap is easily expressed as a data-independent sequence of bit operations: first some bit operations to compute the condition S[i] > S[j], then some bit operations to overwrite (S[i], S[j]) with (min {S[i], S[j]}, max {S[i], S[j]}). There are many sorting networks in the literature. We use a standard ‘oddeven’ sorting network by Batcher [4], which uses exactly (m2 − m + 4)2m−2 − 1 comparators to sort an array of 2m elements. This is more efficient than other sorting networks such as Batcher’s bitonic sort [4] or Shell sort [72]. The oddeven sorting network is known to be suboptimal when m is very large (see [2]), but we are not aware of noticeably smaller sorting networks for the range of m used in code-based cryptography.” (Bernstein et al., 2013, p. 14)
- “Our goals in this paper are more conservative, so we avoid this approach: we are trying to reduce, not increase, the number of questions for cryptanalysts.” (Bernstein et al., 2013, p. 16)
- “Code-based cryptography is often presented as encrypting fixed-length plaintexts. McEliece encryption multiplies the public key (a matrix) by a k-bit message to produce an n-bit codeword and adds t random errors to the codeword to produce a ciphertext. The Niederreiter variant (which has several well-known advantages, and which we use) multiplies the public key by a weight-t n-bit message to produce an (n − k)-bit ciphertext. If the t-error decoding problem is difficult for the public code then both of these encryption systems are secure against passive attackers who intercept valid ciphertexts for random plaintexts.” (Bernstein et al., 2013, p. 16)
- “However, this argument relies implicitly on a detailed analysis of how much information the attacker actually obtains through timing. By systematically eliminating all timing leaks we eliminate the need for such arguments and analyses.” (Bernstein et al., 2013, p. 18)
- “A security proof for Niederreiter KEM/DEM appeared very recently in Persichetti’s thesis [64]. The proof assumes that the t-error decoding problem is hard; it also assumes that a decoding failure for w is indistinguishable from a subsequent MAC failure. This requires care in the decryption procedure; see below.” (Bernstein et al., 2013, p. 18)
- “Many authors have stated that Patterson’s method is somewhat faster than Berlekamp’s method.” (Bernstein et al., 2013, p. 19)
- “CFS is a code-based public-key signature system proposed by Courtois, Finiasz, and Sendrier in [31]. The main drawbacks of CFS signatures are large public-key sizes and inefficient signing; the main advantages are short signatures, fast verification, and post-quantum security.” (Bernstein et al., 2013, p. 19)
strong statement
“We have found many claims that NTRU is orders of magnitude faster than RSA and ECC, but we have also found no evidence that NTRU can match our speeds” (Bernstein et al., 2013, p. 5)
summary
The paper describes multiple optimization techniques useful for code-based cryptography. It mentions 400,000 decryptions per second at the 280 security level and 200,000 per second at the 2128 security level. Among the techniques, Horner’s rule, Chien search (Gao-Mateer additive search), and syndrome computation as transpose of multipoint evaluation is mentioned.
Neat paper starting with prior work to describe optimized software implementations for Niederreiter’s cryptosystem as well as the CFS signature scheme. One question discussed is whether bitslicing is worth the effort. Interestingly, the scheme is explained at the end (section 6); not the beginning. The data is described in text and not summarized in a table. I think the main contribution are the optimization techniques in section 3.
Mind your language: on novices' interactions with erro… §
Title: “Mind your language: on novices' interactions with error messages” by Guillaume Marceau, Kathi Fisler, Shriram Krishnamurthi [url] [dblp]
Published in 2011 at Onward! 2011 and I read it in 2021-04
Abstract: Error messages are one of the most important tools that a language offers its programmers. For novices, this feedback is especially critical. Error messages typically contain both a textual description of the problem and an indication of where in the code the error occurred. This paper reports on a series of studies that explore beginning students' interactions with the vocabulary and source-expression highlighting in DrRacket. Our findings demonstrate that the error message significantly fail to convey information accurately to students, while also suggesting alternative designs that might address these problems.
quotes
- “The DrRacket development team put considerable effort into the design of the error messages. They carefully considered both form and terminology, and refined the messages many times over the years based on their observations of students.”
- “It occurred during the first lab session of WPI’s introductory course for novice programmers. We collected data from 60 students (out of 120, self-selected via consent to participate in our study) in the course. We collected data during lab sessions, which ran 50 minutes per week for 6 weeks. To analyze this data, two experienced instructors independently assessed the extent to which individual edits addressed the reported problems.”
- Interviewer: “Do you know what ‘function body’ means?” Student: “Nah… The input?”
- “This could explain the edit from Figure 1 where the student took the expression “expected a name” to mean “insert ‘name’ here”, while the actual fix was to add a parenthesis.”
- “Through manual inspection of all of the error messages in the Beginning Student language, we found five different meanings for DrRacket’s highlights, depending on the error:
- This expression raised a runtime exception
- The parser did not expect to find this
- The parser expected to see something after this, but nothing is there
- This parenthesis is unmatched
- This expression is inconsistent with another part of the code”
- “Which style of errors gets produced is closely tied to the parsing strategy that the language employs.”
- “we extracted the terms used in the most frequently-presented error messages in our 6-week data set. Table 1 shows the 15 technical vocabulary words in the 90th-percentile of this list. We then developed a short quiz that asked students to circle instances of 5 specific words from this list in a simple piece of code.”
- “For instance, the two professors who did not use the term ‘procedure’ used the term ‘function’ instead.”
- “Studies frequently use control groups to quantify the effect of an intervention. While we did not create control groups around the usage of terms in class, by happenstance 11 of the 15 words were used at some universities but not others. These words formed controlled trials (a technical term), in which it was possible to quantify the effect of a word being used in class on the students' understanding of that word.”
- “The results also raise pedagogic questions about good approaches to teach the technical vocabulary of programming. Should courses use specialized vocabulary training tutors (such as FaCT [2])? Lecture time is limited, as are homework contact hours; could the error messages help teach the vocabulary?”
- “DrRacket’s errors follow a common structure of
<construct>: expected <expr type> but found <X>
where <expr type> describes in grammar terms what the parser expected to find, or the expected data type for a runtime error; and <X> describes the kinds and number of expressions found, or a generic “something else”, or, in case of a runtime error, the specific value given.” - “If we discount this question and focus on compile-time errors, students seem able to match highlights to error-message terms. This suggests that the “edit here” interpretation has some source other than mere confusion, such as students seeking an easy default behavior or giving too much attention to the highlight due to its strong visual appearance.”
- “This suggests that vocabulary and message grammar are perhaps bigger factors than highlighting in students’ difficulties with responding to errors.”
- Recommendation Principle 1: “Error messages should not propose solutions.”
- Recommendation Principle 2: “Error messages should not prompt students towards incorrect edits.”
- Recommendations:
- Simplify the vocabulary in the error messages.
- Be explicit in errors about inconsistencies.
- Help students match message terms to code fragments.
- Treat error messages as an integral part of course design.
- Teach highlighting (and other error components).
- Beware Libraries.
- Our recommendations about color-coded highlights, consistent vocabulary, non-biased error messages, and pedagogy are not specific
to Racket. They should apply just as well in any other programming language used for teaching, including those with graphical syntaxes (to the extent that they have error messages).
summary
Guillaume Marceau, Kathi Fisler, and Shriram Krishnamurthi are contributors to the Racket programming language. They evaluated potential problems with the error messages of DrRacket, an IDE for Racket. They point out that “the DrRacket development team put considerable effort into the design of the error messages. They carefully considered both form and terminology, and refined the messages many times over the years based on their observations of students.” They implemented a user study to evaluate problems: “We logged students’ programming sessions and explored the effectiveness of the error messages at helping students make progress.”
Fundamentally, error messages in DrRacket are presented with a highlighted code section. The error message often follows the same structure. The study was done as part of one of the first lab session of university’s introductory course for novice programmers. They collected data from 60 students (out of 120, self-selected via consent to participate in our study) in the course. Lab sessions ran 50 minutes per week for 6 weeks. “To analyze this data, two experienced instructors independently assessed the extent to which individual edits addressed the reported problems.”
An interesting observation was that terminology like “function body” was not familiar to users. In the end, a lack of familiarity with the vocabulary used in the error message was the main result of the paper.
Their approach of student interviews, vocabulary quizzes, and collaboration/interviews with programming teachers seems profound and the sample size large enough (a distinction between universities would not have been necessary in illustrations though, since data points are similar).
I agree with the results except for Recommendation principle 1: “Error messages should not propose solutions”. To the best of my understanding, the current rust compiler shows that the opposite is true. However, this only works because the compiler can deduce enough information about the written program to suggest correct solutions. This seems to be a difference to Racket.
I also have personal objections to the simplified vocabulary in Table 5, because it simplifies language such that important distinctions cannot be made to uniquely identify the issue. e.g. “Selector” versus “Constructor” (instead of a simplified “function”) can help to identify the root cause. But this opinion is unsupported whereas the data of the paper shows that such simplification might be useful.
NTRU: A ring-based public key cryptosystem §
Title: “NTRU: A ring-based public key cryptosystem” by Jeffrey Hoffstein, Jill Pipher, Joseph H. Silverman [url] [dblp]
Published in 1998 at ANTS 1998 and I read it in 2021-11
Abstract: W e describe N T R U , a new public key cryptosystem. N T R U features reasonably short, easily created keys, high speed, and low memory requirements. NTR.U encryption and decryption use a mixing system suggested by polynomial algebra combined with a clustering principle based on elementary probability theory. The security of the N T R U cryptosystem comes from the interaction of the polynomial mixing system with the independence of reduction modulo two relativelyprime integersp and q.
quotes
- “Currently, the most widely used public key system is RSA, which was created by Rivest, Shamir and Adelman in 1978 [9] and is based on the difficulty of factoring large numbers. Other systems include the McEliece system [8] which relies on error correcting codes, and a recent system of Goldreich, Goldwasser, and Halevi [4] which is based on the difficulty of lattice reduction problems.”
- “In this paper we describe a new public key cryptosystem, which we call the NTRU system. The encryption procedure uses a mixing system based on polynomial algebra and reduction modulo two numbers p and q, while the decryption procedure uses an unmixing system whose validity depends on elementary probability theory. The security of the NTRU public key cryptosystem comes from the interaction of the polynomial mixing system with the independence of reduction modulo p and q. Security also relies on the (experimentally observed) fact that for most lattices, it is very difficult to find extremely short (as opposed
to moderately short) vectors.” - “Encryption and decryption with NTRU are extremely fast, and key creation is fast and easy. See Section 5 for specifics, but we note here that NTRU takes O(N2) operations to encrypt or decrypt a message block of length N, making it considerably faster than the O(N3) operations required by RSA. Further, NTRU key lengths are O(N), which compares well with the O(N2) key lengths required by other "fast" public keys systems such as [8, 4].”
- “In principle, computation of a product F⭙G requires N2 multiplications. However, for a typical product used by NTRU, one of F or G has small coefficients, so the computation of F⭙G is very fast. On the other hand, if N is taken to be large, then it might be faster to use Fast Fourier Transforms to compute products F⭙G in O(N log N) operations.”
- “For appropriate parameter values, there is an extremely high probability that the decryption procedure will recover the original message. However, some parameter choices may cause occasional decryption failure, so one should probably include a few check bits in each message block. The usual cause of decryption failure will be that the message is improperly centered, In this case Dan will be able to recover the message by choosing the coefficients of a ≡ f ⭙ e (mod q) in a slightly different interval, for example from -q/2 + x to q/2 + x for some small (positive or negative) value of x. If no value of x works, then we say that we have gap failure and the message cannot be decrypted as easily. For well-chosen parameter values, this will occur so rarely that it can be ignored in practice.”
- “In order for the decryption process to work, it is necessary that |f⭙m + pφ⭙g|∞ < q. We have found that this will virtually always be true if we choose parameters so that |f⭙m|∞ ≤ q/4 and |pφ⭙g|∞ ≤ q/4, and in view of the above Proposition, this suggests that we take |f|2|m|2 ≈ q/4γ2 and |φ|2|g|2 ≈ q/4pγ2 for a γ2 corresponding to a small value for ε.”
- “An attacker can recover the private key by trying all possible f ∈ Lf and testing if f⭙h (mod q) has small entries, or by trying all g∈ Lg and testing if g⭙h-1 (mod q) has small entries. Similarly, an attacker can recover a message by trying all possible φ ∈ Lφ and testing if e - φ⭙h (mod q) has small entries.”
- Security analysis
- Brute force attacks
- Meet-in-the-middle attacks
- Multiple transmission attacks
- Lattice based attacks
- “The object of this section is to give a brief analysis of the known lattice attacks on both the public key h and the message m. We begin with a few words concerning lattice reduction. The goal of lattice reduction is to find one or more "small" vectors in a given lattice. In theory, the smallest vector can be found by an exhaustive search, but in practice this is not possible if the dimension is large. The LLL algorithm of Lenstra-Lenstra-Lovs [7], with various improvements due to Schnorr and others, [10, 12, 11] will find relatively small vectors in polynomial time, but even LLL will take a long time to find the smallest vector provided that the smallest vector is not too much smaller than the expected length of the smallest vector. We will make these observations more precise below.”
- “In this section we describe our preliminary analysis of the security of the NTRU Public Key Cryptosystem from attacks using lattice reduction methods. It is based on experiments which were performed using version 1.7 of Victor Shoup's implementation of the Schnorr,Euchner and Hoerner improvements of the LLL algorithm, distributed in his NTL package at http://www.cs.wisc.edu/~shoup/ntl/ . The NTL package was run on a 200 M Hz Pentium Pro with a Linux operating system.”
- “Comparison With Other PKCS's . There are currently a number of public key cryptosystems in the literature, including the system of Rivest, Shamir, and Adelman (RSA [9]) based on the difficulty of factoring, the system of McEliece [8] based on error correcting codes, and the recent system of Goldreich, Goldwasser, and Halevi (GGH [4]) based on the difficulty of finding short almost-orthogonalized bases in a lattice.”
- “The NTRU system has some features in common with McEliece's system, in that ⭙-multiplictaion in the ring R can be formulated as multiplication of matrices (of a special kind), and then encryption in both systems can be written as a matrix multiplication E = AX + Y, where A is the public key. A minor difference between the two systems is that for an NTRU encryption, Y is the message and X is a random vector, while the McEliece system reverses these assignments. But the real difference is the underlying trapdoor which allows decryption. For the McEUece system, the matrix A is associated to an error correcting (Goppa) code, and decryption works because the random contribution is small enough to be "corrected" by the Goppa code. For NTRU, the matrix A is a circulant matrix, and decryption depends on the decomposition of A into a product of two matrices having a special form, together with a lifting from mod q to mod p.”
- “As far as we can tell, the NTRU system has little in common with the RSA system.”
- “”
summary
Very fundamental, important paper. NTRU has the unique property of a lattice scheme switching between moduli (“polynomial mixing system” or “lifting operation”). It is interesting to see that they saw RSA and GGH as competitors for NTRU. I think they did a very good job looking at its security analysis, but I didn't look into the details.
New directions in cryptography §
Title: “New directions in cryptography” by W. Diffie, M. Hellman [url] [dblp]
Published in 1976 at Invited paper, IEEE Transactions on Information Theory, Volume 22, Issue 6 and I read it in 2021-05
Abstract:
definitions
- “In a public key cryptosystem enciphering and deciphering are governed by distinct keys, E and D, such that computing D from E is computationally infeasible (e.g., requiring 10100 instructions).”
- “Public key distribution systems offer a different approach to eliminating the need for a secure key distribution channel. In such a system, two users who wish to exchange a key communicate back and forth until they arrive at a key in common. A third party eavesdropping on this exchange must find it computationally infeasible to compute the key from the information overheard,”
- “A privacy system prevents the extraction of information by unauthorized parties from messages”
- “An authentication system prevents the unauthorized injection of messages into a public channel, assuring the receiver of a message of the legitimacy of its sender.”
- “A cryptographic system is a single parameter family {SK}K∈K of invertible transformations SK: {P} → {C} from a space (P) of plaintext messages to a space (C) of ciphertext messages. The parameter K is called the key and is selected from a finite set (K) called the keyspace. If the message spaces (PI and {C) are equal, we will denote them both by (M).”
- “A system which is secure due to the computational cost of cryptanalysis, but which would succumb to an attack with unlimited computation, is called computationally secure; while a system which can resist any cryptanalytic attack, no matter how much computation is allowed, is called unconditionally secure.”
- “We will call a task computationally infeasible if its cost as measured by either the amount of memory used or the runtime is finite but impossibly large.”
- “A ciphertext only attack is a cryptanalytic attack in which the cryptanalyst possesses only ciphertext.”
- “A known plaintext attack is a cryptanalytic attack in which the cryptanalyst possesses a substantial quantity of corresponding plaintext and ciphertext.”
- “A chosen plaintext attack is a cryptanalytic attack in which the cryptanalyst can submit an unlimited number of plaintext messages of his own choosing and examine the resulting cryptograms.”
- “A public key cryptosystem is a pair of families {EK}K∈K and {DK}K∈K of algorithms representing invertible transformations, EK: {M} → {M} and DK: {M} → {M} on a finite message space {M} such that
- for every K ∈ {K} is the inverse of DK,
- for every K ∈ {K} and M ∈ {M}, the algorithms EK and DK are easy to compute,
- for almost every K ∈ {K}, each easily computed algorithm equivalent to DK is computationally infeasible to derive from EK,
- for every K ∈ {K}, it is feasible to compute inverse pairs EK and DK from K.”
- “More precisely, a function f is a one-way function if, for any argument x in the domain off, it is easy to compute the corresponding value f(x), yet, for almost all y in the range off, it is computationally infeasible to solve the equation y = f(x) for any suitable argument x.”
- “Trap doors have already been seen in the previous paragraph in the form of trap-door one-way functions, but other variations exist. A trap-door cipher is one which strongly resists cryptanalysis by anyone not in possession of trap-door information used in the design of the cipher.”
- “For example a quasi one-way function is not one-way in that an easily computed inverse exists. However, it is computationally infeasible even for the designer, to find the easily computed inverse. Therefore a quasi one-way function can be used in place of a one-way function with essentially no loss in security.”
quotes
- “We stand today on the brink of a revolution in cryptography”
- “In the nineteen twenties, however, the “one time pad” was invented, and shown to be unbreakable”
- “Any channel may be threatened with eavesdropping or injection or both, depending on its use. In telephone communication, the threat of injection is paramount, since the called party cannot determine which phone is calling. Eavesdropping, which requires the use of a wiretap, is technically more difficult and legally hazardous. In radio, by comparison, the situation is reversed. Eavesdropping is passive and involves no legal hazard, while injection exposes the illegitimate transmitter to discovery and prosecution.”
- “Much as error correcting codes are divided into convolutional and block codes, cryptographic systems can be divided into two broad classes: stream ciphers and block ciphers.”
- “Care must be taken, however, to use a system in which small changes in the ciphertext result in large changes in the deciphered plaintext. This intentional error propagation ensures that if the deliberate injection of noise on the channel changes a message such as ‘erase file 7’ into a different message such as ‘erase file 8,’ it will also corrupt the authentication information. The message will then be rejected as inauthentic.”
- “A chosen plaintext attack is difficult to achieve in practice, but can be approximated. For example, submitting a proposal to a competitor may result in his enciphering it for transmission to his headquarters. A cipher which is secure against a chosen plaintext attack thus frees its users from concern over whether their opponents can plant messages in their system.”
- “The chosen plaintext attack is often called an IFF attack, terminology which descends from its origin in the development of cryptographic ‘identification friend or foe’ systems after World War II. An IFF system enables military radars to distinguish between friendly and enemy planes automatically. The radar sends a time-varying challenge to the airplane which receives the challenge, encrypts it under the appropriate key, and sends it back to the radar. By comparing this response with a correctly encrypted version of the challenge, the radar can recognize a friendly aircraft. While the aircraft are over enemy territory, enemy cryptanalysts can send challenges and examine the encrypted responses in an attempt to determine the authentication key in use, thus mounting a chosen plaintext attack on the system.”
- “We examine two approaches to this problem, called public key cryptosystems and public key distribution systems, respectively. The first are more powerful, lending themselves to the solution of the authentication problems treated in the next section, while the second are much closer to realization.”
- “And it is at least conceptually simpler to obtain an arbitrary pair of inverse matrices than it is to invert a given matrix. Start with the
identity matrix I and do elementary row and column operations to obtain an arbitrary invertible matrix E. Then starting with I do the inverses of these same elementary operations in reverse order to obtain D = E-1. The sequence of elementary operations could be easily determined from a random bit string” - Diffie-Hellman cryptosystem:
“We now suggest a new public key distribution system which has several advantages. First, it requires only one ‘key’ to be exchanged. Second, the cryptanalytic effort appears to grow exponentially in the effort of the legitimate users. And, third, its use can be tied to a public file of user information which serves to authenticate user A to user B and vice versa. By making the public file essentially a read only memory, one personal appearance allows a user to authenticate his identity many times to many users. Merkle’s technique requires A and B to verify each other’s identities through other means.
The new technique makes use of the apparent difficulty of computing logarithms over a finite field GF(q) with a prime number q of elements. Let Y = αx mod q for 1 ≤ X ≤ q-1 where α is a fixed primitive element of GF(q), then X is referred to as the logarithm of Y to the base α, mod q: X = logα Y mod q for 1 ≤ Y ≤ q-1. Calculation of Y from X is easy, taking at most 2×log2(q) multiplications. For example, for X = 18, Y = α18 = (((α2)2)2)2 × α2. Computing X from Y, on the other hand can be much more difficult and, for certain carefully chosen values of q, requires on the order of q1/2 operations, using the best known algorithm” - “It is important to note that we are defining a function which is not invertible from a computational point of view, but whose noninvertibility
is entirely different from that normally encountered in mathematics. A function f is normally called ‘noninvertible’ when the inverse of a point y is not unique, (i.e., there exist distinct points x1 and x2 such that f(x1) = y = f (x2)). We emphasize that this is not the sort of inversion difficulty that is required. Rather, it must be overwhelmingly difficult, given a value y and knowledge of f, to calculate any x whatsoever with the property that f(x) = y.” - “It may be, however, that rapid computation of fn precludes f from being one-way.”
- “For the system to be secure, computation of the key from the keystream must be computationally infeasible. While, for the system to be usable, calculation of the keystream from the key must be computationally simple. Thus a good key generator is, almost by definition, a one-way function.”
- “A public key cryptosystem can be used to generate a one-way authentication system.”
“The converse does not appear to hold, making the construction of a public key cryptosystem a strictly more difficult problem than one-way authentication. Similarly, a public key cryptosystem can be used as a public key distribution system, but not conversely.” - “In consequence of this, judging the worth of new systems has always been a central concern of cryptographers.”
- “As systems whose strength had been so argued were repeatedly broken, the notion of giving mathematical proofs for the security of systems fell into disrepute and was replaced by certification via cryptanalytic assault.”
- “This example demonstrates both the great promise and the considerable shortcomings of contemporary complexity theory. The theory only tells us that the knapsack problem is probably difficult in the worst case. There is no indication of its difficulty for any particular array.”
summary
A historically important document. The paper has a SoK-style content giving a look at contemporary asymmetric cryptography in 1976. I liked the high-level overview and the collection of established vocabulary. I only missed to understand the math related to “kn = 5000” after parameterizing Leslie Lamport's system.
It was very interesting to see his discussion of the relation to complexity theory. In particular, I have never seen this written in such clear words: “As systems whose strength had been so argued were repeatedly broken, the notion of giving mathematical proofs for the security of systems fell into disrepute and was replaced by certification via crypanalytic assault.”
“We stand today on the brink of a revolution in cryptography” is Diffie-Hellman's first statement which is justified by the last sentence of the paper: “We hope this will inspire others to work in this fascinating area in which participation has been discouraged in the recent past by a nearly total government monopoly”.
typo
- page 653, “intergers”
- page 653, “crypanalytic”
Number "Not" Used Once - Key Recovery Fault Attacks on… §
Title: “Number "Not" Used Once - Key Recovery Fault Attacks on LWE Based Lattice Cryptographic Schemes” by Prasanna Ravi, Shivam Bhasin, Anupam Chattopadhyay [url] [dblp]
Published in 2018 at COSADE 2019 and I read it in 2020-05
Abstract: This paper proposes a simple single bit flip fault attack applicable to several LWE (Learning With Errors Problem) based lattice based schemes like KYBER, NEWHOPE, DILITHIUM and FRODO which were submitted as proposals for the NIST call for standardization of post quantum cryptography. We have identified a vulnerability in the usage of nonce, during generation of secret and error components in the key generation procedure. Our fault attack, based on a practical bit flip model (single bit flip to very few bit flips for proposed parameter instantiations) enables us to retrieve the secret key from the public key in a trivial manner. We fault the nonce in order to maliciously use the same nonce to generate both the secret and error components which turns the LWE instance into an exactly defined set of linear equations from which the secret can be trivially solved for using Gaussian elimination.
ambiguity
What is z in the provided interval in section 2.6? Seems to be either an arbitrary integer or z of Algorithm 1 (NewHope)
error
“later converted to the NTT domain using the PolyBitRev function” … no, it is a separate step
inconsistency
- t = A ⋅ s1 + s2 in flow text in section 2.5
- t = A × s1 + s2 in Algorithm 4
quotes
- “Nonces are predominantly used in all the afore mentioned schemes in order to reduce the amount of randomness required to generate the secret and error components used in the generation of the LWE instance during key generation.”
- “main focus on the key generation algorithm, as that is the target of our attack.”
- “We also use x ← Sη to denote the module x whose coefficients lie in the range [−η, η].”
- About Kyber: “Apart from this, there are other modifications to the scheme like the modified technique for generation of the public module A as in [4], compressed public key and ciphertexts through "Bit-dropping" using the Learning-with-rounding (LWR) problem”
- “every coefficient of the recovered t is only a perturbed version of the original t generated during the key-generation procedure KYBER.CPAPKE.GEN(). Thus, the public key can be assumed to be built based on the hardness of both the MLWE and MLWR problem.”
- About Frodo: “The modulus q is chosen to be a power of 2 that enables easy reduction through bit masking.”
- “For example, if the error component is an all zero vector, then the LWE instance is converted into a set of linear equations with equal number of equations and unknowns. This instance can be solved using straight forward Gaussian elimination. If the error component only has values in a fixed interval [z + 1/2, z - 1/2], then one can just "round away" the non-integral part and subtract z to remove the error from every sample [29]. There are also other easy instances of LWE, For eg. From a given set of n LWE instances, if k of the n error components add up to zero, then one can simply add the corresponding samples to cancel the error and obtain an error free sample. It is also possible to solve an LWE instance in roughly nd time and space using nd samples if the error in the samples lie in a known set of size d [5]. For a very small d, this yields a very practical attack.”
- “Thus, if an attacker can perform a single bit flip of the nonce such that both the calls to the function poly_sample use the same seed, then it yields s = e.”
- “Thus, ultimately the attacker has to inject a minimum of just two bit flips in order to create a very easy LWE instance that results in direct retrieval of the secret key S through Gaussian elimination.”
- “In KYBER, the dimensions of both s1 and s2 are the same and equal k. But, the generated MLWE instance t = a × s1 + s2 is further protected by the hardness of the MLWR problem through the Compressq function and hence the public key is formed by a combination of MLWE and MLWR instances.”
- “Since the nonce is deterministically generated, one can use an error correction scheme to check if the correct value of nonce is being used for the generation of polynomials.”
- “The motivation to use a nonce for generation of polynomials is mainly to reduce the randomness requirement and use the same seed to generate multiple polynomials.”
summary
- paper from 2019
- public key in Learning With Errors problem is given by A×s + e
- If s = e, then A×s + s gives a trivial linear equation system to solve to get the secret key s
- if the nonce is the same between the generation of s and e, then s = e
- stuck-at fault attack on nonces in NewHope, Kyber, Dilithium, and Frodo schemes
- trivial for NewHope (keep poly_sample(&shat, noiseseed, 0) the same, apply stuck-at(0) to poly_sample(&ehat, noiseseed, 1))
- Technically more difficult for Frodo, because stuck-at(1) must be applied to Frodo.SampleMatrix(seed_E, n, n_bar, T_chi, 2). Requires two stuck-at bits unlike NewHope
- For Dilithium, l stuck-ats required, but then equation system is only solvable if enough equation (i.e. signatures) are available
- For Kyber, k stuck-ats required, but the equation system must not necessarily be solvable, since rounding still protects it
Short paper, tiny errors, no practical implementation (how successful can the attack on Frodo be implemented?) (correction: unlike the paper, their presentation slides show evaluation data), the structure of the text is beautiful and it is well-written. I particularly enjoyed the summary of weak LWE instances. Masking is also another countermeasure, but is heavy-weight compared to the proposed error-correcting codes
typo
The aforementioned fault vulnerabilities associated with the nonce only occur due to the implementation strategies used and hence can we believe can be easily corrected albeit with a small performance overhead.
⇒ hence can, we believe, can be easily corrected
On the Security of Password Manager Database Formats §
Title: “On the Security of Password Manager Database Formats” by Paolo Gasti, Kasper B. Rasmussen [url] [dblp]
Published in 2012 at ESORICS 2012 and I read it in 2021-05
Abstract: Password managers are critical pieces of software relied upon by users to securely store valuable and sensitive information, from online banking passwords and login credentials to passport- and social security numbers. Surprisingly, there has been very little academic research on the security these applications provide.
question
Why does Definition 2 (IND-CDBA security) include 1/2 + negl(κ) as threshold whereas Definition 3 (MAL-CDBA security) uses negl(κ)?
Game 1 (which Definition 2 relies upon) asks to decide upon one binary value (⇒ 50% with guessing strategy). Game 2 (which Definition 3 relies upon) asks to create a new, valid DB (⇒ 0% probability if guessing is your strategy).
quotes
- “Users typically solve this problem in one of two ways. A common solution is to reuse the same password on many different websites. This approach increases the potential damage if a password is stolen, cracked, or if a service that has access to it is compromised, since the attacker will be able to reuse it on all online services that share the password. Another approach is to use a “password manager” to store strong (random) passwords for each site. A password manager is a piece of software that requires a user to remember a single strong master password, used to decrypt the password manager’s database. Remembering a single master password is much more feasible for users, who still get the security benefits of using a different password for each online service.”
- “As such, users who rely on password managers are less susceptible to typo-squatting and phishing attacks [11,20]: even if a user is directed to a malicious website that is designed to look identical to the website the user expects, the password manager will not log in automatically, providing an extra layer of protection.”
- “Several producers of password managers suggest storing password databases on USB sticks [35, 37, 40], in the cloud [1, 24] or on mobile devices [2, 4, 27], to allow convenient access to stored passwords.”
- “Note that we do not attempt to provide an exhaustive list of all possible attacks on all password managers. Rather, we model the security provided by common password manager database formats and provide examples of practical attacks.”
- “Advr who has read access to the password database, and Advrw who has read-write access. The goal of both adversaries is to extract as much information as possible and, for Advrw, to produce a database that (1) was not created by the user and (2) once opened, will not trigger any warning or error message from the password manager.”
- “Consider an adversary who has full access to an encrypted password database, and is able to record different versions of it. Such an adversary can clearly use any of the recorded versions to replace the current database, as long as the master password did not change.”
- Untrusted storage: “Consider an adversary who has full access to an encrypted password database, and is able to record different versions of it. Such an adversary can clearly use any of the recorded versions to replace the current database, as long as the master password did not change. […] it cannot be mitigated by the database format alone. Therefore we exclude it from our analysis.”
- Firefox: “URLs are always stored unencrypted regardless of whether a master password is used or not.”
- Microsoft Internet Explorer: “The encryption is performed using the CryptProtectData [29] system call, which uses Triple-DES in CBC mode and a hash-based MAC.”
- 1Password:
- “Entries are listed in an index file called ‘content.js’.”
- “The encryption scheme used is AES-128 in CBC mode. Neither the records nor the index file are integrity protected.”
- KDBX4 (aka KeePass 2.x): “bdy is encrypted using AES-256 in CBC mode, although Twofish is also available. The first 32 bytes of bdy contains the encryption of the ssbytes field in order to efficiently verify whether the provided master password is correct.”
summary
Well written paper with a clear agenda. It is perfectly structured and can be followed easily.
The paper defines two notions of security and evaluates password managers against this notion. The notion is well defined, but the practicality is debatable:
- “We argue that MAL-CDBA security (together with IND-CDBA security) is an appropriate security notion for a password manager database format in practice.”
- “We defined two realistic security models, designed to represent the capabilities of real-world attacks.”
I don't think so. Adding entries to the password manager do have a different impact on security than actually replacing existing entries. To retrieve the actual password, the adversary has to set up a typo-squatting domain and than replace the defined URL with the malicious one. Such a scenario is discussed in the paper, was evaluated and is practical, but the security notion does not distinguish between modification and extension. A pure extension attack has a much more limited impact.
I enjoyed the proper discussion of related work to security notions and formal definition of password managers. As of 2021, some results are deprecated (e.g. Internet Explorer does not exist anymore) and some password manager are not maintained anymore (e.g. PINs). Recommendable for everyone to see how the database formats are defined.
- “In the rest of this paper focus solely on database formats and the security they provide, rather than on each password manager implementation. We assume that the password managers themselves correctly implement what the format specifies. As such, we do not consider, e.g., side channel attacks on the cryptographic primitives, or other attacks against the implementation. Rather we investigate the best possible security achievable given a specific storage format.”
- What is typo-squatting? The practice of serving a malicious, similar looking website if you spell the URL/domain name incorrectly.
- password manager = {
Setup: (security parameter) → mp,
Create: (mp, record-set) → DB,
Open: (mp, DB) → RS ‖⊥,
Valid: (mp, DB) → 1 ‖ 0
} - “We also define two new games, which we call indistinguishability of databases game (IND-CDBA) and malleability of chosen database game (MAL-CDBA).”
- IND-CDBA security ⇒ You decide 2 records. I must distinguish records in encrypted form.
Malleability of chosen database game ⇒ I generate records. You encrypt them. I can create a different encrypted file which is valid. - Neat discussion of security notions at page 6 (top) and in the Appendix.
- 2 times “The application is available upon request.” ⇒ Why not open-source?!
- Class 1 (trustworthy): PasswordSafe v3
Class 2 (never rely on information in the database): KDBX4 & PINs
Class 3 (do not use unless you take other measures to ensure data integrity): … others
typo
- page 2, “poplar” → “popular”
- page 3, Table 1, “where” → “were”
On the criteria to be used in decomposing systems into… §
Title: “On the criteria to be used in decomposing systems into modules” by David Lorge Parnas [url] [dblp]
Published in 1972 at CACM, Volume 15, 1972 and I read it in 2021-03
Abstract: This paper discusses modularization as a mechanism for improving the flexibility and comprehensibility of a system while allowing the shortening of its development time. The effectiveness of a "modulariza tion11 is dependent upon the criteria used in dividing the system into modules. Two system design problems are presented, and for each, both a conventional and unconventional decomposition are described. It is shown that the unconventional decompositions have distinct advantages for the goals outlined. The criteria used in arriving at the decomposi tions are discussed. The unconventional decomposition, if implemented with the conventional assumption that a module consists of one or more subroutines, will be less efficient in most cases. An alternative approach to implementation which does not have this effect is sketched.
quotes
- “A well-defined segmentation of the project effort ensures system modularity”
- “The major advancement in the area of modular programming has been the development of coding techniques and assemblers which (1) allow one module to be written with little knowledge of the code in another module, and (2) allow modules to be reassembled and replaced without reassembly of the whole system”
- “Below are several partial system descriptions called modularizations. In this context ‘module’ is considered to be a responsibility assignment rather than a subprogram. The modularizations include the design decisions which must be made before the work on independent modules can begin.”
- “The system is divided into a number of modules with well-defined interfaces; each one is small enough and simple enough to be thoroughly understood and well programmed.”
- “In the first modularization the interfaces between the modules are the fairly complex formats and table organizations described above.”
- “In the second modularization the interfaces are more abstract; they consist primarily in the function names and the numbers and types of the parameters”
- “In the first decomposition the criterion used was to make each major step in the processing a module”
- “The second decomposition was made using ‘information hiding’ as a criterion”
- “In discussions of system structure it is easy to confuse the benefits of a good decomposition with those of a hierarchical structure”
- “We have tried to demonstrate by these examples that it is almost always incorrect to begin the decomposition of a system into modules on the basis of a flowchart. We propose instead that one begins with a list of difficult design decisions or design decisions which are likely to change. Each module is then designed to hide such a decision from the others”
summary
First, someone needs to get familiar with KWIC (recognize the paper reference in the section below). KWIC felt like an arbitrary index someone came up with. I got confused by phrases like “the characters are packed four to a word”, which make little sense outside the index context of a book. But after reading the paper, I looked up the Wikipedia article and learned about its usecase (index of keywords before full-text search was available or in print). The paper is considered an ACM Classic and he got high praise for it.
Essentially, first I had to understand the setting when this paper was written. I grew up with data encapsulation in object-oriented programming, local scoping in programming languages and manipulating data behind a pointer was already a foreign, dangerous idea. The setting is the transition of assembler language towards more high-level languages with questions regarding information hiding arising.
In modular design, his double dictum of high cohesion within modules and loose coupling between modules is fundamental to modular design in software. However, in Parnas's seminal 1972 paper On the Criteria to Be Used in Decomposing Systems into Modules, this dictum is expressed in terms of information hiding, and the terms cohesion and coupling are not used. He never used them.
–Wikipedia: David Parnas
I would define a module as a set of functionality (independent of representation in a programming language). High cohesion within modules and loose coupling between modules is a defining criterion for a good programmer. What I consider an open question, but often triggers bugs is the missing documentation for the interface between modules. Often a data structure transfers the data from one module to another. An informal description often triggers different expectations regarding the content of the data structure.
Back to the paper, it illustrates the decomposition of a system by two exemplary modularizations. Whereas the first decomposition was created along the major steps of the processing routines, the second decomposition was created with information hiding in mind. Then several recommendable criteria for decompositions are mentioned:
- A data structure, its internal linkings, accessing procedures and modifying procedures are part of a single module.
- The sequence of instructions necessary to call a given routine and the routine itself are part of the same module
- The formats of control blocks used in queues in operating systems and similar programs must be hidden within a “control block module.”
- Character codes, alphabetic orderings, and similar data should be hidden in a module for greatest flexibility
- The sequence in which certain items will be processed should (as far as practical) be hidden within a single module
In the end, I think the paper advocates a clean style which (in some sense and with some limitations) is still true today (e.g. web frameworks heavily limit the way you can define your architecture). I recommend every programmer to reflect about possible decompositions of a system, because the most intuitive one might not be the best. The notion of a flowchart approach being the sequential one, is however awkward and foreign to me.
PDF/A considered harmful for digital preservation §
Title: “PDF/A considered harmful for digital preservation” by Marco Klindt [url] [dblp]
Published in 2017 at iPRES 2017 and I read it in 2020-12
Abstract: Today, the Portable Document Format (PDF) is the prevalent file format for the exchange of fixed content electronic documents for publication, research, and dissemination work in the academic and cultural heritage domains. Therefore it is not surprising that PDF/A is perceived to be an archival format suitable for digital archiving workflows.
clarification needed
- “PDF reduces the computational burden of the display device by
executing the necessary PostScript programs during the creation
of the PDF file.”- PostScript is presented with computational burden.
- PDF is consequently describes as object store of PostScript elements
- How does this reduce the computational burden?
- I assume that reuse of objects is the answer, but stating as such would be useful
- “Fonts with open licenses like SIL Open Font License 4 circumvent possible restrictions but also complicate conversion due to differences in substitute font dimensions.”
- not an inherent property of OFL fonts?”
errors
“The textual markup of Markdown variants is machine actionable while being human friendly to read at the same time. It is suitable for structured texts (including lists and tables) where the exact layout is not as important. Markdown is not well suited for validation.”
- lists are troublesome as nested lists occur in various flavors and were not considered in its initial design
- tables were not part of the first Markdown design and various contradicting implementations exist
quotes
Interesting:
- “In a quick analysis of institutional repositories hosted at the ZIB, the siegfried file identification tool 1 identified 44,114 or 84% from a total of 52,611 documents as PDF (and 1,168 or 0.03% of these as PDF/A). Other file formats included Word, WordPerfect, PostScript files and a long tail of more obscure document formats.”
- “Digital preservation is primarily concerned with keeping information contained in digital objects or documents usable for future use.”
- “The accepted reference model for digital preservation systems is the Open Archival Information System (ISO 14721:2012, OAIS)[11]”
- “The usage of and commercial success began with the release of the free Acrobat Reader 2.0 in 1996 for PDF 1.1 and licensing all patents royalty free for everyone using its format. It became the de-facto exchange format for electronic documents and version 1.7 was finally standardized by the International Standards Organization as ISO 32000-1[15] in 2008.”
- “Until the 10.1.5 and 11.0.01 updates Adobe Acrobat products have historically opened a PDF as long as the %PDF-header started anywhere within the first 1024 bytes of the file.”
- “To extract information from content in PDF, tags can be attached to PDF objects from version 1.4 onward. These tags act as markup to denote the logical structure (semantic elements), and logical order (flow) of the content.”
- “All content shall be marked in the structure tree with semantically appropriate tags (i.e. headings, formulas, paragraphs and such) in the logical, intended reading order.”
- “PDF also does not provide different perspectives on textual content. Electronic documents may want to provide different views of the text or data, either in multiple languages, diplomatic or critical transcriptions, or from different sources.”
- “Usability issues aside, Willinsky et al.[32] give an excellent overview about current issues with using PDF in the scholarly environment. They hope, that their observations will influence further development of PDF or even the ‘Great PDF Replacement Format (GPDFRF)’.”
- “Converting “normal” PDFs to PDF/A a-level conformance automatically is not advisable as a lot of information may already be lost during the creation process of the document.”
- “But even PDF/A a-level conformance may not guarantee full text recovery due to the fact that some tagging features are only recommendations and not mandatory. Hyphenation (the word division at the end of a line) shall be treated as an incidental artifact and be represented as a unicode soft-hyphen (U+00AD) instead of a hard-hyphen (U+002D) as suggested by the standard.”
- “But even PDF/A a-level conformance may not guarantee full text recovery due to the fact that some tagging features are only recommendations and not mandatory. Hyphenation (the word division at the end of a line) shall be treated as an incidental artifact and be represented as a unicode soft-hyphen (U+00AD) instead of a hard-hyphen (U+002D) as suggested by the standard.”
- “It is alternatively possible to provide the /ActualText attribute without the hyphen.”
- “For some time, the go-to-tool for PDF/A validation was JHOVE 5 using PDF profiles. As it was discovered that it was not suitable for validating PDF/A files[29], the EU funded PREFORMA project 6 included a provision to create veraPDF 7 , a validator which aims at checking conformance of all PDF/A flavors while also allowing for policy checks that are customizable to institutional policy.”
- “Some possible strategies for the better handling of PDFs mostly involve the content producers but also create more involved workflows within the archive:
- Negotiate non-PDF documents better suited for their domain and supported by your archive system.
- Consider using PDF/A as a dissemination format only (and therefore use a PDF rendition server only for access not ingest).
- Save the original source documents alongside the PDFs for full text and structure retention. With PDF/A-3 these could be embedded and linked as source of the document.
- Require data producers to implement workflows that adhere to the Matterhorn protocol to assure fully, meaningful tagged PDFs (including MathML formulas, semantically tagged data and so on) and to provide /ActualText for every textual information contained in the PDF that is not easily extractable otherwise.“
- “WebArchive (WARC) files bundle all necessary components and are already in use in digital archiving.”
Well put:
- “While humans have the ability to recognize the structure of text from layout, which is a necessary requirement for meaningful extraction of information and therefore gaining knowledge from texts and illustrations including diagrams, formulas, and tables, machine-based technology is not yet able to achieve this to the same extent. This makes it difficult for such technology to use or reuse the information contained in PDFs.”
- “Adobe extended the PDF specification multiple times over the years to allow for more features like encryption, transparency, device-independent colors, forms, web-links, javascript, audio, video, 3D objects and many more[18].”
- “PDF supports incremental updates of its content. New objects, a new cross-reference table and a new trailer can be appended to the end of the file, if the content of the PDF is updated, without the need to rewrite the whole file. As objects can be marked as deleted in the xref-table there is no need to delete the corresponding objects in the body section.”
- “PDF can also define different rectangles useful in print like crop boxes, bleed boxes, trim boxes, and art boxes (refer to the PDF reference[7] for additional information).”
- “A standard for required tag usage was published by ISO as ISO 14289[10] known as PDF/UA in 2014 (thus after the publication of PDF/A-2/3). Even though being accessible by AT (i.e. software) is a legal requirement in some domains, creating compliant documents is still a complex and cumbersome endeavor. Even assessing compliance to PDF/UA is quite hard: The Matterhorn protocol[24] provides a testing model that defines 31 checkpoints comprised of 136 failure conditions encompassing file format requirements for AT accessible PDF/UAs of which some are not applicable to PDF/A (e.g. related to javascript). While 87 failure conditions are determinable by software 47 usually require human judgement or assessment. Failure condition 06-003 for example is machine testable and requires the metadata stream to contain a dublincore:title while 06-004 requires that the title clearly identifies the document in respect to human knowledge, a check that obviously is not decidable by algorithms.”
- “Nielsen [23] argued in 2001 that the fixed, page-based layout of PDF is not well suited for on-screen reading in contrast to web pages or other hypertext documents.”
- “If optical character recognition results are available they also are embedded into the PDF as a invisible text layer over the corresponding areas in the image of the original.”
- “An insightful analogue of the difference between human content understanding and machine extraction capabilities would be the visible communication of music. While storing the layout of sheet music is perfectly achievable with PDF the placement of note glyphs on lines with annotating glyphs for bars, clefs and so on, it is easily understood and transformed into audible sound by humans trained in reading musical notation. A machine would have a hard time extracting enough information to reproduce or compare the musical score.”
- “What constitutes a word and finding word boundaries might be difficult by itself depending on the layout or script of the text. Selecting rows or columns from tables in PDF reader applications often also results in frustration.”
- “Searching for the string ”Rheinland” (German for Rhineland, a part of Germany) in the PDF/A-1a file of the nestor newsletter number 28[22] for example would result in no matches in macOS Preview or Adobe Reader as it is stored as a hard-hyphen. The hyphen in ”Ostwestfalen-Lippe” is a regular one.”
- “PDF/A is perceived to be an archival solution for digital documents. Discussion within the community revealed the reason for that is three-fold: Firstly, it is marketed as an archival format. The A in PDF/A might stand for “Archive” or “Archival” or simply for the letter “A”; I haven’t found any official explanation for the choice of A in the acronym. The second reason may be that it is used by so many institutions to a point where a critical mass is reached. They cannot altogether err in their risk assessment, so the reasoning is that you simply cannot be wrong when you run with the flock. And thirdly, there does not seem to be a better alternative available (see below).”
- “Pages are useful for citation in the traditional format of books or journals but with the advancement of digital publishing and linked data technologies it will be more useful to refer to information sets identified (and locatable) by persistent digital identifiers like URIs or IRIs.”
- “One teenager even wondered why YouTube isn’t mentioned in a book from 2005.”
- “In contrast to XHTML, an XML language, it is very robust to formal errors.”
- “Sullivan reports in her 2003 article (emphasis added): ‘The intent was not to claim that PDF-based solutions are the best way to preserve electronic documents. PDF/A simply defines an archival profile of PDF that is more amenable to long-term preservation than traditional PDF.’[27]”
summary
This paper summarizes some shortcomings why/how PDF/A is not the final solution for long-term archival of documents.
PDF features not available in PDF/A (list from external resource):
- Embedded video and audio files
- encryption
- transparency
- Javascript
- executable files
- functions
- external links
- layers
I would summarize those arguments as lack of annotations in the PDF to make PDFs accessible for machines and people with visual disabilities. This makes it difficult to index and retrieve document information as part of a large corpus (→ database). The problems boil down to the design of PDF, which allows multiple ways of encoding text information, which might loose e.g. reading order information. Refer to Table 2, for example. In the end, the tooling support is not great, but a lack of alternatives can be identified. However, if we consider information extraction capabilities as more important than (e.g. font embedding and reproducible layout), some alternatives are mentioned in section 5.1 (e.g. HTML/CSS or ODF/OOXML). One intriguing analogy is given in the paper: Musical typesetting can be done with PDF as output, but retrieving information about the music is very difficult.
In the conclusion the author admits that the PDF/A authors were aware of its shortcomings: “The intent was not to claim that PDF-based solutions are the best way to preserve electronic documents. PDF/A simply defines an archival profile of PDF that is more amenable to long-term preservation than traditional PDF”
In the end, the paper is a summary which does not provide any solution as pointed out by the author. As a critic of Markdown, I am saddened to see that Markdown was even mentioned (but other markup languages are neglected).
Parsing Expression Grammars: A Recognition-Based Synta… §
Title: “Parsing Expression Grammars: A Recognition-Based Syntactic Foundation” by Bryan Ford [url] [dblp]
Published in 2004 at POPL 2004 and I read it in 2025-11-23
Abstract: For decades we have been using Chomsky’s generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ambiguity is crucial to their original purpose of modelling natural languages, but this very power makes it unnecessarily difficult both to express and to parse machine-oriented languages using CFGs. Parsing Expression Grammars (PEGs) provide an alternative, recognition-based formal foundation for describing machineoriented syntax, which solves the ambiguity problem by not introducing ambiguity in the first place. Where CFGs express nondeterministic choice between alternatives, PEGs instead use prioritized choice. PEGs address frequently felt expressiveness limitations of CFGs and REs, simplifying syntax definitions and making it unnecessary to separate their lexical and hierarchical components. A linear-time parser can be built for any PEG, avoiding both the complexity and fickleness of LR parsers and the inefficiency of generalized CFG parsing. While PEGs provide a rich set of operators for constructing grammars, they are reducible to two minimal recognition schemas developed around 1970, TS/TDPL and gTS/GTDPL, which are here proven equivalent in effective recognition power.
quotes
- “{s ∈a* | s = (aa)n} is a generative definition […] In contrast, {s ∈a* | s = (|s| mod 2 = 0)} is a recognition-based definition of the same language” (Ford, p. 1)
- “While most language theory adopts the generativeparadigm, most practical language applications in computer science involvethe recognition and structural decomposition, or parsing,of strings. Bridging the gap from generativedefinitions to practical recognizers is the purpose of our ever-expanding library of parsing algorithms with diverse capabilities and trade-offs[9].” (Ford, p. 1)
- “Ambiguity in CFGs is difficult to avoid even when we want to, and itmak es general CFG parsing an inherently super-linear-time problem [14, 23].” (Ford, p. 1)
- “PEGs havefarmore syntactic expressiveness than the LL(k) language class typically associated with top-down parsers, however,and can express all deterministic LR(k) languages and manyothers, including some non-context-free languages.” (Ford, p. 2)
- “The primary contribution of this work isto provide language and protocol designers with anewtool for describing syntax that isboth practical and rigorously formalized. Asecondary contribution isto render this formalism more amenable to further analysis by proving its equivalence totwosimpler formal systems, originally named TS (“TMG recognition scheme”) and gTS (“generalized TS”) by Alexander Birman [4, 5],in reference to an early syntax-directed compiler-compiler.These systems were later called TDPL (“TopDown Parsing Language”) and GTDPL (“Generalized TDPL”) respectively by Aho and Ullman [3]. By extension we provethat with minor caveats TS/TDPL and gTS/GTDPL are equivalent in recognition power,an unexpected result contrary to prior conjectures [5].” (Ford, p. 2)
- “The expression ‘a* a’ for example can never match any string.” (Ford, p. 2)
- “If the definition of Grammar in Figure 1 did not reference EndOfFile at the end, then anyASCII file starting with at least one correct Definition would be interpreted as a“correct” grammar,even ifthe file has unreadable garbage at the end.” (Ford, p. 4)
- “We desugar an and-predicate &e to !(!ed),where ed is the desugaring of e.” (Ford, p. 5)
- “parsing expression languages (PELs), the class of languages that can be expressed by PEGs.” (Ford, p. 5)
- “PELs are closed under union, intersection, and complement. It is undecidable in general whether a PEG represents a nonempty language, or whether two PEGs represent the same language.” (Ford, p. 5)
-
“The classic example language anbncn is not context-free, but we can recognize it with a PEG G = (({A, B, D}, {a, b, c}, R, D), where R contains the following definitions:
A <- a A b /ε
B <- b B c /ε
D <- &(A !b) a* B !.
” (Ford, p. 6)
summary
Very cool paper. It formalizes a pragmatic recognition-based approach towards parsing, while contrasting it properly with prior work. It does not lack formal proofs to show properties of PEGs/PELs.
Phonological Differences between Japanese and English:… §
Title: “Phonological Differences between Japanese and English: Several Potentially Problematic Areas of Pronunciation for Japanese ESL/EFL Learners” by Kota Ohata [url] [dblp]
Published in 2004 at Asian EFL Journal and I read it in 2023-10
Abstract: Inlight of the fact, that L2 pronunciation errors are often caused by the transfer of well-established L1 sound systems, this paper examines some of the characteristic phonological differences between Japanese and English. Comparing segmental and suprasegmental aspects of both languages, this study also discusses several problematic areas of pronunciation, for Japanese learners of English. Based on such contrastive analyses, some of the implications for L2 pronunciation teaching are drawn.
summary
Japanese and English is compared from a phonological perspective (rhythm, stress, pitch, intonation, segmentals).
Used in 2017 in our phonetics class for our presentation.
For me, the most interesting aspect is Table 3 which illustrates the differences between Japanese and English. But the text is not too long and provides some nice examples.
Good read, recommended!
Piret and Quisquater’s DFA on AES Revisited §
Title: “Piret and Quisquater’s DFA on AES Revisited” by Christophe Giraud, Adrian Thillard [url] [dblp]
Published in 2010 at and I read it in 2020-06
Abstract: At CHES 2003, Piret and Quisquater published a very efficient DFA on AES which has served as a basis for many variants published afterwards. In this paper, we revisit P&Q’s DFA on AES and we explain how this attack can be much more efficient than originally claimed. In particular, we show that only 2 (resp. 3) faulty ciphertexts allow an attacker to efficiently recover the key in the case of AES-192 (resp. AES-256). Our attack on AES-256 is the most efficient attack on this key length published so far.
quotes
- “we show that only 2 (resp. 3) faulty ciphertexts allow an attacker to efficiently recover the key in the case of AES-192 (resp. AES-256).”
- “Since its publication in 1996, Fault Analysis has become the most efficient way to attack cryptosystems implemented on embedded devices such as smart cards. In October 2000, Rijndael was selected as AES and since then many researchers have studied this algorithm in order to find very efficient differential fault attacks. Amongst the dozen DFA on AES published so far, Piret and Quisquater’s attack published at CHES 2003 [5] is now a reference which has been used as a basis for several variants published afterwards.”
- “Therefore the last round key can be recovered by using 8 faulty ciphertexts with faults induced at chosen locations.”
- “From our experiments, an exhaustive search on AES-128 amongst 234 key candidates takes about 8 minutes on average on a 4-core 3.2Ghz Xeon by using a non-optimised C code. Therefore such an attack is practical.”
- “The triviality of the extension of Piret and Quisquater’s attack comes from the fact that, since MixColumns is linear, one can rewrite the last two rounds of the AES as depicted in Fig. 3.” → Figure 3 moves MixColumns into the last round and uses key MC-1(Kr-1) for the second-to-last AddRoundKey operation
- “To conclude, the original P&Q’s DFA on AES can uniquely identify the AES key in the 192 and 256-bit cases by using 4 faulty ciphertexts.”
- “From our experiments, an exhaustive search on AES-192 amongst 242 key candidates takes about 1.5 day on average on a 4-core 3.2Ghz Xeon by using a non-optimised C code. Therefore such an attack can be classified as practical.”
- “To conclude, one can reduce the number of candidates for the AES-192 key to 210 by using 3 faulty ciphertexts.”
summary
Good paper. Ideas are immediate. Not all attacks presented give runtimes, but the (more important) number of faulty ciphertexts is analyzed properly. The original P&Q attack is neat in general.
The original attack from 2003 uniquely identifies the key with 2 faulty ciphertext pairs with probability 98%.
typo
Figure 2 swaps the last SubBytes and ShiftRows operation
Power analysis attack on Kyber §
Title: “Power analysis attack on Kyber” by Alexandre Karlov, Natacha Linard de Guertechin [url] [dblp]
Published in 2021-09 at and I read it in 2021-09
Abstract: This paper describes a practical side-channel power analysis on CRYSTALSKyber key-encapsulation mechanism. In particular, we analyse the polynomial multiplication in the decapsulation phase to recover the secret key in a semi-static setting. The power analysis attack was performed against the KYBER512 implementation from pqm4 [1] running on STM32F3 M4cortex CPU.
errors
- “Then the set of byte array of length k” → “Then the set of byte arrays of length k”
- “The goal of the attack is to recover the secret key during the step 3 of the key decapsulation, specifically at the line 1 in Kyber.CPAPKE.Dec(sk, c), which is a multiplication of two polynomials in the NTT domain.” → “The goal of the attack is to recover the secret key during the step 3 of the key decapsulation, specifically at the line 1 in indcpa_dec(sk, c), which is a multiplication of two polynomials in the NTT domain.”
- “In a semi-static setting, Bob computes the multiplication of the secret fixed polynomial with different ciphertexts in line 1 of Algorithm 3.” → “In a semi-static setting, Bob computes the multiplication of the secret fixed polynomial with different ciphertexts in line 1 of Algorithm 7.”
- “The goal of the CPA on KYBER is to recover Bob secret key.” → “The goal of the CPA on KYBER is to recover Bob's secret key.”
- “In a semi-static setting, the attacker intercepts N Alice’s ciphertexts and measures the power consumption for the basemul operation.” → “In a semi-static setting, the attacker intercepts N of Alice’s ciphertexts and measures the power consumption for the basemul operation.”
- “alignement of traces” → “alignement of traces”
- “One simple and straightforward countermeasure consists in avoiding using Kyber in a semi-static setting.” → “One simple and straightforward countermeasure consists of avoiding using Kyber in a semi-static setting.”
quotes
- “In this paper, we proposed a successful and practical correlation power analysis attack on KYBER512 implementation from pqm4 to recover the secret key in the decapsulation phase. With a sufficient number of traces, the attack is > 99% successful and accurate.”
summary
Idea is immediate (CPA on multiplication in decryption step, impl by pqm4, ChipWhisperer Pro with CW308 UFO and STM32F3). The scientific contribution is the evaluation data. A number of 200 traces suffices which is nice. The paper writing is very preliminary (e.g. definition of B in Bη missing, definition of semi-static setting, etc).
Practical CCA2-Secure and Masked Ring-LWE Implementati… §
Title: “Practical CCA2-Secure and Masked Ring-LWE Implementation” by Tobias Oder, Tobias Schneider, Thomas Pöppelmann [url] [dblp]
Published in 2018 at TCHES 2018 and I read it in 2020-10
Abstract: During the last years public-key encryption schemes based on the hardness of ring-LWE have gained significant popularity. For real-world security applications assuming strong adversary models, a number of practical issues still need to be addressed. In this work we thus present an instance of ring-LWE encryption that is protected against active attacks (i.e., adaptive chosen-ciphertext attacks) and equipped with countermeasures against side-channel analysis. Our solution is based on a postquantum variant of the Fujisaki-Okamoto (FO) transform combined with provably secure first-order masking. To protect the key and message during decryption, we developed a masked binomial sampler that secures the re-encryption process required by FO. Our work shows that CCA2-secured RLWE-based encryption can be achieved with reasonable performance on constrained devices but also stresses that the required transformation and handling of decryption errors implies a performance overhead that has been overlooked by the community so far. With parameters providing 233 bits of quantum security, our implementation requires 4,176,684 cycles for encryption and 25,640,380 cycles for decryption with masking and hiding countermeasures on a Cortex-M4F. The first-order security of our masked implementation is also practically verified using the non-specific t-test evaluation methodology.
quotes
- “With parameters providing 233 bits of quantum security, our implementation requires 4,176,684 cycles for encryption and 25,640,380 cycles for decryption with masking and hiding countermeasures on a Cortex-M4F. The first-order security of our masked implementation is also practically verified using the non-specific t-test evaluation methodology.”
- “In this context, a basic semantically secure encryption scheme with parameters leading to a negligible
amount of decryption errors is a requirement to achieve CCA2-security as discussed by Dwork, Naor, and Reingold [DNR04] when applying CCA2-transformations.” - “The importance of CCA2-security is also reflected in the NIST submission requirements for post-quantum public-key encryption and key-exchange [NIS16] that explicitly ask to declare whether CCA2-security is achieved (see [NIS17] for the list of submissions).”
- “Our main contribution is a novel, provably first-order secured masking scheme and its non-trivial integration into a CCA2 conversion.”
- “With masking and hiding countermeasures our code achieves 2,669,559 cycles for key generation, 4,176,684 cycles for encryption, and 25,640,380 cycles for decryption.”
- “In comparison, our masking scheme thus outperforms previous masking approaches for ring-LWE by one
million cycles.” (NOTE this is 97% of the original runtime) - “In the scheme all elements are polynomials over Rq = ℤq[x]/<xn + 1> where we always assume implicit reduction modulo q and reduction modulo x n +1 and only allow parameters for which it holds that 1 ≡ q mod 2n for q being a prime and n being a power-of-two.”
- “Otherwise, the large term ae1r2 cannot be eliminated when computing c1 r2 + c2. An encoding of the n-bit message m is necessary as some small noise (i.e., e = e1 r1 + e2 r2 + e3) is still present after calculating c1 r2 + c2 and would prohibit the retrieval of the message after decryption. This also shows why the noise distribution is chosen to be rather small – a too big noise level would make reliable decoding impractical.”
- “Masking schemes for the ring-LWE encryption scheme have already been investigated by Reparaz, Roy, Vercauteren, and Verbauwhede in [RRVV15a, RRdC + 16]. The main idea of [RRVV15a, RRdC + 16] is to split the secret key r2 into two shares, compute the multiplication r2 · c1 separately on both shares and add c2 to one of the shares. The authors construct a masked decoder that takes both shares as input and checks whether certain pre-defined rules are satisfied or not. For half of all inputs no rule applies and the value cannot be decoded immediately. This is solved by adding a certain δ ∈ [0, q − 1] to the shares and restarting the decoding process up to 16 times. However, this process increases the decryption time and also the decryption error probability is increased by 19%, which has to be compensated by selecting lower noise sizes and thus leads to lower security.”
- “In follow-up work Reparaz, de Clercq, Roy, Vercauteren, and Verbauwhede [RdCR + 16] propose a different masking scheme. The authors exploit that the ring-LWE decryption is almost additively homomorphic. […] Note that this procedure includes an additional encryption of m'' during the decryption. Unfortunately, the addition of two ciphertexts implies that also the including error vectors are added and this again raises the decryption error probability of the scheme and lowers performance.”
- “Thus, we draw two conclusions for the implementation of practically secured ring-LWE encryption:
- Assuming a CPA-only attacker, the DPA attack on ring-LWE without masked decoding is impractical and thus no masked decoder is required.
- Assuming a CCA2 attacker, a CCA2-conversion has to be applied to ring-LWE. Otherwise, an attacker would be able to break the system without performing a DPA and thus rendering any side-channel countermeasures useless. The message m must not be stored unmasked in this setting.”
- “”
summary
iae
typo
- “independently as \sum_{i=0}^{k-1} b_i - b'_i where” → “independently as \sum_{i=0}^{k-1} (b_i - b'_i) where”
- “These countermeasures are addition of a random” → “These countermeasures are addition of a random”
Practical Evaluation of Masking for NTRUEncrypt on ARM… §
Title: “Practical Evaluation of Masking for NTRUEncrypt on ARM Cortex-M4” by Thomas Schamberger, Oliver Mischke, Johanna Sepulveda [url] [dblp]
Published in 2019 at COSADE 2019 and I read it in 2021-10
Abstract: To protect against the future threat of large scale quantum computing, cryptographic schemes that are considered appropriately secure against known quantum algorithms have gained in popularity and are currently in the process of standardization by NIST. One of the more promising so-called post-quantum schemes is NTRUEncrypt, which withstood scrutiny from the scientific community for over 20 years.
quotes
- “With the use of SIMD instructions available in the Cortex-M4 microcontroller, we are able to implement additive masking without any significant performance overhead compared to an unmasked implementation.”
- “The main side-channel attack against NTRUEncrypt is the correlation power analysis attack (CPA) published in [10].”
- “We adapt the CPA of [10] for modern parameter sets that make use of so called trinary polynomials and show successful attack results. For this we change the multiplication algorithm in order to utilize the sparse structure of trinary polynomials.”
- “In a final step we show that a combination of the Random key rotation shuffling countermeasure [13] with our masked implementation provides a secured implementation against second-order attacks using two million traces on our setup.”
- “As the NTRUEncrypt algorithm has evolved over time, two different kinds of parameter sets were proposed. Their main difference is the choice of the modulo parameter p.”
- “In order to provide CCA-2 security the authors instantiate NTRU with the NAEP encryption scheme as described in [7].”
- “We limit the description of the algorithm to the decryption function, as this function is the only point during the algorithm where a known input, namely the ciphertext e, is combined with the secret key polynomial f , which is a necessary condition to mount side-channel attacks.”
- “Multiplication of two polynomials is performed with the circular convolution
product in the corresponding ring. In [5] this product of two polynomials a(x) ∗
b(x) is defined as: \[ a(x) \otimes b(x) = \sum_{k=0}^{N-1} \left(\sum_{i+j\equiv k \pmod{N}} a_i b_j\right) x^k \] In other words, Eq. (3) can be seen as the multiplication of two polynomials with an additional reduction of the result by (x N − 1) through polynomial long division.” - “It has to be noted that neither the standardized version in IEEE-1363.1 [8] nor the NIST submission of NTRUEncrypt [14] defines a specific way of implementing the multiplication.”
- “In [1] the authors propose Algorithm 2 for the multiplication of a polynomial in R q and a binary polynomial B(d). With this algorithm the authors substitute the multiplication of coefficients with additions based on the index of ones in the binary polynomial. As a binary polynomial is build to be sparse, the coefficients with the value zero can be skipped resulting in a lower number of additions to execute and therefore a faster multiplication.”
- “With this attack the authors target the multiplication of the private key f with the ciphertext e, as this is the only operation on the private key with an attacker controllable input.”
- “In addition to their attack the authors of [10] propose three different countermeasures:
- Random initialization of t: The temporary result array t is initialized with different random values r i , which can help during the first register overwrite in a HD scenario.
- Masking of ciphertext e: With this countermeasure each individual coefficient e i is masked with a random value through modular addition. We give a detailed evaluation of this countermeasure in this work.
- Shuffling: The sequence of all d addition rounds can be shuffled randomly, as the order has no impact on the final result. In theory shuffling countermeasures can be defeated with an increased amount of traces, therefore the authors propose this countermeasure only in combination with masking.”
- “More recently a countermeasure named Random key rotation is proposed in [13].”
- “In accordance to [10] we use arithmetic masking with different masks on all coefficients of the ciphertext polynomial e.”
- “We provide two different masked implementations with the use of ARM assembly code.”
- “The parallel implementation of the masking countermeasure makes use of SIMD instructions of the DSP extension of an ARM Cortex-M4 architecture.”
- “The implicit reduction modulo 216 does not change the result of the multiplication as all coefficients of the result are reduced modulo 211, with q = 2048 for modern parameter sets.”
- “All attacks are performed with power measurements of a STM32F303RCT7 ARM Cortex-M4 microcontroller mounted on the NewAE CW308 UFO board.”
- “There is no significant correlation visible for the first-order attack using up to two million trace measurements. In contrast the second-order attack is successful with two hundred thousand traces.”
- “It can be seen that a second-order attack is less effective on the parallel implementation and therefore we recommend this implementation as it also shows an reduced execution time in comparison with the sequential one.”
- “The shuffling method works by generating a random integer i in the range 0 ≤ i < N − 1 and circular shifting the coefficients of f to the right by i positions.”
summary
A good read and average academic paper.
This paper a masking scheme for NIST submission round 1 NTRUEncrypt variants NTRU-443 and NTRU-743. They discuss 4 proposed countermeasures (1 of them is masking) and conclude that two of the countermeasures should only be applied in combination with masking. They go on to mask polynomial multiplication per coefficient in the decryption as m = (f * (e + masks) - (f * masks)) = f * e. Then they evaluate the countermeasure on a STM32F303RCT7 ARM Cortex-M4 microcontroller mounted on the NewAE CW308 UFO board.
I think they could have skipped trivial figures 4 and 5 and instead argue in more detail why the masking only has to be applied to the multiplication but not other parts.
- convolution polynomial ring = ring with polynomials as elements and convolution as multiplication operation
- Algorithm 2 shows only “+ bk” instead of “+ ai bk” because ai = 1
- NIST submission round 1 NTRUEncrypt features NTRU-443, NTRU-743 and NTRU-1024 where the last one is unsupported in this scheme due to SIMD and a non-power-to-two parameter q.
Region-Based Memory Management in Cyclone §
Title: “Region-Based Memory Management in Cyclone” by Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang, James Cheney [url] [dblp]
Published in 2002-06 at LPDI 2002 and I read it in 2023-05-14
Abstract: Cyclone is a type-safe programming language derived from C. The primary design goal of Cyclone is to let programmers control data representation and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory management of Cyclone and its static typing discipline. The design incorporates several advancements, including support for region subtyping and a coherent integration with stack allocation and a garbage collector. To support separate compilation, Cyclone requires programmers to write some explicit region annotations, but a combination of default annotations, local type inference, and a novel treatment of region effects reduces this burden. As a result, we integrate C idioms in a region-based framework. In our experience, porting legacy C to Cyclone has required altering about 8% of the code; of the changes, only 6% (of the 8%) were region annotations.
quotes
- “Cyclone is a type-safe programming language derived from C. The primary design goal of Cyclone is to let programmers control data representation and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory management of Cyclone and its static typing discipline.” (Grossman et al., p. 1)
- “to reduce the need for type casts, Cyclone has features like parametric polymorphism, subtyping, and tagged unions. To prevent bounds violations without making hidden data-representation changes, Cyclone has a variety of pointer types with different compile-time invariants and associated run-time checks.” (Grossman et al., p. 1)
- “Following the seminal work of Tofte and Talpin [28], the system is region-based : each object lives in one region and, with the exception that a distinguished heap region may be garbage collected, a region’s objects are all deallocated simultaneously.” (Grossman et al., p. 2)
- “In our experience, porting C code has required altering about 8% of the code, and the vast majority of changes have not been region annotations.” (Grossman et al., p. 2)
- “Dynamic regions are created with the construct region r{s},wherer is an identifier and s is a statement. The region’s lifetime is the execution of s.Ins, r is bound to aregionhandle, which primitives rmalloc and rnew use to allocate objects into the associated region. For example, rnew(r) 3 returns a pointer to an int allocated in the region of handle r and initialized to 3.” (Grossman et al., p. 2)
- “Like a declaration block, a dynamic region is deallocated precisely when execution leaves the body of the enclosed statement.” (Grossman et al., p. 2)
- “Pointers to arrays of unknown size (denoted τ ?) are implemented with extra fields to support bounds-checks, but this design is orthogonal to regions.” (Grossman et al., p. 2)
- “int*ρ describes a pointer to an int that is in the region whose name is ρ.” (Grossman et al., p. 2)
- “A handle for a region corresponding to ρ has the type region_t<ρ>.” (Grossman et al., p. 3)
- “A block labeled L (e.g., L:{int x=0;s}) has name ρL and refers to the stack region that the block creates.” (Grossman et al., p. 3)
- “We can now give types to some small examples. If e1 has type region_t<ρ> and e2 has type τ ,thenrnew (e1) e2 has type τ *ρ” (Grossman et al., p. 3)
- “Preventing dangling-pointer dereferences. To dereference a pointer, safety demands that its region be live.” (Grossman et al., p. 3)
- “Functions in Cyclone are region polymorphic; they can abstract the actual regions of their arguments or results. That way, functions can manipulate pointers regardless of whether they point into the stack, the heap, or a dynamic region.” (Grossman et al., p. 3)
- “The ? is Cyclone notation for a pointer to a dynamically sized array.” (Grossman et al., p. 3)
- “Because struct definitions can contain pointers, Cyclone allows these definitions to be parameterized by region names.” (Grossman et al., p. 3)
- “To solve this problem, we observe that if the region corresponding to ρ1 outlives the region corresponding to ρ2, then it is sound to use a value of type τ *ρ1 whereweexpect one of type τ *ρ2. Cyclone supports such coercions implicitly. The last-in-first-out region discipline makes such outlives relationships common: when we create a region, we know every region currently alive will outlive it.” (Grossman et al., p. 4)
- “To ensure soundness, we do not allow casting τ1*ρ to τ2*ρ, even if τ1 is a subtype of τ2, as this cast would allow putting a τ2 in a location where other code expects a τ1.(Thisproblem is the usual one with covariant subtyping on references.)” (Grossman et al., p. 4)
- “We emphasize that our approach to inference is purely intraprocedural and that prototypes for functions are never inferred. Rather, we use a default completion of partial prototypes to minimize region annotations. This approach permits separate compilation.” (Grossman et al., p. 4)
- “the compiler deduces an appropriate annotation based on context: 1. For local declarations, a unification-based inference engine infers the annotation from the declaration’s (intraprocedural) uses. This local inference works well in practice, especially when declarations have initializers. 2. Omitted region names in argument types are filled in with fresh region names that are generalized implicitly. So by default, functions are region polymorphic without any region equalities. 3. In all other contexts (return types, globals, type definitions), omitted region names are filled in with ρH (i.e., the heap). This default works well for global variables and for functions that return heap-allocated results. However, it fails for functions like strcpy that return one of their parameters. Without looking at the function body, we cannot determine which parameter (or component of a parameter) the function might return.” (Grossman et al., p. 4)
- “Cyclone does not have closures, but it has other typing constructs that hide regions. In particular, Cyclone provides existential types [22, 14], which suffice to encode closures [21] and simple forms of objects [5]. Therefore, it is possible in Cyclone for pointers to escape the scope of their regions.” (Grossman et al., p. 5)
- “To address this problem, the Cyclone type system keeps track of the subset of region names that are considered live at each control-flow point. Following Walker, Crary, and Morrisett [29], we call the set of live regions the capability. To allow dereferencing a pointer, the type system ensures that the associated region name is in the capability. Similarly, to allow a function call, Cyclone ensures that regions the function might access are all live. To this end, function types carry an effect that records the set of regions the function might access.” (Grossman et al., p. 5)
- “The second major departure from TT is that we do not have effect variables.” (Grossman et al., p. 5)
- “To simplify the system while retaining the benefit of effect variables, we use a type operator, regions_of(τ ).This novel operator is just part of the type system; it does not existatruntime. Intuitively,regions_of(τ ) represents the set of regions that occur free in τ” (Grossman et al., p. 5)
- “For typ e variables, regions_of(α) is treated as an abstract set of region variables, much like effect variables. For example, regions_of(α*ρ) = {ρ} ∪ regions_of(α)” (Grossman et al., p. 6)
- “Cyclone supports existential types, which allow programmers to encode closures.” (Grossman et al., p. 6)
- “In a separate technical report [15], we have defined an operational model of Core Cyclone, formalized the type system, and proven type soundness.” (Grossman et al., p. 6)
- “The code generator ensures that regions are deallocated even when their lifetimes end due to unstructured control flow.” (Grossman et al., p. 8)
- “In this fashion, we ensure that a region is always deallocated when control returns.” (Grossman et al., p. 8)
- “We took two approaches to porting. First, we changed all the programs as little as possible to make them correct Cyclone programs. Then, for cfrac and mini_httpd,we regionized the code: We made functions more region polymorphic and, where possible, eliminated heap allocation in favor of dynamic region allocation with rnew. We also added compiler-checked “not null” annotations to pointer types where possible to avoid some null checks.” (Grossman et al., p. 8)
- “There are two interesting results regarding the difficulty of minimal porting. First, the overall changes in the programs are relatively small — less than 10% of the program code needed to be changed. The vast majority of the differences arise from pointer-syntax alterations. These changes are typically easy to make — e.g., the type of strings are changed from char * to char ?. We are currently experimenting with interpreting char * as a safe null-terminated string type by default; doing so allows many fewer changes. The most encouraging result is that the number of region annotations is small: only 124 changes (which account for roughly 6% of the total changes) in more than 18,000 lines of code. The majority of these changes were completely trivial, e.g., many programs required adding ρH annotations to argv so that arguments could be stored in global variables.” (Grossman et al., p. 9)
- “The cost of porting a program to use dynamic regions was also reasonable; in this case roughly 13% of the total differences were region-related.” (Grossman et al., p. 9)
- “For the non-web benchmarks (and some of the web benchmarks) the median and mean were essentially identical, and the standard deviation was at most 2% of the mean. The factor columns for the Cyclone programs show the slowdown factor relative to the C versions.” (Grossman et al., p. 10)
- “we pay a substantial penalty for compute-intensive benchmarks; the worst is grobner, which is almost a factor of three slower than the C version.” (Grossman et al., p. 10)
- “As Table 3 demonstrates, bounds-checks are also an important component of the overhead, but less than we expected. We found that a major cost is due to the representation of fat pointers. A fat pointer is represented with three words: the base address, the bounds address, and the current pointer location (essentially the same representation used by McGary’s bounded pointers [20]).” (Grossman et al., p. 10)
- “Many systems, including but certainly not limited to LCLint [10, 9], SLAM [3], Safe-C [2], and CCured [25], aim to make C code safe.” (Grossman et al., p. 10)
- “As they note, region-based programming in C is an old idea; they contribute language support for efficient reference counting to detect if a region is deallocated while there remain pointers to it (that are not within it). This dynamic system has no apriorirestrictions on regions’ lifetimes and a pointer can point anywhere, so the RC approach can encode more memory-management idioms.” (Grossman et al., p. 11)
- “Instead, we are currently developing a traditional intraprocedural flow analysis to track region aliasing and region lifetimes.” (Grossman et al., p. 11)
summary
This paper introduces region-based memory management following the seminal work of Tofte and Talpin for the C programming language. A compiler modification is written that takes regions into account to eliminate dereferencing of dangling pointers as compile-time error, memory management through deallocation once the region of the enclosed body statements goes out of scope, and run-time bounds checking with fat pointers for arbitrary-size arrays.
The paper describes the extended type system, notions of subtyping, polymorphism, and “outliving” for regions, revises a soundness proof (provided fully in a technical report), and inference techniques to avoid most region declarations. As a result, only 8% of the code had to changed. Performance has often improved (due to negligible checks), but run-time checks worsen the performance of CPU-intense tasks.
Revisiting “what is a document?” §
Title: “Revisiting “what is a document?”” by Bernd Frohmann [url] [dblp]
Published in 2008-06 at Journal of Documentation, Volume 65 (2009) and I read it in 2024-12
Abstract: Purpose – The purpose of this paper is to provide a reconsideration of Michael Buckland’s important question, “What is a document?”, analysing the point and purpose of definitions of “document” and “documentation”.
quotes
- “The philosopher’s responsibility is to discover what the meaning of a term should be, in order to contribute to a scientific classification of things, and thereby to bring order to thought and communication.” (Frohmann, 2009, p. 2)
- “In our example, we began with some remarks (one fibre) on Otlet’s and Briet’s concept of a document, in which the concept of evidence was yoked to the prevailing concept of the document, thereby extending it to include physical objects.” (Frohmann, 2009, p. 8)
summary
A philosophical article mainly contrasting Wittgensteinian arguments with Putnam’s. I don’t feel like any question of mine was answered, but maybe document can be defined as “anything that can serve as evidence”; hence including physical objects as the base article discusses.
I liked the phrasing of philosopher’s responsibility in the article.
SEVurity: No Security Without Integrity §
Title: “SEVurity: No Security Without Integrity” by Luca Wilke, Jan Wichelmann, Mathias Morbitzer, Thomas Eisenbarth [url] [dblp]
Published in 2020 at IEEE Symposium on Security and Privacy 2020 and I read it in 2020-06
Abstract: One reason for not adopting cloud services is the required trust in the cloud provider: As they control the hypervisor, any data processed in the system is accessible to them. Full memory encryption for Virtual Machines (VM) protects against curious cloud providers as well as otherwise compromised hypervisors. AMD Secure Encrypted Virtualization (SEV) is the most prevalent hardware-based full memory encryption for VMs. Its newest extension, SEV-ES, also protects the entire VM state during context switches, aiming to ensure that the host neither learns anything about the data that is processed inside the VM, nor is able to modify its execution state. Several previous works have analyzed the security of SEV and have shown that, by controlling I/O, it is possible to exfiltrate data or even gain control over the VM’s execution. In this work, we introduce two new methods that allow us to inject arbitrary code into SEV-ES secured virtual machines. Due to the lack of proper integrity protection, it is sufficient to reuse existing ciphertext to build a high-speed encryption oracle. As a result, our attack no longer depends on control over the I/O, which is needed by prior attacks. As I/O manipulation is highly detectable, our attacks are stealthier. In addition, we reverse-engineer the previously unknown, improved Xor-Encrypt-Xor (XEX) based encryption mode, that AMD is using on updated processors, and show, for the first time, how it can be overcome by our new attacks.
Intel SGX's role
Intel Software Guard Extensions (SGX) was the first widely available solution for protecting data in RAM. However, it only can protect a small chunk of RAM, not the VM as a whole.
Memory encryption systems for VMs
- AMD Secure Memory Encryption (SME) (2016): drop-in, AES-based RAM encryption. Controlled by Secure Processor (SP) co-processor. A special bit in the page table – the so-called C-bit – is used to indicate whether a page should be encrypted. The page table in the guest is encrypted and thus not accessible by the hypervisor.
- AMD Secure Encrypted Virtualization (SEV): SEV extends SME for VMs by using different encryption keys per VM, in order to prohibit the hypervisor from inspecting the VM’s main memory
- SEV Encrypted State (SEV-ES)
- Intel Total Memory Encryption (TME)
- Intel Multi-Key Total Memory Encryption (MKTME)
Comparison: S. Mofrad, F. Zhang, S. Lu, and W. Shi, “A comparison study of intel sgx and amd memory encryption technology,”, 2018
Notes
- Why is the tweak value only defined for index ≥4? “We denote the first one as t 4 , since there are no dedicated constants for the least significant bits 3 to 0.” seems to be a resulting condition, not a reason.
- Is the linear equation system resulting from Dec_K(Enc_N(…), q_j) a linear equation system? XOR with m makes it affine linear, doesn't it? Is bit(i, p) linear? I don't think so unless we are in ℤ_2?!
- See Table 1. They used 4-byte repeating patterns for the tweak constants.
- 0x7413 is “jump if equal” on x86_64, but 0xEB13 is “jump unconditionally”
- page 8: 16 MB = 16 777 216 bytes = 134 217 728 bits.
bytes = 16 * 1024**2
bits = 8 * bytes
(bytes/seconds) / 1024**2 # Mbytes/sec
⇒ 0.4266666666666667
(bits/seconds) / 1024**2 # MBits/sec
⇒ 3.4133333333333336 # ok, this occurs in the paper
(bytes/seconds) / 1024 # Kbytes/sec
⇒ 436.9066666666667 # this diverges from the paper with “426.67 KB/s” - “The first mechanism utilizes the cpuid instruction, which is emulated by the hypervisor and features a 2-byte opcode: Each time cpuid is executed, the hypervisor is called to emulate it.”
On virtual memory with virtual machines
On virtualized systems, two different page tables are used. Within the VM, the VA used by the guest, the Guest Virtual Address (GVA), is translated to the Guest Physical Address (GPA). The GPA is the address which the VM considers to be the PA. However, on the host system itself another page table is introduced, to allow multiple VMs to run on the same physical machine. This second page table is called the Second Level Address Translation (SLAT), or Nested Page Table (NPT) [5]. The NPT translates the GPA into the Host Physical Address (HPA), the actual address of the data in physical memory.
summary
- Point of this paper: AMD SEV lacks integrity protection and thus memory encryption of a virtual machine can be broken by an untrusted hypervisor. Hypervisor can thus modify memory content (i.e. data or instructions) of a VM arbitrarily.
- The researchers determined tweak values T(p) using a linear equation system. The AMD Epyc Embedded processor line was considered. AMD Epyc 7251 (released June 2017) used the XE encryption mode whereas AMD Epyc (released Feb 2018) uses the XEX encryption mode
- Xor-Encrypt (XE): Enc_K(m, p) := AES_K(m ⊕ T(p)) and Dec_K(c, p) := AES⁻¹_K(c) ⊕ T(p)
- Xor-Encrypt-Xor (XEX): Enc_K(m, p) := AES_K(m ⊕ T(p)) ⊕ T(p) and Dec_K(c, p) := AES⁻¹_K(c ⊕ T(p)) ⊕ T(p)
- We never know the key K. Thus we cannot decrypt text and encrypt it again. But since we know T(p) where p is chosen to be physical address bit (i.e. address of a 16-byte block), then moving blocks changes the ciphertext in a predictable way
- XE: Dec_K(Enc_K(m, p) , q) = AES⁻¹_K(AES_K(m ⊕ T(p))) ⊕ T (q_j) = m ⊕ T(p) ⊕ T(q_j) = m ⊕ T (p ⊕ q_j)
- When can we make changes in the memory? Whenever control from the VM is handed back from the VM to the hypervisor.
- We can simply remove the executable bit from the page and a page fault will be triggered. Then control will be returned back. Then we also know the address p. Adding the executable bit again and returning control allows to continue instructions stepwise (stepwise execution is a common SGX attack strategy, but unimportant for us)
- Or we can inject the cpuid instruction. The hypervisor is responsible for emulating/answering cpuid by copying the return values to a predefined memory segment.
- So we can move pages without screwing up the encryption. But how can we actually modify memory content? We define an encryption oracle:
- At boot time, cpuid is issued. We filp bits and create a loop to repeatedly call cpuid.
- The hypervisor provides content as return value of cpuid. The kernel in the VM encrypts it. Thus the hypervisor can read the encrypted version of the plaintext it provided.
Trust in cloud providers
One reason for not adopting cloud services is the required trust in the cloud provider: As they control the hypervisor, any data processed in the system is accessible to them.
Tweakable block ciphers
One popular method for storage encryption are tweakable block ciphers, such as AES-XTS [1], which is, e.g., used in Apple’s FileVault, MS Bitlocker and Android’s file-based encryption. Tweakable block ciphers provide encryption without data expansion as well as some protection against plaintext manipulation. A tweak allows to securely change the behavior of the block cipher, similar to instantiating it with a new key, but with little overhead.
XEX and Xor-Encrypt (XE) are methods to turn a block cipher such as AES into a tweakable blockcipher, where a tweak-derived value is XORed with the plaintext before encryption (and XORed again after encryption in the case of XEX).
typo
“before it continuous its operation” → “before it continues its operation.”
SOK: On the Analysis of Web Browser Security §
Title: “SOK: On the Analysis of Web Browser Security” by Jungwon Lim, Yonghwi Jin, Mansour Alharthi, Xiaokuan Zhang, Jinho Jung, Rajat Gupta, Kuilin Li, Daehee Jang, Taesoo Kim [url] [dblp]
Published in 2021-12 at and I read it in 2022-02
Abstract: Web browsers are integral parts of everyone’s daily life. They are commonly used for security-critical and privacy sensitive tasks, like banking transactions and checking medical records. Unfortunately, modern web browsers are too complex to be bug free (e.g., 25 million lines of code in Chrome), and their role as an interface to the cyberspace makes them an attractive target for attacks. Accordingly, web browsers naturally become an arena for demonstrating advanced exploitation techniques by attackers and state-of-the-art defenses by browser vendors. Web browsers, arguably, are the most exciting place to learn the latest security issues and techniques, but remain as a black art to most security researchers because of their fast-changing characteristics and complex code bases.
quotes
- “More specifically, we first introduce a unified architecture that faithfully represents the security design of four major web browsers. Second, we share insights from a 10-year longitudinal study on browser bugs. Third, we present a timeline and context of mitigation schemes and their effectiveness. Fourth, we share our lessons from a full-chain exploit used in 2020 Pwn2Own competition. We believe that the key takeaways from this systematization can shed light on how to advance the status quo of modern web browsers, and, importantly, how to create secure yet complex software in the future.”
- “Note that although Universal Cross-Site Scripting (UXSS) [166] sounds similar to XSS, it commonly originates from problems in the browser’s implementation and design, so it is considered web browser security (§III-E).”
- “The complex nature of websites leads to numerous security policies and unique features of each web browser.”
- “JavaScript engines are the core of modern browsers, which convert JavaScript code into machine code. Major browsers use just-in-time (JIT) compilation to speed up the code execution.
Also, JIT compilers model the result and side-effects of all operations and run various analysis passes to optimize the code. If any of these goes wrong, native code with memory corruption issues can be emitted and executed, which can lead to severe security
implications. While each engine has different implementations, they share similar design principles and have common attack surfaces. Therefore, attackers can build generic attack primitives which work across different engines, such as fakeobj and addrof primitives and element kind transitions. JavaScript engines are being used outside browsers as well (e.g. Node.js uses V8), amplifying the impact of security bugs in JavaScript engines. We discuss issues caused by homogeneous browser engines in section 6.” - “Linux. Unlike on Windows, the Linux sandbox is mainly based on seccomp, chroot and namespace. First, seccomp is a standard system call filter based on the eBPF language. Since the default seccomp configuration is overly tight, browsers define their own filtering rules.
For example, Chrome applies its custom seccomp rules to all processes except the broker process, and the detailed rules vary for each process.
Second, to restrict file access, Linux-based browser sandboxes utilize chroot jailing.
Once a process is confined with chroot, no upper hierarchy of the file system is reachable. For example, Firefox applies chroot jailing to all renderers and only allows them to access specific files based on file descriptors obtained from the broker process. Also, browsers use namespaces to create separated spaces for various resources, such as user, networking, and IPC. For example, creating and joining a user namespace enables
a sandboxed process to be in a separate UID and GID, effectively disabling access to other unsandboxed processes.” - “Since a process-based sandbox uses a non-trivial amount of memory, mobile platforms
introduce subtle differences in sandbox policies or disable them depending on the available resources. For example, on Android, Site Isolation in Chrome is enabled only when the device has enough memory (>1.9GB), and the user need to enter passwords on the website. On iOS, Safari uses sandbox rules that are different from macOS because different system services and IOKit drivers are exposed on mobile. Due to such differences, some exploits may work only on mobile platforms.” - “Renderer bugs are dominant in both Firefox and Chromium since renderers are the core of browsers.”
- “Recently, there have been efforts in rewriting browsers using memory-safe languages
(e.g., Rust) to mitigate memory-safety bugs. For example, Mozilla is rewriting parts of Firefox in Rust with an ongoing project called Oxidation. Up until 2020, the Oxidation project had replaced 12% of Firefox's components with Rust equivalents. Five of the replaced subcomponents fall under the renderer's media parsing component.” - “Since DOM bugs mostly rely on UAF problems, they have been mostly mitigated by UAF mitigations.”
- “Overwrite protection. Overwrite protections are standard protection mechanisms to prevent attackers from introducing arbitrary executable code, which can be seen as the last line of defense in the context of browser exploits. They mainly include four mechanisms: W $\oplus$ X, hardened JIT mapping, fast permission switch, and out-of-process JIT.”
- “As a result, mitigations in JS engines focus on eliminating attack primitives. Recently, the Edge team added a new security feature called Super Duper Secure Mode (SDSM), which basically disables JIT compilation. Users can choose to disable JIT on websites that are less frequently visited. While sacrificing some performance, it is a good approach for reducing attack surfaces.”
- “Same origin policy (SOP) is enforced by web browsers to keep a security boundary between different origins. SOP-bypass bugs can be used to compromise SOP to varying degrees, from leaking one bit to stealing full-page data. UXSS bugs are the most powerful type of SOP-bypass bug that can be used to facilitate cross-origin JavaScript code execution. In UXSS attacks, the attacker can inject scripts to any affected context by exploiting bugs in web browsers or third-party extensions, achieving the same effect as exploiting the XSS vulnerability in the target website.”
- “Site isolation is an effective mitigation against UXSS bugs. However, only Chrome and Firefox have site isolation deployed, since it requires a considerable amount of engineering effort”
- “Although vendors are trying, they are consistently behind in this arms race. Mitigations from vendors are mostly reactive, which means they are developed long after each wave of attacks. By the time an attack surface is finally closed, attackers have already come up with a better exploit.”
- “Modern browsers implement a basic level of heap separation between Javascript-reachable objects and other objects”
- “Delayed free. Another mitigation, delayed free, effectively increases the difficulty
of exploiting UAF bugs, but this approach cannot restrict the reclamation of dangling
pointers. Browsers use various garbage collection (GC) algorithms to deallocate heap-allocated objects with no references. Some variants of GC additionally scan stack and heap areas to find possibly overlooked references, which is known as conservative scanning or delayed free” - “Browsers are also vulnerable to side-channel attacks. To date, studies have shown that sensitive information in browsers can be inferred via
- microarchitectual state;
- GPU;
- floating-point timing channels and
- browser-specific side channels”
- “Cross-Origin-Opener-Policy (COOP) and Cross-Origin-Embedder-Policy (COEP) were introduced to set up a cross-origin isolated environment. COOP allows a website to include a response header on a top-level document, ensuring that the cross-origin documents do not share the same browsing context group with itself, thus preventing direct DOM access.”
- “Improving memory safety. The Chrome team has explored improvements for their C++ codebase that can eliminate/reduce specific types of bugs by limiting the use of specific language features (e.g., C++ exceptions) and introducing wrapper
classes around integer operation.” - “In the case of Spectre/Meltdown attacks, browser vendors worked together to build a plan for mitigating the immediate threats, which is a great example of collaborative effort.”
- “For example, iOS Safari was exploited due to the 1.5-month patch gap”
- “Some industrial efforts on fuzzing browsers are highly effective on finding complex browser bugs. For example, ClusterFuzz runs on over 25,000 cores and found over 29,000 bugs in Chrome.”
summary
The paper looks at Chrome/Blink/V8, Safari/WebKit/JavaScriptCore, Firefox/Gecko/SpiderMonkey, and Internet Explorer/Trident/Chakra web browsers and their architecture with respect to security requirements. This concerns the rendering, IPC, Site isolation, Javascript engine security as well as web technologies like Same-Origin Policy. In Table 1, one can see many sandboxing/isolation mechanisms in place, especially in Chrome and Firefox. Browser exploitation scenarios and bug classifications are provided in Figure 2. The paper also looks at the history and timeline of bugs and bug classes as well as strategies to detect them. Table 4 provides a very detailed look at mitigations in browsers in relation to time.
In summary, the paper provides a very detailed look at web browser security. It can serve as list of keywords to get started on web security topics. In case you plan to build your own web browser engine, it will help you understand requirements and gives architectural design ideas. But it also gives a historical overview which migitations were deployed as time progressed.
Lessons:
- Using memory safe languages is an effective mitigation against memory-safety bugs.
- Higher payouts motivate more bug reports.
- UAF mitigations are effective towards reducing DOM bug exploits.
- Mitigating JS engine bugs is difficult.
- UXSS bugs are mostly mitigated by Site Isolation.
- Collaborative efforts on mitigations are good.
One controversial aspect:
“Although vendors are trying, they are consistently behind in this arms race. Mitigations from vendors are mostly reactive, which means they are developed long after each wave of attacks. By the time an attack surface is finally closed, attackers have already come up with a better exploit.”
I am not really sure about this statement. I think it is too strong. Did the authors consider how many mitigations were put in place to prevent security exploits in the first place? I don't think credit is provided for these decisions in the statement.
citation 121 = “Safer Usage Of C++” by Google Security
- “C/C++’s integer semantics are bonkers: the wrapping, overflow, underflow, undefined behavior, implicit casting, and silent truncation behaviors all add up to unsafety and poor ergonomics.”
- “P1705R1 Enumerating Core Undefined Behavior” by Shafik Yaghmour
Scribble: Closing the Book on Ad Hoc Documentation Too… §
Title: “Scribble: Closing the Book on Ad Hoc Documentation Tools” by Matthew Flatt, Eli Barzilay, Robert Bruce Findler [url] [dblp]
Published in 2009-09 at ICFP'09 and I read it in 2022-01
Abstract: Scribble is a system for writing library documentation, user guides, and tutorials. It builds on PLT Scheme’s technology for language extension, and at its heart is a new approach to connecting prose references with library bindings. Besides the base system, we have built Scribble libraries for JavaDoc-style API documentation, literate programming, and conference papers. We have used Scribble to produce thousands of pages of documentation for PLT Scheme; the new documentation is more complete, more accessible, and better organized, thanks in large part to Scribble’s flexibility and the ease with which we cross-reference information across levels. This paper reports on the use of Scribble and on its design as both an extension and an extensible part of PLT Scheme.
quotes
- “Besides the base system, we have built Scribble libraries for JavaDoc-style API documentation, literate programming, and conference papers. We have used Scribble to produce thousands of pages of documentation for PLT Scheme; the new documentation is more complete, more accessible, and better organized, thanks in large part to Scribble’s flexibility and the ease with which we cross-reference information across levels.”
- “Most existing documentation tools fall into one of three categories: LaTeχ-like tools that know nothing about source code; JavaDoc-like tools that extract documentation from annotations in source code; and WEB-like literate-programming tools where source code
is organized around a prose presentation.” - “Specifically, Scribble leverages lexical scoping as supplied by the underlying programming language, instead of ad hoc textual manipulation, to connect documentation and code.”
- “We developed Scribble primarily for stand-alone documentation, but we have also developed a library for JavaDoc-style extraction of API documentation, and we have created a WEB-style tool for literate programming. In all forms, Scribble’s connection between documentation and source plays a crucial role in crossreferencing, in writing examples within the documentation, and in searching the documentation from within the programming environment.”
- “The Scribble syntax for generating this document fragment is reminiscent of LaTeχ, using @ (like texinfo) instead of \:”
- “The initial #lang scribble/doc line declares that the module uses Scribble’s documentation syntax, as opposed to using #lang scheme for S-expression syntax.”
- “In this definition, real? and pict? are contracts for the function argument and result. Naturally, they are in turn hyperlinked to their definitions, because suitable libraries are imported for-label in the documentation source.”
- “The above documentation of circle is implemented using defproc:
@defproc[(circle [diameter real?]) pict?]{Alternatively, instead of writing the documentation for circle in a stand-alone document—where there is a possibility that the documented contract does not match the contract in the implementation—the documentation could be written with the implementation of circle. In that case, the documentation would look slightly different, since it would be part of the module’s export declarations:
Creates an unfilled ellipse.
}
(provide/doc
[circle ([diameter real?] . -> . pict?)
@{Creates an unfilled ellipse.}]) ” - “Users of a text-markup language experience first and foremost the language’s concrete syntax. The same is true of any language, but in the case of text, authors with different backgrounds have arrived at a remarkably consistent view of the appropriate syntax: it should use blank lines to indicate paragraph breaks, double-quote characters should not be special, and so on. At the same time, a programmable mark-up language needs a natural escape to the programming layer and back.”
- “For Scribble, our solution is the @-notation, which is a text-friendly alternative to traditional S-expression syntax. More precisely, the @-notation is another way to write down arbitrary S-expressions, but it is tuned for writing blocks of free-form text.”
- “The grammar of an @-expression is roughly as follows (where @, [, ], {, and } are literal, and x? means that x is optional):
<at-expr> := @<op>? [<S-expr>*]? {<text>} ?
<op> := <S-expr> that does not start with [ or {
<S-expr> := any PLT Scheme S-expression
<text> := text with balanced {...} and with @-exprs ” - “Section content should be grouped implicitly via section, subsection, etc. declarations, instead of explicitly nesting section constructions.”
- “Paragraph breaks should be determined by empty lines in the source text, instead of explicitly constructing paragraph values.”
- “A handful of ASCII character sequences should be converted automatically to more sophisticated typesetting elements, such as converting ‘‘ and ’’ to curly quotes or --- to an em-dash.”
- “These transformations are specific to typesetting, and they are not appropriate for other contexts where the @ notation is useful. Therefore, the @ parser in Scribble faithfully preserves the original text in Scheme strings, and a separate decode layer in Scribble provides additional transformations.”
- “Functions like bold and emph apply decode-content to their arguments to perform ASCII transformations, and item calls decode-flow to transform ASCII sequences and form paragraphs between empty lines. In contrast, tt and verbatim do not call the decode layer, and they instead typeset text exactly as it is given.”
- “As an embedded domain-specific language, Scribble follows a long tradition of using Lisp- and Scheme-style macros to implement little languages. In particular, Scribble relies heavily on the Scheme notion of syntax objects (Sperber 2007), which are fragments of code that have lexical-binding information attached. Besides using syntax objects in the usual way to implement macros, Scribble uses syntax objects to carry lexical information all the way through document rendering.”
- “A deeper dependence of Scribble on PLT Scheme relates to #lang parsing. The #lang notation organizes reader extensions of Scheme (i.e., changes to the way that raw text is converted to S-expressions) to allow new forms of surface syntax. The identifier
after #lang in the original source act as the ‘language’ of a module.” - “To parse a #lang line, the identifier after #lang is used as the name of a library collection that contains a "lang/reader.ss" module. The collection’s "lang/reader.ss" module must export a read-syntax function, which takes an input stream and produces a syntax object. The "lang/reader.ss" module for scribble/doc parses the given input stream in @-notation text mode, and then wraps the result in a module form. For example,
#lang scribble/doc
in a file named "hello.scrbl" reads as
@(require scribble/manual)
It was a @bold{dark} and @italic{stormy} night.(module hello scribble/doclang
doc ()
"\n" (require scribble/manual) "\n"
"It was a " (bold "dark") " and "
(italic "stormy") "night." "\n")
where doc is inserted by the scribble/doc reader as the identifier to export from the module, and the () is a convenience explained below.” - “The doc binding that a Scribble module exports is a description of a document. Various tools, such as the scribble command-line program, can take this description of a document and render it to a specific format, such as LaTeχ or HTML.”
- “Scribble’s documentation abstraction reflects a least-common denominator among such document formats. For example, Scribble has a baked-in notion of itemization, since LaTeχ, HTML, and other document formats provide specific support to typeset itemizations. For many other layout tasks, such as formatting Scheme code, Scribble documents fall back to a generic ‘table’ abstraction. Similarly, Scribble itself resolves most forms of cross-references and document dependencies, since different formats provide different levels of automatic support; tables of contents and indexes are mostly built within Scribble, instead of the back-end.”
- “An element within a paragraph can be one of the following:
- a plain string;
- an instance of the element structure type, […]
- a target-element, which associates a cross-reference tag with a list of elements, […]
- a link-element, which associates a cross-reference tag to a list of elements, […]
- a delayed-element eventually expands to a list of elements. […]
- A collect-element is the complement of delayed-element: […]
- A few other element types support more specialized tasks, […]”
- “The examples form of the scribble/eval library typesets an example along with its result using the style of a read-eval-print loop. For example,
@examples[(/ 1 2) (/ 1 2.0) (/ 1 +inf.0)]
produces the output:
Examples:> (/ 1 2)
1/2
> (/ 1 2.0)
0.5
> (/ 1 +inf.0)
0.0 ” - “Unlike a normal Scribble program, running a scribble/lp program ignores the prose exposition and instead evaluates the program in the chunks. In literate programming terminology, this process is called tangling the program.”
- “To recover the prose, the @lp-include[filename] form extracts a literate view of the program from filename.”
- “The arrows in Figure 3’s screenshot demonstrate how DrScheme can draw arrows from chunk bindings to chunk references, and from the binding occurrence of an identifier to its bound occurrences, even across chunks. These latter arrows are particularly helpful with literate programs, where lexical scope is sometimes obscured by the way that textually disparate fragments of a program are eventually tangled into the same scope.”
- “Although many existing PLT Scheme tools help in building documents, the process of generating HTML is significantly different from compilation tasks. The main difference is that cyclic dependencies are common in documentation, whereas library dependencies are strictly layered. For example, the core language reference contains many pointers into the overview and a few pointers to the GUI library and other extensions; all documents, meanwhile, refer back to the core reference.”
- “PLT Scheme documentation was previously written in LaTeχ and converted to HTML via tex2page (Sitaram 2007). Although tex2page was a dramatic improvement over our original use of latex2html, the build process relied on layers of fragile LaTeχ macros, HTML hacks, and pre- and post-processing scripts, which made the documentation all but impossible to build except by its authors.”
- “The LaTeχ category includes general word-processing tools like Microsoft Word, but L A TEX offers the crucial advantage of programmability, where macros enable automatic formatting of API details. Systems like Skribe (Gallesio and Serrano 2005) improve LaTeχ by offering a sane programming language.”
- “The JavaDoc category includes perldoc for Perl, RDoc for Ruby, Haddock (Marlow 2002) for Haskell, OCamlDoc (Leroy 2007), Doxygen (van Heesch 2007) for various languages (including Java, C++, C#, and Fortran), and many others.”
- “Literate programming tools such as WEB (Knuth 1984) and noweb (Ramsey 1994) are designed for documenting the implementation of a library as much as the API that a library exports. In a sense, these tools are an extreme version of the JavaDoc category, where the information communicated to a reader is drawn from both the prose and the executable source. In doing so, unfortunately, the tools typically revert to a textual slice-and-dice of the program and prose sources, instead of a programmable layer that spans the two halves.”
- “Simonis and Weiss (2003) provide a more complete overview of existing systems and add ProgDoc, which is similar to noweb in the way that it uses a pipeline of tools.”
- “Skribe’s format-independent document structure and its use of passes to render a document influenced the design of Scribble.”
- “The SLaTeχ (Sitaram 2007) system provides automatic formatting of Scheme code within a LaTeχ document.”
- “In terms of surface syntax, many documentation systems build on either S-expression notation (or its cousin XML) as a way to encode both document structure and program structure. Such representations are especially appropriate for an intermediate representation of documentation, as in DocBook (Walsh and Muellner 2008). S-expression encodings of documentation are especially common in Lisp projects, where data and code are mingled easily.”
-
“A documentation language should be designed not by piling escape conventions on top of a comment syntax, but by removing the weaknesses and restrictions of the programming language that make a separate documentation language appear necessary. Scribble demonstrates that a small number of rules for forming documentation, with no restrictions on how they are composed, suffice to form a practical and efficient documentation language that is flexible enough to support the major documentation paradigms in use today.”
— Clinger’s introduction to the RnRS standards, adapted for Scribble
summary
Excellent tool which expands some existing concepts from Scribe (2005). Fundamentally the prefix “#lang scribble/doc” dispatches parsing based on the module loaded. In particular S-expressions are considered as building blocks for the language defining typesetting elements. The primary advantage is to support the “program is data” paradigm in the typesetting context. The disadvantage is the tight coupling between the PLT Scheme infrastructure and the document.
- They split documentation tools into three categories: LaTeχ-like, JavaDoc-like, WEB-like
- The specified S-expression grammar does not allow to use unbalanced curly braces in text
- @emph{Yes!} == (emph "Yes!")
@section{Country @emph{and} Western} == (section "Country " (emph "and") " Western")
@itemize[(item "a") (item "b")] == (itemize (item "a") (item "b"))
@title[#:style 'toc]{Contracts} == (title #:style 'toc "Contracts")
@emph{committed by @username} == (emph "committed by " username)
@{Country @emph{and} Western} == ("Country " (emph "and") " Western") - Section 7 describes the typesetting elements defined
- When a library is installed in this way, its documentation is installed as the library is compiled. PLaneT supports library versioning, and multiple versions of a package can be installed at a time.
Seven deadly sins of introductory programming language… §
Title: “Seven deadly sins of introductory programming language design” by L. McIver, D. Conway [url] [dblp]
Published in 1996 at SEEP '96 and I read it in 2023-06-22
Abstract: We discuss seven undesirable features common to many programming languages used to teach first-time programmers, and illustrate typical pedagogical difficulties which stem from them with examples drawn from the programming languages ABC, Ada, C, C++, Eiffel, Haskell, LISP, Modula 3, Pascal, Prolog, Scheme, and Turing. We propose seven language design (or selection) principles which may reduce the incidence of such undesirable features.
quotes
- “Scheme has effectively only one data type – the list – and one operation – evaluation of a list. While this abstraction is very simple to explain, and not difficult for the beginner to grasp superficially, it does result in code that is difficult to read because of large numbers of nested parentheses and the absence of other structuring punctuation” (McIver and Conway, 1996, p. 1)
- “Furthermore, to support this extreme degree of homogeneity, a large number of inbuilt functions are required, many of which are quite sophisticated in their behaviour, and therefore difficult to understand and use correctly (for example: sort vs sortcar in Franz LISP [14]).” (McIver and Conway, 1996, p. 1)
- “At first glance this approach seems quite reasonable, but two pedagogical problems frequently sabotage it. The first is that textbooks and other reference materials rarely confine themselves to the selected subset. The second is that, even if the textbook does limit itself to the required subset, the compiler almost certainly does not. The result is often worse than if the complete language was taught: students must still contend with the full semantics of the language, but much of it has deliberately not been explained to them!” (McIver and Conway, 1996, p. 2)
- “Examples of this "creeping featuritis" are easy to cite: C++ provides over 50 distinct operators at 17 levels of precedence, Ada9X has 68 reserved words and over 50 predefined attributes, Modula 3 reserves over 100 keywords, and some commonly-used LISP dialects ([14] for example) define over 500 special functions. Because most textbooks and compilers attempt to cover the full language, novice programmers are forced to contend with all of these features, even if they are not using them.” (McIver and Conway, 1996, p. 2)
-
“Syntactic homonyms (constructs which are syntactically the same, but which have two or more different semantics depending on context) are perhaps a more serious flaw in a language. An extreme example of this3 may be seen in the pedagogically-oriented language Turing, in which the construct A(B) has five distinct meanings:
- call function A and pass parameter B
- dereference the pointer B to access an object in collection A
- access the Bth element of array A
- construct a set of type A with a single element having the value of B
- create a one-letter substring of the string A consisting of its Bth character” (McIver and Conway, 1996, p. 2)
- “Notice that referencing an element of array a with subscript i as in a(i) is notationally equivalent to [the pointer dereference] c(p). This is an example of uniform referents, which means that analogous ways of accessing data should be notationally equivalent. [17]” (McIver and Conway, 1996, p. 2)
- “The blame for this minor failure can hardly be laid on the novice, who may simply have forgotten (or perhaps never grasped) that ABC lists are automatically sorted on input.” (McIver and Conway, 1996, p. 4)
- “Likewise, we suggest that an introductory language need only provide a single numeric data type which stores rational numbers with arbitrary precision integer numerators and denominators. The restriction to rationals still allows the educator to discuss the general issue of representational limitations (such as the necessary approximation of common transfinite numbers such as π and e), but eliminates several large classes of common student error which are caused by misapplication of prior mathematical understanding to fixed precision arithmetic” (McIver and Conway, 1996, p. 6)
summary
The authors took a look at the programming languages ABC, Ada, C, C++, Eiffel, Haskell, LISP, Modula 3, Pascal, Prolog, Scheme, and Turing. They identified typical pain points for novice programmers to write correct programs. They broke down the faults into seven deadly sins: “Less is more”, “More is more”, “Grammatical traps”, “Hardware dependence”, “Backwards compatibility”, “Excessive cleverness”, “Violation of Expectations”. Furthermore they gave seven steps towards more teachable languages: “Start where the novice is”, “Differentiate semantics with syntax”, “Make the syntax readable and consistent”, “Provide a small and orthogonal set of features”, “Be careful especially with I/O”, “Provide better error diagnosis”, “Choose a suitable level of abstraction”. They conclude with a list of seven criteria for choosing a suitable introductory language.
It is difficult to judge the applicability of the criteria because of changes since 1996. However, “Violation of Expectations” reminds me of the principle of least astonishment from UX design. The given examples are the strong suite of the paper, but there are only few. In the end I agree with the statements, but the points are pretty generic and I somewhat expected more insights. One thing I learned from this paper is that there should be a somewhat 1:1 correspondence between syntactic and semantic elements.
Seven great blunders of the computing world §
Title: “Seven great blunders of the computing world” by N. Holmes [url] [dblp]
Published in 2002-07 at and I read it in 2021-08
Abstract:
quotes
- “But we must remember the blunders so we can strike a proper balance between pride and humility—assuming there have indeed been blunders. This column aims to confirm their existence by giving examples.”
- “Unicode’s blunder was in aiming to encode every language rather than every writing system.”
- “IBM’s Ken Iverson and colleagues adapted his reformed mathematical notation, developed at Harvard, to use on computers.”
- “Blunders arise from a failure of imagination, from an inability to see beyond the immediate problem to its full social or professional context.”
summary
The article discusses seven topics, the author considers to be solved wrongfully from the technical community.
Even though the mentioned ‘blunders’ are a neat collection of historically decided debates, I don't think the term ‘blunder’ is justified. Especially blunder 4 “Commercial programming” is highly controversial in retrospect and mostly claimed without proper arguments. Blunder 1 “Terminology”, on the other hand, made me reflect on the terms “information” and “data”.
SoK: Eternal War in Memory §
Title: “SoK: Eternal War in Memory” by L. Szekeres, M. Payer, Tao Wei, Dawn Song [url] [dblp]
Published in 2013 at IEEE S&P 2013 and I read it in 2020-09
Abstract: Memory corruption bugs in software written in low-level languages like C or C++ are one of the oldest problems in computer security. The lack of safety in these languages allows attackers to alter the program’s behavior or take full control over it by hijacking its control flow. This problem has existed for more than 30 years and a vast number of potential solutions have been proposed, yet memory corruption attacks continue to pose a serious threat. Real world exploits show that all currently deployed protections can be defeated.
quotes
- “Google Chrome, one of the most secure web browsers written in C++, was exploited four times during the Pwn2Own/Pwnium hacking contests in 2012.”
- “A multitude of defense mechanisms have been proposed to overcome one or more of the possible attack vectors. Yet most of them are not used in practice, due to one or more of the following factors: the performance overhead of the approach outweighs the potential protection, the approach is not compatible with all currently used features (e.g., in legacy programs), the approach is not robust and the offered protection is not complete, or the approach depends on changes in the compiler toolchain or in the source-code while the toolchain is not publicly available.”
- “A pointer pointing to a deleted object is called a dangling pointer. Dereferencing an out-of-bounds pointer causes a so called spatial error, while dereferencing a dangling pointer causes a temporal error.”
- “Programming bugs which make these errors possible, such as buffer overflows and double-frees, are common in C/C++. When developing in such low-level languages, both bounds checking and memory management are fully the programmers responsibility, which is very error prone.”
- “C and C++ are inherently memory unsafe. According to the C/C++ standards, writing an array beyond its bounds, dereferencing a null-pointer, or reading an uninitialized variable result in undefined behavior.”
- “A Code Integrity policy enforces that program code cannot be written. Code Integrity can be achieved if all memory pages containing code are set read-only, which is supported by all modern processors. Unfortunately, Code Integrity does not support self-modifying code or Just-In-Time (JIT) compilation. Today, every major browser includes a JIT compiler for JavaScript or Flash.”
- “The instruction pointer can only be updated indirectly by executing an indirect control-flow transfer instruction, e.g., an indirect function
call, indirect jump or function return instruction.” - “Control-flow Integrity (CFI)”
- “Non-executable Data”
- “W⊕X (Write XOR Execute)”
- “However in the case of JIT compilation or self-modifying code, W⊕X cannot be fully enforced. For the sake of completeness, we note that another randomization approach, Instruction Set Randomization (ISR) can also mitigate the execution of injected code or the corruption of existing code by encrypting it.”
- “The reused code can be an existing function (“return-to-libc” attack) or small instruction sequences (gadgets) found anywhere in the code
that can be chained together to carry out useful (malicious) operations. This approach is called Return Oriented Programming (ROP), because the attack chains the execution of functions or gadgets using the ending return instructions. Jump Oriented Programming (JOP) is the generalization of this attack which leverages indirect jumps as well for chaining.” - “We call this policy Data Integrity, which naturally includes Code Integrity and Code Pointer Integrity.”
- “Data Space Randomization techniques”
- “The most widely deployed protection mechanisms are stack smashing protection, DEP/W⊕X and ASLR.”
- “SafeSEH and SEHOP also validate exception handler pointers on the stack before they are used,”
- “DEP/W⊕X can protect against code injection attacks, but does not protect against code reuse attacks like ROP. ROP exploits can be generated automatically [27], and large code bases like the C library usually provide enough gadgets for Turing-completeness [10], [11].”
- “In all examples, the control flow is hijacked at an indirect call instruction (after corrupting a function pointer or the vtable), so stack cookies are not an issue.”
- “One common way of leaking out memory contents is by overwriting the length field of a (e.g., JavaScript) string object before reading it out (in the user script).”
- “the W⊕X policy (Code Integrity and Non-executable Data) is now enforced by the hardware, as modern processors support both non-writable and non-executable page permissions.”
- “The alternative to hardware support is adding the reference monitor dynamically or statically to the code.”
- “Dynamic (binary) instrumentation (e.g., Valgrind [30], PIN [31], DynamoRIO [32], or libdetox [33]) can be used to dynamically insert safety checks into unsafe binaries at runtime.”
- “(e.g., a shadow stack costs less than 6.5% performance for SPEC CPU2006 in [33]).”
- “Our comparison analysis in Section IX shows that techniques introducing an overhead larger than roughly 10% do not tend to gain wide adoption in production environments. Some believe the average overhead should be less than 5% in order to get adopted by industry, e.g., the rules of the Microsoft BlueHat Prize Contest [37] confirm this viewpoint.”
- “The diverted jump target can be some injected payload in a data area or existing code in the code section. This is why every memory area must be randomized, including the stack, heap, main code segment, and libraries.”
- “On most Linux distributions, for instance, only library code locations are randomized but the main module is at a fixed address. Most programs are not compiled as Position Independent Executables (PIE) to prevent a 10% on average performance degradation [40].”
- “Furthermore, on 32 bit machines the maximum possible entropy allowed by the virtual memory space is ineffective against brute-force or de-randomization attacks [41]. De-randomization is often carried out by simply filling the memory with repeated copies of the payload, which is called heap-spraying or JIT-spraying [14], [42].”
- “Self-Transforming Instruction Relocation (STIR) [46] randomly re-orders the basic blocks of a binary at launch-time.”
- “To encrypt the pointers PointGuard uses the XOR operation with the same key for all pointers.”
- “Data Space Randomization (DSR) [48] was introduced by Bhatkar and Sekar to overcome the weaknesses of PointGuard and to provide stronger protection.”
- “The average overhead of DSR is 15% on a custom benchmark.”
- “Type-safe languages enforce both spatial and temporal safety by checking object bounds at array accesses and using automatic garbage collection (the programmer cannot destroy objects explicitly).”
- “CCured [49] and Cyclone [50] use “fat-pointers” by extending the pointer representation to a structure which includes the extra information.”
- “SoftBound [51] addresses the compatibility problem by splitting the metadata from the pointer, thus the pointer representation remains unchanged.”
- “the performance overhead of SoftBound is high, 67% on average.”
- “Consider, for instance, a pointer created by the protected module. If that pointer is modified by an unprotected module, the corresponding metadata is not updated, causing false positives.”
- “Automatic pool allocation partitions the memory based on a static points-to analysis. Partitioning allows using a separate and much smaller data structures to store the bounds metadata for each partition, which can decrease the overhead further to around 120%.”
- “Baggy Bounds Checking (BBC) [56] is currently one of the fastest object based bounds checkers. BBC trades memory for performance and adds padding to every object so that its size will be a power of two and aligns their base addresses to be the multiple of their (padded) size.”
- “BBC’s average performance overhead is 60% on the SPECINT 2000 benchmark. PAriCheck [57] was developed concurrently with BBC. It pads and aligns objects to powers of two as well for efficient bounds checking. It has slightly better performance cost and memory overhead than BCC.”
- “Special memory allocators, like Cling [58], are designed to thwart dangling pointer attacks without significant memory or performance overhead. Cling is a replacement for malloc, which allows address space reuse only among objects of the same type and alignment. This policy does not prevent dereferences through dangling pointers, but enforces type safe memory reuse, preventing the described use-after-free attack. Dynamic memory allocator replacements of course cannot prevent unsafe reuse of local, stack allocated objects.”
- “Perhaps the most widely used tools to detect memory errors in practice is Valgrind’s Memcheck [30] tool and AddressSanitizer [59]. These tools try to detect use-after-free bugs by marking locations which were de-allocated in a shadow memory space. Accessing
a newly de-allocated location can be detected this way. This approach, however, fails to detect errors after the area is re-allocated for another pointer: the area is registered again and the invalid access remains undetected. The object based bounds checkers described in the previous subsection offer the same protection, since de-allocation invalidates the object in the metadata table. Valgrind, being a dynamic translator, causes a 10x slowdown in average, while AddressSanitizer causes 73% slowdown by instrumenting code at compile time.” - “Maintaining not only bounds but also allocation information with pointers allows enforcing full Memory Safety.”
- “CETS [60] extends SoftBound and solves the problem by eliminating the redundancy of the above described naı̈ve idea.”
- “In other words, together with SoftBound, CETS enforces Memory Safety. The average execution overhead of the instrumentation enforcing temporal safety alone is 48%. When coupled with SoftBound to enforce complete Memory Safety, the overhead is 116% on average on the SPEC CPU benchmark.”
- “A pointer is considered unsafe if it might go out of bounds, e.g., because it is a computed value (p[i]). A static pointer analysis identifies the unsafe pointers together with their points-to sets, i.e., their potential target objects. Let us call the union of the identified points-to sets unsafe objects.”
- “The reported runtime overhead of Yong’s system varies between 50-100% on the SPEC 2000 benchmark.”
- “The previous technique restricts pointer dereferences to write only unsafe object. Write Integrity Testing (WIT) [62] further strengthens the above policy by restricting each pointer dereference to write only objects in its own points-to set.”
- “The reported performance overhead of WIT is around 5–25% for the SPEC benchmark.”
- “Data-Flow Integrity (DFI) as proposed by Castro et al. [64] detects the corruption of any data before it gets used by checking read instructions. DFI restricts reads based on the last instruction that wrote the read location.”
- “For instance, the policy ensures that the isAdmin variable was last written by the write instruction that the source code defines and not by some rogue attacker-controlled write.”
- “The performance overhead of the technique varies between 50-100% on the SPEC 2000 benchmark.”
- “Immutable code pointers, such as the ones in the Global Offset Table or in virtual function tables (vtable), can be easily protected by keeping them in read-only memory pages.”
- “Most use-after-free exploits, for instance, divert the control-flow by reading the “wrong” virtual function table through a dangling pointer, which does not involve overwriting code pointers in memory at all.”
- “To estimate the performance overhead of an unprotected shadow stack mechanism, we implemented one as an LLVM plugin, which has an average overhead of 5% on the SPEC2006 benchmark.”
- “The precise points-to sets can only be determined globally, which makes modularity and dynamic library reuse challenging. This the main reason why this solution works great with monolithic kernels [70] or hypervisors [71], where every module is statically linked together, but has not been deployed for dynamically linked applications.”
- “The average performance overhead of the Abadi implementation is 15%, while the maximum measured is as high as 45%.”
- “We remind researchers in the security area to recognize the significance of these properties in the real world.”
- “especially built into commonly used compilers,
such as LLVM and GCC. These open-source platforms can be of great value, where some of the compatibility problems can be solved by the community so researchers can release their robust but possibly slow protections to interested users.” - “The war is not over.”
summary
Well-written paper. Figure 1 provides a nice classification for the different attack on memory vulnerabilities. This is a helpful contributions. The attacks themselves were not new to me. In section 2 C (“Control-flow hijack attack”) it is mentioned that “The instruction pointer can only be updated indirectly by executing an indirect control-flow transfer instruction, e.g., an indirect function call, indirect jump or function return instruction”, but this is not true for ARM platforms where the program counter is addressable. This is one example where the platform is not specified in the paper. One sentence in section 3 reads “One common way of leaking out memory contents is by overwriting the length field of a (e.g., JavaScript) string object before reading it out (in the user script).”, but the reference to CVE 2012-0469 in Table 1 does not refer to a CVE discussing Javascript String length AFAICS. Furthermore some claims are too strong and not sufficiently justified like “In case of most applications, however, this is much less of an issue than runtime performanc”. But these are rare cases. In general, a good read to get an overview over memory vulnerabilities of the previous decades.
typo
- “an uninitialized variable result in undefined behavior.” → “an uninitialized variable results in undefined behavior.”
- “the authors suggest using a shadows stack mechanism instead for checking returns.” → “the authors suggest using a shadow stack mechanism instead for checking returns.”
SoK: Exploiting Network Printers §
Title: “SoK: Exploiting Network Printers” by Jens Muller, Vladislav Mladenov, Juraj Somorovsky, Jorg Schwenk [url] [dblp]
Published in 2017 at IEEE S&P 2017 and I read it in 2020-08
Abstract: The idea of a paperless office has been dreamed for more than three decades. However, nowadays printers are still one of the most essential devices for daily work and common Internet users. Instead of getting rid of them, printers evolved from simple printing devices to complex network computer systems installed directly in company networks, and carrying lots of confidential data in their print jobs. This makes them to an attractive attack target.
quotes
- “PRinter Exploitation Toolkit (PRET). We used PRET to evaluate 20 printer models from different vendors and found all of them to be vulnerable to at least one of the tested attacks.”
- “In recent years research into printer security started to gain some attention. In 1996 the potential danger of PostScript file I/O primitives was pointed out by Silbert et.al. [52].”
- “In 2005 Crenshaw et.al. [19] published an overview of potentially harmful PJL commands on network printers. In 2007 Weaver et.al. [62] discovered the cross-site printing technique to force web browsers into printing arbitrary payloads on a network printer.”
- “Thus, in this work, all attacks concentrate on two of the most used and implemented interpreters – PostScript and PJL.”
- Figure 2: Encapsulation of printer languages: {Device control protocols {Network Printing protocols such as IPP, LPD, SMP, raw port 9100 {Job/printer control languages such as PJL, PML {Page description Languages such as PostScript, PCL, PDF, XPS, …}}}
- “Network Printing Alliance Protocol (NPAP)”
- “Simple Network Management Protocol (SNMP)”
- “The most common network printing protocols supported by printer devices are the Internet Printing Protocol (IPP), Line Printer Daemon (LPD), Server Message Block (SMB) and raw port 9100 printing.”
- “Raw port 9100 printing is the default method used by CUPS and the Microsoft Windows printing architecture to communicate with network printers and considered as ‘the simplest, fastest, and generally the most reliable network protocol used for printers’”
- “Such a bidirectional channel is not only perfect for debugging, but gives us direct access to results of PostScript and PJL commands.”
- “While it usually sits as an optional layer between the printing protocol and the page description language, functions can overlap.”
- “PJL is not limited to the current print job as some settings can be made permanent”
- “PML is embedded within PJL and can be used to read and set SNMP values on a printer device. This is especially interesting if a firewall blocks access to SNMP services (161/udp), but an attacker is still able to print.”
- “A PDL specifies the appearance of the actual document. It must however be pointed out that some PDLs offer limited job control, so a clear demarcation between page description and printer/job control language is not always possible.”
- “There are various proprietary page description languages like Kyocera’s PRESCRIBE, Samsung Printer Language (SPL), Xerox Escape Sequence (XES), Canon Printing System Language (CaPSL), …”
- “Support for direct Portable Document Format (PDF) and XML Paper Specification (XPS) printing is also common on newer printers.”
- “Printer Command Language (PCL)”
- “The PostScript language was invented by Adobe Systems between 1982 and 1984. It has been standardized as PostScript Level 1 [50], PostScript Level 2 [61], PostScript 3 [33], and in various language supplements.”
- “PostScript is a stack-based, turing-complete programming language consisting of about 400 operators for arithmetics, stack and graphic manipulation and various data types such as arrays or dictionaries.”
- “An important limitation of this attacker model is the missing backchannel. In other words, the attacker can send malicious commands to the printer, but cannot get the result. The reason for this limitation is the same-origin policy within the browser disabling the cross-site access.”
- “Long-term settings for printers and other embedded devices are stored in Non-Volatile Random-Access Memory (NVRAM) which is traditionally implemented either as Electrically Erasable Programmable Read-Only Memory (EEPROM) or as flash memory.”
- “On early HP LaserJets ‘flash chips would only sustain about 1000-2000 cycles of rewriting’ [23]. Today, vendors of flash memory guarantee about 100,000 rewrites before any write errors may occur.”
- “A formal policy-based security model for access control on MFPs has recently been proposed by Lukusa et al.[40].”
- “Samsung (and some DELL) printers enabled a remote attacker to execute actions with administrator privileges using hardcoded SNMP commands [3]. This was possible even if SNMP has been disabled on the affected printers.”
- “One simple way to manipulate the appearance of printouts is to use overlays. PCL has a documented function to put overlay macros on top of a document. Unfortunately, this feature is limited to the current print job and cannot be made permanent. PostScript does not offer such functionality by default, however it can be programmed into by redefining PostScript operators: When a PostScript document calls an operator, the first version found on the dictionary stack is used.”
- “A proof-of-concept implementation demonstrating that advanced cross-site printing attacks are practical and a real-world threat to companies and institutions is available at http://hacking-printers.net/xsp/.”
- “A virtual, distributed file system based on PJL has been proposed and implemented by [53].”
- “One approach to systematically collect credentials and other useful information from the web server is the Praeda [31] tool.”
- “Breaking such mechanisms via printer jobs is in the scope of this work.”
- “The standard however allows only numerical values ranging from 1 to 65,535 as key space [44].”
- “There are various language constructs to provoke feedback from a PostScript interpreter, however not all are supported by every printer. This is often caused by vendors who apply PostScript clones instead of using "real" (Adobe) PostScript. For example, Brother’s BR-Script does not support output commands like print while Kyocera’s KPDL cannot has difficulties sending larger amounts of data to %stdout.”
- “This minimalist document keeps a PostScript interpreter busy forever. In our pool of
test printers, only HP LaserJet M2727nf had a watchdog mechanism and restarted itself after about ten minutes.” - “Within 24 hours, eight devices indicated a corrupt NVRAM: Brother MFC-9120CN, Brother DCP-9045CDN and Konica bizhub 20p showed error code E6 (EEPROM error), but everything worked fine after a reboot. Lexmark E360dn and Lexmark C736dn became unresponsive and showed error code 959.24 (EEPROM retention error). After a restart, both devices recovered but only accepted between a dozen and several hundreds of long-term values to be set until the same behavior could be observed again. Dell 5130cdn, Dell 1720n and HP LaserJet M2727nfs completely refused to set any long-term values anymore.”
- “Accessing files with PostScript is supported by a variety of devices in our test printer pool but sandboxed to a certain directory.”
- “Only HP LaserJet 4200N is prone to path traversal which allows access to the whole file system.”
- “The HP developers attempted to fix the issue in the firmware for HP LaserJet 4250N. However, we could bypass this protection with a new attack, by using %*% special characters as disk prefix and .././ instead of ../ for path traversal.”
- “Passwords for the embedded web server can be found in /dev/rdsk_jdi_cfg0 while the RAM is available for reading and writing at /dev/dsk_ram0.”
- “Four tested devices allow an attacker to access the file system with PJL commands.”
- “With the capability to hook into arbitrary PostScript operators it is possible to manipulate and access foreign print jobs. To parse the actual datastream sent to the printer, we apply an idea based on the debug.ps project [36]: Every line to be processed by the PostScript interpreter can be accessed by reading from the %lineedit special file [33]. This can be done in a loop to line by line retrieve the content
of printed documents. Each line can further be executed using the exec operator and appended to a file.” - “For example, all documents printed with CUPS are pressed into a fixed structure beginning with currentfile /ASCII85Decode filter”
- “PostScript provides two types of protection mechanisms: The SystemParamsPassword is required to change print job settings like paper size while the StartJobPassword is needed to exit the server loop and therefore permanently alter the PostScript environment.”
- “Our tested printers were capable of performing between 5,000 and 100,000 password verifications per second.”
- “However, we found out that these printers only verify the very first character of the password, which effectively limits the key size to 8 bit, and allows an attacker to crack the password even manually.”
- “However, PJL passwords are vulnerable to brute-force attacks because of their limited 16 bit key size as demonstrated by [49]”
- “The goal of our attack is to use a malicious file and enforce the GCP service to reveal internal, non-public information.”
- “In our proof-of-concept attack we accessed all files and folders in the end-user’s home directory including their names in the file.”
- “A typical example are websites converting PostScript files to some other format. Thus, an attacker has direct access to the used interpreter.”
- “In our evaluation we concentrated only on a proof-of-concept attack which lists files stored on the server and results in information
disclosure.” - “A future research question should provide a comprehensive evaluation of possible and more critical attacks like DoS, Server-Side-Request-Forgery and any attack capable to manipulate locally stored files.”
- “For example, a similar approach used Heiderich et al. to attack Web application filters with malicious innerHTML mutations [29] or SVG files [28].”
- “In contrast to other networked devices however, it common for printers to deploy firmware updates as ordinary print jobs. This opens up a wide gateway for attackers because access to printing functionality is usually a low hurdle.”
- “The security of code signing is based on keeping the private key a long-term trade secret.”
- “We can therefore claim that it is common in the printing industry to install new firmware over the printing channel and name a major design flaw: data and code over the same channel.”
- “An in-depth analysis of firmware modification attacks should be part of future work.”
- “With the libraries, arbitrary Java code can be complied and executed on the HP LaserJet 4200N and the HP LaserJet 4250N by uploading the .jar files to a ‘hidden’ URL: http://printer/hp/device/this.loader. Installing the malware requires knowledge of the embedded web server password which however can be readout using PostScript or bypassed by restoring factory defaults”
- “In the middle of 90s Adobe introduced ‘PostScript fax’ as a language supplement [38], allowing compatible devices to receive PostScript files directly via fax.”
summary
The depth of evaluation is the strong suit of this paper which seems appropriate considering it is published as SoK. I disliked the use of attacker models. The attacker models were generic, intuitive and (as pointed out) non-exhaustive. I think the better approach for this paper is to describe the general printing process and briefly describe attacker models wherever they are needed. Correspondingly, AirPrint and other technologies shall not be explained in the attacker model section. In section 7.4, the number “524,280 bit” = 219 - 8 bit is wrong. ASCII represents 27 or 28 characters (depending on the definition). A length of 216 gives us 27 · 216 = 223 possibilities.
typo
- “[…] while Kyocera’s KPDL cannot has difficulties sending larger amounts of data to %stdout.” ⇒ “[…] while Kyocera’s KPDL cannot and has difficulties sending larger amounts of data to %stdout.”
- “[…] could be used as basics for these purposes as well.” ⇒ “could be used as basis for these purposes as well.”
SoK: Science, Security and the Elusive Goal of Securit… §
Title: “SoK: Science, Security and the Elusive Goal of Security as a Scientific Pursuit” by Cormac Herley, P. C. van Oorschot [url] [dblp]
Published in 2017 at 2017 IEEE Symposium on Security and Privacy and I read it in 2020-05
Abstract: The past ten years has seen increasing calls to make security research more “scientific”. On the surface, most agree that this is desirable, given universal recognition of “science” as a positive force. However, we find that there is little clarity on what “scientific” means in the context of computer security research, or consensus on what a “Science of Security” should look like. We selectively review work in the history and philosophy of science and more recent work under the label “Science of Security”. We explore what has been done under the theme of relating science and security, put this in context with historical science, and offer observations and insights we hope may motivate further exploration and guidance. Among our findings are that practices on which the rest of science has reached consensus appear little used or recognized in security, and a pattern of methodological errors continues unaddressed.
Bacon's scientific approach
Bacon [8] formalized an inductive method of generalizing from observations—his method of observation, generalization and correction is acknowledged as an early articulation of what many historically view as the basic scientific method
Empirism versus formal approaches
Mathematical models have proved enormously useful in various branches of Science. It is worth emphasizing that a mathematical model is judged on the accuracy of its predictions rather than the perceived reasonableness of its assumptions.
There is no confusion on this point when checking predictions against measurements is easy: when there’s a conflict between the model and the data, the data wins (see, e.g., remarks by Feynman and others in Section II-D). However, in disciplines where direct measurement and controlled experiments are hard (e.g., Economics and Security) it can be tempting to seek other forms of validation for a mathematical model.
This is a limitation of any mathematical model: how well a model matches reality can only be tested empirically.
inductive / deductive
Probably the most significant settled point in the philosophy of science is that inductive and deductive statements constitute different types of knowledge claims.
The importance of this distinction has long been known. Versions date back to Plato, who distinguished the messy, physical realm of things observed from the perfect non-physical world of “forms.”
Despite broad consensus in the scientific community, in Security there is repeated failure to respect the separation of inductive and deductive statements. Both types have value, provided their limitations are recognized and they are kept separate.
Kant’s very thorough treatment was influential [5]; he calls statements whose truth is independent
of experience a priori and those that depend on observation a posteriori.
Mill argued essentially the reverse: induction is our only path to understanding the world since, on its own, deduction is incapable of helping us discover anything about the world [9]:
But this is, in fact, to say that nothing ever was, or can be, proved by syllogism, which was not known, or assumed to be known, before.
Ayer, a logical positivist, summarizes [6, p.57]:
the characteristic mark of a purely logical inquiry is that it is concerned with the formal consequences of
our definitions and not with questions of empirical fact.
Is security special?
Biological and military systems must also guarantee robustness in the presence of adversaries (see Forrest et al. [78], [79]). In that many of its discoveries are expressible as laws, Physics is an exception rather than the rule [3, pp.197-208]. The compactness and elegance of the mathematical expression of much of Physics is unmatched in other Sciences
Main points
History/Philosophy of Science
- How to organize “Science of Security”?
- inductive versus deductive; a priori versus a posteriori; analytic versus synthetic
- “all swans are white” (inductive reasoning; like in physics) versus Euclidean geometry/Pythagoras' theorem (deductive reasoning; like in math)
- Hume’s “problem of induction”: the logical basis for believing inductive claims is weak in comparison with the certainty of deductive ones.
- Galileo discovered the moons of Jupiter. Aristoteles proclaimed that all heavenly bodies orbited earth
- Bacon: formalized an inductive method of generalizing from observations
- Plato: we could observe only an illusory shadow of the perfect world of forms
- Mill: induction is our only path to understanding the world; deduction is incapable of helping us discover anything about the world
- Medawar: Deduction in itself is quite powerless as a method of scientific discovery
- Falsification as Demarcation Criterion (“scientific” versus “non-scientific”)
- Thompson’s “plum-pudding” model of the atom and Pauling’s triple helix model of DNA are incorrect. it would seem harsh to describe them as unscientific
- Popper: “A theory which is not refutable by any conceivable event is non-scientific. Irrefutability is not a virtue of a theory (as people often think) but a vice.”
- Counterargument 1 (Hempel, The raven paradox): “all ravens are black” ⇔ “all non-black things are not ravens”. Observing a green apple supports the claim
- Counterargument 2 (The tacking problem): “all ravens are black” is falsifiable ⇒ scientific, “all ravens are black and Loch Ness exists” is falsifiable ⇒ scientific. Mayo introduces notion of “severe test”
- Counterargument 3 (Duhem-Quine thesis (holism): “there’s a $20 bill under a rock on a planet 1 million light years from here”, falsification must be theoretical or practical? “The holism problem is that it is difficult, or even impossible, to isolate falsification of a single statement from assumptions about the observation.” Observations themselves are subject to error
- Counterargument 4: “if no possible observation conflicts with a theory, then it is unscientific”. “Falsification offers a clear criterion for ruling theories unscientific.” but not for being scientific.
- Shannon on information theory: “If, for example, the human being acts in some situations like an ideal decoder, this is an experimental and not a mathematical fact, and as such must be tested under a wide variety of experimental situations.” (corresponds to “how well a model matches reality can only be tested empirically“)
- “when there’s a conflict between the model and the data, the data wins”
- Relation between Mathematics and Science
- “Whether a geometry can be applied to the actual physical world or not, is an empirical question which falls outside the scope of the geometry itself. There is no sense, therefore, in asking which of the various geometries known to us are false and which are true. In so far as they are all free from contradiction, they are all true.”
- “how well a model matches reality can only be tested empirically” (example: Shannon expressed this about his information theory)
- Viewpoints of Major Scientists
- Bohr: identified with the logical positivists ⇒ questions that could not be tested are nonscientific
- Bohr: “It is wrong to think that the task of physics is to find out how Nature is. Physics concerns what we can say about Nature”
- Feynman: “In general, we look for a new law by the following process. First, we guess it, no, don’t laugh, that’s really true. Then we compute the consequences of the guess, to see what, if this is right, if this law we guess is right, to see what it would imply and then we compare the computation results to nature, or we say compare to experiment or experience, compare it directly with observations to see if it works. If it disagrees with experiment, it’s wrong. In that simple statement is the key to science.”
- Platt (notion of strong inference): “any conclusion that is not an exclusion is insecure.”
- Feynman: “You can’t prove a vague theory wrong”
- Ayala: “A hypothesis is scientific only if it is consistent with some but not other possible states of affairs not yet observed, so that it is subject to the possibility of falsification by reference to experience”
- Hypothetico-deductive Model
- Required properties of a scientific theory:
- Consistency
- Falsifiability
- Predictive power and progress
- hypothetico-deductive model:
- Form hypotheses from what is observed.
- Formulate falsifiable predictions from those hypotheses
- If new observations agree with the predictions, an hypothesis is supported (but not proved); if they disagree, it is rejected
- Sherlock Holmes: “Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth.”
- Required properties of a scientific theory:
- Sciences of the Artificial (Research on Human Artifacts)
- “The place of much younger Computer Science (II-C2) is less clear, with debate on its status as a science [29]”
- “Differences between fields and methodologies have led to historical tensions between the pure and applied sciences”
- “Much of Security research is directly dependent on human-made artifacts; e.g., results often depend directly on specific software”
- “both the generality and longevity of results impact the type of scientific results we might expect in Security.”
- Pasteur’s Quadrant (uniting basic and applied research)
- two-by-two grid with Yes/No entries
rows ask “Quest for fundamental understanding?”
columns ask “Considerations of use?”
- (Yes, No) is Bohr’s quadrant (his modeling of atomic structure exemplifies basic research)
- (No, Yes) is Edison’s quadrant (his pursuit of commercial electric lighting exemplifies applied researchers uninterested in basic science implications)
- (Yes, Yes) is Pasteur’s quadrant (his many contributions combine pursuit of basic knowledge and target use)
- (No, Yes) is empty
- McLean (1987): “Hence, we have developed an environment where our documented foundations are inadequate, yet shielded from adversity by appeals to implicit assumptions “which everybody knows about” (even if people disagree on what these assumptions are!) ... Such is the path to neither science nor security.”
- two-by-two grid with Yes/No entries
Science of Security
- Saltzer and Schroeder, 1975, “The protection of information in computer systems”
- US Department of Defense, 1983, “Trusted Computer System Evaluation Criteria (TCSEC)” aka Orange Book
- P. A. Karger and R. R. Schell, 2002, “Thirty Years Later: Lessons from the Multics Security Evaluation”
- P. Karger and R. Schell, 1974, “Multics Security Evaluation: Vulnerability Analysis, ESD-TR-74-193, Vol. II”
- McLean, 1987: “Hence, we have developed an environment where our documented foundations are inadequate, yet shielded from adversity by appeals to implicit assumptions “which everybody knows about” (even if people disagree on what these assumptions are!) ... Such is the path to neither science nor security”
- Good, 1987, “What is important is that the abstractions are done in such a way so that when we prove some property about the abstraction, then that property is true of the real, running system”
- “Good paints the way forward as formal verification techniques, carefully defined as “the use of rigorous mathematical reasoning and logic in the system engineering process to produce real systems that are proved to meet their requirements”
- DeMillo, 1979, “formal verifications of programs, no matter how obtained, will not play the same key role in the development of computer science and software engineering as proofs do in mathematics.”
- “Scientists should not confuse mathematical models with reality—and verification is nothing but a model of believability.”
- JASON report 2010: “The science seems under-developed in reporting experimental results, and consequently in the ability to use them. The research community does not seem to have developed a generally accepted way of reporting empirical studies so that people could reproduce the work.”
- Claims of what we need more or less of
- More formal approaches
- Better support/tool/understanding for Empiricism and data collection
- Defined metrics and measurements
- Scientific training
- Attack papers
- Pfleeger: “stop insisting that quantitative is better than qualitative; both types of measurement are useful”
- Is a Science of Security even possible?
- Adaptive Adversary: Unlike other fields, security has a quick adaptive adversary. “Nature does not form new types of storms in response to improved techniques.”
- Absence of invariant laws: There might not be any universal truths. Evans and Stolfo (2011) ”Computer security is too entwined with human behavior and engineered systems to have universal laws at the physics level”
- Man-made artifacts: JASON report “cyber-security is an artificially constructed environment that is only weakly tied to the physical universe” and “the threats associated with cybersecurity are dynamic”
- Security is not special: “Biological and military systems must also guarantee robustness in the presence of adversaries. In that many of its discoveries are expressible as laws, Physics is an exception rather than the rule. The compactness and elegance of the mathematical expression of much of Physics is unmatched in other Sciences”
- “Pleading uniqueness to avoid being held to scientific approaches is common in unscientific fields, and would place Security in poor company.”
- Controversial examples
- “‘Provable security’ involves proofs showing that breaking a target cryptosystem allows solving a believed-hard problem in not much further effort (i.e., “not much” in the asymptotic sense). Bellare [80] suggests reductionist security as a less misleading term.”
- Is crypto the role model of security?
- Degrabriele: provable security has transitioned crypto to a science
- The gap between provable security and the real world
- “Describing an SSH side-channel attack [88] by two of them, they note that the obvious question is [84]: “how we would be able to attack a variant of SSH that was already proven secure.” The answer: the security model failed to consider differences in failure modes by which real implementations report errors—here, allowing attackers to send small units of ciphertext to repeatedly extract small amounts of information by observed differences in how the receiving system responds,”
- “Side-channels continue to epitomize the difficulty of modeling real world problems—raising problems even with definitions.”
Failures to apply lessons from science
- Failure to observe inductive-deductive split
- “Despite broad consensus in the scientific community, in Security there is repeated failure to respect the separation of inductive and deductive statements. Both types have value,provided their limitations are recognized and they are kept separate.”
- “Schneider [51] suggests modifying the definition of Science to include deductive statements. “The status of the natural sciences remains unaffected by changing the definition of a science in this way. But computer science now joins.””
- “Speaking of mathematical guarantees as if they are properties of real-world systems is a common error.”
- “A simple example may flush out misunderstandings. For example, we might think that the claim “a 128-bit key is more secure than a 64-bit key” can be evaluated purely by deduction. First, we must separate the question of whether we believe a particular claim from the question of whether the reasoning behind it is sound; a valid conclusion doesn’t imply a solid argument.”
- Reliance on unfalsifiable claims
- “We can observe that somethingis insecure (by observing a failure) but no observation allows
us to determine empirically that something is secure (this observation is often used to motivate formal approaches” - “claims of the form “X improves security” are unfalsifiable”
- “We can observe that somethingis insecure (by observing a failure) but no observation allows
- Failure to bring theory into contact with observation
- “A scientific model is judged on the accuracy of its predictions”
- Failure to make claims and assumptions explicit
- “If a theory says “X should never happen under assumptions A, B and C” then showing that it does suffices to refute the claim. But when a statement is vague, or assumptions implicit, it is unclear what, if anything, is ruled out. Thus, difficulty articulating what evidence would falsify a claim suggests implicit assumptions or an imprecise theory [3].”
- “The problem of implicit assumptions seems widespread”
- “This leads Bishop and Armstrong [101] to suggest that the skill of reverse-engineering to uncover assumptions implicit in a security design is a vital part of a computer security education”
- Failure to seek refutation rather than confirmation
- hypotheses are most useful when they allow anticipation of as-yet unseen things, and observations are most useful when they present severe tests to existing hypotheses
- observations must actively seek to refute existing belief
Ways forward: Insights and discussion
summary observations and constructive suggestions following:
- T1: “Pushes for “more science” in security, that rule nothing in or out, are too ambiguous to be effective. Many insights and methods from philosophy of science remain largely unexplored in security research.”
- T2: “Ignoring the sharp distinction between inductive and deductive statements is a consistent source of confusion in security.”
- “It is worth being unequivocal on this point. There is no possibility whatsoever of proving rigorously that a real-world system is “secure” in the commonly interpreted sense of: invulnerable to (all) attacks.”
- T3: “Unfalsifiable claims are common in security—and they, along with circular arguments, are used to justify many defensive measures in place of evidence of efficacy.”
- T4: “Claims that unique aspects of security exempt it from practices ubiquitous elsewhere in science are unhelpful and divert attention from identifying scientific approaches that advance security research.”
- T5: “Physics-envy is counterproductive; seeking “laws of cybersecurity” similar to physics is likely to be a fruitless search.”
- T6: “Crypto-envy is counterproductive; many areas of security, including those involving empirical research, are less amenable to formal treatment or mathematical role models.”
- T7: “Both theory and measurement are needed to make progress across the diverse set of problems in security research.”
- “The process is iterative, with theory and observation in constant contact. […] While both are essential, recent history suggests that theorythat has not been tested by observation is currently a greater problem in security than measurement that fails to test theory”
- T8: “More security research of benefit to society may result if researchers give precise context on how their work fits into full solutions—to avoid naive claims of providing key components, while major gaps mean full-stack solutions never emerge.”
- “Regardless, more research of societal benefit may result if researchers took responsibility for explaining how contributions add up to full solutions; these rarely appear by chance.”
- T9: “Conflating unsupported assertions, and argument-by-authority, with evidence-supported statements, is an avoidable error especially costly in security.”
- T10: “Despite consensus that assumptions need be carefully detailed, undocumented and implicit assumptions are common in security research.”
- “One possibility is to find a forcing function to make assumptions explicit.”
- “As one example (towards a different goal), Nature demands that abstracts contain a sentence beginning ‘Here we show that.’”
- “Platt recommends answering either “what experiment would disprove your hypothesis” or “what hypothesis does your experiment disprove.””
- T11: “Science prioritizes efforts at refutation. Empirical work that aims only to verify existing beliefs, but does not suggest new theory or disambiguate possibilities falls short of what science can deliver.”
- “It is often noted (e.g.,[62]) that learning accelerates if we learn from mistakes in other disciplines; arguably, security research is learning neither from other disciplines nor its own literature, and questioning security foundations is not new”
Appendix
- “The problem with such terms as ‘automated theorem-proving’, ‘computer-aided theorem proving’, and ‘automated proof-checking’ is the same as the problem with the term ‘provable security’ ... they promise a lot more than they deliver.”
- “Maxion [56] emphasizes experiments which are repeatable (produce consistent measurements), reproducible by others (cf. [133]), and valid (well-grounded and generalizable), with focus on confounding errors. He cites Feynman: “Experiment is the sole judge of scientific ‘truth’.””
- “Seven years after Kocher’s well-known timing attacks, formal models still failed to address side-channels.”
Popper's criterion
A simplified statement of Popper’s criterion is that scientific theories should be falsifiable [14]:
A theory which is not refutable by any conceivable event is non-scientific. Irrefutability is not a virtue of a theory (as people often think) but a vice.
…
Einstein’s Relativity, by contrast, even though it assaulted basic intuitions about time and space, made testable predictions.
Programs as predicates
The origins of Computer Science and Computer Security are mathematical. WWII ciphers and code-breaking are justly revered. Without the accomplishments of modern cryptography, it is doubtful the World-Wide-Web would have attained such influence. This has contributed to the strongly mathematical flavor of our discipline. In 1984, Hoare describes a computer program as [18] “a logical predicate describing all of its permitted behaviors”; and further, that the final design meets the original requirements “can be mathematically proved before starting the implementation of its components.” This view of programs as implementing mathematical predicates, is unsurprising given the importance of the development of algorithms. For an algorithm that involves sorting, searching etc., it is important that its behavior is well-understood under all possible inputs.
Shamir and Schell on scientific security
Shamir, in accepting the 2002 Turing award, described non-crypto security as “a mess.” Schell, in 2001, described the field as being filled with “pseudo-science and flying pigs” [2].
Side-channel attacks reveal the gap between theory and practice
Side-channels continue to epitomize the difficulty of modeling real world problems—raising problems even with definitions.
summary
This paper tackles the difficult question of discussing IT security from a scientific perspective. A demarcation criterion distinguishes between “scientific” and “non-scientific” content. Are any approaches in security scientific at all? Is it an inductive or deductive process? How does security compare to other sciences? Where does security position itself in Pasteur's Quadrant and is falsification an admissible demarcation criterion? What are the conclusions of the JASON report? A thorough discussion followed by summary observations and constructive suggestions is done.
Extensive, well written paper. Nice summary in section 2 “History/Philosophy of Science” which can be debated with any curious mind of any field. The following sections are security specific. Drawing conclusions on the central question is difficult and the paper can neither provide them. But it shows a lack of procedures for empirical approaches established in other sciences. I think Appendix A is very important and should have been put into the main text (because falsification is such an important discussed criterion in the paper).
typo
Further, Science generally requires that a hypothesis survives severe tests before it is trusted.
(Appendix A)
typo
give a highly accessible exposition
Software-based Power Side-Channel Attacks on x86 §
Title: “Software-based Power Side-Channel Attacks on x86” by Moritz Lipp, Andreas Kogler, David Oswald, Michael Schwarz, Catherine Easdon, Claudio Canella, Daniel Gruss [url] [dblp]
Published in 2020-11 at and I read it in 2020-12
Abstract: Power side-channel attacks exploit variations in power consumption to extract secrets from a device, e.g., cryptographic keys. Prior attacks typically required physical access to the target device and specialized equipment such as probes and a high-resolution oscilloscope.
questions
- “Fig. 1: A histogram of the power consumption of various instructions on the i7-6700K (desktop) system.”
with the same or different operands? - How does SGX-Step work? Zero-stepping? I don't get Figure 7
- Interpreting Figure 9
- “However, using zero stepping (Section IV-B2) and the possibility to observe the Hamming weight of bytes (Section III-E), masking is insufficient against our attacks on SGX enclaves.”
- “Timing-Independent Covert Channel” which information do you want to transmit?
quotes
- “However, until recently, power analysis attacks had two limitations. First, they primarily targeted small embedded microcontrollers rather than more complex high-performance desktop and server CPUs. Second, software-based attacks relying on the available interfaces were so far not successfully applied on x86 to leak fine-grained information, e.g., cryptographic key bits.”
- “CPA [11] is an extension of DPA, which examines the correlation between variations in the set of traces and a leakage model depending on the value of intermediate values [49].”
- “The Intel Running Average Power Limit (RAPL) mechanism was introduced with the Sandy Bridge microarchitecture to ensure the CPU remains within desired thermal and power constraints [27].”
- “Since Haswell, it has provided three distinct capabilities for controlling average power over timescales of multiple seconds, ~10 ms, and <10 ms (PL1, PL2, and PL3, respectively).”
- “Intel defines four different domains for RAPL: package (PKG), power planes (PP0 and PP1), and DRAM.”
- “Intel generally considers physical side-channel attacks on SGX out of scope. Side channels [9], [73], race conditions, and memory-safety violations are not in the threat model,”
- “Note that while we primarily refer to runtime energy consumption rather than power consumption throughout this work, these are directly related, as power = energy ÷ time.”
- “We performed the experiment on our Intel Xeon E3-1240 v5 (server) system, collecting measurements for all possible byte values for 627 hours.”
- “Moreover, note that Intel RAPL does not provide the energy consumption per core but per processor package. Thus, code executed on other cores have a direct influence on the measurement of a specific piece of code running on one core and, thus, the number of overall measurements increases to average out the noise introduced by the other cores.”
- “the attacker needs to align the recorded traces. The trace needs to contain a distinctive feature, e.g., a distinct peak in power consumption, so that traces can be shifted into alignment with each other. While
a privileged attacker can precisely control the victim’s execution and interrupt it at will, an unprivileged attacker cannot. However, if the attacker can control when the execution of the attacked code begins, or use a trigger signal such as a cache-based side channel [72], then the collected traces can be aligned based on that timing information.” - “We measured over 96 000 execution runs, yielding an overall attack time of 8.11 h on the E3-1275 v5. The result is illustrated in Figure 7.”
- “Incidentally, we note that the key recovery specifically fails for key bytes 0, 4, 8, and 12, i.e., the first byte of each 4-byte word.”
- “However, using zero stepping (Section IV-B2) and the possibility to observe the Hamming weight of bytes (Section III-E), masking is insufficient against our attacks on SGX enclaves.”
summary
In this paper, we present PLATYPUS attacks which are novel software-based power side-channel attacks on Intel server, desktop, and laptop CPUs. We exploit unprivileged access to the Intel Running Average Power Limit (RAPL) interface that exposes values directly correlated with power consumption, forming a low-resolution side channel.
- Targets Linux and Intel CPUs after Sandy Bridge. Might work on AMD and ARM also.
- Voltage package works best, then core package.
- Zero-Stepping is used as delaying technique
- RAPL allows to maintain power in more detail. Including monitoring in order to utilize cooling appropriately.
- We are not aware of user applications requiring RAPL access. So restriction of access as countermeasure seems reasonable.
- Intel: medium severity
- “Observing Intra-Cacheline Activity” is motivated by Intel's internal symmetric cryptography implementation guidelines
Some instructive mathematical errors §
Title: “Some instructive mathematical errors” by Richard P. Brent [url] [dblp]
Published in 2021-06-20 at and I read it in 2020-08
Abstract: We describe various errors in the mathematical literature, and consider how some of them might have been avoided, or at least detected at an earlier stage, using tools such as Maple or Sage. Our examples are drawn from three broad categories of errors. First, we consider some significant errors made by highly-regarded mathematicians. In some cases these errors were not detected until many years after their publication. Second, we consider in some detail an error that was recently detected by the author. This error in a refereed journal led to further errors by at least one author who relied on the (incorrect) result. Finally, we mention some instructive errors that have been detected in the author’s own published papers.
Comment: 25 pages, 75 references. Corrected typos and added footnote 5 in v2
feedback
Links to Wikipedia should be permalinks.
quotes
- “We describe various errors in the mathematical literature, and consider how some of them might have been avoided,”
- “Since mathematics is a human endeavour, errors can and do occur.”
- “The errors that we consider can be grouped into three broad categories.
- Well-known errors made by prominent mathematicians (see §2).
- Errors discovered by the author in other mathematicians’ work (§3).
- Some errors in, or relevant to, the author’s own work (§4).”
- “A considerably longer list is available online [70].”
- “If the author were not what he is, I would not for a moment hesitate to say that he has made a great mistake here.” (Phragmén, December 1888)
- “Eventually, in December 1889, Poincaré admitted that he had made an error with a critical consequence – his claimed proof of the stability of the solar system was invalid!”
- “Poincaré prepared a corrected version, about twice as long as the original prize entry, and it was eventually published [54]. Poincaré had to pay the extra costs involved, which exceeded the prize money that he had won.”
- “[…] in realising his error and making his corrections, Poincaré discovered the phenomenon of chaos”
- “We remark that several other mathematicians have claimed to prove RH. Some serious attempts are mentioned in [12, Ch. 8].”
- “Wiles worked to repair his proof, first alone, and then with his former student Richard Taylor. By September 1994 they were almost ready to admit defeat. Then, while trying to understand why his approach could not be made to work, Wiles had a sudden insight.”
- “I was sitting at my desk examining the Kolyvagin–Flach method. It wasn’t that I believed I could make it work, but I thought that at least I could explain why it didn’t work. Suddenly I had this incredible revelation. I realised that, the Kolyvagin–Flach method wasn’t working, but it was all I needed to make my original Iwasawa theory work from three years earlier. So out of the ashes of Kolyvagin–Flach seemed to rise the true answer to the problem. It was so indescribably beautiful; it was so simple and so elegant.” (Andrew Wiles, quoted by Simon Singh)
- “At the present time, all that we can say with certainty is that the status of Mochizuki’s proof is unclear. For further information, see [38, 58, 75], and comments on MathOverflow.”
- “The paper [14] contained some significant errors which were not noticed until 1997, when Donald Knuth was revising volume 2 of his classic series The Art of Computer Programming in preparation for publication of the third edition [39].”
- “In a curious twist, it turned out that Knuth’s value was incorrect, because he relied on some of the incorrect results in my paper [14], whereas my value was correct, because I had used a more direct numerical method that depended only on recurrences for certain distribution functions that were given correctly in [14]. With assistance from Flajolet and Vallée, we reached agreement on the correct value of K just in time to meet the deadline for the third edition of [39].”
summary
Nice paper showing how some errors crept up in mathematical literature. Some examples are historical, others are related to the author's work. Apparently the author works in the field of number theory and computational mathematics. Less examples are provided from algebra and geometry.
Reproducibility and numeric verification seem to be approaches to combat the underlying problems. How I also wonder how much a difference it would make if formulas are easier to search for. I wonder how accessible sagemath, Maple and others are for researchers (can they easily verify theorems of fields they are not familiar with?).
Some-well known errors:
- Four-color theorem (Kempe & Tait, 1880) → error found 11 years later
- Mertens and Stieltjes (1897) → claim was lacking a proof
- Poincaré's prize essay (1888) → funny situation due to awarding the prize money
Ad claim 2 - disproval methods:
- algebraic argument
- numeric evaluation
- analytical argument
Strategies for Parallel Markup §
Title: “Strategies for Parallel Markup” by Bruce R. Miller [url] [dblp]
Published in 2015 at CICM 2015 and I read it in 2024-09
Abstract: Cross-referenced parallel markup for mathematics allows the combination of both presentation and content representations while associating the components of each. Interesting applications are enabled by such arrangements: interaction with parts of the presentation to manipulate and query the corresponding content; enhanced search indexing. Although the idea of such markup is hardly new, effective techniques for creating and manipulating it are more difficult than it appears. Since the structures and tokens in the two formats often do not correspond one-toone, decisions and heuristics must be developed to determine in which way each component refers to and is referred to by components of the other representation. Conversion between fine and coarse-grained parallel markup complicates xml identifier (ID) assignments. In this paper, we will describe the techniques developed for LATExml, a TEX/LATEX to xml converter, to create cross-referenced parallel MathML. While not yet considering LATExml’s content MathML to be truly useful, the current effort is a step towards that continuing goal.
quotes
- “Of course, the idea of parallel markup is hardly new. The m:semantics element has been part of the MathML specification [1] since the first version, in 1998!” (Miller, 2015, pp. -)
- “Fine-grained parallelism is when the smallest sub-expressions are represented in multiple forms, whereas with coarse-grained parallelism the entire expression appears in several forms.” (Miller, 2015, p. 1)
- “But what isn’t so clear is how to maintain the associations between the symbols and structures in the two trees. Indeed, there is typically no one-to-one correspondence between the elements of each format.” (Miller, 2015, p. 2)
- “And, now that generating Content MathML is more fun, we must continue working towards generating good Content MathML. Ongoing work will attempt to establish appropriate OpenMath Content Dictionaries, probably in a FlexiFormal sense [5], improved math grammar, and exploring semantic analysis.” (Miller, 2015, p. 7)
references
- sTeX
- Digital Library of Mathematical Functions (DLMF)
summary
Weak paper. It was nice to see MathML examples. It was also nice to see XMath (apparently an intermediate representation by LatexML). They explain one algorithm to convert XMath to MathML, but they do neither explain XMath or how they came up with it. And I think the paper name is a misnomer. It should be something like “Strategies to convert XMath to cMML and pMML”.
Sweeping for Leakage in Masked Circuit Layouts §
Title: “Sweeping for Leakage in Masked Circuit Layouts” by Danilo Sijacic, Josep Balasch, Ingrid Verbauwhede [url] [dblp]
Published in 2019 at DATE 2020 and I read it in 2020-11
Abstract: Masking schemes are the most popular countermeasure against side-channel analysis. They theoretically decorrelate information leaked through inherent physical channels from the key-dependent intermediate values that occur during computation. Their provable security is devised under models that abstract complex physical phenomena of the underlying hardware. In this work, we investigate the impact of the physical layout to the side-channel security of masking schemes. For this we propose a model for co-simulation of the analog power distribution network with the digital logic core. Our study considers the drive of the power supply buffers, as well as parasitic resistors, inductors and capacitors. We quantify our findings using Test Vector Leakage Assessment by relative comparison to the parasitic-free model. Thus we provide a deeper insight into the potential layout sources of leakage and their magnitude.
quotes
- “As masking schemes make no assumptions on how each share is implemented, they can be realized using standard cell ASIC libraries or on FPGAs. On the front-end of the design cycle, using gate-level modeling, the abstraction shown on Fig. 1 holds. In the back-end stages, logic cells are placed on a substrate, shared with different physical cells such as filler cells, a clock distribution network and a power distribution network (PDN). Logic cells are then interconnected using multiple metal layers in a process called routing. Consequently, a myriad of parasitic elements and non-linear effects emerge, unbeknown to masking models.”
- “We propose a SPICE model for the co-simulation of the analog PDN with the digital logic core.”
- “Our aim is however not to determine how feasible it is to exploit, or even measure, such leakage in a practical settings. We aim instead to diagnose which parasitic elements may compromise the SCA security.”
- “However, our experiments show that parasitic PDN inductors are the more likely culprit, as shown in Fig. 5c.”
summary
- Test Vector Leakage Assessment:
J. Cooper, E. DeMulder, G. Goodwill, J. Jaffe, G. Kenworthy, and P. Rohatgi, “Test Vector Leakage Assessment (TVLA) methodology in practice,” International Cryptographic Module Conference, 2013.
- “We propose a SPICE model for the co-simulation of the analog PDN with the digital logic core.”
- “We aim instead to diagnose which parasitic elements may compromise the SCA security.”
- Methodology:
- preliminary investigation of possible leakage sources (without questioning how probable they may be)
- we perform transient analysis, sweeping the value of each parasitic element.
- Using TVLA as the core metric, we quantify information leakage in the function of each parasitic element and its values, relative to the parasitic-free case.
typo
- page 1: “bellow”
TEXML: Resurrecting TEX in the XML world §
Title: “TEXML: Resurrecting TEX in the XML world” by Oleg Parashchenko [url] [dblp]
Published in 2007 at and I read it in 2022-11
Abstract: TEXML is an XML syntax for TEX, LATEX and ConTEXt. This definition is extremely correct, but I dislike its formality. Instead, I prefer the following. Thanks to TEXML, you can reuse your TEX skills in the XML world. With TEXML, XML publishing becomes a case of TEX publishing. TEXML is a very simple thing. You can learn it in a minute by looking at the examples in the section ‘TEXML tour’. But knowing the syntax isn’t enough. To feel TEXML, you need to know its past and future, the ideas behind it, and understand the author’s intentions. That’s why the technical stuff is wrapped by the sections with my very subjective view on the topic of XML publishing.In the most cases, the words ‘TEX’ and ‘LATEX’ are interchangeable, and they mean also any other TEX format.
The author is from the XML world. The TEXML home page is http://getfo.org/texml/.
opinion
“Creating automata for TEXML could be a good master thesis or even a PhD work. If you know someone who might be interested in this task, don’t hesitate to mention TEXML.” (Parashchenko, 2007, p. 5)
This is pure engineering and seems pretty easy. How could this be a scientific project?
quotes
-
“What in XML looks like
<environment> ...text... </environment>
in LATEX looks like this:
\begin{environment} ...text... \end{environment}” (Parashchenko, 2007, pp. -) - “Among benefits of logical markup is the possibility of single source publishing, when the same source document can be converted to different output formats.” (Parashchenko, 2007, p. 1)
- “On the other hand, the only correct TEX parser is TEX itself, and TEX is locked in its sandbox.” (Parashchenko, 2007, p. 1)
- “Only a few tools implement XSL-FO in full, and all these tools are commercial, without open source alternatives (the best one is FOP, which is under development), and the W3 Consortium has started work on XSL-FO 2.0.” (Parashchenko, 2007, p. 1)
- “And there are common errors when generating TEX code. (See bug databases for such projects as db2latex (Casellas and Devenish, 2004), dblatex (Guillon, 2006) and others.)” (Parashchenko, 2007, p. 2)
- “The TEXML markup language is minimalistic. Most of the time, you use only three elements: cmd, env and group (the other elements are pdf, math, dmath, ctrl, spec and TeXML).” (Parashchenko, 2007, p. 2)
-
“This example demonstrates the three most often used TEXML elements:
- cmd creates a LATEX command,
- env creates a LATEX environment,
- group creates a LATEX group.” (Parashchenko, 2007, p. 3)
- “Meanwhile, I also investigated how to deal with TEX and XSLT limitations. This activity resulted in the projects sTEXme (Parashchenko, 2004a) (TEX+Scheme) and XSieve (Parashchenko, 2006c) (XSLT+Scheme), one of the Google Summer of Code 2005 projects, presented at the XTech 2006 conference.” (Parashchenko, 2007, p. 4)
- “I developed Consodoc (Parashchenko, 2006a), an XML to PDF publishing tool on top of TEXML.” (Parashchenko, 2007, p. 6)
summary
XML syntax encoding for common Teχ markup generates escaped Teχ output to generate a PDF.
Templates vs. Stochastic Methods: A Performance Analys… §
Title: “Templates vs. Stochastic Methods: A Performance Analysis for Side Channel Cryptanalysis” by Benedikt Gierlichs, Kerstin Lemke-Rust, Christof Paar [url] [dblp]
Published in 2006 at CHES 2006 and I read it in 2020-06
Abstract: Template Attacks and the Stochastic Model provide advanced methods for side channel cryptanalysis that make use of ‘a-priori’ knowledge gained from a profiling step. For a systematic comparison of Template Attacks and the Stochastic Model, we use two sets of measurement data that originate from two different microcontrollers and setups. Our main contribution is to capture performance aspects against crucial parameters such as the number of measurements available during profiling and classification. Moreover, optimization techniques are evaluated for both methods under consideration. Especially for a low number of measurements and noisy samples, the use of a T-Test based algorithm for the choice of relevant instants can lead to significant performance gains. As a main result, T-Test based Templates are the method of choice if a high number of samples is available for profiling. However, in case of a low number of samples for profiling, stochastic methods are an alternative and can reach superior efficiency both in terms of profiling and classification.
ambiguity
- page 2: “a multivariate characterization of the noise” → noise is defined as “noise + performed operation”
- page 3: “for each time instant” → undefined term “time instant”
- page 3: “it is the average mi2 of all available samples” → do we square the average? is it already squared?
- page 3: “(P1, …, Pp)” → what is P?
- page 6: “(II) the number of curves for profiling” → which curves? difference curves? they were only defined for the template method not for the stochastic method
quotes
- “Especially for a low number of measurements and noisy samples, the use of a T-Test based algorithm for the choice of relevant instants can lead to significant performance gains.”
- “However, in case of a low number of samples for profiling, stochastic methods are an alternative and can reach superior efficiency both in terms of profiling and classification.”
- “The underlying working hypothesis for side channel cryptanalysis assumes that computations of a cryptographic device have an impact on instantaneous physical observables in the (immediate) vicinity of the device, e.g., power consumption or electromagnetic radiation”
- approaches depending on number of stages:
- one-stage approach:
- directly extract key
- two-stage approach:
- “profiling step”
- “attack step”
- one-stage approach:
- “Templates were introduced as the strongest side channel attack possible from an information theoretic point of view”
- “This is due to the fact that positive and negative differences between the averages may zeroize, which is desirable to filter noise but hides as well valuable peaks that derive from significant signal differences with alternating algebraic sign.”
- “Templates estimate the data-dependent part ht itself, whereas the Stochastic model approximates the linear part of ht in the chosen vector subspace (e.g., F9) and is not capable of including non-linear parts.”
- “The number of measurements, both during profiling and key extraction, is regarded as the relevant and measurable parameter.”
- “We focus on the number of available samples (side channel quality) since computational complexity is of minor importance for the attacks under consideration.”
- “We focus on the number of available samples (side channel quality) since computational complexity is of minor importance for the attacks under consideration.”
- “Hence, a general statement on which attack yields better success rates is not feasible as this depends on the number of curves that are available in the profiling step. If a large number of samples is available (e.g., more than twenty thousand), the Template Attack yields higher success rates. If only a small number of samples is available (e.g., less than twenty thousand), stochastic methods are the better choice.” (w.r.t. Metric 3)
- “The Stochastic Model’s strength is the ability to “learn” quickly from a small number of samples. One weakness lies in the reduced precision due to the linear approximation in a vector subspace.”
- “The Template Attack’s weakness is its poor ability to reduce the noise in the side channel samples if the adversary is bounded in the number of samples in the profiling step.”
- “The T-Test Template Attack is the best possible choice in almost all parameter ranges.”
- “For example, using N = 200 profiling measurements and N3 = 10 curves for classification it still achieves a success rate of 81.7%.”
summary
Important empirical results. Parameters and assumptions are okay. Results are fundamental and significant. However, the description of the methods (templates, stochastic) are quite bad. Looking up the references is required.
Notes:
- sosd: \sum_{i,j=1}^K (m_i - m_j)^2 for i ≥ j
- sost: \sum_{i,j=1}^K \left(\frac{m_i - m_j}{\sqrt{\frac{\sigma_i^2}{n_i} + \frac{\sigma_j^2}{n_j}}\right)^2 for i ≥ j
The Aesthetics of Reading §
Title: “The Aesthetics of Reading” by Kevin Larson, Rosalind Picard [url] [dblp]
Published in 2005 at and I read it in 2021-08
Abstract: In this paper we demonstrate a new methodology that can be used to measure aesthetic differences by examining the cognitive effects produced by elevated mood. Specifically in this paper we examine the benefits of good typography and find that good typography induces a good mood. When participants were asked to read text with either good or poor typography in two studies, the participants who received the good typography performed better on relative subjective duration and on certain cognitive tasks.
quotes
- “Our goal with this project is to develop a measure that is sensitive to improvements in aesthetics. By extending two earlier methodologies we hope to find one that is successful in detecting differences. The first methodology is based on the adage time flies when you’re having fun. Participants’ perception of time is manipulated by the enjoyment of their activity. The second methodology is based on the finding that participants perform better on certain cognitive tasks when they are in a good mood.”
- “Weybrew extended Zeigarnik’s work on task interruption by demonstrating that task interruptions cause participants to overestimate task duration (Weybrew, 1984).”
- “Recent work has turned this finding into a useful usability measure called relative subjective duration (Czerwinski, Horvitz, Cutrell, 2001). Relative subjective duration (RSD) measures participant’s perception of how long they have been performing a task.”
- “Our hope is that RSD not only detects task difficulty, but also aesthetic differences.”
summary
This paper tries to evaluate ClearType's typographic performance in a user study by evaluating performance in creative tasks. I think the assumptions are very strong (e.g. “Our hope is that RSD not only detects task difficulty, but also aesthetic differences”).
- The figures give neat examples for good&bad typography
- The goal is to measure improvements in aesthetics
- Study 1: 20 people read a text on a tablet (⇒ 10 with good, 10 with bad typography)
bad typography := bitmap Courier font, 2pts extra between words, worse hyphenation
20min reading time, interrupted after 15min ⇒ relative subjective duration, Likert scale questionnaire, performance in candle task
p-value < 0.05 showed a difference in performance - Study 2: 20 people, …, interrupted after 17min ⇒ … performance in finding compound words, …
- Assumptions:
- Isen 1987: in positive mood ⇒ perform better on cognitive tasks
- GTAE 2004: good typography ⇒ +17% improved word recognition
- the selected examples are representative
- Weybrew 1984: task interruptions ⇒ overestimated work duration
- CHC 2001: difficult tasks ⇒ duration overestimated, easy ⇒ underestimated (quantified by RSD)
- The p-value was chosen appropriately
- The participants were influenced by typography and not other factors (daytime, …)
- The number of participants is sufficiently high
The Case of Correlatives: A Comparison between Natural… §
Title: “The Case of Correlatives: A Comparison between Natural and Planned Languages” by Federico Gobbo [url] [dblp]
Published in 2011 at Journal of Universal Language 2011 and I read it in 2021-07
Abstract:
quotes
- “Since the publication of Volapük, the most important functional and deictic words present in grammar—interrogative, relative and demonstrative pronouns, and adjectives among others—have been described in planned grammars in a series or a table, namely ‘correlatives,’ showing a considerable level of regularity.”
- “The main result of this comparison is that, in the case of correlatives, some natural languages are surprisingly far more regular than their planned daughters, in spite of the fact that regularity was a major claim of the efforts in planning IALs during the late XIX and early XX centuries in Europe.”
- “Most language planners are men, while women are rare (Yaguello 2001).”
- “Blanke (1985) proposes a scale where to put planned languages following their sociolinguistic success, i.e., the presence and importance of a speech community; this scale starts from ‘project’ (no speech community) until ‘language’ (stable speech community, with presence of family language).”
- “More recent examples of languages planned for non-auxiliary purposes can be found, in particular for literary of fictional ones. For example, Klingon and Na’vi share a lot of characteristics: both were planned as an important part of the background for the science-
fiction universes, respectively for the Star Trek saga (see at least Okrand 1992) and James Cameron’s blockbuster movie Avatar (see at least Frommer 2009).” - “Basing on Bausani (1974), Gobbo (2008) finds another criterion beyond purpose in order to classify planned languages: publicity, i.e., the dichotomy exoteric vs. esoteric. Bausani’s example of esoteric language is again Balaibalan;”
- “Language planners quite often claim that their language is ‘easy’ to learn, compared to competing IALs—and, of course, natural languages. This claim is based on the regularity of structure of IALs: the aim of regularity is that learners acquire a reasonable level of passive and active proficiency quickly and efficiently. But easiness is hardly acceptable as a linguistic
dimension: how to measure it?” - “Regularity is an internal or intralinguistic dimension: no language is completely suppletive, i.e., there are always paradigms that form common transformations in a regular way, which are valid in most cases.”
- “Similarity is an external or interlinguistic dimension: the IAL should have a considerable degree of similarity to the lexicon and writing system of the “source languages,” i.e., natural languages taken as models for planning, so that learners can take advantage from their language repertoire in becoming familiar with the proposed IAL without extra effort.”
- “Bausani (1974) notes that the core features of IALs—in particular phonetics and phonology—are determined by the language repertoire of the language planner, who often chooses unconsciously the sounds belonging to his mother tongue as distinctive features for the phonemes of the IAL.”
- “Correlatives comprehend interrogative clauses and their answers, as well as their relative counterparts.”
- “In particular, with the important exception of the verbant character (I), correlatives can take any grammar character: adjunctive (A), stative (O) or circumstantial (E).”
- “The normative grammars of languages belonging to the Standard Average European (SAE) sprachbund were influenced by Latin grammar”
- “However, regularity in natural languages is only a tendency: for instance in French, the causal correlative is rendered through an analytical strategy, which put together the factual quoi (what) and the preposition pour (for); a similar strategy is also attested in the English what for (5a), while German borrows from the locative in order to fit the same function need (lit. wofür stands for ‘where-for,’ 5b).”
- “Even if their analysis is limited to Western languages, it is worth noticing that regularity in correlatives is not confined in natural languages belonging to the SAE sprachbund, but, on the contrary, it is a tendency found in many natural languages of the world.”
- “The sociolinguistic relative success of Volapük at the end of the XIX century in Europe (Golden 1997) largely influenced other language planners—Zamenhof included, especially in his proto-
Esperanto proposed in 1881 (Waringhien 1959, Tresoldi 2011). Schleyer, Volapük’s inventor and owner, largely considered the dimension of regularity more important than the similarity with any natural language, and correlatives are no exceptions.” - “According to many scholars, this fact was a key factor in the fall of the Volapukist movement and the rise of the Esperantist movement (see Forster 1982, Large 1985).”
- “Most volapukesques were published in Germany or France between the end of the XIX century and the very beginning of the XX century. Couturat & Leau (1903) report five direct reforms of Volapük: Hilbe’s Zahlensprache, Bauer’s Spelin, Fieweger’s Dil, Dormoy’s Balta, Guardiola’s Orba.”
- “They are von Arnim’s Veltparl, published in Opole (Poland) in 1896, and Marchand’s Dilpok, published in Besançon (France) in 1898. Even if some influence of Esperanto can be accounted, their model in language planning was still Volapük.”
- “As the two mathematicians were very precise in their review, it is probable that many language planners did not take correlatives into account, at least as part of the core features of the proposed IAL published at its launch.”
- “In fact, unlike Spelin, Hilbe’s Zahlensprache shows a clear influence by Latin.”
- “All interrogative clauses are always introduced by li—Zahlen-sprache extending the original use of Volapük, where li introduces only yes/no questions.”
- “As a provisional conclusion, volapukesques show very different strategies about correlatives: from absolute regularity in the case of Spelin, to considerable similarity to Latin in Zahlensprache.”
- “In particular, some traits of Esperanto are typically Slavic (Comrie 1996), while most if not all reforms of Esperanto—which are called since Bausani (1974) “esperantidos”—cut off most
influences from Slavic languages in particular, but also Germanic ones, with the important exception of English (Gobbo 2005b). In particular, one of the most criticized parts of the Esperanto grammars has always been the correlative system, because it is regular but not similar to any widely used natural language. However, Esperanto correlatives show a considerable degree of similarity with Lithuanian ones (see Table 11 below), Lithuanian being part of Zamenhof’s repertoire (Künzli 2010).” - “The first esperantido was a reform proposed by Zamenhof himself to the readers of the first journal written in Esperanto, La Esperantisto. The reform was rejected by the readers themselves after a referendum in 1894 (Haupenthal 1988).”
- “Esperanto (1894) retains all the wideness of the Esperanto correlative system, but it sacrifices regularity in favor of similarity with Latin, as shown in Table 11.”
- “Table 13 (below) shows that Ido retains part of the regularity of Esperanto, although its morphemes are fairly more similar to Latin and therefore more familiar to most learners in 1908, when Ido was published:”
- “However, it is not intuitive why the Latin prefix qu- is retained in some forms (e.g., Ido and Latin quo, English what) but not in others—for instance, the Latin quando becomes in Ido kande.”
- “Furthermore, the authoritative grammar of Ido by de Beaufront—recently (2005) republished in digital form—uses the word ‘correlatives’ only once: the author programmatically rejects
to consider these words as a regular system, in order to take a distance from Esperanto.” - “As described in section 1, there are three dimensions of analysis of planned languages: publicity, purpose (e.g., auxiliary, religious, fictional) and the diamesic axis (i.e., the main channel of use, whether written or spoken).”
- “Interestingly, no Tolkien’s language apparently shows any correlative. Kloczko (2002, 2004) has extracted a corpus-based grammar of every language planned by Tolkien, complete with the appropriate dictionaries. Unfortunately, most texts are poems, where correlatives seem to be never used. After all, the aim of Tolkien was not an actual use by other people, but rather they served as part of the background of Middle-Earth, its fictional world described in his novels and essays, which is firmly grounded in Old and Middle English language and literature (Solopova 2009).”
- “In sum, the strategy behind the language planning of Klingon is regularity in syntax, while morphology is highly idiosyncratic, perhaps to increase the appearance of “exoticism” in the Klingon language.”
- “Unlike Klingon, the syntax of Na’vi is rich and complex. For example, there are six cases: subjective, agentive, patientive, dative, genitive, and topical—it is worth noticing that Na’vi is pragmatically split-ergative too.”
- “In Na’vi, there is the attributive (Tesneèrian symbol: A) particle a which is used to transform the whole sentence in a adjunctive, so that relatives follow the same rules of adjectives.”
- “For instance, the English expression the man on the moon is rendered literally as ‘the on-moon-attribute man.’ Moreover, there is a ‘resumptive pronoun’ for animate heads (po) and one for inanimate (tsaw) when the head of the relative clause is neither the subject nor the direct object (Annis 2011: 46).”
- “Moreover, after Esperanto, correlatives were often presented as an apart category in grammars: when this happens, correlatives tend to show a considerable level of regularity.”
- “In conclusion, it can be said that planning a language—for whatever purpose—is not an easy task: the language planner should not only master the natural languages to be used as sources, but also he or she needs to understand the structural principles that underlie their grammars. This is particularly true in the case of correlatives.”
summary
A very neat paper from the field of linguistics. Language elements are first established based on grammatical categories (Bausani (1974) and Gobbo (2008) distinguish esoteric/exoteric languages) (Blanke (1985) differentiates project to language on a one-dimensional axis) (Grammar Characters by Tesnière (1959)).
In the following planned languages with their peak development on the transition from the 19th to 20th century and a focus on the Standard Average European sprachbund are discussed. Here the distinction of similarity (with established languages) versus regularity (formed along formal principles) is pointed out as fundamental. Similar to the notion of suppletive. The paper then contributes the correlatives in various languages: {English, German, French, Latin, Volapük, Spelin, Zahlensprache, Lithuanian, Esperanto, Esperanto (1984), Ido, Novial, Neo, Interlingua, Klingon, Na'vi}. The paper appropriately discusses the relation between those languages.
The main result is given as “some natural languages are surprisingly far more regular than their planned daughters in spite of the fact that regularity was a major claim of the efforts in planning IALs during the late 19th and early 20th century”. The result is less unexpected in the end. As the discussed developments show, similarity is sometimes favored over regularity thus neglecting a regular table of correlatives. Furthermore some discussed auxiliary languages never used simplicity/easiness/regularity as the main goal. In essence, Esperanto shows the most regular system which is an auxiliary language.
- Most of the tables contain tables of correlatives and are very informative.
- The history of planned/auxiliary languages in Europe is provided as a neat primer
- suppletive: the flection is reuses the root of the word and does not create a completely different word
typo
page 73, remove “took”
The European Union and the Semantic Web §
Title: “The European Union and the Semantic Web” by Neville Holmes [url] [dblp]
Published in 2008 at and I read it in 2021-09
Abstract:
quotes
- “In the essay, I deplored the failure of the European Union to streamline the translation of their documents into the Union’s various official languages and proposed E-speranto, a simplified dialect of Esperanto, as an intermediate language to form the basis for the streamlining.”
- “‘Esperanto, anyone?’—by Robert Glass in his March/April 2008 IEEE Software column on the unsuitability of English as a lingua franca”
- “Thus a signals here and now, o signals ahead or the future, i behind or the past, u conditional or propositional, and e perpetual or definitional.”
- “The proliferation of formal vocabularies for the Semantic Web seems to result from various interest groups each focused on supporting their own needs. This is rather like a library divided into interest sections, each with its own independent topic classification. This makes cross-disciplinary research difficult and adventitious discovery through browsing less likely.”
- “At the topmost level, items of the vocabulary will have five phonemes: initial, prefix, vowel, suffix, and final.”
- “Ultimately, establishing an intermediary universal vocabulary could make automatic indexing
and searching of text on the Web more effective and independent of the source language.”
summary
Without a particular reason the author creates the E-speranto proposal; a “simplification” to Esperanto. The elements are appropriately discussed for an introductory article but also the relationship to the EU is unmotivated. In the end, the author emphasizes the lexicon which is interesting to consider it as a separate concept from language.
The proposal:
Word endings: As an Indo-European language, E-speranto uses endings to express grammatical qualification of word stems. Thus, adverbs end in –e, infinitives in –i, and imperatives in –u. Synthesis starts showing in noun and adjective endings. Using a BNF-like notation, these are –[a|o](y](n] where the required a or o signal an adjective or noun respectively, the optional y signals plurality, and the optional n signals accusativity. Verb endings use the five vowels for a kind of temporal placement. Thus a signals here and now, o signals ahead or the future, i behind or the past, u conditional or propositional, and e perpetual or definitional. The full verb endings are –as for present tense, –os for future tense, –is for past tense, –us for conditional, and –es for perpetual. These verb endings can also be used as morphemes, so that osa is the adjectival future and la aso means the present. The vowels are used as placers in other synthetic syllables that can be used as suffixes or ordinary morphemes. The formula [vowel](n][t] yields 10 morphemes that qualify actions. With the n the action is active, without it passive, so amanta and amata are adjectives meaningloving and loved, and ante and ate are adverbs meaning actively and passively, all these set in the present. This construction can be simply extended to specify horizontal spatial placement by [vowel](m][p] and vertical by [vowel](n][k], where the m and n specify placement relative to the sentence’s subject. Also, u means left and e means right. Thus la domompo means the house ahead while la domopo means the front of the house. Such compounds can take some getting used to, but they are regular and powerful.
Structural words: Synthesis at the phonemic level is even more expressive in its use when building the pronouns and correlatives essential to expressiveness. In Esperanto these center on the vowel i, with affixed phonemes to give the particular class of meaning. The singular pronouns are mi, ci, li, xi, and ji for I, thou, he, she, and it, and si for reflexion. Here I use E-speranto spellings where c is pronounced like the ts in tsunami, and x as sh. There are also two plural pronouns: ni and vi for we and you. There are also two prefixes that can be used: i- to widen the scope and o- to abstract it. Thus ili means he and his associates or they, and omi means someone like me or one. The pronouns can also take grammatical suffixes such as –n for the accusative (min means me) and –a for the adjectival (mia means my). The correlatives are two-dimensional, with one range of meanings given by a suffix, and an independent range by a prefix. The generic correlatives have no prefix. Having a simple vowel suffix, ia, ie, io, and iu mean respectively some kind of, somewhere, something, and somebody. Having a vowel+consonant suffix, ial, iam, iel, ies, iol, and iom, mean respectively for some reason, at some time, in some way, someone’s, in some number, and in some amount. The specific correlatives apply
a variety of prefixes to the generic correlatives. The prefixes are k–, t–, nen–, and q– (pronounced ch–),
and they give selective, indicative, negative, and inclusive meanings. For example, kiam, tiam, neniam,
and qiam mean when, then, never, and always, respectively. This description shows how phonemic synthesis can yield dramatic richness simply. Further, the correlatives can also take the i– and o– scoping prefixes, and both the pronouns and correlatives can be grammatically suffixed.
The Honey Badger of BFT Protocols §
Title: “The Honey Badger of BFT Protocols” by Andrew Miller, Yu Xia, Kyle Croman, Elaine Shi, Dawn Song [url] [dblp]
Published in 2016 at CCS 2016 and I read it in 2024-09
Abstract: The surprising success of cryptocurrencies has led to a surge of interest in deploying large scale, highly robust, Byzantine fault tolerant (BFT) protocols for mission-critical applications, such as financial transactions. Although the conventional wisdom is to build atop a (weakly) synchronous protocol such as PBFT (or a variation thereof), such protocols rely critically on network timing assumptions, and only guarantee liveness when the network behaves as expected. We argue these protocols are ill-suited for this deployment scenario.
notes
backlog := latest transactions not yet committed
byzantine system := system where nodes might drop messages or might forge messages
fairness property = censorship resilience := If transaction tx is input to N-f correct nodes then it is eventually output by every node
IND-CCA := indistinguishable ciphertext under (CCA1 → non-adaptive) (CCA2 → adaptive) chosen ciphertext attack
IND-CPA := indistinguishable ciphertext under chosen plaintext attack
minting := generating blocks w.r.t. proof of stake
PBFT := Practical Byzantine Fault Tolerance (PBFT)
proof of stake := set of validators (subset of nodes) agree consensus on next block; weight of each validator’s vote depends on the size of its deposit (=stake)
slashing := punishment if you break network rules (in particular if your a validator/special node) (e.g. reduce or remove completely all staked tokens)
staking := a personal token is used as wage and at risk if your node misbehaves (via Polkadot protocol)
Sybil attack := one person creates many pseudonyms to gain disproportional influence
threshold cryptography := more than t of n parties need to cooperate to decipher a ciphertext (t <= n)
quotes
- “Second, even when the weak synchrony assumptions are satisfied in practice, weakly synchronous protocols degrade significantly in throughput when the underlying network is unpredictable.” (Miller et al., 2016, p. 2)
- “We propose HoneyBadgerBFT, the first BFT atomic broadcast protocol to provide optimal asymptotic efficiency in the asynchronous setting. We therefore directly refute the prevailing wisdom that such protocols a re necessarily impractical.” (Miller et al., 2016, p. 2)
- “For this reason, many of Bitcoin’s enthusiastic supporters refer to it as the “Honey Badger of Money” [41].” (Miller et al., 2016, p. 1)
- “For example, the Visa processes 2,000 tx/sec on average, with a peak of 59,000 tx/sec [1].” (Miller et al., 2016, p. 1)
- “This inefficiency has two root causes. The first cause is redundant work among the parties.” (Miller et al., 2016, p. 2) “The second cause is the use of a suboptimal instantiation of the Asynchronous Common Subset (ACS) subcomponent.” (Miller et al., 2016, p. 2)
- “By contrast, decentralized cryptocurrencies such as Bitcoin and Ethereum opt for a “permissionless” blockchain, where enrollment is open to anyone, and nodes may join and leave dynamically and frequently. To achieve security in this setting, known consensus protocols rely on proofs-of-work to defeat Sybil attacks, and pay an enormous price in terms of throughput and latency, e.g., Bitcoin commits transactions every 10 min, and its throughput limited by 7 tx/sec even when the current block size is maximized.” (Miller et al., 2016, p. 2)
- “Traditionally, such a primitive is called total order or atomic broadcast [23]; in Bitcoin parlance, we would call it a blockchain.” (Miller et al., 2016, p. 2)
- “The key invention we contribute is a novel reduction from ABC to ACS that provides better efficiency (by an O(N) factor) through batching, while using threshold encryption to preserve censorship resilience (see Section 4.4).” (Miller et al., 2016, p. 3)
- “A convenient (but very strong) network assumption is synchrony: a ∆-synchronous network guarantees that every message sent is delivered after at most a delay of ∆ (where ∆ is a measure of real time).” (Miller et al., 2016, p. 3)
- “Weaker timing assumptions come in several forms. In the unknown-∆ model, the protocol is unable to use the delay bound as a parameter. Alternatively, in the eventually synchronous model, the message delay bound ∆ is only guaranteed to hold after some (unknown) instant, called the “Global Stabilization Time.” Collectively, these two models are referred to as partial synchrony [26]. Yet another variation is weak synchrony [26], in which the delay bound is time varying, but eventually does not grow faster than a polynomial function of time [20].” (Miller et al., 2016, p. 3)
- “At any given time, the designated leader is responsible for proposing the next batch of transactions. If progress isn’t made, either because the leader is faulty or because the network has stalled, then the nodes attempt to elect a new leader.” (Miller et al., 2016, p. 4)
- “Our model particularly matches the deployment scenario of a “permissioned blockchain” where transactions can be submitted by arbitrary clients, but the nodes responsible for carrying out the protocol are fixed.” (Miller et al., 2016, p. 4)
- “We assume each pair of nodes is connected by a reliable authenticated point-to-point channel that does not drop messages.” (Miller et al., 2016, p. 4)
- “The adversary is given complete control of up to f faulty nodes, where f is a protocol parameter. Note that 3 f + 1 ≤ N (which our protocol achieves) is the lower bound for broadcast protocols in this setting.” (Miller et al., 2016, p. 5)
- “A finite transaction delay implies censorship resilience.” (Miller et al., 2016, p. 5)
- “The protocol proceeds in epochs, where after each epoch, a new batch of transactions is appended to the committed log. At the beginning of each epoch, nodes choose a subset of the transactions in their buffer (by a policy we will define shortly), and provide them as input to an instance of a randomized agreement protocol. At the end of the agreement protocol, the final set of transactions for this epoch is chosen.” (Miller et al., 2016, p. 5)
- “Therefore instead of simply choosing the first element(s) from its buffer (as in CKPS01 [15]), each node in our protocol proposes a randomly chosen sample, such that each transaction is, on average, proposed by only one node.” (Miller et al., 2016, p. 5)
- “For deterministic erasure coding, we use the zfec library [52], which implements Reed-Solomon codes. For instantiating the common coin primitive, we implement Boldyreva’s pairing-based threshold signature scheme [11]. For threshold encryption of transactions, we use Baek and Zheng’s scheme [7] to encrypt a 256-bit ephemeral key, followed by AES-256 in CBC mode over the actual payload.” (Miller et al., 2016, p. 8)
- “The total communication cost (per node) is estimated as: m_all = r(BmT + NmE) + N2((1 + log N)mH + mD + 4mS) where mE and mD are respectively the size of a ciphertext and decryption share in the TPKE scheme, and mS is the size of a TSIG signature share.” (Miller et al., 2016, p. 9)
- “Even for small networks, HoneyBadgerBFT provides significantly better robustness under adversarial conditions as noted in Section 3. In particular, PBFT would achieve zero throughput against an adversarial asynchronous scheduler, whereas HoneyBadgerBFT would complete epochs at a regular rate.” (Miller et al., 2016, p. 10)
- “The Tor network consists of approximately 6, 500 relays, which are listed in a public directory service.” (Miller et al., 2016, p. 10)
- “We design our experiment setup such that we could run all N HoneyBadgerBFT nodes on a single desktop machine running the Tor daemon software, while being able to realistically reflect Tor relay paths.” (Miller et al., 2016, p. 10)
- “We attain a maximum throughput of over 800 transactions per second of Tor. In general, messages transmitted over Tor’s relay network tends to have significant and highly variable latency.” (Miller et al., 2016, p. 11)
- “The Byzantine agreement algorithm from Moustefaoui et al. [42] is shown in Figure 11” (Miller et al., 2016, p. 15)
summary
Very good paper proposing a novel scheme including tests on a prototype implementation confirming its claims. Assumes a lot of background from the field (which is fine) (but also explains some primitives in the appendix).
The Implementation of Lua 5.0 §
Title: “The Implementation of Lua 5.0” by Roberto Ierusalimschy, Luiz Henrique de Figueiredo, Waldemar Celes [url] [dblp]
Published in 2005 at and I read it in 2022-12-10
Abstract: We discuss the main novelties of the implementation of Lua 5.0: its registerbased virtual machine, the new algorithm for optimizing tables used as arrays, the implementation of closures, and the addition of coroutines.
quotes
- “Currently the compiler accounts for approximately 30% of the size of the Lua core.” (Ierusalimschy et al., p. 3)
- “the hand-written parser is smaller, more efficient,more portable, and fully reentrant.” (Ierusalimschy et al., p. 3)
- “Lua is really lightweight: for instance, on Linux its stand-alone interpreter, complete with all standard libraries, takes less than 150 Kbytes; the core is less than 100 Kbytes.” (Ierusalimschy et al., p. 4)
- “We also consider Lua a simple language, being syntactically similar to Pascal and semantically similar to Scheme, but this is subjective.” (Ierusalimschy et al., p. 4)
- “Lua represents values as tagged unions ,that is, as pairs (t; v), where t is an integer tag identifying the type of the value v, which is a union of Ctypes implementing Lua types.” (Ierusalimschy et al., p. 5)
- “One consequence of using tagged unions to represent Lua values is that copying values is a little expensive: on a 32-bit machine with 64-bit doubles, the size of a TObject is 12bytes (or 16bytes, if doubles are aligned on 8byte boundaries) and so copying a value requires copying 3 (or 4) machine words.” (Ierusalimschy et al., p. 5)
- “The hash function does not look at all bytes of the string if the string is too long.” (Ierusalimschy et al., p. 6)
- “The hash part uses a mix of chained scatter table with Brent's variation [3].” (Ierusalimschy et al., p. 8)
- “Most procedural languages avoid this problem by restricting lexical scoping (e.g., Python), not providing first-class functions (e.g., Pascal), or both (e.g., C). Functional languages do not have those restrictions. Research in non-pure functional languages like Scheme and ML has created a vast body of knowledge about compilation techniques for closures (e.g., [19, 1,21]).” (Ierusalimschy et al., p. 9)
- “For instance, just the control flow analysis of Bigloo, an optimizer Scheme compiler [20], is more than ten times larger than the whole Lua implementation: The source for module Cfa of Bigloo 2.6f has 106,350 lines, versus 10,155 lines for the core of Lua 5.0. As explained in Section 2, Lua needs something simpler.” (Ierusalimschy et al., p. 9)
- “Lua uses a structure called an upvalue to implement closures. Any outer local variable is accessed indirectly through an upvalue. The upvalue originally points to the stack slot wherein the variable lives (Figure 4,left). When the variable goes out of scope, it migrates into a slot inside the upvalue itself (Figure 4, right).” (Ierusalimschy et al., p. 9)
- “Coroutines in Lua are stackful, in the sense that we can suspend a coroutine from inside any number of nested calls.” (Ierusalimschy et al., p. 10)
- “As such, they allow programmers to implement several advanced control mechanisms, such as cooperative multithreading, generators, symmetric coroutines, backtracking, etc. [7].” (Ierusalimschy et al., p. 10)
- “A key point in the implementation of coroutines in Lua is that the interpreter cannot use its internal C stack to implement calls in the interpreted code. (The Python community calls an interpreter that follows that restriction a stackless interpreter [23].)” (Ierusalimschy et al., p. 11)
- “Because a function running in a coroutine may have been created in another coroutine, it may refer to variables in a different stack. This leads to what some authors call a cactus structure [18]. The use of flat closures, as we discussed in Section 5, avoids this problem altogether.” (Ierusalimschy et al., p. 11)
- “Most instructions in a stack machine have implicit operands.” (Ierusalimschy et al., p. 12)
- “There are 35 instructions in Lua's virtual machine.” (Ierusalimschy et al., p. 12)
- “Branch instructions pose a difficulty because they need to specify two operands to be compared plus a jump offset.” (Ierusalimschy et al., p. 14)
- “The solution adopted in Lua is that, conceptually, a test instruction simply skips the next instruction when the test fails;” (Ierusalimschy et al., p. 14)
- “For function calls, Lua uses a kind of register window. It evaluates the call arguments in successive registers, starting with the first unused register. When it performs the call, those registers become part of the activation record of the called function, which therefore can access its parameters as regular local variables.” (Ierusalimschy et al., p. 15)
- “Lua uses two parallel stacks for function calls.” (Ierusalimschy et al., p. 15)
summary
Wonderful read with a beautiful overview on the Lua 5.0 interpreter runtime design. The sections illustrate the content: {The representation of Values, Tables, Functions and Closures, Threads and Coroutines, The Virtual Machine}.
unclarity
-
“The equivalent Lua program, a={[1000000000]=1}, creates a table with a single entry.” (Ierusalimschy et al., p. 6)
But the two-table design later on contradicts this statement. Since we skip storing the indices, we again require a huge array allocation.
The Profession as a Culture Killer §
Title: “The Profession as a Culture Killer” by Neville Holmes [url] [dblp]
Published in 2007 at and I read it in 2021-09
Abstract:
quotes
- “The computing profession and industry, however, saw this crudeness as simplicity and promulgated separate sets of similar crudity to cater to other cultural needs.”
- “To my dismay, I recently read in a local newspaper that ICANN, the Internet Corporation for Assigned Names and Numbers, would begin allowing non-Latin characters in domain names. Getting it done will be immensely complicated, and the result will likely be chaotic. Worse, it shows a complete disregard for mankind’s second greatest digital technology: writing.”
- “From this, it can be seen that the Internet could well provide a separate DNS for each writing system without compromising, and maybe even helping, its end-to-end model and unique binary naming.”
- “Scientists and engineers using programming languages like Fortran were thus not only restricted to capital letters but also had only the hyphen as a basic mathematical symbol. This led to the replacement of the traditional arithmetic symbols by commercial ones: multiplication’s saltire (×) by the asterisk (*), division’s obelus (÷) by the virgule (/), and even addition’s plus (+) by the ampersand (&), although users could pay extra money to get printer features that replaced ampersands with plusses.”
- “Indeed, the neglect by computing professionals of the Latin writing system’s culture brings about a multitude of problems.”
- “What we have now disgraces our profession.”
- “A good place to start would be to kill the Universal Web”
summary
Mentions a myriad of topics. His discussion of characters of ASCII is informative, but his remarks about technology and writing systems are incomplete and biased.
- Worse, it shows a complete disregard for mankind’s second greatest digital technology: writing
⇒ unjustified - In any case, users could mix URLs for different Webs if needed because each DNS would translate the domain names for the different Webs into the underlying Internet addresses.
⇒ requires clear definition of distinct writing systems - Indeed, the neglect by computing professionals of the Latin writing system’s culture brings about a multitude of problems
⇒ Who is responsible? You assume the non-Latin people - It’s also faster to read and more economical of space than linear alphabetic writing systems.
⇒ But the problems are not discussed
The Security Risk of Lacking Compiler Protection in We… §
Title: “The Security Risk of Lacking Compiler Protection in WebAssembly” by Quentin Stievenart, Coen De Roover, Mohammad Ghafari [url] [dblp]
Published in 2021 at QRS 2021 and I read it in 2022-12-12
Abstract: WebAssembly is increasingly used as the compilation target for cross-platform applications. In this paper, we investigate whether one can rely on the security measures enforced by existing C compilers when compiling C programs to WebAssembly. We compiled 4,469 C programs with known buffer overflow vulnerabilities to x86 code and to WebAssembly, and observed the outcome of the execution of the generated code to differ for 1,088 programs. Through manual inspection, we identified that the root cause for these is the lack of security measures such as stack canaries in the generated WebAssembly: while x86 code crashes upon a stack-based buffer overflow, the corresponding WebAssembly continues to be executed. We conclude that compiling an existing C program to WebAssembly without additional precautions may hamper its security, and we encourage more research in this direction.
potential contradiction
-
“We performed our evaluation using -O1 as the optimisation level.” (Stievenart et al., p. 7)
“For this reason, we decided to use -O2 as the level of optimization.” (Stievenart et al., p. 6)
quotes
- “We compiled 4,469 C programs with known buffer overflow vulnerabilities to x86 code and to WebAssembly, and observed the outcome of the execution of the generated code to differ for 1,088 programs. Through manual inspection, we identified that the root cause for these is the lack of security measures such as stack canaries in the generated WebAssembly: while x86 code crashes upon a stack-based buffer overflow, the corresponding WebAssembly continues to be executed.” (Stievenart et al., p. 1)
- “The standard has been designed with security in mind, as evidenced among others by the strict separation of application memory from the execution environment’s memory. Thanks to this separation, a compromised WebAssembly binary cannot compromise the browser that executes the binary [2], [7].” (Stievenart et al., p. 1)
- “This contradicts the design document of the WebAssembly standard [2] which states “common mitigations such as data execution prevention (DEP) and stack smashing protection (SSP) are not needed by WebAssembly programs”” (Stievenart et al., p. 1)
- “The execution model of WebAssembly is stack-based” (Stievenart et al., p. 2)
- “A WebAssembly program also contains a single linear memory, i.e., a consecutive sequence of bytes that can be read from and written to by specific instructions.” (Stievenart et al., p. 2)
-
“An example function is the following:
(func $main (type 4)
(param i32 i32)
(result i32)
(local i32)
local.get 0)” (Stievenart et al., p. 2) [with two params, one return value, one local variable] - “After the last instruction, the function execution ends and the value remaining on the stack is the return value.” (Stievenart et al., p. 2)
- “g0 acts as the stack pointer.” (Stievenart et al., p. 2)
- “Moreover, multiple features of WebAssembly render programs less vulnerable than their native equivalent. Unlike in x86, the return address of a function in WebAssembly is implicit and can only be accessed by the execution environment, preventing returnoriented programming attacks among others, diminishing the potential of stack smashing attacks. Also, function pointers are supported in WebAssembly through indirect calls, where the target function of the call is contained in a statically-defined table: this reduces the number of possible control flow exploits.” (Stievenart et al., p. 3)
- “We rely on the Juliet Test Suite 1.3 for C2 of the Software Assurance Reference Dataset [3], released in October 2017 and which has been used to compare static analysis tools that detect security issues in C and Java applications [5].” (Stievenart et al., p. 3)
- “In total, for CWE 121, we have 2785/5906 programs (47%) that can be compiled to WebAssembly, and for CWE 122, we have 1666/3656 programs (46%) that can be compiled.” (Stievenart et al., p. 3)
- “Moreover, stack smashing protection need to be ensured by the compiler rather than the WebAssembly runtime, that has no knowledge of how the program stack is managed.” (Stievenart et al., p. 5)
- “We now turn our attention to the impact of optimisations on the ability to prevent vulnerable code to make it into the binary7. To illustrate this, we inspect one specific example that exhibits different behaviour depending on the optimisation levels, which is in essence similar to our first example.” (Stievenart et al., p. 5)
- “For instance, one aspect which we have not encountered in the programs we inspected is that there is no use of function pointers.” (Stievenart et al., p. 7)
- “Arteaga et al. [4] propose an approach to achieve code diversification for WebAssembly: given an input program, multiple variants of this program can be generated. Narayan et al. [14] propose Swivel, a new compiler framework for hardening WebAssembly binaries against Spectre attacks, which can compromise the isolation guarantee of WebAssembly. Sti ́ evenart and De Roover propose a static analysis framework [16] for WebAssembly, used to build an information flow analysis [17] to detect higher-level security concerns such as leaks of sensitive information.” (Stievenart et al., p. 7)
summary
4469 C programs (of a corpus for static code analysis test cases) are compiled to WASM and differences in the behavior on those two platforms are observed. 1088 programs showed different behavior. Specifically, it is simply concluded that WASM does not provide stack smashing detection. The study is useful, but limited in scope and with an expected result.
The UNIX Time- Sharing System §
Title: “The UNIX Time- Sharing System” by Dennis M Ritchie, Ken Thompson, Bell Laboratories [url] [dblp]
Published in 1974-07 at Communications of the ACM and I read it in 2021-03
Abstract:
contradiction
- quote 1: “The size of an ordinary file is determined by the highest byte written on it; no predetermination of the size of a file is necessary or possible”
- quote 2: “The entry thereby found (the file's i-node) contains the description of the file as follows.” … “4. Its size”
Funny process
9.2 Per day (24-hour day, 7-day week basis)
There is a "background" process that runs at the lowest possible priority; it is used to soak up any idle CPU time. It has been used to produce a million-digit approximation to the constant e - 2, and is now generating composite pseudoprimes (base 2).
quotes
- “UNIX is a general-purpose, multi-user, interactive operating system for the Digital Equipment Corporation PDP-11/40 and 11/45 computers. It offers a number of features seldom found even in larger operating systems, including: (1) a hierarchical file system incorporating demountable volumes; (2) compatible file, device, and inter-process I/O; (3) the ability to initiate asynchronous processes; (4) system command language selectable on a per-user basis; and (5) over 100 subsystems including a dozen languages.”
- “Our own installation is used mainly for research in operating systems, languages, computer networks, and other topics in computer science, and also for document preparation.
Perhaps the most important achievement of UNIX is to demonstrate that a powerful operating system for interactive use need not be expensive either in equipment or in human effort: UNIX can run on hardware costing as little as $40,000, and less than two man-years were spent on the main system software” - “There is also a host of maintenance, utility, recreation, and novelty programs. All of these programs were written locally. It is worth noting that the system is totally self-supporting. All UNIX software is maintained under UNIX; likewise, UNIX documents are generated and formatted by the UNIX editor and text formatting program.”
- “The PDP-11/45 on which our UNIX installation is implemented is a 16-bit word (8-bit byte) computer with 144K bytes of core memory; UNIX occupies 42K bytes. This system, however, includes a very large number of device drivers and enjoys a generous allotment of space for I/O buffers and system tables; a minimal system capable of running the software mentioned above can require as little as 50K bytes of core altogether.”
- “The greater part of UNIX software is written in the above-mentioned C language [6]. Early versions of the operating system were written in assembly language, but during the summer of 1973, it was rewritten in C. The size of the new system is about one third greater than the old.”
- “UNIX differs from other systems in which linking is permitted in that all links to a file have equal status. That is, a file does not exist within a particular directory; the directory entry for a file consists merely of its name and a pointer to the information actually describing the file. Thus a file exists independently of any directory entry, although in practice a file is made to disappear along with the last link to it.”
- “There is a threefold advantage in treating I/O devices this way: file and device I/O are as similar as possible; file and device names have the same syntax and meaning, so that a program expecting a file name as a parameter can be passed a device name; finally, special files are subject to the same protection mechanism as regular files”
- “After the mount, there is virtually no distinction between files on the removable volume and those in the permanent file system.”
- “There is only one exception to the rule of identical treatment of files on different devices: no link may exist between one file system hierarchy and another. This restriction is enforced so as to avoid the elaborate bookkeeping which would otherwise be required to assure removal of the links when the removable volume is finally dismounted.”
- “Although the access control scheme in UNIX is quite simple, it has some unusual features. Each user of the system is assigned a unique user identification number. When a file is created, it is marked with the user ID of its owner.”
- “If the seventh bit is on, the system will temporarily change the user identification of the current user to that of the creator of the file whenever the file is executed as a program. This change in user ID is effective only during the execution of the program which calls for it. The set-user-ID feature provides for privileged pro- grams which may use files inaccessible to other users. For example, a program may keep an accounting file which should neither be read nor changed except by the program itself. If the set-user-identification bit is on for the program, it may access the file although this access might be forbidden to other programs invoked by the given program's user.”
- “There is no distinction between ‘random’ and ‘sequential’ I/O, nor is any logical record size imposed by the system.”
- “It should be said that the system has sufficient internal interlocks to maintain the logical consistency of the file system when two users engage simultaneously in such inconvenient activities as writing on the same file, creating files in the same directory, or deleting each other's open files.”
- “The space on all fixed or removable disks which contain a file system is divided into a number of 512-byte blocks logically addressed from 0 up to a limit which depends on the device. There is space in the i-node of each file for eight device addresses. A small (nonspecial) file fits into eight or fewer blocks; in this case the addresses of the blocks themselves are stored. For large (nonspecial) files, each of the eight device addresses may point to an indirect block of 256 addresses of blocks constituting the file itself. Thus files may be as large as 8 • 256 • 512, or 1,048,576 (220) bytes.”
- “Except while UNIX is bootstrapping itself into opera- tion, a new process can come into existence only by use of the fork system call: processid = fork(label)”
- “Although interprocess communication via pipes is a quite valuable tool (see §6.2), it is not a completely general mechanism since the pipe must be set up by a common ancestor of the processes involved.”
- “If file command cannot be found, the Shell prefixes the string /bin/ to command and attempts again to find the file. Directory /bin contains all the commands intended to be generally used”
- “An extension of the standard l/O notion is used to direct output from one command to the input of another. A sequence of commands separated by vertical bars causes the Shell to execute all the commands simultaneously and to arrange that the standard output of each command be delivered to the standard input of the next command in the sequence”
- “A program such as pr which copies its standard input to its standard output (with processing) is called a .filter.”
- “The PDP-1 l hardware detects a number of program faults, such as references to nonexistent memory, unimplemented instructions, and odd addresses used where an even address is required. Such faults cause the processor to trap to a system routine. When an illegal action is caught, unless other arrangements have been made, the system terminates the process and writes the user's image on file core in the current directory.”
- “Perhaps paradoxically, the success of UNIX is largely due to the fact that it was not designed to meet any predefined objectives”
- “The fork operation, essentially as we implemented it, was present in the Berkeley time-sharing system [8]. On a number of points we were influenced by Multics, which suggested the particular form of the I/O system calls [9] and both the name of the Shell and its general functions.”
- “A ‘crash’ is an unscheduled system reboot or halt. There is about one crash every other day;”
summary
Interesting retrospective read.
It was interesting to observe what has changed since then. Though I assume in 1974 they were quite experienced with their system design, such a huge design necessarily has a lot of controversial points.
The paper explains the filesystem and the shell as core concepts.I think the idle background process and the casual statement of “there is about one crash every other day” is funny from today's perspective.
Underspecified statement
“To provide an indication of the overall efficiency of UNIX and of the file system in particular, timings were made of the assembly of a 7621-1ine program. The assembly was run alone on the machine; the total clock time was 35.9 sec, for a rate of 212 lines per sec.”
… which program? What does it do?
The design of a Unicode font §
Title: “The design of a Unicode font” by C Bigelow, K Holmes [url] [dblp]
Published in 1993 at and I read it in 2020-10
Abstract: The international scope of computing, digital information interchange, and electronic publishing has created a need for world-wide character encoding standards. Unicode is a comprehensive standard designed to meet such a need. To be readable by humans, character codes require fonts that provide visual images — glyphs — corresponding to the codes. The design of a font developed to provide a portion of the Unicode standard is described and discussed.
ambiguity
“The inclusion of non-alphabetic symbols and non-Latin letters in these 8-bit character
sets required font developers to decide whether the assorted symbols and non-Latin letters
should be style-specific or generic.”
A definition of style-specific and generic would be nice.
“Hence, although accents need to be clearly differentiated, they do not need to be emphatic, and, indeed, overly tall or heavy accents can be more distracting than helpful to readers.”
What does empathic mean in this context?
nicely written
“To design such a font is a way to study and appreciate, on a microcosmic scale, the manifold variety of literate culture and history.”
“A ‘roman’ typeface design (perhaps upright would be a less culture-bound term, since a Unicode font is likely to include Greek, Cyrillic, Hebrew, and other scripts)” …
question
One of the advantages of Unicode is that it includes a Generic Diacritical Marks set of ‘floating’ diacritics that can be combined with arbitrary letters. These are not case-sensitive, i.e. there is only one set of floating diacritics for both capitals and lowercase
Isn't this a property of the font file format?
quotes
- “The Unicode standard distinguishes between characters and glyphs in the following way: ‘Characters reside only in the machine, as strings in memory or on disk, in the backing store.”
- “In contrast to characters, glyphs appear on the screen or paper as particular representations of one or more backing store characters. A repertoire of glyphs comprises a font.’”
- “By script or writing system we mean a graphical representation of language.”
- “Accumulated contextual glyphic variations are what transformed capitals into lowercase, for example, and turned roman into italic.”
- “Version 1.0 of the standard encodes approximately 28,000 characters, of which some 3,000 are alphabetic letters and diacritical marks for European, Indic, and Asian languages, 1,000 are various symbols and graphic elements, and 24,000 are ideographic (logographic), syllabic, and phonetic characters used in Chinese, Japanese, and Korean scripts [1,2].”
- “Although the textura typeface used by Gutenberg in the 42-line Bible of 1455-56, the first printed book in Europe, included more than 250 characters, and the humanistica corsiva typeface cut by Francesco Griffo for the Aldine Virgil of 1501, the first book printed in italic, included more than 200 characters, character sets became smaller in later fonts, to reduce the costs of cutting, founding, composing, and distributing type.”
- “Within one font, we wanted the different alphabets and symbols to maintain a single design theme.”
- “A problem with this kind of haphazard development is that the typographic features of documents, programming windows, screen displays, and other text-based images will not be preserved when transported between systems using different fonts.”
- “By ‘harmonization’, we mean that the basic weights and alignments of disparate alphabets are regularized and tuned to work together, so that their inessential differences are minimized, but their essential, meaningful differences preserved.”
- “Latinate typographers”
- “As designers, we would like to know how the ‘rules’ that govern legibility in non-Latin scripts compare to the rules for Latin typefaces. Shimron and Navon, for example, report a significant difference in the spatial distribution of distinctive features in the Roman and Hebrew alphabets [20].”
- “Although many typographic purists believe that simple obliques are inferior to true cursives, Stanley Morison argued in 1926 that the ideal italic companion to roman should be an inclined version of the roman [21].”
- “The European mode of distinction between formal and cursive type forms is not as strong a tradition in some non-Latin scripts, e.g. Hebrew (though there may be other kinds of highly regulated graphical distinctions), so a simple oblique is a more universal, albeit minimal, graphic distinction that can apply equally to all non-Latin scripts.”
- “we concluded that, for improved legibility in international text composition, accents and diacritics should be designed somewhat differently than in the standard version of Lucida Sans.”
- “Accordingly, we designed the lowercase diacritics of Lucida Sans Unicode to be slightly taller and a little different in modulation than those of the original Lucida Sans. Following current practice, we used the lowercase accents to compose accented capitals.”
- “One of the advantages of Unicode is that it includes a Generic Diacritical Marks set of ‘floating’ diacritics that can be combined with arbitrary letters. These are not case-sensitive, i.e. there is only one set of floating diacritics for both capitals and lowercase. In our first version of Lucida Sans Unicode, we implemented these as lowercase diacritics and adjusted their default position to float over the centre of a lowercase o. Ideally, there should be at least two sets of glyphs, one for lowercase and one for upper case (determined automatically by the text line layout manager of the OS or application), along with a set of kerning tables that optimizes the visual appearance of each combination of letter + diacritic.”
- “In a proportional font (like the Times Roman before the eyes of the reader), the advance width of a character is proportional to the geometric logic of its form. A proportionally-spaced m, which has a spatial frequency of three cycles per letter, is wider than an n, which has a frequency of two cycles, which in turn is wider than an i, which has one.”
- “In a fixed pitch font like Courier, all characters are of the same width, so that m is cramped and i extended, and ‘minimum’ has an irregular rhythm, since the spatial frequency of the letters is continually changing within the fixed width of the cells.”
- “Among the alphabetic Unicode character sets, Cyrillic poses an interesting problem in fixed-pitch mode because it has, compared to the Latin, a greater percentage of characters with higher spatial frequency (three or more cycles per letter) on the horizontal axis. The Hebrew alphabet, on the other hand, is more easily transformed to fixed-pitch mode because it has many fewer letters of high spatial frequency.”
- “While the assumption that characters are fully contained within cells was invariably true for primitive terminal and TTY fonts, it is not necessarily true of typographic fonts. In many PostScript and TrueType digital fonts, diacritics on capitals extend above the nominal upper boundary of the body of the font because, in an effort to reduce font file size, capital and lowercase accented characters are built as composites in which letters and diacritics are treated as subroutines,”
- “The lowercase form of most diacritics is taller than the capital form,”
- “A font that would contain all of Unicode 1.0 would be of daunting size, and the standard continues to grow as the Unicode committee adds more characters to it. Even without the Chinese/Japanese/Korean set, the alphabets and symbols comprise almost 4,000 separate characters, sixteen times larger than the usual 8-bit character sets.”
- “To call an incomplete font containing Unicode subsets a ‘Unicode’ font could be misleading, since some users could mistakenly assume that any font called ‘Unicode’ will contain a full set of 28,000 characters.”
- “The Japanese term ‘gothic’ is equivalent to ‘sans serif’, and is also used in English for sans serif faces, particularly those of 19th-century American design, e.g. Franklin Gothic and News Gothic.”
- “To satisfy the French critics and give Times greater appeal in the French market, the Monotype Corporation cut a special version of the face in accord with the dictates of Maximilien Vox, a noted French typographic authority [15,26]. Vox’s re-design of some fourteen characters brought Times slightly closer to the sophisticated style of the French Romain du Roi, cut by Philippe Grandjean circa 1693.”
- “For the German typographic market, Monotype cut a version of Times with lighter capitals that are less distracting in German orthography, where every noun is capitalized”
- “Robert Granjon” … “His ecclesiastical patrons in Rome called him ‘the excellent . . .’, ‘the most extraordinary . . .’, ‘the best ever . . .’ cutter of letters [31].”
- “we followed a traditional Hebrew thick/thin modulation, in which horizontal elements are thicker than vertical – the opposite of the Latin convention – but weighted the Hebrew characters to have visual ‘presence’ equivalent to that of the Latin.”
- “Unicode is a character encoding standard, not a glyph standard.”
- “For example, Unicode treats Latin capital B as one character and lowercase b as another because these are significant differences in Latin orthography.”
- “Unicode does not treat italic b or bold b as separate from b, because those letters are merely allographs, as the linguists would say, the graphic differences not being orthographically significant.”
- “S-cedilla and T-cedilla are used in both Turkish and Romanian, but the cedilla may be rendered as either the French form of cedilla or as a comma-like accent below the letter.”
- “Pike and Thompson [3] discuss the advantage of economical memory management and greater font loading speed when a complete Unicode character set is implemented as many small subfonts from which characters can be accessed independently, and this is the method used in the Plan 9 operating system, both for bitmap fonts and the Lucida Sans Unicode outline fonts. The other method is used in Microsoft Windows NT 3.1, in which the first version of the Lucida Sans Unicode font is implemented as a single TrueType font of 1,740 glyphs. In the current version of Windows NT, this allows a simpler font-handling mechanism, makes the automatic ‘hinting’ of the font easier, since all characters can be analyzed by the hinting software in one pass, and preserves the default design coordination of the subsets, if the font is based on a harmonized set of designs.”
summary
A neat paper. Due to my lack of understanding of the status quo in 1993, I cannot judge on the approach and quality. There are some considerations, I am not familiar with (e.g. why do we need to distinguish only two kinds of diacritics - lowercase and uppercase), but the paper gives a broad overview over design decisions that need to be made, when designing a ‘Unicode’ font. They designed 1700 characters, but Unicode 1.1 specifies 34,168 characters. It is part of the paper to discuss “One big font vs. many little fonts”.
- “Our initial motivation in designing a Unicode font was to provide a standardized set of glyphs that could be used as a default core font for different operating systems and languages. Within one font, we wanted the different alphabets and symbols to maintain a single design theme.”
- “Having described reasons in favor of creating a Unicode font, we should also discuss arguments against such an undertaking, and various problems we encountered.”
- “Ars longa, vita brevis” (i.e. huge development effort)
- “Culture-bound design” (i.e. typographers only ‘truely’ understand the writing system they use in their culture)
- “Homogenization” (“If it erases distinctive differences between scripts, it increases the possibility of confusion”)
- “Character standard vs. glyph standard”
- “One big font vs. many little fonts”
The problem with unicode §
Title: “The problem with unicode” by N. Holmes [url] [dblp]
Published in 2003-06 at and I read it in 2021-08
Abstract:
quotes
- “I suspected that the Unicode people had withdrawn from the debate when they realized that by blunder I did not mean failure.”
- “The official Unicode site states that it is an encoding system that ‘provides a unique
number for every character, no matter what the platform, no matter what the program, no matter what the language’ (www.unicode.org/unicode/standard/WhatIsUnicode.html).” - “Font classes such as typewriter, serif, and sans serif have as little meaning in the Arab writing system as diwani, kufic, and thuluth have in the Latin writing system.”
- “At the most populous end of the spectrum lie plain text messages such as I have to deal with every day: letters, e-mail, handwritten notes. Plain text of this kind, being mostly brief and personal, never mixes writing systems.”
- “Documents are best marked up in a single writing system, with any mixing of writing systems specified through markup directly or, better, by using macrodefinitions or specifying an inclusion.”
- “By putting all writing systems and languages together, Unicode becomes much too complex and unstable.”
- “In addition, no attempt should be made to implement any particular collating sequences. Not only are these complex, they also differ from culture. to culture. For example, German treats
ä as though it were a, while Finnish treats the two as distinct. English treats rh as two letters, while Welsh treats them as one. Thus, the placement of symbols within alphabets should be chosen to support transliteration.” - “The generative capability of this approach provides for complex use of accents as in Vietnamese and for the stable generation of new transliterations and symbols, thanks to typography’s ability to provide esthetically pleasing forms of newly popular compound symbols such as the euro.”
summary
In this article, the author argues that Unicode is blunder, too complex and unstable. First, he puts markup and plaintext into contrast. Then he continues to discuss the Latin writing system suggesting a eight-bit categorization system (without actually assigning glyphs). He continues to discuss keyboards leading towards CJK setups. In the end, he claims, Unicode does not take full advantage of the systematic graphical features of the various writing systems.
At least from today's perspective, the author claims to solve typesetting problems without actually offering a solution in detail. The suggested modifiers in Table 1 require further discussion by the typographers. Does the font specify the positioning of diacritics or is it part of his suggested scheme? In the end, these discussions are today solved in OpenType and the original bit-level encoding does not matter. His suggestion for various 8-bit systems requires a proper annotation of encodings per text file.
In the end, I think the author has some valid points, but his statements lack depth and time has solved these issues differently than he suggested.
typo
esthetically → aesthetically
Too Much Crypto §
Title: “Too Much Crypto” by Jean-Philippe Aumasson [url] [dblp]
Published in 2020-01-03 at Real-World Crypto 2020 and I read it in 2020/01
Abstract: We show that many symmetric cryptography primitives would not be less safe with significantly fewer rounds. To support this claim, we review the cryptanalysis progress in the last 20 years, examine the reasons behind the current number of rounds, and analyze the risk of doing fewer rounds. Advocating a rational and scientific approach to round numbers selection, we propose revised number of rounds for AES, BLAKE2, ChaCha, and SHA-3, which offer more consistent security margins across primitives and make them much faster, without increasing the security risk.
ambiguity
“Where we examine the reasons behind the number of rounds, comment on the risk posed by quantum computers, and finally propose new primitives for a future where less energy is wasted on computing superfluous rounds.”
Is this an English sentence?
notes
- “Designed in the 1970’s, neither DES nor GOST are practically broken by cryptanalysis.”
- “(We restrict this reassuring outlook to symmetric primitives, and acknowledge that spectacular failures can happen for more sophisticated constructions. An example is characteristic-2 supersingular curves’ fall from 128-bit to 59-bit security [32].)”
- “The speed of symmetric primitives being inversely proportional to their number of rounds, a natural yet understudied question is whether fewer rounds would be sufficient assurance against cryptanalysis’ progress.”
- “We conclude by proposing reduced-round versions of AES, BLAKE2, ChaCha, and SHA-3 that are significantly faster yet as safe as their full-round versions.”
- “in 2009 Bruce Schneier wrote this [51]:”
Cryptography is all about safety margins. If you can break n rounds of a cipher, you design it with 2n or 3n rounds. What we’re learning is that the safety margin of AES is much less than previously believed. And while there is no reason to scrap AES in favor of another algorithm, NIST should increase the number of rounds of all three AES variants. At this point, I suggest AES-128 at 16 rounds, AES-192 at 20 rounds, and AES-256 at 28 rounds. Or maybe even more; we don’t want to be revising the standard again and again.
- “128-bit security is often acknowledged as sufficient for most applications”
- “at the time of writing mining a Bitcoin block requires approximately 2 74 evaluations of SHA-256.”
- “Lloyd [46] estimated that ‘[the] Universe could currently register 1090 [or 2299] bits. To register this amount of information requires every degree of freedom of every particle in the Universe’. Applying a more general bound and the holographic principle, Lloyd further calculates that the observable Universe could register approximately 2399 bits by using all the information capacity matter, energy, and gravity.”
- “Unlike complexity theoretic estimates that use asymptotic notations such as O(n log n) where n is the problem size, cryptanalysts work with fixed-length values and can’t work with asymptotics.”
- “complexities in cryptanalysis papers ignore the fact that a memory access at a random address is typically orders of magnitude slower than simple arithmetic operations.”
- “the area-time (AT) metric model, where the attack cost is viewed as the product between area and time requirements”
- “complexities in cryptanalysis papers ignore the fact that a memory access at a random address is typically orders of magnitude slower than simple arithmetic operations.”
- “This example stresses that the area-time (AT) metric model, where the attack cost is viewed as the product between area and time requirements, is more realistic than the model where only time is considered”
- “Grigg and Gutmann called ‘cryptographic numerology’”
- “Using any standard commercial risk management model, cryptosystem failure is orders of magnitude below any other risk.”
- “For example, the greatest risks with e-voting systems are not the cryptographic protocols and key lengths, but the operational and information security concerns.”
- “Our proposed categories are:
- Analyzed: […]
- Attacked: […]
- Wounded: […]
- Broken: […]”
- “Schneier’s law is the tautological-sounding statement ‘Attacks always get better, they never get worse’, which Bruce Schneier popularized, and that (he heard) comes from inside the NSA.”
- “Rarely have number of rounds been challenged as too high. A possible reason (simplifying) is that people competent to constructively question the number of rounds have no incentive to promote faster cryptography, and that those who don’t have the expertise to confidently suggest fewer rounds.”
- “Although Grover reduces key search from O(2n) to O(2n/2), one shouldn’t ignore the constant factors hiding in the O(). Translating this asymptotic speed-up into a square-root of the actual cost is a gross oversimplification; between constant factors, the size and cost of a quantum circuit implementing the attacked primitive, the lack of parallelism [30], and the latency of the circuit, it’s actually unclear, given today’s quantum computing engineering knowledge, whether Grover would actually be more cost-efficient than classical computers. It’s nonetheless a safe bet to assume that it would be.”
- “Anyway, the number of rounds would not matter much would AES be Groverable, the answer to that question is therefore not important in choosing a number of rounds.”
- “[…] we propose the following:
- AES: 9 rounds instead of 10 for AES-128, 10 instead of 12 for AES-192, 11 instead of 14 for AES-256, yielding respectively a 1.1×, 1.2×, and 1.3× speed-up.
- BLAKE2: 8 rounds instead of 12 for BLAKE2b, 7 rounds instead of 10 for BLAKE2s (we’ll call these versions BLAKE2bf and BLAKE2sf), yielding respectively a 1.5× and 1.4× speed-up.
- ChaCha: 8 rounds instead of 20 (that is, ChaCha8), yielding a 2.5× speed-up.
- SHA-3: 10 rounds instead of 24 (we’ll call this version KitTen, inspired by Keccak family member KangarooTwelve), yielding a 2.4× speed-up.”
Good paper; its survey is its strong suit. However, the particular choice of proposed parameters has little justification
typo
If all the best cryptanalysts could find was a distinguisher in the chosen-ciphertext related-key model, then even less likely are practical key recovery attack in the chosen-plaintext model.
typo
But as as noted in §2, numbers such as
Toward decent text encoding §
Title: “Toward decent text encoding” by N. Holmes [url] [dblp]
Published in 1998 at and I read it in 2020-08
Abstract:
quotes
- “Now, without much public discussion or dispute, the computing industry seems to be moving to an equally poor but contrastingly obese character set called Unicode.”
- “In the 1960s, two expanded character sets came into wide use. When IBM introduced the 8-bit System/360 computers, it introduced an 8-bit character set called EBCDIC (Extended Binary Coded Decimal Interchange Code) to go with it.”
- “Both EBCDIC and ASCII provided users with a + symbol as standard, but (with breathtaking arrogance) the developers of both sets refused to provide the traditional multiplication and division symbols.”
- “What is disappointing, if not tragic, is that the replacement is so unsuitable for text encoding.”
- “Unicode seems to be trying to provide a single character set to represent documents in any language or writing system or mixture thereof.”
- “Unicode is intended primarily to allow the computing and telecommunications industry to get by with only one character set for the entire world (http://www.unicode.org). One result is that everyone has to use 16 bits for every character.”
- “Mudawwar’s Multicode aims to counter the 16-bit drawback and several others that he describes in some detail.”
- “Most traffic in text is raw text—messages, identifiers, business records—and the vast majority of this traffic is monolingual.”
- “Mudawwar’s Multicode scheme recognizes this and therefore provides for a separate character set for every ‘official language’ (‘Unicode Misunderstood,’
Computer, June 1997).” - “In this case Multicode provides for great data compression, but in any case it separates languages from one another, which is no longer the way of the world, if it ever was. There are two aspects of language interchange. First, languages borrow words and phrases from one another so that, for example, English uses French and German words and takes their diacritical marks with them.”
- “Second, in this international society it is important to be able to name people and organizations in their own language.”
- “I should be able to read all Swedish names in plaintext e-mail messages, but at present many are garbled.”
- “For text encoding, the world needs a standard for each writing system that suits each and every language using that system.”
- “The one exception is the traditional Chinese writing system, which encompasses thousands of distinct characters.”
summary
This article debates requirements for a decent text encoding scheme. It criticizes Unicode and argues in favor of Multicode.
Reading this 1998 article in 2021 certainly creates some issues. First and foremost, it is not understandable to me why the author equates Unicode to a 16-bit encoding (last page, left column). Unicode is called “obese character set” and 8-bit encodings are praised. The author neglects UTF-8 invented in 1992 without the “obese” property. His lack of understanding for writing systems is shown when describing “the traditional Chinese writing system” as “the one exception” “which encompasses thousands of distinct characters”. This statement excludes the CJK community which includes Kanji in Japanese text and Hanja in Hanguel (admittedly reduced to minority use since 1970 and limited to South Korea).
At the same time Holmes praises 8-bit encoding systems like the Multicode scheme, which makes computations like “convert to lowercase” difficult to implement (thus ignoring computational burden).
It seems to me the author did not properly research his topic of interest. But I agree upon the goal mentioned in the article:
“For text encoding, the world needs a standard for each writing system that suits each and every language using that system.”
Transitivaj kaj netransitivaj verboj en Esperanto §
Title: “Transitivaj kaj netransitivaj verboj en Esperanto” by Kiselman Christer [url] [dblp]
Published in 1995 at La Stato kaj Estonteco de la Internacia Lingvo Esperanto and I read it in 2026-04-09
Abstract:
citaĵoj
- “Mi starigas la sekvajn demandojn, kaj esperas kontribui al ties respondoj. Ĉu entute indas distingi la netransitivajn disde la transitivaj verboj en Esperanto? Se jes, kiel oni devas difini transitivecon? Kiuj verboj estas pli oftaj, la transitivaj aŭ la netransitivaj? Ĉu ekzistas reguloj laŭ kiuj oni povas formi transitivajn respektive netransitivajn verbojn (krom per -igi kaj -iĝi, kompreneble)? Kial Zamenhof ne reguligis la aferon? Ĉu entute oni povas vidi iujn regulaĵojn en la verboj rilate al transitiveco?” (Christer, 1995, p. 1)
- “ju pli la ago influas la objekton, des pli tiu tendencas esti rekta.” (Christer, 1995, p. 2)
- “oni povas diri same bone mi dankas vin; mi dankas al vi ; simile pri helpi : helpu lin aŭ helpu al li.” (Christer, 1995, p. 2)
- “La sufiksoj -igi kaj -iĝi havas klaran efikon rilate al la valento: la unua pliigas la valenton je unu; la dua malpliigas la valenton je unu.” (Christer, 1995, p. 2)
- “Se mi pagigas ŝuldanton, la ŝuldanto pagas; PIV havas nur tiun sencon de pagigi. Male, pri manĝigi du sencoj estas registritaj en PIV. Se mi manĝigas kokidon, ĉu mi donas al la kokido ion por manĝi aŭ ĉu mi igas iun manĝi ĝin? Ambaŭ interpretoj ekzistas; oni esprimas ilin dirante ke manĝigi povas signifi aŭ ‘manĝantigi’ aŭ ‘manĝatigi’.” (Christer, 1995, p. 3)
- “Cetere oni povas daŭrigi la aplikon de la sufikso: la libro brulis, ĉar iu bruligis ĝin, kaj la reĝo bruligigis ĝin.” (Christer, 1995, p. 3)
- “Tamen ni devas akcepti ke -igi ne povas esti aplikata al ĉiu transitiva verbo, ĉar ni ne povas amasigi sufiksojn senlime.” (Christer, 1995, p. 3)
- “Ni notu ke estas la subjekto de la origina frazo kiu estas eliminita per la apliko de la sufikso, dum la origina objekto farigˆas subjekto en la nova frazo.” (Christer, 1995, p. 3)
- “Sed kio do okazas se la verbo estas unuvalenta? […] Pro tio kelkaj opinias ke verbo kia sidiĝi, kvankam fundamenta kaj eĉ troviĝanta en la unua libro (Zamenhof 1887: faldfolio), estas nelogika (vidu Weidmann 1989:38). Por eliri Weidmann mencias du solvojn: aŭ oni konsideras sidiĝi kiel simpligon de sidigiĝi (kio do konservas la ideon pri valentpliigo kaj valentmalpliigo), aŭ oni diras ke sidiĝi tute ne venas de la verbo sidi sed de la adjektiva sida; sidiĝi tiam estas analoga al ruĝiĝi.” (Christer, 1995, p. 3)
- “Lock (1989) komentis ke la vera malfacilaĵo ne temas pri transitiveco sed pri lernado de la signifo de la verboj.” (Christer, 1995, p. 5)
- “Ŝajnas ke Zamenhof antaŭ la publikigo de Esperanto provis pli skemisman aliron al la vortfarado, precipe pri kelkaj verboj, sed ke li poste forlasis tiujn solvojn.” (Christer, 1995, p. 5)
- “reciprokaj rilatoj inter la vortoj […] doni, preni ; vendi, aĉeti ; instrui, lerni” (Christer, 1995, p. 6)
- “mi listigis ĉiujn verbojn kiuj estiĝas de radikoj en la Fundamento de Esperanto kaj estas registritaj en la Plena Ilustrita Vortaro (PIV). Tio signifas ke la listo enhavas ĉiujn primitivajn fundamentajn verbojn kaj ĉiujn derivitajn kaj kunmetitajn verbojn en PIV kies ĉefelemento estas fundamenta.” (Christer, 1995, p. 7)
- “La klaso T estas pli ol duoble pli granda ol la klaso N.” (Christer, 1995, p. 7)
- “En PIV estas la jena Zamenhofa ekzemplo, kiu montras la transitivecon de preterkuri : rivero, kiu preterkuras urbojn, kastelojn k vinberĝardenojn. Sed la rivero ne rajtas – laŭ PIV – preterflui la urbojn, ĉar la verbo preterflui estas ja netransitiva!” (Christer, 1995, p. 8)
- “Ni povas noti ke la sekvantaj tre ofte uzataj verboj mankas en PIV: klari, necesi, pali, preti, proksimi, sami, varmi, same kiel ruĝi kaj ĉiuj similaj verboj derivitaj de koloradjektivoj, krom la menciita verdi. Ŝajnas do ke PIV en tiu rilato ne plu reprezentas la parolan Esperanton.” (Christer, 1995, p. 10)
eraroj
- de la Internacia Lingvo (p. 1)
- perfikso prefikso (p. 8)
notoj
- nombro de koncernantoj = valento
- forsarko = de. Brachlegung/Verhinderung eo. malebligo
- Ĉu Zamenhof iel menciis tr/ntr en la Fundamento: tute ne. Li nur kontribuis vortekzemplojn
-
Liaj demandoj kaj mia respondo laŭ la teksto:
-
“Ĉu entute indas distingi la netransitivajn disde la transitivaj verboj en Esperanto?”
Ne, Kiselman distingas kvar kategoriojn:- Verboj kiuj ĉiam postulas akuzativobjekton.
- Verboj kiuj akceptas akuzativobjekton de ĝenerala speco (sed ne postulas akuzativobjekton).
- Verboj kiuj akceptas akuzativobjekton, sed nur de limigita speco.
- Verboj kiuj ne akceptas akuzativobjekton.
-
“Se jes, kiel oni devas difini transitivecon?”
El la kvar kategorioj oni nomiĝas verboj el la du unaj kategorioj “transitiva” kaj el la du lastaj kategorioj “netransitiva” -
“Kiuj verboj estas pli oftaj, la transitivaj aŭ la netransitivaj?”
transitivaj verboj okazas proksime duoblaj ofte -
“Ĉu ekzistas reguloj laŭ kiuj oni povas formi transitivajn respektive netransitivajn verbojn (krom per -igi kaj -iĝi, kompreneble)?”
Nu, afiksoj havas influon… ekz. pri-, preter-, tra-, kaj trans- -
“Kial Zamenhof ne reguligis la aferon?”
Zamenhof provis pli skemisman aliron al la vortfarado kaj forlasis la solvon -
“Ĉu entute oni povas vidi iujn regulaĵojn en la verboj rilate al transitiveco?”
Laŭ mi reguloj ne ekzistas sed Kiselman klare diskutas efikon de afiksoj. Unu tendenco estas: ju pli la ago influas la objekton, des pli la verbo tendencas esti rekta
-
summary
La artikolo klare priskribas la situacio de vortfarado rilate al verboj. Li diskutas la nocio de transitiveco kaj citas antaŭaj opinioj kaj rezultoj de la temo. Mi legis la artikolon kaj ne sufiĉe memorigis la sufiksoj -igi kaj -iĝi. Post du jaroj mi debatis la temon kun mia plejŝatata lingvisto kaj skribis mian propran artikolon. Post mi detale denove legis ĉi tiun artikolon de Kiselman. Mi devas akcepti ke li faris pli bonan laboron ol mi.
Tweaks and Keys for Block Ciphers: The TWEAKEY Framewo… §
Title: “Tweaks and Keys for Block Ciphers: The TWEAKEY Framework” by Jérémy Jean, Ivica Nikolić, Thomas Peyrin [url] [dblp]
Published in 2014 at ASIACRYPT 2014 and I read it in 2020-06
Abstract: We propose the TWEAKEY framework with goal to unify the design of tweakable block ciphers and of block ciphers resistant to related-key attacks. Our framework is simple, extends the key-alternating construction, and allows to build a primitive with arbitrary tweak and key sizes, given the public round permutation (for instance, the AES round). Increasing the sizes renders the security analysis very difficult and thus we identify a subclass of TWEAKEY, that we name STK, which solves the size issue by the use of finite field multiplications on low hamming weight constants. We give very efficient instances of STK, in particular, a 128-bit tweak/key/state block cipher Deoxys-BC that is the first AES-based ad-hoc tweakable block cipher. At the same time, Deoxys-BC could be seen as a secure alternative to AES-256, which is known to be insecure in the related-key model. As another member of the TWEAKEY framework, we describe Kiasu-BC, which is a very simple and even more efficient tweakable variation of AES-128 when the tweak size is limited to 64 bits.
Properties of tweakable block ciphers:
- formalized in 2002 by Liskov et al.
- tweaks are completely public, keys are not
- retweaking (changing the tweak value) is less costly than changing its secret key
- security model considers that the attacks has full control over both: the message and the tweak inputs
quotes (on contributions)
- “We propose the TWEAKEY framework with goal to unify the design of tweakable block ciphers and of block ciphers resistant to related-key attacks.”
- “We give very efficient instances of STK, in particular, a 128-bit tweak/key/state block cipher Deoxys-BC that is the first AES-based ad-hoc tweakable block cipher.”
- “we describe Kiasu-BC, which is a very simple and even more efficient tweakable variation of AES-128 when the tweak size is limited to 64 bits.”
quotes (on history)
- “[…] designs that allowed to prove their security against classical differential or linear attacks have been a very important step forward, […]”
- “The security of the block ciphers, both Feistel and Substitution-Permutation networks, has been well studied when the key is fixed and secret, however, when the attacker is allowed to ask for encryption or decryption with different (and related) keys the situation becomes more complicated.”
- “Most key schedule constructions are ad-hoc, in the sense that the designers came up with a key schedule that is quite different from the internal permutation of the cipher, in a hope that no meaningful structure is created by the interaction of the two components.”
- “This extra input T, later renamed as tweak, was supposed to be completely public and to randomize the instance of the block cipher: to different values of T correspond different and independent families of permutations EK”
- “This feature was formalized in 2002 by Liskov et al., who showed that tweakable block ciphers are valuable building blocks if retweaking (changing the tweak value) is less costly than changing its secret key.”
- “disk encryption where each block is ciphered with the same key, but the block index is used as tweak value.”
- “Simple constructions of a tweakable block cipher EK(T, P) based on a block cipher EK(P), like XORing the tweak into the key input and/or message input, are not satisfactory. For example, only XORing the tweak into the key input would result in an undesirable property that EK(T, P) = EK ⊕ X(T ⊕ X, P).”
Non-intuitive results on birthday paradox bounds
- “More importantly, these methods ensure only security up to the birthday-bound (relative to the block cipher size).”
- “Minematsu [46] partially overcomes this limitation by proving beyond birthday-bound security for his design, but at the expense of a very
reduced efficiency.”
Future work
- “As of today, it remains an open problem to design an ad-hoc AES-like tweakable block cipher, which in fact would be very valuable for authenticated encryption as AES-NI instruction sets guarantee extremely fast software implementations.”
Tweakey
- “we emphasize that not all TWEAKEY instances are secure”
- “E is a key-alternating cipher when the general form f(si, Ki) = si+1 for i < r becomes f(si ⊕ Ki) = si+1”
- “The signature of standard block ciphers can be described as E: {0, 1}k ×{0, 1}n → {0, 1}n where an n-bit plaintext P is transformed into an n-bit ciphertext C = E(K, M) using a k-bit key K.”
- “The signature for a tweakable block cipher therefore becomes E : {0, 1}k × {0, 1}t × {0, 1}n → {0, 1}n, the ciphertext C = E(K, T, P ) where the tweak T does not need to be secret and thus can be placed in the public domain.”
- “It is important to note that the security model considers that the attacker has full control over both the message and the tweak inputs.”
- related-tweakey := related-key related-tweak
open-tweakey := open-key open-tweak - Figure 3 shows the TWEAKEY design, where the top wires transmit t+k bits, g outputs k bits and the bottom wires transmit n bits
- subtweakey extraction function g
- internal state update permutation f
- tweakey state update function h
- “This can be summarized as: si+1 = f(si ⊕ g(tki)) followed by tki+1 = h(tki)”
- “The functions f, g and h must be chosen along with the number of rounds r such that no known attack can apply on the resulting primitives.”
- “One of the main causes for the low number of ad-hoc tweakable block ciphers is the fact that adding a tweak input makes the security analysis much harder.”
- “The trick we use is to apply a nibble-wise multiplication with a distinct coefficient α j for all tweakey words.”
- “when we deal with differences in several tweakey words (which is supposedly very hard to analyze due to the important number of nibbles), the study of the STK construction is again the same as for a classical TK-1 analysis, except that at most p − 1 active output nibbles can be erased in each subgroup.”
- Figure 4 shows the STK construction
- “The chosen round functions (and the nibble sizes), suggest that Deoxys-BC is software oriented, while Joltik-BC is hardware (and lightweight) oriented design.”
- “For instance, XORing two columns (instead of rows) would immediately lead to an insecure variant.” (context Kiasu-BC)
Performance and area
- “a complete 128-bit tweak 128-bit key 128-bit block cipher proposal Deoxys-BC based on the AES round function, but faster and more lightweight than other tentatives to build a tweakable block cipher from AES-128. When used in ΘCB3 [38] authenticated encryption, Deoxys-BC runs at about 1.3 c/B on the latest Intel processors. This has to be compared to OCB3, which runs at 0.7-0.88 c/B when instantiated with AES-128, but only ensures birthday-bound security. Alternatively, Deoxys-BC could be a replacement for AES-256,
which has related-key issues as shown in [8].” - “On longer inputs and modes based on parallelizable block cipher calls (such as ΘCB3), Deoxys-BC-256 runs at around 1.3 cycles per byte, while Deoxys-BC-384 at around 1.55 cycles per byte. This is to be compared to AES in OCB3 mode, which runs at around 0.70 - 0.88 cycles per byte (but has only birthday bound security).”
- “Therefore, we estimate that the entire Deoxys-BC-256 can be implemented with around 3400 GE, and Deoxys-BC-384 with around 4400 GE.”
summary
I think this paper is some good research product. I am not an expert on symmetric cryptography and cannot judge upon the security analysis and possible attacks, but to me it seems to consider relevant properties. Unrelated to the paper, I was not aware of beyond-birthday-attack-security which totally intrigued me. Related to the paper, follow-up work could be made regarding the question “What are sufficient conditions to achieve a secure tweakable block cipher with Tweakey?”. Well done!
- Tweakey framwork
- Kiasu-BC (tweak size = 64 bits, AES-128 based)
- STK (goal: ease of cryptanalysis with existing tools)
- Deoxys-BC (n=128 bit blocks input, f = AES round function)
- Deoxys-BC-256
- Deoxys-BC-384
- Joltik-BC (n = 64 bits, f = AES-like using 4-bit nibbles)
- Joltik-BC-128 (r = 24 rounds)
- Joltik-BC-192 (r = 32)
- Deoxys-BC (n=128 bit blocks input, f = AES round function)
- ““The QARMA Block Cipher Family’ by Roberto Avanzi” is a follow-up on this work
typo
- “these scheme might not be really efficient” → “these schemes might not be really efficient”
- “This might be seen as counter intuitive as it is required the tweak input to be somehow more efficient than the key input,” → “This might be seen as counter intuitive as it requires the tweak input to be somehow more efficient than the key input,”
- “but at the same time the security requirement on the tweak seem somehow stronger than on the key,” → “but at the same time the security requirement on the tweak seems somehow stronger than on the key,”
- “and then multiply each c-bit cell of the j-th” → and then multiplies each c-bit cell of the j-th
- “Most automated differential analysis tools for AES-like ciphers (e.g., [9, 23,28]) use truncated differential
representation to make feasible the search for differential characteristics.” → “Most automated differential analysis tools for AES-like ciphers (e.g., [9, 23,28]) use truncated differential representation to make the search for differential characteristics feasible.”
Underproduction: An Approach for Measuring Risk in Ope… §
Title: “Underproduction: An Approach for Measuring Risk in Open Source Software” by Kaylea Champion, Benjamin Mako Hill [url] [dblp]
Published in 2021-02 at SANER 2021 and I read it in 2022-04
Abstract: The widespread adoption of Free/Libre and Open Source Software (FLOSS) means that the ongoing maintenance of many widely used software components relies on the collaborative effort of volunteers who set their own priorities and choose their own tasks. We argue that this has created a new form of risk that we call ‘underproduction’ which occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced. We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset from the Debian GNU/Linux distribution that includes 21,902 source packages and the full history of 461,656 bugs. We draw on this application to present two experiments: (1) a demonstration of how our technique can be used to identify at-risk software packages in a large FLOSS repository and (2) a validation of these results using an alternate indicator of package risk. Our analysis demonstrates both the utility of our approach and reveals the existence of widespread underproduction in a range of widelyinstalled software components in Debian.
Lehman’s laws of software evolution
(via “Laws of Software Evolution Revisited” by M M Lehman)
“All relate specifically to E-type systems that is, broadly speaking, to software systems that solve a problem or implement a computer application in the real world”:
- Continuing Change: An E-type program that is used must be continually adapted else it becomes progressively less satisfactory.
- Increasing Complexity: As a program is evolved its complexity increases unless work is done to maintain or reduce it.
- Self Regulation: The program evolution process is self regulating with close to normal distribution of measures of product and process attributes.
-
Conservation of Organisational Stability: The average effective global activity rate on an evolving system is invariant over the product
life time. - Conservation of Familiarity: During the active life of an evolving program, the content of successive releases is statistically invariant
- Continuing Growth: Functional content of a program must be continually increased to maintain user satisfaction over its lifetime.
- Declining Quality: E-type programs will be perceived as of declining quality unless rigorously maintained and adapted to a changing operational environment.
- Feedback System: E-type Programming Processes constitute Multi-loop, Multi-level Feedback systems and must be treated as such to be successfully modified or improved.
peer production
… as defined by Yochai Benkler (in the 1990s):
- decentralized goal setting and execution
- a diverse range of participant motives, including non-financial ones
- non-exclusive approaches to poverty (e.g. copyleft or permissive licensing)
- governance through participation, notions of meritocracy, and charisma (rather than through property or contract)
quotes
- “The widespread adoption of Free/Libre and Open Source Software (FLOSS) means that the ongoing maintenance of many widely used software components relies on the collaborative effort of volunteers who set their own priorities and choose their own tasks. We argue that this has created a new form of risk that we call ‘underproduction’ which occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced. We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset from the Debian GNU/Linux distribution that includes 21,902 source packages and the full history of 461,656 bugs.” (Champion and Hill, 2021, p. 1)
- “In this paper, we describe an approach for identifying other important but poorly maintained FLOSS packages” (Champion and Hill, 2021, p. 1)
-
“In an early and influential practitioner account, Raymond argued that FLOSS would reach high quality through a process he dubbed ‘Linus’ law’ and defined as ‘given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone’ [5]. Benkler coined the term ‘peer production’ to describe the method through which many small contributions from large groups of diversely motivated individuals could be integrated together into high quality information goods like software [6].
A growing body of research suggests reasons to be skeptical about Linus’ law [7] and the idea that simply opening the door to one’s code will attract a crowd of contributors [8, 9].” (Champion and Hill, 2021, p. 1) - “Over time, it has become clear that peer produced FLOSS projects’ reliance on volunteer labor and self-selection into tasks has introduced types of risk that traditional software engineering processes have typically not faced. Foremost among these is what we call ‘underproduction.’ We use the term underproduction to refer to the fact that although a large portion of volunteer labor is dedicated to the most widely used open source projects, there are many places where the supply of quality software and volunteer labor is far out of alignment with demand. Because underproduction may go unnoticed or unaddressed until it is too late, we argue that it represents substantial risk to the stability and security of software infrastructure.” (Champion and Hill, 2021, p. 2)
- “How can we measure underproduction in FLOSS?” (Champion and Hill, 2021, p. 2)
- “Our paper contributes to software engineering research in three distinct ways. First, we describe a broad conceptual framework to identify relative underproduction in peer produced FLOSS repositories: identifying software packages of lower relative quality than one would expect given their relative popularity. Second, we describe an application of this conceptual framework to a dataset of 21,902 source packages from the Debian GNU/Linux distribution using measures derived from multilevel Bayesian regression survival models. Finally, we present results from two experiments. The first experiment identifies a pool of relatively underproduced software in Debian. The second experiment seeks to validate our application of our framework for identifying underproduction by correlating underproduction with an alternate indicator of risk.” (Champion and Hill, 2021, p. 2)
- “FLOSS began with the free software movement in the 1980s and its efforts to build the GNU operating system as a replacement for commercial UNIX operating systems [23]. Over time, free software developers discovered that their free licenses and practices of working openly supported new forms of mass collaboration and bug fixes [4].” (Champion and Hill, 2021, p. 2)
- “‘Peer production’ is a term coined by Yochai Benkler to describe the model of organizing production discovered by FLOSS communities in the early 1990s that involved the mass aggregation of many small contributions from diversely motivated individuals. Benkler [9] defines peer production for online groups in terms of four criteria: (1) decentralized goal setting and execution, (2) a diverse range of participant motives, including non-financial ones, (3) non-exclusive approaches to property (e.g. copyleft or permissive licensing), and (4) governance through participation, notions of meritocracy, and charisma, rather than through property or contract.” (Champion and Hill, 2021, p. 2)
- “The process of building and maintaining software is often collaborative and social, including not only code but code comments, commit messages, pull requests, and code reviews, as well as bug reporting, issue discussing, and shared problem-solving [24].” (Champion and Hill, 2021, p. 2)
- “The team found that the laws are frequently not upheld in FLOSS, especially when effort from outside a core team is considered. This work suggests that the effort available to maintain a piece of FLOSS software may increase as it grows in popularity.” (Champion and Hill, 2021, p. 3)
- “Prior studies have suggested that bug resolution rate is closely associated of a range of important software engineering outcomes, including codebase growth, code quality, release rate, and developer productivity [32, 33, 34]. By contrast, lack of maintenance activity as reflected in a FLOSS project’s bug tracking system can be considered a sign of failure [35].” (Champion and Hill, 2021, p. 3)
- “In particular, we are inspired by a study of Wikipedia by Warncke-Wang et al. [36] who build off previous work by Gorbatˆ ai [37] to formalize what Warncke-Wang calls the “perfect alignment hypothesis” (PAH). The PAH proposes that the most heavily used peer produced information goods (for Warncke-Wang et al., articles in Wikipedia) will be the highest quality, that the least used will be the lowest quality, and so on. In other words, the PAH proposes that if we rank peer production products in terms of both quality and importance—for example, in the simple conceptual diagram shown in Figure 1—the two ranked lists will be perfectly correlated.” (Champion and Hill, 2021, p. 3)
- “Despite the central role that FLOSS plays in peer production, we know of no efforts to conceptualize or measure underproduction in software.” (Champion and Hill, 2021, p. 3)
- “A low quality Wikipedia article on an extremely popular subject seems likely to pose much less risk to society than a bug like the Heartbleed vulnerability described earlier which could occur when FLOSS is underproduced.” (Champion and Hill, 2021, p. 3)
- “The measure of deviation resulting from this process serves as our measure of (mis-)alignment between quality and importance (i.e., over- or underproduction).” (Champion and Hill, 2021, p. 3)
- “With a community in operation since 1993, Debian is widely used and is the basis for other widelyused distributions like Ubuntu. Debian had more than 1,400 different contributors in 20201 and contains more than 20,000 of the most important and widely used FLOSS packages.” (Champion and Hill, 2021, p. 4)
- “A single source package may produce many binary packages. For examples, although it is an outlier, the Linux kernel source package produces up to 1,246 binary packages from its single source package (most are architecture specific subcollections of kernel modules).” (Champion and Hill, 2021, p. 4)
- “However, software engineering researchers have noted that the quantity of bugs reported against a particular piece of FLOSS may be more related to the number of users of a package [50, 52], or the level of effort being expended on bug-finding [1] in ways that limit its ability to serve as a clear signal of software quality. In fact, Walden [1] found that OpenSSL had a lower bug count before Heartbleed than after. Walden [1] argued that measures of project activity and process improvements are a more useful sign of community recovery and software quality than bug count.” (Champion and Hill, 2021, p. 4)
- “Time to resolution has been cited as an important measure of FLOSS quality by a series of software engineering scholars” (Champion and Hill, 2021, p. 4)
- “A second challenge in measuring quality as time to resolution comes from the fact that the distribution of bugs across packages is highly unequal. Most of the packages we examine (14,604 of 21,902) have 10 or fewer bugs and more than one out of six (3,857 of 21,902) have only one bug reported.” (Champion and Hill, 2021, p. 5)
- “Given this construction, Uj will be zero when a package is fully aligned, negative if it is overproduced, and positive if it is underproduced.” (Champion and Hill, 2021, p. 6)
- “Our first experiment describes results from the application of our method described in §V and suggests that a minimum of 4,327 packages in Debian are underproduced.” (Champion and Hill, 2021, p. 6)
- “Underproduction is a concept borrowed from economics and involves a relationship between supply and demand.” (Champion and Hill, 2021, p. 8)
- “For example, resolution time is an imperfect and partial measure of quality.” (Champion and Hill, 2021, p. 8)
- “Our results suggest that underproduction is extremely widespread in Debian. Our non-parametric survival analysis shown in Figure 2 suggests that Debian resolves most bugs quickly and that release-critical bugs in Debian are fixed much more quickly than non-release-critical bugs. The presence of substantial underproduction in widely-installed components of Debian exposes Debian’s users to risk.” (Champion and Hill, 2021, p. 9)
- “One striking feature of our results is the predominance of visual and desktop-oriented components among the most underproduced packages (see Figure 5). Of the 30 most underproduced packages in Debian, 12 are directly part of the XWindows, GNOME, or KDE desktop windowing systems. For example, the “worst” ranking package, GNOME Power Manager (gnome-power-manager) tracks power usage statistics, allows configuration of power preferences, screenlocking, screensavers, and alerts users to power events such as an unplugged AC adaptor.” (Champion and Hill, 2021, p. 9)
- “These results might simply reflect the difficulty of maintaining desktop-related packages. For example, maintaining gnomepower-manager includes complex integration work that spans from a wide range of low-level kernel features to high-level user-facing and usability issues.” (Champion and Hill, 2021, p. 9)
- “FLOSS acts as global digital infrastructure. Failures in that infrastructure ripple through supply chains and across sectors.” (Champion and Hill, 2021, p. 9)
summary
The paper defines the notion of underproduction from economics for software projects. Roughly the notion captures the imbalance between activity/attention of open source packages in relation to demand (values below 1, in particular). To empirically quantify underproduction, they looked at the time for bug resolution versus installments of 21,902 Debian packages. 4,327 packages have been identified as underproduced (about 20%).
All decisions are made with proper rationale and limitations are discussed in section 7. The normalization of data, in particular assignment of bugs to packages through BTS, must have been taken a large efforts. However, my confidence in the statistical model is rather low (for example, I am not sure a uniform model for packages with such diverging bug reporting property - as explained on page 5 - is appropriate). A ‘control group’ with commercial software projects would be nice, but is obviously infeasible. I would like to point out that this is purely subjective and I cannot support this with a formal statement since my empirical background is very small.
The paper is well-written except for the screwup on page 5.
The result, that the worst underproduced applications are GUI applications, is interesting.
Understanding memory and thread safety practices and i… §
Title: “Understanding memory and thread safety practices and issues in real-world Rust programs” by Boqin Qin, Yilun Chen, Zeming Yu, Linhai Song, Yiying Zhang [url] [dblp]
Published in 2020-06 at PLDI 2020 and I read it in 2021-01
Abstract: Rust is a young programming language designed for systems software development. It aims to provide safety guarantees like high-level languages and performance efficiency like low-level languages. The core design of Rust is a set of strict safety rules enforced by compile-time checking. To support more low-level controls, Rust allows programmers to bypass these compiler checks to write unsafe code.
questions
- “This bug demonstrates the unique difficulty in knowing the boundaries of critical sections in Rust. Rust developers need to have a good understanding of the lifetime of a variable returned by lock(), read(), or write() to know when unlock() will implicitly be called.”
I don't think this is unique to rust at all. We talk about RAII here, which originates from C++ and has become a prominent pattern. Python has context managers and ruby has blocks. As such pattern where the destructor is called implicitly is common to many programming languages. This is also intuitive since forgetting to call a destructor is a common source of issues like memory leaks.
Since the RAII concept is applied consistently in rust (i.e. it is not intermingled with other concepts) and manual memory management makes it apparent, when objects are freed, I think in rust the situation is better than in all languages without manual memory management and better than in C++ (because the concept is more consistent).
quotes
- “We performed the first empirical study of Rust by close, manual inspection of 850 unsafe code usages and 170 bugs in five open-source Rust projects, five widely-used Rust libraries, two online security databases, and the Rust standard library. Our study answers three important questions: how and why do programmers write unsafe code, what memory-safety issues real Rust programs have, and what concurrency bugs Rust programmers make.”
- Definition: “A function can be defined as unsafe or a piece of code inside a function can be unsafe. For the latter, the function can be called as a safe function in safe code, which provides a way to encapsulate unsafe code. We call this code pattern interior unsafe.”
- “Our study covers five Rust-based systems and applications (two OSes, a browser, a key-value store system, and a blockchain system), five widely-used Rust libraries, and two online vulnerability databases.”
- “We found that unsafe code is extensively used in all of our studied Rust software and it is usually used for good reasons (e.g., performance, code reuse), although programmers also try to reduce unsafe usages when they can.”
- “Second, we study memory-safety issues in real Rust programs by inspecting bugs in our selected applications and libraries and by examining all Rust issues reported on CVE [12] and RustSec [66]. We not only analyze these bugs’ behaviors but also understand how the root causes of them are propagated to the effect of them. We found that all memory-safety bugs involve unsafe code, and (surprisingly) most of them also involve safe code.”
- “We also found that the scope of lifetime in Rust is difficult to reason about, especially when combined with unsafe code, and wrong understanding of lifetime causes many memory-safety issues.”
- “Finally, we study concurrency bugs, including non-blocking and blocking bugs [80]. Surprisingly, we found that non-blocking bugs can happen in both unsafe and safe code and that all blocking bugs we studied are in safe code. Although many bug patterns in Rust follow traditional concurrency bug patterns (e.g., double lock, atomicity violation), a lot of the concurrency bugs in Rust are caused by programmers’ misunderstanding of Rust’s (complex) lifetime and safety rules.”
- “With our empirical study results, we conducted an initial exploration on detecting Rust bugs by building two static bug detectors (one for use-after-free bugs and one for double-lock bugs). In total, these detectors found ten previously unknown bugs in our studied Rust applications.”
- “Code regions marked with unsafe will bypass Rust’s compiler checks and be able to perform five types of functionalities: dereferencing and manipulating raw pointers, accessing and modifying mutable static variables (i.e., global variables), calling unsafe functions, implementing unsafe traits, and accessing union fields.”
- “Rust runtime detects and triggers a panic on certain types of bugs, such as buffer overflow, division by zero and stack overflow. Rust also provides more bug-detecting features in its debug build mode, including detection of double lock and integer overflow.”
- “Researchers have designed a few bug detection techniques for Rust. Rust-clippy [64] is a static detector for memory bugs that follow certain simple source-code patterns. It only covers a small amount of buggy patterns. Miri [43] is a dynamic memory-bug detector that interprets and executes Rust’s mid-level intermediate representation (MIR). Jung et al. proposed an alias model for Rust [27]. Based on this model, they built a dynamic memory-bug detector that uses a stack to dynamically track all valid references/pointers to each memory location and reports potential undefined behavior and memory bugs when references are not used in a properly-nested manner.”
- “From our experiments, Miri also generates many false positives.”
- “There are only a few empirical studies on Rust’s unsafe code usage similar to what we performed in Section 4.”
- “One previous study counts the number of Rust libraries that depend on external C/C++ libraries [72].”
- “To collect bugs, we analyzed GitHub commit logs from applications in Table 1. We first filtered the commit logs using a set of safety-related keywords, e.g., ‘use-after-free’ for memory bugs, ‘deadlock’ for concurrency bugs.”
- “We manually inspect these unsafe usages to understand 1) why unsafe is used in the latest program versions, 2) how unsafe is removed during software evolution, and 3) how interior unsafe is encapsulated.”
- “The most common purpose of the unsafe usages is to reuse existing code (42%), for example, to convert a C-style array to Rust’s variable-size array (called slice), to call functions from external libraries like glibc. Another common purpose of using unsafe code is to improve performance (22%). We wrote simple tests to evaluate the performance difference between some of the unsafe and safe code that can deliver the same functionalities. Our experiments show that unsafe memory copy with ptr::copy_nonoverlapping() is 23% faster than the slice::copy_from_slice() in some cases. Unsafe memory access with slice::get_unchecked() is 4-5× faster than the safe memory access with boundary checking.”
- “One interesting finding is that sometimes removing unsafe will not cause any compile errors (32 or 5% of the studied unsafe usages). For 21 of them, programmers mark a function as unsafe for code consistency (e.g., the same function for a different platform is unsafe).”
- “For example, in Rust std, the String struct has a constructor function String::from_utf8_unchecked() which creates a String using the input array of characters. This constructor is marked as an unsafe function although the operations in it are all safe. However, other member functions of String that use the array content could potentially have safety issues due to invalid UTF-8 characters. Instead of marking all these functions unsafe and requiring programmers to properly check safe conditions when using them, it is more efficient and reliable to mark only the constructor as unsafe. This design pattern essentially encapsulates the unsafe nature in a much smaller scope.”
- “In total, we sampled 250 interior unsafe functions in Rust std. For the unsafe code to work properly, different types of conditions need to be satisfied. For example, 69% of interior unsafe code regions require valid memory space or valid UTF-8 characters. 15% require conditions in lifetime or ownership.”
- “Of particular interest are two bad practices that lead to potential problems. They are illustrated in Figure 5. Function peek() returns a reference of the object at the head of a queue, and pop() pops (removes) the head object from the queue. A use-after-free error may happen with the following sequence of operations (all safe code): a program first calls peek() and saves the returned reference at line 5, then calls pop() and drops the returned object at line 6, and finally uses the previously saved reference to access the (dropped) object at line 7.”
- “If a function’s safety depends on how it is used, then it is better marked as unsafe not interior unsafe.”
- “Based on whether cause and effect are in safe or unsafe code, we categorize bugs into four groups: safe → safe (or simply, safe), safe → unsafe, unsafe → safe, and unsafe → unsafe (or simply, unsafe).”
- “Out of the ten invalid-free bugs, five share the same (unsafe) code pattern. Figure 6 shows one such example.”
- “Assigning a new FILE struct to *f at line 7 ends the lifetime of the previous struct f points to, causing the previous struct to be dropped by Rust.”
- “Note that such behavior is unique to Rust and does not happen in traditional languages (e.g., *f=buf in C/C++ does not cause the object pointed by f to be freed).”
- “We have identified misunderstanding of lifetime being the main reason for most use-after-free and many other types of memory-safety bugs.”
- “More than half of memory-safety bugs were fixed by changing or conditionally skipping unsafe code, but only a few were fixed by completely removing unsafe code, suggesting that unsafe code is unavoidable in many cases.”
- “taxonomy of concurrency bugs [80],” → “Tengfei Tu, Xiaoyu Liu, Linhai Song, and Yiying Zhang. 2019. Understanding Real-World Concurrency Bugs in Go. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’19). Providence, RI.”
- “Blocking bugs manifest when one or more threads conduct operations that wait for resources (blocking operations), but these resources are never available.”
“Non-blocking bugs are concurrency bugs where all threads can finish their execution, but with undesired results.” - “Different from traditional multi-threaded programming languages, the locking mechanism in Rust is designed to protect data accesses, instead of code fragments [42].”
- “Even though problems like double locking and conflicting lock orders are common in traditional languages too, Rust’s complex lifetime rules together with its implicit unlock mechanism make it harder for programmers to write blocking-bug-free code.”
- “This bug demonstrates the unique difficulty in knowing the boundaries of critical sections in Rust. Rust developers need to have a good understanding of the lifetime of a variable returned by lock(), read(), or write() to know when unlock() will implicitly be called. But Rust’s complex language features make it tricky to determine lifetime scope.”
- “In Rust, a channel has unlimited buffer size by default, and pulling data from an empty channel blocks a thread until another thread sends data to the channel.”
- “Lacking good understanding in Rust’s lifetime rules is a common cause for many blocking bugs.”
- “Future IDEs should add plug-ins to highlight the location of Rust’s implicit unlock, which could help Rust developers avoid many blocking bugs.”
- “Rust allows explicit drop of the return value of lock() (by calling mem::drop()). We found 11 such usages in our studied applications. Among them, nine cases perform explicit drop to avoid double lock and one case is to avoid acquiring locks in conflicting orders.”
- “Surprisingly, 25 of our studied non-blocking bugs happen in safe code. This is in contrast to the common belief that safe Rust code can mitigate many concurrency bugs and provide ‘fearless concurrency’ [62, 81].”
- “Rust provides a unique strategy where a mutex is poisoned when a thread holding the mutex panics.”
- “Misusing Rust’s unique libraries is one major root cause of non-blocking bugs, and all these bugs are captured by runtime checks inside the libraries, demonstrating the effectiveness of Rust’s runtime checks.”
- “20 bugs were fixed by enforcing atomic accesses to shared memory. Ten were fixed by enforcing ordering between two shared-memory accesses from different threads. Five were fixed by avoiding (problematic) shared memory accesses. One was fixed by making a local copy of some shared memory.”
- “Being able to visualize objects’ lifetime and owner(s) during programming time could largely help Rust programmers avoid memory bugs. An effective way of visualization is to add plug-ins to IDE tools, for example, by highlighting a variable’s lifetime scope when the mouse/cursor hops over it or its pointer/reference.”
- “Highlighting and annotating ownership operations can also help programmers avoid various memory bugs such as double-free bugs and invalid-free bugs (e.g., Figure 6).”
- “For example, for bugs caused by misuse of interior mutability like the one in Figure 9, we could perform the following static check. When a struct is sharable (e.g., implementing the Sync trait) and has a method immutably borrowing self, we can analyze whether self is modified in the method and whether the modification is unsynchronized. If so, we can report a potential bug.”
summary
This recent paper is a nice read about programming language design. One of the weaknesses is that claims have not been properly addressed. The abstract reads that it answers “how and why do programmers write unsafe code, what memory-safety issues real Rust programs have, and what concurrency bugs Rust programmers make” which is a broad statement. However, the answer can be only found in examples provided and the discussion discusses different questions; namely “When and why to use unsafe code? How to properly encapsulate unsafe operations? How to change unsafe code to safe code?”
Rust has many builtin features to prevent certain memory and concurrency related issues prevalent in unsafe programming languages like C/C++. In particular, the authors looked at rust's unsafe{} feature which gives programmers certain superpowers which are not possible in “safe rust”. These superpowers mean that checks are omitted and programmers need to guarantee properties themselves.
The authors manually looked at bugs in popular software packages from various fields and categorized those bugs. Their goal is to answer the questions:
- When and why to use unsafe code?
- How to properly encapsulate unsafe operations?
- How to change unsafe code to safe code?
It turns out, that some issues are preventable by improved tooling support and better understanding of primitives among programmers. At the same time, rust proves that its safety features contribute to the correctness of programs.
One open question to me, that is not covered: Did someone
Implementations considered:
- Servo
- TiKV
- Parity Etherum
- Redox
- Tox
- Rand
- Crossbeam
- Threadpool
- Rayon
- Lazy-static
Insights:
- “Most unsafe usages are for good or unavoidable reasons, indicating that Rust’s rule checks are sometimes too strict and that it is useful to provide an alternative way to escape these checks.”
- “Interior unsafe is a good way to encapsulate unsafe code.”
- “Insight 3: Some safety conditions of unsafe code are difficult to check. Interior unsafe functions often rely on the preparation of correct inputs and/or execution environments for their internal unsafe code to be safe.”
- “Rust’s safety mechanisms (in Rust’s stable versions) are very effective in preventing memory bugs. All memory-safety issues involve unsafe code (although many of them also involve safe code).”
- “More than half of memory-safety bugs were fixed by changing or conditionally skipping unsafe code, but only a few were fixed by completely removing unsafe code, suggesting that unsafe code is unavoidable in many cases.”
- “Lacking good understanding in Rust’s lifetime rules is a common cause for many blocking bugs.”
- “There are patterns of how data is (improperly) shared and these patterns are useful when designing bug detection tools.”
- “How data is shared is not necessarily associated with how non-blocking bugs happen, and the former can be in unsafe code and the latter can be in safe code.”
- “Misusing Rust’s unique libraries is one major root cause of non-blocking bugs, and all these bugs are captured by runtime checks inside the libraries, demonstrating the effectiveness of Rust’s runtime checks.”
- “The design of APIs can heavily impact the Rust compiler’s capability of identifying bugs.”
- “Fixing strategies of Rust non-blocking (and blocking) bugs are similar to traditional languages. Existing automated bug fixing techniques are likely to work on Rust too.”
Suggestion:
- “Programmers should try to find the source of unsafety and only export that piece of code as an unsafe interface to minimize unsafe interfaces and to reduce code inspection efforts.”
- “Rust developers should first try to properly encapsulate unsafe code in interior unsafe functions before exposing them as unsafe.”
- “If a function’s safety depends on how it is used, then it is better marked as unsafe not interior unsafe.”
- “Interior mutability can potentially violate Rust’s ownership borrowing safety rules, and Rust developers should restrict its usages and check all possible safety violations, especially when an interior mutability function returns a reference. We also suggest Rust designers differentiate interior mutability from real immutable functions.”
- “Future memory bug detectors can ignore safe code that is unrelated to unsafe code to reduce false positives and to improve execution efficiency.”
- “Future IDEs should add plug-ins to highlight the location of Rust’s implicit unlock, which could help Rust developers avoid many blocking bugs.”
- “Rust should add an explicit unlock API of Mutex, since programmers may not save the return value of lock() in a variable and explicitly dropping the return value is sometimes inconvenient.”
- “Internal mutual exclusion must be carefully reviewed for interior mutability functions in structs implementing the Sync trait.”
typo
- “Similar usages also happen in applications and is explained by developers as a good practice [78].”
- “The rest initialize buffers incorrectly, e.g., using memcpy with wrong input parameters.” →
“The rest initializes buffers incorrectly, e.g., using memcpy with wrong input parameters.”
Unicode and math, a combination whose time has come — … §
Title: “Unicode and math, a combination whose time has come — Finally!” by Barbara Beeton [url] [dblp]
Published in 2000 at and I read it in 2023-12
Abstract: To technical publishers looking at ways to provide mathematical content in electronic form (Web pages, e-books, etc.), fonts are seen as an “f-word”. Without an adequate complement of symbols and alphabetic type styles available for direct presentation of mathematical expressions, the possibilities are limited to such workarounds as .gif and .pdf files, either of which limits the flexibility of presentation.
quotes
- “The STIX project (Scientific and Technical Information eXchange), representing a consortium of scientific and technical publishers and scientific societies, has been trying to do something about filling this gap.” (Beeton, 2000, p. 176)
- “Negotiations have been underway since mid-1997 (the wheels of standards organizations grind exceedingly slowly), but things are beginning to happen.” (Beeton, 2000, p. 176)
- “Needless to say, font foundries have never been overly eager to provide an unlimited supply of new symbol shapes of arcane design and often intricate production requirements.” (Beeton, 2000, p. 176)
- “In standardese, a term can have only one meaning. The basic ISO definition [5] is: character A member of a set of elements used for the organisation, control, or representation of data.” (Beeton, 2000, p. 177)
- “glyph A recognizable abstract graphic symbol which is independent of any specific design.” (Beeton, 2000, p. 177)
- “The alphanumeric soup of standardized codes has already been mentioned.” (Beeton, 2000, p. 177)
- “ISO 646 is the “international” version of ASCII.” (Beeton, 2000, p. 177)
- “Computer manufacturers and other commercial organizations dependent on computer technology became dissatisfied with the progress of the ISO working group responsible for standardizing codes, and, in 1988, formed the Unicode Consortium for the purpose of creating a unified international code standard on which new multinational computer technology could be based. The ISO old guard was joined or replaced by the Unicode members, and since 1991 Unicode and ISO 10646 have been parallel.” (Beeton, 2000, p. 178)
- “In the Unicode 3.0 manual [8], only one reference can be unambiguously associated with math symbols: ISO 6862, Information and documentation — Mathematics character set for bibliographic information interchange (no explicit references are shown in the Unicode 2.0 manual).” (Beeton, 2000, p. 178)
- “The Unicode Standard avoids duplicate encoding of characters by unifying them within scripts across languages; characters that are equivalent in form are given a single code.” (Beeton, 2000, p. 180)
- “The first is the Hamiltonian formula well known in physics; the second is an unremarkable integral equation.” (Beeton, 2000, p. 183)
-
“These alphabets are needed for proper composition of mathematics:
- lightface upright Latin, Greek and digits
- boldface upright Latin, Greek and digits
- lightface italic Latin, Greek and digits
- boldface italic Latin, Greek and digits
- script
- fraktur
- bold fraktur
- open-face (blackboard bold) including digits
- lightface upright sans serif Latin and digits
- lightface italic sans serif Latin
- boldface upright sans serif Latin, Greek, and digits
- boldface italic sans serif Latin and Greek
- monospace Latin and digits” (Beeton, 2000, p. 183)
summary
Exciting. The text describes early efforts during Unicode 3.0 to include technological & math symbols into the Unicode standard. As an intro, a history of character sets and Unicode is given. Kudos to all people involved into the efforts!
Very interesting: Standard variants defined using a Variation Selector (VS) example list
Vnodes: An Architecture for Multiple File System Types… §
Title: “Vnodes: An Architecture for Multiple File System Types in Sun UNIX” by S R Kleiman [url] [dblp]
Published in 1986 at USENIX summer 1986 and I read it in 2023-07
Abstract:
quotes
- “vfs_sync(vfsp): Write out all cached information for vfsp.Note that this isnot necessarily done synchronously. When the operation returns all data has not necessarily been written out, however ithas been scheduled.” (Kleiman, 1986, p. 7)
- “The current interface has been in operation since the summer of 1984, and isareleased Sun product.” (Kleiman, 1986, p. 9)
- “Vnodes has been proven to provide aclean, well defined interface to different file system implementations.” (Kleiman, 1986, p. 9)
- “In addition, a prototype "/proc" file system[5] has been implemented.” (Kleiman, 1986, p. 9)
summary
In this paper, the author S. R. Kleiman explains the filesystem interface developed for Sun UNIX. The basic idea is to have a uniform vfs and vnodes interface with C structs which is implemented for every filesystem. The interface with its proposed semantics is presented.
Very nice to see a white paper on such fundamental software architecture.
I am not familiar with filesystem requirements, but the interface, examples, and some rationale is provided. I was surprised fsync is not synchronous. The fid structure was not understandable to me either (does the unique file ID have a length of 1 byte).
typo
page 6, “the file pointer is is changed”
What is a “document”? §
Title: “What is a “document”?” by Michael K. Buckland [url] [dblp]
Published in 1997 at JASIS 1997 and I read it in 2023-10
Abstract:
quotes
- “In the late 19th century, there was increasing concern with the rapid increase in the number of publications, especially of scientific and technical literature. Continued effectiveness in the creation, dissemination, and utilization of recorded knowledge was seen as needing new techniques for managing the growing literature.” (Buckland, 1997, p. 804)
- “Early in the 20th century, the word ‘‘documentation’’ was increasingly adopted in Europe instead of ‘‘bibliography’’ to denote the set of techniques needed to manage this explosion of documents.” (Buckland, 1997, p. 804)
- “Loosjes (1962, pp. 1–8) explained documentation in historical terms: Systematic access to written texts, he wrote, became more difficult after the invention of printing resulted in the proliferation of texts; scholars were increasingly obliged to delegate tasks to specialists; assembling and maintaining collections was the field of librarianship; bibliography was concerned with the descriptions of documents; the delegated task of creating access for scholars to the topical contents of documents, especially of parts within printed documents and without limitation to particular collections, was documentation.” (Buckland, 1997, p. 805)
- “Paul Otlet (1868–1944), is known for his observation that documents could be three-dimensional, which enabled the inclusion of sculpture.” (Buckland, 1997, p. 805)
-
“Similarly, the International Institute for Intellectual Cooperation, an agency of the League of Nations, developed, in collaboration with Union Français des Organismes de Documentation, technical definitions of ‘‘document’’ and related technical terms in English, French, and German versions and adopted:
Document: Any source of information, in material form, capable of being used for reference or study or as an authority. Examples: manuscripts, printed matter, illustrations, diagrams, museum specimens, etc.” (Buckland, 1997, p. 805) - “In 1951 Briet published a manifesto on the nature of documentation, Qu'est-ce que la documentation, which starts with the assertion that "A document is evidence in support of a fact." ("Un document est une preuve à l'appui d'un fait" (Briet, 1951, 7). She then elaborates: A document is "any physical or symbolic sign, preserved or recorded, intended to represent, to reconstruct, or to demonstrate a physical or conceptual phenomenon". ("Tout indice concret ou symbolique, conservé ou enregistré, aux fins de représenter, de reconstituer ou de prouver un phénomène ou physique ou intellectuel." p. 7.) The implication is that documentation should not be viewed as being concerned with texts but with access to evidence.” (Buckland, 1997, p. 806)
-
“We infer, however, from her discussion that:
- There is materiality: Physical objects and physical signs only;
- There is intentionality: It is intended that the object be treated as evidence;
- The objects have to be processed: They have to be made into documents; and, we think,
- There is a phenomenological position: The object is perceived to be a document.” (Buckland, 1997, p. 806)
- “indexicality–the quality of having been placed in an organized, meaningful relationship with other evidence–” (Buckland, 1997, p. 806)
- “A document is the repository of an expressed thought.” (Buckland, 1997, p. 806)
- “Ranganathan's view of "document" as a synonym for "embodied micro thought" on paper "or other material, fit for physical handling, transport across space, and preservation through time" was adopted by the Indian Standards Institution (1963, 24), with a note explaining that the term "document" "is now extended in use to include any embodied thought, micro or macro and whether the physical embodiment is exclusive to one work or is shared by more than one work." (Buckland, 1997, p. 807)
summary
The paper revisits various notions of “document” in academic literature. Is an audio recording a document? Is a specimen in a museum a document? Is an expressed thought a document? The paper illustrates that over time, the definition shifted from its physical nature to a broader contextualized meaning. The paper does not provide an answer, but a summary.
When a Patch is Not Enough - HardFails: Software-Explo… §
Title: “When a Patch is Not Enough - HardFails: Software-Exploitable Hardware Bugs” by Ghada Dessouky, David Gens, Patrick Haney, Garrett Persyn, Arun Kanuparthi, Hareesh Khattri, Jason M. Fung, Ahmad-Reza Sadeghi, Jeyavijayan Rajendran [url] [dblp]
Published in 2018-12-01 at and I read it in 2020-07
Abstract: Modern computer systems are becoming faster, more efficient, and increasingly interconnected with each generation. Consequently, these platforms also grow more complex, with continuously new features introducing the possibility of new bugs. Hence, the semiconductor industry employs a combination of different verification techniques to ensure the security of System-on-Chip (SoC) designs during the development life cycle. However, a growing number of increasingly sophisticated attacks are starting to leverage cross-layer bugs by exploiting subtle interactions between hardware and software, as recently demonstrated through a series of real-world exploits with significant security impact that affected all major hardware vendors.
HCF
“A behavior humorously hinted at in IBM System/360 machines in the form of a Halt-and-Catch-Fire (HCF) instruction.”
→ See also Christopher Domas' research on x86 since 2017
→ Pentium F00F (C7C8) bug
quotes
- “This approach does not ensure security at the hardware implementation level. Hardware vulnerabilities can be introduced due to: (a) incorrect or ambiguous security specifications, (b) incorrect design, (c) faulty implementation of the design, or (d) a combination thereof.”
- “To detect such bugs, the semiconductor industry makes extensive use of a variety of verification and analysis techniques, such as simulation and emulation (also called dynamic verification)”
- “industry-standard tools include Incisive, Solidify, Questa Simulation and Questa Formal, OneSpin 360, and JasperGold”
- “This process incorporates a combination of many different techniques and toolsets such as RTL manual code audits, assertion-based testing, dynamic simulation, and automated security verification.”
- “recent outbreak of cross-layer bugs” with 15 reference appended ^^
- “To reproduce this effect, we implemented the list of bugs using two popular and freely available processor designs for the widely used open-source RISC-V architecture.”
- “Specifically, we observe that RTL bugs arising from complex and cross-modular interactions in real-world SoCs render RTL bugs extremely difficult to detect in practice. Further, it may often be feasible to exploit them from software to compromise the entire platform, and we call such bugs HardFails.”
- “As all vendors keep their proprietary industry designs and implementations inaccessible, we use the popular open-source RISC-V architecture and hardware micro-architecture as a baseline”
- “We investigated how these vulnerabilities can be effectively detected using formal verification techniques (Section V) using an industry-standard tool and in a second case study through simulation and manual RTL analysis (Section VI).”
- “As a result, real-world SoCs can easily approach 100,000 lines of RTL code, and some open-source designs significantly outgrow this to many millions lines of code”
- “However, since RTL code is usually compiled and hardwired as integrated circuitry logic, the underlying bugs will remain and cannot, in principle, be patched after production. This is why RTL bugs pose a severe security threat in practice.”
- “We call these the HardFail properties of a bug:”
- “Cross-modular effects (HF-1).” […]
- “Timing-flow gap (HF-2).” […] “In practice, this leads to vast sources of information leakage due to software-exploitable timing channels (see Section IX).” […]
- “Cache-state gap (HF-3).” […] “In particular, current tools reason about the architectural state of a processor by exclusively focusing on the state of registers. However, this definition of the architectural state completely discards that modern processors feature a highly complex microarchitecture and diverse hierarchy of non-register caches. This problem is amplified as these caches have multiple levels and shared across multiple privilege levels. Caches represent a state that is influenced directly or indirectly by many control-path signals.” […]
- “Hardware/firmware interactions (HF-4).” […] “Hence, reasoning on whether an RTL bug exists is inconclusive when considering the hardware RTL in isolation.” […]
- “On analyzing the RTL of Ariane, we observed that TLB page faults due to illegal accesses occur in a different number of clock cycles than page faults that occur due to unmapped memory (we contacted the developers and they acknowledged the vulnerability).”
- “Once the instruction is retired, the execution mode of the core is changed to the unprivileged level, but the entries that were prefetched into the cache (at the system privilege level) do not get flushed.”
- “We emphasize that in a real-world security testing (see Section II), engineers will not have prior knowledge of the specific vulnerabilities they are trying to find. Our goal, however, is to investigate how an industry-standard tool can detect RTL bugs that we deliberately inject in an open-source SoC and have prior knowledge of (see Table I).”
- “Our results in this study are based on two formal techniques: Formal Property Verification (FPV) and Security Path Verification (SPV).”
- “To describe our assertions correctly, we examined the location of each bug in the RTL and how it is manifested in the behavior of the surrounding logic and input/output relationships. Once we specified the security properties using assert, assume and cover statements, we determined which RTL modules we need to model to prove these assertions.”
- “Out of the 31 bugs we investigated, shown in Table I, using the formal verification techniques described above, only 15 or 48%, were detected. While we tried to detect all 31 bugs formally, we were only able to formulate security properties for only 17 bugs. This indicates that the main challenge with using formal verification tools is identifying and expressing security properties that the tools are capable of capturing and checking.”
- “Our results, shown in the SPV and FPV bars of Figure 3, indicate that integer overflow and address overlap bugs had the best detection rates, 80% and 100%, respectively.”
- “The implications of these findings are especially grave for real-world more complex SoC designs where these bug classes are highly relevant from a security standpoint.”
- “we replaced the PULP_SECURE variable, which controls access privileges to the registers, with the PULP_SEC variable.”
- “We present next the results of our second case study. 54 teams of researchers participated in Hack@DAC 2018, a recently conducted capture-the-flag competition to identify hardware bugs that were injected deliberately in real-world open-source SoC designs. This is the equivalent of bug bounty programs that semiconductor companies offer”
- “The goal is to investigate how well these bugs can be detected through dynamic verification and manual RTL audit without prior knowledge of the bugs.”
- “This RTL vulnerability manifests in the hardware behaving in the following way. When an error signal is generated on the memory bus while the underlining logic is still handling an outstanding transaction, the next signal to be handled will instead be considered operational by the module unconditionally.”
- “While existing industry SoCs support hot-fixes by microcode patching, this approach is inherently limited to a handful of changes to the instruction set architecture, e.g., modifying the interface of individual complex instructions and adding or removing instructions. Thus, such patches at this higher abstraction level in the firmware only act as a "symptomatic" fix that circumvent the RTL bug.”
- “VeriCoq based on the Coq proof assistant transforms the Verilog code that describes the hardware design into proof-carrying code.”
- “Finally, computational scalability to verifying real-world complex SoCs remains an issue given that the proof verification for a single AES core requires 30 minutes to complete”
- “Murφ model”
- “Information flow analysis (such as SPV) are better suited for this purpose where a data variable or input is assigned a security label (or a taint), and the taint propagation is monitored.”
- “IFT techniques are proposed at different levels of abstraction: gate-, RT, and language-levels.”
- “At the language level, Caisson and Sapper are security-aware HDLs that use a typing system where the designer assigns security “labels” to each variable (wire or register) by the security policies required. However, they both require redesigning the RTL using a new hardware description language which is not practical. SecVerilog [33, 100] overcomes this by extending the Verilog language with a dynamic security type system.”
- “In the Meltdown attack, speculative execution can be exploited on modern processors (affecting all main vendors) to completely bypass all memory access restrictions.”
- “a selection multiplexer to select between AES, SHA1, MD5, and the temperature sensor.”
summary
A technical report discussing how bugs are introduced in the hardware design process and slip through testing tools. Specifically, they define HardFails as RTL bugs that are difficult to detect and can be triggered from software potentially compromising the entire platform. Their classification in 4 HardFails properties {Cross-modular effects, Timing-flow gap, Cache-state gap, Hardware/Firmware-interactions} is non-exhaustive (as pointed out in the conclusion). In my opinion, it is too copious when discussing the hardware development process and laying out the advantages/disadvantages of various tools. I think it could have been more concise (e.g. in “Proof assistant and theorem-proving” the scalability issue is mentioned twice).
Besides that, I think it give a nice overview over the issues hardware design has to deal with and yes, we need better tool support. But there is (obviously) no solution to the mentioned problems in the paper.
Catherine pointed out that CLKScrew in Table 2 does not need Firmware interaction and Foreshadow has nothing to do with Firmware interaction.
You Really Shouldn't Roll Your Own Crypto: An Empirica… §
Title: “You Really Shouldn't Roll Your Own Crypto: An Empirical Study of Vulnerabilities in Cryptographic Libraries” by Jenny Blessing, Michael A. Specter, Daniel J. Weitzner [url] [dblp]
Published in 2021-07 at and I read it in 2021-10
Abstract: The security of the Internet rests on a small number of opensource cryptographic libraries: a vulnerability in any one of them threatens to compromise a significant percentage of web traffic. Despite this potential for security impact, the characteristics and causes of vulnerabilities in cryptographic software are not well understood. In this work, we conduct the first comprehensive analysis of cryptographic libraries and the vulnerabilities affecting them. We collect data from the National Vulnerability Database, individual project repositories and mailing lists, and other relevant sources for eight widely used cryptographic libraries.
quotes
- “We examine eight of the most widely used cryptographic libraries and build a dataset of the 300+ entries in the National Vulnerability Database (NVD) [55] for these systems. In our analysis, we combine data from the NVD with information scraped from the projects’ GitHub repositories, internal mailing lists, project bug trackers, and various other external references. We extensively characterize the vulnerabilities originating in cryptographic software, measuring exploitable lifetime, error type, and severity to better understand their security impact on cryptographic software.”
- “Our findings include the following: just 27.2% of vulnerabilities in cryptographic software are cryptographic issues as defined by the NVD, while 37.2% of errors are related to memory management or corruption, suggesting that developers should focus their efforts on systems-level implementation issues. The median exploitable lifetime of a vulnerability in a cryptographic library is 4.18 years, providing malicious actors a substantial window of exploitation. At least one vulnerability is introduced for every thousand lines of code added in the most widely used cryptographic library, OpenSSL; and the rate of vulnerability introduction is up to three times as high in cryptographic software as in non-cryptographic software.”
- “When a new CVE (Common Vulnerabilities and Exposures) ID [33] is created, the NVD calculates a severity score (CVSS) [56] and performs additional analysis before adding the vulnerability to the NVD.”
- “We scrape CVE data from two third-party platforms, CVE Details [1] and OpenCVE [2], which contain much the same data as the official NVD but organize CVEs by product and vendor, enabling us to retrieve all CVEs for a particular system.”
- […] “and so we include data scraped from both CVE Details and OpenCVE.”
- “In total, our dataset consists of n=312 CVEs in cryptographic libraries and 2,000+ CVEs in non-cryptographic software. In our study of cryptographic vulnerability characteristics, we consider only CVEs published by the NVD between 2010 and 2020, inclusive.”
- “Prior work [40] has demonstrated significant differences in vulnerability causes in memory-unsafe C/C++ source code compared to systems written in memory-safe languages such as Java.”
- “In cryptographic software, we consider only cryptographic libraries that have at least 10 CVEs published from 2010 - 2020.”
- “We define cyclomatic complexity as the number of linearly independent paths through a system’s source code, following McCabe’s 1976 definition [45].”
- “We use a separate command-line tool, lizard [63], to calculate cyclomatic complexity of all C and C++ source files. Lizard calculates the complexity of each file individually and averages them together, outputting a single average cyclomatic complexity number (CCN).”
- “We therefore define a vulnerability’s lifetime as the period of time in which it can be exploited by a malicious actor.”
- “we observe that OpenSSL has a far greater number of CVEs than any other cryptographic library, with 153 CVEs published during our timeframe of 2010 - 2020 compared to the second-highest count of 43 CVEs in GnuTLS.”
- “Column 7 shows that, on average, around 1 CVE is introduced in OpenSSL for every thousand lines of code added.”
- “LibreSSL was conceived of as a replacement for OpenSSL that maintained prior API compatibility and portability [9], while BoringSSL was developed for internal Google use only.”
- “The OpenBSD team built LibreSSL under the design that the library would only be used on a POSIX-compliant OS with a standard C compiler [59, 19].”
- “The abrupt jettisoning of 22% and 70% of OpenSSL’s codebase by LibreSSL and BoringSSL, respectively, raises the question of what impact this had on the security of the two new codebases.”
- “Table 5 shows that of those 59 CVEs, 44 still affected LibreSSL and just 35 affected BoringSSL. The clear correspondence between the percentage of the OpenSSL codebase removed and the percentage of OpenSSL vulnerabilities removed demonstrates the security implications for reducing codebase size.”
- “Table 6 shows the average cyclomatic complexities over the previous five major versions for each of the eight cryptographic libraries studied and the three non-cryptographic systems selected.”
- “For Ubuntu, of the 2,187 CVEs studied the average and median lifetimes are 3.89 and 4.03 years, with a standard deviation of 1.44 years. Of the 509 CVEs in Wireshark, the average and median lifetimes are 1.29 and 1.4 years, with a standard deviation of 0.61.”
- “approximately three out of every four vulnerabilities in cryptographic software are caused by common implementation errors, and particularly by memory management issues, overly bloated software threatens significant implications for library security.”
- “However, cryptography libraries suffer from serious usability issues that make them challenging for non-specialists to navigate.”
- “We found that only 27.2% of vulnerabilities introduced in cryptographic software are actually cryptographic, while 37.2% are memory or resource management issues.”
summary
Well-designed methodology. All decisions made are well-documented and put into context. I am just still a little bit skeptical about the statement “You really shouldn't roll your own crypto”. The data shows that cryptographic bugs occur and cryptographic libraries should prevent errors on an API level. However, it does not really show results regarding “own crypto”. It does not provide examples for custom-designed cryptographic libraries or its effects in industry.
Prior work:
- Ozment et al [48]
- Zimmermann et al [64]
- Azad et al [29]
- Lazar et al [42]
- Walden et al [61]
- Li et al [44]
- Shahzad et al [53]
- Rescorla et al [50]
π is the Minimum Value for Pi §
Title: “π is the Minimum Value for Pi” by C. L. Adler, James Tanton [url] [dblp]
Published in 2000 at and I read it in 2021-10
Abstract:
summary
If we consider Pi as the ratio of the circumference to its diameter, the value depends on the chosen metric. Varying this metric gives various values from 4 to π and to 4 again. Thus π is the minimum value of Pi. An analytic proof follows. Nice little result. Also attached is a proof that a² + b² ≥ 2ab. If the bright grey area equals a・b and the dark grey area equals a・b as well, then we can discover the area of 2・a・b. If the dark grey area outside the large square is a² and the large square b², we can observe that a² + b² ≥ 2ab because of the white square on the top.