Papers

I read a bunch of papers. This peaked during my PhD studies, but I enjoyed good papers before and after that. I systematically review and summarize them. Here are my notes.

Table of contents:

Papers and notes

A Masked Ring-LWE Implementation §

Title: “A Masked Ring-LWE Implementation” by Oscar Reparaz, Sujoy Sinha Roy, Frederik Vercauteren, Ingrid Verbauwhede [url] [dblp]
Published in 2015 at CHES 2015 and I read it in 2020/07
Abstract: Lattice-based cryptography has been proposed as a postquantum public-key cryptosystem. In this paper, we present a masked ring-LWE decryption implementation resistant to first-order side-channel attacks. Our solution has the peculiarity that the entire computation is performed in the masked domain. This is achieved thanks to a new, bespoke masked decoder implementation. The output of the ring-LWE decryption are Boolean shares suitable for derivation of a symmetric key. We have implemented a hardware architecture of the masked ring-LWE processor on a Virtex-II FPGA, and have performed side channel analysis to confirm the soundness of our approach. The area of the protected architecture is around 2000 LUTs, a 20 % increase with respect to the unprotected architecture. The protected implementation takes 7478 cycles to compute, which is only a factor ×2.6 larger than the unprotected implementation.

ambiguity

\[ \operatorname{th}(x) = \begin{cases} 0 & \text{if } x \in (0, q/4) \cup (3q/4, q) \\ 1 & \text{if } x \in (q/4, 3q/4) \end{cases} \]

The intervals apparently should denote inclusive-exclusive boundaries.

error

On page 686, c1 and c2 is swapped erroneously all the time. On page 685, c2 is defined such that it contains the message. Consequently, factor r2 must be applied to c1, not c2. However, page 686 repeatedly multiplies c2 to the different r values.

Masked Decoder table

Sorry, I don't have table support in Zotero.

First line is meant to be read as “If a' is in quadrant (1) and a'' is in quadrant (1), then a is inconclusive (∅) unlike bit 0 or 1”

a' a'' a

1 1 ∅
1 2 1
1 3 ∅
1 4 0
2 1 1
2 2 ∅
2 3 0
2 4 ∅
3 1 ∅
3 2 0
3 3 ∅
3 4 1
4 1 0
4 2 ∅
4 3 1
4 4 ∅

quotes

summary

Nice paper. It is sad that c1 and c2 got mixed up on page 686. The idea to mask is indeed natural with r2 = r2' + r2'' (mod q) and a' := a' + Δ1 for the decoder a'' := a'' - Δ1. Isn't sampling also a problem when masking ring-LWE? If so, the title is inappropriate and should be “A Masked Ring-LWE Decoder Implementation”. The described implementation works for CPA-only. It gives a factor of 2.679 in terms of CPU cycles and 3.2 in terms of runtime in microseconds in the decryption step. With N=16 (maximum number of iterations in the decoder), you get 1 error in about 400 000 bits. This means in NewHope's 256 bit encapsulated keys, you get 1 error in 1420 messages. I did not understand why the masked decoder requires the value of the previous iteration (loop in Figure 3) for a long time. Then I recognized. I don't like the fact that the source code was not published with the paper (→ reproducible research).

typo

A Practical Analysis of Rust’s Concurrency Story §

Title: “A Practical Analysis of Rust’s Concurrency Story” by Aditya Saligrama, Andrew Shen, Jon Gjengset [url] [dblp]
Published in 2019 at and I read it in 2021-11
Abstract: Correct concurrent programs are difficult to write; when multiple threads mutate shared data, they may lose writes, corrupt data, or produce erratic program behavior. While many of the data-race issues with concurrency can be avoided by the placing of locks throughout the code, these often serialize program execution, and can significantly slow down performance-critical applications. Programmers also make mistakes, and often forget locks in less-executed code paths, which leads to programs that misbehave only in rare situations.

quotes

summary

A nice report about the implementation of a concurrent hashmap in rust. The basic primitives of rust are well-explained, but the lock-free design of the hash map is not discussed at all. The relative performance characteristics are presented in a figure without appropriate caption and nice examples are provided for struggles with rust. Overall a nice bachelor/master-level project outcome.

A Provably Secure True Random Number Generator with Bu… §

Title: “A Provably Secure True Random Number Generator with Built-In Tolerance to Active Attacks” by Berk Sunar, William Martin, Douglas Stinson [url] [dblp]
Published in 2007-01 at and I read it in 2021-11-08
Abstract: This paper is a contribution to the theory of true random number generators based on sampling phase jitter in oscillator rings. After discussing several misconceptions and apparently insurmountable obstacles, we propose a general model which, under mild assumptions, will generate provably random bits with some tolerance to adversarial manipulation and running in the megabit-persecond range. A key idea throughout the paper is the fill rate, which measures the fraction of the time domain in which the analog output signal is arguably random. Our study shows that an exponential increase in the number of oscillators is required to obtain a constant factor improvement in the fill rate. Yet, we overcome this problem by introducing a postprocessing step which consists of an application of an appropriate resilient function. These allow the designer to extract random samples only from a signal with only moderate fill rate and, therefore, many fewer oscillators than in other designs. Last, we develop fault-attack models and we employ the properties of resilient functions to withstand such attacks. All of our analysis is based on rigorous methods, enabling us to develop a framework in which we accurately quantify the performance and the degree of resilience of the design.

clarifications and questions

quotes

summary

A very good paper. It specifies the requirements of a True Random Number Generator, builds up the theory with mathematical background and finally presents one particular instance with desired properties.

I struggled seeing the connection to linear codes which is given by Theorem 1, but I assume one just has to follow reference [15] which proves Theorem 1.

A good TRNG relies on {entropy source, harvesting mechanism, postprocessing}. The design uses ring oscillators (quantified by the urn model with prime ring lengths and the Coupon Collector's Problem), a XOR tree and a linear code namely simplex code (justified by the (n, m, t)-resilient function model).

Prior work:

Test suites:

4 gigasamples/sec = 4·109/s corresponds to 4·109s between 2 samples = 4ns

Figure 5 shows a neat network to make the XOR tree more robust by redundancy.

A Replication Study on Measuring the Growth of Open So… §

Title: “A Replication Study on Measuring the Growth of Open Source” by Michael Dorner, Maximilian Capraro, Ann Barcomb, Krzysztof Wnuk [url] [dblp]
Published in 2022-01 at and I read it in 2022-04
Abstract: Context: Over the last decades, open-source software has pervaded the software industry and has become one of the key pillars in software engineering. The incomparable growth of open source reflected that pervasion: Prior work described open source as a whole to be growing linearly, polynomially, or even exponentially. Objective: In this study, we explore the long-term growth of open source and corroborating previous findings by replicating previous studies on measuring the growth of open source projects. Method: We replicate four existing measurements on the growth of open source on a sample of 172,833 open-source projects using Open Hub as the measurement system: We analyzed lines of code, commits, new projects, and the number of open-source contributors over the last 30 years in the known open-source universe. Results: We found growth of open source to be exhausted: After an initial exponential growth, all measurements show a monotonic downwards trend since its peak in 2013. None of the existing growth models could stand the test of time. Conclusion: Our results raise more questions on the growth of open source and the representativeness of Open Hub as a proxy for describing open source. We discuss multiple interpretations for our observations and encourage further research using alternative data sets.

quotes

  • “Prior work described open source as a whole to be growing linearly, polynomially, or even exponentially.” (Dorner et al., 2022, pp. -)
  • “We replicate four existing measurements on the growth of open source on a sample of 172,833 open-source projects using Open Hub as the measurement system: We analyzed lines of code, commits, new projects, and the number of open-source contributors over the last 30 years in the known open-source universe.” (Dorner et al., 2022, p. 1)
  • “Open source has evolved from small communities of volunteers driven by non-monetary incentives to foundations that host large projects and support decentralized innovation among many global industries [13]” (Dorner et al., 2022, p. 2)
  • “Three horizontal, longitudinal studies investigated the growth of open source as a whole: [9] from 2003, [24] from 2007, and in 2008 [11].” (Dorner et al., 2022, p. 2)
  • “In detail, the contributions of this paper are:

    • a detailed discussion of the three prior studies,
    • the multi-dimensional measurements using Open Hub as a measuring system to quantify the growth of open source with respect to lines of code, commits, contributors, and projects, and, thereby,
    • the dependent and independent replication of the measurements by three prior studies.” (Dorner et al., 2022, p. 2)
  • “Study A [9] from 2003 at FreshMeat.net with 406 projects accounted until 2002-07-01
    Study B [24] from 2007 at SourceForge.net with 4,047 projects
    Study C [11] from 2008 at Ohloh.net with 5,122 projects from 1995-01-01 until 2006-12-31
    from Table 1: Three prior [9, 11, 24] studies on the evolution of open source, showing data source, project sample size, and the considered data time frame.”
    (Dorner et al., 2022, p. 3)
  • “Their construction is unclear and, in consequence, we cannot replicate the measurements.” (Dorner et al., 2022, p. 4)
  • “Vitality V is defined by V = R·A/L (1) where—according to description—R is the number of releases in a given period (t), A is the age of the project in days, and L is the number of releases in the period t.” (Dorner et al., 2022, p. 4)
  • “However, a comment is a valuable contribution to a software project. In modern programming languages like Go or Python, comments are directly embedded in the source code for documentation purposes.” (Dorner et al., 2022, p. 7)
  • “Therefore, our measurement is consistent with Hypothesis 1, which states that open source grows (in bytes).” (Dorner et al., 2022, p. 7)
  • “When commit size is calculated as the total number of commits in a period, it follows a power-law distribution [27]” (Dorner et al., 2022, p. 8)
  • “At the time of the data collection (2021-06-04 to 2021-06-07), Open Hub lists 355,111 open-source projects as of 2021-06-06.” (Dorner et al., 2022, p. 10)
  • “After filtering our data set for the time frame from 1991-01-01 to 2020-12-31, 173,265 projects were available for further analysis.” (Dorner et al., 2022, p. 10)
  • “Observation 2: Although initially growing exponentially until 2009, the growth in lines of code has continuously slowed down since 2013.” (Dorner et al., 2022, p. 13)
  • “Observation 4: Although growing exponentially until 2009 and reaching its peak in March 2013 with 107,915 contributors, the number of open-source contributors has, as of 2018, decreased to the level of 2008.” (Dorner et al., 2022, p. 14)
  • “We also encountered similar problems while crawling Open Hub: 181,846 of 355,111 projects do not contain information on the development activity. We also found that the accumulated number of lines added does not fit the measured lines of code. Additionally, we found a large drop in added projects in 2011 we are not able to explain (Figure 5). We speculate that Open Hub could have decreased the number of projects it adds, so that newer projects are under-represented.” (Dorner et al., 2022, p. 16)
  • “In this study, we conducted a large-scale sample study on open-source projects and their cumulative growth. We analyzed the number of developers and their contributions with respect to lines of code, commits, and new projects to the open-source universe. We leveraged Open Hub as a measuring system to measure development activities of 172,833 open-source projects over the last 30 years concerning those four quantities.” (Dorner et al., 2022, p. 19)

replications

  • exact replications

    • dependent: keep all conditions identical
    • independent: vary one or more major aspects

summary

In this study (I read version #5), the authors replicate results from 2003 (Capiluppi, A., Lago, P., Morisio, M., “Characteristics of open source projects”), 2007 (Koch, S., “Software evolution in open source projects—a large-scale investigation”), and 2008 (Deshpande, A., Riehle, D., “The Total Growth of Open Source”) on the OpenHub project index. Open Hub lists 355,111 open-source projects as of 2021-06-06 and the development activity is available for 173,305 projects. The previous studies claimed that Open Source grows w.r.t. byte size (2003), grows quadratically (2007) or exponentially (2008) w.r.t. lines of code as well as exponentially w.r.t. projects (2008).

For the 2003 study, the authors remark “their construction is unclear and, in consequence, we cannot replicate the measurements”. The status (e.g. “beta”, “mature”) is also determined in an unclear way. The 2003 study’s lack of specification regarding the magnitude is also weak (“grows”?!).

In essence, the data sources from 2003 and 2007 are not publicly available anymore. As such the statements need to be replicated with new data for an extended timeline. The result is that the authors recognize a peak around 2010 and 2013 and a steady decline afterwards in various parameters (size in bytes, LoCs, # of projects, new projects).

The paper is well-written and the intention is clear. A little more thorough investigation around the peaks from 2010 and 2013 could be done since they occur in multiple parameters (c.f. Fig. 2 and Fig. 3) and thus seem significant. I suggest to validate against hypotheses that activities around Github or the Linux foundations influenced the way how projects are maintained. On page 16, it is mentioned that 181,846 projects do not contain information on the development activity. It should be explicitly pointed out that those are not included in the respective figures (at least this is what I assume).

The paper shows how bad the situation regarding empirical replication of data in software development is. On the other hand, it shows how it improved because version control and public availability of data improved. My personal summary is just that Open Source changed over time and since 2013 in particular.

A Side-channel Resistant Implementation of SABER §

Title: “A Side-channel Resistant Implementation of SABER” by Michiel Van Beirendonck, Jan-Pieter D'Anvers, Angshuman Karmakar, Josep Balasch, Ingrid Verbauwhede [url] [dblp]
Published in 2020 at IACR eprint 2020 and I read it in 2020-11
Abstract: The candidates for the NIST Post-Quantum Cryptography standardization have undergone extensive studies on efficiency and theoretical security, but research on their side-channel security is largely lacking. This remains a considerable obstacle for their real-world deployment, where side-channel security can be a critical requirement. This work describes a side-channel resistant instance of Saber, one of the lattice-based candidates, using masking as a countermeasure. Saber proves to be very efficient to mask due to two specific design choices: power-of-two moduli, and limited noise sampling of learning with rounding. A major challenge in masking lattice-based cryptosystems is the integration of bit-wise operations with arithmetic masking, requiring algorithms to securely convert between masked representations. The described design includes a novel primitive for masked logical shifting on arithmetic shares, as well as adapts an existing masked binomial sampler for Saber. An implementation is provided for an ARM Cortex-M4 microcontroller, and its side-channel resistance is experimentally demonstrated. The masked implementation features a 2.5x overhead factor, significantly lower than the 5.7x previously reported for a masked variant of NewHope. Masked key decapsulation requires less than 3,000,000 cycles on the Cortex-M4 and consumes less than 12kB of dynamic memory, making it suitable for deployment in embedded platforms.

open questions

quotes

summary

 

typo

A Sound Method for Switching between Boolean and Arith… §

Title: “A Sound Method for Switching between Boolean and Arithmetic Masking” by Louis Goubin [url] [dblp]
Published in 2001 at CHES 2001 and I read it in 2021-10
Abstract: Since the announcement of the Differential Power Analysis (DPA) by Paul Kocher and al., several countermeasures were proposed in order to protect software implementations of cryptographic algorithms. In an attempt to reduce the resulting memory and execution time overhead, a general method was recently proposed, consisting in “masking” all the intermediate data.

quotes

summary

One of the most beautiful papers, I have read.

Minor issues:

A system for typesetting mathematics §

Title: “A system for typesetting mathematics” by Brian W. Kernighan, Lorinda L. Cherry [url] [dblp]
Published in 1975 at and I read it in 2022-01
Abstract: The syntax of the language is specified by a small context-free grammar; a compiler-compiler is used to make a compiler that translates this language into typesetting commands. Output may be produced on either a phototypesetter or on a terminal with forward and reverse half-line motions. The system interfaces directly with text formatting programs, so mixtures of text and mathematics may be handled simply. This paper was typeset by the authors using the system described.

error

quotes

summary

Design paper for a fundamental tool in the UNIX space is introduced. It is neat to see how the fundamentals of mathematical typesetting developed. It is truly an achievement that the paper is set using eqn & troff itself. However, the formal grammar behind eqn is not formalized rigorously, seems pretty complex (custom disambiguation rules).

Additively Homomorphic Ring-LWE Masking §

Title: “Additively Homomorphic Ring-LWE Masking” by Oscar Reparaz, Ruan de Clercq, Sujoy Sinha Roy, Frederik Vercauteren, Ingrid Verbauwhede [url] [dblp]
Published in 2016 at PQCrypto 2016 and I read it in 2020-07
Abstract: In this paper, we present a new masking scheme for ringLWE decryption. Our scheme exploits the additively-homomorphic property of the existing ring-LWE encryption schemes and computes an additive-mask as an encryption of a random message. Our solution differs in several aspects from the recent masked ring-LWE implementation by Reparaz et al. presented at CHES 2015; most notably we do not require a masked decoder but work with a conventional, unmasked decoder. As such, we can secure a ring-LWE implementation using additive masking with minimal changes. Our masking scheme is also very generic in the sense that it can be applied to other additively-homomorphic encryption schemes.

quotes

The most important statement is equation 3:

decryption(c1, c2) ⊕ decryption(c1’, c2’) = decryption(c1 + c1’, c2 + c2’)

summary

A paper of rather low quality. The core essence is Equality 3: decryption(c1, c2) ⊕ decryption(c1’, c2’) = decryption(c1 + c1’, c2 + c2’). Besides that, there are many confusing statements. The workings of Equality 3 are barely mentioned (i.e. correctness of Equality 3 is IMHO insufficiently discussed), but should be understandable for everyone in the field. Correctness is non-obvious, because we have an addition on the RHS and XOR on the LHS. But it is trivial, because on the RHS, we consider ℤq whereas LHS uses ℤ2 and the XOR is addition in ℤ2. Unlike the CHES 2015 paper, no additional circuitry is needed, which makes it a super interesting approach. The failure rate increases by a factor of 92 (3.6 × 10-5 → 3.3 × 10-3 per bit), which is a difficult problem of its own, but given in the CHES 2015 approach as well.

In summary, the approach computes the encryption of the original message (⇒ (c1, c2)) and also the encryption of some random value (⇒ (c1’, c2’)). Then, we don't decrypt dec(c1, c2), but dec(c1 + c1’, c2 + c2’).

typo

Aggregated Private Information Retrieval §

Title: “Aggregated Private Information Retrieval” by Lukas Helminger, Daniel Kales, Christian Rechberger [url] [dblp]
Published in 2020-05 at and I read it in 2020-07
Abstract: With the outbreak of the coronavirus, governments rely more and more on location data shared by European mobile network operators to monitor the advancements of the disease. In order to comply with often strict privacy requirements, this location data, however, has to be anonymized, limiting its usefulness for making statements about a filtered part of the population, like already infected people.

quotes

summary

The paper implements a nice idea. The usefulness of the usecase needs to be discussed with public health experts (what does it help if I know that many infected people live in this block of houses?). However, I have been told the entire paper was written in 1 month and that is quite impressive considering the technical depth in the field of Homomorphic Encryption.

There are many typos and to me, the main purpose of the protocol in Figure 1 was not comprehensible before talking to the authors. I also didn't understand in which ways ε-Differential Privacy is desirable or how it can be ensured and which definition they used for “vector c is binary” before going into details in section 3.2. Apparently, a binary vector is desirable to prevent leakage. For Figure 2, they used the Teχ package cryptocode to illustrate the protocol. To the best of my understanding, this is just a reiteration of Figure 2. On page 13, the paragraph “Note that, as we have already mentioned in Section 3.6” should be moved to the concluding remarks. On page 14, it is unclear, what “isolated profiles” are. I didn't go through the details of section 5.

BasicBlocker: Redesigning ISAs to Eliminate Speculativ… §

Title: “BasicBlocker: Redesigning ISAs to Eliminate Speculative-Execution Attacks” by Jan Philipp Thoma, Jakob Feldtkeller, Markus Krausz, Tim Güneysu, Daniel J. Bernstein [url] [dblp]
Published in 2020-07 at and I read it in 2020-08
Abstract: Recent research has revealed an ever-growing class of microarchitectural attacks that exploit speculative execution, a standard feature in modern processors. Proposed and deployed countermeasures involve a variety of compiler updates, firmware updates, and hardware updates. None of the deployed countermeasures have convincing security arguments, and many of them have already been broken.

Comment: Preprint

quotes

summary

Quite good paper.

“Preventing speculative execution exploits” conjured up more sophisticated expectations for my part, but in general the idea is legit and the research was done properly. Essentially, the additional bb instruction annotates how many following instructions do not contain control flow instructions which allows a processor to prefetch them without any speculative considerations. An additional lcnt instruction for RISC-V handles loops.

In the reading group it was criticized that formal definitions were cumbersome and inappropriate, but I consider it a strong suit that the idea was not only presented, but formalized. I think the negative aspects are that some statements are not properly attributed; like Observation 1 which is not a contribution by this paper, but prior work. Furthermore statements like “One could easily extend our design to larger, more complex CPUs” seem unjustified. Intel Skylake CPUs are built with 19 stage pipelines, which certainly shows different performance metrics. So more research into the relation of number of pipeline stages and performance is required.  A new instruction type for RISC-V is also a non-trivial modification in practice. The corresponding code is not yet published, but the paper was just released 1 month prior reading. Another weak point is selling, which focuses on the good performance of BasicBlocker in the nettle-aes and nettle-sha benchmarks. As cryptographic applications, these obviously do not rely on branching except for “for loops”, which is completely non-representative, but statements like “In some cases, BasicBlocker even outperforms sophisticated branch predictors” can be found. Finally, Claudio pointed out very well, that JIT compilation cannot be implemented with BB since the number of instructions of a block are most likely not known.

Overall, a quite good paper, but less aggressive selling would have increased reputation from my perspective.

Benchmarking Post-quantum Cryptography in TLS §

Title: “Benchmarking Post-quantum Cryptography in TLS” by Christian Paquin, Douglas Stebila, Goutam Tamvada [url] [dblp]
Published in 2020-04 at PQCRYPTO 2020 and I read it in 2021-06
Abstract: Post-quantum cryptographic primitives have a range of tradeoffs compared to traditional public key algorithms, either having slower computation or larger public keys and ciphertexts/signatures, or both. While the performance of these algorithms in isolation is easy to measure and has been a focus of optimization techniques, performance in realistic network conditions has been less studied. Google and Cloudflare have reported results from running experiments with post-quantum key exchange algorithms in the Transport Layer Security (TLS) protocol with real users’ network traffic. Such experiments are highly realistic, but cannot be replicated without access to Internet-scale infrastructure, and do not allow for isolating the effect of individual network characteristics.

quotes

summary

Neat experiment. OpenQuantumSafe's project implementation is used to run a network experiment on Linux virtual network devices where packet loss and round trip times are the considered variables. The key finding is obvious, but the collected data is provided publicly enabling future research. It was very interesting to read Firefox telemetry's reference packet loss data.

Crouching error, hidden markup [Microsoft Word] §

Title: “Crouching error, hidden markup [Microsoft Word]” by N. Holmes [url] [dblp]
Published in 2001-09 at and I read it in 2021-08
Abstract:

quotes

summary

In this article Holmes derives from his personal experiences the requirements for a modern text formatting system. He favors a markup language over the intransparent data model of WYSIWYG editors. He concludes that a dual system is the preferred user-centric way to go wheres Teχ is praised for its visual result.

I like his fundamental discussion of standards and former text formatting tools. His preference for Lilac seems understandable, but lacks depth. Whereas the concept was also applied to Macromedia's Dreamweaver on the HTML markup language, the situation becomes increasingly difficult with a larger data model and more powerful layouting scheme. CSS allows users to place objects on defined pixels which makes it difficult to select and edit elements in a visual editor if elements get close. One of the issues, the author completely ignores is Teχ's property as representational notation (in contrast to HTML). As such Teχ is less a markup language than a sequence of instructions to generate a document.

 

Cryptanalysis of ring-LWE based key exchange with key … §

Title: “Cryptanalysis of ring-LWE based key exchange with key share reuse” by Scott Fluhrer [url] [dblp]
Published in 2016 at and I read it in 2020-06
Abstract:

quotes

summary

 

typos

Cryptographic competitions §

Title: “Cryptographic competitions” by Daniel J Bernstein [url] [dblp]
Published in 2020-12 at and I read it in 2021-06-20
Abstract: Competitions are widely viewed as the safest way to select cryptographic algorithms. This paper surveys procedures that have been used in cryptographic competitions, and analyzes the extent to which those procedures reduce security risks.

quotes

summary

Very interesting meta-level read with many historical remarks. As organizers of cryptographic competitions, djb also has required background information to share. Performance as criterion is broadly discussed in the first half. The contextualization of running competitions is done nicely. Personal remarks can be found more towards the end. But it cannot offer solutions to the posed (difficult) problems.

Hitchhiker's guide to the paper:

Significant quotes:

 

Cyclone: A safe dialect of C §

Title: “Cyclone: A safe dialect of C” by Greg Morrisett, James Cheney, Dan Grossman, Michael Hicks, Yanling Wang [url] [dblp]
Published in 2002 at USENIX 2002 and I read it in 2020-02
Abstract: Cyclone is a safe dialect of C. It has been designed from the ground up to prevent the buffer overflows, format string attacks, and memory management errors that are common in C programs, while retaining C’s syntax and semantics. This paper examines safety violations enabled by C’s design, and shows how Cyclone avoids them, without giving up C’s hallmark control over low-level details such as data representation and memory management.

Ambiguity

“Arrays and strings are converted to ?-pointers as necessary (automatically by the compiler).”

→ When exactly?

Example

“We don’t consider it an error if non-pointers are uninitialized. For example, if you declare a local array of non-pointers, you can use it without initializing the elements:

char buf[64];       // contains garbage ..
sprintf(buf,"a");   // .. but no err here
char c = buf[20]; // .. or even here

This is common in C code; since these array accesses are in-bounds, we allow them.”

Example

“However, this technique will not catch even the following simple variation:

char *itoa(int i) {
  char buf[20];
  char *z;
  sprintf(buf,"%d",i);
  z = buf;
  return z;
}

Here, the address of buf is stored in the variable z, and then z is returned. This passes gcc -Wall without complaint.”

Quotes

Region blocks

“Therefore, Cyclone provides a feature called growable regions. The following code declares a growable region, does some allocation into the region, and deallocates theregion:

region h {
  int *x = rmalloc(h,sizeof(int));
  int ?y = rnew(h) { 1, 2, 3 };
  char ?z = rprintf(h,"hello");
}

The code uses a region block to start a new, growable region that lives on the heap. The region is deallocated on exit from the block (without an explicit free).”

Summary

Great paper with pragmatic approach derived from the Typed Assembly Language project. Definitely worth a read. Essential for everyone interested in the Rust programming language as this project inspired many ideas related to proper memory management and lifetimes. Implementation needs to be compiled on your own, but it is not maintained anymore anyways. Furthermore, there are more papers following the progress of the project and they also introduced more drastic changes which discards the label “pragmatic”.

Tagged unions

“We solve this in Cyclone in two steps. First, we add tagged unions to the language:”

tunion t {
  Int(int);
  Str(char ?);
};

[…]

void pr(tunion t x) {
  switch (x) {
    case &Int(i): printf("%d",i); break;
    case &Str(s): printf("%s",s); break;
  }
}

“The printf function itself accesses the tagged arguments through a fat pointer (Cyclone’s varargs are bounds checked)”

Detecting Unsafe Raw Pointer Dereferencing Behavior in… §

Title: “Detecting Unsafe Raw Pointer Dereferencing Behavior in Rust” by Zhijian Huang, Yong Jun Wang, Jing Liu [url] [dblp]
Published in 2018 at IEICE 2018 and I read it in 2021-11
Abstract: The rising systems programming language Rust is fast, efficient and memory safe. However, improperly dereferencing raw pointers in Rust causes new safety problems. In this paper, we present a detailed analysis into these problems and propose a practical hybrid approach to detecting unsafe raw pointer dereferencing behaviors. Our approach employs pattern matching to identify functions that can be used to generate illegal multiple mutable references (We define them as thief function) and instruments the dereferencing operation in order to perform dynamic checking at runtime. We implement a tool named UnsafeFencer and has successfully identified 52 thief functions in 28 real-world crates∗, of which 13 public functions are verified to generate multiple mutable references.

quotes

summary

A nice specific LLVM compiler plugin to detect a special case where multiple mutable references in the context of  unsafe are generated. Its practicality is limited, but the authors provided pragmatic tools to make it as accessible as possible.

typos/problems

EWD1300: The Notational Conventions I Adopted, and Why §

Title: “EWD1300: The Notational Conventions I Adopted, and Why” by Edsger W. Dijkstra [url] [dblp]
Published in 2002 at and I read it in 2021-09
Abstract:

quotes

summary

A nice read summing up opinions related to mathematical notation. Prefix/Infix notation, precedence rules, and the layout of proofs are discussed. Details are often lacking, but I expected it to be only some opinion paper.

Engineering a sort function §

Title: “Engineering a sort function” by Jon L. Bentley, M. Douglas McIlroy [url] [dblp]
Published in 1993 at Software - Practice and Experience and I read it in 2022-12-29
Abstract: We recount the history of a new qsort function for a C library. Our function is clearer, faster and more robust than existing sorts. It chooses partitioning elements by a new sampling scheme; it partitions by a novel solution to Dijkstra’s Dutch National Flag problem; and it swaps efficiently. Its behavior was assessed with timing and debugging testbeds, and with a program to certify performance. The design techniques apply in domains beyond sorting.

quotes

  • “C libraries have long included a qsort function to sort an array, usually implemented by Hoare’s Quicksort.1 Because existing qsorts are flawed, we built a new one. This paper summarizes its evolution.” (Bentley and McIlroy, 1993, p. 1249)
  • “Shopping around for a better qsort, we found that a qsort written at Berkeley in 1983 would consume quadratic time on arrays that contain a few elements repeated many times—in particular arrays of random zeros and ones.” (Bentley and McIlroy, 1993, p. 1249)
  • “The sort need not be stable; its specification does not promise to preserve the order of equal elements.” (Bentley and McIlroy, 1993, p. 1250)
  • “Sedgewick studied Quicksort in his Ph.D. thesis” (Bentley and McIlroy, 1993, p. 1251)
  • “A more efficient (and more familiar) partitioning method uses two indexes i and j. Index i scans up from the bottom of the array until it reaches a large element (greater than or equal to the partition value), and j scans down until it reaches a small element. The two array elements are then swapped, and the scans continue until the pointers cross. This algorithm is easy to describe, and also easy to get wrong—Knuth tells horror stories about inefficient partitioning algorithms.” (Bentley and McIlroy, 1993, p. 1252)
  • “As a benchmark, swapping two integers in inline code takes just under a microsecond.” (Bentley and McIlroy, 1993, p. 1253)
  • “using inline swaps for integer-sized objects and a function call otherwise.” (Bentley and McIlroy, 1993, p. 1254)
  • “When a colleague and I modified sort to improve reliability and efficiency, we found that techniques that improved performance for other sorting applications sometimes degraded the performance of sort.’” (Bentley and McIlroy, 1993, p. 1254)
  • “Partitioning about a random element takes Cn ∼∼ 1.386n lg n comparisons. We now whittle away at the constant in the formula. If we were lucky enough to choose the median of every subarray as the partitioning element, we could reduce the number of comparisons to about n lg n.” (Bentley and McIlroy, 1993, p. 1254)
  • “We adopted Tukey’s ‘ninther’, the median of the medians of three samples, each of three elements.” (Bentley and McIlroy, 1993, p. 1255)
  • “Tripartite partitioning is equivalent to Dijkstra’s ‘Dutch National Flag’ problem.” (Bentley and McIlroy, 1993, p. 1257)
  • “Quicksort with split-end partitioning (Program 7) is about twice as fast as the Seventh Edition qsort.” (Bentley and McIlroy, 1993, p. 1258)
  • “Since the expected stack size is logarithmic in n, the stack is likely to be negligible compared to the data—only about 2,000 bytes when n = 1,000,000.” (Bentley and McIlroy, 1993, p. 1259)
  • “We therefore emulated Knuth’s approach to testing TeX: ‘I get into the meanest, nastiest frame of mind that I can manage, and I write the nastiest code I can think of; then I turn around and embed that in even nastier constructions that are almost obscene.’” (Bentley and McIlroy, 1993, p. 1260)
  • “P. McIlroy’s merge sort has guaranteed O (n log n) worst-case performance and is almost optimally adaptive to data with residual order (it runs the highly nonrandom certification suite of Figure 1 almost twice as fast as Program 7), but requires O (n) additional memory.” (Bentley and McIlroy, 1993, p. 1262)
  • “The key to performance is elegance, not battalions of special cases.” (Bentley and McIlroy, 1993, p. 1263)

summary

In this paper, the authors try to improve the performance of qsort; a Quicksort implementation by Lee McMahon based on Scowen’s ‘Quickersort’ shipped with the Seventh Edition Unix System. Predictably, Quicksort exhibits quadratic behavior in an easy-to-discover set of inputs. As a result, the authors look at some other proposals and use techniques like split-end partitioning to improve upon these set of inputs.

An old, but down-to-earth paper. It is interesting to see how asymptotic behavior was really the main driver for considerations during that time (contrary to considerations regarding caches like today). But I want to emphasize that actual benchmarks were also provided in the paper.

typo

“to sift the” (Bentley and McIlroy, 1993, p. 1250)

Everything Old is New Again: Binary Security of WebAss… §

Title: “Everything Old is New Again: Binary Security of WebAssembly” by Daniel Lehmann, Johannes Kinder, Michael Pradel [url] [dblp]
Published in 2020 at USENIX Security 2020 and I read it in 2020-07
Abstract: WebAssembly is an increasingly popular compilation target designed to run code in browsers and on other platforms safely and securely, by strictly separating code and data, enforcing types, and limiting indirect control flow. Still, vulnerabilities in memory-unsafe source languages can translate to vulnerabilities in WebAssembly binaries. In this paper, we analyze to what extent vulnerabilities are exploitable in WebAssembly binaries, and how this compares to native code. We find that many classic vulnerabilities which, due to common mitigations, are no longer exploitable in native binaries, are completely exposed in WebAssembly. Moreover, WebAssembly enables unique attacks, such as overwriting supposedly constant data or manipulating the heap using a stack overflow. We present a set of attack primitives that enable an attacker (i) to write arbitrary memory, (ii) to overwrite sensitive data, and (iii) to trigger unexpected behavior by diverting control flow or manipulating the host environment. We provide a set of vulnerable proof-of-concept applications along with complete end-to-end exploits, which cover three WebAssembly platforms. An empirical risk assessment on real-world binaries and SPEC CPU programs compiled to WebAssembly shows that our attack primitives are likely to be feasible in practice. Overall, our findings show a perhaps surprising lack of binary security in WebAssembly. We discuss potential protection mechanisms to mitigate the resulting risks.

quotes

summary

Wonderful paper. Firstly, a security analysis was necessary and I think they covered an appropriate amount of attack vectors. Secondly it is, of course, sad to see that many old vulnerabilities still work on a new platform.

I am unconditionally in favor of true read-only memory in WebAssembly. As David points out, this can also be enforced by a compiler by appropriate runtime checks. However, David also specified a criterion: Iff a memory attack could lead to an escape from the WASM sandbox, then it is an issue of WebAssembly sandbox and should be prevented at this level.

I keep wondering about stack canaries and guard pages. Maybe it was a decision between the trade-off of security and performance. But I am not convinced by it with 100%. Thirdly, the paper is well-structured and gives sufficient data to reason its arguments. The attacks were okay, the introduction to WebAssembly was superb and justifying the claims regarding indirect calls with quantitative data in section 6 was outstanding. I think everyone in IT security can easily follow it. I love it!

WebAssembly architecture:

toc

• Introduction
• Background on WebAssembly
• Security Analysis of Linear Memory
• Managed vs. Unmanaged Data
• Memory Layout
• Memory Protections
• Attack Primitives
• Obtaining a Write Primitive
• Stack-based Buffer Overflow
• Stack Overflow
• Heap Metadata Corruption
• Overwriting Data
• Overwriting Stack Data
• Overwriting Heap Data
• Overwriting “Constant” Data
• Triggering Unexpected Behavior
• Redirecting Indirect Calls
• Code Injection into Host Environment
• Application-specific Data Overwrite
• End-to-End Attacks
• Cross-Site Scripting in Browsers
• Remote Code Execution in Node.js
• Arbitrary File Write in Stand-alone VM
• Quantitative Evaluation
• Experimental Setup and Analysis Process
• Measuring Unmanaged Stack Usage
• Measuring Indirect Calls and Targets
• Comparing with Existing CFI Policies
• Discussion of Mitigations
• WebAssembly Language
• Compilers and Tooling
• Application and Library Developers
• Related Work
• Conclusion

FastSpec: Scalable Generation and Detection of Spectre… §

Title: “FastSpec: Scalable Generation and Detection of Spectre Gadgets Using Neural Embeddings” by M. Caner Tol, Koray Yurtseven, Berk Gulmezoglu, Berk Sunar [url] [dblp]
Published in 2020 at and I read it in 2020-07
Abstract: Several techniques have been proposed to detect vulnerable Spectre gadgets in widely deployed commercial software. Unfortunately, detection techniques proposed so far rely on hand-written rules which fall short in covering subtle variations of known Spectre gadgets as well as demand a huge amount of time to analyze each conditional branch in software. Since it requires arduous effort to craft new gadgets manually, the evaluations of detection mechanisms are based only on a handful of these gadgets.

ambiguity

quotes

summary

Very advanced paper. Perfect combination of Machine Learning technology with microarchitectural attack work. Shows a huge effort and nice considerations regarding Generative Adversarial Networks. However, I could not understand technical aspects of the machine learning part

The masking rate (page 6) seems to be the percentage of hidden tokens during the training phase. Figure 4 is a little bit awkward and a little bit random. FastSpec/scan.sh seems to show how FastSpec was called to evaluate OpenSSL. And commands.txt tries to explain it somehow.

typo

High-speed Instruction-set Coprocessor for Lattice-bas… §

Title: “High-speed Instruction-set Coprocessor for Lattice-based Key Encapsulation Mechanism: Saber in Hardware” by Sujoy Sinha Roy, Andrea Basso [url] [dblp]
Published in 2020 at CHES 2020 and I read it in 2020-11
Abstract: In this paper, we present an instruction set coprocessor architecture for lattice-based cryptography and implement the module lattice-based post-quantum key encapsulation mechanism (KEM) Saber as a case study. To achieve fast computation time, the architecture is fully implemented in hardware, including CCA transformations. Since polynomial multiplication plays a performance-critical role in the module and ideal lattice-based public-key cryptography, a parallel polynomial multiplier architecture is proposed that overcomes memory access bottlenecks and results in a highly parallel yet simple and easy-to-scale design. Such multipliers can compute a full multiplication in 256 cycles, but are designed to target any area/performance trade-offs. Besides optimizing polynomial multiplication, we make important design decisions and perform architectural optimizations to reduce the overall cycle counts as well as improve resource utilization.

open questions

quotes

summary

A good read with appropriate comparisons with other schemes and implementations. Reasonable arguments are provided for design decisions. The runtime results were expectable and have been met.

Instruction                             Cycle Count
Keygen Encapsulation Decapsulation
––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
SHA3-256 339 585 303
SHA3-512 0 62 62
SHAKE-128 1,461 1,403 1,403
Vector sampling 176 176 176
Polynomial multiplications 2,685 3,592 4,484
(i.e. 47% of column) (i.e. 54%) (i.e. 56%)
Remaining operations 792 800 1,606
Total cycles 5,453 6,618 8,034
Total time at 250 MHz 21.8 μs 26.5 μs 32.1 μs

typo

Historical Notes on the Fast Fourier Transform §

Title: “Historical Notes on the Fast Fourier Transform” by W Lewis, Peter D Welch [url] [dblp]
Published in 1967 at IEEE 1967 and I read it in 2020-03
Abstract: The fast Fourier transform algorithm has a long and interesting history that has only recently been appreciated. In this paper, the contributions of many investigators are described and placed in historical perspective.

summary

How Usable are Rust Cryptography APIs? §

Title: “How Usable are Rust Cryptography APIs?” by Kai Mindermann, Philipp Keck, Stefan Wagner [url] [dblp]
Published in 2018 at QRS 2018 and I read it in 2021-12
Abstract: Context: Poor usability of cryptographic APIs is a severe source of vulnerabilities. Aim: We wanted to find out what kind of cryptographic libraries are present in Rust and how usable they are. Method: We explored Rust’s cryptographic libraries through a systematic search, conducted an exploratory study on the major libraries and a controlled experiment on two of these libraries with 28 student participants. Results: Only half of the major libraries explicitly focus on usability and misuse resistance, which is reflected in their current APIs. We found that participants were more successful using rustcrypto which we considered less usable than ring before the experiment. Conclusion: We discuss API design insights and make recommendations for the design of crypto libraries in Rust regarding the detail and structure of the documentation, higher-level APIs as wrappers for the existing low-level libraries, and selected, good-quality example code to improve the emerging cryptographic libraries of Rust.

quotes

summary

A well-written usability paper. The authors thoroughly determine a list of rust cryptographic libraries and their history. Subsequently one author tried to implement an advanced usecase with the rust-crypto, ring, rust-openssl, rust_sodium, and octavo libraries. Then 22 participants with little cryptographic knowledge were asked to implement a basic usecase. The results were expected but can lead to practical improvement suggestions (which happened partially).

typos

In defense of PowerPoint §

Title: “In defense of PowerPoint” by N. Holmes [url] [dblp]
Published in 2004 at and I read it in 2021-08
Abstract:

quotes

summary

The argument is “blame the user, not the tool”. Otherwise this article just recites known arguments.

Languages and the computing profession §

Title: “Languages and the computing profession” by N. Holmes [url] [dblp]
Published in 2004 at and I read it in 2021-09
Abstract:

quotes

summary

Very shallow reading which proposes an intermediate language. He discusses some properties (specifity, precision, regularity, literality, neutrality) but fails to achieve a coherent requirement analysis.

McBits: Fast Constant-Time Code-Based Cryptography §

Title: “McBits: Fast Constant-Time Code-Based Cryptography” by Daniel J. Bernstein, Tung Chou, Peter Schwabe [url] [dblp]
Published in 2015 at CHES 2013 and I read it in 2022-04
Abstract: This paper presents extremely fast algorithms for code-based public-key cryptography, including full protection against timing attacks. For example, at a 2128 security level, this paper achieves a reciprocal decryption throughput of just 60493 cycles (plus cipher cost etc.) on a single Ivy Bridge core. These algorithms rely on an additive FFT for fast root computation, a transposed additive FFT for fast syndrome computation, and a sorting network to avoid cache-timing attacks.

quotes

  • “To summarize, all of these examples of bitsliced speed records are for small Sboxes or large binary fields, while code-based cryptography relies on medium-size fields and seems to make much more efficient use of table lookups.” (Bernstein et al., 2013, p. 3)
  • “[…] we point out several ways that our decryption algorithm improves upon the algorithm used in [44]: we use an additive FFT rather than separate evaluations at each point (“Chien search”); we use a transposed additive FFT rather than applying a syndrome-conversion matrix; we do not even need to store the syndrome-conversion matrix, the largest part of the data stored in [44]; and we use a simple hash (see Section 6) rather than a constant-weight-word-to-bit-string conversion” (Bernstein et al., 2013, p. 6)
  • “For multipoint evaluation we use a characteristic-2 “additive FFT” algorithm introduced in 2010 [39] by Gao and Mateer (improving upon an algorithm by von zur Gathen and Gerhard in [40], which in turn improves upon an algorithm proposed by Wang and Zhu in [77] and independently by Cantor in [29]), together with some new improvements described below.” (Bernstein et al., 2013, p. 7)
  • “The basic idea of the algorithm is to write f in the form f0(x2−x)+xf1(x2−x) for two half-degree polynomials f0, f1 ∈ Fq[x]; this is handled efficiently by the ‘radix conversion’ described below. This form of f shows a large overlap between evaluating f(α) and evaluating f(α + 1). Specifically, (α + 1)2 − (α + 1) = α2 − α, so
    f(α) = f02 − α) + αf12 − α),
    f(α + 1) = f0(α2 − α) + (α + 1)f1(α2 − α).
    Evaluating both f0 and f1 at α2 − α produces both f(α) and f(α + 1) with just a few more field operations: multiply the f1 value by α, add the f0 value to obtain f(α), and add the f1 value to obtain f(α + 1).” (Bernstein et al., 2013, p. 8)
  • “Consider the problem of computing the vector (∑α rα, ∑α rαα, …, ∑α rααd), given a sequence of q elements rα ∈ Fq indexed by elements α ∈ Fq, where q = 2m. This vector is called a ’syndrome’.” (Bernstein et al., 2013, p. 10)
  • “The transposition principle states that if a linear algorithm computes a matrix M (i.e., M is the matrix of coefficients of the inputs in the outputs) then reversing the edges of the linear algorithm, and exchanging inputs with outputs, computes the transpose of M .” (Bernstein et al., 2013, p. 12)
  • “In particular, since syndrome computation is the transpose of multipoint evaluation, reversing a fast linear algorithm for multipoint evaluation produces a fast linear algorithm for syndrome computation.” (Bernstein et al., 2013, p. 12)
  • “This procedure produced exactly the desired number of operations in Fq but was unsatisfactory for two reasons. First, there were a huge number of nodes in the graph, producing a huge number of variables in the final software. Second, this procedure eliminated all of the loops and functions in the original software, producing a huge number of lines of code in the final software. Consequently the C compiler, gcc, became very slow as m increased and ran out of memory around m = 13 or m = 14, depending on the machine we used for compilation.” (Bernstein et al., 2013, p. 13)
  • “A ‘sorting network’ uses a sequence of ‘comparators’ to sort an input array S. A comparator is a data-independent pair of indices (i, j); it swaps S[i] with S[j] if S[i] > S[j]. This conditional swap is easily expressed as a data-independent sequence of bit operations: first some bit operations to compute the condition S[i] > S[j], then some bit operations to overwrite (S[i], S[j]) with (min {S[i], S[j]}, max {S[i], S[j]}). There are many sorting networks in the literature. We use a standard ‘oddeven’ sorting network by Batcher [4], which uses exactly (m2 − m + 4)2m−2 − 1 comparators to sort an array of 2m elements. This is more efficient than other sorting networks such as Batcher’s bitonic sort [4] or Shell sort [72]. The oddeven sorting network is known to be suboptimal when m is very large (see [2]), but we are not aware of noticeably smaller sorting networks for the range of m used in code-based cryptography.” (Bernstein et al., 2013, p. 14)
  • “Our goals in this paper are more conservative, so we avoid this approach: we are trying to reduce, not increase, the number of questions for cryptanalysts.” (Bernstein et al., 2013, p. 16)
  • “Code-based cryptography is often presented as encrypting fixed-length plaintexts. McEliece encryption multiplies the public key (a matrix) by a k-bit message to produce an n-bit codeword and adds t random errors to the codeword to produce a ciphertext. The Niederreiter variant (which has several well-known advantages, and which we use) multiplies the public key by a weight-t n-bit message to produce an (n − k)-bit ciphertext. If the t-error decoding problem is difficult for the public code then both of these encryption systems are secure against passive attackers who intercept valid ciphertexts for random plaintexts.” (Bernstein et al., 2013, p. 16)
  • “However, this argument relies implicitly on a detailed analysis of how much information the attacker actually obtains through timing. By systematically eliminating all timing leaks we eliminate the need for such arguments and analyses.” (Bernstein et al., 2013, p. 18)
  • “A security proof for Niederreiter KEM/DEM appeared very recently in Persichetti’s thesis [64]. The proof assumes that the t-error decoding problem is hard; it also assumes that a decoding failure for w is indistinguishable from a subsequent MAC failure. This requires care in the decryption procedure; see below.” (Bernstein et al., 2013, p. 18)
  • “Many authors have stated that Patterson’s method is somewhat faster than Berlekamp’s method.” (Bernstein et al., 2013, p. 19)
  • “CFS is a code-based public-key signature system proposed by Courtois, Finiasz, and Sendrier in [31]. The main drawbacks of CFS signatures are large public-key sizes and inefficient signing; the main advantages are short signatures, fast verification, and post-quantum security.” (Bernstein et al., 2013, p. 19)

strong statement

“We have found many claims that NTRU is orders of magnitude faster than RSA and ECC, but we have also found no evidence that NTRU can match our speeds” (Bernstein et al., 2013, p. 5)

summary

The paper describes multiple optimization techniques useful for code-based cryptography. It mentions 400,000 decryptions per second at the 280 security level and 200,000 per second at the 2128 security level. Among the techniques, Horner’s rule, Chien search (Gao-Mateer additive search), and syndrome computation as transpose of multipoint evaluation is mentioned.

Neat paper starting with prior work to describe optimized software implementations for Niederreiter’s cryptosystem as well as the CFS signature scheme. One question discussed is whether bitslicing is worth the effort. Interestingly, the scheme is explained at the end (section 6); not the beginning. The data is described in text and not summarized in a table. I think the main contribution are the optimization techniques in section 3.

NTRU: A ring-based public key cryptosystem §

Title: “NTRU: A ring-based public key cryptosystem” by Jeffrey Hoffstein, Jill Pipher, Joseph H. Silverman [url] [dblp]
Published in 1998 at ANTS 1998 and I read it in 2021-11
Abstract: W e describe N T R U , a new public key cryptosystem. N T R U features reasonably short, easily created keys, high speed, and low memory requirements. NTR.U encryption and decryption use a mixing system suggested by polynomial algebra combined with a clustering principle based on elementary probability theory. The security of the N T R U cryptosystem comes from the interaction of the polynomial mixing system with the independence of reduction modulo two relativelyprime integersp and q.

quotes

summary

Very fundamental, important paper. NTRU has the unique property of a lattice scheme switching between moduli (“polynomial mixing system” or “lifting operation”). It is interesting to see that they saw RSA and GGH as competitors for NTRU. I think they did a very good job looking at its security analysis, but I didn't look into the details.

New directions in cryptography §

Title: “New directions in cryptography” by W. Diffie, M. Hellman [url] [dblp]
Published in 1976 at Invited paper, IEEE Transactions on Information Theory, Volume 22, Issue 6 and I read it in 2021-05
Abstract:

definitions

quotes

summary

A historically important document. The paper has a SoK-style content giving a look at contemporary asymmetric cryptography in 1976. I liked the high-level overview and the collection of established vocabulary. I only missed to understand the math related to “kn = 5000” after parameterizing Leslie Lamport's system.

It was very interesting to see his discussion of the relation to complexity theory. In particular, I have never seen this written in such clear words: “As systems whose strength had been so argued were repeatedly broken, the notion of giving mathematical proofs for the security of systems fell into disrepute and was replaced by certification via crypanalytic assault.”

“We stand today on the brink of a revolution in cryptography” is Diffie-Hellman's first statement which is justified by the last sentence of the paper: “We hope this will inspire others to work in this fascinating area in which participation has been discouraged in the recent past by a nearly total government monopoly”.

typo

Number "Not" Used Once - Key Recovery Fault Attacks on… §

Title: “Number "Not" Used Once - Key Recovery Fault Attacks on LWE Based Lattice Cryptographic Schemes” by Prasanna Ravi, Shivam Bhasin, Anupam Chattopadhyay [url] [dblp]
Published in 2018 at COSADE 2019 and I read it in 2020-05
Abstract: This paper proposes a simple single bit flip fault attack applicable to several LWE (Learning With Errors Problem) based lattice based schemes like KYBER, NEWHOPE, DILITHIUM and FRODO which were submitted as proposals for the NIST call for standardization of post quantum cryptography. We have identified a vulnerability in the usage of nonce, during generation of secret and error components in the key generation procedure. Our fault attack, based on a practical bit flip model (single bit flip to very few bit flips for proposed parameter instantiations) enables us to retrieve the secret key from the public key in a trivial manner. We fault the nonce in order to maliciously use the same nonce to generate both the secret and error components which turns the LWE instance into an exactly defined set of linear equations from which the secret can be trivially solved for using Gaussian elimination.

ambiguity

What is z in the provided interval in section 2.6? Seems to be either an arbitrary integer or z of Algorithm 1 (NewHope)

error

“later converted to the NTT domain using the PolyBitRev function” … no, it is a separate step

inconsistency

quotes

summary

Short paper, tiny errors, no practical implementation (how successful can the attack on Frodo be implemented?) (correction: unlike the paper, their presentation slides show evaluation data), the structure of the text is beautiful and it is well-written. I particularly enjoyed the summary of weak LWE instances. Masking is also another countermeasure, but is heavy-weight compared to the proposed error-correcting codes

typo

The aforementioned fault vulnerabilities associated with the nonce only occur due to the implementation strategies used and hence can we believe can be easily corrected albeit with a small performance overhead.

⇒ hence can, we believe, can be easily corrected

On the Security of Password Manager Database Formats §

Title: “On the Security of Password Manager Database Formats” by Paolo Gasti, Kasper B. Rasmussen [url] [dblp]
Published in 2012 at ESORICS 2012 and I read it in 2021-05
Abstract: Password managers are critical pieces of software relied upon by users to securely store valuable and sensitive information, from online banking passwords and login credentials to passport- and social security numbers. Surprisingly, there has been very little academic research on the security these applications provide.

question

Why does Definition 2 (IND-CDBA security) include 1/2 + negl(κ) as threshold whereas Definition 3 (MAL-CDBA security) uses negl(κ)?

Game 1 (which Definition 2 relies upon) asks to decide upon one binary value (⇒ 50% with guessing strategy). Game 2 (which Definition 3 relies upon) asks to create a new, valid DB (⇒ 0% probability if guessing is your strategy).

quotes

summary

Well written paper with a clear agenda. It is perfectly structured and can be followed easily.

The paper defines two notions of security and evaluates password managers against this notion. The notion is well defined, but the practicality is debatable:

  • “We argue that MAL-CDBA security (together with IND-CDBA security) is an appropriate security notion for a password manager database format in practice.”
  • “We defined two realistic security models, designed to represent the capabilities of real-world attacks.”

I don't think so. Adding entries to the password manager do have a different impact on security than actually replacing existing entries. To retrieve the actual password, the adversary has to set up a typo-squatting domain and than replace the defined URL with the malicious one. Such a scenario is discussed in the paper, was evaluated and is practical, but the security notion does not distinguish between modification and extension. A pure extension attack has a much more limited impact.

I enjoyed the proper discussion of related work to security notions and formal definition of password managers. As of 2021, some results are deprecated (e.g. Internet Explorer does not exist anymore) and some password manager are not maintained anymore (e.g. PINs). Recommendable for everyone to see how the database formats are defined.

typo

On the criteria to be used in decomposing systems into… §

Title: “On the criteria to be used in decomposing systems into modules” by David Lorge Parnas [url] [dblp]
Published in 1972 at CACM, Volume 15, 1972 and I read it in 2021-03
Abstract: This paper discusses modularization as a mechanism for improving the flexibility and comprehensibility of a system while allowing the shortening of its development time. The effectiveness of a "modulariza tion11 is dependent upon the criteria used in dividing the system into modules. Two system design problems are presented, and for each, both a conventional and unconventional decomposition are described. It is shown that the unconventional decompositions have distinct advantages for the goals outlined. The criteria used in arriving at the decomposi tions are discussed. The unconventional decomposition, if implemented with the conventional assumption that a module consists of one or more subroutines, will be less efficient in most cases. An alternative approach to implementation which does not have this effect is sketched.

quotes

summary

First, someone needs to get familiar with KWIC (recognize the paper reference in the section below). KWIC felt like an arbitrary index someone came up with. I got confused by phrases like “the characters are packed four to a word”, which make little sense outside the index context of a book. But after reading the paper, I looked up the Wikipedia article and learned about its usecase (index of keywords before full-text search was available or in print). The paper is considered an ACM Classic and he got high praise for it.

Essentially, first I had to understand the setting when this paper was written. I grew up with data encapsulation in object-oriented programming, local scoping in programming languages and manipulating data behind a pointer was already a foreign, dangerous idea. The setting is the transition of assembler language towards more high-level languages with questions regarding information hiding arising.

In modular design, his double dictum of high cohesion within modules and loose coupling between modules is fundamental to modular design in software. However, in Parnas's seminal 1972 paper On the Criteria to Be Used in Decomposing Systems into Modules, this dictum is expressed in terms of information hiding, and the terms cohesion and coupling are not used. He never used them.
Wikipedia: David Parnas

I would define a module as a set of functionality (independent of representation in a programming language). High cohesion within modules and loose coupling between modules is a defining criterion for a good programmer. What I consider an open question, but often triggers bugs is the missing documentation for the interface between modules. Often a data structure transfers the data from one module to another. An informal description often triggers different expectations regarding the content of the data structure.

Back to the paper, it illustrates the decomposition of a system by two exemplary modularizations. Whereas the first decomposition was created along the major steps of the processing routines, the second decomposition was created with information hiding in mind. Then several recommendable criteria for decompositions are mentioned:

  1. A data structure, its internal linkings, accessing procedures and modifying procedures are part of a single module.
  2. The sequence of instructions necessary to call a given routine and the routine itself are part of the same module
  3. The formats of control blocks used in queues in operating systems and similar programs must be hidden within a “control block module.”
  4. Character codes, alphabetic orderings, and similar data should be hidden in a module for greatest flexibility
  5. The sequence in which certain items will be processed should (as far as practical) be hidden within a single module

In the end, I think the paper advocates a clean style which (in some sense and with some limitations) is still true today (e.g. web frameworks heavily limit the way you can define your architecture). I recommend every programmer to reflect about possible decompositions of a system, because the most intuitive one might not be the best. The notion of a flowchart approach being the sequential one, is however awkward and foreign to me.

PDF/A considered harmful for digital preservation §

Title: “PDF/A considered harmful for digital preservation” by Marco Klindt [url] [dblp]
Published in 2017 at iPRES 2017 and I read it in 2020-12
Abstract: Today, the Portable Document Format (PDF) is the prevalent file format for the exchange of fixed content electronic documents for publication, research, and dissemination work in the academic and cultural heritage domains. Therefore it is not surprising that PDF/A is perceived to be an archival format suitable for digital archiving workflows.

clarification needed

errors

“The textual markup of Markdown variants is machine actionable while being human friendly to read at the same time. It is suitable for structured texts (including lists and tables) where the exact layout is not as important. Markdown is not well suited for validation.”

quotes

Interesting:

Well put:

summary

This paper summarizes some shortcomings why/how PDF/A is not the final solution for long-term archival of documents.

PDF features not available in PDF/A (list from external resource):

  • Embedded video and audio files
  • encryption
  • transparency
  • Javascript
  • executable files
  • functions
  • external links
  • layers

I would summarize those arguments as lack of annotations in the PDF to make PDFs accessible for machines and people with visual disabilities. This makes it difficult to index and retrieve document information as part of a large corpus (→ database). The problems boil down to the design of PDF, which allows multiple ways of encoding text information, which might loose e.g. reading order information. Refer to Table 2, for example. In the end, the tooling support is not great, but a lack of alternatives can be identified. However, if we consider information extraction capabilities as more important than (e.g. font embedding and reproducible layout), some alternatives are mentioned in section 5.1 (e.g. HTML/CSS or ODF/OOXML). One intriguing analogy is given in the paper: Musical typesetting can be done with PDF as output, but retrieving information about the music is very difficult.

In the conclusion the author admits that the PDF/A authors were aware of its shortcomings: “The intent was not to claim that PDF-based solutions are the best way to preserve electronic  documents. PDF/A simply defines an archival profile of PDF that is more amenable to long-term preservation than traditional PDF”

In the end, the paper is a summary which does not provide any solution as pointed out by the author. As a critic of Markdown, I am saddened to see that Markdown was even mentioned (but other markup languages are neglected).

Piret and Quisquater’s DFA on AES Revisited §

Title: “Piret and Quisquater’s DFA on AES Revisited” by Christophe Giraud, Adrian Thillard [url] [dblp]
Published in 2010 at and I read it in 2020-06
Abstract: At CHES 2003, Piret and Quisquater published a very efficient DFA on AES which has served as a basis for many variants published afterwards. In this paper, we revisit P&Q’s DFA on AES and we explain how this attack can be much more efficient than originally claimed. In particular, we show that only 2 (resp. 3) faulty ciphertexts allow an attacker to efficiently recover the key in the case of AES-192 (resp. AES-256). Our attack on AES-256 is the most efficient attack on this key length published so far.

quotes

summary

Good paper. Ideas are immediate. Not all attacks presented give runtimes, but the (more important) number of faulty ciphertexts is analyzed properly. The original P&Q attack is neat in general.

The original attack from 2003 uniquely identifies the key with 2 faulty ciphertext pairs with probability 98%.

typo

Figure 2 swaps the last SubBytes and ShiftRows operation

Power analysis attack on Kyber §

Title: “Power analysis attack on Kyber” by Alexandre Karlov, Natacha Linard de Guertechin [url] [dblp]
Published in 2021-09 at and I read it in 2021-09
Abstract: This paper describes a practical side-channel power analysis on CRYSTALSKyber key-encapsulation mechanism. In particular, we analyse the polynomial multiplication in the decapsulation phase to recover the secret key in a semi-static setting. The power analysis attack was performed against the KYBER512 implementation from pqm4 [1] running on STM32F3 M4cortex CPU.

errors

quotes

summary

Idea is immediate (CPA on multiplication in decryption step, impl by pqm4, ChipWhisperer Pro with CW308 UFO and STM32F3). The scientific contribution is the evaluation data. A number of 200 traces suffices which is nice. The paper writing is very preliminary (e.g. definition of B in Bη missing, definition of semi-static setting, etc).

Practical CCA2-Secure and Masked Ring-LWE Implementati… §

Title: “Practical CCA2-Secure and Masked Ring-LWE Implementation” by Tobias Oder, Tobias Schneider, Thomas Pöppelmann [url] [dblp]
Published in 2018 at TCHES 2018 and I read it in 2020-10
Abstract: During the last years public-key encryption schemes based on the hardness of ring-LWE have gained significant popularity. For real-world security applications assuming strong adversary models, a number of practical issues still need to be addressed. In this work we thus present an instance of ring-LWE encryption that is protected against active attacks (i.e., adaptive chosen-ciphertext attacks) and equipped with countermeasures against side-channel analysis. Our solution is based on a postquantum variant of the Fujisaki-Okamoto (FO) transform combined with provably secure first-order masking. To protect the key and message during decryption, we developed a masked binomial sampler that secures the re-encryption process required by FO. Our work shows that CCA2-secured RLWE-based encryption can be achieved with reasonable performance on constrained devices but also stresses that the required transformation and handling of decryption errors implies a performance overhead that has been overlooked by the community so far. With parameters providing 233 bits of quantum security, our implementation requires 4,176,684 cycles for encryption and 25,640,380 cycles for decryption with masking and hiding countermeasures on a Cortex-M4F. The first-order security of our masked implementation is also practically verified using the non-specific t-test evaluation methodology.

quotes

summary

iae

typo

Practical Evaluation of Masking for NTRUEncrypt on ARM… §

Title: “Practical Evaluation of Masking for NTRUEncrypt on ARM Cortex-M4” by Thomas Schamberger, Oliver Mischke, Johanna Sepulveda [url] [dblp]
Published in 2019 at COSADE 2019 and I read it in 2021-10
Abstract: To protect against the future threat of large scale quantum computing, cryptographic schemes that are considered appropriately secure against known quantum algorithms have gained in popularity and are currently in the process of standardization by NIST. One of the more promising so-called post-quantum schemes is NTRUEncrypt, which withstood scrutiny from the scientific community for over 20 years.

quotes

summary

A good read and average academic paper.

This paper a masking scheme for NIST submission round 1 NTRUEncrypt variants NTRU-443 and NTRU-743. They discuss 4 proposed countermeasures (1 of them is masking) and conclude that two of the countermeasures should only be applied in combination with masking. They go on to mask polynomial multiplication per coefficient in the decryption as m = (f * (e + masks) - (f * masks)) = f * e. Then they evaluate the countermeasure on a STM32F303RCT7 ARM Cortex-M4 microcontroller mounted on the NewAE CW308 UFO board.

I think they could have skipped trivial figures 4 and 5 and instead argue in more detail why the masking only has to be applied to the multiplication but not other parts.

Region-Based Memory Management in Cyclone §

Title: “Region-Based Memory Management in Cyclone” by Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang, James Cheney [url] [dblp]
Published in 2002-06 at LPDI 2002 and I read it in 2023-05-14
Abstract: Cyclone is a type-safe programming language derived from C. The primary design goal of Cyclone is to let programmers control data representation and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory management of Cyclone and its static typing discipline. The design incorporates several advancements, including support for region subtyping and a coherent integration with stack allocation and a garbage collector. To support separate compilation, Cyclone requires programmers to write some explicit region annotations, but a combination of default annotations, local type inference, and a novel treatment of region effects reduces this burden. As a result, we integrate C idioms in a region-based framework. In our experience, porting legacy C to Cyclone has required altering about 8% of the code; of the changes, only 6% (of the 8%) were region annotations.

quotes

  • “Cyclone is a type-safe programming language derived from C. The primary design goal of Cyclone is to let programmers control data representation and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory management of Cyclone and its static typing discipline.” (Grossman et al., p. 1)
  • “to reduce the need for type casts, Cyclone has features like parametric polymorphism, subtyping, and tagged unions. To prevent bounds violations without making hidden data-representation changes, Cyclone has a variety of pointer types with different compile-time invariants and associated run-time checks.” (Grossman et al., p. 1)
  • “Following the seminal work of Tofte and Talpin [28], the system is region-based : each object lives in one region and, with the exception that a distinguished heap region may be garbage collected, a region’s objects are all deallocated simultaneously.” (Grossman et al., p. 2)
  • “In our experience, porting C code has required altering about 8% of the code, and the vast majority of changes have not been region annotations.” (Grossman et al., p. 2)
  • “Dynamic regions are created with the construct region r{s},wherer is an identifier and s is a statement. The region’s lifetime is the execution of s.Ins, r is bound to aregionhandle, which primitives rmalloc and rnew use to allocate objects into the associated region. For example, rnew(r) 3 returns a pointer to an int allocated in the region of handle r and initialized to 3.” (Grossman et al., p. 2)
  • “Like a declaration block, a dynamic region is deallocated precisely when execution leaves the body of the enclosed statement.” (Grossman et al., p. 2)
  • “Pointers to arrays of unknown size (denoted τ ?) are implemented with extra fields to support bounds-checks, but this design is orthogonal to regions.” (Grossman et al., p. 2)
  • “int*ρ describes a pointer to an int that is in the region whose name is ρ.” (Grossman et al., p. 2)
  • “A handle for a region corresponding to ρ has the type region_t<ρ>.” (Grossman et al., p. 3)
  • “A block labeled L (e.g., L:{int x=0;s}) has name ρL and refers to the stack region that the block creates.” (Grossman et al., p. 3)
  • “We can now give types to some small examples. If e1 has type region_t<ρ> and e2 has type τ ,thenrnew (e1) e2 has type τ *ρ” (Grossman et al., p. 3)
  • “Preventing dangling-pointer dereferences. To dereference a pointer, safety demands that its region be live.” (Grossman et al., p. 3)
  • “Functions in Cyclone are region polymorphic; they can abstract the actual regions of their arguments or results. That way, functions can manipulate pointers regardless of whether they point into the stack, the heap, or a dynamic region.” (Grossman et al., p. 3)
  • “The ? is Cyclone notation for a pointer to a dynamically sized array.” (Grossman et al., p. 3)
  • “Because struct definitions can contain pointers, Cyclone allows these definitions to be parameterized by region names.” (Grossman et al., p. 3)
  • “To solve this problem, we observe that if the region corresponding to ρ1 outlives the region corresponding to ρ2, then it is sound to use a value of type τ *ρ1 whereweexpect one of type τ *ρ2. Cyclone supports such coercions implicitly. The last-in-first-out region discipline makes such outlives relationships common: when we create a region, we know every region currently alive will outlive it.” (Grossman et al., p. 4)
  • “To ensure soundness, we do not allow casting τ1*ρ to τ2*ρ, even if τ1 is a subtype of τ2, as this cast would allow putting a τ2 in a location where other code expects a τ1.(Thisproblem is the usual one with covariant subtyping on references.)” (Grossman et al., p. 4)
  • “We emphasize that our approach to inference is purely intraprocedural and that prototypes for functions are never inferred. Rather, we use a default completion of partial prototypes to minimize region annotations. This approach permits separate compilation.” (Grossman et al., p. 4)
  • “the compiler deduces an appropriate annotation based on context: 1. For local declarations, a unification-based inference engine infers the annotation from the declaration’s (intraprocedural) uses. This local inference works well in practice, especially when declarations have initializers. 2. Omitted region names in argument types are filled in with fresh region names that are generalized implicitly. So by default, functions are region polymorphic without any region equalities. 3. In all other contexts (return types, globals, type definitions), omitted region names are filled in with ρH (i.e., the heap). This default works well for global variables and for functions that return heap-allocated results. However, it fails for functions like strcpy that return one of their parameters. Without looking at the function body, we cannot determine which parameter (or component of a parameter) the function might return.” (Grossman et al., p. 4)
  • “Cyclone does not have closures, but it has other typing constructs that hide regions. In particular, Cyclone provides existential types [22, 14], which suffice to encode closures [21] and simple forms of objects [5]. Therefore, it is possible in Cyclone for pointers to escape the scope of their regions.” (Grossman et al., p. 5)
  • “To address this problem, the Cyclone type system keeps track of the subset of region names that are considered live at each control-flow point. Following Walker, Crary, and Morrisett [29], we call the set of live regions the capability. To allow dereferencing a pointer, the type system ensures that the associated region name is in the capability. Similarly, to allow a function call, Cyclone ensures that regions the function might access are all live. To this end, function types carry an effect that records the set of regions the function might access.” (Grossman et al., p. 5)
  • “The second major departure from TT is that we do not have effect variables.” (Grossman et al., p. 5)
  • “To simplify the system while retaining the benefit of effect variables, we use a type operator, regions_of(τ ).This novel operator is just part of the type system; it does not existatruntime. Intuitively,regions_of(τ ) represents the set of regions that occur free in τ” (Grossman et al., p. 5)
  • “For typ e variables, regions_of(α) is treated as an abstract set of region variables, much like effect variables. For example, regions_of(α*ρ) = {ρ} ∪ regions_of(α)” (Grossman et al., p. 6)
  • “Cyclone supports existential types, which allow programmers to encode closures.” (Grossman et al., p. 6)
  • “In a separate technical report [15], we have defined an operational model of Core Cyclone, formalized the type system, and proven type soundness.” (Grossman et al., p. 6)
  • “The code generator ensures that regions are deallocated even when their lifetimes end due to unstructured control flow.” (Grossman et al., p. 8)
  • “In this fashion, we ensure that a region is always deallocated when control returns.” (Grossman et al., p. 8)
  • “We took two approaches to porting. First, we changed all the programs as little as possible to make them correct Cyclone programs. Then, for cfrac and mini_httpd,we regionized the code: We made functions more region polymorphic and, where possible, eliminated heap allocation in favor of dynamic region allocation with rnew. We also added compiler-checked “not null” annotations to pointer types where possible to avoid some null checks.” (Grossman et al., p. 8)
  • “There are two interesting results regarding the difficulty of minimal porting. First, the overall changes in the programs are relatively small — less than 10% of the program code needed to be changed. The vast majority of the differences arise from pointer-syntax alterations. These changes are typically easy to make — e.g., the type of strings are changed from char * to char ?. We are currently experimenting with interpreting char * as a safe null-terminated string type by default; doing so allows many fewer changes. The most encouraging result is that the number of region annotations is small: only 124 changes (which account for roughly 6% of the total changes) in more than 18,000 lines of code. The majority of these changes were completely trivial, e.g., many programs required adding ρH annotations to argv so that arguments could be stored in global variables.” (Grossman et al., p. 9)
  • “The cost of porting a program to use dynamic regions was also reasonable; in this case roughly 13% of the total differences were region-related.” (Grossman et al., p. 9)
  • “For the non-web benchmarks (and some of the web benchmarks) the median and mean were essentially identical, and the standard deviation was at most 2% of the mean. The factor columns for the Cyclone programs show the slowdown factor relative to the C versions.” (Grossman et al., p. 10)
  • “we pay a substantial penalty for compute-intensive benchmarks; the worst is grobner, which is almost a factor of three slower than the C version.” (Grossman et al., p. 10)
  • “As Table 3 demonstrates, bounds-checks are also an important component of the overhead, but less than we expected. We found that a major cost is due to the representation of fat pointers. A fat pointer is represented with three words: the base address, the bounds address, and the current pointer location (essentially the same representation used by McGary’s bounded pointers [20]).” (Grossman et al., p. 10)
  • “Many systems, including but certainly not limited to LCLint [10, 9], SLAM [3], Safe-C [2], and CCured [25], aim to make C code safe.” (Grossman et al., p. 10)
  • “As they note, region-based programming in C is an old idea; they contribute language support for efficient reference counting to detect if a region is deallocated while there remain pointers to it (that are not within it). This dynamic system has no apriorirestrictions on regions’ lifetimes and a pointer can point anywhere, so the RC approach can encode more memory-management idioms.” (Grossman et al., p. 11)
  • “Instead, we are currently developing a traditional intraprocedural flow analysis to track region aliasing and region lifetimes.” (Grossman et al., p. 11)

summary

This paper introduces region-based memory management following the seminal work of Tofte and Talpin for the C programming language. A compiler modification is written that takes regions into account to eliminate dereferencing of dangling pointers as compile-time error, memory management through deallocation once the region of the enclosed body statements goes out of scope, and run-time bounds checking with fat pointers for arbitrary-size arrays.

The paper describes the extended type system, notions of subtyping, polymorphism, and “outliving” for regions, revises a soundness proof (provided fully in a technical report), and inference techniques to avoid most region declarations. As a result, only 8% of the code had to changed. Performance has often improved (due to negligible checks), but run-time checks worsen the performance of CPU-intense tasks.

SEVurity: No Security Without Integrity §

Title: “SEVurity: No Security Without Integrity” by Luca Wilke, Jan Wichelmann, Mathias Morbitzer, Thomas Eisenbarth [url] [dblp]
Published in 2020 at IEEE Symposium on Security and Privacy 2020 and I read it in 2020-06
Abstract: One reason for not adopting cloud services is the required trust in the cloud provider: As they control the hypervisor, any data processed in the system is accessible to them. Full memory encryption for Virtual Machines (VM) protects against curious cloud providers as well as otherwise compromised hypervisors. AMD Secure Encrypted Virtualization (SEV) is the most prevalent hardware-based full memory encryption for VMs. Its newest extension, SEV-ES, also protects the entire VM state during context switches, aiming to ensure that the host neither learns anything about the data that is processed inside the VM, nor is able to modify its execution state. Several previous works have analyzed the security of SEV and have shown that, by controlling I/O, it is possible to exfiltrate data or even gain control over the VM’s execution. In this work, we introduce two new methods that allow us to inject arbitrary code into SEV-ES secured virtual machines. Due to the lack of proper integrity protection, it is sufficient to reuse existing ciphertext to build a high-speed encryption oracle. As a result, our attack no longer depends on control over the I/O, which is needed by prior attacks. As I/O manipulation is highly detectable, our attacks are stealthier. In addition, we reverse-engineer the previously unknown, improved Xor-Encrypt-Xor (XEX) based encryption mode, that AMD is using on updated processors, and show, for the first time, how it can be overcome by our new attacks.

Intel SGX's role

Intel Software Guard Extensions (SGX) was the first widely available solution for protecting data in RAM. However, it only can protect a small chunk of RAM, not the VM as a whole.

Memory encryption systems for VMs

Comparison: S. Mofrad, F. Zhang, S. Lu, and W. Shi, “A comparison study of intel sgx and amd memory encryption technology,”, 2018

Notes

On virtual memory with virtual machines

On virtualized systems, two different page tables are used. Within the VM, the VA used by the guest, the Guest Virtual Address (GVA), is translated to the Guest Physical Address (GPA). The GPA is the address which the VM considers to be the PA. However, on the host system itself another page table is introduced, to allow multiple VMs to run on the same physical machine. This second page table is called the Second Level Address Translation (SLAT), or Nested Page Table (NPT) [5]. The NPT translates the GPA into the Host Physical Address (HPA), the actual address of the data in physical memory.

summary

Trust in cloud providers

One reason for not adopting cloud services is the required trust in the cloud provider: As they control the hypervisor, any data processed in the system is accessible to them.

Tweakable block ciphers

One popular method for storage encryption are tweakable block ciphers, such as AES-XTS [1], which is, e.g., used in Apple’s FileVault, MS Bitlocker and Android’s file-based encryption. Tweakable block ciphers provide encryption without data expansion as well as some protection against plaintext manipulation. A tweak allows to securely change the behavior of the block cipher, similar to instantiating it with a new key, but with little overhead.

XEX and Xor-Encrypt (XE) are methods to turn a block cipher such as AES into a tweakable blockcipher, where a tweak-derived value is XORed with the plaintext before encryption (and XORed again after encryption in the case of XEX).

typo

“before it continuous its operation” → “before it continues its operation.”

SOK: On the Analysis of Web Browser Security §

Title: “SOK: On the Analysis of Web Browser Security” by Jungwon Lim, Yonghwi Jin, Mansour Alharthi, Xiaokuan Zhang, Jinho Jung, Rajat Gupta, Kuilin Li, Daehee Jang, Taesoo Kim [url] [dblp]
Published in 2021-12 at and I read it in 2022-02
Abstract: Web browsers are integral parts of everyone’s daily life. They are commonly used for security-critical and privacy sensitive tasks, like banking transactions and checking medical records. Unfortunately, modern web browsers are too complex to be bug free (e.g., 25 million lines of code in Chrome), and their role as an interface to the cyberspace makes them an attractive target for attacks. Accordingly, web browsers naturally become an arena for demonstrating advanced exploitation techniques by attackers and state-of-the-art defenses by browser vendors. Web browsers, arguably, are the most exciting place to learn the latest security issues and techniques, but remain as a black art to most security researchers because of their fast-changing characteristics and complex code bases.

quotes

summary

The paper looks at Chrome/Blink/V8, Safari/WebKit/JavaScriptCore, Firefox/Gecko/SpiderMonkey, and Internet Explorer/Trident/Chakra web browsers and their architecture with respect to security requirements. This concerns the rendering, IPC, Site isolation, Javascript engine security as well as web technologies like Same-Origin Policy. In Table 1, one can see many sandboxing/isolation mechanisms in place, especially in Chrome and Firefox. Browser exploitation scenarios and bug classifications are provided in Figure 2. The paper also looks at the history and timeline of bugs and bug classes as well as strategies to detect them. Table 4 provides a very detailed look at mitigations in browsers in relation to time.

In summary, the paper provides a very detailed look at web browser security. It can serve as list of keywords to get started on web security topics. In case you plan to build your own web browser engine, it will help you understand requirements and gives architectural design ideas. But it also gives a historical overview which migitations were deployed as time progressed.

Lessons:

  1. Using memory safe languages is an effective mitigation against memory-safety bugs.
  2. Higher payouts motivate more bug reports.
  3. UAF mitigations are effective towards reducing DOM bug exploits.
  4. Mitigating JS engine bugs is difficult.
  5. UXSS bugs are mostly mitigated by Site Isolation.
  6. Collaborative efforts on mitigations are good.

One controversial aspect:

“Although vendors are trying, they are consistently behind in this arms race. Mitigations from vendors are mostly reactive, which means they are developed long after each wave of attacks. By the time an attack surface is finally closed, attackers have already come up with a better exploit.”

I am not really sure about this statement. I think it is too strong. Did the authors consider how many mitigations were put in place to prevent security exploits in the first place? I don't think credit is provided for these decisions in the statement.

citation 121 = “Safer Usage Of C++” by Google Security

Scribble: Closing the Book on Ad Hoc Documentation Too… §

Title: “Scribble: Closing the Book on Ad Hoc Documentation Tools” by Matthew Flatt, Eli Barzilay, Robert Bruce Findler [url] [dblp]
Published in 2009-09 at ICFP'09 and I read it in 2022-01
Abstract: Scribble is a system for writing library documentation, user guides, and tutorials. It builds on PLT Scheme’s technology for language extension, and at its heart is a new approach to connecting prose references with library bindings. Besides the base system, we have built Scribble libraries for JavaDoc-style API documentation, literate programming, and conference papers. We have used Scribble to produce thousands of pages of documentation for PLT Scheme; the new documentation is more complete, more accessible, and better organized, thanks in large part to Scribble’s flexibility and the ease with which we cross-reference information across levels. This paper reports on the use of Scribble and on its design as both an extension and an extensible part of PLT Scheme.

quotes

summary

Excellent tool which expands some existing concepts from Scribe (2005). Fundamentally the prefix “#lang scribble/doc” dispatches parsing based on the module loaded. In particular S-expressions are considered as building blocks for the language defining typesetting elements. The primary advantage is to support the “program is data” paradigm in the typesetting context. The disadvantage is the tight coupling between the PLT Scheme infrastructure and the document.

Seven great blunders of the computing world §

Title: “Seven great blunders of the computing world” by N. Holmes [url] [dblp]
Published in 2002-07 at and I read it in 2021-08
Abstract:

quotes

summary

The article discusses seven topics, the author considers to be solved wrongfully from the technical community.

Even though the mentioned ‘blunders’ are a neat collection of historically decided debates, I don't think the term ‘blunder’ is justified. Especially blunder 4 “Commercial programming” is highly controversial in retrospect and mostly claimed without proper arguments. Blunder 1 “Terminology”, on the other hand, made me reflect on the terms “information” and “data”.

Software-based Power Side-Channel Attacks on x86 §

Title: “Software-based Power Side-Channel Attacks on x86” by Moritz Lipp, Andreas Kogler, David Oswald, Michael Schwarz, Catherine Easdon, Claudio Canella, Daniel Gruss [url] [dblp]
Published in 2020-11 at and I read it in 2020-12
Abstract: Power side-channel attacks exploit variations in power consumption to extract secrets from a device, e.g., cryptographic keys. Prior attacks typically required physical access to the target device and specialized equipment such as probes and a high-resolution oscilloscope.

questions

quotes

summary

In this paper, we present PLATYPUS attacks which are novel software-based power side-channel attacks on Intel server, desktop, and laptop CPUs. We exploit unprivileged access to the Intel Running Average Power Limit (RAPL) interface that exposes values directly correlated with power consumption, forming a low-resolution side channel.

Some instructive mathematical errors §

Title: “Some instructive mathematical errors” by Richard P. Brent [url] [dblp]
Published in 2021-06-20 at and I read it in 2020-08
Abstract: We describe various errors in the mathematical literature, and consider how some of them might have been avoided, or at least detected at an earlier stage, using tools such as Maple or Sage. Our examples are drawn from three broad categories of errors. First, we consider some significant errors made by highly-regarded mathematicians. In some cases these errors were not detected until many years after their publication. Second, we consider in some detail an error that was recently detected by the author. This error in a refereed journal led to further errors by at least one author who relied on the (incorrect) result. Finally, we mention some instructive errors that have been detected in the author’s own published papers.

Comment: 25 pages, 75 references. Corrected typos and added footnote 5 in v2

feedback

Links to Wikipedia should be permalinks.

quotes

summary

Nice paper showing how some errors crept up in mathematical literature. Some examples are historical, others are related to the author's work. Apparently the author works in the field of number theory and computational mathematics. Less examples are provided from algebra and geometry.

Reproducibility and numeric verification seem to be approaches to combat the underlying problems. How I also wonder how much a difference it would make if formulas are easier to search for. I wonder how accessible sagemath, Maple and others are for researchers (can they easily verify theorems of fields they are not familiar with?).

Some-well known errors:

Ad claim 2 - disproval methods:

  1. algebraic argument
  2. numeric evaluation
  3. analytical argument

 

Templates vs. Stochastic Methods: A Performance Analys… §

Title: “Templates vs. Stochastic Methods: A Performance Analysis for Side Channel Cryptanalysis” by Benedikt Gierlichs, Kerstin Lemke-Rust, Christof Paar [url] [dblp]
Published in 2006 at CHES 2006 and I read it in 2020-06
Abstract: Template Attacks and the Stochastic Model provide advanced methods for side channel cryptanalysis that make use of ‘a-priori’ knowledge gained from a profiling step. For a systematic comparison of Template Attacks and the Stochastic Model, we use two sets of measurement data that originate from two different microcontrollers and setups. Our main contribution is to capture performance aspects against crucial parameters such as the number of measurements available during profiling and classification. Moreover, optimization techniques are evaluated for both methods under consideration. Especially for a low number of measurements and noisy samples, the use of a T-Test based algorithm for the choice of relevant instants can lead to significant performance gains. As a main result, T-Test based Templates are the method of choice if a high number of samples is available for profiling. However, in case of a low number of samples for profiling, stochastic methods are an alternative and can reach superior efficiency both in terms of profiling and classification.

ambiguity

quotes

summary

Important empirical results. Parameters and assumptions are okay. Results are fundamental and significant. However, the description of the methods (templates, stochastic) are quite bad. Looking up the references is required.

Notes:

The Aesthetics of Reading §

Title: “The Aesthetics of Reading” by Kevin Larson, Rosalind Picard [url] [dblp]
Published in 2005 at and I read it in 2021-08
Abstract: In this paper we demonstrate a new methodology that can be used to measure aesthetic differences by examining the cognitive effects produced by elevated mood. Specifically in this paper we examine the benefits of good typography and find that good typography induces a good mood. When participants were asked to read text with either good or poor typography in two studies, the participants who received the good typography performed better on relative subjective duration and on certain cognitive tasks.

quotes

summary

This paper tries to evaluate ClearType's typographic performance in a user study by evaluating performance in creative tasks. I think the assumptions are very strong (e.g. “Our hope is that RSD not only detects task difficulty, but also aesthetic differences”).

The Case of Correlatives: A Comparison between Natural… §

Title: “The Case of Correlatives: A Comparison between Natural and Planned Languages” by Federico Gobbo [url] [dblp]
Published in 2011 at Journal of Universal Language 2011 and I read it in 2021-07
Abstract:

quotes    

summary

A very neat paper from the field of linguistics. Language elements are first established based on grammatical categories (Bausani (1974) and Gobbo (2008) distinguish esoteric/exoteric languages) (Blanke (1985) differentiates project to language on a one-dimensional axis) (Grammar Characters by Tesnière (1959)).

In the following planned languages with their peak development on the transition from the 19th to 20th century and a focus on the Standard Average European sprachbund are discussed. Here the distinction of similarity (with established languages) versus regularity (formed along formal principles) is pointed out as fundamental. Similar to the notion of suppletive. The paper then contributes the correlatives in various languages: {English, German, French, Latin, Volapük, Spelin, Zahlensprache, Lithuanian, Esperanto, Esperanto (1984), Ido, Novial, Neo, Interlingua, Klingon, Na'vi}. The paper appropriately discusses the relation between those languages.

The main result is given as “some natural languages are surprisingly far more regular than their planned daughters in spite of the fact that regularity was a major claim of the efforts in planning IALs during the late 19th and early 20th century”. The result is less unexpected in the end. As the discussed developments show, similarity is sometimes favored over regularity thus neglecting a regular table of correlatives. Furthermore some discussed auxiliary languages never used simplicity/easiness/regularity as the main goal. In essence, Esperanto shows the most regular system which is an auxiliary language.

typo

page 73, remove “took”

The European Union and the Semantic Web §

Title: “The European Union and the Semantic Web” by Neville Holmes [url] [dblp]
Published in 2008 at and I read it in 2021-09
Abstract:

quotes

summary

Without a particular reason the author creates the E-speranto proposal; a “simplification” to Esperanto. The elements are appropriately discussed for an introductory article but also the relationship to the EU is unmotivated. In the end, the author emphasizes the lexicon which is interesting to consider it as a separate concept from language.

The proposal:

Word endings: As an Indo-European language, E-speranto uses endings to express grammatical qualification of word stems. Thus, adverbs end in –e, infinitives in –i, and imperatives in –u. Synthesis starts showing in noun and adjective endings. Using a BNF-like notation, these are –[a|o](y](n] where the required a or o signal an adjective or noun respectively, the optional y signals plurality, and the optional n signals accusativity. Verb endings use the five vowels for a kind of temporal placement. Thus a signals here and now, o signals ahead or the future, i behind or the past, u conditional or propositional, and e perpetual or definitional. The full verb endings are –as for present tense, –os for future tense, –is for past tense, –us for conditional, and –es for perpetual. These verb endings can also be used as morphemes, so that osa is the adjectival future and la aso means the present. The vowels are used as placers in other synthetic syllables that can be used as suffixes or ordinary morphemes. The formula [vowel](n][t] yields 10 morphemes that qualify actions. With the n the action is active, without it passive, so amanta and amata are adjectives meaningloving and loved, and ante and ate are adverbs meaning actively and passively, all these set in the present. This construction can be simply extended to specify horizontal spatial placement by [vowel](m][p] and vertical by [vowel](n][k], where the m and n specify placement relative to the sentence’s subject. Also, u means left and e means right. Thus la domompo means the house ahead while la domopo means the front of the house. Such compounds can take some getting used to, but they are regular and powerful.

Structural words: Synthesis at the phonemic level is even more expressive in its use when building the pronouns and correlatives essential to expressiveness. In Esperanto these center on the vowel i, with affixed phonemes to give the particular class of meaning. The singular pronouns are mi, ci, li, xi, and ji for I, thou, he, she, and it, and si for reflexion. Here I use E-speranto spellings where c is pronounced like the ts in tsunami, and x as sh. There are also two plural pronouns: ni and vi for we and you. There are also two prefixes that can be used: i- to widen the scope and o- to abstract it. Thus ili means he and his associates or they, and omi means someone like me or one. The pronouns can also take grammatical suffixes such as –n for the accusative (min means me) and –a for the adjectival (mia means my). The correlatives are two-dimensional, with one range of meanings given by a suffix, and an independent range by a prefix. The generic correlatives have no prefix. Having a simple vowel suffix, ia, ie, io, and iu mean respectively some kind of, somewhere, something, and somebody. Having a vowel+consonant suffix, ial, iam, iel, ies, iol, and iom, mean respectively for some reason, at some time, in some way, someone’s, in some number, and in some amount. The specific correlatives apply
a variety of prefixes to the generic correlatives. The prefixes are k–, t–, nen–, and q– (pronounced ch–),
and they give selective, indicative, negative, and inclusive meanings. For example, kiam, tiam, neniam,
and qiam mean when, then, never, and always, respectively. This description shows how phonemic synthesis can yield dramatic richness simply. Further, the correlatives can also take the i– and o– scoping prefixes, and both the pronouns and correlatives can be grammatically suffixed.

The Implementation of Lua 5.0 §

Title: “The Implementation of Lua 5.0” by Roberto Ierusalimschy, Luiz Henrique de Figueiredo, Waldemar Celes [url] [dblp]
Published in 2005 at and I read it in 2022-12-10
Abstract: We discuss the main novelties of the implementation of Lua 5.0: its registerbased virtual machine, the new algorithm for optimizing tables used as arrays, the implementation of closures, and the addition of coroutines.

quotes

  • “Currently the compiler accounts for approximately 30% of the size of the Lua core.” (Ierusalimschy et al., p. 3)
  • “the hand-written parser is smaller, more efficient,more portable, and fully reentrant.” (Ierusalimschy et al., p. 3)
  • “Lua is really lightweight: for instance, on Linux its stand-alone interpreter, complete with all standard libraries, takes less than 150 Kbytes; the core is less than 100 Kbytes.” (Ierusalimschy et al., p. 4)
  • “We also consider Lua a simple language, being syntactically similar to Pascal and semantically similar to Scheme, but this is subjective.” (Ierusalimschy et al., p. 4)
  • “Lua represents values as tagged unions ,that is, as pairs (t; v), where t is an integer tag identifying the type of the value v, which is a union of Ctypes implementing Lua types.” (Ierusalimschy et al., p. 5)
  • “One consequence of using tagged unions to represent Lua values is that copying values is a little expensive: on a 32-bit machine with 64-bit doubles, the size of a TObject is 12bytes (or 16bytes, if doubles are aligned on 8byte boundaries) and so copying a value requires copying 3 (or 4) machine words.” (Ierusalimschy et al., p. 5)
  • “The hash function does not look at all bytes of the string if the string is too long.” (Ierusalimschy et al., p. 6)
  • “The hash part uses a mix of chained scatter table with Brent's variation [3].” (Ierusalimschy et al., p. 8)
  • “Most procedural languages avoid this problem by restricting lexical scoping (e.g., Python), not providing first-class functions (e.g., Pascal), or both (e.g., C). Functional languages do not have those restrictions. Research in non-pure functional languages like Scheme and ML has created a vast body of knowledge about compilation techniques for closures (e.g., [19, 1,21]).” (Ierusalimschy et al., p. 9)
  • “For instance, just the control flow analysis of Bigloo, an optimizer Scheme compiler [20], is more than ten times larger than the whole Lua implementation: The source for module Cfa of Bigloo 2.6f has 106,350 lines, versus 10,155 lines for the core of Lua 5.0. As explained in Section 2, Lua needs something simpler.” (Ierusalimschy et al., p. 9)
  • “Lua uses a structure called an upvalue to implement closures. Any outer local variable is accessed indirectly through an upvalue. The upvalue originally points to the stack slot wherein the variable lives (Figure 4,left). When the variable goes out of scope, it migrates into a slot inside the upvalue itself (Figure 4, right).” (Ierusalimschy et al., p. 9)
  • “Coroutines in Lua are stackful, in the sense that we can suspend a coroutine from inside any number of nested calls.” (Ierusalimschy et al., p. 10)
  • “As such, they allow programmers to implement several advanced control mechanisms, such as cooperative multithreading, generators, symmetric coroutines, backtracking, etc. [7].” (Ierusalimschy et al., p. 10)
  • “A key point in the implementation of coroutines in Lua is that the interpreter cannot use its internal C stack to implement calls in the interpreted code. (The Python community calls an interpreter that follows that restriction a stackless interpreter [23].)” (Ierusalimschy et al., p. 11)
  • “Because a function running in a coroutine may have been created in another coroutine, it may refer to variables in a different stack. This leads to what some authors call a cactus structure [18]. The use of flat closures, as we discussed in Section 5, avoids this problem altogether.” (Ierusalimschy et al., p. 11)
  • “Most instructions in a stack machine have implicit operands.” (Ierusalimschy et al., p. 12)
  • “There are 35 instructions in Lua's virtual machine.” (Ierusalimschy et al., p. 12)
  • “Branch instructions pose a difficulty because they need to specify two operands to be compared plus a jump offset.” (Ierusalimschy et al., p. 14)
  • “The solution adopted in Lua is that, conceptually, a test instruction simply skips the next instruction when the test fails;” (Ierusalimschy et al., p. 14)
  • “For function calls, Lua uses a kind of register window. It evaluates the call arguments in successive registers, starting with the first unused register. When it performs the call, those registers become part of the activation record of the called function, which therefore can access its parameters as regular local variables.” (Ierusalimschy et al., p. 15)
  • “Lua uses two parallel stacks for function calls.” (Ierusalimschy et al., p. 15)

summary

Wonderful read with a beautiful overview on the Lua 5.0 interpreter runtime design. The sections illustrate the content: {The representation of Values, Tables, Functions and Closures, Threads and Coroutines, The Virtual Machine}.

unclarity

  • “The equivalent Lua program, a={[1000000000]=1}, creates a table with a single entry.” (Ierusalimschy et al., p. 6)
    But the two-table design later on contradicts this statement. Since we skip storing the indices, we again require a huge array allocation.

The Profession as a Culture Killer §

Title: “The Profession as a Culture Killer” by Neville Holmes [url] [dblp]
Published in 2007 at and I read it in 2021-09
Abstract:

quotes

summary

Mentions a myriad of topics. His discussion of characters of ASCII is informative, but his remarks about technology and writing systems are incomplete and biased.

The Security Risk of Lacking Compiler Protection in We… §

Title: “The Security Risk of Lacking Compiler Protection in WebAssembly” by Quentin Stievenart, Coen De Roover, Mohammad Ghafari [url] [dblp]
Published in 2021 at QRS 2021 and I read it in 2022-12-12
Abstract: WebAssembly is increasingly used as the compilation target for cross-platform applications. In this paper, we investigate whether one can rely on the security measures enforced by existing C compilers when compiling C programs to WebAssembly. We compiled 4,469 C programs with known buffer overflow vulnerabilities to x86 code and to WebAssembly, and observed the outcome of the execution of the generated code to differ for 1,088 programs. Through manual inspection, we identified that the root cause for these is the lack of security measures such as stack canaries in the generated WebAssembly: while x86 code crashes upon a stack-based buffer overflow, the corresponding WebAssembly continues to be executed. We conclude that compiling an existing C program to WebAssembly without additional precautions may hamper its security, and we encourage more research in this direction.

potential contradiction

  • “We performed our evaluation using -O1 as the optimisation level.” (Stievenart et al., p. 7)
    “For this reason, we decided to use -O2 as the level of optimization.” (Stievenart et al., p. 6)

quotes

  • “We compiled 4,469 C programs with known buffer overflow vulnerabilities to x86 code and to WebAssembly, and observed the outcome of the execution of the generated code to differ for 1,088 programs. Through manual inspection, we identified that the root cause for these is the lack of security measures such as stack canaries in the generated WebAssembly: while x86 code crashes upon a stack-based buffer overflow, the corresponding WebAssembly continues to be executed.” (Stievenart et al., p. 1)
  • “The standard has been designed with security in mind, as evidenced among others by the strict separation of application memory from the execution environment’s memory. Thanks to this separation, a compromised WebAssembly binary cannot compromise the browser that executes the binary [2], [7].” (Stievenart et al., p. 1)
  • “This contradicts the design document of the WebAssembly standard [2] which states “common mitigations such as data execution prevention (DEP) and stack smashing protection (SSP) are not needed by WebAssembly programs”” (Stievenart et al., p. 1)
  • “The execution model of WebAssembly is stack-based” (Stievenart et al., p. 2)
  • “A WebAssembly program also contains a single linear memory, i.e., a consecutive sequence of bytes that can be read from and written to by specific instructions.” (Stievenart et al., p. 2)
  • “An example function is the following:
    (func $main (type 4)
     (param i32 i32)
     (result i32)
     (local i32)
     local.get 0)”
    (Stievenart et al., p. 2) [with two params, one return value, one local variable]
  • “After the last instruction, the function execution ends and the value remaining on the stack is the return value.” (Stievenart et al., p. 2)
  • “g0 acts as the stack pointer.” (Stievenart et al., p. 2)
  • “Moreover, multiple features of WebAssembly render programs less vulnerable than their native equivalent. Unlike in x86, the return address of a function in WebAssembly is implicit and can only be accessed by the execution environment, preventing returnoriented programming attacks among others, diminishing the potential of stack smashing attacks. Also, function pointers are supported in WebAssembly through indirect calls, where the target function of the call is contained in a statically-defined table: this reduces the number of possible control flow exploits.” (Stievenart et al., p. 3)
  • “We rely on the Juliet Test Suite 1.3 for C2 of the Software Assurance Reference Dataset [3], released in October 2017 and which has been used to compare static analysis tools that detect security issues in C and Java applications [5].” (Stievenart et al., p. 3)
  • “In total, for CWE 121, we have 2785/5906 programs (47%) that can be compiled to WebAssembly, and for CWE 122, we have 1666/3656 programs (46%) that can be compiled.” (Stievenart et al., p. 3)
  • “Moreover, stack smashing protection need to be ensured by the compiler rather than the WebAssembly runtime, that has no knowledge of how the program stack is managed.” (Stievenart et al., p. 5)
  • “We now turn our attention to the impact of optimisations on the ability to prevent vulnerable code to make it into the binary7. To illustrate this, we inspect one specific example that exhibits different behaviour depending on the optimisation levels, which is in essence similar to our first example.” (Stievenart et al., p. 5)
  • “For instance, one aspect which we have not encountered in the programs we inspected is that there is no use of function pointers.” (Stievenart et al., p. 7)
  • “Arteaga et al. [4] propose an approach to achieve code diversification for WebAssembly: given an input program, multiple variants of this program can be generated. Narayan et al. [14] propose Swivel, a new compiler framework for hardening WebAssembly binaries against Spectre attacks, which can compromise the isolation guarantee of WebAssembly. Sti ́ evenart and De Roover propose a static analysis framework [16] for WebAssembly, used to build an information flow analysis [17] to detect higher-level security concerns such as leaks of sensitive information.” (Stievenart et al., p. 7)

summary

4469 C programs (of a corpus for static code analysis test cases) are compiled to WASM and differences in the behavior on those two platforms are observed. 1088 programs showed different behavior. Specifically, it is simply concluded that WASM does not provide stack smashing detection. The study is useful, but limited in scope and with an expected result.

The UNIX Time- Sharing System §

Title: “The UNIX Time- Sharing System” by Dennis M Ritchie, Ken Thompson, Bell Laboratories [url] [dblp]
Published in 1974-07 at Communications of the ACM and I read it in 2021-03
Abstract:

contradiction

Funny process

9.2 Per day (24-hour day, 7-day week basis)
There is a "background" process that runs at the lowest possible priority; it is used to soak up any idle CPU time. It has been used to produce a million-digit approximation to the constant e - 2, and is now generating composite pseudoprimes (base 2).

quotes

summary

Interesting retrospective read.

It was interesting to observe what has changed since then. Though I assume in 1974 they were quite experienced with their system design, such a huge design necessarily has a lot of controversial points.
The paper explains the filesystem and the shell as core concepts.

I think the idle background process and the casual statement of “there is about one crash every other day” is funny from today's perspective.

Underspecified statement

“To provide an indication of the overall efficiency of UNIX and of the file system in particular, timings were made of the assembly of a 7621-1ine program. The assembly was run alone on the machine; the total clock time was 35.9 sec, for a rate of 212 lines per sec.”

… which program? What does it do?

The design of a Unicode font §

Title: “The design of a Unicode font” by C Bigelow, K Holmes [url] [dblp]
Published in 1993 at and I read it in 2020-10
Abstract: The international scope of computing, digital information interchange, and electronic publishing has created a need for world-wide character encoding standards. Unicode is a comprehensive standard designed to meet such a need. To be readable by humans, character codes require fonts that provide visual images — glyphs — corresponding to the codes. The design of a font developed to provide a portion of the Unicode standard is described and discussed.

ambiguity

“The inclusion of non-alphabetic symbols and non-Latin letters in these 8-bit character
sets required font developers to decide whether the assorted symbols and non-Latin letters
should be style-specific or generic.”

A definition of style-specific and generic would be nice.

“Hence, although accents need to be clearly differentiated, they do not need to be emphatic, and, indeed, overly tall or heavy accents can be more distracting than helpful to readers.”

What does empathic mean in this context?

nicely written

“To design such a font is a way to study and appreciate, on a microcosmic scale, the manifold variety of literate culture and history.”

“A ‘roman’ typeface design (perhaps upright would be a less culture-bound term, since a Unicode font is likely to include Greek, Cyrillic, Hebrew, and other scripts)” …

question

One of the advantages of Unicode is that it includes a Generic Diacritical Marks set of ‘floating’ diacritics that can be combined with arbitrary letters. These are not case-sensitive, i.e. there is only one set of floating diacritics for both capitals and lowercase

Isn't this a property of the font file format?

quotes

summary

A neat paper. Due to my lack of understanding of the status quo in 1993, I cannot judge on the approach and quality. There are some considerations, I am not familiar with (e.g. why do we need to distinguish only two kinds of diacritics - lowercase and uppercase), but the paper gives a broad overview over design decisions that need to be made, when designing a ‘Unicode’ font. They designed 1700 characters, but Unicode 1.1 specifies 34,168 characters. It is part of the paper to discuss “One big font vs. many little fonts”.

The problem with unicode §

Title: “The problem with unicode” by N. Holmes [url] [dblp]
Published in 2003-06 at and I read it in 2021-08
Abstract:

quotes

summary

In this article, the author argues that Unicode is blunder, too complex and unstable. First, he puts markup and plaintext into contrast. Then he continues to discuss the Latin writing system suggesting a eight-bit categorization system (without actually assigning glyphs). He continues to discuss keyboards leading towards CJK setups. In the end, he claims, Unicode does not take full advantage of the systematic graphical features of the various writing systems.

At least from today's perspective, the author claims to solve typesetting problems without actually offering a solution in detail. The suggested modifiers in Table 1 require further discussion by the typographers. Does the font specify the positioning of diacritics or is it part of his suggested scheme? In the end, these discussions are today solved in OpenType and the original bit-level encoding does not matter. His suggestion for various 8-bit systems requires a proper annotation of encodings per text file.

In the end, I think the author has some valid points, but his statements lack depth and time has solved these issues differently than he suggested.

typo

esthetically → aesthetically

Too Much Crypto §

Title: “Too Much Crypto” by Jean-Philippe Aumasson [url] [dblp]
Published in 2020-01-03 at Real-World Crypto 2020 and I read it in 2020/01
Abstract: We show that many symmetric cryptography primitives would not be less safe with significantly fewer rounds. To support this claim, we review the cryptanalysis progress in the last 20 years, examine the reasons behind the current number of rounds, and analyze the risk of doing fewer rounds. Advocating a rational and scientific approach to round numbers selection, we propose revised number of rounds for AES, BLAKE2, ChaCha, and SHA-3, which offer more consistent security margins across primitives and make them much faster, without increasing the security risk.

ambiguity

“Where we examine the reasons behind the number of rounds, comment on the risk posed by quantum computers, and finally propose new primitives for a future where less energy is wasted on computing superfluous rounds.”

Is this an English sentence?

notes

Good paper; its survey is its strong suit. However, the particular choice of proposed parameters has little justification

typo

If all the best cryptanalysts could find was a distinguisher in the chosen-ciphertext related-key model, then even less likely are practical key recovery attack in the chosen-plaintext model.

typo

But as as noted in §2, numbers such as

Toward decent text encoding §

Title: “Toward decent text encoding” by N. Holmes [url] [dblp]
Published in 1998 at and I read it in 2020-08
Abstract:

quotes

summary

This article debates requirements for a decent text encoding scheme. It criticizes Unicode and argues in favor of Multicode.

Reading this 1998 article in 2021 certainly creates some issues. First and foremost, it is not understandable to me why the author equates Unicode to a 16-bit encoding (last page, left column). Unicode is called “obese character set” and 8-bit encodings are praised. The author neglects UTF-8 invented in 1992 without the “obese” property. His lack of understanding for writing systems is shown when describing “the traditional Chinese writing system” as “the one exception” “which encompasses thousands of distinct characters”. This statement excludes the CJK community which includes Kanji in Japanese text and Hanja in Hanguel (admittedly reduced to minority use since 1970 and limited to South Korea).

At the same time Holmes praises 8-bit encoding systems like the Multicode scheme, which makes computations like “convert to lowercase” difficult to implement (thus ignoring computational burden).

It seems to me the author did not properly research his topic of interest. But I agree upon the goal mentioned in the article:

“For text encoding, the world needs a standard for each writing system that suits each and every language using that system.”

Tweaks and Keys for Block Ciphers: The TWEAKEY Framewo… §

Title: “Tweaks and Keys for Block Ciphers: The TWEAKEY Framework” by Jérémy Jean, Ivica Nikolić, Thomas Peyrin [url] [dblp]
Published in 2014 at ASIACRYPT 2014 and I read it in 2020-06
Abstract: We propose the TWEAKEY framework with goal to unify the design of tweakable block ciphers and of block ciphers resistant to related-key attacks. Our framework is simple, extends the key-alternating construction, and allows to build a primitive with arbitrary tweak and key sizes, given the public round permutation (for instance, the AES round). Increasing the sizes renders the security analysis very difficult and thus we identify a subclass of TWEAKEY, that we name STK, which solves the size issue by the use of finite field multiplications on low hamming weight constants. We give very efficient instances of STK, in particular, a 128-bit tweak/key/state block cipher Deoxys-BC that is the first AES-based ad-hoc tweakable block cipher. At the same time, Deoxys-BC could be seen as a secure alternative to AES-256, which is known to be insecure in the related-key model. As another member of the TWEAKEY framework, we describe Kiasu-BC, which is a very simple and even more efficient tweakable variation of AES-128 when the tweak size is limited to 64 bits.

Properties of tweakable block ciphers:

quotes (on contributions)

quotes (on history)

Non-intuitive results on birthday paradox bounds

Future work

Tweakey

Performance and area

summary

I think this paper is some good research product. I am not an expert on symmetric cryptography and cannot judge upon the security analysis and possible attacks, but to me it seems to consider relevant properties. Unrelated to the paper, I was not aware of beyond-birthday-attack-security which totally intrigued me. Related to the paper, follow-up work could be made regarding the question “What are sufficient conditions to achieve a secure tweakable block cipher with Tweakey?”. Well done!

typo

Underproduction: An Approach for Measuring Risk in Ope… §

Title: “Underproduction: An Approach for Measuring Risk in Open Source Software” by Kaylea Champion, Benjamin Mako Hill [url] [dblp]
Published in 2021-02 at SANER 2021 and I read it in 2022-04
Abstract: The widespread adoption of Free/Libre and Open Source Software (FLOSS) means that the ongoing maintenance of many widely used software components relies on the collaborative effort of volunteers who set their own priorities and choose their own tasks. We argue that this has created a new form of risk that we call ‘underproduction’ which occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced. We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset from the Debian GNU/Linux distribution that includes 21,902 source packages and the full history of 461,656 bugs. We draw on this application to present two experiments: (1) a demonstration of how our technique can be used to identify at-risk software packages in a large FLOSS repository and (2) a validation of these results using an alternate indicator of package risk. Our analysis demonstrates both the utility of our approach and reveals the existence of widespread underproduction in a range of widelyinstalled software components in Debian.

Lehman’s laws of software evolution

(via “Laws of Software Evolution Revisited” by M M Lehman)

“All relate specifically to E-type systems that is, broadly speaking, to software systems that solve a problem or implement a computer application in the real world”:

  1. Continuing Change: An E-type program that is used must be continually adapted else it becomes progressively less satisfactory.
  2. Increasing Complexity: As a program is evolved its complexity increases unless work is done to maintain or reduce it.
  3. Self Regulation: The program evolution process is self regulating with close to normal distribution of measures of product and process attributes.
  4. Conservation of Organisational Stability: The average effective global activity rate on an evolving system is invariant over the product
    life time.
  5. Conservation of Familiarity: During the active life of an evolving program, the content of successive releases is statistically invariant
  6. Continuing Growth: Functional content of a program must be continually increased to maintain user satisfaction over its lifetime.
  7. Declining Quality: E-type programs will be perceived as of declining quality unless rigorously maintained and adapted to a changing operational environment.
  8. Feedback System: E-type Programming Processes constitute Multi-loop, Multi-level Feedback systems and must be treated as such to be successfully modified or improved.

peer production

as defined by Yochai Benkler (in the 1990s):

  1. decentralized goal setting and execution
  2. a diverse range of participant motives, including non-financial ones
  3. non-exclusive approaches to poverty (e.g. copyleft or permissive licensing)
  4. governance through participation, notions of meritocracy, and charisma (rather than through property or contract)

quotes

  • “The widespread adoption of Free/Libre and Open Source Software (FLOSS) means that the ongoing maintenance of many widely used software components relies on the collaborative effort of volunteers who set their own priorities and choose their own tasks. We argue that this has created a new form of risk that we call ‘underproduction’ which occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced. We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset from the Debian GNU/Linux distribution that includes 21,902 source packages and the full history of 461,656 bugs.” (Champion and Hill, 2021, p. 1)
  • “In this paper, we describe an approach for identifying other important but poorly maintained FLOSS packages” (Champion and Hill, 2021, p. 1)
  • “In an early and influential practitioner account, Raymond argued that FLOSS would reach high quality through a process he dubbed ‘Linus’ law’ and defined as ‘given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone’ [5]. Benkler coined the term ‘peer production’ to describe the method through which many small contributions from large groups of diversely motivated individuals could be integrated together into high quality information goods like software [6].
    A growing body of research suggests reasons to be skeptical about Linus’ law [7] and the idea that simply opening the door to one’s code will attract a crowd of contributors [8, 9].”
    (Champion and Hill, 2021, p. 1)
  • “Over time, it has become clear that peer produced FLOSS projects’ reliance on volunteer labor and self-selection into tasks has introduced types of risk that traditional software engineering processes have typically not faced. Foremost among these is what we call ‘underproduction.’ We use the term underproduction to refer to the fact that although a large portion of volunteer labor is dedicated to the most widely used open source projects, there are many places where the supply of quality software and volunteer labor is far out of alignment with demand. Because underproduction may go unnoticed or unaddressed until it is too late, we argue that it represents substantial risk to the stability and security of software infrastructure.” (Champion and Hill, 2021, p. 2)
  • “How can we measure underproduction in FLOSS?” (Champion and Hill, 2021, p. 2)
  • “Our paper contributes to software engineering research in three distinct ways. First, we describe a broad conceptual framework to identify relative underproduction in peer produced FLOSS repositories: identifying software packages of lower relative quality than one would expect given their relative popularity. Second, we describe an application of this conceptual framework to a dataset of 21,902 source packages from the Debian GNU/Linux distribution using measures derived from multilevel Bayesian regression survival models. Finally, we present results from two experiments. The first experiment identifies a pool of relatively underproduced software in Debian. The second experiment seeks to validate our application of our framework for identifying underproduction by correlating underproduction with an alternate indicator of risk.” (Champion and Hill, 2021, p. 2)
  • “FLOSS began with the free software movement in the 1980s and its efforts to build the GNU operating system as a replacement for commercial UNIX operating systems [23]. Over time, free software developers discovered that their free licenses and practices of working openly supported new forms of mass collaboration and bug fixes [4].” (Champion and Hill, 2021, p. 2)
  • “‘Peer production’ is a term coined by Yochai Benkler to describe the model of organizing production discovered by FLOSS communities in the early 1990s that involved the mass aggregation of many small contributions from diversely motivated individuals. Benkler [9] defines peer production for online groups in terms of four criteria: (1) decentralized goal setting and execution, (2) a diverse range of participant motives, including non-financial ones, (3) non-exclusive approaches to property (e.g. copyleft or permissive licensing), and (4) governance through participation, notions of meritocracy, and charisma, rather than through property or contract.” (Champion and Hill, 2021, p. 2)
  • “The process of building and maintaining software is often collaborative and social, including not only code but code comments, commit messages, pull requests, and code reviews, as well as bug reporting, issue discussing, and shared problem-solving [24].” (Champion and Hill, 2021, p. 2)
  • “The team found that the laws are frequently not upheld in FLOSS, especially when effort from outside a core team is considered. This work suggests that the effort available to maintain a piece of FLOSS software may increase as it grows in popularity.” (Champion and Hill, 2021, p. 3)
  • “Prior studies have suggested that bug resolution rate is closely associated of a range of important software engineering outcomes, including codebase growth, code quality, release rate, and developer productivity [32, 33, 34]. By contrast, lack of maintenance activity as reflected in a FLOSS project’s bug tracking system can be considered a sign of failure [35].” (Champion and Hill, 2021, p. 3)
  • “In particular, we are inspired by a study of Wikipedia by Warncke-Wang et al. [36] who build off previous work by Gorbatˆ ai [37] to formalize what Warncke-Wang calls the “perfect alignment hypothesis” (PAH). The PAH proposes that the most heavily used peer produced information goods (for Warncke-Wang et al., articles in Wikipedia) will be the highest quality, that the least used will be the lowest quality, and so on. In other words, the PAH proposes that if we rank peer production products in terms of both quality and importance—for example, in the simple conceptual diagram shown in Figure 1—the two ranked lists will be perfectly correlated.” (Champion and Hill, 2021, p. 3)
  • “Despite the central role that FLOSS plays in peer production, we know of no efforts to conceptualize or measure underproduction in software.” (Champion and Hill, 2021, p. 3)
  • “A low quality Wikipedia article on an extremely popular subject seems likely to pose much less risk to society than a bug like the Heartbleed vulnerability described earlier which could occur when FLOSS is underproduced.” (Champion and Hill, 2021, p. 3)
  • “The measure of deviation resulting from this process serves as our measure of (mis-)alignment between quality and importance (i.e., over- or underproduction).” (Champion and Hill, 2021, p. 3)
  • “With a community in operation since 1993, Debian is widely used and is the basis for other widelyused distributions like Ubuntu. Debian had more than 1,400 different contributors in 20201 and contains more than 20,000 of the most important and widely used FLOSS packages.” (Champion and Hill, 2021, p. 4)
  • “A single source package may produce many binary packages. For examples, although it is an outlier, the Linux kernel source package produces up to 1,246 binary packages from its single source package (most are architecture specific subcollections of kernel modules).” (Champion and Hill, 2021, p. 4)
  • “However, software engineering researchers have noted that the quantity of bugs reported against a particular piece of FLOSS may be more related to the number of users of a package [50, 52], or the level of effort being expended on bug-finding [1] in ways that limit its ability to serve as a clear signal of software quality. In fact, Walden [1] found that OpenSSL had a lower bug count before Heartbleed than after. Walden [1] argued that measures of project activity and process improvements are a more useful sign of community recovery and software quality than bug count.” (Champion and Hill, 2021, p. 4)
  • “Time to resolution has been cited as an important measure of FLOSS quality by a series of software engineering scholars” (Champion and Hill, 2021, p. 4)
  • “A second challenge in measuring quality as time to resolution comes from the fact that the distribution of bugs across packages is highly unequal. Most of the packages we examine (14,604 of 21,902) have 10 or fewer bugs and more than one out of six (3,857 of 21,902) have only one bug reported.” (Champion and Hill, 2021, p. 5)
  • “Given this construction, Uj will be zero when a package is fully aligned, negative if it is overproduced, and positive if it is underproduced.” (Champion and Hill, 2021, p. 6)
  • “Our first experiment describes results from the application of our method described in §V and suggests that a minimum of 4,327 packages in Debian are underproduced.” (Champion and Hill, 2021, p. 6)
  • “Underproduction is a concept borrowed from economics and involves a relationship between supply and demand.” (Champion and Hill, 2021, p. 8)
  • “For example, resolution time is an imperfect and partial measure of quality.” (Champion and Hill, 2021, p. 8)
  • “Our results suggest that underproduction is extremely widespread in Debian. Our non-parametric survival analysis shown in Figure 2 suggests that Debian resolves most bugs quickly and that release-critical bugs in Debian are fixed much more quickly than non-release-critical bugs. The presence of substantial underproduction in widely-installed components of Debian exposes Debian’s users to risk.” (Champion and Hill, 2021, p. 9)
  • “One striking feature of our results is the predominance of visual and desktop-oriented components among the most underproduced packages (see Figure 5). Of the 30 most underproduced packages in Debian, 12 are directly part of the XWindows, GNOME, or KDE desktop windowing systems. For example, the “worst” ranking package, GNOME Power Manager (gnome-power-manager) tracks power usage statistics, allows configuration of power preferences, screenlocking, screensavers, and alerts users to power events such as an unplugged AC adaptor.” (Champion and Hill, 2021, p. 9)
  • “These results might simply reflect the difficulty of maintaining desktop-related packages. For example, maintaining gnomepower-manager includes complex integration work that spans from a wide range of low-level kernel features to high-level user-facing and usability issues.” (Champion and Hill, 2021, p. 9)
  • “FLOSS acts as global digital infrastructure. Failures in that infrastructure ripple through supply chains and across sectors.” (Champion and Hill, 2021, p. 9)

summary

The paper defines the notion of underproduction from economics for software projects. Roughly the notion captures the imbalance between activity/attention of open source packages in relation to demand (values below 1, in particular). To empirically quantify underproduction, they looked at the time for bug resolution versus installments of 21,902 Debian packages. 4,327 packages have been identified as underproduced (about 20%).

All decisions are made with proper rationale and limitations are discussed in section 7. The normalization of data, in particular assignment of bugs to packages through BTS, must have been taken a large efforts. However, my confidence in the statistical model is rather low (for example, I am not sure a uniform model for packages with such diverging bug reporting property - as explained on page 5 - is appropriate). A ‘control group’ with commercial software projects would be nice, but is obviously infeasible. I would like to point out that this is purely subjective and I cannot support this with a formal statement since my empirical background is very small.

The paper is well-written except for the screwup on page 5.

The result, that the worst underproduced applications are GUI applications, is interesting.

Vnodes: An Architecture for Multiple File System Types… §

Title: “Vnodes: An Architecture for Multiple File System Types in Sun UNIX” by S R Kleiman [url] [dblp]
Published in 1986 at USENIX summer 1986 and I read it in 2023-07
Abstract:

quotes

  • “vfs_sync(vfsp): Write out all cached information for vfsp.Note that this isnot necessarily done synchronously. When the operation returns all data has not necessarily been written out, however ithas been scheduled.” (Kleiman, 1986, p. 7)
  • “The current interface has been in operation since the summer of 1984, and isareleased Sun product.” (Kleiman, 1986, p. 9)
  • “Vnodes has been proven to provide aclean, well defined interface to different file system implementations.” (Kleiman, 1986, p. 9)
  • “In addition, a prototype "/proc" file system[5] has been implemented.” (Kleiman, 1986, p. 9)

summary

In this paper, the author S. R. Kleiman explains the filesystem interface developed for Sun UNIX. The basic idea is to have a uniform vfs and vnodes interface with C structs which is implemented for every filesystem. The interface with its proposed semantics is presented.

Very nice to see a white paper on such fundamental software architecture.

I am not familiar with filesystem requirements, but the interface, examples, and some rationale is provided. I was surprised fsync is not synchronous. The fid structure was not understandable to me either (does the unique file ID have a length of 1 byte).

typo

page 6, “the file pointer is is changed”

When a Patch is Not Enough - HardFails: Software-Explo… §

Title: “When a Patch is Not Enough - HardFails: Software-Exploitable Hardware Bugs” by Ghada Dessouky, David Gens, Patrick Haney, Garrett Persyn, Arun Kanuparthi, Hareesh Khattri, Jason M. Fung, Ahmad-Reza Sadeghi, Jeyavijayan Rajendran [url] [dblp]
Published in 2018-12-01 at and I read it in 2020-07
Abstract: Modern computer systems are becoming faster, more efficient, and increasingly interconnected with each generation. Consequently, these platforms also grow more complex, with continuously new features introducing the possibility of new bugs. Hence, the semiconductor industry employs a combination of different verification techniques to ensure the security of System-on-Chip (SoC) designs during the development life cycle. However, a growing number of increasingly sophisticated attacks are starting to leverage cross-layer bugs by exploiting subtle interactions between hardware and software, as recently demonstrated through a series of real-world exploits with significant security impact that affected all major hardware vendors.

HCF

“A behavior humorously hinted at in IBM System/360 machines in the form of a Halt-and-Catch-Fire (HCF) instruction.”

→ See also Christopher Domas' research on x86 since 2017
→ Pentium F00F (C7C8) bug

quotes

summary

A technical report discussing how bugs are introduced in the hardware design process and slip through testing tools. Specifically, they define HardFails as RTL bugs that are difficult to detect and can be triggered from software potentially compromising the entire platform. Their classification in 4 HardFails properties {Cross-modular effects, Timing-flow gap, Cache-state gap, Hardware/Firmware-interactions} is non-exhaustive (as pointed out in the conclusion). In my opinion, it is too copious when discussing the hardware development process and laying out the advantages/disadvantages of various tools. I think it could have been more concise (e.g. in “Proof assistant and theorem-proving” the scalability issue is mentioned twice).
Besides that, I think it give a nice overview over the issues hardware design has to deal with and yes, we need better tool support. But there is (obviously) no solution to the mentioned problems in the paper.
Catherine pointed out that CLKScrew in Table 2 does not need Firmware interaction and Foreshadow has nothing to do with Firmware interaction.

You Really Shouldn't Roll Your Own Crypto: An Empirica… §

Title: “You Really Shouldn't Roll Your Own Crypto: An Empirical Study of Vulnerabilities in Cryptographic Libraries” by Jenny Blessing, Michael A. Specter, Daniel J. Weitzner [url] [dblp]
Published in 2021-07 at and I read it in 2021-10
Abstract: The security of the Internet rests on a small number of opensource cryptographic libraries: a vulnerability in any one of them threatens to compromise a significant percentage of web traffic. Despite this potential for security impact, the characteristics and causes of vulnerabilities in cryptographic software are not well understood. In this work, we conduct the first comprehensive analysis of cryptographic libraries and the vulnerabilities affecting them. We collect data from the National Vulnerability Database, individual project repositories and mailing lists, and other relevant sources for eight widely used cryptographic libraries.

quotes

summary

Well-designed methodology. All decisions made are well-documented and put into context. I am just still a little bit skeptical about the statement “You really shouldn't roll your own crypto”. The data shows that cryptographic bugs occur and cryptographic libraries should prevent errors on an API level. However, it does not really show results regarding “own crypto”. It does not provide examples for custom-designed cryptographic libraries or its effects in industry.

Prior work:

π is the Minimum Value for Pi §

Title: “π is the Minimum Value for Pi” by C. L. Adler, James Tanton [url] [dblp]
Published in 2000 at and I read it in 2021-10
Abstract:

summary

If we consider Pi as the ratio of the circumference to its diameter, the value depends on the chosen metric. Varying this metric gives various values from 4 to π and to 4 again. Thus π is the minimum value of Pi. An analytic proof follows. Nice little result. Also attached is a proof that a² + b² ≥ 2ab. If the bright grey area equals a・b and the dark grey area equals a・b as well, then we can discover the area of 2・a・b. If the dark grey area outside the large square is a² and the large square b², we can observe that a² + b² ≥ 2ab because of the white square on the top.