Papers

Due to my time at university, I read a bunch of papers. Since the beginning of my PhD studies, I systematically review and summarize them. Here are my notes.

Table of contents:

Papers and notes

Cyclone: A safe dialect of C §

Title: “Cyclone: A safe dialect of C” by Morrisett, Cheney, Grossman, Hicks, Wang [url] [dblp]
Published in at USENIX 2002 and I read it in 2020-02
Abstract:

Ambiguity

“Arrays and strings are converted to ?-pointers as necessary (automatically by the compiler).”

→ When exactly?

Example

“We don’t consider it an error if non-pointers are uninitialized. For example, if you declare a local array of non-pointers, you can use it without initializing the elements:

char buf[64];       // contains garbage ..
sprintf(buf,"a");   // .. but no err here
char c = buf[20]; // .. or even here

This is common in C code; since these array accesses are in-bounds, we allow them.”

Example

“However, this technique will not catch even the following simple variation:

char *itoa(int i) {
  char buf[20];
  char *z;
  sprintf(buf,"%d",i);
  z = buf;
  return z;
}

Here, the address of buf is stored in the variable z, and then z is returned. This passes gcc -Wall without complaint.”

Quotes

  • “Cyclone is a safe dialect of C. It has been designed from the ground up to prevent the buffer overflows, format string attacks, and memory management errors that are common in C programs, while retaining C’s syntax and semantics. This paper examines safety violations enabled by C’s design, and shows how Cyclone avoids them, without giving up C’s hallmark control over low-level details such as data representation and memory management.”
  • “Every introductory C programming course warns against them and teaches techniques to avoid them, yet they continue to be announced in security bulletins every week. There are reasons for this that are more fundamental than poor training:
    • One cause of buffer overflows in C is bad pointer arithmetic, and arithmetic is tricky. […]
    • C uses NUL-terminated strings. […]
    • Out-of-bounds pointers are commonplace in C. […]”
  • “Our goal is to design Cyclone so that it has the safety guarantee of Java (no valid program can commit a safety violation) while keeping C’s syntax, types, semantics, and idioms intact.”
  • “We must reject some safe programs,because it is impossible to implement an analysis that perfectly separates the safe programs from the unsafe programs.”
  • Table 1: “Restrictions imposed by Cyclone to preserve safety”
    • NULL checks are inserted to prevent segmentation faults
    • Pointer arithmetic is restricted
    • Pointers must be initialized before use
    • Dangling pointers are prevented through region analysis and limitations on free
    • Only “safe” casts and unions are allowed
    • goto into scopes is disallowed
    • switch labels in different scopes are disallowed
    • Pointer-returning functions must execute return
    • setjmp and longjmp are not supported
  • Table 2: “Extensions provided by Cyclone to safely regain C programming idioms”
    • Never-NULL pointers do not require NULL checks
    • “Fat” pointers support pointer arithmetic with run-time bounds checking
    • Growable regions support a form of safe manual memory management
    • Tagged unions support type-varying arguments
    • Injections help automate the use of tagged unions for programmers
    • Polymorphism replaces some uses of void *
    • Varargs are implemented with fat pointers
    • Exceptions replace some uses of setjmp and longjmp
  • “If you call getc(NULL), what happens? The C standard gives no definitive answer.”
  • “Cyclone’s region analysis is intraprocedural — it is not a whole-program analysis.”
  • “Here ‘r is a region variable.” (c.f. rust's notation)
  • “Obviously, programmers still need a way to reclaim heap-allocated data. We provide two ways. First, the programmer can use an optional garbage collector. This is very helpful in getting existing C programs to port to Cyclone without many changes. However, in many cases it constitutes an unacceptable loss of control.”
  • “A goto that does not enter a scope is safe, and is allowed in Cyclone. We apply the same analysis to switch statements, which suffer from a similar vulnerability in C.”
  • “The Cyclone compiler is implemented in approximately 35,000 lines of Cyclone. It consists of a parser, a static analysis phase, and a simple translator to C. We use gcc as a back end and have also experimented with using Microsoft Visual C++.”
  • “When a user compiles with garbage collection enabled, we use the Boehm-Demers-Weiser conservative garbage collector as an off-the-shelf component.”
  • “We achieve near-zero overhead for I/O bound applications such as the web server and the http programs, but there is a considerable overhead for computationally-intensive benchmarks;”
  • “Cyclone’s representation of fat pointers turned out to be another important overhead. We represent fat pointers with three words: the base address, the bounds address, and the current pointer location (essentially the same representation used by Mc-Gary’s bounded pointers [26]).”
  • “Good code generation can make a big difference: we found that using gcc’s -march=i686 flag increased the speed of programs making heavy use of fat pointers (such as cfrac and grobner) by as much as a factor of two, because it causes gcc to use a more efficient implementation of block copy.”
  • “We found array bounds violations in three benchmarks when we ported them from C to Cyclone: mini_httpd, grobner, and tile. This was a surprise, since at least one (grobner) dates back to the mid 1980s.”
  • “Cyclone began as an offshoot of the Typed Assembly Language (TAL) project”
  • “In C, a switch case by default falls through to the next case, unless there is an explicit break. This is exactly the opposite of what it should be: most cases do not fall through, and, moreover, when a case does fall through, it is probably a bug. Therefore, we added an explicit fallthru statement,” […] “Our decision to “correct” C’s mistake was wrong. It made porting error-prone because we had to examine every switch statement to look for intentional fall throughs, and add a fallthru statement.”
  • “There is an enormous body of research on making C safer. Most techniques can be grouped into one of the following strategies:”
    1. Static analysis. […]
    2. Inserting run-time checks. […]
    3. Combining static analysis and run-time checks. […]

Region blocks

“Therefore, Cyclone provides a feature called growable regions. The following code declares a growable region, does some allocation into the region, and deallocates theregion:

region h {
  int *x = rmalloc(h,sizeof(int));
  int ?y = rnew(h) { 1, 2, 3 };
  char ?z = rprintf(h,"hello");
}

The code uses a region block to start a new, growable region that lives on the heap. The region is deallocated on exit from the block (without an explicit free).”

Summary

Great paper with pragmatic approach derived from the Typed Assembly Language project. Definitely worth a read. Essential for everyone interested in the Rust programming language as this project inspired many ideas related to proper memory management and lifetimes. Implementation needs to be compiled on your own, but it is not maintained anymore anyways. Furthermore, there are more papers following the progress of the project and they also introduced more drastic changes which discards the label “pragmatic”.

Tagged unions

“We solve this in Cyclone in two steps. First, we add tagged unions to the language:”

tunion t {
  Int(int);
  Str(char ?);
};

[…]

void pr(tunion t x) {
  switch (x) {
    case &Int(i): printf("%d",i); break;
    case &Str(s): printf("%s",s); break;
  }
}

“The printf function itself accesses the tagged arguments through a fat pointer (Cyclone’s varargs are bounds checked)”

Historical Notes on the Fast Fourier Transform §

Title: “Historical Notes on the Fast Fourier Transform” by Lewis, Welch [url] [dblp]
Published in at IEEE 1967 and I read it in 2020-03
Abstract: The problem of esthnating the errors arising in a variety of numerical operatiominvolvingbaodlimited faactiom is considered. The errors are viewedas respollsesof suitably created system, a d the analysisis based on the evaluationof the maximum responseof these systems in termsof the energy or powerof theirinput.Theinvestigationi n c l u d e s deterministicand random signah, and it is extended to twdin1easioop1fmctiom and Hankel transforms. Finally, the results are related tothe uncertainty principle inone and two variables.

summary

  • Survey paper on historical developments of the four years before. Nice and straight. Forward summary though I didn't go through details of the Prime factor algorithm.
  • “The greatest emphasis, however, was on the computational economy that could be derived from the symmetries of the sine and cosine functions”
  • “use the periodicity of the sine-cosine functions to obtain a 2N-point Fourier analysis from two N-point analyses with only slightly more than N operations. Going the other way, if the series to be transformed is of length N and N is a power of 2, the series can be split into log, N subseries”
  • “The number of computations in the resulting successive doubling algorithm is therefore proportional to N log₂ N rather than N²”
  • “The fast Fourier transform algorithm of Cooley and Tukey is more general in that it is applicable when N is composite and not necessarily a power of 2”
  • “the algorithms are different for the following reaons : 1) in the Thomas algorithm the factors of N must be mutually prime; 2) in the Thomas algorithm the calculation is precisely multidimensional Fourier analysis with no intervening phase shifts or “twiddle factors” as they have been called ; and 3) the correspondences between the one-dimensional index and the multidimensional indexes in the two algorithms are quite different.”
  • “The factor Wj0n0, referred to as the “twiddle factor” by Gentleman and Sande, is usually combined with either the Wrj0n factor in (eq 4) or the Wsjin0 factor in (eq 5)”
    • eq 4: A_1(j_0, n_0) = \sum_{n1=0}^{r-1} A(n_1, n_0) W_r^{j_0 n_1}
    • eq 5: X(j_1, j_0) = \sum_{n_0=0}^{s-1} A_1(j_0, n_0) W_s^{j_1 n_0} W_N^{j_0 n_0}

SoK: Science, Security and the Elusive Goal of Securit… §

Title: “SoK: Science, Security and the Elusive Goal of Security as a Scientific Pursuit” by Herley, Oorschot [url] [dblp]
Published in 2020-03-09 08:35:43 at 2017 IEEE Symposium on Security and Privacy and I read it in 2020-05
Abstract: The past ten years has seen increasing calls to make security research more “scientific”. On the surface, most agree that this is desirable, given universal recognition of “science” as a positive force. However, we find that there is little clarity on what “scientific” means in the context of computer security research, or consensus on what a “Science of Security” should look like. We selectively review work in the history and philosophy of science and more recent work under the label “Science of Security”. We explore what has been done under the theme of relating science and security, put this in context with historical science, and offer observations and insights we hope may motivate further exploration and guidance. Among our findings are that practices on which the rest of science has reached consensus appear little used or recognized in security, and a pattern of methodological errors continues unaddressed.

Bacon's scientific approach

Bacon [8] formalized an inductive method of generalizing from observations—his method of observation, generalization and correction is acknowledged as an early articulation of what many historically view as the basic scientific method

Empirism versus formal approaches

Mathematical models have proved enormously useful in various branches of Science. It is worth emphasizing that a mathematical model is judged on the accuracy of its predictions rather than the perceived reasonableness of its assumptions.

There is no confusion on this point when checking predictions against measurements is easy: when there’s a conflict between the model and the data, the data wins (see, e.g., remarks by Feynman and others in Section II-D). However, in disciplines where direct measurement and controlled experiments are hard (e.g., Economics and Security) it can be tempting to seek other forms of validation for a mathematical model.

This is a limitation of any mathematical model: how well a model matches reality can only be tested empirically.

inductive / deductive

Probably the most significant settled point in the philosophy of science is that inductive and deductive statements constitute different types of knowledge claims.

The importance of this distinction has long been known. Versions date back to Plato, who distinguished the messy, physical realm of things observed from the perfect non-physical world of “forms.”

Despite broad consensus in the scientific community, in Security there is repeated failure to respect the separation of inductive and deductive statements. Both types have value, provided their limitations are recognized and they are kept separate.

Kant’s very thorough treatment was influential [5]; he calls statements whose truth is independent
of experience a priori and those that depend on observation a posteriori.

Mill argued essentially the reverse: induction is our only path to understanding the world since, on its own, deduction is incapable of helping us discover anything about the world [9]:

But this is, in fact, to say that nothing ever was, or can be, proved by syllogism, which was not known, or assumed to be known, before.

Ayer, a logical positivist, summarizes [6, p.57]:

the characteristic mark of a purely logical inquiry is that it is concerned with the formal consequences of
our definitions and not with questions of empirical fact.

Is security special?

Biological and military systems must also guarantee robustness in the presence of adversaries (see Forrest et al. [78], [79]). In that many of its discoveries are expressible as laws, Physics is an exception rather than the rule [3, pp.197-208]. The compactness and elegance of the mathematical expression of much of Physics is unmatched in other Sciences

Main points

History/Philosophy of Science

  • How to organize “Science of Security”?
  • inductive versus deductive; a priori versus a posteriori; analytic versus synthetic
    • “all swans are white” (inductive reasoning; like in physics) versus Euclidean geometry/Pythagoras' theorem (deductive reasoning; like in math)
    • Hume’s “problem of induction”: the logical basis for believing inductive claims is weak in comparison with the certainty of deductive ones.
    • Galileo discovered the moons of Jupiter. Aristoteles proclaimed that all heavenly bodies orbited earth
    • Bacon: formalized an inductive method of generalizing from observations
    • Plato: we could observe only an illusory shadow of the perfect world of forms
    • Mill: induction is our only path to understanding the world; deduction is incapable of helping us discover anything about the world
    • Medawar: Deduction in itself is quite powerless as a method of scientific discovery
  • Falsification as Demarcation Criterion (“scientific” versus “non-scientific”)
    • Thompson’s “plum-pudding” model of the atom and Pauling’s triple helix model of DNA are incorrect. it would seem harsh to describe them as unscientific
    • Popper: “A theory which is not refutable by any conceivable event is non-scientific. Irrefutability is not a virtue of a theory (as people often think) but a vice.”
      • Counterargument 1 (Hempel, The raven paradox): “all ravens are black” ⇔ “all non-black things are not ravens”. Observing a green apple supports the claim
      • Counterargument 2 (The tacking problem): “all ravens are black” is falsifiable ⇒ scientific, “all ravens are black and Loch Ness exists”  is falsifiable ⇒ scientific. Mayo introduces notion of “severe test”
      • Counterargument 3 (Duhem-Quine thesis (holism): “there’s a $20 bill under a rock on a planet 1 million light years from here”, falsification must be theoretical or practical? “The holism problem is that it is difficult, or even impossible, to isolate falsification of a single statement from assumptions about the observation.” Observations themselves are subject to error
      • Counterargument 4: “if no possible observation conflicts with a theory, then it is unscientific”. “Falsification offers a clear criterion for ruling theories unscientific.” but not for being scientific.
    • Shannon on information theory: “If, for example, the human being acts in some situations like an ideal decoder, this is an experimental and not a mathematical fact, and as such must be tested under a wide variety of experimental situations.” (corresponds to “how well a model matches reality can only be tested empirically“)
    • “when there’s a conflict between the model and the data, the data wins”
  • Relation between Mathematics and Science
    • “Whether a geometry can be applied to the actual physical world or not, is an empirical question which falls outside the scope of the geometry itself. There is no sense, therefore, in asking which of the various geometries known to us are false and which are true. In so far as they are all free from contradiction, they are all true.”
    • “how well a model matches reality can only be tested empirically” (example: Shannon expressed this about his information theory)
  • Viewpoints of Major Scientists
    • Bohr: identified with the logical positivists ⇒ questions that could not be tested are nonscientific
    • Bohr: “It is wrong to think that the task of physics is to find out how Nature is. Physics concerns what we can say about Nature”
    • Feynman: “In general, we look for a new law by the following process. First, we guess it, no, don’t laugh, that’s really true. Then we compute the consequences of the guess, to see what, if this is right, if this law we guess is right, to see what it would imply and then we compare the computation results to nature, or we say compare to experiment or  experience, compare it directly with observations to see if it works. If it disagrees with experiment, it’s wrong. In that simple statement is the key to science.”
    • Platt (notion of strong inference): “any conclusion that is not an exclusion is insecure.”
    • Feynman: “You can’t prove a vague theory wrong”
    • Ayala: “A hypothesis is scientific only if it is consistent with some but not other possible states of affairs not yet observed, so that it is subject to the possibility of falsification by reference to experience”
  • Hypothetico-deductive Model
    • Required properties of a scientific theory:
      • Consistency
      • Falsifiability
      • Predictive power and progress
    • hypothetico-deductive model:
      1. Form hypotheses from what is observed.
      2. Formulate falsifiable predictions from those hypotheses
      3. If new observations agree with the predictions, an hypothesis is supported (but not proved); if they disagree, it is rejected
    • Sherlock Holmes: “Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth.”
  • Sciences of the Artificial (Research on Human Artifacts)
    •  “The place of much younger Computer Science (II-C2) is less clear, with debate on its status as a science [29]”
    • “Differences between fields and methodologies have led to historical tensions between the pure and applied sciences”
    • “Much of Security research is directly dependent on human-made artifacts; e.g., results often depend directly on specific software”
    • “both the generality and longevity of results impact the type of scientific results we might expect in Security.”
  • Pasteur’s Quadrant (uniting basic and applied research)
    • two-by-two grid with Yes/No entries
      rows ask “Quest for fundamental understanding?”
      columns ask “Considerations of use?”
      • (Yes, No) is Bohr’s quadrant (his modeling of atomic structure exemplifies basic research)
      • (No, Yes) is Edison’s quadrant (his pursuit of commercial electric lighting exemplifies applied researchers uninterested in basic science implications)
      • (Yes, Yes) is Pasteur’s quadrant (his many contributions combine pursuit of basic knowledge and target use)
      • (No, Yes) is empty
    • McLean (1987): “Hence, we have developed an environment where our documented foundations are inadequate, yet shielded from adversity by appeals to implicit assumptions “which everybody knows about” (even if people disagree on what these assumptions are!) ... Such is the path to neither science nor security.”

Science of Security

  • Saltzer and Schroeder, 1975, “The protection of information in computer systems”
  • US Department of Defense, 1983, “Trusted Computer System Evaluation Criteria (TCSEC)” aka Orange Book
  • P. A. Karger and R. R. Schell, 2002, “Thirty Years Later: Lessons from the Multics Security Evaluation”
  • P. Karger and R. Schell, 1974, “Multics Security Evaluation: Vulnerability Analysis, ESD-TR-74-193, Vol. II”
  • McLean, 1987: “Hence, we have developed an environment where our documented foundations are inadequate, yet shielded from adversity by appeals to implicit assumptions “which everybody knows about” (even if people disagree on what these assumptions are!) ... Such is the path to neither science nor security”
  • Good, 1987, “What is important is that the abstractions are done in such a way so that when we prove some property about the abstraction, then that property is true of the real, running system”
  • “Good paints the way forward as formal verification techniques, carefully defined as “the use of rigorous mathematical reasoning and logic in the system engineering process to produce real systems that are proved to meet their requirements”
  • DeMillo, 1979, “formal verifications of programs, no matter how obtained, will not play the same key role in the development of computer science and software engineering as proofs do in mathematics.”
  • “Scientists should not confuse mathematical models with reality—and verification is nothing but a model of believability.”
  • JASON report 2010: “The science seems under-developed in reporting experimental results, and consequently in the ability to use them. The research community does not seem to have developed a generally accepted way of reporting empirical studies so that people could reproduce the work.”
  • Claims of what we need more or less of
    • More formal approaches
    • Better support/tool/understanding for Empiricism and data collection
      • Defined metrics and measurements
    • Scientific training
    • Attack papers
  • Pfleeger: “stop insisting that quantitative is better than qualitative; both types of measurement are useful”
  • Is a Science of Security even possible?
    • Adaptive Adversary: Unlike other fields, security has a quick adaptive adversary. “Nature does not form new types of storms in response to improved techniques.”
    • Absence of invariant laws: There might not be any universal truths. Evans and Stolfo (2011) ”Computer security is too entwined with human behavior and engineered systems to have universal laws at the physics level”
    • Man-made artifacts: JASON report “cyber-security is an artificially constructed environment that is only weakly tied to the physical universe” and “the threats associated with cybersecurity are dynamic”
    • Security is not special: “Biological and military systems must also guarantee robustness in the presence of adversaries. In that many of its discoveries are expressible as laws, Physics is an exception rather than the rule. The compactness and elegance of the mathematical expression of much of Physics is unmatched in other Sciences”
    • “Pleading uniqueness to avoid being held to scientific approaches is common in unscientific fields, and would place Security in poor company.”
  • Controversial examples
    • “‘Provable security’ involves proofs showing that breaking a target cryptosystem allows solving a believed-hard problem in not much further effort (i.e., “not much” in the asymptotic sense). Bellare [80] suggests reductionist security as a less misleading term.”
    • Is crypto the role model of security?
      • Degrabriele: provable security has transitioned crypto to a science
    • The gap between provable security and the real world
      • “Describing an SSH side-channel attack [88] by two of them, they note that the obvious question is [84]: “how we would be able to attack a variant of SSH that was already proven secure.” The answer: the security model failed to consider differences in failure modes by which real implementations report errors—here, allowing attackers to send small units of ciphertext to repeatedly extract small amounts of information by observed differences in how the receiving system responds,”
      • “Side-channels continue to epitomize the difficulty of modeling real world problems—raising problems even with definitions.”

Failures to apply lessons from science

  • Failure to observe inductive-deductive split
    • “Despite broad consensus in the scientific community, in Security there is repeated failure to respect the separation of inductive and deductive statements. Both types have value,provided their limitations are recognized and they are kept separate.”
    • “Schneider [51] suggests modifying the definition of Science to include deductive statements. “The status of the natural sciences remains unaffected by changing the definition of a science in this way. But computer science now joins.””
    • “Speaking of mathematical guarantees as if they are properties of real-world systems is a common error.”
    • “A simple example may flush out misunderstandings. For example, we might think that the claim “a 128-bit key is more secure than a 64-bit key” can be evaluated purely by deduction. First, we must separate the question of whether we believe a particular claim from the question of whether the reasoning behind it is sound; a valid conclusion doesn’t imply a solid argument.”
  • Reliance on unfalsifiable claims
    • “We can observe that somethingis insecure (by observing a failure) but no observation allows
       us to determine empirically that something is secure (this observation is often used to motivate formal approaches”
    • “claims of the form “X improves security” are unfalsifiable”
  • Failure to bring theory into contact with observation
    • “A scientific model is judged on the accuracy of its predictions”
  • Failure to make claims and assumptions explicit
    • “If a theory says “X should never happen under assumptions A, B and C” then showing that it does suffices to refute the claim. But when a statement is vague, or assumptions implicit, it is unclear what, if anything, is ruled out. Thus, difficulty articulating what evidence would falsify a claim suggests implicit assumptions or an imprecise theory [3].”
    • “The problem of implicit assumptions seems widespread”
    • “This leads Bishop and Armstrong [101] to suggest that the skill of reverse-engineering to uncover assumptions implicit in a security design is a vital part of a computer security  education”
  • Failure to seek refutation rather than confirmation
    • hypotheses are most useful when they allow anticipation of as-yet unseen things, and observations are most useful when they present severe tests to existing hypotheses
    • observations must actively seek to refute existing belief

Ways forward: Insights and discussion

summary observations and constructive suggestions following:

  • T1: “Pushes for “more science” in security, that rule nothing in or out, are too ambiguous to be effective. Many insights and methods from philosophy of science remain largely unexplored in security research.”
  • T2: “Ignoring the sharp distinction between inductive and deductive statements is a consistent source of confusion in security.”
  • “It is worth being unequivocal on this point. There is no possibility whatsoever of proving rigorously that a real-world system is “secure” in the commonly interpreted sense of: invulnerable to (all) attacks.”
  • T3: “Unfalsifiable claims are common in security—and they, along with circular arguments, are used to justify many defensive measures in place of evidence of efficacy.”
  • T4: “Claims that unique aspects of security exempt it from practices ubiquitous elsewhere in science are unhelpful and divert attention from identifying scientific approaches that advance security research.”
  • T5: “Physics-envy is counterproductive; seeking “laws of cybersecurity” similar to physics is likely to be a fruitless search.”
  • T6: “Crypto-envy is counterproductive; many areas of security, including those involving empirical research, are less amenable to formal treatment or mathematical role models.”
  • T7: “Both theory and measurement are needed to make progress across the diverse set of problems in security research.”
  • “The process is iterative, with theory and observation in constant contact. […]  While both are essential, recent history suggests that theorythat has not been tested by observation is currently a greater problem in security than measurement that fails to test theory”
  • T8: “More security research of benefit to society may result if researchers give precise context on how their work fits into full solutions—to avoid naive claims of providing key components, while major gaps mean full-stack solutions never emerge.”
  • “Regardless, more research of societal benefit may result if researchers took responsibility for explaining how contributions add up to full solutions; these rarely appear by chance.”
  • T9: “Conflating unsupported assertions, and argument-by-authority, with evidence-supported statements, is an avoidable error especially costly in security.”
  • T10: “Despite consensus that assumptions need be carefully detailed, undocumented and implicit assumptions are common in security research.”
  • “One possibility is to find a forcing function to make assumptions explicit.”
  • “As one example (towards a different goal), Nature demands that abstracts contain a sentence beginning ‘Here we show that.’”
  • “Platt recommends answering either “what experiment would disprove your hypothesis” or “what hypothesis does your experiment disprove.””
  • T11: “Science prioritizes efforts at refutation. Empirical work that aims only to verify existing beliefs, but does not suggest new theory or disambiguate possibilities falls short of what science can deliver.”
  • “It is often noted (e.g.,[62]) that learning accelerates if we learn from mistakes in other disciplines; arguably, security research is learning neither from other disciplines nor its own literature, and questioning security foundations is not new”

Appendix

  • “The problem with such terms as ‘automated theorem-proving’, ‘computer-aided theorem proving’, and ‘automated proof-checking’ is the same as the problem with the term ‘provable security’ ... they promise a lot more than they deliver.”
  • “Maxion [56] emphasizes experiments which are repeatable (produce consistent measurements), reproducible by others (cf. [133]), and valid (well-grounded and generalizable), with focus on confounding errors. He cites Feynman: “Experiment is the sole judge of scientific ‘truth’.””
  • “Seven years after Kocher’s well-known timing attacks, formal models still failed to address side-channels.”

Popper's criterion

A simplified statement of Popper’s criterion is that scientific theories should be falsifiable [14]:

A theory which is not refutable by any conceivable event is non-scientific. Irrefutability is not a virtue of a theory (as people often think) but a vice.

Einstein’s Relativity, by contrast, even though it assaulted basic intuitions about time and space, made testable predictions.

Programs as predicates

The origins of Computer Science and Computer Security are mathematical. WWII ciphers and code-breaking are justly revered. Without the accomplishments of modern cryptography, it is doubtful the World-Wide-Web would have attained such influence. This has contributed to the strongly mathematical flavor of our discipline. In 1984, Hoare describes a computer program as [18] “a logical predicate describing all of its permitted behaviors”; and further, that the final design meets the original requirements “can be mathematically proved before starting the implementation of its components.” This view of programs as implementing mathematical predicates, is unsurprising given the importance of the development of algorithms. For an algorithm that involves sorting, searching etc., it is important that its behavior is well-understood under all possible inputs.

Shamir and Schell on scientific security

Shamir, in accepting the 2002 Turing award, described non-crypto security as “a mess.” Schell, in 2001, described the field as being filled with “pseudo-science and flying pigs” [2].

Side-channel attacks reveal the gap between theory and practice

Side-channels continue to epitomize the difficulty of modeling real world problems—raising problems even with definitions.

summary

This paper tackles the difficult question of discussing IT security from a scientific perspective. A demarcation criterion distinguishes between “scientific” and “non-scientific” content. Are any approaches in security scientific at all? Is it an inductive or deductive process? How does security compare to other sciences? Where does security position itself in Pasteur's Quadrant and is falsification an admissible demarcation criterion? What are the conclusions of the JASON report? A thorough discussion followed by summary observations and constructive suggestions is done.

Extensive, well written paper. Nice summary in section 2 “History/Philosophy of Science” which can be debated with any curious mind of any field. The following sections are security specific. Drawing conclusions on the central question is difficult and the paper can neither provide them. But it shows a lack of procedures for empirical approaches established in other sciences. I think Appendix A is very important and should have been put into the main text (because falsification is such an important discussed criterion in the paper).

typo

Further, Science generally requires that a hypothesis survives severe tests before it is trusted.

(Appendix A)

typo

give a highly accessible exposition

Number "Not" Used Once - Key Recovery Fault Attacks on… §

Title: “Number "Not" Used Once - Key Recovery Fault Attacks on LWE Based Lattice Cryptographic Schemes” by Ravi, Bhasin, Chattopadhyay [url] [dblp]
Published in at COSADE 2019 and I read it in 2020-05
Abstract: This paper proposes a simple single bit flip fault attack applicable to several LWE (Learning With Errors Problem) based lattice based schemes like KYBER, NEWHOPE, DILITHIUM and FRODO which were submitted as proposals for the NIST call for standardization of post quantum cryptography. We have identified a vulnerability in the usage of nonce, during generation of secret and error components in the key generation procedure. Our fault attack, based on a practical bit flip model (single bit flip to very few bit flips for proposed parameter instantiations) enables us to retrieve the secret key from the public key in a trivial manner. We fault the nonce in order to maliciously use the same nonce to generate both the secret and error components which turns the LWE instance into an exactly defined set of linear equations from which the secret can be trivially solved for using Gaussian elimination.

ambiguity

What is z in the provided interval in section 2.6? Seems to be either an arbitrary integer or z of Algorithm 1 (NewHope)

error

“later converted to the NTT domain using the PolyBitRev function” … no, it is a separate step

inconsistency

  • t = As1 + s2 in flow text in section 2.5
  • t = A × s1 + s2 in Algorithm 4

quotes

  • “Nonces are predominantly used in all the afore mentioned schemes in order to reduce the amount of randomness required to generate the secret and error components used in the generation of the LWE instance during key generation.”
  • “main focus on the key generation algorithm, as that is the target of our attack.”
  • “We also use x ← Sη to denote the module x whose coefficients lie in the range [−η, η].”
  • About Kyber: “Apart from this, there are other modifications to the scheme like the modified technique for generation of the public module A as in [4], compressed public key and ciphertexts through "Bit-dropping" using the Learning-with-rounding (LWR) problem”
  • “every coefficient of the recovered t is only a perturbed version of the original t generated during the key-generation procedure KYBER.CPAPKE.GEN(). Thus, the public key can be assumed to be built based on the hardness of both the MLWE and MLWR problem.”
  • About Frodo: “The modulus q is chosen to be a power of 2 that enables easy reduction through bit masking.”
  • “For example, if the error component is an all zero vector, then the LWE instance is converted into a set of linear equations with equal number of equations and unknowns. This instance can be solved using straight forward Gaussian elimination. If the error component only has values in a fixed interval [z + 1/2, z - 1/2], then one can just "round away" the non-integral part and subtract z to remove the error from every sample [29]. There are also other easy instances of LWE, For eg. From a given set of n LWE instances, if k of the n error components add up to zero, then one can simply add the corresponding samples to cancel the error and obtain an error free sample. It is also possible to solve an LWE instance in roughly nd time and space using nd samples if the error in the samples lie in a known set of size d [5]. For a very small d, this yields a very practical attack.”
  • “Thus, if an attacker can perform a single bit flip of the nonce such that both the calls to the function poly_sample use the same seed, then it yields s = e.”
  • “Thus, ultimately the attacker has to inject a minimum of just two bit flips in order to create a very easy LWE instance that results in direct retrieval of the secret key S through Gaussian elimination.”
  • “In KYBER, the dimensions of both s1 and s2 are the same and equal k. But, the generated MLWE instance t = a × s1 + s2 is further protected by the hardness of the MLWR problem through the Compressq function and hence the public key is formed by a combination of MLWE and MLWR instances.”
  • “Since the nonce is deterministically generated, one can use an error correction scheme to check if the correct value of nonce is being used for the generation of polynomials.”
  • “The motivation to use a nonce for generation of polynomials is mainly to reduce the randomness requirement and use the same seed to generate multiple polynomials.”

summary

  • paper from 2019
  • public key in Learning With Errors problem is given by A×s + e
  • If s = e, then A×s + s gives a trivial linear equation system to solve to get the secret key s
  • if the nonce is the same between the generation of s and e, then s = e
  • stuck-at fault attack on nonces in NewHope, Kyber, Dilithium, and Frodo schemes
  • trivial for NewHope (keep poly_sample(&shat, noiseseed, 0) the same, apply stuck-at(0) to poly_sample(&ehat, noiseseed, 1))
  • Technically more difficult for Frodo, because stuck-at(1) must be applied to Frodo.SampleMatrix(seed_E, n, n_bar, T_chi, 2). Requires two stuck-at bits unlike NewHope
  • For Dilithium, l stuck-ats required, but then equation system is only solvable if enough equation (i.e. signatures) are available
  • For Kyber, k stuck-ats required, but the equation system must not necessarily be solvable, since rounding still protects it

Short paper, tiny errors, no practical implementation (how successful can the attack on Frodo be implemented?) (correction: unlike the paper, their presentation slides show evaluation data), the structure of the text is beautiful and it is well-written. I particularly enjoyed the summary of weak LWE instances. Masking is also another countermeasure, but is heavy-weight compared to the proposed error-correcting codes

typo

The aforementioned fault vulnerabilities associated with the nonce only occur due to the implementation strategies used and hence can we believe can be easily corrected albeit with a small performance overhead.

⇒ hence can, we believe, can be easily corrected

SEVurity: No Security Without Integrity §

Title: “SEVurity: No Security Without Integrity” by Wilke, Wichelmann, Morbitzer, Eisenbarth [url] [dblp]
Published in at IEEE Symposium on Security and Privacy 2020 and I read it in 2020-06
Abstract: One reason for not adopting cloud services is the required trust in the cloud provider: As they control the hypervisor, any data processed in the system is accessible to them. Full memory encryption for Virtual Machines (VM) protects against curious cloud providers as well as otherwise compromised hypervisors. AMD Secure Encrypted Virtualization (SEV) is the most prevalent hardware-based full memory encryption for VMs. Its newest extension, SEV-ES, also protects the entire VM state during context switches, aiming to ensure that the host neither learns anything about the data that is processed inside the VM, nor is able to modify its execution state. Several previous works have analyzed the security of SEV and have shown that, by controlling I/O, it is possible to exfiltrate data or even gain control over the VM’s execution. In this work, we introduce two new methods that allow us to inject arbitrary code into SEV-ES secured virtual machines. Due to the lack of proper integrity protection, it is sufficient to reuse existing ciphertext to build a high-speed encryption oracle. As a result, our attack no longer depends on control over the I/O, which is needed by prior attacks. As I/O manipulation is highly detectable, our attacks are stealthier. In addition, we reverse-engineer the previously unknown, improved Xor-Encrypt-Xor (XEX) based encryption mode, that AMD is using on updated processors, and show, for the first time, how it can be overcome by our new attacks.

Intel SGX's role

Intel Software Guard Extensions (SGX) was the first widely available solution for protecting data in RAM. However, it only can protect a small chunk of RAM, not the VM as a whole.

Memory encryption systems for VMs

  • AMD Secure Memory Encryption (SME) (2016): drop-in, AES-based RAM encryption. Controlled by Secure Processor (SP) co-processor. A special bit in the page table – the so-called C-bit – is used to indicate whether a page should be encrypted. The page table in the guest is encrypted and thus not accessible by the hypervisor.
  • AMD Secure Encrypted Virtualization (SEV): SEV extends SME for VMs by using different encryption keys per VM, in order to prohibit the hypervisor from inspecting the VM’s main memory
    • SEV Encrypted State (SEV-ES)
  • Intel Total Memory Encryption (TME)
  • Intel Multi-Key Total Memory Encryption (MKTME)

Comparison: S. Mofrad, F. Zhang, S. Lu, and W. Shi, “A comparison study of intel sgx and amd memory encryption technology,”, 2018

Notes

  • Why is the tweak value only defined for index ≥4? “We denote the first one as t 4 , since there are no dedicated constants for the least significant bits 3 to 0.” seems to be a resulting condition, not a reason.
  • Is the linear equation system resulting from Dec_K(Enc_N(…), q_j) a linear equation system? XOR with m makes it affine linear, doesn't it? Is bit(i, p) linear? I don't think so unless we are in ℤ_2?!
  • See Table 1. They used 4-byte repeating patterns for the tweak constants.
  • 0x7413 is “jump if equal” on x86_64, but 0xEB13 is “jump unconditionally”
  • page 8: 16 MB = 16 777 216 bytes = 134 217 728 bits.
    bytes = 16 * 1024**2
    bits = 8 * bytes
    (bytes/seconds) / 1024**2  # Mbytes/sec
    ⇒ 0.4266666666666667
    (bits/seconds) / 1024**2  # MBits/sec
    ⇒ 3.4133333333333336    # ok, this occurs in the paper
    (bytes/seconds) / 1024    # Kbytes/sec
    ⇒ 436.9066666666667     # this diverges from the paper with “426.67 KB/s”
  • “The first mechanism utilizes the cpuid instruction, which is emulated by the hypervisor and features a 2-byte opcode: Each time cpuid is executed, the hypervisor is called to emulate it.”

On virtual memory with virtual machines

On virtualized systems, two different page tables are used. Within the VM, the VA used by the guest, the Guest Virtual Address (GVA), is translated to the Guest Physical Address (GPA). The GPA is the address which the VM considers to be the PA. However, on the host system itself another page table is introduced, to allow multiple VMs to run on the same physical machine. This second page table is called the Second Level Address Translation (SLAT), or Nested Page Table (NPT) [5]. The NPT translates the GPA into the Host Physical Address (HPA), the actual address of the data in physical memory.

summary

  • Point of this paper: AMD SEV lacks integrity protection and thus memory encryption of a virtual machine can be broken by an untrusted hypervisor. Hypervisor can thus modify memory content (i.e. data or instructions) of a VM arbitrarily.
  • The researchers determined tweak values T(p) using a linear equation system. The AMD Epyc Embedded processor line was considered. AMD Epyc 7251 (released June 2017) used the XE encryption mode whereas AMD Epyc (released Feb 2018) uses the XEX encryption mode
    • Xor-Encrypt (XE): Enc_K(m, p) := AES_K(m ⊕ T(p)) and Dec_K(c, p) := AES⁻¹_K(c) ⊕ T(p)
    • Xor-Encrypt-Xor (XEX): Enc_K(m, p) := AES_K(m ⊕ T(p)) ⊕ T(p) and Dec_K(c, p) := AES⁻¹_K(c ⊕ T(p)) ⊕ T(p)
  • We never know the key K. Thus we cannot decrypt text and encrypt it again. But since we know T(p) where p is chosen to be physical address bit (i.e. address of a 16-byte block), then moving blocks changes the ciphertext in a predictable way
    • XE: Dec_K(Enc_K(m, p) , q) = AES⁻¹_K(AES_K(m ⊕ T(p))) ⊕ T (q_j) = m ⊕ T(p) ⊕ T(q_j) = m ⊕ T (p ⊕ q_j)
  • When can we make changes in the memory? Whenever control from the VM is handed back from the VM to the hypervisor.
    • We can simply remove the executable bit from the page and a page fault will be triggered. Then control will be returned back. Then we also know the address p. Adding the executable bit again and returning control allows to continue instructions stepwise (stepwise execution is a common SGX attack strategy, but unimportant for us)
    • Or we can inject the cpuid instruction. The hypervisor is responsible for emulating/answering cpuid by copying the return values to a predefined memory segment.
  • So we can move pages without screwing up the encryption. But how can we actually modify memory content? We define an encryption oracle:
    1. At boot time, cpuid is issued. We filp bits and create a loop to repeatedly call cpuid.
    2. The hypervisor provides content as return value of cpuid. The kernel in the VM encrypts it. Thus the hypervisor can read the encrypted version of the plaintext it provided.

Trust in cloud providers

One reason for not adopting cloud services is the required trust in the cloud provider: As they control the hypervisor, any data processed in the system is accessible to them.

Tweakable block ciphers

One popular method for storage encryption are tweakable block ciphers, such as AES-XTS [1], which is, e.g., used in Apple’s FileVault, MS Bitlocker and Android’s file-based encryption. Tweakable block ciphers provide encryption without data expansion as well as some protection against plaintext manipulation. A tweak allows to securely change the behavior of the block cipher, similar to instantiating it with a new key, but with little overhead.

XEX and Xor-Encrypt (XE) are methods to turn a block cipher such as AES into a tweakable blockcipher, where a tweak-derived value is XORed with the plaintext before encryption (and XORed again after encryption in the case of XEX).

typo

“before it continuous its operation” → “before it continues its operation.”

Piret and Quisquater’s DFA on AES Revisited §

Title: “Piret and Quisquater’s DFA on AES Revisited” by Giraud, Thillard [url] [dblp]
Published in at and I read it in 2020-06
Abstract: At CHES 2003, Piret and Quisquater published a very efficient DFA on AES which has served as a basis for many variants published afterwards. In this paper, we revisit P&Q’s DFA on AES and we explain how this attack can be much more efficient than originally claimed. In particular, we show that only 2 (resp. 3) faulty ciphertexts allow an attacker to efficiently recover the key in the case of AES-192 (resp. AES-256). Our attack on AES-256 is the most efficient attack on this key length published so far.

quotes

  • “we show that only 2 (resp. 3) faulty ciphertexts allow an attacker to efficiently recover the key in the case of AES-192 (resp. AES-256).”
  • “Since its publication in 1996, Fault Analysis has become the most efficient way to attack cryptosystems implemented on embedded devices such as smart cards. In October 2000, Rijndael was selected as AES and since then many researchers have studied this algorithm in order to find very efficient differential fault attacks. Amongst the dozen DFA on AES published so far, Piret and Quisquater’s attack published at CHES 2003 [5] is now a reference which has been used as a basis for several variants published afterwards.”
  • “Therefore the last round key can be recovered by using 8 faulty ciphertexts with faults induced at chosen locations.”
  • “From our experiments, an exhaustive search on AES-128 amongst 234 key candidates takes about 8 minutes on average on a 4-core 3.2Ghz Xeon by using a non-optimised C code. Therefore such an attack is practical.”
  • “The triviality of the extension of Piret and Quisquater’s attack comes from the fact that, since MixColumns is linear, one can rewrite the last two rounds of the AES as depicted in Fig. 3.” → Figure 3 moves MixColumns into the last round and uses key MC-1(Kr-1) for the second-to-last AddRoundKey operation
  • “To conclude, the original P&Q’s DFA on AES can uniquely identify the AES key in the 192 and 256-bit cases by using 4 faulty ciphertexts.”
  • “From our experiments, an exhaustive search on AES-192 amongst 242 key candidates takes about 1.5 day on average on a 4-core 3.2Ghz Xeon by using a non-optimised C code. Therefore such an attack can be classified as practical.”
  • “To conclude, one can reduce the number of candidates for the AES-192 key to 210 by using 3 faulty ciphertexts.”

summary

Good paper. Ideas are immediate. Not all attacks presented give runtimes, but the (more important) number of faulty ciphertexts is analyzed properly. The original P&Q attack is neat in general.

The original attack from 2003 uniquely identifies the key with 2 faulty ciphertext pairs with probability 98%.

typo

Figure 2 swaps the last SubBytes and ShiftRows operation

Exploring the Vulnerability of R-LWE Encryption to Fau… §

Title: “Exploring the Vulnerability of R-LWE Encryption to Fault Attacks” by Valencia, Oder, Güneysu, Regazzoni [url] [dblp]
Published in 2020-06-22 17:10:52 at CS2@HiPEAC 2018 and I read it in 2020-06
Abstract: The future advent of quantum computer pushes for the design and implementation of public-key cryptosystems capable of resisting quantum attacks. Lattice-based cryptography, especially when implemented over ideal lattices, is one of the most promising candidates to replace current public-key systems. Area and performance of different designs have been explored in a wide number of platforms, including embedded ones. However, the resistance of these constructions against physical attacks, although fundamental to guarantee security, still remains largely unexplored.

ambiguity

Description of “Randomization of the key” (page 5):

“Decryption can also be attacked by altering the key injecting a random fault in one coefficient. The drawback of this attack is that it is necessary to know the difference δ between the original key value and the new one. If this is known, the attacker starts to send faulty ciphertexts to the decryption oracle. For each δ we note for each bit in the decrypted message whether it was decrypted correctly or not. We repeat this procedure until we know the decryption error status for each δ and each bit of the message. Then, for each message bit, we count the number of the times the error status has changed between δi and δi+1 , for each δ ∈ [0, q-1]. The number of error status changes divided by two is the absolute value of the secret key at this position.”

  • “altering the key” → secret key is faulted/modified
    “send faulty ciphertexts” → so the ciphertext (c1, c2) is also faulted?
  • Does this attack work coefficient-wise? Since c1·sk is a polynomial multiplication (unless in NTT domain?!), it affects multiple coefficients. On the other hand, the text talks about individual bits.
  • δ is defined as “δ := sk - sk'” where sk' is the faulted secret key. “changed between δi and δi+1” implies that δi must be ordered. How?

Best effort interpretation:

  • I fault the secret key during decryption. There is no faulty ciphertext.
  • I assume it is a coefficient-wise attack.
  • I assume we increase δ from 1 to q-1: δi → from i=1 to q-1. Decode(…) thus gives only one bit of interest.
  • Let C := #{Decode(c1·(sk+δi)+c2) ≠ Decode(c1·(sk+δi+1)+c2) | i ∈ [0, q-1]} be the “number of the times the error status has changed”.
    The main statement is abs(sk) = C/2. The larger δ, the more reduction in ℤq we have. Recognize that factor c1 is multiplied. Thus the more “error status changes” we have ⇒ the larger C. If δ gets closer to q/2, then δi ≠ δi+1 for almost all i, because a factor of q/2 (or close to it) switches the decoded bit from 0 to 1 (or vice versa).

Robert pointed out another attack that is somewhat similar:

  • If (sk+δi) == 0, then c1·(sk+δi) == 0. Thus, if δi = -sk, then Decode(c2) == Decode(pke1 + e3 + m) will be computed. Thus, we can try various ciphertexts and almost only for δi==-sk, the ciphertext will always be correctly decoded. Thus, we can read the secret key for the computed δi

inconsistency

  • change of notation: sometime m with tilde above, sometime enc(m)
  • “cipher text” versus “ciphertext”

quotes

  • “Our qualitative analysis represents one of the first steps towards the deployment of reliable post-quantum cryptography resistant against fault attacks.”
  • “Lattice-based primitives are based on the hard and quantum resistant problem of finding the solutions of a system of linear equations when an error is introduced.”
  • “To the best of our knowledge, thispaper is one of the first attempt of systematically discussing the robustness of a post quantum scheme against fault attacks.”
  • “This transformation is very interesting for lattice-based cryptography because polynomial multiplication in the frequency domain has linear complexity instead of quadratic complexity.”
  • “To allow efficient computation of the NTT, the coefficient ring has to contain primitive roots of unity”
  • “To do this, we consider three types of adversaries:
    1. an adversary that can set to zero a single or multiple bits of a vector,
    2. and adversary that can skip a single instruction within a loop at the desired iteration, and
    3. an adversary that can change any data bit to a random value.”
  • “In our case we obtained the possible set of values of A using 2 ciphertexts and we continue with the next iteration, thus we use 2*N number of decryptions to recover the key using this attack.”
  • “Our simulations using Matlab demonstrated that running 1000 attacks having as parameter n = 192 and q = 4093 the average number of cipher texts needed to completely recover the secret key is 7.8.”
  • “Our results showed that, on average, the secret key can be recovered using 268.8 ciphertext, when the parameters have been set to n = 192 and q = 4093.”
  • “When running 1000 simulated attacks using Matlab with n = 192 and q = 4093, we needed on average 2 ciphertexts for recovering a key component.”

summary

I like the overview-style nature of the paper. It is properly structured. The overview section is very short, dense, and fine. The fault attacks mentioned are not sophisticated, but the overview makes it a nice paper. However, the last attack “Randomization of the key” is incomprehensible. Furthermore it is only published on ACM and all references are broken there. In conclusion: Nice read if you are interested in ring-LWE and fault attacks.

typo

  • “one of the first attempt” → “one of the first attempts
  • “Learning with Errors problem problem” → “Learning with Errors problem problem
  • “In our case we obtained the possible set of values of A using 2 ciphertext and we continue with the next iteration, thus we use 2*N number of decryption to recover the key” → “In our case we obtained the possible set of values of A using 2 ciphertexts and we continue with the next iteration, thus we use 2*N number of decryptions to recover the key”
  • “and the attack succesfully recovers the keys.” → “and the attack successfully recovers the keys.”
  • “the secret key can be recovered using 268.8 ciphertext,” → “the secret key can be recovered using 268.8 ciphertexts,”
  • “Since the result include the threshold,” → “Since the result includes the threshold,”

Templates vs. Stochastic Methods: A Performance Analys… §

Title: “Templates vs. Stochastic Methods: A Performance Analysis for Side Channel Cryptanalysis” by Gierlichs, Lemke-Rust, Paar [url] [dblp]
Published in 2020-06-29 12:47:39 at CHES 2006 and I read it in 2020-06
Abstract: Template Attacks and the Stochastic Model provide advanced methods for side channel cryptanalysis that make use of ‘a-priori’ knowledge gained from a profiling step. For a systematic comparison of Template Attacks and the Stochastic Model, we use two sets of measurement data that originate from two different microcontrollers and setups. Our main contribution is to capture performance aspects against crucial parameters such as the number of measurements available during profiling and classification. Moreover, optimization techniques are evaluated for both methods under consideration. Especially for a low number of measurements and noisy samples, the use of a T-Test based algorithm for the choice of relevant instants can lead to significant performance gains. As a main result, T-Test based Templates are the method of choice if a high number of samples is available for profiling. However, in case of a low number of samples for profiling, stochastic methods are an alternative and can reach superior efficiency both in terms of profiling and classification.

ambiguity

  • page 2: “a multivariate characterization of the noise” → noise is defined as “noise + performed operation”
  • page 3: “for each time instant” → undefined term “time instant”
  • page 3: “it is the average mi2 of all available samples” → do we square the average? is it already squared?
  • page 3: “(P1, …, Pp)” → what is P?
  • page 6: “(II) the number of curves for profiling” → which curves? difference curves? they were only defined for the template method not for the stochastic method

quotes

  • “Especially for a low number of measurements and noisy samples, the use of a T-Test based algorithm for the choice of relevant instants can lead to significant performance gains.”
  • “However, in case of a low number of samples for profiling, stochastic methods are an alternative and can reach superior efficiency both in terms of profiling and classification.”
  • “The underlying working hypothesis for side channel cryptanalysis assumes that computations of a cryptographic device have an impact on instantaneous physical observables in the (immediate) vicinity of the device, e.g., power consumption or electromagnetic radiation”
  • approaches depending on number of stages:
    • one-stage approach:
      1. directly extract key
    • two-stage approach:
      1. “profiling step”
      2. “attack step”
  • “Templates were introduced as the strongest side channel attack possible from an information theoretic point of view”
  • “This is due to the fact that positive and negative differences between the averages may zeroize, which is desirable to filter noise but hides as well valuable peaks that derive from significant signal differences with alternating algebraic sign.”
  • “Templates estimate the data-dependent part ht itself, whereas the Stochastic model approximates the linear part of ht in the chosen vector subspace (e.g., F9) and is not capable of including non-linear parts.”
  • “The number of measurements, both during profiling and key extraction, is regarded as the relevant and measurable parameter.”
  • “We focus on the number of available samples (side channel quality) since computational complexity is of minor importance for the attacks under consideration.”
  • “We focus on the number of available samples (side channel quality) since computational complexity is of minor importance for the attacks under consideration.”
  • “Hence, a general statement on which attack yields better success rates is not feasible as this depends on the number of curves that are available in the profiling step. If a large number of samples is available (e.g., more than twenty thousand), the Template Attack yields higher success rates. If only a small number of samples is available (e.g., less than twenty thousand), stochastic methods are the better choice.” (w.r.t. Metric 3)
  • “The Stochastic Model’s strength is the ability to “learn” quickly from a small number of samples. One weakness lies in the reduced precision due to the linear approximation in a vector subspace.”
  • “The Template Attack’s weakness is its poor ability to reduce the noise in the side channel samples if the adversary is bounded in the number of samples in the profiling step.”
  • “The T-Test Template Attack is the best possible choice in almost all parameter ranges.”
  • “For example, using N = 200 profiling measurements and N3 = 10 curves for classification it still achieves a success rate of 81.7%.”

summary

Important empirical results. Parameters and assumptions are okay. Results are fundamental and significant. However, the description of the methods (templates, stochastic) are quite bad. Looking up the references is required.

Notes:

  • sosd: \sum_{i,j=1}^K (m_i - m_j)^2 for i ≥ j
  • sost: \sum_{i,j=1}^K \left(\frac{m_i - m_j}{\sqrt{\frac{\sigma_i^2}{n_i} + \frac{\sigma_j^2}{n_j}}\right)^2 for i ≥ j

FastSpec: Scalable Generation and Detection of Spectre… §

Title: “FastSpec: Scalable Generation and Detection of Spectre Gadgets Using Neural Embeddings” by Tol, Yurtseven, Gulmezoglu, Sunar [url] [dblp]
Published in 2020-07-02 08:40:53 at and I read it in 2020-07
Abstract: Several techniques have been proposed to detect vulnerable Spectre gadgets in widely deployed commercial software. Unfortunately, detection techniques proposed so far rely on hand-written rules which fall short in covering subtle variations of known Spectre gadgets as well as demand a huge amount of time to analyze each conditional branch in software. Since it requires arduous effort to craft new gadgets manually, the evaluations of detection mechanisms are based only on a handful of these gadgets.

ambiguity

  • DNN (page 3) is undefined
  • Equation 1: Notation X~Pdata is unknown to me
  • “which alters the flags”. What is a flag? (Step 4, page 5)

quotes

  • “Several techniques have been proposed to detect vulnerable Spectre gadgets in widely deployed commercial software. Unfortunately, detection techniques proposed so far rely on hand-written rules which fall short in covering subtle variations of known Spectre gadgets as well as demand a huge amount of time to analyze each conditional branch in software.”
  • “In this work, we employ deep learning techniques for automated generation and detection of Spectre gadgets.”
  • “Using mutational fuzzing, we produce a data set with more than 1 million Spectre-V1 gadgets which is the largest Spectre gadget data set built to date.”
  • “we conduct the first empirical usability study of Generative Adversarial Networks (GANs) for creating assembly code without any human interaction.”
  • “While the initial variants of Spectre [26] exploit conditional and indirect branches, Koruyeh et al. [27] proposes another Spectre variant obtained by poisoning the entries in Return-Stack-Buffers (RSBs).”
  • “The proposed detection tools mostly implement taint analysis [66] and symbolic execution [15,65] to identify potential gadgets in benign applications. However, the methods proposed so far are have two shortcomings: (1) the scarcity of Spectre gadgets prevents the  comprehensive evaluation of the tools, (2) the scanning time increases drastically with increasing binary file sizes.”
  • “BERT [9] was proposed by the Google AI team to learn the relations between different words in a sentence by applying a self-attention mechanism [63].”
  • “In summary,
    • We extend 15 base Spectre examples to 1 million gadgets via mutational fuzzing,
    • We propose SpectreGAN which leverages conditional GANs to create new Spectre gadgets by learning the distribution of existing Spectre gadgets in a scalable way,
    • We show that both mutational fuzzing and SpectreGAN create diverse and novel gadgets which are not detected by oo7 and Spectector tools
    • We introduce FastSpec which is based on supervised neural word embeddings to identify the potential gadgets in benign applications orders of magnitude faster than rule-based methods.”
  • “Hence, Spectre-type attacks are not completely resolved yet and finding an efficient countermeasure is still an open problem.”
  • “There are a number of known Spectre variants: Spectre-V1 (bounds check bypass), Spectre-V2 (branch target injection), Spectre-RSB [27, 34] (return stack buffer speculation), and Spectre-V4 [19] (speculative store bypass).”
  • “The heavy part of the training is handled by processing unlabeled data in an unsupervised manner. The unsupervised phase is called pre-training which consists of masked language model training and next sentence prediction procedures.”
  • Defenses against Spectre. There are various detection methods for speculative execution attacks. Taint analysis is used in oo7 [66] software tool to detect leakages. As an alternative way, the taint analysis is implemented in the hardware context to stop the speculative execution for secret dependent data [53, 71]. The second method relies on symbolic execution analysis. Spectector [15] symbolically executes the programs where the conditional branches are treated as mispredicted. Furthermore, SpecuSym [18] and KleeSpectre [65]
    aim to model cache usage with symbolic execution to detect speculative interference which is based on Klee symbolic execution engine. Following a different approach, Speculator [35] collects performance counter values to detect mispredicted branches and speculative execution domain. Finally, Specfuzz [44] uses a fuzzing strategy to analyze the control flow paths which are most likely vulnerable against speculative execution attacks.”
  • “Our mutation operator is the insertion of random Assembly instructions with random operands.”
  • “The overall success rate of fuzzing technique is around 5% out of compiled gadgets.”
  • “the input assembly functions are converted to a sequence of tokens T' = {x'1, … x'N} where each token represents an instruction, register, parenthesis, comma, intermediate value or label. SpectreGAN is conditionally trained with each sequence of tokens where a masking vector m = (m1, …, mN) with elements mt ∈ {0, 1} is generated.”
  • “The training procedure consists of two main phases namely, pre-training and adversarial training.”
  • “We keep commas, parenthesis, immediate values, labels, instruction and register names as separate tokens.”
  • “The tokenization process converts the instruction "movq (%rax), %rdx" into the list ["movq", "(", "%rax", ")", ",", "%rdx"] where each element of the list is a token x't. Hence, each token list T' = {x'1, …, x'N} represents an assembly function in the data set.”
  • “SpectreGAN is trained with a batch size of 100 on NVIDIA GeForce GTX 1080 Ti until the validation perplexity converges in Figure 2. The pre-training lasts about 50 hours while the adversarial training phase takes around 30 hours.” (80 hours = 3d 8h)
  • “SpectreGAN is trained with a masking rate of 0.3, the success rate of gadgets increases up to 72%. Interestingly, the success rate drops for other masking rates, which also demonstrates the importance of masking rate choice.”
  • “To illustrate an example of the generated samples, we fed the gadget in Listing 2 to SpectreGAN and generated a new gadget in Listing 3.”
  • “Mutational fuzzing and SpectreGAN generated approximately 1.2 million gadgets in total.”
  • “The quality of generated texts is mostly evaluated by analyzing the number of unique n-grams.”
  • “However, it is challenging to examine the effects of instructions in the transient domain since they are not visible in the architectural state. After we carefully analyzed the performance counters for the Haswell architecture, we determined that two counters namely, UOPS_ISSUED : ANY and UOPS_RET IRED : ANY give an idea to what extent the speculative window is altered. UOPS_ISSUED : ANY counter is incremented every time a μop is issued which counts both speculative and non-speculative μops. On the other hand,  UOPS_RET IRED : ANY counter only counts the executed and committed μops which automatically excludes speculatively executed μops.”
  • “we have selected 100,000 samples from each gadget example uniformly random due to the immense time consumption of oo7 (150 hours for 100K gadgets) which achieves 94% detection rate.”
  • “Interestingly, specific gadget types from both fuzzing and SpectreGAN are not caught by oo7. When a gadget contains cmov or xchg or set instruction and its variants, it is not identified as a Spectre gadget.”
  • “Listing 5: XCHG gadget: When a past value controlled by the attacker is used in the Spectre gadget, oo7 cannot detect the XCHG gadget”
  • “For each Assembly file, Spectector is adjusted to track 25 symbolic paths of at most 5000 instructions each, with a global timeout of 30 minutes. The remaining parameters are kept as default.”
  • “23.75% of the gadgets are not detected by Spectector. We observed that 96% of the undetected gadgets contain unsupported instruction/register which is the indicator of an implementation issue in Spectector.”
  • “After we examined the undetected gadgets, we observed that if the gadgets include either sfence/mfence/lfence or 8-bit registers (%al, %bl, %cl, %dl), they are likely to bypass Spectector.”
  • “Differently, the mask positions are selected from 15% of the training sequence and the selected positions are masked and replaced with <MASK> token with 0.80 probability, replaced with a random token with 0.10 probability or kept as the same token with 0.10 probability.”
  • “Since it is not possible to visualize the high dimensional embedding vectors, we leverage the t-SNE algorithm [33] which maps the embedding vectors to a three-dimensional space as shown in Figure 4.”
  • “The output probabilities of the softmax layer are the predictions on the assembly code sequence.”
  • “We combine the assembly data set that was generated in Section 4 and the disassembled Linux libraries to train FastSpec. Although it is possible that Linux libraries contain Spectre-V1 gadgets, we assume that the total number of hidden Spectre gadgets are negligible comparing the total size of the data set.”
  • “In total, a dataset of 107 million lines of assembly code is collected which consists of 370 million tokens after the pre-processing.”
  • “The pre-training phase takes approximately 6 hours with a sequence length of 50. We further train the positional embeddings for 1 hour with a sequence length of 250. The fine-tuning takes only 20 minutes on the pre-trained model to achieve classifying all types of samples in the test data set correctly.”
  • “In the evaluation of FastSpec, we obtained 1.3 million true positives and 110 false positives (99.9% precision rate) in the test data set which demonstrates the high performance of FastSpec. We assume that the false positives are Spectre-like gadgets in Linux libraries, which needs to be explored deeply in the future work. Moreover, we only have 55 false negatives (99.9% recall rate) which yields 0.99 F-1 score on the test data set.”
  • “The processing time of FastSpec is independent of the number of branches whereas for Spectector and oo7 the analysis time increases drastically.”
  • “Consequently, FastSpec is faster than oo7 and Spectector 455 times and 75 times on average, respectively.”
  • “The total number of tokens is 203,055 while the analysis time is around 17 minutes.”
  • “This work for the first time proposed NLP inspired approaches for Spectre gadget generation and detection.”

summary

Very advanced paper. Perfect combination of Machine Learning technology with microarchitectural attack work. Shows a huge effort and nice considerations regarding Generative Adversarial Networks. However, I could not understand technical aspects of the machine learning part

  • Several techniques have been proposed to detect vulnerable Spectre gadgets in widely deployed commercial software. Unfortunately, detection techniques proposed so far rely on hand-written rules. Current shortcomings are: (1) the scarcity of Spectre gadgets prevents the
    comprehensive evaluation of the tools, (2) the scanning time increases drastically with increasing binary file sizes
  • Approach:
    1. Mutational fuzzing is used to expand Kocher's 15 + Spectector's 2 Spectre gadgets to more than 1 miilion
    2. A Generative Adversarial Network (SpectreGAN; based on MaskGAN) is used to generate Assembly
    3. FastSpec (based on BERT by Google) takes Assembly and determines whether some binary contains a Spectre gadget
  • They achieve 455× the performance of oo7 and 75× the performance of Spectector
  • The Linux kernel shows 1.3 million true positive Spectre gadgets and FastSpec found 110 false positives in 107 million lines of ASM code
  • 379 matches were found in the OpenSSL 1.1.1g library
  • On github, there is a 390 MB tar gz archive (split up). Decompressed, it has a size of 6.7 GB. 972 MB seem to be 239 025 Spectre test gadget files in ASM format

The masking rate (page 6) seems to be the percentage of hidden tokens during the training phase. Figure 4 is a little bit awkward and a little bit random. FastSpec/scan.sh seems to show how FastSpec was called to evaluate OpenSSL. And commands.txt tries to explain it somehow.

typo

  • “The critical time period before the flush happens is commonly referred to the transient domain.” → ““The critical time period before the flush happens is commonly referred to as the transient domain.”
  • “microarchtiectures” → “microarchitectures”
  • “An resule of a sample gadget” → “An result of a sample gadget”

Everything Old is New Again: Binary Security of WebAss… §

Title: “Everything Old is New Again: Binary Security of WebAssembly” by Lehmann, Kinder, Pradel [url] [dblp]
Published in at USENIX Security 2020 and I read it in 2020-07
Abstract: WebAssembly is an increasingly popular compilation target designed to run code in browsers and on other platforms safely and securely, by strictly separating code and data, enforcing types, and limiting indirect control flow. Still, vulnerabilities in memory-unsafe source languages can translate to vulnerabilities in WebAssembly binaries. In this paper, we analyze to what extent vulnerabilities are exploitable in WebAssembly binaries, and how this compares to native code. We find that many classic vulnerabilities which, due to common mitigations, are no longer exploitable in native binaries, are completely exposed in WebAssembly. Moreover, WebAssembly enables unique attacks, such as overwriting supposedly constant data or manipulating the heap using a stack overflow. We present a set of attack primitives that enable an attacker (i) to write arbitrary memory, (ii) to overwrite sensitive data, and (iii) to trigger unexpected behavior by diverting control flow or manipulating the host environment. We provide a set of vulnerable proof-of-concept applications along with complete end-to-end exploits, which cover three WebAssembly platforms. An empirical risk assessment on real-world binaries and SPEC CPU programs compiled to WebAssembly shows that our attack primitives are likely to be feasible in practice. Overall, our findings show a perhaps surprising lack of binary security in WebAssembly. We discuss potential protection mechanisms to mitigate the resulting risks.

quotes

  • “We find that many classic vulnerabilities which, due to common mitigations, are no longer exploitable in native binaries, are completely exposed in WebAssembly. Moreover, WebAssembly enables unique attacks, such as overwriting supposedly constant data or manipulating the heap using a stack overflow.”
  • “WebAssembly is an increasingly popular bytecode language that offers a compact and portable representation, fast execution, and a low-level memory model. Announced in 2015 and implemented by all major browsers in 2017, WebAssembly is supported by 92% of all global browser installations as of June 2020. The language is designed as a compilation target, and several widely used compilers exist, e.g., Emscripten for C and C++, or the Rust compiler, both based on LLVM”
  • “There are two main aspects to the security of the WebAs-sembly ecosystem: (i) host security, the effectiveness of the runtime environment in protecting the host system against malicious WebAssembly code; and (ii) binary security, the effectiveness of the built-in fault isolation mechanisms in preventing exploitation of otherwise benign WebAssembly code”
  • “Comparing the exploitability of WebAssembly binaries with native binaries, e.g., on x86, shows that WebAssembly re-enables several formerly defeated attacks because it lacks modern mitigations. One example are stack-based buffer overflows, which are effective again because WebAssembly binaries do not deploy stack canaries.”
  • “The original WebAssembly paper addresses this question briefly by saying that “at worst, a buggy or exploited WebAssembly program can make a mess of the data in its own memory”
  • “Regarding data-based attacks, we find that one third of all functions make use of the unmanaged (and unprotected) stack in linear memory. Regarding control-flow attacks, we find that every second function can be reached from indirect calls that take their target directly from linear memory. We also compare WebAssembly’s type-checking of indirect calls with native control-flow integrity defenses.”
  • “There are four primitive types: 32 and 64 bit integers (i32 , i64) and single and double precision floats (f32 , f64). More complex types, such as arrays, records, or designated pointers do not exist.”
  • “Branches can only jump to the end of surrounding blocks, and only inside the current function. Multi-way branches can only target
    blocks that are statically designated in a branch table. Unrestricted gotos or jumps to arbitrary addresses are not possible. In particular, one cannot execute data in memory as bytecode instructions.”
  • “The call_indirect instruction on the left pops a value from the stack, which it uses to index into the so called table section. Table entries map this index to a function, which is subsequently called. Thus, a function can only be indirectly called if it is present in the table.”
  • “In contrast to other byte-code languages, WebAssembly does not provide managed memory or garbage collection. Instead, the so called linear memory is simply a single, global array of bytes.”
  • “One of the most basic protection mechanisms in native programs is virtual memory with unmapped pages. A read or write to an unmapped page triggers a page fault and terminates the program, hence an attacker must avoid writing to such addresses. WebAssembly’s linear memory, on the other hand, is a single, contiguous memory space without any holes, so every pointer ∈ [0, max_mem] is valid. […] This is a fundamental limitation of linear memory with severe consequences. Since one cannot install guard pages between static data, the unmanaged stack, and the heap, overflows in one section can silently corrupt data in adjacent sections.”
  • “In WebAssembly, linear memory is non-executable by design, as it cannot be jumped to.”
  • “an overflow while writing into a local variable on the unmanaged stack, e.g., buffer, may overwrite other local variables in the same and even in other stack frames upwards in the stack”
  • “Because in WebAssembly no default allocator is provided by the host environment, compilers include a memory allocator as part of the compiled program”
  • “While standard allocators, such as dlmalloc, have been hardened against a variety of metadata corruption attacks, simplified and lightweight allocators are often vulnerable to classic attacks. We find both emmalloc and wee_alloc to be vulnerable to metadata corruption attacks, which we illustrate for a version of emmalloc in the following.”
  • “As WebAssembly has no way of making data immutable in linear memory, an arbitrary write primitive can change the value of any non-scalar constant in the program, including, e.g., all string literals.”
  • “Version 1.6.35 of libpng suffers from a known buffer overflow vulnerability (CVE-2018-14550 [3]), which can be exploited when converting a PNM file to a PNG file. When the library is compiled to native code with modern compilers on standard settings, stack canaries prevent this vulnerability from being exploited. In WebAssembly, the vulnerability can be exploited unhindered by any mitigations.”
  • “While exec and the log_* functions have different C++ types, all three functions have identical types on the WebAssembly level (Figure 8b). The reason is that both integers and pointers are represented as i32 types in WebAssembly, i.e., the redirected call passes WebAssembly’s type check.”
  • “To the best of our knowledge, it is the first security analysis tool for WebAssembly binaries. The analysis is written in Rust”
  • “For example, with ten nested calls (assuming a uniform distribution of functions), there would be some data on the unmanaged stack with 1 − ((1 − 0.33) 10 ) ≈ 98.2% probability.”
  • “Averaged over all 26 programs, 9.8% of all call instructions are indirect,”
  • “[…] how many functions are type-compatible with at least one call_indirect instruction and present in the table section. The percentage of indirectly callable functions ranges from 5% to 77.3%, with on average 49.2% of all functions in the program corpus.”
  • “WebAssembly’s type checking of indirect calls can be seen as a form of control-flow integrity (CFI) for forward edges. Backward
    edges, i.e., returns, are protected due to being managed by the VM and offer security conceptually similar to shadow stack solutions.”
  • multiple memories proposal: Andreas Rossberg. “Multiple per-module memories for Wasm”. https://github.com/WebAssembly/multi-memory, 2019.
  • reference types proposal: Andreas Rossberg. “Proposal for adding basic reference types”. https://github.com/WebAssembly/reference-types, 2019.
  • “Examples that would benefit Web-Assembly compilers are FORTIFY_SOURCE-like code rewriting, stack canaries, CFI defenses, and safe unlinking in memory allocators. In particular for stack canaries and rewriting commonly exploited C string functions, we believe there are no principled hindrances to deployment.”
  • “Developers of WebAssembly applications can reduce the risk by using as little code in “unsafe” languages, such as C, as possible.”
  • “The language has a formally defined type system shown to be sound”
  • via [37] = “Not So Fast: Analyzing the Performance of WebAssembly vs. Native Code”: “We find that the mean slowdown of WebAssembly vs. native across SPEC bench-marks is 1.55×for Chrome and 1.45×for Firefox, with peak slowdowns of 2.5×in Chrome and 2.08×in Firefox”

summary

Wonderful paper. Firstly, a security analysis was necessary and I think they covered an appropriate amount of attack vectors. Secondly it is, of course, sad to see that many old vulnerabilities still work on a new platform.

I am unconditionally in favor of true read-only memory in WebAssembly. As David points out, this can also be enforced by a compiler by appropriate runtime checks. However, David also specified a criterion: Iff a memory attack could lead to an escape from the WASM sandbox, then it is an issue of WebAssembly sandbox and should be prevented at this level.

I keep wondering about stack canaries and guard pages. Maybe it was a decision between the trade-off of security and performance. But I am not convinced by it with 100%. Thirdly, the paper is well-structured and gives sufficient data to reason its arguments. The attacks were okay, the introduction to WebAssembly was superb and justifying the claims regarding indirect calls with quantitative data in section 6 was outstanding. I think everyone in IT security can easily follow it. I love it!

WebAssembly architecture:

  • not possible
    • overwriting string literals in supposedly constant memory is not possible
    • In WebAssembly, linear memory is non-executable by design, as it cannot be jumped to
    • WebAssembly’s linear memory is a single, contiguous memory space without any holes, so every pointer ∈ [0, max_mem] is valid.
  • measures, restrictions
    • program memory, data structures of underlying VM and stack are separated
    • binaries are easily type-checked
    • only jump to designated code locations
    • WebAssembly has two mechanisms that limit an attacker’s ability to redirect indirect calls. First, not all functions defined in or exported into a WebAssembly binary appear in the table for indirect calls, but only those that may be subject to an indirect call. Second, all calls, both direct and indirect, are type checked.
  • missing
    • In WebAssembly, there is no ASLR.
    • does not use stack canaries
    • buffer and stack overflows are thus very powerful attack primitives in WebAssembly
    • an overflow while writing into a local variable on the unmanaged stack, e.g., buffer, may overwrite other local variables in the same and even in other stack frames upwards in the stack,
    • As WebAssembly has no way of making data immutable in linear memory, an arbitrary write primitive can change the value of any non-scalar constant in the program, including, e.g., all string literals.

toc

• Introduction
• Background on WebAssembly
• Security Analysis of Linear Memory
• Managed vs. Unmanaged Data
• Memory Layout
• Memory Protections
• Attack Primitives
• Obtaining a Write Primitive
• Stack-based Buffer Overflow
• Stack Overflow
• Heap Metadata Corruption
• Overwriting Data
• Overwriting Stack Data
• Overwriting Heap Data
• Overwriting “Constant” Data
• Triggering Unexpected Behavior
• Redirecting Indirect Calls
• Code Injection into Host Environment
• Application-specific Data Overwrite
• End-to-End Attacks
• Cross-Site Scripting in Browsers
• Remote Code Execution in Node.js
• Arbitrary File Write in Stand-alone VM
• Quantitative Evaluation
• Experimental Setup and Analysis Process
• Measuring Unmanaged Stack Usage
• Measuring Indirect Calls and Targets
• Comparing with Existing CFI Policies
• Discussion of Mitigations
• WebAssembly Language
• Compilers and Tooling
• Application and Library Developers
• Related Work
• Conclusion