Browse the glossary using this index

Special | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | ALL

A

Picture of Yee Wei Law

Abstract interpretation

by Yee Wei Law - Wednesday, 17 May 2023, 10:01 AM
 

High-level overview

Abstract interpretation, first formalised by Cousot and Cousot in 1977 [CC77], executes a program on an abstract machine to determine the properties of interest [vJ11, p. 1255].

The level of detail in the abstract machine specification typically determines the accuracy of the results.

At one end of the spectrum is an abstract machine that faithfully represents the concrete semantics of the language.

Abstract interpretation on such a machine 👆 would correspond to a concrete execution on the real machine.

Typical abstract interpretation analyses, however, do not require detailed semantics.

A bit of theory

Fundamentally, the correctness problem of a program is undecidable, so approximation is necessary [Cou01, Abstract].

The purpose of abstract interpretation is to formalise this idea 👆 of approximation; see Fig. 1.

Formally, abstract interpretation is founded in the theory for approximating sets and set operations [Cou01, Sec. 2].

The semantics of a program can be defined as the solution to a fixpoint (short for fixed point) equation.

Thus, an essential role of abstract interpretation is providing constructive and effective methods for fixpoint approximation and checking by abstraction.

By observing computations at different levels of abstraction (trace semantics → relational semantics → denotational semantics → weakest precondition semantics → Hoare logics), fixpoints can be approximated.

Fig. 1: In abstract interpretation, the program analyzer computes an approximate semantics of the program [Cou01, Fig. 12].

The generator generates equations/constraints, the solution to which is a computer representation of the program semantics.

The solver solves these equations/constraints.

The diagnoser checks the solutions with respect to the specification and outputs “yes”, “no” or “unknown”.

References

[Cou01] P. Cousot, Abstract Interpretation Based Formal Methods and Future Challenges, pp. 138–156, Springer Berlin Heidelberg, Berlin, Heidelberg, 2001. https://doi.org/10.1007/3-540-44577-3_10.
[CC77] P. Cousot and R. Cousot, Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints, in Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, POPL ’77, Association for Computing Machinery, 1977, pp. 238–252. https://doi.org/10.1145/512950.512973.
[NNH99] F. Nielson, H. R. Nielson, and C. Hankin, Principles of Program Analysis, Springer Berlin, Heidelberg, 1999. https://doi.org/10.1007/978-3-662-03811-6.
[vJ11] H. C. van Tilborg and S. Jajodia (eds.), Encyclopedia of Cryptography and Security, Springer, Boston, MA, 2011. https://doi.org/10.1007/978-1-4419-5906-5.

Picture of Yee Wei Law

Address space layout randomisation (ASLR)

by Yee Wei Law - Tuesday, 16 May 2023, 11:04 AM
 

Address space layout randomisation (ASLR) refers to the randomisation of the locations of the key components of an executable — stack, heap, libraries and memory-mapped files, executable code — in the address space; to mitigate remote code execution and other attacks targeting CWE-787 and CWE-125.

Watch Dr. Yeongjin Jang’s lecture on ASLR:

ASLR targets memory corruption vulnerabilities such as buffer overflows, which typically occur when data get written to memory that is smaller than the size of the data — a common programming error when using a memory-unsafe language like C.

By introducing uncertainty into the locations of the shellcode (or any other attack code), ASLR hinders exploitations of memory corruption vulnerabilities.

There are many ways ASLR can be done, and different operating systems do it differently.

There are three dimensions to be considered [MGRR19]:

When

This is about when the entropy used for ASLR is generated (see Table 1):

  • Per-deployment: The application is randomised when it is installed in the system.

    This limited form of randomisation, called pre-linking, enables randomisation on systems that do not support position-independent code (PIC) or relocatable code.

  • Per-boot: Randomisation is done whenever the system boots.

    Useful for systems whose shared libraries are not compiled as PIC.

  • Per-exec: Randomisation is done whenever a new executable image is loaded into memory.

    This pre-process randomisation is triggered by the exec() system call but not the fork() system call.

  • Per-fork: Randomisation is done whenever a new process is created/forked.

  • Per-object: Every object is randomised when it is created.

    Objects that are at a constant distance away from another already mapped object are not considered to be randomised on a per-object basis, even if the reference object is randomised, because knowledge of one object’s location compromises the location of another.

What

This is about what to randomise.

In the extreme, if a single object (e.g., the executable) is not randomised, ASLR is considered to be broken.

An “aggressive” form of ASLR is when the randomisation is applied to the logical elements contained in the memory objects: processor instructions, blocks of code, functions, and even the data structures of the application; see Table 1.

In this case 👆, the compiler, linker and loader are required to play an active role.

This form of ASLR is not known to be in use.

How

Table 1 lists the main strategies:

  • Partial-VM vs full-VM: In the former case, a subset of the virtual memory space is divided into non-overlapping ranges or zones, and each zone contains one or more objects.

    In the latter case, randomisation applies to the whole virtual memory space.

    Full-VM randomisation does not honour the order of the main areas (i.e., exec, heap, libs, stack, ...), but it is not known to be in use.

  • Isolated-object vs correlated-object: In the former case, every object is randomly mapped with respect to any other.

    In the latter case, the position of an object is calculated as a function of the position of another object and a random value.

  • For details on the other strategies in Table 1, please refer to [MGRR19, pp. 5-6].
Table 1: Summary of the three dimensions of ASLR [MGRR19, Table 1]. vDSO = virtual dynamic shared object; VM = virtual memory.

References

[MGRR19] H. Marco-Gisbert and I. Ripoll Ripoll, Address space layout randomization next generation, Applied Sciences 9 no. 14 (2019). https://doi.org/10.3390/app9142928.

Picture of Yee Wei Law

Advanced persistent threat (APT)

by Yee Wei Law - Thursday, 15 June 2023, 2:54 PM
 

Advanced persistent threat (APT, see Definition 1) has been occupying the attention of many cybersecurity firms.

Definition 1: Advanced persistent threat (APT) [NIS11, Appendix B]

An adversary that possesses sophisticated levels of expertise and significant resources which allow it to create opportunities to achieve its objectives using multiple attack vectors (e.g., cyber, physical, and deception).

These objectives typically include establishing and extending footholds within the information technology infrastructure of the targeted organisations for purposes of exfiltrating information, undermining or impeding critical aspects of a mission, program, or organization; or positioning itself to carry out these objectives in the future.

The advanced persistent threat: 1️⃣ pursues its objectives repeatedly over an extended period of time; 2️⃣ adapts to defenders’ efforts to resist it; and 3️⃣ is determined to maintain the level of interaction needed to execute its objectives.

Since APT groups are characterised by sophistication, persistence and resourcefulness, they are challenging to counter. Lists of APT groups are being actively maintained, e.g., by MITRE and Mandiant.

References

[NIS11] NIST, Managing Information Security Risk: Organization, Mission, and Information System View, NIST Special Publication 800-39, March 2011. Available at https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-39.pdf.

Picture of Yee Wei Law

Application security testing

by Yee Wei Law - Sunday, 4 June 2023, 9:42 AM
 

Application security testing can be static or dynamic.

Static application security testing (SAST) is a specialisation of static (code/program) analysis.

Watch an introduction to static code analysis on LinkedIn Learning:

Static analysis from Developing Secure Software by Jungwoo Ryoo

Exploring tools for static analysis from Developing Secure Software by Jungwoo Ryoo

Definitions and history:

  • Static analysis is the processes of extracting semantic information about a program at compile time [Lan92, Sec. 1].
  • Classic example: The live-variables problem, where a variable is live at a statement iff on some execution, is used/accessed after is executed without being redefined [Lan92, Sec. 1; NNH99, p. 49]. This is also an example of data flow analysis (discussed below).
  • In security, static analysis is supposed to classify the potential behaviour of a program as malicious or benign without executing it [vJ11, p. 754].
  • Static analysis algorithms historically have come from compiler research and implementations [vJ11, p. 1254], evolving from intraprocedural analysis to interprocedural analysis [Lan92, Sec. 1].
  • In the 1970s, two main frameworks were established: 1️⃣ data flow analysis, 2️⃣ abstract interpretation. Nowadays, we also have 3️⃣ model checking, and 4️⃣ type checking.

Challenges:

  • Fundamentally, no tool can determine whether a program terminates due to the uncomputability of the halting problem, as we learn from basic complexity theory.

    The halting problem aside, finding an exact solution to typical static analysis questions is almost always undecidable [vJ11, p. 1254].

  • As codebases grow, static analysis tools take longer to parse and traverse code because they generally operate over all possible branches of execution in a program [Tho21].

    Furthermore, static analyses are inherently computationally expensive — often quadratic, sometimes even cubic — in terms of required space or time [Tho21].

    Consequently, static analysis tools are under constant pressure to be more efficient.

  • In general, there exist code obfuscation techniques that can defeat static analysis [MKK07], rendering 1️⃣ static analysis better at finding bugs in benign programs than detecting malware, and 2️⃣ the complementary role of dynamic analysis indispensable.

  • Besides obfuscation, malware often employ polymorphism such as the Shikata Ga Nai polymorphic encoding scheme [MRC19] to deter static analysis [vJ11, p. 754].

Tools:

  • Static analysis is especially important for C/C++ code.

    Among the open-source static analysis tools for C/C++ code, the Clang Static Analyzer and Frama-C offer robust detect rates [ACC+17].

  • For common programming languages, see OWASP’s extensive catalog of static analysis tools.
  • Regardless of the tool used, it helps to write code that facilitates checking, e.g., by adopting Holzman’s power-of-ten rules [Hol06] for writing safety-critical code.

Dynamic application security testing (DAST) is a specialisation of dynamic (code/program) analysis.

Watch an introduction to dynamic code analysis on LinkedIn Learning:

Dynamic analysis from Developing Secure Software by Jungwoo Ryoo

Dynamic analysis tools from Developing Secure Software by Jungwoo Ryoo

Definitions and history:

  • Dynamic analysis refers to the broad class of techniques that make inferences about a program by observing its runtime execution behaviour [vJ11, p. 365].

  • An example of dynamic analysis is fuzz testing, which is the execution of a program-under-test (PUT) using input(s) sampled from an input space (the “fuzz input space”) that “protrudes” the expected input space of the PUT, to test if the PUT violates a correctness policy [MHH+21, Definition 1].
  • Another example of dynamic analysis is taint analysis, also called information flow tracking, which is the tracking of “tainted” data throughout a system while a program manipulating this data is executed [ESKK08].
  • Better than static analysis, dynamic analysis is robust to malware polymorphism, including low-level obfuscations that can thwart disassembly [vJ11, p. 755].
  • Applications: software debugging, software profiling and host-based intrusion detection [vJ11, p. 366].

Challenges:

  • Dynamic analysis tools are typically competent on checking soundness (analysis results are consistent with the actual behaviour of the program), but not so much on completeness (analysis can infer all behaviours of interest of the program).

    In general, static analysis often suffers from a high false-positive rate, whereas dynamic analysis is limited in coverage.

  • There exist anti-emulation techniques that check for certain low-level processor features (e.g., undocumented instructions) or timings, enabling determination of whether the execution environment is an emulation [vJ11, p. 755].

    In case of an emulation, the malware can terminate execution without performing any malicious action and risking detection.

Tools:

  • Among the open-source dynamic analysis tools for C/C++ code, Valgrind [NS07] is likely the best known.
  • For common programming languages, an extensive catalog of dynamic analysis tools can be found on GitHub.

References

[ACC+17] A. Arusoaie, S. Ciobâca, V. Craciun, D. Gavrilut, and D. Lucanu, A Comparison of Open-Source Static Analysis Tools for Vulnerability Detection in C/C++ Code, in 2017 19th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2017, pp. 161–168. https://doi.org/10.1109/SYNASC.2017.00035.
[ESKK08] M. Egele, T. Scholte, E. Kirda, and C. Kruegel, A survey on automated dynamic malware-analysis techniques and tools, ACM Comput. Surv. 44 no. 2 (2008). https://doi.org/10.1145/2089125.2089126.
[Hol06] G. J. Holzmann, The power of 10: rules for developing safety-critical code, Computer 39 no. 6 (2006), 95–99. https://doi.org/10.1109/MC.2006.212.
[Lan92] W. Landi, Undecidability of static analysis, ACM Lett. Program. Lang. Syst. 1 no. 4 (1992), 323 – 337. https://doi.org/10.1145/161494.161501.
[MHH+21] V. J. Manès, H. Han, C. Han, S. K. Cha, M. Egele, E. J. Schwartz, and M. Woo, The art, science, and engineering of fuzzing: A survey, IEEE Transactions on Software Engineering 47 no. 11 (2021), 2312–2331. https://doi.org/10.1109/TSE.2019.2946563.
[MRC19] S. Miller, E. Reese, and N. Carr, Shikata Ga Nai encoder still going strong, Mandiant threat research, 2019. Available at https://www.mandiant.com/resources/blog/shikata-ga-nai-encoder-still-going-strong.
[MKK07] A. Moser, C. Kruegel, and E. Kirda, Limits of static analysis for malware detection, in Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007), 2007, pp. 421–430. https://doi.org/10.1109/ACSAC.2007.21.
[NS07] N. Nethercote and J. Seward, Valgrind: A framework for heavyweight dynamic binary instrumentation, SIGPLAN Not. 42 no. 6 (2007), 89 – 100. https://doi.org/10.1145/1273442.1250746.
[NNH99] F. Nielson, H. R. Nielson, and C. Hankin, Principles of Program Analysis, Springer Berlin, Heidelberg, 1999. https://doi.org/10.1007/978-3-662-03811-6.
[SKK+16] S. Schrittwieser, S. Katzenbeisser, J. Kinder, G. Merzdovnik, and E. Weippl, Protecting software through obfuscation: Can it keep pace with progress in code analysis?, ACM Comput. Surv. 49 no. 1 (2016). https://doi.org/10.1145/2886012.
[Tho21] P. Thomson, Static analysis: An introduction: The fundamental challenge of software engineering is one of complexity, Queue 19 no. 4 (2021), 29 – 41. https://doi.org/10.1145/3487019.3487021.
[vJ11] H. C. van Tilborg and S. Jajodia (eds.), Encyclopedia of Cryptography and Security, Springer, Boston, MA, 2011. https://doi.org/10.1007/978-1-4419-5906-5.
[ZWCX22] X. Zhu, S. Wen, S. Camtepe, and Y. Xiang, Fuzzing: A survey for roadmap, ACM Comput. Surv. 54 no. 11s (2022). https://doi.org/10.1145/3512345.

Picture of Yee Wei Law

Asymmetric-key cryptography

by Yee Wei Law - Friday, 26 May 2023, 11:00 PM
 

Picture of Yee Wei Law

Asymptotic notation

by Yee Wei Law - Sunday, 23 June 2024, 11:46 PM
 

Picture of Yee Wei Law

Authenticated encryption with associated data (AEAD)

by Yee Wei Law - Friday, 9 August 2024, 8:34 PM
 

See 👇 attachment (coming soon) or the latest source on Overleaf.

Tags: