Differential Testing
   HOME

TheInfoList



OR:

Differential testing,
William M. McKeeman William is a male given name of Germanic origin.Hanks, Hardcastle and Hodges, ''Oxford Dictionary of First Names'', Oxford University Press, 2nd edition, , p. 276. It became very popular in the English language after the Norman conquest of Engl ...
, “Differential testing for software,” Digital Technical Journal, vol. 10, no. 1, pp. 100–107, 1998.
also known as differential fuzzing, is a popular
software testing Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to apprecia ...
technique that attempts to detect bugs, by providing the same input to a series of similar applications (or to different implementations of the same application), and observing differences in their execution. Differential testing complements traditional software testing, because it is well-suited to find
semantic Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
or logic bugs that do not exhibit explicit erroneous behaviors like crashes or assertion failures. Differential testing is sometimes called back-to-back testing. Differential testing finds semantic bugs by using different implementations of the same functionality as cross-referencing
oracles An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word '' ...
, pinpointing differences in their outputs over the same input: any discrepancy between the program behaviors on the same input is marked as a potential bug.


Application domains

Differential testing has been used to find semantic bugs successfully in diverse domains like
SSL/TLS Transport Layer Security (TLS) is a cryptographic protocol designed to provide communications security over a computer network. The protocol is widely used in applications such as email, instant messaging, and voice over IP, but its use in securi ...
implementations,C. Brubaker, S. Jana, B. Ray, S. Khurshid, and V. Shmatikov, “Using frankencerts for automated adversarial testing of certificate validation in SSL/TLS implementations,” in Proceedings of the 2014 IEEE Symposium on Security and Privacy (S&P). IEEE Computer Society, 2014, pp. 114–129.Y. Chen and Z. Su, “Guided differential testing of certificate validation in SSL/TLS implementations,” in Proceedings of the 10th Joint Meeting on Foundations of Software Engineering (FSE). ACM, 2015, pp. 793– 804.Petsios, T., Tang, A., Stolfo, S., Keromytis, A. D., & Jana, S. (2017, May). NEZHA: Efficient Domain-Independent Differential Testing. In Proceedings of the 38th IEEE Symposium on Security & Privacy,(San Jose, CA).S. Sivakorn, G. Argyros, K. Pei, A. D. Keromytis and S. Jana, "HVLearn: Automated Black-Box Analysis of Hostname Verification in SSL/TLS Implementations," 2017 IEEE Symposium on Security and Privacy (S&P), San Jose, CA, USA, 2017, pp. 521–538. C compilers,X. Yang, Y. Chen, E. Eide, and J. Regehr, “Finding and understanding bugs in C compilers,” in Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 2011, pp. 283–294.
JVM A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes ...
implementations,Y. Chen, T. Su, C. Sun, Z. Su, and J. Zhao, “Coverage-directed differential testing of JVM implementations,” in Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 2016, pp. 85–99. Web application firewalls,G. Argyros, I. Stais, S. Jana, A. D. Keromytis, and A. Kiayias, “SFADiff: Automated evasion attacks and fingerprinting using black-box differential automata learning,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM, 2016, pp. 1690–1701. security policies for APIs,V. Srivastava, M. D. Bond, K. S. McKinley, and V. Shmatikov, “A security policy oracle: Detecting security holes using multiple API implementations,” ACM SIGPLAN Notices, vol. 46, no. 6, pp. 343–354, 2011. and
antivirus software Antivirus software (abbreviated to AV software), also known as anti-malware, is a computer program used to prevent, detect, and remove malware. Antivirus software was originally developed to detect and remove computer viruses, hence the nam ...
.S. Jana and V. Shmatikov, “Abusing file processing in malware detectors for fun and profit,” in Proceedings of the 2012 IEEE Symposium on Security and Privacy (S&P). IEEE Computer Society, 2012, pp. 80– 94. Differential testing has also been used for automated fingerprint generation from different
network protocol A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules, syntax, semantics and synchroniza ...
implementations.D. Brumley, J. Caballero, Z. Liang, J. Newsome, and D. Song, “Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation,” in 16th USENIX Security Symposium (USENIX Security ’07). USENIX Association, 2007.


Input generation


Unguided

Unguided differential testing tools generate test inputs independently across iterations without considering the test program’s behavior on past inputs. Such an input generation process does not use any information from past inputs and essentially creates new inputs at random from a prohibitively large input space. This can make the testing process highly inefficient, since large numbers of inputs need to be generated to find a single bug. An example of a differential testing system that performs unguided input generation is "Frankencerts". This work synthesizes Frankencerts by randomly combining parts of real certificates. It uses syntactically valid certificates to test for semantic violations of SSL/TLS certificate validation across multiple implementations. However, since the creation and selection of Frankencerts are completely unguided, it is significantly inefficient compared to the guided tools.


Guided

Guided input generation process aims to minimize the number of inputs needed to find each bug by taking program behavior information for past inputs into account.


Domain-specific evolutionary guidance

An example of a differential testing system that performs domain-specific
coverage Coverage may refer to: Filmmaking * Coverage (lens), the size of the image a lens can produce * Camera coverage, the amount of footage shot and different camera setups used in filming a scene * Script coverage, a short summary of a script, wri ...
-guided input generation is Mucerts. Mucerts relies on the knowledge of the partial grammar of the
X.509 In cryptography, X.509 is an International Telecommunication Union (ITU) standard defining the format of public key certificates. X.509 certificates are used in many Internet protocols, including TLS/SSL, which is the basis for HTTPS, the secu ...
certificate format and uses a stochastic sampling algorithm to drive its input generation while tracking the program coverage. Another line of research builds on the observation that the problem of new input generation from existing inputs can be modeled as a stochastic process. An example of a differential testing tool that uses such a stochastic process modeling for input generation is Chen et al.’s tool. It performs differential testing of
Java virtual machine A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes ...
s (JVM) using
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
(MCMC) sampling for input generation. It uses custom domain-specific mutations by leveraging detailed knowledge of the Java class file format.


Domain-independent evolutionary guidance

NEZHA is an example of a differential testing tool that has a path selection mechanism geared towards domain-independent differential testing. It uses specific metrics (dubbed as delta-diversity) that summarize and quantify the observed asymmetries between the behaviors of multiple test applications. Such metrics that promote the relative diversity of observed program behavior have shown to be effective in applying differential testing in a domain-independent and black-box manner.


Automata-learning-based guidance

For applications, such as
cross-site scripting Cross-site scripting (XSS) is a type of security vulnerability that can be found in some web applications. XSS attacks enable attackers to inject client-side scripts into web pages viewed by other users. A cross-site scripting vulnerability may ...
(XSS) filters and X.509 certificate hostname verification, which can be modeled accurately with
finite-state automata A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
(FSA), counter-example-driven FSA learning techniques can be used to generate inputs that are more likely to find bugs.


Symbolic-execution-based guidance

Symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inp ...
J. C. King, “Symbolic execution and program testing,” Communications of the ACM, vol. 19, no. 7, pp. 385–394, 1976. is a white-box technique that executes a program symbolically, computes constraints along different paths, and uses a constraint solver to generate inputs that satisfy the collected constraints along each path. Symbolic execution can also be used to generate input for differential testing.D. A. Ramos and D. R. Engler, “Practical, low-effort equivalence verification of real code,” in International Conference on Computer Aided Verification. Springer, 2011, pp. 669–685. The inherent limitation of symbolic-execution-assisted testing tools—path explosion and scalability—is magnified especially in the context of differential testing where multiple test programs are used. Therefore, it is very hard to scale symbolic execution techniques to perform differential testing of multiple large programs.


See also

*
Software testing Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to apprecia ...
*
Software diversity Software diversity is a research field about the comprehension and engineering of diversity in the context of software. Areas The different areas of software diversity are discussed in surveys on diversity for fault-tolerance or for security. A r ...


References

{{software testing Software testing