Reducing Failure-Inducing Inputs
Overview
This article covers the Reducing Failure-Inducing Inputs chapter from Fuzzing Book. This chapter builds on the concepts from the Fuzzing and Fuzzing With Grammars chapters. Let’s get started and learn more about increasing the efficiency of fuzzed inputs and bug finding!
Summary
Taking failure-inducing input is a reducer which is reduced to a minimum that still give the failure. This article talks about “Reducer” classes that use reducers. Reduction helps us debug the failure so the instance where it happens can be found. There are manual ways to do this but for large and complex inputs a way to automate it will help reducing be faster.
Delta Debugging
Delta debugging uses fuzzing to find what fails and then takes the large input and reduces it to the part of the string that is failing. This decreases the inputs and helps the programmer know what is failing so that then they are able to find the bug more easily. The goal is to find failures in order to then find the reason why the code failed and how to fix the issue.
The chapter explains that “A reducer takes a failure-inducing input and reduces it to the minimum that still reproduces the failure.” This chapter shows the Reducer
class which is implemented in the following way:
class Reducer:
"""Base class for reducers."""
def __init__(self, runner: Runner, log_test: bool = False) -> None:
"""Attach reducer to the given `runner`"""
self.runner = runner
self.log_test = log_test
self.reset()
def reset(self) -> None:
"""Reset the test counter to zero. To be extended in subclasses."""
self.tests = 0
def test(self, inp: str) -> Outcome:
"""Test with input `inp`. Return outcome.
To be extended in subclasses."""
= self.runner.run(inp)
result, outcome self.tests += 1
if self.log_test:
print("Test #%d" % self.tests, repr(inp), repr(len(inp)), outcome)
return outcome
def reduce(self, inp: str) -> str:
"""Reduce input `inp`. Return reduced input.
To be defined in subclasses."""
self.reset()
# Default: Don't reduce
return inp
Manual reduction input has to do with manually testing and iterating through the code in order to find the bug; where as delta debugging automates this process of testing in order to find the failures.
Delta debugging first tests the first half of the code to see if it fails. If it passes it will then test the second half of the string. If that also passes the delta debugger will iterate through different combinations of the string quarters. The size of the string that the delta debugger tests gets smaller and smaller until there are only one or two characters left. But this can be confusing so lets walk through an example:
= DeltaDebuggingReducer(mystery, log_test=True)
dd_reducer reduce(failing_input) dd_reducer.
Round 1 (1/2):
Test #1 ' 7:,>((/$$-/->.;.=;(.%!:50#7*8=$&&=$9!%6(4=&69\':\'<3+0-3.24#7=!&60)2/+";+<7+1<2!4$>92+$1<(3%&5\'\'>#' 97 FAIL
Test #2 '\'<3+0-3.24#7=!&60)2/+";+<7+1<2!4$>92+$1<(3%&5\'\'>#' 49 PASS
Test #3 " 7:,>((/$$-/->.;.=;(.%!:50#7*8=$&&=$9!%6(4=&69':" 48 PASS
Round 2 (1/4):
Test #4 '50#7*8=$&&=$9!%6(4=&69\':\'<3+0-3.24#7=!&60)2/+";+<7+1<2!4$>92+$1<(3%&5\'\'>#' 73 FAIL
Test #5 "50#7*8=$&&=$9!%6(4=&69':<7+1<2!4$>92+$1<(3%&5''>#" 49 PASS
Test #6 '50#7*8=$&&=$9!%6(4=&69\':\'<3+0-3.24#7=!&60)2/+";+' 48 FAIL
Test #7 '\'<3+0-3.24#7=!&60)2/+";+' 24 PASS
Test #8 "50#7*8=$&&=$9!%6(4=&69':" 24 PASS
Round 3 (1/8):
Test #9 '9!%6(4=&69\':\'<3+0-3.24#7=!&60)2/+";+' 36 FAIL
Test #10 '9!%6(4=&69\':=!&60)2/+";+' 24 FAIL
Test #11 '=!&60)2/+";+' 12 PASS
Test #12 "9!%6(4=&69':" 12 PASS
Test #13 '=&69\':=!&60)2/+";+' 18 PASS
Test #14 '9!%6(4=!&60)2/+";+' 18 FAIL
Test #15 '9!%6(42/+";+' 12 PASS
Test #16 '9!%6(4=!&60)' 12 FAIL
Test #17 '=!&60)' 6 PASS
Test #18 '9!%6(4' 6 PASS
Test #19 '6(4=!&60)' 9 FAIL
Test #20 '6(460)' 6 FAIL
Test #21 '60)' 3 PASS
Test #22 '6(4' 3 PASS
Test #23 '(460)' 5 FAIL
Test #24 '460)' 4 PASS
Test #25 '(0)' 3 FAIL
Test #26 '0)' 2 PASS
Test #27 '(' 1 PASS
Test #28 '()' 2 FAIL
Test #29 ')' 1 PASS
'()'
Benefits
First, one benefit of a delta debugger is that it “reduces the cognitive load of the programmer”. It makes it so the test cases can address one problem specifically rather than broad unavoidable problems. Second, it also makes the problem easier to communicate to others. And finally, reducing the failure-inducing input can make it more obvious to programmers if there are duplicates in their test cases.
But is this effective?
The best case for this is logarithmic Big O curve, while the worst case is an exponential Big O curve as pictured on the graph below. The worst chase situation happens when the program runs and the last character fails. If the last character fails the program will then return an AssertionError. The goal is for the last character to pass so that it is clear that the error resulted in the penultimate character. Here is a graph of these Big O notations:
Drawbacks
One issue is that with delta debugging is that it is prone to failure. This method is not always accurate. In addition, another issue is that it is not the most efficient method of determining the failing piece of code. Which leads us to the another method of iterating through the fuzzing produced string in order to find the piece of the string that is failing called Grammar-Based Reduction.
Grammar Based Reduction
Delta debugging may take multiple attempts at reduction, if the input language is syntactically complex.This is where Grammar-Based Reduction comes in, this approach uses grammars to simplify complex syntactic inputs. With Lexical Reduction, the input is broken down based on character, without taking into account the grammar rules. In the example below the string, This is an issue because it produces many invalid invalid which in turn lead to meaningless invalid test cases. On the other hand using syntactic rules allows you to produce valid inputs to parse so that your test cases are meaningful, since the inputs it produces still follow the intended grammatical pattern. In this example the inputs are either valid or unresolved allowing you to use the valid one in you test cases. The grammar-based reduction approach simplifies complex inputs by operating on the structure of the input’s derivation tree, rather than just reducing the string. There are two key strategies: “Replacing the subtrees” and “Alternative Expansions”. Let’s investigate each of these strategies in more detail! Replacing the sub-trees means to take a larger part of the tree and replace it with a smaller part of the tree. In the example given it shows the original tree which makes up the string Using alternative expansions we would change how the term is expanded. In the example given in the book it took the non-terminal The log above after test Delta debugging is trying to split inputs in half trying to get to a minimum input. Having a tree be replaces with subtrees it could make a tree less important. However this may take sever tries. A good way to go about this is to look at large subtrees first. To locate large subtrees, we reduce the depth of the search for replacements. Looking at direct descendants first, then lower descendants. The The The The approach is to begin with a depth of 0 and increase it as we move forward: Discussion Question: What does a depth-oriented strategy show? There are some differences between text-based delta debugging and grammar-based reduction. Building a long expression: For grammars a couple of tests are needed to find the failure-inducing input: However, with delta debugging it takes more tests and the reduction is not as good compared to grammar-based reducer. A grammar to reduce inputs better to use if an input is syntactically complex.Lexical Reduction versus Syntactic Rules
"1 + (2 * 3)"
is broken down many times into various versions of itself, but it doesn’t follow the same grammatical pattern.= "1 + (2 * 3)"
expr_input = DeltaDebuggingReducer(mystery, log_test=True)
dd_reducer reduce(expr_input) dd_reducer.
class EvalMysteryRunner(MysteryRunner):
def __init__(self) -> None:
self.parser = EarleyParser(EXPR_GRAMMAR)
def run(self, inp: str) -> Tuple[str, Outcome]:
try:
*_ = self.parser.parse(inp)
tree, except SyntaxError:
return (inp, Runner.UNRESOLVED)
return super().run(inp)
= EvalMysteryRunner()
eval_mystery
= DeltaDebuggingReducer(eval_mystery, log_test=True)
dd_reducer reduce(expr_input) dd_reducer.
A Grammar-Based Reduction Approach
Replacing the Subtrees
"1 + (2 * 3)"
and then being broken down to "(2 * 3)"
by replacing the upper most <expr>
with the lowest <expr>
.from Grammars import Grammar
from GrammarFuzzer import all_terminals, expansion_to_children, display_tree
*_ = EarleyParser(EXPR_GRAMMAR).parse(expr_input)
derivation_tree, display_tree(derivation_tree)pyt
Alternative Expansions
<term> * <factor>
which would have given the output (2 * 3)
, and broke it by altering how the term would expand, which is <term> : := <factor>
. Therefore changing the grammatical rule of how <term>
would expand.import copy
from Grammars import Grammar
from GrammarFuzzer import all_terminals, expansion_to_children, display_tree
= copy.deepcopy(derivation_tree)
new_derivation_tree # We really should have some query language
= new_derivation_tree[1][0][1][2]
sub_expr_tree display_tree(sub_expr_tree)
A Depth-Oriented Strategy
#2
is reduced to "2 * 3"
for the input tree the GrammarReducer
goes to replace the tree "2"
and "3"
being the alternate <term>
subtrees. This can work but if there are many subtrees, it will take time trying all the subtrees possible.depth
parameter in the subtrees_with_symbol()
function, which is passed through the calling functions, controls this behavior. When set, it ensures that only symbols at the specified depth are returned. An example with derivation tree:= GrammarReducer(
grammar_reducer
mystery,
EarleyParser(EXPR_GRAMMAR),=True)
log_reduce
all_terminals(derivation_tree)
display_tree(derivation_tree)
<term>
symbol is not here at depth of 1:for t in grammar_reducer.subtrees_with_symbol(
[all_terminals(t) "<term>", depth=1)] derivation_tree,
<term>
subtree is here at depth of 2 on the left hand side:for t in grammar_reducer.subtrees_with_symbol(
[all_terminals(t) "<term>", depth=2)] derivation_tree,
class GrammarReducer(GrammarReducer):
def reduce_tree(self, tree):
= 0
depth while depth < max_height(tree):
= self.reduce_subtree(tree, tree, depth)
reduced if reduced:
= 0 # Start with new tree
depth else:
+= 1 # Extend search for subtrees
depth return tree
= GrammarReducer(
grammar_reducer
mystery,
EarleyParser(EXPR_GRAMMAR),=True)
log_testreduce(expr_input) grammar_reducer.
Click to Expand for the Answer
It needs less stepsComparing Strategies
from GrammarFuzzer import GrammarFuzzer
= GrammarFuzzer(EXPR_GRAMMAR, min_nonterminals=100).fuzz()
long_expr_input long_expr_input
from Timer import Timer
= GrammarReducer(eval_mystery, EarleyParser(EXPR_GRAMMAR))
grammar_reducer with Timer() as grammar_time:
print(grammar_reducer.reduce(long_expr_input))
grammar_reducer.tests
grammar_time.elapsed_time()
= DeltaDebuggingReducer(eval_mystery)
dd_reducer with Timer() as dd_time:
print(dd_reducer.reduce(long_expr_input))
dd_reducer.tests
dd_time.elapsed_time()
Conclusion
This article talked about reducing inputs down to the smallest piece that is breaking or hindering the code. This has many really world applications. Can you think of any real world uses for delta debugging?