- Main
Automatic Detection and Repair of Input Validation and Sanitization Bugs
- Alkhalaf, Muath
- Advisor(s): Bultan, Tevfik
Abstract
A crucial problem in developing dependable web applications is the
correctness of the input validation and sanitization. Bugs in string
manipulation operations used for validation and sanitization are common,
resulting in erroneous application behavior and vulnerabilities that are
exploitable by malicious users. In this dissertation, we investigate the
problem of automatic detection and repair of validation and sanitization bugs
both at the client-side (JavaScript) and the server-side (PHP or Java) code.
We first present a formal model for input validation and sanitization
functions along with a new domain specific intermediate language
to represent them. Then, we show how to extract input validation and
sanitization functions in our intermediate language from both client and
server-side code in web applications. After the extraction phase, we use
automata-based static string-analysis techniques to automatically verify
and fix the extracted functions. One of our contributions is the development
of efficient automata-based string analysis techniques for frequently used,
complex string operations.
We developed two basic approaches to bug detection and repair: 1)
policy-based, and 2) differential. In the policy-based approach, input
validation and sanitization policies are expressed using two regular
expressions, one specifying the maximum policy (the upper bound for the
set of strings that should be allowed) and the other specifying the minimum
policy (the lower bound for the set of strings that should be allowed). Using
our string analysis techniques we can identify two types of errors in
an input validation and sanitization function: 1) it accepts a set of strings that
is not permitted by the maximum policy (i.e., it is under-constrained),
or 2) it rejects a set of strings that is permitted by the minimum policy
(i.e., it is over-constrained).
Our differential bug detection and repair approach does not require any
policy specifications. It exploits the fact that, in web applications,
developers typically perform redundant input validation and sanitization
in both the client and the server-side since client-side checks can
be by-passed. Using automata-based string analysis, we compare the
input validation and sanitization functions extracted from the client- and
server-side code, and identify and report the inconsistencies between them.
Finally, we present an automated differential repair technique that can
repair client and server-side code with respect to each other, or across
applications in order to strengthen the validation and sanitization
checks. Given a reference and a target function, our differential repair
technique strengthens the validation and sanitization operations in the
target function based on the reference function by automatically generating
a set of patches.
We experimented with a number of real world web applications and found many
bugs and vulnerabilities. Our analysis generates counter-example behaviors
demonstrating the detected bugs and vulnerabilities to help the developers
with the debugging process. Moreover, we automatically generate patches
that can be used to mitigate the detected bugs and vulnerabilities until
developers write their own patches.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-