Definition
Traces are the HTTP requests generated during navigation of the application under attack-free sessions. These traces can be generated either manually or automatically. The traces are analyzed for extraction of the intended behavior of the application
Automatic approach (web crawler)
HTTP traces can be fetched automatically with a web crawler that navigate a SUT (system under test).
Benefit
- automatic, fast, less error prone
- scalable and reliable
Shortcomings
- possible false behavior may be inferred
- possibility of missing few vulnerable pages from being identified by the crawler as the crawler may not navigate through the application in a fashion intended by the programmer
- difficult to implement a context-aware crawler for workflow extraction: a crawler does not know what is the best way to get from a page to another
Implementation example (Deepa, Thilagam, et al., 2018):
- the crawler starts from a seed URL and fetches the first HTTP response
- the crawler looks for hyperlinks in the HTML. It looks for the attributes such as
src
,href
, andaction
in the HTML content, and JavaScript events such aswindow.location
,window.open
,.location.assign
,.href
,.load
,.action
, and.src
as well for identifying the URLs in a web page - The URLs captured are stored in a list, and the crawler starts exploring the web pages of the application in a DFS (depth-first search) fashion until all the web pages are visited
Manual approach (human tester)
Manual traces are generated by allowing a tester to navigate through the application
Benefit
- accurate, detailed, relevant traces can be gathered
Shortcomings
- time-consuming and error prone approach
References
- Technique used by (Deepa, Thilagam, et al., 2018)
- Technique used by (Pellegrino, Balzarotti, 2014)
- Technique used by (Li, Xue, et al., 2013) to build a system to detect logic flaws in a black-box fashion analyzing URL tampering,