Definition

Traces are the HTTP requests generated during navigation of the application under attack-free sessions. These traces can be generated either manually or automatically. The traces are analyzed for extraction of the intended behavior of the application

Automatic approach (web crawler)

HTTP traces can be fetched automatically with a web crawler that navigate a SUT (system under test).

Benefit

  • automatic, fast, less error prone
  • scalable and reliable

Shortcomings

  • possible false behavior may be inferred
  • possibility of missing few vulnerable pages from being identified by the crawler as the crawler may not navigate through the application in a fashion intended by the programmer
  • difficult to implement a context-aware crawler for workflow extraction: a crawler does not know what is the best way to get from a page to another

Implementation example (Deepa, Thilagam, et al., 2018):

  • the crawler starts from a seed URL and fetches the first HTTP response
  • the crawler looks for hyperlinks in the HTML. It looks for the attributes such as src, href, and action in the HTML content, and JavaScript events such as window.location, window.open, .location.assign, .href, .load, .action, and .src as well for identifying the URLs in a web page
  • The URLs captured are stored in a list, and the crawler starts exploring the web pages of the application in a DFS (depth-first search) fashion until all the web pages are visited

Manual approach (human tester)

Manual traces are generated by allowing a tester to navigate through the application

Benefit

  • accurate, detailed, relevant traces can be gathered

Shortcomings

  • time-consuming and error prone approach

References