Paid Advertising is
ha.ckers sla.cking
Whether this is about, or some other project you are interested in or want to talk about, throw it in here to get feedback. 
Go to Topic: PreviousNext
Go to: Forum ListMessage ListNew TopicSearchLog In
Web application scanner
Posted by: holiman
Date: May 20, 2009 05:23AM

I am working on a little web application scanner. However, I would like to use an existing webcrawler to locate application entry points on a given site, and wonder if anyone has any recommendations.

What I want :
A spider that navigates through and produces a list of all entry-points, such as


Most software that should be able to do this easily seems to be web-grabbers, Httrack for example would be great but it does not seem 'tweekable' enough to get out what I want from it. Any ideas?

Options: ReplyQuote
Re: Web application scanner
Posted by: wireghoul
Date: May 21, 2009 07:15PM



Google for crawler, mirror, spider + your favorite os/programming language. You will get some hits


Options: ReplyQuote
Re: Web application scanner
Posted by: holiman
Date: May 22, 2009 01:06PM

How can I make wget produce such a list ?
Curl is tool for transferring files in many protocols, but does not parse html or spider webpages (afaik)
Can Lynx do that? How? I thought it was just an ascii browser?

And I don't want to program a web crawler, since people have done great jobs before - I want to use an existing one that also gives me an output in a format I can massage into some kind of list. Yes, I can google further and look them all up, but if anyone has a oneline howtos, I'd be glad for the help.

Options: ReplyQuote
Re: Web application scanner
Posted by: Spyware
Date: May 22, 2009 06:21PM

Httrck is open source...

Options: ReplyQuote
Re: Web application scanner
Posted by: Reiners
Date: May 24, 2009 11:15AM

quick and dirty:


// get HTML
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_URL,"");
$html = curl_exec ($ch);
curl_close ($ch); 

// load DOM object
$dom = new DOMDocument();

// get all <a> links
$links = $dom->getElementsByTagName('a');

foreach($links as $link)
echo $link->getAttribute('href')."<br>";

// get all <form action=""> links
$forms = $dom->getElementsByTagName('form');

foreach($forms as $form)
	$method = $form->getAttribute('method');
	$action = $form->getAttribute('action');

	$inputs = $form->getElementsByTagName('input');

	$data = "";
	foreach($inputs as $input)
		$name = $input->getAttribute('name');	
		$value = $input->getAttribute('value');
		$data.= "$name=$value&";

	if(eregi('post', $method))
		echo $action." POST: $data";
	} else
		echo $action."?$data";



<a> and <form> are the only tags with links to follow which come to my mind atm. you can easily edit the above code to recursivly get all links.

Edited 2 time(s). Last edit at 05/24/2009 11:48AM by Reiners.

Options: ReplyQuote
Re: Web application scanner
Posted by: nEUrOO
Date: May 25, 2009 10:42AM

it's quite tough to create a good crawler, many client-side technologies to parse (at least, JS & Flash for the common ones). I understand that you want something very basic, then Reiners' might work well. Anyhow, w3af has a good crawler (for an open-source product), so you might want to take a look at it -- and btw, it's already a web apps scanner framework, so you could use it to develop your custom attacks, ideas etc.

nEUrOO -- --

Options: ReplyQuote
Re: Web application scanner
Posted by: holiman
Date: May 26, 2009 12:20PM

I touched briefly on the idea of writing my own crawler, but I realised there are a lot of requirements on a good one:

* It needs more complex workflow than recursion, probably a couple of threads; one handling a "link-funnel", sorting it by removing already-visited, removing already-queued and adding to visit-queue those that meet BL-rules (such as allowed domains, allowed path:s etc). Another thread to visit pages and analyze content.
* Some kind of disk-based history of visited locations, probably sqlite or some kind of key-value-store to not bounce around stupidly, and also be able to stop and resume later.
* It needs pretty complex html parsing, to handle (as nEurOO points out) javascript and such.
* Probably more advanced configuration is wanted ; stealth may be wanted, ability to load cookies, fake UA-strings.
* I would probably want to store pages in a best-effort kind of way, perhaps save pages that contain forms
* It would be nice to aggregate some info from all spidered pages, like paros proxy and present all html comments found on a domain.

So, it just kinda grew. I don't want something very basic, I want something very good... I am going to check out the open-source variants and see if I can tweek one of them into what I want.

Options: ReplyQuote
Re: Web application scanner
Posted by: nEUrOO
Date: May 26, 2009 12:38PM

If you want a benchmark for your crawler:

And based on the results, that's why I told you that w3af could be a good open-source choice

nEUrOO -- --

Options: ReplyQuote
Re: Web application scanner
Posted by: holiman
Date: May 26, 2009 12:44PM

I took a peek at it just now, it looks good! Also, it's in python, which I am getting the hang of now, so it seems like a good choice.

Options: ReplyQuote
Re: Web application scanner
Posted by: jibsn
Date: February 13, 2010 05:40AM

Structured Query Language (SQL) is a specialized programming language for sending queries to databases. Most small and industrial- strength database applications can be accessed using SQL statements. SQL is both an ANSI and an ISO standard. However, many database products supporting SQL do so with proprietary extensions to the standard language. Web applications may use user-supplied input to create custom SQL statements for dynamic web page requests.


Options: ReplyQuote
Re: Web application scanner
Posted by: ntp
Date: April 11, 2010 08:07PM

nEUrOO Wrote:
> If you want a benchmark for your crawler:

Wivet stats are interesting to me. Recently, I've benchmarked quite a few scanners.

Netsparker Community Edition -- 84%
Google Skipfish -- 46%
Qualys WAS -- 33%
GNUCITIZEN Websecurify -- 21%
Burp Scanner -- 14%

Honestly, it's best to avoid scanning at first when you have a target for an app assessment. Instead, you should walk the app as best you can using passive analysis tools such as Burp Suite Pro. Sometimes I will also run a good crawler (such as Netsparker Community Edition) through a passive analysis proxy such as Casaba Watcher, Casaba x5s, or Google Ratproxy. I also like to run Skipfish against the TLD (e.g. -D, which will generally find a lot of detailed information and reconnaissance data, including link extraction across domain names.

Metasploit WMAP (and their auxiliary modules for http scanning) takes an even stranger approach where they use Google Ratproxy data (after manually walking an app) from trace files to automatically scan the web application for vulnerabilities. More information on WMAP is available in the Metasploit Unleashed training (MSFU) --

I have been meaning to run the Acunetix Free Edition and perhaps Metasploit WMAP (assuming it does any link extraction -- not sure yet) against Wivet. Anyone else have any other suggestions?

I'm also looking for Wivet statistics on some of the other missing commercial scanners, such as NTOSpider.

It's interesting to compare scanners using Wivet vs. Larry Suto's work. Larry sees the top web application security scanners as:
(best) 1. NTOBJECTives NTOSpider
2. Mavituna Security Netsparker
3. Cenzic Hailstorm
4. IBM AppScan
5. Acunetix WVS
6. Burp Suite Professional
7. HP WebInspect
8. Qualys WAS

However, Larry's data has always been questioned and moreso, he only tested 6 web applications, not many of which had Ajax, Flash, or other hard to extract links or coverage problems.

The wivet statistics list top link extractors as follows:
(best) 1. [tied] WebInspect, Acunetix WVS
2. Netsparker Standard/Pro
3. Netsparker Community Edition
4. Hailstorm
5. AppScan
6. W3AF
7. Skipfish
8. Qualys WAS
9. Websecurify
10. Burp Scanner

If we are to truly take any of this data seriously, then we must realize that Netsparker was the only web application security scanner that performed well in any sort of benchmarks I've seen yet. Crazier, it's the only one that's free that performs better than W3AF or Skipfish (and by a lot!). Netsparker Pro also carries one of the cheapest price tags I've seen or heard of. I would be interested to try it out and benchmark it more, especially after seeing the Community Edition. It's possible that Netsparker was released this way because they know that they have a superior product compared to the rest of the market.

Personally, I do not care about false positives. I am the type of person who loves to verify false positives and study them. Of course, nobody ever mentions "false false positives" which happens very often. In fact, in many models (where less-talented pen-testers perform "peripheral security testing" and where the top-talent performs "adversarial security testing") lots of true positives get flagged as false positives, when in reality they are exploitable conditions that have not been fully fleshed or scoped out. In other more professional medical/engineering practices, "false false positives" are called Type III errors --

For me, my primary focus is on accuracy, performance, stability, and data exportability/synergy. It is for this reason that I recommend that this industry needs to look further into Burp Suite Professional to ultimately meet our needs. With the latest 1.3.02 release, Intruder can now be configured to work with external fuzzing files. The Scanner supports export of XML report to the Dradis Framework.

If I were to target other XML report data to go into the Dradis Framework, I would first target Netsparker support, which already has XML -- then Casaba Watcher, which also has XML. Ratproxy and Skipfish report data would be very useful to import.

I'm not sure what the long-term vision of some web application scanners are. I want to say that W3AF will become more stable; that Websecurify will get better link extraction; that Metasploit WMAP will integrate more application security instead of server-platform security. Clearly, money is not an issue when it comes to building a quality web application security scanner. Or perhaps HP and IBM have simply not invested time, money, or resources to WebInspect or Appscan and have alienated their internal talent pools due to the competitive nature of this market, patents, and other politics.

It is also sad to see Cenzic, first to market, fail to hold on to product research that could withstand competition from its largest competitors -- the newcomers, Acunetix and Netsparker. And what is up with Qualys WAS? Surely, they have a nice false positive rate, but this is obviously at the cost of a high false negative rate.

I also get very upset that the Dradis Framework is most useful for reconnaissance work and only with Nmap, Nikto, Nessus, and Burp Scanner output. I would like it to see it accept a few other kinds of output (e.g. httsquash, check_http_ssl from nagios-plugins, NeXpose Community Edition, Metasploit, as well as Netsparker, Casaba Watcher, et al), but really I'd like to see this platform turn into something a bit more clever. I find it odd that Offensive-Security recommend the Dradis Framework, but that Metasploit and Rapid7 don't seem to have it on their radar.

Perhaps the HP AMP Open API or the HoneyApps Conduit products will allow for other reconnaissance / data import / data synergy for reporting or other purposes between web application security scanners and the passive tools surrounding them.

One topic I did not bring up was AcuSensor. Only Acunetix claims to have this combined functionality, although Fortify promises it with PTA and Hybrid 2.0 claims to be able to combine WebInspect and PTA. There are small benefits to either of these products, and they are especially not worth it for the money when open-source alternatives such as Xdebug, Emma, and PartCover exist.

Edited 1 time(s). Last edit at 04/12/2010 02:32AM by ntp.

Options: ReplyQuote
Re: Web application scanner
Posted by: ntp
Date: July 14, 2010 11:40PM

The shellphish guys paper appears to have confirmed my suspicions.

You basically have scanners that maintain a high level of quality:
1) HP WebInpsect
2) Netsparker Pro
3) Acunetix WVS
4) Burp Scanner

The above are my favorites from best to worst, but really it would be nice to have all 4, at least on occasion.

And scanners that do not hold up to today's tech or are just otherwise low quality (again, in order of my most favorite to least favorite):
1) Cenzic Hailstorm (I do like the ability to configure the attacks in Javascript)
2) IBM AppScan (although I hear that the State Inducer tool it comes with is unique and awesome!)
3) NTOBJECTives NTOSpider (I really don't see the fuss about this from RSnake or Larry Suto. Who else likes this tool -- and WHY??? -- it did very poorly in the shellphish report as seen from the Wivet and other results)
4) Qualys WAS
5) WhiteHat Security Sentinel

There are some scanners or scanner-friendly tools that are just damn useful, mostly because they don't cost any money:
1) Casaba Watcher
2) Casaba x5s
3) Google Ratproxy
4) Google Skipfish
5) Websecurify
6) Netsparker CE
7) XSS_Rays
(in no particular order)

I really like how Websecurify is coming along.

I do not like how skipfish works, but I have a new use for it in mind. Basically, stop it from doing all of that crazy content discovery and instead use it to run through Ratproxy patched with the Metasploit WMAP patches (or perhaps the msfproxy). Basically, skipfish will feed the WMAP database so that the WMAP module and the auxiliary modules can be run against the URLs including parameter queries/forms (but not cookies unfortunately). This is mostly useful if you have a huge environment that you want to test really fast, and it works even better if you utilize either existing discovery data or skipfish's internal "-D" flag.

For penetration-tests, scanners can be useful. I like to run Websecurify and XSS_Rays to hit an XSS and then start exploring manually with a browser (which both tools happen to be a part of already). If the authentication looks weak (i.e. no lockouts), Fireforce is a great tool, especially when combined with pre-existing password re-use knowledge or a copy of the skullsecurity-lists (I like rockyou-75.txt best because it's a lot but not too too much -- although rockyou.txt is my favorite if you want to do an exhaustive test or second test).

If Websecurify or XSS_Rays and manual poking fail to get me a quick XSS (just for even demonstration purposes), then running Netsparker (Pro or CE) in crawl-only mode through Casaba x5s will definitely get you one. As an added benefit, Casaba Watcher can be running simultaneously as x5s, and you may even get a Clickjacking finding out of it. Websecurify has an added benefit here in that it's good at finding CSRF, especially when you run it through a properly configured Ratproxy.

Armed with the capability to become a user of an app (or a sub-part of an app, such as an admin panel), you can now attack other pieces and parts of the web application. Finding admin panels or other places that load local or remote files (file upload especially) is penultimate to finding any other immediate command injection bug.

In order to find these panels (or forums or similar places that do obvious file interactions), I would suggest looking through the `discovery' section of fuzzdb, especially the ones marked with "cgi-x-platform", "interesting-files-", etc and pick at targets like apache, a particular cms, iis, sharepoint, etc. You can often identify your target fairly easy, but these days it seems like most applications run more than one type of webapp on the server (e.g. PHP, ASP.NET and Java Enterprise all at once!), so it's easy to miss obvious stuff. There are not too many application scanners that help here, but I would bet that WebInspect and Appscan perform rather well compared to the others. However, I've seen Netsparker Pro and Acunetix WVS find the same issues that Webinspect or Appscan do with regards to this.

Platform vulnerability discovery is certainly the biggest missing feature in Burp Suite Professional. Let's just pretend that Webinspect is the best in this category simply because they have a policy specific to platform vulns, and you can run it from the command line, extract the XML, and totally automate these kinds of findings really easily compared to most of the other tools. However, it is kind of heavy on the false positives -- even when it comes to discovery of these panels sometimes. The benefit of going through these is that you'll have to dig around for PII or other useful reconnaissance info anyways (especially post-authenticated), so looking for those path traversals and predictable resource locations can be done at the same time -- and you might even find something more interesting than an admin panel! It's times like that that you want a Google Enterprise Appliance that has pre-authenticated as every possible user on every possible web application across the organization. However, the concept of dorking can sometimes be applied to a webapp's own search interface. Dig up all of the goods while you can!

After you find a panel, you probably just want to manually figure out how to load up a script -- usually PHP, ASP, aspx/asmx, or JSP. However, this could also be a JAR or WAR file. In essence, you have to poke around the web server install/config as well as the app or parts of the app. You may need to query OSVDB to find similar bugs (although occasionally you may even find an unpatched well-known admin panel!). You may also want to check for file inclusion or script inclusion type bugs using scanners that support it. Burp does a great job with this in both the scanner as well as manually with the intruder. It's easy to sort and look for errors that could cause this. Generic errors and unhandled exceptions may also lead to a file/script read or write inclusion -- so almost any scanner that's finding those quickly and easily for you (personally, I think that Webinspect and Netsparker Pro do great jobs here) is going to be win and save you time. Metasploit is gaining some traction here as well, especially brute-forcing certain panels such as the Tomcat admin panel. Metasploit Express automates this even further.

So far you've attacked the users and the server (and thus you can start leveraging both to attack each other with lucky-punch attacks). If you do get access to the server, the penetration test can certainly turn into a Metasploit style penetration-test very quickly.

Assuming you've only captured users (and a lot of the users don't appear to be admins of anything) then it's probably best to turn to SQL, XPath, and command injections next in your penetration-test. Netsparker Pro certainly does a great (and fast!) job at enumerating all of the possibilities, but you may want to target some hotspots manually first -- with SQL Inject-Me, TamperData, etc. The skipfish+ratproxy+WMAP method I described earlier would also be nice to try here. As far as XPath and command injection go -- I really think that Burp scanner and intruder+fuzzdb are you best bets.

If you really want to drive home the part about attacking users, you'll want to find more CSRF, Clickjacking, HTTP header injections, open redirects, and session management attacks. Some of this is going to be a lot more manual then the above -- scanners are not exceptional with regards to these attacks. Don't forget about the Burp sequencer functionality, though. The final part here would be testing the authorization models, and this usually works best with pair testers who can spend time sharing session information knowledge while trying to pass parameters and objects around. I think Burp Scanner is exceptional at finding HTTP header injections (although I've heard HHI/HRS as the major benefit of WHS Sentinel if you care enough). Most scanners can find open redirects, but my top 4 choices seem to find them best. XSRFTester (the Fiddler2 plugin) looks fairly decent for finding CSRF.

If you want to gather a lot of information on users, you may want to target LDAP injection or path traversal type attacks. I also think Burp does a great job here.

In Flash or Ajax heavy sites (or ones that employ other external components), this may vary a lot. Burp Suite Professional has a nice search feature to find SWF files, which can be fed to SWFScan. Almost all of the top 10 scanners claim to have some Ajax or Flash support, which is true, but the findings are very wild and vary quite a lot more than you would suspect! Because external components like these can be downloaded, it is nice to get to know them a lot better.

For example, with Flash, start reading the code in SWFScan. If you run into problems with SWFScan, perhaps use IDA Pro with the SWF decompiler plugin. Testing the Flash may be easy or may be difficult. Look for LSOs or other cached Flash content. Check out external content quickly with swfdump -D and grep out URLs. Some other flash decompilers provide nice overviews of the content, which can help you understand the code better. It might be nice to dump the source code into an editor such as Source Insight. Using the OWASP Code Review guide and some other common places can get you at least some knowledge for what to look for beyond what the SWFScan analysis mode gives you. The runtime might do things that scanners can interface with -- such as Burp's AMF capability (and some Burp plugins). I have successfully ran WPE Pro on Internet Explorer to view other SWF-to-server data on the wire. There are a few tools from SpiderLabs and Gotham Digital Science that go a bit farther here, as well.

With regards to Ajax, I suggest pulling down the content locally and running it through AppCodeScan (Blueinfy). It is sometimes very nice to get some runtime knowledge, and the obvious solution is FireBug (although I prefer WiderBug). There are some great plugins to FireBug/WiderBug such as Firerainbow, which does syntax highlighting on the Javascript. I suggest learning how to use the Javascript debugger and profiler. However, AppCodeScan seems to provide a very fast look at the Javascript code, and to boot it's even better when you're running into Ajax proxies such as found along with Java Enterprise or ASP.NET Web Services. I'm not sure about other static analysis or code review tools that support Ajax or Javascript libraries very well, although the Fortify products claim to. If you are looking at a lot of XML or JSON, then the Blueinfy Web2.0Proxy tool makes it a little easier to deal with then going into the Burp XML tabs or similar. However, Burp Scanner does a great job at finding XML injections, but be way as these are some of its most common false positives. If you hit true positives here though, expect Web Services code to "be around". See if you can grab some of it and load it into AppCodeScan, if supported. I'm assuming that Burp Scanner implements very nice SOAP injection capability, although again this is something that appears to vary wildly between all of the scanners. They don't support Web Services rather well, especially not REST.

Java applets are about to be blown apart with some of the upcoming Blackhat talks, such as Stephen de Vries and Arshan Dabirsiaghi. In the meantime (or regardless), you may want to check out your friendly neighborhood Java decompiler or a runtime tool such as Echo Mirage (which has an excellent feature that appears to be built specifically for Java: "Inject Into Process"). Although any HTTP can certainly be reverse proxied through Burp just like with Flash.

Scanners save time and money, but it's best to use them wisely. I hope that some of my wisdom here will help you make the best out of the scanners that you have. I hope that it helps even moreso if you are trying to make a decision on how to fill that missing piece of your penetration-testing automation. Soon, new aggregation tools will solidify some integration pieces between these tools. Already, The Dradis Framework appears to be integrating well with Burp Scanner and potentially soon, Netsparker. HoneyApps Conduit is going to demonstrate some serious long-term effects on the future of commercial scanner success because importing tons of data from 5-10 scanners (plus secure static analysis tools) about tons of apps is going to point to the winners in an obvious fashion. HP AMP has an Open API and currently can connect Fortify 360. The new 1.1 version of O2 is already beginning to assemble integration features that may potentially rival all of the other scanner-integration-frameworks I've mentioned in this paragraph.

Scanners might even be useful when full-knowledge or server access is possible. Most people that I know who have done these assessments for a long time still use scanners for some pieces, even when they have source code -- especially when on a penetration-testing goal time-crunch (or even an open-ended ethical hacking approach where there are no specific goals, but time is at least partially a factor). Scanner technology is going to work well with runtime instrumentation technology such as found in Fortify PTA (Java Enterprise or ASP.NET) or the Chorizo-Scanner Morcilla plugin (for PHP). Other languages don't have this security-flavored instrumentation yet, although they appear to need to so this is a market niche. However, PTA and Morcilla should work with SWF and Ajax as long as the vulnerability is server-side and not client-side.

For SWF/AIR files, applets, or fat applications that implement HTTP or SSL, Burp is the clear winner. If only part of the SWF/AIR file, applet, or fat app is HTTP/TLS, you can see which parts are with WPE Pro, Echo Mirage, or perhaps even Uhooker. Then you can exercise that specific functionality with Burp. One of the nicest features in a web application scanner that I saw recently was Netsparker Pro adding support for Burp import. This brings automated fuzz testing of SWF/AIR files, applets, and fat apps to a new level.

Edited 2 time(s). Last edit at 07/14/2010 11:54PM by ntp.

Options: ReplyQuote
Re: Web application scanner
Posted by: rsnake
Date: July 15, 2010 02:38PM

That's a lot to respond to so I'll only respond to the part that mentions me to be brief. I don't think I ever made a fuss out of NTO. I think the only thing I've ever even said about them is that they got number one in Larry's report (twice) and that I agree that depth metrics is one solid metric among several that is worth thinking about when evaluating a scanner. In fact I've said nice things about a lot of scanners. I like the Acunetix DNS enumeration (although I still think Fierce is better). I like Mavituna's pivoting. I like Whitehat's foray into integration with WAFs, although clearly there is room for improvement, no doubt. Etc... I try not to focus too much on the negative and just talk about the things I do like in regards to scanners because of precisely this sort of thread - it's religion to a lot of people and I'd rather not get involved in religious arguments.

Oh, bit I do agree. Burp is awesome - I think I talk about that one the most because it is by far my favorite for manual professional penetration testers, as opposed to scanner jockies. It's also the least scalable. At least as of today...

- RSnake
Gotta love it.

Options: ReplyQuote
Re: Web application scanner
Posted by: ntp
Date: July 15, 2010 03:37PM

"Burp is awesome - I think I talk about that one the most because it is by far my favorite for manual professional penetration testers, as opposed to scanner jockies. It's also the least scalable. At least as of today..."

I know that 3250 US dollars worth of tools per person per year sounds like a lot, but there is value to give everybody you know (or force them to buy) a copy of Burp Suite Professional and Netsparker Pro.

Using the Dradis Framework, it's possible to split up penetration-testing work between a few people on a team. You could have newbies running Nmap, graduating to Nessus/Nikto, apprenticing to appsec with Netsparker Pro, and mastering the universe with the Burp import. The import functionality of this tool is quite nice, especially with the ability to write in comments. The export functionality is even more impressive -- with the ability to merge data from all of those sources and dump it into a Word document or HTML.

The Dradis Framework can be used in multiple ways. It could be used in a training class. It could be used for on-the-job training. It could be used where the scanner jockey runs the scans and an expert drops in later for a second to identify false positives. I think it was meant for a few experts to quickly share information about an on-going, live pen-test.

Options: ReplyQuote

Sorry, only registered users may post in this forum.