Ethical Hacking

Learn to find vulnerabilities before the bad guys do! Gain real world hands on hacking experience in our state of the art hacking lab. Course designed and taught by expert instructors with years of penetration testing experience. 12 student maximum in every class. Certification attempt included in every package.
Computer Forensics Training at InfoSec Institute

Gain the in-demand skills of a certified computer examiner, learn to recover trace data left behind by fraud, theft, and cybercrime perpetrators. Discover the source of computer crime and abuse at your organization so that it never happens again. All of our class sizes are guaranteed to be 12 students or less to facilitate one-on-one interaction with one of our expert instructors.




Network Security Web-App-Sec
[Top] [All Lists]

Re: Hit Throttling - Content Theft Prevention

Subject: Re: Hit Throttling - Content Theft Prevention
Date: Wed, 19 Oct 2005 17:03:36 +1000
On 19/10/05, Kurt Seifried <bt@seifried.org> wrote:
One effective strategy is to have hidden links (i.e. white text on white
background or a 1x1 pixel image stashed somewhere) that regular browsers
won't see at all. Have it go to a page with more links that specifically say
"do not click this, you will be blocked," etc. These links go to a CGI, the
CGI blocks that IP/etc (firewall rules, apache config, whatever), make sure
you stick these in various alphabetical orders and at the top and bottom of
the pages (many scrappers start at the top of a page or go in alphabetical
order).

Thanks for the tips Kurt. In one instance, I have seen a crawler that
will iterate ID's within a GET request, ie.

/content.asp?bookid=1000&page=1
/content.asp?bookid=1000&page=2

and keep going until it reaches 404, and then increment the first ID.
In one instance, we obfuscated the ID's by hashing them (md5), but the
bot caught on and then did the same thing. We had to put that case
down to a design fault in the application in that the URL's were
easily guessable.  When you have content of high value at stake, the
'other side' seems to get more sophisticated as opposed to your
standard home user who has downloaded a website scraper from
download.com. What your tips are leading towards are ways to
distinguish human visitors from bots, which with some attackers simply
leads to a game of cat-and-mouse as opposed to a solution that can be
handed to the client. We have also tried CAPTCHA, but that resulted in
a noticeable drop in website hits with 20-30% of visitors not going
past the image challenge screen. The CAPTCHA lead to a one-time URL,
which also posed a problem when the user would refresh the page.

I have contacted a number of appliance vendors to see if they offer a
transparent application-layer firewall that could identify bad bots
and drop them, but surprisingly not one had a solution to offer. This
is a field that we are continuously getting more and more requests
about - pity that the Big Co's aren't taking up the opportunity
(considering that most of the companies being affected and who need to
protect their content would pay almost anything).

Regards,
Nik

--
Nik Cubrilovic  -  <http://www.nik.com.au>

<Prev in Thread] Current Thread [Next in Thread>