modperl-ip_based_instant_throttle

This is part of The Pile, a partial archive of some open source mailing lists and newsgroups.

To: Justin <jb@dslreports.com>
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: IP based instant throttle?
Date: 07 Jun 2001 19:34:45 -0700

>>>>> "Justin" == Justin  <jb@dslreports.com> writes:

Justin> Does anyone see the value in a Throttle module that looked at
Justin> the apache parent status block and rejected any request where
Justin> another child was already busy servicing *that same IP* ?
Justin> (note: the real IP is in the header in a backend setup so it
Justin>  is not possible to dig it out across children without
Justin>  creating another bit of shared memory or using the filesystem?).

Justin> I'm still finding existing throttle modules do not pickup and
Justin> block parallel or fast request streams fast enough .. ok there are
Justin> no massive outages but 10 seconds of delay for everyone because
Justin> all demons are busy servicing the same guy before we can conclude
Justin> we're being flooded is not really great.. modperl driven forums
Justin> (or PHP ones even) can be killed this way since there are so
Justin> many links on one page, all active.. 

It would be pretty simple, basing it on my CPU-limiting throttle that
I've published in Linux Magazine
<http://www.stonehenge.com/merlyn/LinuxMag/col17.html>.  Just grab a
flock on the CPU-logging file in the post-read-request phase instead
of writing to it.  If you can't get the flock, reject the request.
Release the flock by closing the file in the log phase.

But this'd sure mess up my ordinary visit to you, since my browser
makes 4 connections in parallel to fetch images, and I believe most
browsers do that these days.

===

To: "Ken Williams" <ken@forum.swarthmore.edu>
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: IP based instant throttle?
Date: 08 Jun 2001 06:13:05 -0700

>>>>> "Ken" == Ken Williams <ken@forum.swarthmore.edu> writes:

Ken> merlyn@stonehenge.com (Randal L. Schwartz) wrote:
>> It would be pretty simple, basing it on my CPU-limiting throttle that
>> I've published in Linux Magazine
>> <http://www.stonehenge.com/merlyn/LinuxMag/col17.html>.  Just grab a
>> flock on the CPU-logging file in the post-read-request phase instead
>> of writing to it.  If you can't get the flock, reject the request.
>> Release the flock by closing the file in the log phase.
>> 
>> But this'd sure mess up my ordinary visit to you, since my browser
>> makes 4 connections in parallel to fetch images, and I believe most
>> browsers do that these days.

Ken> I was thinking about that too, and concluded that you'd only want to
Ken> throttle the back-end server in a 2-server setup.  That would usually
Ken> (save for subrequests) only be 1 request throttled per page-load.  I
Ken> tend not to care about the front-end, because overload is rarely a
Ken> problem there.

Well, if the reason you're throttling is to block excessive usage of
the machine, the full monty of CPU limiting will do that just fine,
since images are delivered quickly, but anything that eats CPU starts
pushing the counter up to the max.  That's why I have my CPU
throttler, and it worked fine to prevent me from being "slashdotted"
that one day I was mentioned there.  I'm told that my CPU throttler
was used at etoys.com for a similar purpose, and permitted them to
keep from losing millions of dollars of revenue due to people
spidering their catalog.

===

To: modperl@apache.org
From: Roman Maeder <maeder@mathconsult.ch>
Subject: Re: IP based instant throttle? 
Date: Fri, 08 Jun 2001 15:55:57 +0200

merlyn@stonehenge.com said:
> Well, if the reason you're throttling is to block excessive usage of
> the machine, the full monty of CPU limiting will do that just fine, 

one kind of DOS would not be caught by looking at CPU usage, it is one
that I have experienced a number of times, namely the use of some
misconfigured offline browsing tool that would just open as many
concurrent connections as it can until it has read all pages on your
server. I don't know whether some of these tools are misconfigured out of
the box, or users changed the settings. Some idiots do that even
behind a modem, so the limit is not CPU but bandwidth, as all of
these connections go through the same slow wire. Your CPU
will then be mostly idle, with full IP output queues and all Apache
processes in the "W" sate. As soon one of the requests times out, the
tool opens a new one.

It should be easy to hack Apache::SpeedLimit to count concurrent
accesses instead of number of accesses over a certain time and
lock out the client if it reaches some max number. Is this the
best way to do this or are there better ideas?

===
To: modperl@apache.org
From: Justin <jb@dslreports.com>
Subject: Re: IP based instant throttle?
Date: Fri, 8 Jun 2001 17:17:16 -0400

good ideas, thanks.

as someone said its cloggage on the backend due to either
SQL server contention or more likely largish pages draining
to the user even with all the buffers en-route helping to
mitigate this. you can't win : if they are on a modem they can
tie up 8 modperl demons, and if they are on a cable modem they
can disrupt your SQL server creating select/insert locks and
yet more stalled stuff. A cable modem, user could request
1mbit/s of dynamic content.. thats a big ask..

Since the clogging is not images (that is hopefully handled by
an image server like mathopd), its modperl pages, I'm going to
try a timed ban triggered by parallel requests from a single IP.

And yes it does happen often enough to annoy.. (often might be
two or three times a day even though as a percentage of uniques
its very tiny) since many of the culprits don't even know what
they've got installed on their PCs and are on dhcp addresses
and probably never return anyway IP bans after the event are
never any good and may hit the next user who picks up the IP.

===

To: modperl@apache.org
From: Justin <jb@dslreports.com>
Subject: Re: IP based instant throttle?
Date: Fri, 8 Jun 2001 17:34:37 -0400

I'm glad I haven't got your user.. I think most any site on the
net can be brought to its knees by, for example, stuffing its
site search form with random but very common words and pressing
the post button and issuing these requests as frequently as
possible from a long list of open proxies.. or how about repeatedly
fetching random pages of very old postings such that the SQL
server index/table memory cache becomes useless... nightmare ;)

All one can do is respond with appropriate measures at the time
of the attack, which is why working in modperl is cool because
of the ease with which one can patch in defenses and modify
them while live.

Writing a short script that takes the last 20 minutes of access_log
and automatically identifies abuse based on frequency of request,
IPs and URL patterns, and drops the route to those IPs is a
good start.. to have this auto-triggered from a site monitoring
script is even better.

===
the rest of The Pile (a partial mailing list archive)
doom@kzsu.stanford.edu