Bug 4157 - Reducing System Load with Temporary Rejections - Penalty Box
Summary: Reducing System Load with Temporary Rejections - Penalty Box
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamassassin (show other bugs)
Version: unspecified
Hardware: Other other
: P5 enhancement
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-02-25 07:45 UTC by Marc Perkel
Modified: 2005-02-25 10:25 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Perkel 2005-02-25 07:45:05 UTC
I've discovered a trick that has significantly reduced the system load using
Spam Assassin and I'm thinking that the idea should be incorporated into SA and
done better than I am doing it.

Often a spammer is sending the same spam over and over to different people and
SA correctly identifies the same spam - but at a cost of load on the system.
Sometimes spammers pound the server over and over with dictionary and various
other attacks. This suggestion is geared to reducing the load on the system by
slowing down the sammers to temporary errors using what I call - a penalty box.

The idea is that once a from address has sent a spam any email from that address
will get a temport error (come back later) from the MTA for the next 5 minutes.
If the sender is sending ham - the message will eventually get through. But in
mant cases spammer make only one attemtp and move on.

I'm using Exim and most of what I'm doing is at the MTA level. Basically
spammers are put into a temporary black list that is used to retern temporary
errors. Sometimes I put the IP address in a similare list to return temp arrors.
Every 5 minutes the list is emptied from a cron job. And - it is working very
well in reducing the load of having to process the same spam over and over, as
well as reducing the load of other "sins" that spammer commit.

So - how does this tie into Spam Assassin? It would be handy if SA could
maintain a short lived database (DB file? Text File?) that contained a list of
recient spammers or spam information in a way that can be read form Exim or
other MTAs - or SA itself - for the purpose of reducing system load from
spammers that hammer the server over and over in a short period of time. It's
sort of a recient sinners list and can contain either from addresses or IP
adresses of offenders.

This is similar in many ways to greylisting but with greylisting you penalize
everyone new with delays. This method only penalizes by delays those who have
previously offended. It isn't as effective as greylisting in some ways - but it
eliminates the delays greylisting causes on new ham that I consider to be
unacceptable.

The penalty box idea is working very well for me and it gets rid of some nasty
load spikes that used to hit pretty hard. I think it's worth considering ways to
 reduce load by reducing the number of messages SA has to process.
Comment 1 Sidney Markowitz 2005-02-25 13:11:51 UTC
My first reaction was that this would be a great idea for my ISP to use, and I
started to write a letter to a sysadmin there to suggest it.

While writing it I realized that it would put the system behavior at the mercy
of individual users' local configuration options. At a minimum, you should not
have a system-wide temporary fail of email because of a hit on somebody's
personal blacklist. Similarly, people could have local scores set high for some
specific rules. I know of people who are so against HTML in email that they have
set scores of some SA rules that only hit when there is HTML to 1000. That may
not make a lot of sense from a pure spam-filtering point of view, but they do
it. How could this idea work in an ISP environment where users have the ability
to set their local options?
Comment 2 Tom Schulz 2005-02-25 14:09:45 UTC
Perhaps a small modification to SpamAssassin to create two scores.  One would
be that produced by the system wide rules and the other would include the
users rules and modifications.  This would still be one pass, just keep two
running totals as you go through the rules.  You would need a new header to
display the score for the system wide rules, and would not normally need to
display the details of how this score was arived at.
Comment 3 Marc Perkel 2005-02-25 17:06:51 UTC
Subject: Re:  Reducing System Load with Temporary Rejections - Penalty
 Box



bugzilla-daemon@bugzilla.spamassassin.org wrote:

>http://bugzilla.spamassassin.org/show_bug.cgi?id=4157
>  
>
The idea isn't to bounce messages system wide. It's only to delays 
senders for a 5 minute period. And - it would be triggered by high scores.

Comment 4 Sidney Markowitz 2005-02-25 17:31:10 UTC
I understand that the effect is not as drastic as a bounce, and I agree that a
temp fail penalty box can be a great idea -- especially since spam senders
usually do not retry temp fails. The problem is that you can't count on a high
score really meaning that some piece of mail is spam when individual users are
able to set their own preferences. They can blacklist anyone they choose, they
can assign high scores to rules that would FP for everyone else, they can even
rescore everything and set their spam threshold to 100 if they feel like it.

The only way this could work without making it vulnerable to individual user
preferences is to have some mechanism to keep track of rule hits and scores
using the default scoring.

To do that, you would have to deal with the problem that assigning a rule a
score of 0 now means that the rule is never run. That's a problem with Tom's
suggestion of keeping two running totals. I'm not saying that the idea is not
feasible, but I'm raising issues that I think an implementation would need to
address if it is going to be practical.

I agree that the goal of finding a way to temporarily temp fail likely spam
senders is a good one.
Comment 5 Daniel Quinlan 2005-02-25 19:25:10 UTC
This is not SA's job.
Comment 6 Loren Wilton 2005-02-27 03:58:12 UTC
Subject: Re:  Reducing System Load with Temporary Rejections - Penalty Box

> The problem is that you can't count on a high
> score really meaning that some piece of mail is spam when individual users
are
> able to set their own preferences.

So implement it first only for sites that don't allow individual user rules
or scores.  That way the site rules prevail, because they are the only ones
around.  At a guess, that might be 60% of the SA sites.

After that, yes, some additional trickery would be needed to remove the
effect of user rules.

Comment 7 Marc Perkel 2005-02-27 08:15:24 UTC
Subject: Re:  Reducing System Load with Temporary Rejections - Penalty
 Box

I don't think that additional trickery is necessary because it's only a 
5 minute temp error so if something gets it wrong it's only a 5 minute 
problem and no real email is lost.

The beauty of this system is that you don't have to be precise. If you 
get it wrong - no big deal. The whole point of this is to reduce system 
load without sacrificing any email and the short delays on false 
positives are not significant and far less of a delay than greylisting.

This is about keeping it simple.