Bug 2609 - HTML anchors and non-local HTML and image references
Summary: HTML anchors and non-local HTML and image references
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 2.60
Hardware: Other All
: P5 enhancement
Target Milestone: 2.70
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-10-18 19:52 UTC by Lee Howard
Modified: 2003-10-19 11:33 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Lee Howard 2003-10-18 19:52:21 UTC
I don't currently see a ruleset for this, but it would be a nice thing to have.
 If there is a way to do this with 2.60 please let me know.

Most of the spam I see getting through SpamAssassin these days contains
non-local HTML and image references... for example:

<x-html><html>
<a href="http://some.host/some.htm">
<img src="http://some.host/some.img">
</a></html></x-html>

the anchor/link may not always be there.  For example, I see generic Viagra ads
come through in JPEG format, but the JPEG is not an in-line attachment... but
rather it comes from a non-local source.
Comment 1 Daniel Quinlan 2003-10-19 17:44:52 UTC
Unfortunately, non-local images are quite common in HTML email that is not spam.

There's another bug for handling URLs in a blacklist should an appropriate
DNS-based blacklist become available, though.

Thanks.
Comment 2 Lee Howard 2003-10-19 19:33:02 UTC
Thanks Dan.

I am quite surprised, however, by your response.

SpamAssassin has a myriad of rules which are already matched with common
non-spam mail.  For examples NO_REAL_NAME, HTML_MESSAGE, HTML_30_40, HTML_70_80,
HTML_FONTCOLOR_UNKNOWN, and FWD_MSG are all flagged rulesets which have
regularly been matched by non-spam mail in my own mailbox.  I am not concerned
at all that non-spam mail matches those rulesets because SpamAssassin doesn't
declare a mail as spam because it matches just one ruleset, but rather because
it scores enough points by matched rulesets.

The fact that a particular mail matches a "non-local HTML/image" rule doesn't
necessarily mean that it is spam any more than the fact that it matches the
HTML_MESSAGE or FWD_MSG rules.  It is simply an indicator by which a
SpamAssassin user can judge whether or not it is spam.  Your refusal to consider
any such "non-local HTML/image" rule simply because it is a common feature of
non-spam also seems a bit of a double-standard.

Can such a rule be user-created by defining a body pattern test?

Thanks.