Bug 3576 - Not valid ISO codes should be tagged
Summary: Not valid ISO codes should be tagged
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 2.63
Hardware: All All
: P5 minor
Target Milestone: 3.1.0
Assignee: Daniel Quinlan
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-07-07 05:56 UTC by Frank Urban
Modified: 2005-01-21 02:11 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
All valid code pages according to the previously mentioned url (man I love regex ;) ) text/plain None Jesse Houwing [HasCLA]
rules to catch invalid charsets in content-type, subject and html text/plain None Jesse Houwing [HasCLA]
Example for wrong hit 1 text/plain None Frank Urban [NoCLA]
Example for wrong hit 2 text/plain None Frank Urban [NoCLA]
Example for wrong hit 3 text/plain None Frank Urban [NoCLA]
Test Perl porgram for regex text/plain None Frank Urban [NoCLA]
New, optimized incorrect charset catcher text/plain None Jesse Houwing [HasCLA]
Stupid last error fixed text/plain None Jesse Houwing [HasCLA]
This should do it (for real) text/plain None Jesse Houwing [HasCLA]
Updated to swim around two FP's text/plain None Jesse Houwing [HasCLA]
Updated to swim around the other FP's that were logged. text/plain None Jesse Houwing [HasCLA]
Another update to fix the =3d issues text/plain None Jesse Houwing [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Frank Urban 2004-07-07 05:56:29 UTC
We are receiving a lot of mails with a not valid ISO code included in the 
header like "iso-8237-4". It would be nice if such unvalid ISO codes could be 
tacked.
Comment 1 Frank Urban 2004-07-07 05:58:13 UTC
I mean tagged not tacked :)
Comment 2 Jesse Houwing 2004-07-07 11:46:32 UTC
All valid charsets are listed here:

http://www.iana.org/assignments/character-sets

I might do something with this later this week.
Comment 3 Jesse Houwing 2004-07-07 14:08:38 UTC
Created attachment 2098 [details]
All valid code pages according to the previously mentioned url (man I love regex ;) )

I attached a list of all available code pages and aliases. I'm writing a rough
rule from this, but it would probably need some optimization ;)
Comment 4 Jesse Houwing 2004-07-08 01:41:09 UTC
Created attachment 2099 [details]
rules to catch invalid charsets in content-type, subject and html

I've attached a set of rules to do what you asked, but the results (at least on
my system) are disappointing (only one spam hit on 85000 messages). at least it
does not hit ham.
Comment 5 Frank Urban 2004-07-08 01:50:13 UTC
Whow. 12h to fix a call! We got a lot of such mails. I will test it and give 
you a respons soon.
Comment 6 Frank Urban 2004-07-08 04:01:58 UTC
I´ve got hundrets of wrong hits. It seemed that the tagged mails does not have 
an entry in the header with "charset=xxxx". Who does this role work. Every hit 
is a invalid_charset_2 hit. What is rawbody?
Comment 7 Frank Urban 2004-07-08 04:16:49 UTC
here one example of a tagged mail witch was tagged as invalid_charset_2, but I 
expected only a invalid_chaset_1 hit for this mail:

Received: from pfx2.example.com (sg001168.intranet.example.com 
[140.100.200.10]) by sv028081.exchange.example.com with SMTP (Microsoft 
Exchange Internet Mail Service Version 5.5.2657.72)
	id 3PVCBKYQ; Thu, 8 Jul 2004 13:11:26 +0200
Received: from localhost (localhost [127.0.0.1])
	by pfx2.example.com (example Internal Mail-System) with ESMTP id 
BC99612ED4
	for <SPAMBuffer@example.com>; Thu,  8 Jul 2004 13:11:27 +0200 (CEST)
Received: by pfx2.example.com (example Internal Mail-System, from userid 501)
	id 6CB3812FBE; Thu,  8 Jul 2004 13:11:27 +0200 (CEST)
Received: from mail.example.com (extern.postfix.example.com [140.100.155.100])
	by pfx2.example.com (example Internal Mail-System) with ESMTP id 
5965512EC2
	for <SPAMBuffer@example.com>; Thu,  8 Jul 2004 13:11:27 +0200 (CEST)
Received: by mail.example.com (example Mail-System, from userid 501)
	id 209582641; Thu,  8 Jul 2004 11:11:25 +0000 (UTC)
Received: from localhost by mail4.example.com
	with SpamAssassin (2.63 2004-01-11);
	Thu, 08 Jul 2004 13:11:24 +0200
From: "Royce Peterson" <celrlrogfht@icq.com>
To: iain.barbour@exampleib.com
Subject: *****SPAM***** this is the best  brindisi squalid
Date: Thu, 08 Jul 2004 13:19:15 +0200
Message-Id: <20040708111111.78E95260B@mail.example.com>
X-Spam-DCC: xmailer: mail4 1192; Body=1 Fuz1=1 Fuz2=1
X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on 
	mail4.example.com
X-Spam-Level: *********************
X-Spam-Status: Yes, hits=21.8 required=6.3 tests=BAYES_99,
	FORGED_RCVD_NET_HELO,HTML_50_60,HTML_FONT_BIG,HTML_FONT_INVISIBLE,
	HTML_MESSAGE,HTML_TITLE_UNTITLED,INVALID_CHARSET_2,MIME_HTML_ONLY,
	MIME_HTML_ONLY_MULTI,MSGID_FROM_MTA_SHORT,RCVD_IN_BL_SPAMCOP_NET,
	RCVD_IN_DSBL,RCVD_IN_NJABL,RCVD_IN_NJABL_PROXY,RCVD_IN_RFCI,
	RCVD_IN_SORBS,RCVD_IN_SORBS_HTTP,RCVD_IN_SORBS_SOCKS autolearn=spam 
	version=2.63
X-Spam-Pyzor: Reported 0 times.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----------=_40ED2BDC.CB948519"
X-AntiVirus: checked by AntiVir MailGate (version: 2.0.2-6; AVE: 6.26.0.3; VDF: 
6.26.0.19; host: pfx2)

This is a multi-part message in MIME format.

------------=_40ED2BDC.CB948519
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable


Your Email was identified as SPAM and did not reached the recipient.
Please check for the reasons below.

Ihre Email ist als SPAM Mail identifiziert worden und wurde dem
Empfaenger nicht zugestellt. Die Begruendung finden Sie weiter unten.

Contact address:  SPAM@example.com.

Content preview:  Untitled Document Say goodbye to expensive Refills! We
  are not retreating - we are advancing in another Direction. - General
  Douglas MacArthur (1880-1964) R_X Warehouse Direct!
  URI:http://Geraldine.tgoiwe.com/_55d958a932f5b91262baa654773c6a8e/
  >>more info<< [...]=20

Content analysis details:   (21.8 points, 6.3 required)
 0.1 HTML_MESSAGE           BODY: HTML included in message
 0.3 HTML_FONT_BIG          BODY: HTML has a big font
 5.4 BAYES_99               BODY: Bayesian spam probability is 99 to 100%
                            [score: 1.0000]
 0.3 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts
 0.4 HTML_TITLE_UNTITLED    BODY: HTML title contains "Untitled"
 0.6 HTML_FONT_INVISIBLE    BODY: HTML font color is same as background
 0.1 HTML_50_60             BODY: Message is 50% to 60% HTML
 1.0 INVALID_CHARSET_2      BODY: INVALID_CHARSET_2
 3.0 MSGID_FROM_MTA_SHORT   Message-Id was added by a relay
 4.1 FORGED_RCVD_NET_HELO   Host HELO'd using the wrong IP network
 1.1 RCVD_IN_SORBS_HTTP     RBL: SORBS: sender is open HTTP proxy server
                            [210.205.152.10 listed in dnsbl.sorbs.net]
 0.5 RCVD_IN_NJABL_PROXY    RBL: NJABL: sender is an open proxy
                            [210.205.152.10 listed in dnsbl.njabl.org]
 0.1 RCVD_IN_SORBS          RBL: SORBS: sender is listed in SORBS
                            [210.205.152.10 listed in dnsbl.sorbs.net]
 0.1 RCVD_IN_NJABL          RBL: Received via a relay in dnsbl.njabl.org
                            [210.205.152.10 listed in dnsbl.njabl.org]
 1.2 RCVD_IN_SORBS_SOCKS    RBL: SORBS: sender is open SOCKS proxy server
                            [210.205.152.10 listed in dnsbl.sorbs.net]
 0.7 RCVD_IN_DSBL           RBL: Received via a relay in list.dsbl.org
                            [<http://dsbl.org/listing?ip=3D210.205.152.10=
>]
 1.5 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
              [Blocked - see <http://www.spamcop.net/bl.shtml?210.205.152=
.10>]
 0.1 RCVD_IN_RFCI           RBL: Sent via a relay in ipwhois.rfc-ignorant=
.org
                            [$ has inaccurate or missing WHOIS data at th=
e]
                            [RIR]
 1.1 MIME_HTML_ONLY_MULTI   Multipart message only has text/html MIME par=
ts

The original message was not completely plain text, and may be unsafe to
open with some email clients; in particular, it may contain a virus,
or confirm that your address can receive spam.  If you wish to view
it, it may be safer to save it to a file and open it with an editor.


------------=_40ED2BDC.CB948519
Content-Type: message/rfc822; x-spam-type=original
Content-Description: original message before SpamAssassin
Content-Disposition: attachment
Content-Transfer-Encoding: 7bit

Return-Path: <celrlrogfht@icq.com>
Received: from 212.149.48.150 (unknown [210.205.152.10])
	by mail.example.com (example Mail-System) with SMTP
	id 78E95260B; Thu,  8 Jul 2004 13:11:11 +0200 (CEST)
Original-Encoded-Information-Types: multipart/alternative
Language: English
Disclose-Recipients: No
Reply-To: "Royce Peterson" <celrlrogfht@icq.com>
From: "Royce Peterson" <celrlrogfht@icq.com>
To: iain.barbour@exampleib.com
Subject: this is the best  brindisi squalid
Date: Thu, 08 Jul 2004 13:19:15 +0200
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="--613821744884439"
Message-Id: <20040708111111.78E95260B@mail.example.com>

----613821744884439
Content-Type: text/html;
	charset="iso-3846-0"
Content-Transfer-Encoding: 7Bit

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>

<body bgcolor="#0099FF" text="#FFFFFF" link="#FFFFFF">
<p><font color="#FFFF33" size="4" face="Arial, Helvetica, sans-serif"></font> 
  <font face="Arial, Helvetica, sans-serif">Say goodbye to expensive Refills!
<br>
  <font color="#0099FF">We are not retreating - we are advancing in another 
Direction. - General Douglas MacArthur (1880-1964) </font></font></p>
<h2><font face="Arial, Helvetica, sans-serif">R_X Warehouse Direct!</font> </h2>
<p><font face="Arial, Helvetica, sans-serif"><a 
href="http://Geraldine.tgoiwe.com/_55d958a932f5b91262baa654773c6a8e/">&gt;&gt;mo
re 
  info&lt;&lt;</a></font></p>
<p><font color="#0099FF" face="Arial, Helvetica, sans-serif">Everyone is a 
genius at least once a year; a real genius has his original ideas closer 
together. - Georg Lichtenberg (1742-1799) <br><br>
If you were plowing a field which would you rather use? Two strong oxen or 1024 
chickens? - Seymour Cray (1925-1996) father of supercomputing</font></p>
</body>
</html>

----613821744884439--

------------=_40ED2BDC.CB948519--
Comment 8 Jesse Houwing 2004-07-09 01:34:47 UTC
I'll look into it over the weekend. he rawbody rule hits on for example illegal
charset codes in the HTML content of a message.

You could temporarily disable the second rule until I've taken a look at it.
Remember that rules posted to the bugzilla system might not work completely as
expected sometimes until it says so.

Please, also note that it is easier for people working on these bugs if you use
the attachment feature to attach sample messages.
Comment 9 Frank Urban 2004-07-09 02:07:41 UTC
Hi,

thks. for the answer. I tried also to fix the problem by myself. But Im not a 
fan of rexec :)
I also found taht my example was not the best. The rule realy matches this 
example. So if will attach some other example of wrong tagged mails as 
attachment to this bug report, as you request.
... and dont forget to do also some other more funny thinks at the weekend...

Greetings

Frank
Comment 10 Frank Urban 2004-07-09 02:18:12 UTC
Created attachment 2104 [details]
Example for wrong hit 1
Comment 11 Frank Urban 2004-07-09 02:18:32 UTC
Created attachment 2105 [details]
Example for wrong hit 2
Comment 12 Frank Urban 2004-07-09 02:18:46 UTC
Created attachment 2106 [details]
Example for wrong hit 3
Comment 13 Jesse Houwing 2004-07-10 09:54:04 UTC
Hmz... I don't see why it hits in those messages. As far as I can see the regex
is correct, I might be missing something...
Comment 14 Frank Urban 2004-07-12 00:12:37 UTC
Created attachment 2117 [details]
Test Perl porgram for regex
Comment 15 Frank Urban 2004-07-12 00:15:58 UTC
Hi,

regex is not my world...
I attached a small test program for regex rules. 
This program also tell me that is something wrong:

charset="us-ascii"
Matched: |<charset="us-ascii">|
charset="fggfh"
Matched: |<charset="fggfh">|
charset=us-ascii
No match.

How you can see charset="us-ascii" matches and I shouldn´t 
charset=us-ascii did not match, thats ok.

Frank
Comment 16 Jesse Houwing 2004-07-13 13:37:01 UTC
Created attachment 2121 [details]
New, optimized incorrect charset catcher

This one should do better. Perl was backtracking around the ['"]? to make the
rule match. The rule is now written in such a way that this can't happen.

I opimized the list along the way.
Comment 17 Jesse Houwing 2004-07-13 13:44:36 UTC
Created attachment 2122 [details]
Stupid last error fixed
Comment 18 Jesse Houwing 2004-07-13 15:43:30 UTC
Created attachment 2123 [details]
This should do it (for real)
Comment 19 Frank Urban 2004-07-13 22:25:53 UTC
Great!. That looks much better. I´ve got 10 correct hits in 2min. 
It´s very interessting that SPAMer are not able to use they tools. Or how can I 
get somethink like that?:
Content-Type: text/html; charset=%CHARSET  :)
I will take a look on the tagged mails over the day.
Seemed that we get only get mails of the type SARE_ILL_CS_2. But from them a 
lot.
Frank
Comment 20 Jesse Houwing 2004-07-13 22:40:08 UTC
Created attachment 2124 [details]
Updated to swim around two FP's
Comment 21 Frank Urban 2004-07-13 22:46:02 UTC
I have one FP until now but with your first version:
<META http-equiv=3DContent-Type content=3D"text/html; charset=3Diso-8859-1">
is that what you have fixed in the newest version?
Comment 22 Frank Urban 2004-07-13 22:51:12 UTC
FP or not FP.....
Content-Type: text/plain; charset=
this was a "not delivered message". It had nothing behind the =
But is that worth to tag?
Comment 23 Frank Urban 2004-07-13 22:53:12 UTC
next FP with newest version:

<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; charset=3Diso-885=
9-1">
Comment 24 Frank Urban 2004-07-13 22:54:40 UTC
and the next "not delivered message" from another company with only:
Content-Type: text/plain; charset=
seemed that this is normal.
Comment 25 Frank Urban 2004-07-13 23:02:16 UTC
hard job:
<META http-equiv=3DContent-Type content=3D"text/html; charset=3Diso-8859-1"=
>
I think this kind of FP should be fixed bevor I send you more....
Is the cr at the end of the line a problem?
Comment 26 Frank Urban 2004-07-13 23:59:32 UTC
Seemed that spaces are also hitted:
Content-Type: text/plain; charset= "iso-8859-1"
Comment 27 Jesse Houwing 2004-07-14 00:01:31 UTC
I'll update the rule to catch this (is quite easy)

Content-Type: text/plain; charset=

just add $| after (?!

I'll run some tests on this tonight

I'm having more trouble with the following: 

charset=3Diso-885=
9-1">

I can think of a rule to catch this, but it would be VERY ugly. I might just 
check if the line doesn't end with a '=' sign.

<META http-equiv=3DContent-Type content=3D"text/html; charset=3Diso-8859-1"=
>

Should already be fixed, if not please forward me the whole message privately.
Comment 28 Jesse Houwing 2004-07-14 00:01:57 UTC
Addin  whitespace shouldn't be too hard. I'll have a look.
Comment 29 Frank Urban 2004-07-14 00:02:06 UTC
one of this two lines out of one mail where tagged:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Type: text/x-vcard; charset=utf8;
Comment 30 Justin Mason 2004-07-14 00:08:06 UTC
Subject: Re:  Not valid ISO codes should be tagged 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


it is interesting BTW.  I think it must be deliberate for some
reason.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFA9NvBQTcbUG5Y7woRAo1tAKC8KAGZ5fcxcO8i+qYSAXJfzd+oRACgriep
jw8vzWA68YC0cuaX1yXCitQ=
=qfhX
-----END PGP SIGNATURE-----

Comment 31 Frank Urban 2004-07-14 00:13:27 UTC
All the HTML mails FP where sended as attachment to a new mail. So that the 
reason for the cr in the mailbody. I think the problem is the header of an 
attached mail. What happend when I write in the text of a mail somethink like 
charset="1234"
Comment 32 Frank Urban 2004-07-14 01:12:15 UTC
The major problem seemed to be the cr. This mail was not an attachment:
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3DISO-8859-1">

Comment 33 Theo Van Dinter 2004-07-14 08:17:54 UTC
OMG that rule is ugly!

it's best redone as a plugin or eval test.  don't need RE to try to get the charset out (just call the internal 
functions to parse Content-Type), and the RE can probably be made into something more efficient like 
a table lookup.
Comment 34 Jesse Houwing 2004-07-14 09:48:55 UTC
I'm all in for it, but haven't got the experience in perl to get this done.

It would need to check both the content-type header and the HTML tags that can
contain a charset.
Comment 35 Jesse Houwing 2004-07-14 10:37:52 UTC
Created attachment 2125 [details]
Updated to swim around the other FP's that were logged.

New rule reesuls are below. The one FP is from someone setting the charset to
"ansi" which is no valid value. But being a Microsoft Tech Newsletter I might
add it anyways.

OVERALL     SPAM      HAM     S/O   SCORE  NAME
  35727    12376    23351    0.346   0.00    0.00  (all messages)
    665      664	1    0.999   1.00   0.50  SARE_ILL_CS_2
      1        1	0    1.000   0.00   0.50  SARE_ILL_CS_1

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
  35727    12376    23351    0.346   0.00    0.00  (all messages)
100.000  34.6405  65.3595    0.346   0.00    0.00  (all messages as %)
  1.861   5.3652   0.0043    0.999   1.00    0.50  SARE_ILL_CS_2
  0.003   0.0081   0.0000    1.000   0.00    0.50  SARE_ILL_CS_1

I've disabled the subject, from and to lines because they're not catching
anything.
Comment 36 Frank Urban 2004-07-14 22:15:26 UTC
Morning :)
like every day it looks much better. No FP until now, but a lot of hits:
out.26275:      charset="iso-9999-9"
out.26287:      charset="iso-3808-1"
out.26299:      charset=3D"iso-2D52-3"
out.26311:      charset="iso-9680-8"
out.26323:      charset="iso-4458-8"
out.26362:      charset="iso--"
out.26395:      charset="iso-5833-9"
out.26407:      charset="iso-5305-3"
out.26419:      charset="iso-9089-6"
out.26431:      charset="iso-9976-8"
I will see until the afternoon if its ok now.
Comment 37 Frank Urban 2004-07-14 22:55:30 UTC
200 hits and no FP until now...
Comment 38 Frank Urban 2004-07-14 23:14:37 UTC
Now a have the first wrong hit. It was a very colored jokemail:
Content-Type: text/plain; charset=iso-8859-1
Content-Type: text/html; charset=iso-8859-1
<META http-equiv=3DContent-Type content=3D"text/html; charset=3Diso-8859-1"=
>
an I think the last line was the problem...
Comment 39 Jesse Houwing 2004-07-14 23:58:21 UTC
That seems to be backtracking related aswell... I think I know how to catch 
this bugger. I'm testing again tonight. the change would be:

"(?:3d?)?" -> "(?:3d?)?(?!3d)"
Comment 40 Frank Urban 2004-07-15 00:30:37 UTC
seemed that everythink is allowed:
<META http-equiv=3DContent-Type content=3D"text/html; charset=3DISO-8859-1">
Comment 41 Frank Urban 2004-07-15 00:34:12 UTC
her e Im not shure if this was a SPAM mail or not:
Content-type: text/plain; charset=x-euc-jp
is that a valid ISO code?
Comment 42 Frank Urban 2004-07-15 00:36:13 UTC
this was no SPAM but I think the ISO code is not valid:
Content-Type: text/html; charset=Cp1252
Comment 43 Frank Urban 2004-07-15 00:38:16 UTC
Oh, the next mail with that:
Content-Type: text/plain; charset=Cp1252
seemed that this is a valid code
Comment 44 Frank Urban 2004-07-15 03:32:05 UTC
One of them matched:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Type: text/x-vcard; charset=utf8;
Comment 45 Frank Urban 2004-07-15 03:38:34 UTC
wrong hit:
<META http-equiv=3DContent-Type content=3D"text/html; charset=3Dunicode">
Comment 46 Frank Urban 2004-07-15 03:42:56 UTC
whats is wrong here?
Content-Type: text/html; charset="ISO-8859-1"
Content-Type: text/plain; charset="ISO-8859-1"
Comment 47 Frank Urban 2004-07-15 03:56:40 UTC
This paaens also sometimes in Newsletter, but I think its ok to let them tagged:
Content-Type: text/plain; charset=ISO8859-1
Comment 48 Jesse Houwing 2004-07-15 13:30:33 UTC
Created attachment 2134 [details]
Another update to fix the =3d issues
Comment 49 Roy Badami 2004-07-15 19:42:53 UTC
I may be being dumb here, but I thought Spamassassin 3 was capable of looking at
the message body after content-transfer-encodings had been dealt with?

It seems to me that a rule looking at body text really shouldn't have to
explicitly know about '=' being encoded in content-printable as '=3D'

Surely there must be a better way?
Comment 50 Frank Urban 2004-07-15 22:11:36 UTC
you mean somethink like:
body SARE_ILL_CS       Content-transfer =~ /ansi|unicode|437|8(?:5[/i
that would be helpfull.
Comment 51 Frank Urban 2004-07-15 22:29:09 UTC
seemed that in the list one vaild code is missing:
iso-8851-15
Comment 52 Frank Urban 2004-07-15 23:41:25 UTC
another vaild code witch is missing:
windows-874
Comment 53 Frank Urban 2004-07-16 00:30:26 UTC
there are some less hits I cant´t understand
out of one mail:
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset="ISO-8859-1"
Content-Disposition: inline
Content-Length: 8217
...
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="ISO-8859-1"
Content-Disposition: inline
Content-Length: 1585
Comment 54 Frank Urban 2004-07-16 00:31:49 UTC
valid ISO code with is missing:
Content-type: text/plain; charset=x-euc-jp
Comment 55 Frank Urban 2004-07-16 00:37:31 UTC
vaild code missing:
Content-Type: text/plain; charset=Cp1252
Comment 56 Jesse Houwing 2004-07-19 16:31:32 UTC
appart from some missing charsets... is this rule working ok now?
Comment 57 Frank Urban 2004-07-19 16:31:50 UTC
Subject: Out of Office AutoReply:  Not valid ISO codes should be
	 tagged

19.07.04 - 03.08.04
Bei Problemen und Fragen senden Sie bitte eine Mail an
mailservices@commerzbank.com.

Comment 58 Justin Mason 2004-07-19 16:56:10 UTC
lol!  Outlook meeting its usual quality standards there ;)
Comment 59 Frank Urban 2004-08-04 00:02:43 UTC
Outlook was here before me :)
Itzs hard work to switch 70.000 Mailboxes to another system.
.. but the backbone is Linux. And thats my part.
I can`t see any other problem in the moment. But it would be nice to have a 
version with all valid codes, because I get a copy of ervery tagged mail and 
there are to much mails with the missing valid codes that I can check them all. 
Can you include the missing codes and I will take a look on it for some more 
days???
Frank
Comment 60 Frank Urban 2004-08-12 05:35:40 UTC
Hi,
is there a chance that this problem will be solved the next time?
Frank 
Comment 61 Daniel Quinlan 2005-01-21 11:10:55 UTC
If I already did this...
Comment 62 Daniel Quinlan 2005-01-21 11:11:48 UTC
... then I can close it as FIXED in HEAD.

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
 394900   318772    76128    0.807   0.00    0.00  (all messages)
100.000  80.7222  19.2778    0.807   0.00    0.00  (all messages as %)
  5.810   7.1973   0.0000    1.000   0.95    1.00  MIME_BAD_ISO_CHARSET