SA Bugzilla – Bug 6183
ISO-2022-JP false positives on FM_FRM_RN_L_BRACK
Last modified: 2009-09-03 14:21:03 UTC
FM_FRM_RN_L_BRACK is described as having a > without < in From. It seems some legitimate ISO-2022-JP mail can trigger this rule in error. Attaching a few examples.
Created attachment 4522 [details] mbox containing 4 samples showing FM_FRM_RN_L_BRACK bug
great! keep 'em coming ;)
if we want to change this for 3.3.0, it needs to be in SVN by this Thursday; see bug 6155.
describe FM_FRM_RN_L_BRACK From name has > but not < I have access to 13 of the 45 spam hits of FM_FRM_RN_L_BRACK in my own corpus. They are confirmed spam, but they are all Japanese ISO-2022-JP without broken brackets as the rule is described. Are the other spam hits of this rule Japanse ISO-2022-JP without broken brackets as well?
Created attachment 4525 [details] Example ISO-2022-JP "From" that does not trigger FM_FRM_RN_L_BRACK
rulesrc/sandbox/emailed/00_FVGT_File001.cf header __FROM_LEFT_BRACK From:name =~ /</ header __FROM_RIGH_BRACK From:name =~ />/ meta FM_FRM_RN_L_BRACK (__FROM_RIGH_BRACK && !__FROM_LEFT_BRACK) describe FM_FRM_RN_L_BRACK From name has > but not < __FROM_LEFT_BRACK is somehow broken? Any ideas? If we can't fix this, perhaps we are better off disabling this rule. All of the ham and spam in my corpus that triggers FM_FRM_RN_L_BRACK show that the rule is incorrect. This rule isn't identifying spam. It is identifying a certain subset of ISO-2022-JP Japanese mail. In the past all Japanese mail might have been in the spam corpus, without ham samples, so we didn't notice this problem.
(In reply to comment #6) > rulesrc/sandbox/emailed/00_FVGT_File001.cf > > header __FROM_LEFT_BRACK From:name =~ /</ > header __FROM_RIGH_BRACK From:name =~ />/ > meta FM_FRM_RN_L_BRACK (__FROM_RIGH_BRACK && !__FROM_LEFT_BRACK) > describe FM_FRM_RN_L_BRACK From name has > but not < > > __FROM_LEFT_BRACK is somehow broken? Any ideas? > > If we can't fix this, perhaps we are better off disabling this rule. All of > the ham and spam in my corpus that triggers FM_FRM_RN_L_BRACK show that the > rule is incorrect. This rule isn't identifying spam. It is identifying a > certain subset of ISO-2022-JP Japanese mail. In the past all Japanese mail > might have been in the spam corpus, without ham samples, so we didn't notice > this problem. May I suggest: header __FROM_LEFT_BRACK From:name =~ /^</ header __FROM_RIGH_BRACK From:name =~ />$/ meta FM_FRM_RN_L_BRACK (__FROM_RIGH_BRACK && !__FROM_LEFT_BRACK) describe FM_FRM_RN_L_BRACK From name has > but not < comments?
http://ruleqa.spamassassin.org/20090831-r809502-n/ http://ruleqa.spamassassin.org/20090901-r809894-n FM_FRM_RN_L_BRACK disappeared between these two masscheck runs. What happened?
Bug #5201 and Bug #6082 are the same issue. I now understand this isn't the address portion of From but the free string of the name which is usually prior to the address. Due to the rule disappearing in masscheck and my poor understanding of this code I am unable to test the suggested rule in Comment #7.
(In reply to comment #7) > May I suggest: > > header __FROM_LEFT_BRACK From:name =~ /^</ > header __FROM_RIGH_BRACK From:name =~ />$/ > meta FM_FRM_RN_L_BRACK (__FROM_RIGH_BRACK && !__FROM_LEFT_BRACK) > describe FM_FRM_RN_L_BRACK From name has > but not < > > comments? unfortunately that misses the two good hits I have in my corpus. the > is halfway through the line. here's a fix: : 45...; svn commit -m "bug 6183: avoid ISO-2022-JP FPs on FM_FRM_RN_L_BRACK rule" Sending rulesrc/sandbox/emailed/00_FVGT_File001.cf Adding t.rules/FM_FRM_RN_L_BRACK Adding t.rules/FM_FRM_RN_L_BRACK/bug6183_hit1 Adding t.rules/FM_FRM_RN_L_BRACK/bug6183_hit2 Adding t.rules/FM_FRM_RN_L_BRACK/fp_bug6183_att4522_1 Adding t.rules/FM_FRM_RN_L_BRACK/fp_bug6183_att4522_2 Adding t.rules/FM_FRM_RN_L_BRACK/fp_bug6183_att4522_3 Adding t.rules/FM_FRM_RN_L_BRACK/fp_bug6183_att4522_4 Adding t.rules/FM_FRM_RN_L_BRACK/fp_bug6183_att4525 Transmitting file data ........ Committed revision 811129.