SA Bugzilla – Bug 7192
US Dollars rules FP
Last modified: 2017-04-15 09:58:06 UTC
I pulled a few messages out of the quarantine this evening that hit NA_DOLLARS, US_DOLLARS_3 and MILLION_USD solely based on the fact that there was a legitimate discussion between a CFO and their accounting firm. Is it really the case that all that's necessary for an email to be considered spam is a discussion of large sums of money without any other classifier? I realize I could of course make the scores lower, but that's not good, guys. Any Nigerian scam or other effort to extort or steal money surely would involve some other characteristic, no? Is there anything more that can be done here? I bet TxRep might help, but should that be necessary? __FRAUD_KDT ======> got hit: "USD $6,000,000" MILLION_USD ======> got hit: "million U.S. dollars (USD" __LOTSA_MONEY_01 ======> got hit: "$ 6,000,000" __hk_bigmoney ======> got hit: "$ 6,000,000" __FRAUD_DBI ======> got hit: "Euros" NA_DOLLARS ======> got hit: "million U.S. dollar" __LOTSA_MONEY_04 ======> got hit: "million Euros" US_DOLLARS_3 ======> got hit: "$ 6,000,000" __KAM_REFI4 ======> got hit: "$6,000" __FRAUD_LTX ======> got hit: "million U.S. dollars" __HUSH_HUSH ======> got hit: "confidentiality" X-Spam-Report: * -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) * [208.70.234.51 listed in wl.mailspike.net] * 0.0 RELAYCOUNTRY_US Relayed through United States * -0.0 SPF_PASS SPF: sender matches SPF record * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay * domain * 3.6 NA_DOLLARS BODY: Talks about a million North American dollars * 1.8 US_DOLLARS_3 BODY: Mentions millions of $ ($NN,NNN,NNN.NN) * 3.2 MILLION_USD BODY: Talks about millions of dollars * 0.0 HTML_MESSAGE BODY: HTML included in message * 0.1 LOC_CDIS_INLINE BODY: Content-Disposition: inline * -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 0.0 LOTS_OF_MONEY Huge... sums of money * -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders * 0.1 LOC_IMGSPAM Probably inline image * 0.0 SAGREY Adds 0.01 to spam from first-time senders
KAM's additional comment was to move the rules from stock to sandbox so they are evaluated for rule promotion and better scoring.
I've created 20_rules_to_sandbox.cf in my kmcgrail and removed the force_publish for these 3 rules MILLION_USD, NA_DOLLARS & US_DOLLARS so they will go through ruleqa, etc. and be given better scores and S/O analyzed for promotion, etc. If this works, we should look at moving all rules to the sandbox so everything goes through ruleqa. svn commit -m 'Bug 7192 moving MILLION_USD, NA_DOLLARS & US_DOLLARS to sandbox for ruleqa/promotion, etc.' Sending rules/20_phrases.cf Sending rules/30_text_de.cf Sending rules/30_text_fr.cf Sending rules/30_text_nl.cf Sending rules/30_text_pl.cf Sending rules/30_text_pt_br.cf Sending rules/50_scores.cf Sending rulesrc/10_force_active.cf Adding rulesrc/sandbox/kmcgrail/20_rules_to_sandbox.cf Transmitting file data ......... Committed revision 1679253. regards, KAM
No updates at all in the last 24hrs. Just to clarify, should I expect to see 20_rules_to_sandbox.cf with the next sa-update and the MILLION_USD, NA_DOLLARS & US_DOLLARS rules removed from normal distribution?
(In reply to Alex from comment #3) > No updates at all in the last 24hrs. Just to clarify, should I expect to see > 20_rules_to_sandbox.cf with the next sa-update and the MILLION_USD, > NA_DOLLARS & US_DOLLARS rules removed from normal distribution? You will not see 20_rules_to_sandbox. The rules are now in the sandbox so they are relegated to ruleqa and promotion to live rules is determined based on their merit Rules are published nightly *if* everything goes well. Last night, it didn't have enough SPAM: HAM: 227525 (150000 required) SPAM: 145299 (150000 required) Insufficient spam corpus to generate scores; aborting. Exit Status 9 is not zero for do-nightly-rescore-example If you join the ruleqa@ list, you can see these reported. Anyway, give it a few more days and let me know if you see any of the rules disappear or the scores change. This is considered resolved but the rules update is pending that process and we can tweak things based on that when it occurs. Regards, KAM
None of the rules were considered for automatic promotion and all are removed from the latest rule update: http://ruleqa.spamassassin.org/?daterev=20150514-r1679324-n&rule=MILLION_USD+NA_DOLLARS+US_DOLLARS_3&srcpath=&g=Change They might be worth pushing out with a lower score ceiling if someone has any input.
(In reply to Kevin A. McGrail from comment #5) > None of the rules were considered for automatic promotion and all are > removed from the latest rule update: > http://ruleqa.spamassassin.org/?daterev=20150514-r1679324- > n&rule=MILLION_USD+NA_DOLLARS+US_DOLLARS_3&srcpath=&g=Change > > They might be worth pushing out with a lower score ceiling if someone has > any input. They overlap with LOTSA_MONEY, which does go out.
This was labeled as version 3.4.1 but was committed ad closed after the 3.4.1 release and was never ported to the 3.4 branch. I'm changing the target version to 3.4.2 and will port the changes to branch.
Committed to 3.4 branch merge from trunk as Revision 1791448