SA Bugzilla – Bug 734
PYZOR check fails under spamd
Last modified: 2002-08-26 21:49:59 UTC
If PYZOR_CHECK is enabled, messages over somewhere near 5K cause spamd to stop processing the message before finishing. The message is sent back to spamc with no SpamAssassin headers. This has caused a bunch of 'false positives' that appeared untouched by SA. Details below. If someone else can duplicate this, I recommend disabling Pyzor checking until this is figured out. I guess it could also be something specific to my system. The following is the tail end of spamd's debug output when this happens. This particular test message is 9156 bytes long. ... debug: DCC is available: debug: DCC: got response: X-DCC-sackHeads-Metrics: host.yrex.com 1012; Body=7 Fuz1=7 Fuz2=7 debug: Pyzor is available: debug: Pyzor -> check failed - Broken pipe. debug: forged_rcvd_trail: entry 0: by=(undef) from=(undef) mismatches=0 debug: forged_rcvd_trail: entry 1: by=starlingtech.com from=gmtiresponder.com mismatches=0 debug: DNS MX records found: 4 -----(it ends here. it shouldn't.) When the same message is run through spamc with the PYZOR_CHECK score set to zero, spamd successfully marks the message as spam: .... debug: DCC is available: debug: DCC: got response: X-DCC-sackHeads-Metrics: host.yrex.com 1012; Body=8 Fuz1=8 Fuz2=8 debug: forged_rcvd_trail: entry 0: by=(undef) from=(undef) mismatches=0 debug: forged_rcvd_trail: entry 1: by=starlingtech.com from=gmtiresponder.com mismatches=0 debug: DNS MX records found: 4 debug: running meta tests; score so far=13.2 debug: AWL active, pre-score: 14.9, mean: undef, originating-ip: 216.10.23.150 debug: Post AWL score: 14.9 debug: is spam? score=14.9 required=7 tests=COMPLETELY_FREE,CTYPE_JUST_HTML,HTML_50_70,HTML_FONT_COLOR_NAME,HTML_FONT_ COLOR_RED,HTML_FONT_COLOR_YELLOW,HTML_FONT_FACE_ODD,HTML_WITH_BGCOLOR,JAVASCRIPT ,MAILTO_LINK,NO_REAL_NAME,SPAM_PHRASE_00_01,SUBJECT_IS_NEWS,TABLE_THICK_BORDER logmsg: identified spam (14.9/7.0) for root:99 in 28 seconds, 9156 bytes. -----(it ends here, correctly.) I did a bit more testing and the following may help narrow it down: - Editing the same message to be below about 5K in size makes spamd work fine even with Pyzor enabled. I haven't determined the exact size where this starts to happen. - This only happens with spamc/spamd. `spamassassin' works fine. - It's not a specific message, I've tried several.
bugger, this one is serious :( we should really try to nobble it before 2.40 release. I think it's probably something to use with forking a subprocess (ie pyzor-check), and perl's IO buffering. note also that Mike is using DCC, too, which also forks and exec's dcc's checker tool. Mike -- what perl version and OS are you using BTW, in case that's relevant?
I'm using Perl 5.6.1 under Red Hat Linux 7.1.
excellent! I think this is fixed now, in b2_4_0. It was indeed a buffering issue caused by the use of open2(), so I fixed it for dcc as well, just in case. Mike, could you check?
Hmmm, I'm still getting "Broken pipe" on Pyzor but the spam is scored correctly now. I'm going to run some tests now to see if Pyzor is working on my system. debug: Pyzor is available: debug: Pyzor -> check failed - Broken pipe. debug: forged_rcvd_trail: entry 0: by=(undef) from=(undef) mismatches=0 debug: forged_rcvd_trail: entry 1: by=starlingtech.com from=gmtiresponder.com mismatches=0 debug: DNS MX records found: 4 debug: running meta tests; score so far=13.2 debug: AWL active, pre-score: 14.9, mean: undef, originating-ip: 216.10.23.150 debug: Post AWL score: 14.9 debug: is spam? score=14.9 required=7 tests=COMPLETELY_FREE,CTYPE_JUST_HTML,HTML_50_70,HTML_FONT_COLOR_NAME,HTML_FONT_ COLOR_RED,HTML_FONT_COLOR_YELLOW,HTML_FONT_FACE_ODD,HTML_WITH_BGCOLOR,JAVASCRIPT ,MAILTO_LINK,NO_REAL_NAME,SPAM_PHRASE_00_01,SUBJECT_IS_NEWS,TABLE_THICK_BORDER logmsg: identified spam (14.9/7.0) for root:99 in 2 seconds, 9156 bytes.
Did some more checking. I'm still getting "broken pipe" on messages > ~5K. On smaller messages, I get a different error in the debug output: debug: Pyzor is available: debug: Pyzor: got response: environment variable HOME is unset; please set it debug: Pyzor: couldn't grok response "environment variable HOME is unset; please set it " In either case, I never get a result for PYZOR_CHECK, even on messages that are definitely listed. HOME is definitely set correctly when I start spamd. The Pyzor check works fine with spamassassin -t. Just to clarify, the original problem (spamd failing altogether on the Pyzor check) seems to be fixed, but Pyzor checking under spamc/spamd still seems broken.
I think we change the $HOME somewhere in the Razor code to be a directory in /tmp. Maybe there's a problem...
Yes, looks like it's because $HOME is reset when spamd is started, and Pyzor relies on it. Can anyone think *why* we do this?? What we can do is copy $ENV{'HOME'} when spamd starts, then set $ENV{'HOME'} in the Pyzor check to that value for the duration of the check. I've just implemented this for Razor, Pyzor et al, and it seems to fix it.
I think it was done because else razor put its razor.lst into / when SA is called as root or without $HOME. Don't ask me for details. There might be a bug about this somewhere here in zilla...
I can confirm that this is fixed in b2_4_0 - spamd now successfully scores Pyzor on messages both < 5K and > 5K. Nice work!
excellent! closing bug.