Bug 5720 - mass-check malformed result lines from clients
Summary: mass-check malformed result lines from clients
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Masses (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P3 normal
Target Milestone: 3.3.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-11-14 23:00 UTC by Daryl C. W. O'Shea
Modified: 2007-11-21 13:32 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Daryl C. W. O'Shea 2007-11-14 23:00:13 UTC
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: ror Y  9
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESS
AGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=1836279597,scantime=0,
format=r,reuse=no,host=talon1.pccc.com 


I haven't really looked into how this happened... revision 595224 hacks around
it in cs_schedule_cache mode so that the server can exit upon completion.
Comment 1 Daryl C. W. O'Shea 2007-11-15 23:22:24 UTC
Reproduced a failure on talon1 (with the zone), patched the zone and completed a
mass-check with talon1.

[dos@cyan trunk]$ svn ci masses/mass-check -m 'bug 5720: send the client a
malformed message index, get one back; duh!'
Sending        masses/mass-check
Transmitting file data .
Committed revision 595585.
[dos@cyan trunk]$

Leaving this open until it continues to work, and then I'll try to break it by
reverting revision 595224. :)
Comment 2 Justin Mason 2007-11-16 03:17:04 UTC
yay! confirmed it doesn't hang now.  I was able to complete the bb-jm mass-check
(25000 ham, 25000 spam) using talon and infiltrator in 22 minutes.  nice, and
with more servers, it can only get better ;)

I still see a batch of errors in the middle, but it recovers from them and
carries on just fine...

status: client requested 800 messages                    now: 2007-11-16 10:55:25
status: sent 777 of 800 intended messages                now: 2007-11-16 10:55:26
status: 1777 messages outstanding                        now: 2007-11-16 10:55:26
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: msg-error Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=f,reuse=no,host=infiltrator

Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: msg-error Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=f,reuse=no,host=infiltrator

Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: f Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=,reuse=no,host=infiltrator

Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: f Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=,reuse=no,host=infiltrator

Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: msg-error Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=f,reuse=no,host=infiltrator

Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: msg-error Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=f,reuse=no,host=infiltrator

Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: msg-error Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=f,reuse=no,host=infiltrator

Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: s Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=,reuse=no,host=infiltrator

Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: msg-error Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=f,reuse=no,host=infiltrator

Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: s Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=,reuse=no,host=infiltrator

Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: f Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=,reuse=no,host=infiltrator

Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: f Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=,reuse=no,host=infiltrator

Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
Use of uninitialized value in concatenation (.) or string at mass-check line 826.
>> WTH!?  result is not in the correct format: f Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=0,format=,reuse=no,host=infiltrator

Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
Use of uninitialized value in concatenation (.) or string at mass-check line 826.
Use of uninitialized value in substitution (s///) at mass-check line 881.
Use of uninitialized value in sprintf at mass-check line 889.
>> WTH!?  result is not in the correct format: f Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=1,format=,reuse=no,host=infiltrator

>> WTH!?  result is not in the correct format: s Y  9 
EMPTY_MESSAGE,MISSING_DATE,MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,__ENV_AND_HDR_FROM_MATCH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__UNUSABLE_MSGID
time=0,scantime=1,format=,reuse=no,host=infiltrator

status: 1777 messages outstanding                        now: 2007-11-16 10:55:32
status:   1% ham: 142    spam: 359    date: 2007-05-31   now: 2007-11-16 10:55:34
status:   2% ham: 295    spam: 707    date: 2007-01-15   now: 2007-11-16 10:55:34
status: POST request from 63.216.184.50                  now: 2007-11-16 10:55:34
status: client requested 1000 messages                   now: 2007-11-16 10:55:34
status: sent 1000 of 1000 intended messages              now: 2007-11-16 10:55:34
[.etc...]
Comment 3 Justin Mason 2007-11-16 07:35:44 UTC
this stuff is hard.  :(

I noticed an error around this stanza near line 695:

  # client mode is a little crazy because we need to kluge around the fact
  # that the information needed to do the run is different than the
  # information that goes into the results.
  if ($opt_client) {
    # change the format and id to the real version, make sure to remember
    # the server's message number
    # note: pop/push makes it work regardless of any changes to M::SA::AI
    $origid = pop @{$real{$id}};
    $format = $real{$id}->[2];
    $id = $real{$id}->[3];
  }


at one stage, $id was undef after that.  I think it's probably because one of
the *earlier* entries in @{$real{$id}} was undef, maybe origid?
Comment 4 Daryl C. W. O'Shea 2007-11-21 13:32:06 UTC
The problem is immediately clear when you have more than a single missing
message per batch (in non cs_paths_only mode... which I don't use, so it's not
tested as well).  The bogus message id of 0 gets used twice (or more), and
instead of just skipping the message the clients were scanning the bogus
message, deleting the id the first time around and then complaining about it
being missing the second time around.

Anyway, fixed...

[dos@cyan masses]$ svn ci -m "bug 5720: get client to skip messages when the
server indicates an error with a particular message; this eliminates the issue
of trying to use the same message id in the same batch when more than one
message has a problem in a batch... and keeps us from scanning bogus
place-holder messages that we were going to ignore the results of anyway"
Sending        masses/mass-check
Transmitting file data .
Committed revision 597212.
[dos@cyan masses]$