SA Bugzilla – Bug 5288
[review] ArchiveIterator fails on filenames with leading spaces
Last modified: 2007-02-04 11:40:11 UTC
While trying to run sa-learn on some mailboxes with leading spaces in their names, I discovered that ArchiveIterator.pm fails to open such mailboxes because it uses the 2 argument version of open() which strips leading whitespace from filenames. The attached patch fixes this fixes by prepending ./ to such filenames (as per the Camel book)
Created attachment 3821 [details] patch to lib/Mail/SpamAssassin/ArchiveIterator.pm
Hrm, interesting! I didn't know about this "special" behavior of open()... Unfortunately, both leading and trailing whitespace is removed, and redirection characters are honored, which is bad. So we'll have to deal with this a little more directly. If anyone's interested, here's a snippet from "perldoc -f open": The filename passed to 2-argument (or 1-argument) form of open() will have leading and trailing whitespace deleted, and the normal redirection characters honored. This property, known as "magic open", can often be used to good effect. A user could specify a filename of "rsh cat file |", or you could change certain filenames as needed: $filename =~ s/(.*\.gz)\s*$/gzip -dc < $1|/; open(FH, $filename) or die "Can't open $filename: $!"; Use 3-argument form to open a file with arbitrary weird charac- ters in it, open(FOO, '<', $file); otherwise it's necessary to protect any leading and trailing whitespace: $file =~ s#^(\s)#./$1#; open(FOO, "< $file\0");
indeed, it would probably be better to use the 3 argument version of open() in mail_open (execpt in the .gz/.bz2 cases)
(In reply to comment #3) > indeed, it would probably be better to use the 3 argument version of open() in > mail_open (execpt in the .gz/.bz2 cases) Actually we need to do it there as well to deal with the whitespace issue. I have a patch that I'm almost done with which does this. :)
ok, patch applied to 3.2: Sending lib/Mail/SpamAssassin/ArchiveIterator.pm Transmitting file data . Committed revision 494733. will put up a 3.1 version shortly
Created attachment 3823 [details] suggested patch This patch backports 3.2's mail_open function, so it includes the fix for this bug as well as bug 5249 (use binmode after opening the file).
there is one thing your patch does not take into account: opening of STDIN via a filename of '-', which the 2 arg version of open() does. This breaks e.g. piping mail into spamassassin...
(In reply to comment #7) > there is one thing your patch does not take into account: opening of STDIN via a > filename of '-', which the 2 arg version of open() does. This breaks e.g. piping > mail into spamassassin... Actually, ArchiveIterator doesn't support "-", so that's not an issue. spamassassin and sa-learn handle reading from STDIN and creating a temp file, which is what's actually used. (bug 5145 dealt with this for 3.1.8 -- you're probably right that things would break if you only apply this patch to 3.1.7) :)
yes, the patch breaks 3.1.7, which is how I found out about it... the following extra condition at the top of mail_open fixes it: if ($file eq "-") { $mode = '<&'; @expr = ( "STDIN" ); }
+1
Sending lib/Mail/SpamAssassin/ArchiveIterator.pm Transmitting file data . Committed revision 503461.