Bug 5612

Summary: [review] DB_File: infinite loop causing 100% cpu usage when opening dbm files
Product: Spamassassin Reporter: J. Nick Koston <bdraco>
Component: LibrariesAssignee: SpamAssassin Developer Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: ben, dan, jshanley, nick
Priority: P5    
Version: 3.2.3   
Target Milestone: 3.2.4   
Hardware: Other   
OS: other   
Whiteboard: go
Attachments: Proof of concept to create the loop
Proposed Workaround for BasedAddrList.pm
Solution Patch
Perl script to patch sa for testing
alt patch
Remove all affected temp names
and a patch against the one already applied (convience)
new patch
Full patch against 3.2 branch

Description J. Nick Koston 2007-08-17 18:54:58 UTC
 
Comment 1 J. Nick Koston 2007-08-17 18:56:41 UTC
#!/usr/bin/perl
use DB_File;
use Mail::SpamAssassin::Locker::UnixNFSSafe;
open(my $fake_lock_fh,'>','__db.breakme');
close($fake_lock_fh);
my $locker = new Mail::SpamAssassin::Locker::UnixNFSSafe;
$locker->safe_lock('breakme', 30, 0600);   # simulate
Mail::SpamAssassin::DBBasedAddrList::new_checker's tie in r/w;
tie %x, 'DB_File', 'breakme', O_RDWR | O_CREAT, 0600;
$locker->safe_unlock('breakme');


This will result in 100% cpu usage as it will try over and over again

open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
stat64("/home/koston/.spamassassin/x/breakme", 0xbfe4c720) = -1 ENOENT (No such
file or directory)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
stat64("/home/koston/.spamassassin/x/breakme", 0xbfe4c720) = -1 ENOENT (No such
file or directory)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
stat64("/home/koston/.spamassassin/x/breakme", 0xbfe4c720) = -1 ENOENT (No such
file or directory)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
stat64("/home/koston/.spamassassin/x/breakme", 0xbfe4c720) = -1 ENOENT (No such
file or directory)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
stat64("/home/koston/.spamassassin/x/breakme", 0xbfe4c720) = -1 ENOENT (No such
file or directory)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
stat64("/home/koston/.spamassassin/x/breakme", 0xbfe4c720) = -1 ENOENT (No such
file or directory)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/koston/.spamassassin/x/__db.breakme",
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)

This happens under normal conditions when 
1) spamd crashes
2) the server reboots
3) the user exceeds quota 

The __db.[DBNAME] temp file is there but the db is not.

Comment 2 J. Nick Koston 2007-08-17 19:10:47 UTC
Created attachment 4093 [details]
Proof of concept to create the loop
Comment 3 J. Nick Koston 2007-08-17 19:19:01 UTC
It would appear that the problem is with DB_File itself

use DB_File;
open(my $fake_lock_fh,'>','__db.breakme');
close($fake_lock_fh);
tie %x, 'DB_File', 'breakme', O_RDWR | O_CREAT, 0600;
Comment 4 J. Nick Koston 2007-08-17 19:22:47 UTC
Created attachment 4094 [details]
Proposed Workaround for BasedAddrList.pm
Comment 5 J. Nick Koston 2007-08-17 19:23:30 UTC
SpamAssassin/BayesStore/DBM.pm would need a similar fix.
Comment 6 J. Nick Koston 2007-08-18 08:12:55 UTC
A dupe of this is: http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5598
Comment 7 Sidney Markowitz 2007-08-18 14:12:59 UTC
I secured this bug report when I saw that it was in the Security group, just to
be prudent. Now I see that it appears to be a dupe of public bug 5598 and I
don't see any way that it can be triggered remotely or by a crafted email. By
the way Nick, this is not a criticism of your initially putting it in the
Security category -- Better safe than sorry.

I think this should be taken out of the closed security group and closed as a
dupe  of bug 5598 or it should be closed as a dupe of this.

Before I do that, is there anyone who thinks that this does need to be kept as a
limited access security bug?
Comment 8 J. Nick Koston 2007-08-18 15:20:46 UTC
Looks like the root cause is a bug in db-4.2.x

diff -u db-4.2.52/fileops/fop_util.c db-4.6.19/fileops/fop_util.c |less

Seems like it is solved in 4.6.19
Comment 9 J. Nick Koston 2007-08-18 15:22:34 UTC
Its not very easy to trigger remotely:

You'd have to know that the user is about to exceed their quota to trigger it
remotely.

Comment 10 J. Nick Koston 2007-08-18 15:25:59 UTC
diff -u db-4.2.52/fileops/fop_util.c db-4.3.29/fileops/fop_util.c |less

Looks like it _may_ be ok with 4.3 as well.  

It looks like RHEL 4/CentOS 4 users are going to the ones out in the cold. 
However I've yet to test it with db-4.3.29 yet
Comment 11 J. Nick Koston 2007-08-18 15:31:04 UTC
with db-4.3 it only tried 300 times before it just gives up.

# strace perl t.pl 2>&1 |grep open |grep breakme | wc
    301    2405   22857


You'll still end up data loss however thats much less critical as it will just
give up and never be able to write to the db.

Mac OS X is going to be affected as it uses an affected db version as well.
Comment 12 J. Nick Koston 2007-08-19 12:15:08 UTC
Created attachment 4096 [details]
Solution Patch

This could use a few comments.	This patch does appear to solve the problem and
I have tested it in a live env.
Comment 13 J. Nick Koston 2007-08-19 17:19:26 UTC
Created attachment 4101 [details]
Perl script to patch sa for testing
Comment 14 Justin Mason 2007-08-20 03:49:25 UTC
agreed, this is not a security issue -- since you would need write access to the
user conf dir anyway to exploit it, and at that point there are much worse
things an attacker could do.  opening up...
Comment 15 Justin Mason 2007-08-20 04:05:28 UTC
Created attachment 4102 [details]
alt patch

Thanks for the patch!  I've taken it and mangled it a bit -- moving the helper
logic to a Util function, modified it to fit with the SA coding style a little
bit more, and put it closer to the tie() calls.

Could you try out this version and see if it still works ok?
Comment 16 Justin Mason 2007-08-20 04:06:46 UTC
*** Bug 5598 has been marked as a duplicate of this bug. ***
Comment 17 Justin Mason 2007-08-20 04:08:24 UTC
committed to trunk:

: jm 178...; svn commit -m "bug 5612: DB_File version 4.2.x has a bug that loops
infinitely if files named '__db.{filename}' are present; work around.  thanks to
J. Nick Koston for the report and fix"
Sending        lib/Mail/SpamAssassin/BayesStore/DBM.pm
Sending        lib/Mail/SpamAssassin/DBBasedAddrList.pm
Sending        lib/Mail/SpamAssassin/Util.pm
Transmitting file data ...
Committed revision 567653.
Comment 18 Daniel Muey 2007-08-20 06:57:16 UTC
Just a side note, I'd created an rt report about this for DB_file at 
 http://rt.cpan.org/Ticket/Display.html?id=28885 
and of course referenced this bugzilla.
Comment 19 J. Nick Koston 2007-08-20 07:10:31 UTC
Reviewed & Tested JMASON's patch.   Looks good.   Excellent work!
Comment 20 Michael Parker 2007-08-20 07:19:19 UTC
Assign to dev since it no longer a security issue.
Comment 21 J. Nick Koston 2007-08-30 17:58:44 UTC
stat64("/home/xxx/.spamassassin/bayes_toks.expire347", 0xbfe40940) 
= -1 ENOENT (No such file or directory)
open("/home/xxx/.spamassassin/__db.bayes_toks.expire347", 
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/xxx/.spamassassin/__db.bayes_toks.expire347", 
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/xxx/.spamassassin/__db.bayes_toks.expire347", 
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
stat64("/home/xxx/.spamassassin/bayes_toks.expire347", 0xbfe40940) 
= -1 ENOENT (No such file or directory)
open("/home/catonesp/.spamassassin/__db.bayes_toks.expire347", 
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/xxx/.spamassassin/__db.bayes_toks.expire347", 
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/xxx/.spamassassin/__db.bayes_toks.expire347", 
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
stat64("/home/xxx/.spamassassin/bayes_toks.expire347", 0xbfe40940) 
= -1 ENOENT (No such file or directory)
open("/home/xxx/.spamassassin/__db.bayes_toks.expire347", 
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
open("/home/xxx/.spamassassin/__db.bayes_toks.expire347", 
O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EEXIST (File exists)
Process 347 detached

It looks like $db_tmpfile(.*) needs to be removed

in avoid_db_file_locking_bug
Comment 22 J. Nick Koston 2007-08-31 09:39:33 UTC
.spamassassin/__db.bayes_toks.new as well can cause this
Comment 23 J. Nick Koston 2007-08-31 09:40:48 UTC
to be safe 

remove 

/^$db_tmpfile\Q.\E/o
and
/^$db_tmpfile$/o
Comment 24 J. Nick Koston 2007-08-31 14:04:22 UTC
using substr is going to be faster then a regex here.

     Regex:  1 wallclock secs ( 0.74 usr +  0.00 sys =  0.74 CPU) @ 1351351.35/s
(n=1000000)
    Substr:  0 wallclock secs ( 0.68 usr +  0.00 sys =  0.68 CPU) @ 1470588.24/s
(n=1000000)
Comment 25 J. Nick Koston 2007-08-31 15:31:43 UTC
Created attachment 4113 [details]
Remove all affected temp names
Comment 26 J. Nick Koston 2007-08-31 15:32:34 UTC
Created attachment 4114 [details]
and a patch against the one already applied (convience)
Comment 27 Justin Mason 2007-09-24 12:48:08 UTC
Created attachment 4130 [details]
new patch

here's another take on that, using the more convenient (<foo.*>) globbing
support in perl.  let me know if this works for you...
Comment 28 Justin Mason 2007-09-26 03:15:00 UTC
applied 4130 to trunk:

: jm 149...; svn commit -m "bug 5612: some versions of Berkeley DB can get into
a deadlock condition if files named __db.{filename} are present; work around
more thoroughly this time.  (previous patch omitted some failure cases)"
lib/Mail/SpamAssassin/Util.pm
Sending        lib/Mail/SpamAssassin/Util.pm
Transmitting file data .
Committed revision 579561.
Comment 29 Mark Martinec 2007-09-26 05:11:54 UTC
A widespread usage of -e $file and similar simple file tests
would need to be replaced by a stat() or lstat() call for 3.3.0
because -e does not distinguish between ENOENT and EACCES
(which would not fix the current issue if a file existed
but was inaccessible), but that should do for the time being.
I plan to embark on a mission to replace all usage of
-e, -r, -x etc with stat, stat & -r _, stat & -x _ etc
some day.

+1 as it is for 3.2.4
Comment 30 Daryl C. W. O'Shea 2007-11-06 13:21:33 UTC
+1
Comment 31 Justin Mason 2007-12-03 08:24:25 UTC
doh! the avoid_db_file_locking_bug() method isn't present in the 3.2.x branch :(
taking the patch back out of review state.
Comment 32 Sidney Markowitz 2007-12-22 15:29:15 UTC
Created attachment 4212 [details]
Full patch against 3.2 branch

For convenience I applied the rev 567653 of comment #17 and then applied patch
#4130 to get a full combined patch against branch 3.2.

It appears to me not to break anything and I'm putting this back in review with
my vote. Justin, it's still up you to commit if you are happy with it and it
gets the needed votes -- I'm doing this for your convenience and to help move
things along for 3.2.4, but I'm not trying to take over your work on this.
Comment 33 Doc Schneider 2007-12-22 16:27:40 UTC
+1
Comment 34 Justin Mason 2007-12-23 02:53:08 UTC
(In reply to comment #32)
> Created an attachment (id=4212) [edit]
> Full patch against 3.2 branch
> 
> For convenience I applied the rev 567653 of comment #17 and then applied patch
> #4130 to get a full combined patch against branch 3.2.
> 
> It appears to me not to break anything and I'm putting this back in review with
> my vote. Justin, it's still up you to commit if you are happy with it and it
> gets the needed votes -- I'm doing this for your convenience and to help move
> things along for 3.2.4, but I'm not trying to take over your work on this.

don't worry about that, I have no problems with you doing that ;)  Thanks
Sidney.

+1
Comment 35 Justin Mason 2007-12-28 05:14:16 UTC
thanks! applied to 3.2.x:

: jm 231...; svn commit -m "bug 5612: DB_File version 4.2.x has a bug that loops
infinitely if files named '__db.{filename}' are present; work around.  thanks to
J. Nick Koston for the report and fix"
Sending        lib/Mail/SpamAssassin/BayesStore/DBM.pm
Sending        lib/Mail/SpamAssassin/DBBasedAddrList.pm
Sending        lib/Mail/SpamAssassin/Util.pm
Transmitting file data ...
Committed revision 607235.