SA Bugzilla – Bug 6937
3.3.2 and Perl 5.18.0: Altering hash requires restarting loop else UNDEFINED behavior.
Last modified: 2013-05-29 09:24:03 UTC
Error in syslog: Use of each() on hash after insertion without resetting hash iterator results in undefined behavior at /usr/local/lib/perl5/site_perl/5.18.0/Mail/SpamAssassin/AsyncLoop.pm line 363. (SpamAssassin 3.3.2 installed.) Code (starting at line 363 - a while loop): while (my($key,$ent) = each %$pending) { my $id = $ent->{id}; if (defined $ent->{poll_callback}) { # call a "poll_callback" if exists # be nice, provide fresh info to a callback routine $ent->{status} = 'FINISHED' if exists $self->{finished}->{$id}; # a callback might call set_response_packet() or report_id_complete() # dbg("async: calling poll_callback on key $key"); $ent->{poll_callback}->($ent); } my $finished = exists $self->{finished}->{$id}; if ($finished) { $anydone = 1; delete $self->{finished}->{$id}; $ent->{status} = 'FINISHED'; $ent->{finish_time} = $now if !defined $ent->{finish_time}; my $elapsed = $ent->{finish_time} - $ent->{start_time}; dbg("async: completed in %.3f s: %s", $elapsed, $ent->{display_id}); # call a "completed_callback" sub, if one exists if (defined $ent->{completed_callback}) { # dbg("async: calling completed_callback on key $key"); $ent->{completed_callback}->($ent); } $self->{timing_by_query}->{". $key"} += $elapsed; $self->{queries_completed}++; $self->{total_queries_completed}++; delete $pending->{$key}; } } This is beyond my ability to suggest a fix.
> Error in syslog: Use of each() on hash after insertion > without resetting hash iterator results in undefined behavior > at ... 5.18.0/Mail/SpamAssassin/AsyncLoop.pm line 363. I'm running trunk (3.4) under 5.18.0 and am not seeing this, although I can't pinpoint any change which had been addressing this problem. Will investigate...
Created attachment 5143 [details] Some essential changes backported to 3.3.2 to make it work under perl 5.18 The attached combo patch for 3.3.2 should make it usable under more recent versions of Perl.
> I'm running trunk (3.4) under 5.18.0 and am not seeing this, > although I can't pinpoint any change which had been addressing > this problem. Will investigate... Took me two days to grasp it all. It is true that the current trunk did not suffer from this problem, the reason is my changes to DNS resolving (grouping of DNS queries & caching DNS responses for future use - r1434360, Bug 6884) - which unintentionally also avoided inserting entries into a hash while an each() context is active. What happens is that SA 3.3 invoked callbacks of finished DNS queries from MS::AsyncLoop::complete_lookups within an for/each loop. Some of the callbacks (e.g. NS lookup) could generate another DNS query (for an A record), which in turn inserted a new entry into a hash of pending queries. If this happened within an active each() context, older perl would just quietly suffer and possibly do something unpredictable (or maybe not), while the perl 5.18.0 issues a warning. The attached patch for 3.3.2 is just a quick workaround, building a list of keys of a 'pending' hash before the loop is entered. Changes by (r1434360, Bug 6884) moved calling of callback subroutines from MS::AsyncLoop::complete_lookups into MS::DnsResolver::poll_responses, i.e. somewhat earlier, but also outside of the each() context, so it avoided the problem altogether. The main reason for this change was re-using results of DNS queries when multiple independent rules happen to query the same DNS resources. A potential downside of the current approach of invoking callbacks earlier is that actions are more intertwined in comparison to decoupled calls to callback, delayed till after the poll_responses loop has already exited. ... which made me try to revert to the decoupled approach, but it all ended up in a mess of solving unnecessary new problems, so I finally gave up and kept the new approach. As a result of this exercise I updated some documentation, added/adjusted some debugging and added some hardening, and re-introduced the 'aborted' callbacks, even though they are not currently used by any callback. Also ditched some of the dead code: currently unused poll_callback and get_pending_lookups(), and the 'completed_callback' mechanism of specifying callbacks which has been superseded by the $cb argument of sub bgsend_and_start_lookup. trunk (3.4): Bug 6937: 3.3.2 and Perl 5.18.0: Altering hash requires restarting loop else UNDEFINED behavior Sending lib/Mail/SpamAssassin/AsyncLoop.pm Sending lib/Mail/SpamAssassin/Dns.pm Sending lib/Mail/SpamAssassin/DnsResolver.pm Sending lib/Mail/SpamAssassin/Plugin/ASN.pm Sending lib/Mail/SpamAssassin/Plugin/AskDNS.pm Sending lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm Committed revision 1487178.
If 3.4 (trunk) is unaffected by this, that's yet another reason to finalize it and push it out. ;-) In the meantime, thanks for hunting this down and fixing it (I assume the patch works).
Fixed in 3.4, patch available for 3.3.2, closing.