SA Bugzilla – Bug 3692
[review] check_dcc() code shortcut with header-RE doesn't work
Last modified: 2004-08-17 06:02:46 UTC
From the mailinglist (archive see URL): On Tuesday 17 August 2004 04:29 CET Theo Van Dinter wrote: > On Tue, Aug 17, 2004 at 12:33:28AM +0200, Malte S. Stretz wrote: > > While I tracked another possible bug (see [1]), I had a look how > > check_dcc works. And was completely puzzled by the code. > > Heh. > > > check_dcc() calls > > $self->get('X-DCC-(?:[^:]+-)?Metrics') > > to shortcut some ocde path. That's obviously intended to be a RE. But > > on the whole path through the code I can't find the point where the > > header is matched via a RE. > > Well, now that you mention it... PMS::get() doesn't accept an RE. > > > Follow me through the code, maybe you can enlighten me: > > 2. PerMsgStatus.pm:1175: get('X-DCC-(?:[^:]+-)?Metrics', undef) > > Yeah, that's not going to work. > > > -> PerMsgStatus.pm:1227: calls Message::Node::get_header() > > doesn't do an RE either. > > > Ok, so where's the RE applied? Even if other code paths were taken I > > can't find any point where the string is made an RE and I can't believe > > that the mail parser puts the "Header" check_dcc() wants into the > > cache-hash. > > Yeah, so we ought to replace: > > $_ = $self->get('X-DCC-(?:[^:]+-)?Metrics'); > return 1 if /bulk/; > > with: > > return 1 if grep(/^X-DCC-(?:[^:]+-)?Metrics:/ && /bulk/, > @{$self->{msg}->get_all_headers()}); On Tuesday 17 August 2004 16:04 CET Theo Van Dinter wrote: > On Mon, Aug 16, 2004 at 08:56:58PM -0700, Dan Quinlan wrote: > > > return 1 if grep(/^X-DCC-(?:[^:]+-)?Metrics:/ && /bulk/, > > > @{$self->{msg}->= get_all_headers()}); > > > > If we can expand out the list of headers, we should just go with that. > > Fetching the entire list of headers is definitely inefficient unless we > > really need it. > > Sure. I was reproducing the initial behavior, but if we know the list > of headers, that'd be better. The issue, however, is: > > [...] > DCC SMTP header lines are of the form: > > X-DCC-brand-Metrics: chost server-ID; bulk cknm1=count cknm2=count > ... where > brand is the "brand name" of the DCC server, such as > "RHYOLITE". [...] > > So "brand" is open to how things are configured and which DCC server is > being used. I think we have to do the get_all_headers() call there.
d'oh. there shouldn't be an array dereference bit around get_all_headers... <sigh> I'll be putting up a proper (and tested :)) patch shortly.
Created attachment 2265 [details] suggested patch I did a very quick manual test of this and the new version works, so that's a plus. :)
it's not a major issue, but we may as well fix it before 3.0.0 since it's known.
What do you think if we made it possible to send a real RE to those routines? It would look like $self->get(qr/X-DCC-(?:[^:]+-)?Metrics/) The routines (probably hahnded through till M::N::headers()) then check if ref ($request) eq "Regexp" and grep the wanted headers from the ->{headers} hash. We could also introduce another suffix :re which would be more consistent with the :raw etc. ones though a bit more computation overhead. It would look like $self->get('X-DCC-(?:[^:]+-)?Metrics:re') I prefer the first one.
Subject: Re: check_dcc() code shortcut with header-RE doesn't work On Tue, Aug 17, 2004 at 09:48:02AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > What do you think if we made it possible to send a real RE to those routines? > It would look like > $self->get(qr/X-DCC-(?:[^:]+-)?Metrics/) I think it would end up slowing down the most common case which is grabbing specific headers. In a quick look at the code... The DCC thing is is the only line I can find calling "get()" with an RE. There are 3 calls to get_all_headers() right now, none of them do the "match multiple headers" type thing.
changing to review status.
I doubt that the two if()s necessary will slow down checking significantly. OTOH ain't I sure if not fetching all headers in the first place will make any difference either. Anyway, +1 on the patch as it at least makes the stuff work ;)
Subject: Re: check_dcc() code shortcut with header-RE doesn't work >> It would look like >> $self->get(qr/X-DCC-(?:[^:]+-)?Metrics/) > I think it would end up slowing down the most common case which is > grabbing specific headers. I agree with Theo. Keep the common case fast. I don't mind doing a get_all_headers() with DCC too much. DCC is already dirt slow. Daniel
+1
+1 minor suggestion: I'd replace the + with {1,80}
committed with quinlan's suggested + -> {1,80} change. r36540
maybe I'll close the ticket this time too. <G>