Bug 7183

Summary: Spamc/Spamd very slow with -z compression and ssl
Product: Spamassassin Reporter: Kevin A. McGrail <kmcgrail>
Component: spamc/spamdAssignee: SpamAssassin Developer Mailing List <dev>
Status: NEW ---    
Severity: normal CC: giovanni, kmcgrail, mjcaley
Priority: P2    
Version: 3.4 SVN branch   
Target Milestone: 4.0.0   
Hardware: PC   
OS: Windows 7   
Whiteboard:
Attachments: Read only expected bytes
exit when data has been read

Description Kevin A. McGrail 2015-04-30 03:02:47 UTC
Using spamc with -z and --ssl is either just a bad idea or has some inefficiency

Here is the same message with compression and then without:

Apr 29 22:54:27 intel1 spamd[14348]: spamd: accept() on fd 5 
Apr 29 22:54:27 intel1 spamd[14348]: spamd: connection from firewall.pccc.com [71.163.15.130]:46846 to port 783, fd 5 
Apr 29 22:54:27 intel1 spamd[14348]: spamd: compress header received 
Apr 29 22:54:27 intel1 spamd[14348]: spamd: running as uid 103 
Apr 29 22:54:42 intel1 spamd[14348]: spamd: processing message <0000014d069d705a-018d325a-a0a1-4c53-9eff-506ef19842e2-000000@email.amazonses.com> for root:103 
Apr 29 22:54:44 intel1 spamd[14348]: spamd: clean message (1.8/6.5) for root:103 in 16.9 seconds, 17307 bytes. 
Apr 29 22:54:44 intel1 spamd[14348]: spamd: result: . 1 - DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_REPLYTO,FREEMAIL_REPLYTO_END_DIGIT,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,KAM_RPTR_PASSED,RCVD_IN_DNSWL_NONE,SPF_PASS,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD scantime=16.9,size=17307,user=root,uid=103,required_score=6.5,rhost=firewall.pccc.com,raddr=71.163.15.130,rport=46846,mid=<0000014d069d705a-018d325a-a0a1-4c53-9eff-506ef19842e2-000000@email.amazonses.com>,autolearn=disabled 


Without compression: 

Apr 29 22:54:49 intel1 spamd[14348]: spamd: accept() on fd 5 
Apr 29 22:54:49 intel1 spamd[14348]: spamd: connection from firewall.pccc.com [71.163.15.130]:46847 to port 783, fd 5 
Apr 29 22:54:49 intel1 spamd[14348]: spamd: running as uid 103 
Apr 29 22:54:50 intel1 spamd[14348]: spamd: processing message <0000014d069d705a-018d325a-a0a1-4c53-9eff-506ef19842e2-000000@email.amazonses.com> for root:103 
Apr 29 22:54:51 intel1 spamd[14348]: spamd: clean message (1.8/6.5) for root:103 in 2.1 seconds, 17307 bytes. 
Apr 29 22:54:51 intel1 spamd[14348]: spamd: result: . 1 - DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_REPLYTO,FREEMAIL_REPLYTO_END_DIGIT,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,KAM_RPTR_PASSED,RCVD_IN_DNSWL_NONE,SPF_PASS,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD scantime=2.1,size=17307,user=root,uid=103,required_score=6.5,rhost=firewall.pccc.com,raddr=71.163.15.130,rport=46847,mid=<0000014d069d705a-018d325a-a0a1-4c53-9eff-506ef19842e2-000000@email.amazonses.com>,autolearn=disabled 

Note that the processing times are 2.1 to 16.9 seconds with a tiny 17K email though scores and rules appear the same.

I can reproduce this over and over.

Without SSL, the times for roughly the same.  Here is with compression:

Apr 29 22:59:06 intel1 spamd[15407]: spamd: accept() on fd 5 
Apr 29 22:59:06 intel1 spamd[15407]: spamd: connection from firewall.pccc.com [71.163.15.130]:46850 to port 783, fd 5 
Apr 29 22:59:06 intel1 spamd[15407]: spamd: compress header received 
Apr 29 22:59:06 intel1 spamd[15407]: spamd: running as uid 103 
Apr 29 22:59:06 intel1 spamd[15407]: spamd: processing message <0000014d069d705a-018d325a-a0a1-4c53-9eff-506ef19842e2-000000@email.amazonses.com> for root:103 
Apr 29 22:59:08 intel1 spamd[15407]: spamd: clean message (1.8/6.5) for root:103 in 1.7 seconds, 17307 bytes. 
Apr 29 22:59:08 intel1 spamd[15407]: spamd: result: . 1 - DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_REPLYTO,FREEMAIL_REPLYTO_END_DIGIT,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,KAM_RPTR_PASSED,RCVD_IN_DNSWL_NONE,SPF_PASS,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD scantime=1.7,size=17307,user=root,uid=103,required_score=6.5,rhost=firewall.pccc.com,raddr=71.163.15.130,rport=46850,mid=<0000014d069d705a-018d325a-a0a1-4c53-9eff-506ef19842e2-000000@email.amazonses.com>,autolearn=disabled 


And here is without:

Apr 29 22:59:25 intel1 spamd[15407]: spamd: accept() on fd 5 
Apr 29 22:59:25 intel1 spamd[15407]: spamd: connection from firewall.pccc.com [71.163.15.130]:46851 to port 783, fd 5 
Apr 29 22:59:25 intel1 spamd[15407]: spamd: running as uid 103 
Apr 29 22:59:25 intel1 spamd[15407]: spamd: processing message <0000014d069d705a-018d325a-a0a1-4c53-9eff-506ef19842e2-000000@email.amazonses.com> for root:103 
Apr 29 22:59:27 intel1 spamd[15407]: spamd: clean message (1.8/6.5) for root:103 in 1.9 seconds, 17307 bytes. 
Apr 29 22:59:27 intel1 spamd[15407]: spamd: result: . 1 - DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_REPLYTO,FREEMAIL_REPLYTO_END_DIGIT,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,KAM_RPTR_PASSED,RCVD_IN_DNSWL_NONE,SPF_PASS,TXREP,T_REMOTE_IMAGE,T_RP_MATCHES_RCVD scantime=1.9,size=17307,user=root,uid=103,required_score=6.5,rhost=firewall.pccc.com,raddr=71.163.15.130,rport=46851,mid=<0000014d069d705a-018d325a-a0a1-4c53-9eff-506ef19842e2-000000@email.amazonses.com>,autolearn=disabled 

1.9 seconds vs 1.7 seconds though it jumps around on retests.


So is there a code issue?  Should we add a warn to -z if --ssl is used?
Comment 1 Kevin A. McGrail 2015-04-30 03:09:44 UTC
NOTE: Added a small debug for compress header acknowledgement to spamd so I could tell for sure that compression was working as expected.

svn commit -m '3.4 commit - adding compress header debug statement - bug 7183' 
Sending        spamd/spamd.raw
Transmitting file data .
Committed revision 1676888.

svn commit -m 'trunk commit - adding compress header debug statement - bug 7183' 
Sending        spamd/spamd.raw
Transmitting file data .
Committed revision 1676889.
Comment 2 Kevin A. McGrail 2018-09-03 04:43:49 UTC
Continuing to target for 4.0 to look at this issue
Comment 3 Michael Caley 2020-11-11 16:44:51 UTC
I've had some problems with doing this as well.  It looks like the socket is reading a static number of bytes.  Until the socket is closed it'll keep waiting for more data.

I've made a patch that changes the read call to look for the expected number of bytes.  Let me know what your thoughts are.
Comment 4 Michael Caley 2020-11-11 16:46:55 UTC
Created attachment 5730 [details]
Read only expected bytes
Comment 5 Giovanni Bechis 2020-12-13 09:55:44 UTC
Created attachment 5733 [details]
exit when data has been read

I think your diff will read all data at once which could be bad with big emails.
The attached diff will exit the sub as soon as enough data has been read.
Comment 6 Kevin A. McGrail 2021-03-14 03:57:46 UTC
Giovanni, do we need someone to test your revised patch?
Comment 7 Giovanni Bechis 2021-03-14 08:55:45 UTC
Yes, one more test could be useful'