Bug 6168 - html title concatenation too tight
Summary: html title concatenation too tight
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P3 normal
Target Milestone: 3.3.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-29 20:28 UTC by Cedric Knight
Modified: 2009-08-05 02:17 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
html content without spaces between tags message/rfc822 None Cedric Knight [HasCLA]
patch to put whitespace after title patch None Cedric Knight [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Cedric Knight 2009-07-29 20:28:24 UTC
Created attachment 4490 [details]
html content without spaces between tags

In the attached example:
<head><title>example.org</title></head><body><h1>example.net</h1>
the title is apparently concatenated to the body without any space.  This means that \b and ^ may not work as expected at the start of a body rule and also confuses uri parsing:
dbg: uri: parsed uri found of type parsed, example.orgexample.net

</title> and </body> should imply at least as much of a break as <p> or <td>.
Comment 1 Cedric Knight 2009-08-04 18:02:47 UTC
Created attachment 4501 [details]
patch to put whitespace after title

This patch fixes the run-together of domain names in the test case (which was based on an actual false positive), by putting the html title on a line of its own.  However, it doesn't deal with two adjacent h[n] headers: See also bug #5749.
Comment 2 Justin Mason 2009-08-05 02:17:02 UTC
thanks; applied. 

: 219...; svn commit -m "bug 6168: <title> tags should be surrounded by an implicit whitespace char in text rendering of HTML" lib/Mail/SpamAssassin/HTML.pm 
Sending        lib/Mail/SpamAssassin/HTML.pm
Transmitting file data .
Committed revision 801100.