SA Bugzilla – Bug 7443
HTML_FONT_LOW_CONTRAST not being checked for td/tr/th if style attribute exists for these elements
Last modified: 2022-04-16 06:43:58 UTC
Created attachment 5453 [details] patch -p0 patch.txt Enclosed is a tiny patch to HTML.pm It is to accommodate changes in the HTML standard as it relates to inline style. Specifically, inline style was not being parsed because the only allowed attribute for td/tr/th were bgcolor. The standard allows for style. Without this, rules like the HTML_FONT_LOW_CONTRAST were not being checked. Saw some spam today getting around that check. Spam contained the following pattern but didn't trigger HTML_FONT_LOW_CONTRAST <table align="center" border="0" cellpadding="10" cellspacing="0" width="100%"> <tbody> <tr> <td align="center" style="font-size: 10.17px; font-family:Copperplate,'Copperplate Gothic Light',fantasy; background-color:#737373; color:#737373;">elected on just 777 votes from an electoral ... ... ... </tr> </tbody> </table>
Created attachment 5454 [details] patch -p0 < patch.txt First patch wasn't general enough... This works better with a few other cases.
Created attachment 5457 [details] patch -p0 < patch.txt Continuing and fixing problems related to colors and inheritance of color. Probably best to wait as we locate more problems. At this point, we have html_font_low_contrast working much better without the previous false positives that existed because of updates to the style standards. This will not be the last patch from what we have seen in the code.
Thanks, Jad. Incorporated the parts about handling inline style attributes in table and anchor tags. Committed to trunk and stable 3.4 branch. Sending SpamAssassin/HTML.pm Committed revision 1806019. Sending SpamAssassin/HTML.pm Committed revision 1806020. Ignoring whitespace changes, the remaining bits in attachment 5457 [details] need more polish first.
JDunphy, can you submit an ICLA https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7443 so we can consider this code?
Cla received, thank you.
Sorry, was fixing his account. CLA received, thanks!
Pushing to 3.4.3. KB's comments, off bz agree with mine: These would be my current objections: - The (background)?-?color change is not correct, obvious intention was to make the hyphen optional, so /(bg-?)?c/ nested. If leaving out the hyphen is abuseable at all and some renderers do accept it. - The "protecting from word values" should be a non-issue. The immediately following regex test for $value does the same. - The code with the "# repetitive from above -- need to collapse logic" comment is self-explaining, why my comment was "needs polish". No argument what it does and why it is necessary in the report, either. - The code change of setting $new right after the comment that it's getting set below is icky. If correct at all, it misses to adjust the comments. In other words, still needs polish. Or an explanation...
This code appears to be committed. Karsten, is it production ready for 3.4.3 release? Or does it need to be uncommitted and moved to 4.0.0? Moving back to 3.4.3 blocker.
(In reply to Kevin A. McGrail from comment #8) > This code appears to be committed. Karsten, is it production ready for > 3.4.3 release? Or does it need to be uncommitted and moved to 4.0.0? > > Moving back to 3.4.3 blocker. Only the simple part was committed and is already in 3.4.2 production, there's nothing to roll back. Only additional stuff to polish and commit as per your last comment.
THanks!
Postponing as rest of the unpolished bits are minor updates, should not delay 4.0.