Bug 46055 - Batik CSS Scanner does not handle unicode-range correctly
Summary: Batik CSS Scanner does not handle unicode-range correctly
Status: NEW
Alias: None
Product: Batik - Now in Jira
Classification: Unclassified
Component: CSS (show other bugs)
Version: 1.6
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Batik Developer's Mailing list
Keywords: RFC
Depends on:
Reported: 2008-10-21 20:03 UTC by Peter Farland
Modified: 2009-02-11 11:46 UTC (History)
0 users

Attempt to support the variants of CSS-2 @font-face unicode-ranges (6.42 KB, patch)
2009-02-11 11:44 UTC, Peter Farland
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Farland 2008-10-21 20:03:13 UTC
Batik's org.apache.batik.css.parser.Scanner class's nextToken() method has a section to handle when a "U" character is followed by a "+" character, as is typically the case for unicode character ranges specified in CSS-2 style @font-face rules.

For example,  a simple unicode range might look like this:

@font-face {
    unicode-range: U+0030-U+0039;

However, there can be multiple entries in the ranges separated by a comma, and each entry can be just a single character, or include wild cards too. It seems the Batik CSS Scanner does not take these multiple entries into consideration.

@font-face {
    unicode-range: U+0030-U+0039,U+002E;
Comment 1 Peter Farland 2008-10-23 18:31:13 UTC
Note that the error manifests itself when unicode-range isn't the last descriptor in the style declaration.

@font-face {
    unicode-range: U+0030-U+0039,U+002E;
Comment 2 Peter Farland 2009-02-11 11:44:04 UTC
Created attachment 23248 [details]
Attempt to support the variants of CSS-2 @font-face unicode-ranges

The existing code did not handle all of the variants of unicode ranges as specified in CSS2 specification.

An example test case needed to support is as follows:

    unicode-range: U+0030-0039,U+002E,U+004?;

Comment 3 Peter Farland 2009-02-11 11:46:40 UTC
Note the original format reported in the bug was incorrect (the second part of a range pair should not start with U+), but an examples of correct formats that were still not supported by Batik are included below:

unicode-range: U+0030-0039,U+002E,U+004?;