Bug 61169

Summary: Text with Japanese characters overflows textbox
Product: POI Reporter: François Beaune <dictoon>
Component: SL CommonAssignee: POI Developers List <dev>
Status: REOPENED ---    
Severity: normal    
Priority: P2    
Version: 3.16-FINAL   
Target Milestone: ---   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 45140    
Attachments: Java repro case
PowerPoint file generated by repro case
Test class with registered font
Result when using Apache POI (commit a753adb84805ff0f7b7385905780b07e5fe9e4ab on GitHub)

Description François Beaune 2017-06-09 09:27:54 UTC
Created attachment 35041 [details]
Java repro case

When using the XSLF API, text with Japanese characters (left-to-right) overflows the textbox, even when using default styling (default font family, size and style).
Comment 1 François Beaune 2017-06-09 09:28:24 UTC
Created attachment 35042 [details]
PowerPoint file generated by repro case
Comment 2 Andreas Beeker 2017-06-14 22:17:27 UTC
tl;dr: the textbox is too short because of an undefined/unregistered font and there is an issue in calculating the text height / width in POI.

There are a few issues with the current rendering code, which also applies for calculating the text height:
- the textbox indents are ignored when the text height is calculated
- you need to register a font having those japanese glyphs in
- my test font (mona) has a textlayout leading of 0, hence the leading need to be fixed somehow

The rendering in Libre Office seems to use some kind of tracking (= opposite of kerning). Although the Tracking attribute can be added to the AttributedString, this is ignored when breaking the text. An alternative to modify the registered font [1] doesn't work.

[1] https://stackoverflow.com/questions/13229725
Comment 3 Andreas Beeker 2017-06-14 22:23:59 UTC
for the records, the corresponding SO issue:
Comment 4 Andreas Beeker 2017-06-15 16:21:42 UTC
"LineBreakMeasurer does not measure correctly if TextAttribute.TRACKING is set."
(Affects Version/s: 6.0, 7, 8, 8u102, 9)

To recap: Libre Office uses more lines to display the text, because the glyphs are wider spread opposed to the Java rendering. Although the rendering can be modified with the TRACKING attribute, the linebreak measurer is not taking it into account.

Maybe it's possible to copy&adapt the standard linebreak measurer ...
Comment 5 Andreas Beeker 2017-06-17 00:02:33 UTC
Created attachment 35059 [details]
Test class with registered font
Comment 6 Andreas Beeker 2017-06-17 00:06:14 UTC
Added a (partial *) ) fix via r1798986

Lets forget about the tracking issue mentioned above - you need to specify also the "ea" attribute for asian fonts - see my test class.

*) ... at least for the Mona font, the rendering output is similar to the libre office dimensions, so I'm closing this now.
Comment 7 François Beaune 2017-06-26 13:49:34 UTC
Thanks for the updates and the fix Andreas. We just tried our repro case with the latest Apache POI cloned from GitHub.

Unfortunately it looks like it doesn't entirely fix our problem. On Windows, there is pretty much no difference between Apache POI 3.16 Final and Git master as of today (with your fix). On Linux, the box is indeed taller but it still doesn't enclose all the text, see my latest attachment.
Comment 8 François Beaune 2017-06-26 13:52:54 UTC
Created attachment 35076 [details]
Result when using Apache POI (commit a753adb84805ff0f7b7385905780b07e5fe9e4ab on GitHub)
Comment 9 Andreas Beeker 2017-07-08 22:26:58 UTC
add resize methods with Graphics argument via r1801329

I still need to provide new methods to specify the charset - for east asian and complex script fonts, otherwise Libre Office and probably also Office don't use the set font family but default to something else, which renders futile any textbox calculation.