Bug 58992 - failed to extract the correct paragraph direction
Summary: failed to extract the correct paragraph direction
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.13-FINAL
Hardware: All All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2016-02-10 21:49 UTC by nan.yu
Modified: 2016-02-11 18:56 UTC (History)
0 users

The Word document contains RTL content (166.50 KB, application/msword)
2016-02-10 21:49 UTC, nan.yu

Note You need to log in before you can comment on or make changes to this bug.
Description nan.yu 2016-02-10 21:49:13 UTC
Created attachment 33545 [details]
The Word document contains RTL content

I use paragraph.cloneProperties().getFBiDi to get the directional information for paragraphs. When HWPFDocument reads Arabic/Hebrew documents, I would expect that getFBiDi returns TRUE for the RTL paragraphs. However, it returns FALSE that represents "LTR" direction.
Comment 2 Tim Allison 2016-02-11 18:56:15 UTC
File the following as not exceedingly useful.

Y, I'd want the same behavior that you expected.

With a test doc with one paragraph of Arabic and one paragraph of English, I was not able to see any difference in the paragraph properties or in the run properties between those two paragraphs and their runs.

In the MS-DOC spec, p. 275, there should be a specification for LtrPara/RtlPara or LtrRun/Rtl run in the FCI enumeration, which is stored in the CidFci.  I don't think we're currently extracting these command fields...could be wrong though.