Summary: | failed to extract the correct paragraph direction | ||
---|---|---|---|
Product: | POI | Reporter: | nan.yu <yunan05404> |
Component: | HWPF | Assignee: | POI Developers List <dev> |
Status: | NEW --- | ||
Severity: | major | ||
Priority: | P2 | ||
Version: | 3.13-FINAL | ||
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All | ||
Attachments: | The Word document contains RTL content |
Also reported at http://stackoverflow.com/questions/35326966/extract-wrong-paragraph-direction-in-word-using-apache-poi-library File the following as not exceedingly useful. Y, I'd want the same behavior that you expected. With a test doc with one paragraph of Arabic and one paragraph of English, I was not able to see any difference in the paragraph properties or in the run properties between those two paragraphs and their runs. In the MS-DOC spec, p. 275, there should be a specification for LtrPara/RtlPara or LtrRun/Rtl run in the FCI enumeration, which is stored in the CidFci. I don't think we're currently extracting these command fields...could be wrong though. |
Created attachment 33545 [details] The Word document contains RTL content I use paragraph.cloneProperties().getFBiDi to get the directional information for paragraphs. When HWPFDocument reads Arabic/Hebrew documents, I would expect that getFBiDi returns TRUE for the RTL paragraphs. However, it returns FALSE that represents "LTR" direction.