Bug 58992

Summary:	failed to extract the correct paragraph direction
Product:	POI	Reporter:	nan.yu <yunan05404>
Component:	HWPF	Assignee:	POI Developers List <dev>
Status:	NEW ---
Severity:	major
Priority:	P2
Version:	3.13-FINAL
Target Milestone:	---
Hardware:	All
OS:	All
Attachments:	The Word document contains RTL content

Description nan.yu 2016-02-10 21:49:13 UTC

Created attachment 33545 [details]
The Word document contains RTL content

I use paragraph.cloneProperties().getFBiDi to get the directional information for paragraphs. When HWPFDocument reads Arabic/Hebrew documents, I would expect that getFBiDi returns TRUE for the RTL paragraphs. However, it returns FALSE that represents "LTR" direction.

Comment 1 Dominik Stadler 2016-02-11 15:42:53 UTC

Also reported at http://stackoverflow.com/questions/35326966/extract-wrong-paragraph-direction-in-word-using-apache-poi-library

Comment 2 Tim Allison 2016-02-11 18:56:15 UTC

File the following as not exceedingly useful.

Y, I'd want the same behavior that you expected.

With a test doc with one paragraph of Arabic and one paragraph of English, I was not able to see any difference in the paragraph properties or in the run properties between those two paragraphs and their runs.

In the MS-DOC spec, p. 275, there should be a specification for LtrPara/RtlPara or LtrRun/Rtl run in the FCI enumeration, which is stored in the CidFci.  I don't think we're currently extracting these command fields...could be wrong though.