Bug 53556 - Mispositioned Textboxes In Reading Doc Files Through HWPF
Summary: Mispositioned Textboxes In Reading Doc Files Through HWPF
Status: RESOLVED WONTFIX
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.8-FINAL
Hardware: PC Linux
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords: APIBug
Depends on:
Blocks:
 
Reported: 2012-07-17 07:40 UTC by Vipul Kumar
Modified: 2017-09-11 19:36 UTC (History)
1 user (show)



Attachments
This is the document which i was unable to read properly. (53.00 KB, application/msword)
2012-07-17 07:40 UTC, Vipul Kumar
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vipul Kumar 2012-07-17 07:40:03 UTC
Created attachment 29070 [details]
This is the document which i was unable to read properly.

I tried reading doc and docx files using Apache POI 3.8. It worked fine until i encountered textboxes.

If the format of the document is like this: 
paragraph 1 
textbox 1 
paragraph 2 
textbox 2 
paragraph 3 

Then the output should be: 
paragraph 1 textbox 1 paragraph 2 textbox 2 paragraph 3 
But HWPF reads such .doc file as: 
paragraph 1 paragraph 2 paragraph 3 textbox 1 textbox 2 

It seems to be adding textboxes at the end and not at the place where it should be, i.e. between the paragraphs.

In case of .docx files, XWPF didn't read textboxes at all.

I tried methods getText(), getTextFromPieces(), extractText(), getParagraphText(), but none of these helped.
Comment 1 Sergey Vladimirov 2012-11-06 16:42:33 UTC
Vipur,

Textboxes are graphical objects. Currently POI unable to detect exact place for textbox to be placed on the page. Another problem -- textbox can be anchored to the page (not to some paragraph), and there is no way to detect position in text to insert textbox content without page rendering (which POI doesn't).

Patches to detect textbox anchors position (exact point to insert text box content into document) are always welcomed.
Comment 2 Dominik Stadler 2017-09-11 19:36:23 UTC
No update on this for a very long time and as explained above, it is very hard to get this right for all cases. Therefore I am closing this as WONTFIX, please report new bugs if there are any contributions in this area.