Bug 54049 - [PATCH] XWPFWordExtractor does not extract footnote references
Summary: [PATCH] XWPFWordExtractor does not extract footnote references
Alias: None
Product: POI
Classification: Unclassified
Component: XWPF (show other bugs)
Version: 3.9-dev
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2012-10-24 19:31 UTC by Andrey Hihlovskiy
Modified: 2015-01-04 19:48 UTC (History)
0 users

svn patch (1.91 KB, text/plain)
2012-10-24 19:31 UTC, Andrey Hihlovskiy

Note You need to log in before you can comment on or make changes to this bug.
Description Andrey Hihlovskiy 2012-10-24 19:31:32 UTC
Created attachment 29508 [details]
svn patch


Say, source docx file contains a text with a footnote. XWPFWordExtractor extracts the footnote just fine, but location of the footnote reference is lost.


extractor.getText() on "test-data/document/footnotes.docx" returns:
"Eto ochen prostoy text so snoskoy
[1:  snoska]"
The position just after the word "prostoy" contains footnote reference, but the returned text does not contain such reference.

Suggested improvement:

Let's insert a footnote reference into output text, for example, like this:
"Eto ochen prostoy[fnote:1] text so snoskoy
[1: snoska]"
The footnotes marked such way could be used for accurate rendering in HTML (or any other relevant format).

Solution: attached patch contains change for "XWPFRun.java" and relevant test in "TestXWPFWordExtractor.java".
Comment 1 Dominik Stadler 2015-01-04 19:48:00 UTC
This was already applied via r1492308 on 2013-06-12 via github pull #3 and thus this is already included in POI 3.10 and newer releases. The only change is that the footnote reference is printed as "[footnoteRef:1]".