On TIKA-2125, Seva Alekseyev provided a document that contains "odd" as the footer type. According to ECMA[1] (labeled-page 727, pdf-page 737), there can be three types of footers: first, odd and even. However, the way to encode those is "first" "default" "even", and the xsd makes this clear. <w:sectPr> … <w:footerReference r:id="rId6" w:type="first" /> <w:footerReference r:id="rId7" w:type="default" /> <w:footerReference r:id="rId10" w:type="even" /> … </w:sectPr> The So the file submitted on TIKA-2125 is out of compliance. Do we want to add special handling to convert "odd" to "default"? [1] http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-376,%20Fourth%20Edition,%20Part%201%20-%20Fundamentals%20And%20Markup%20Language%20Reference.zip
I would say that depends on what Microsoft Word does. Can you attach the sample?
Created attachment 34412 [details] triggering doc from TIKA-2125
Word seems to handle it without complaining.
Looks that way, I suspect it is just treating everything that is not FIRST or EVEN as DEFAULT. Not sure we need to convert anything, just handle it. This document does not look like it was created by MS Word since that writes default and even. At least it does at version 2016, but it does accept odd.
r1767353 Agreed. This is non-standard/invalid. I added a try/catch block, and I applied "default" in the catch block. Thank you!