Apache OpenOffice (AOO) Bugzilla – Issue 117672
Opening OOXML files fails when the Relationship "Target" attribute in _rels/.rels has superfluous slashes
Last modified: 2023-02-07 17:55:11 UTC
When I try to open the attached spreadsheet, OpenOffice fails silently. The file is generated by a .NET open source component, EPPlus, but opens without problem in Microsoft Excel. If I save the file in Excel, the file will open in OpenOffice. So probably there is an issue with EPPlus as well. The issue I have with OpenOffice is that there is no error report, and no log that I can find
Created attachment 76260 [details] File created with EPPlus File created with EPPlus
getting rid of value "enhancement" for field "severity". For enhancement the field "issue type" shall be used.
This defect exists in OO 3.4.1.
Still exists in the latest Git. The content type passed to this method in main/oox/source/core/filterdetect.cxx OUString FilterDetectDocHandler::getFilterNameFromContentType( const OUString& rContentType ) const is always "application/xml", while we are expecting "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml" and the like. Where is it being passed from?? --snip-- #0 oox::core::FilterDetectDocHandler::getFilterNameFromContentType(rtl::OUString const&) const (this=<optimized out>, rContentType=...) at source/core/filterdetect.cxx:166 #1 0x000000080e7109d8 in oox::core::FilterDetectDocHandler::parseContentTypesDefault(oox::AttributeList const&) (this=this@entry=0x80ae71f80, rAttribs=...) at source/core/filterdetect.cxx:205 #2 0x000000080e710787 in oox::core::FilterDetectDocHandler::startFastElement(int, com::sun::star::uno::Reference<com::sun::star::xml::sax::XFastAttributeList> const&) (this=0x80ae71f80, nElement=<optimized out>, rAttribs=<optimized out>) at source/core/filterdetect.cxx:100 --snip-- So this: ---snip--- 95 // cases for [Content_Types].xml 96 case PC_TOKEN( Types ): 97 break; 98 case PC_TOKEN( Default ): 99 if( !maContextStack.empty() && (maContextStack.back() == PC_TOKEN( Types )) ) 100 parseContentTypesDefault( aAttribs ); 101 break; 102 case PC_TOKEN( Override ): 103 if( !maContextStack.empty() && (maContextStack.back() == PC_TOKEN( Types )) ) 104 parseContentTypesOverride( aAttribs ); 105 break; ---snip--- calls this: ---snip--- 196 void FilterDetectDocHandler::parseContentTypesDefault( const AttributeList& rAttribs ) 197 { 198 // only if no overridden part name found 199 if( mrFilterName.getLength() == 0 ) 200 { 201 // check if target path ends with extension 202 OUString aExtension = rAttribs.getString( XML_Extension, OUString() ); 203 sal_Int32 nExtPos = maTargetPath.getLength() - aExtension.getLength(); 204 if( (nExtPos > 0) && (maTargetPath[ nExtPos - 1 ] == '.') && maTargetPath.match( aExtension, nExtPos ) ) 205 mrFilterName = getFilterNameFromContentType( rAttribs.getString( XML_ContentType, OUString() ) ); 206 } 207 } ---snip--- So the "application/xml" being passed to getFilterNameFromContentType() comes from the "ContentType" attribute of the "Default" element in [Content_Types].xml. But why is "Default" being used and not "Override" on line 102? Because this method always fails: ---snip--- 209 void FilterDetectDocHandler::parseContentTypesOverride( const AttributeList& rAttribs ) 210 { 211 if( rAttribs.getString( XML_PartName, OUString() ).equals( maTargetPath ) ) 212 mrFilterName = getFilterNameFromContentType( rAttribs.getString( XML_ContentType, OUString() ) ); 213 } ---snip--- because "maTargetPath" is "//xl/workbook.xml" instead of "/xl/workbook.xml" and so it never matches the "PartName" attribute of "/xl/workbook.xml", because when it got initialized by this method, a leading slash was prefixed: ---snip--- 157 void FilterDetectDocHandler::parseRelationship( const AttributeList& rAttribs ) 158 { 159 OUString aType = rAttribs.getString( XML_Type, OUString() ); 160 if( aType.equalsAscii( "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" ) ) 161 maTargetPath = OUString( sal_Unicode( '/' ) ) + rAttribs.getString( XML_Target, OUString() ); 162 } ---snip--- and the bad file's _rels/.rels contained a "Target" attribute that had a leading slash already: <Relationship Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="/xl/workbook.xml" Id="Rfd55d326fad84da6" /> In normal XSLX files, "Target" never has a leading slash. I don't think this file is valid, but Excel apparently accepts it, so we should too. As a workaround, we can check for a leading slash and not prefix another if it is already present. A patch that does that, gets the file to successfully open :-).
Testing against Excel and LO however shows they go even further than allowing one leading slash in the _rels/.rels Relationship "Target" attribute and the [Content Types].xml Override "PartName" attribute. <Relationship Target="///xl/workbook.xml" ...> <Override PartName="/xl/workbook.xml" ...> Excel and LO open successfully, no warnings. <Relationship Target="///xl///workbook.xml" ...> <Override PartName="/xl/workbook.xml" ...> Excel open successfully, no warnings. LO offers to repair the file. <Relationship Target="///xl///workbook.xml" ...> <Override PartName="//xl/workbook.xml" ...> Both Excel and LO offer to repair the file. So at least in Relationship "Target" we should allow multiple leading slashes, and multiple slashes anywhere else where one slash would be ok. The Override element's "PartName" attribute is apparently stricter and we can get away with doing less compatibility fixes there.
Even when I allow superfluous slashes in filterdetect.cxx, loading the files with multiple slashes still breaks with "General I/O error" somewhere else. Maybe main/package also has to change.
*** Issue 126570 has been marked as a duplicate of this issue. ***
I've now found the other place that needed patching and have fixed it too, and pushed the commit to trunk. So this bug is fixed by the commit below. Resolving FIXED. Thank you for your bug report and sample file :-). Oh and we now open this test case successfully too: <Relationship Target="///xl///workbook.xml" ...> <Override PartName="/xl/workbook.xml" ...> Excel open successfully, no warnings. LO offers to repair the file. commit 3ff2b12a82734e8b46c6f7693a7e1b8eef8ada96 Author: Damjan Jovanovic <damjan@apache.org> Date: Sat Jan 7 20:25:36 2023 +0200 Allow the XLSX Relationship "Target" attribute in _rels/.rels to have superfluous slashes.
Resolving FIXED, and removing "interop_OOXML" as this is not standard OOXML but Excel-specific behavior.
Cherry-picked for AOO42X with: https://github.com/apache/openoffice/commit/46a6857e37cac5bd8a2a5f8798fd0ddd334adf81
This affects Writer too, so updating summary.
*** Issue 116755 has been marked as a duplicate of this issue. ***
*** Issue 124266 has been marked as a duplicate of this issue. ***
*** Issue 97242 has been marked as a duplicate of this issue. ***
Cherry-picked for AOO41X in commit 74eb9fee553bfc2739c7523bac1b787b6ee509f7.