Issue 117672 - Opening OOXML files fails when the Relationship "Target" attribute in _rels/.rels has superfluous slashes
Summary: Opening OOXML files fails when the Relationship "Target" attribute in _rels/....
Status: RESOLVED FIXED
Alias: None
Product: Calc
Classification: Application
Component: open-import (show other issues)
Version: OOo 3.3
Hardware: All All
: P3 Normal (vote)
Target Milestone: 4.1.14
Assignee: AOO issues mailing list
QA Contact: Rajashree
URL:
Keywords: ms_interoperability
: 97242 116755 124266 126570 (view as issue list)
Depends on:
Blocks:
 
Reported: 2011-04-03 13:07 UTC by bk
Modified: 2023-02-07 17:55 UTC (History)
7 users (show)

See Also:
Issue Type: ENHANCEMENT
Latest Confirmation in: 4.2.0-dev
Developer Difficulty: ---


Attachments
File created with EPPlus (7.37 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2011-04-03 13:10 UTC, bk
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description bk 2011-04-03 13:07:43 UTC
When I try to open the attached spreadsheet, OpenOffice fails silently.
The file is generated by a .NET open source component, EPPlus, but opens without problem in Microsoft Excel.

If I save the file in Excel, the file will open in OpenOffice. So probably there is an issue with EPPlus as well.

The issue I have with OpenOffice is that there is no error report, and no log that I can find
Comment 1 bk 2011-04-03 13:10:53 UTC
Created attachment 76260 [details]
File created with EPPlus

File created with EPPlus
Comment 2 Oliver-Rainer Wittmann 2012-06-13 12:25:19 UTC
getting rid of value "enhancement" for field "severity".
For enhancement the field "issue type" shall be used.
Comment 3 Rajashree 2013-01-25 17:41:32 UTC
This defect exists in OO 3.4.1.
Comment 4 damjan 2023-01-07 12:12:56 UTC
Still exists in the latest Git.

The content type passed to this method in main/oox/source/core/filterdetect.cxx

OUString FilterDetectDocHandler::getFilterNameFromContentType( const OUString& rContentType ) const

is always "application/xml", while we are expecting "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml" and the like.


Where is it being passed from??

--snip--
#0  oox::core::FilterDetectDocHandler::getFilterNameFromContentType(rtl::OUString const&) const (this=<optimized out>, rContentType=...) at source/core/filterdetect.cxx:166
#1  0x000000080e7109d8 in oox::core::FilterDetectDocHandler::parseContentTypesDefault(oox::AttributeList const&) (this=this@entry=0x80ae71f80, rAttribs=...) at source/core/filterdetect.cxx:205
#2  0x000000080e710787 in oox::core::FilterDetectDocHandler::startFastElement(int, com::sun::star::uno::Reference<com::sun::star::xml::sax::XFastAttributeList> const&)
    (this=0x80ae71f80, nElement=<optimized out>, rAttribs=<optimized out>) at source/core/filterdetect.cxx:100
--snip--


So this:

---snip---
     95         // cases for [Content_Types].xml
     96         case PC_TOKEN( Types ):
     97         break;
     98         case PC_TOKEN( Default ):
     99             if( !maContextStack.empty() && (maContextStack.back() == PC_TOKEN( Types )) )
    100                 parseContentTypesDefault( aAttribs );
    101         break;
    102         case PC_TOKEN( Override ):
    103             if( !maContextStack.empty() && (maContextStack.back() == PC_TOKEN( Types )) )
    104                 parseContentTypesOverride( aAttribs );
    105         break;
---snip---

calls this:

---snip---
    196 void FilterDetectDocHandler::parseContentTypesDefault( const AttributeList& rAttribs )
    197 {
    198     // only if no overridden part name found
    199     if( mrFilterName.getLength() == 0 )
    200     {
    201         // check if target path ends with extension
    202         OUString aExtension = rAttribs.getString( XML_Extension, OUString() );
    203         sal_Int32 nExtPos = maTargetPath.getLength() - aExtension.getLength();
    204         if( (nExtPos > 0) && (maTargetPath[ nExtPos - 1 ] == '.') && maTargetPath.match( aExtension, nExtPos ) )
    205             mrFilterName = getFilterNameFromContentType( rAttribs.getString( XML_ContentType, OUString() ) );
    206     }
    207 }
---snip---

So the "application/xml" being passed to getFilterNameFromContentType() comes from the "ContentType" attribute of the "Default" element in [Content_Types].xml.

But why is "Default" being used and not "Override" on line 102?

Because this method always fails:

---snip---
    209 void FilterDetectDocHandler::parseContentTypesOverride( const AttributeList& rAttribs )
    210 {
    211     if( rAttribs.getString( XML_PartName, OUString() ).equals( maTargetPath ) )
    212         mrFilterName = getFilterNameFromContentType( rAttribs.getString( XML_ContentType, OUString() ) );
    213 }
---snip---

because "maTargetPath" is "//xl/workbook.xml" instead of "/xl/workbook.xml" and so it never matches the "PartName" attribute of "/xl/workbook.xml",

because when it got initialized by this method, a leading slash was prefixed:

---snip---
    157 void FilterDetectDocHandler::parseRelationship( const AttributeList& rAttribs )
    158 {
    159     OUString aType = rAttribs.getString( XML_Type, OUString() );
    160     if( aType.equalsAscii( "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" ) )
    161         maTargetPath = OUString( sal_Unicode( '/' ) ) + rAttribs.getString( XML_Target, OUString() );
    162 }
---snip---

and the bad file's _rels/.rels contained a "Target" attribute that had a leading slash already:

<Relationship
  Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"
  Target="/xl/workbook.xml"
  Id="Rfd55d326fad84da6"
/>

In normal XSLX files, "Target" never has a leading slash.

I don't think this file is valid, but Excel apparently accepts it, so we should too. As a workaround, we can check for a leading slash and not prefix another if it is already present.

A patch that does that, gets the file to successfully open :-).
Comment 5 damjan 2023-01-07 12:57:59 UTC
Testing against Excel and LO however shows they go even further than allowing one leading slash in the _rels/.rels Relationship "Target" attribute and the [Content Types].xml Override "PartName" attribute.

<Relationship Target="///xl/workbook.xml" ...>
<Override PartName="/xl/workbook.xml" ...>
Excel and LO open successfully, no warnings.

<Relationship Target="///xl///workbook.xml" ...>
<Override PartName="/xl/workbook.xml" ...>
Excel open successfully, no warnings. LO offers to repair the file.

<Relationship Target="///xl///workbook.xml" ...>
<Override PartName="//xl/workbook.xml" ...>
Both Excel and LO offer to repair the file.

So at least in Relationship "Target" we should allow multiple leading slashes, and multiple slashes anywhere else where one slash would be ok. The Override element's "PartName" attribute is apparently stricter and we can get away with doing less compatibility fixes there.
Comment 6 damjan 2023-01-07 18:28:39 UTC
Even when I allow superfluous slashes in filterdetect.cxx, loading the files with multiple slashes still breaks with "General I/O error" somewhere else. Maybe main/package also has to change.
Comment 7 damjan 2023-01-08 09:48:21 UTC
*** Issue 126570 has been marked as a duplicate of this issue. ***
Comment 8 damjan 2023-01-08 13:08:36 UTC
I've now found the other place that needed patching and have fixed it too, and pushed the commit to trunk.

So this bug is fixed by the commit below. Resolving FIXED.

Thank you for your bug report and sample file :-).

Oh and we now open this test case successfully too:
<Relationship Target="///xl///workbook.xml" ...>
<Override PartName="/xl/workbook.xml" ...>
Excel open successfully, no warnings. LO offers to repair the file.




commit 3ff2b12a82734e8b46c6f7693a7e1b8eef8ada96
Author: Damjan Jovanovic <damjan@apache.org>
Date:   Sat Jan 7 20:25:36 2023 +0200

    Allow the XLSX Relationship "Target" attribute in _rels/.rels to have superfluous slashes.
Comment 9 damjan 2023-01-08 13:10:02 UTC
Resolving FIXED, and removing "interop_OOXML" as this is not standard OOXML but Excel-specific behavior.
Comment 10 Matthias Seidel 2023-01-09 15:52:37 UTC
Cherry-picked for AOO42X with:
https://github.com/apache/openoffice/commit/46a6857e37cac5bd8a2a5f8798fd0ddd334adf81
Comment 11 damjan 2023-01-13 15:24:30 UTC
This affects Writer too, so updating summary.
Comment 12 damjan 2023-01-13 15:25:20 UTC
*** Issue 116755 has been marked as a duplicate of this issue. ***
Comment 13 damjan 2023-01-13 15:36:42 UTC
*** Issue 124266 has been marked as a duplicate of this issue. ***
Comment 14 damjan 2023-02-01 22:48:55 UTC
*** Issue 97242 has been marked as a duplicate of this issue. ***
Comment 15 damjan 2023-02-07 17:42:46 UTC
Cherry-picked for AOO41X in commit 74eb9fee553bfc2739c7523bac1b787b6ee509f7.