Apache OpenOffice (AOO) Bugzilla – Issue 76465
RTF to ODT, then cannot open ODT
Last modified: 2017-05-20 11:13:12 UTC
Open sample file with OOo Writer 2.2, then save as ODT. Created ODT cannot be opened - Writer show error ("error reading file").
Created attachment 44499 [details] sample RTF
Created attachment 44500 [details] saved as ODT
MRU->HBRINKM: can confirm this. Open the attached RTF, save as odt and reopen -> "Error reading file".
Created attachment 50796 [details] More Specific Test Case
Created attachment 50797 [details] Odt version of second test case
See http://user.services.openoffice.org/en/forum/viewtopic.php?f=5&t=1532 for background This is the same issue but the title is actually wrong: this is a symptom rather than the underlying bug discussed below. One of the most common Denial of Service issues that we see on both the user.services and OooForums is that our users post “I suddenly can't read my (usually ODT) file and I have now lost all my work. What do I do to get it back.” However without hard test cases it hasn't be realistic to raise this as a hard issue. However, this time I took the effort to do a binary chop on the content.xml to isolate the troublesome tags. In all three cases, the problem was caused by style:text-position attribute within a style:text-properties tag to place the text on the line. These all conformed to the ODF spec. The issue was that whilst MS Word allows vertical offsets > 1 line, in Writer these are limited in the GIU to a maximum of +/- 100%. I've had a look at the code for the XML exporter and importer. It seems to be using a standard framework which is generated from the XML DTD with a whole load of stub to do the filling in so that the internal structures can be mapped to XML and visa-versa. The issue is that the outbound validation is a lot less lax than the inbound (After all, why bother validating the outbound — its valid already, isn't it?). Well this actually break a pretty design principle for such converters because if there is any logic path which results in the internal state being inconsistent with the input validation parameters, you can still successfully save your document, thereby overwriting a valid document with an unloadable one. This is at *least* a P2 error. I've created a minimal RTF which replicates the Topic 1532 case. Here the problem style is T2. The equivalent tags in Attachment 2 [details] ODT are P262 an T5. Set all 3 to “-100% 100% and the docs will load. This is not an RTF error. RTF is purely the access path to load the Rich Text bypassing XML input validation. There are others: open Attachment 3 [details] RTF and do a select all and copy. Now open any large OTD and paste the clipboard. Save. Close and try the reopen: bang you have now lost your precious document, The poster on topic 1532 mention that the user was pasting content from PPTs (opened in Calc) to create this failure. In general the whole concept of aborting file loads because a parameter is out of bounds is flawed. At a minimum there should be a load option to enable demotion of such errors to a warning or a dialogue to the effect that “This document contains formatting that may be lost in OpenOffice, Click Yes to continue loading”. That way at least the user might have the odd height position clipped rather than loosing access to the whole document.
target 3.0
retargeted due to lack of resources
Henning, as this obviously is not an RTF import problem, perhaps OD should take over?
No more a blocker for 3.1
Reset assigne to the default "issues@openoffice.apache.org".