Issue 128356 - Track Changes and Annotations on text range can cause corruption. Applies to 4.x (all versions?)
Summary: Track Changes and Annotations on text range can cause corruption. Applies to ...
Status: CONFIRMED
Alias: None
Product: Writer
Classification: Application
Component: editing (show other issues)
Version: 4.1.7
Hardware: All All
: P2 Critical with 2 votes (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact: Keith N. McKenna
URL:
Keywords: data_loss, ms_interoperability
Depends on:
Blocks:
 
Reported: 2020-04-03 18:03 UTC by roryof
Modified: 2020-05-14 19:32 UTC (History)
2 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
file showing the annotation (25.04 KB, application/vnd.oasis.opendocument.text)
2020-04-03 18:03 UTC, roryof
no flags Details
Broken_File.odt gives Read Error on opening (113.34 KB, application/vnd.oasis.opendocument.text)
2020-05-14 19:32 UTC, John
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description roryof 2020-04-03 18:03:59 UTC
Created attachment 86921 [details]
file showing the annotation

When Track Changes is enabled and an Annotation is attached to a text range, the file can often be corrupted on reopening.  This manifests as two Aannotation reference numbers attaching to paragraph format P1, which OpenOffice reports as a SaxParse error.  Removal of one or other of these Annotation reference numbers will permit the file to open correctly.

  We can now replicate the fault reliably where it seems to happen if "text containing two comments attached to a range of characters" is deleted.  You can see it happen with comments.odt which I have created for raising a bug report.
  Open comments.odt
  Set Edit > Changes > Record if not already set
  Highlight sentences one and two.
  Press delete.
  Save.
  Open comments.odt
  
  Expected behaviour:  File opens correctly
  
  Actual behaviour:  File does not open and gives "Format error discovered in the file in sub-document content.xml at 2,2778. 
  
  Examination of content.xml shows the first paragraph style definition (P1) has been corrupted by the addition of office:name="__Annotation__2_10671659881" office:name="__Annotation__3_10671659881" Notes
  
  1.  You must set Edit > Record > Changes.  If it is not set, the error does not occur.
  
  2. Deleting only one sentence does not cause the error.
  
  3.  Deleting each comment by the Delete comment command within the comment does not cause the problem.  Note that the range of characters is no longer highlighted after the comment has been deleted. 
  
  4. If the comments are at a location in the text, and not attached to a range of characters, the error does not occur.
Comment 1 John 2020-04-03 20:36:55 UTC
1.  Minor correction - AOO does not give a SAXParse error.  AOO gives a "Read Error:  Format error discovered ... at n,nnnn (row,col)"

2.  See Issue 127745 - Read Error: Format error discovered ... at n,nnnn (row,col) which appears to be related.
Comment 2 Keith N. McKenna 2020-05-09 17:09:22 UTC
Confirmed using AOO 4.1.7 on Windows 10
Comment 3 John 2020-05-10 12:37:14 UTC
As this bug causes complete data loss / data corruption I think it should be more important than P5 = LOWEST.
Comment 4 John 2020-05-14 10:17:33 UTC
I have just repaired a user's file where styles.xml was corrupted with two office:name annotations.  The corruption was similarly in the first style definition in the file.

The file was full of comments and I could replicate the problem by deleting a range of text which included two comments each attached to a range of text while Track changed was ON.
Comment 5 John 2020-05-14 19:25:36 UTC
I believe the corruption is in styles.xml because the reviewer had Record Changes set to ON and had made a change to a style.  Such recorded changes are stored in styles.xml.

This suggests that "deleting some text which includes two comments attached to a range of text while Record changes is ON" is an effect of the bug and not the cause.

I think investigation needs to focus on how and why the duplicated annotation gets written.

Note that Issue 127745 - Read Error: Format error discovered ... at n,nnnn (row,col) concluded it was AOO writing the file which caused the corruption.

I have attached Broken_file.odt which has been anonymised but which behaves as described.

It would help the forum greatly if a resolution of this bug could be applied to v4.2.  See OpenOffice Writer issue - Read-Error - Format error at https://forum.openoffice.org/en/forum/viewtopic.php?f=15&t=101969&p=492539#p492539.  

This bug has caused the French forum to post a repair utility to repair such corrupted files.  It has been downloaded 1,497 times in 8 months.  See 
Utilitaire de réparation de fichier ODF at https://forum.openoffice.org/fr/forum/viewtopic.php?f=26&t=60992
Comment 6 John 2020-05-14 19:32:01 UTC
Created attachment 86947 [details]
Broken_File.odt gives Read Error on opening

Broken_File.odt gives Read Error - Format error in styles.xml at 2,15035 due to repeated office:name annotations in the first style definition in styles.xml.

Note the file still has Record Changes set to ON and extensive changes and comments have been applied to the file.

As the file is a user's file the text has been anonymised.