Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing
|Summary:||Lost images while editing a Writer .odt file - two scenarios|
|Component:||editing||Assignee:||AOO issues mailing list <issues>|
|Status:||UNCONFIRMED ---||QA Contact:|
|Priority:||P5 (lowest)||CC:||jes, john.ha24, oliver.brinzing, orcmid|
|Issue Type:||DEFECT||Latest Confirmation in:||---|
Description John 2016-05-15 15:31:22 UTC
Created attachment 85532 [details] Files for Scenario 2 lost images I am creating this new bug report to capture details about instances of lost images. Lost images is one of the problems identified in Issue 126846 - Analysis Task: Major Recurring Data/Operation Loss/Corruption Situations at https://bz.apache.org/ooo/show_bug.cgi?id=126846. I wrote the original analysis which caused the meta-task to be created. I have now experienced three cases of lost images in a matter of weeks and I have done some in depth analyses of the cases which I believe may assist the developers in locating the problematic area of the code. There appear to be two completely different scenarios for image loss: Scenario 1 - the temporary image files get deleted Scenario 2 - the temporary files do NOT get deleted, but Writer loses contact with them Scenario 1 - the temporary image files get deleted Writer stores all temporary files in the default temporary folder C:\Users\John\AppData\Local\Temp\sv1naht3.tmp\. Note that if there are three documents opened at the same time, then all three documents' temporary files are stored in that single temporary folder (making unique names critical). For some reason, the temporary image files are NOT flagged as being open or owned by Writer, whereras the "as opened .odt file stored as a .tmp file" is falgged as being open. As this ...\Temp folder is a generic, system wide temporary folder, disk cleaning cleaning utilities will delete the temporary image files if they are run. I have reproduced this scenario by opening a .odt file with images, flushing the images from memory, and deleting the temporary files - the images are then lost from the document. See the PDF uploaded to Issue 126846 - Analysis Task: Major Recurring Data/Operation Loss/Corruption Situations for a description. There are two suggested fixes for this: 1 Do not use a system wide generic folder for temporary AOO files. Instead, use a temporary folder in the AOO profile. C:\Users\John\AppData\Roaming\OpenOffice\4\user\temp\ is already present - why not use this location for AOO temporary files? 2 Flag ALL temporary files as being "Open" while they are placed in the temporary folder. There are numerous reports of image loss in the forum and in bugzilla. LibreOffice also has reports of image loss - see Images disappeared in Writer at http://en.libreofficeforum.org/node/13346 where one poster has identified the problem as being caused by Auslogics Boostspeed deleting temporary files. Scenario 2 - the temporary files do NOT get deleted, but Writer loses contact with them In this scenario, the temporary image files are NOT deleted - they are still in the temporary folder, but the Writer document displays Read Error messages and the images are not displayed. This has now happened twice to me. I reported the first occurrence at https://bz.apache.org/ooo/show_bug.cgi?id=126869 Below is the second occurrrence: I had received Advert Blagdon May 2016.docx file with a new advert. I opened my ads A9.odt file and began to edit it to reproiduce the advert drafted in the .docx file. I have uploaded the .docx and the A9.odt file, and also the contents of the C:\Users\John\AppData\Roaming\OpenOffice\4\user\temp\sv1jk9x1.tmp\ folder after I lost images and saw the Read Error messages. Both .docx and .odt files were one page with some images. My Remove from memory was set to 1 minute to force images to flush quickly for diagnostic purposes. I was editing ads A9.odt (uploaded and also see image) where the Zumba advert words had changed, and I needed to delete two images and add three new images (highlighted). The document is one page, where each advert is in a table cell. I had edited the words and deleted the two images no longer needed, and had added the three images shown. Suddenly, without any worning, I saw Read Error messages for ALL THE OTHER images on the page EXCEPT for the three highlighted images which I had added. I doubt it was because these were still in memory as my remove from memory was set to 1 minute. I immediately went to C:\Users\John\AppData\Roaming\OpenOffice\4\user\temp\sv1jk9x1.tmp\ and took a copy of the entire folder. I have uploaded it here. I then saved the .odt file and re-opened it. As expected, the three outlined images were present but all the other images were missing. I did not get a Read Error message for the missing images - all trace of them was gone. Analysis. It appears that the "in-memory" document had somehow lost contact with the images which were and are still in the ...\Temp folder. Why has this happened? Has Writer lost the name of the images? The names are pseudo-random - could Writer be re-using names? What causes Writer to display the Read Error message? I think this is a very good line of investigation - when and for what reason does Writer display this message? Does that reason give any hint as to what is happening? What does Writer do after it gets this error - it obviously does not save the images files which are still in the temporary folder - why not? How can I "dump" the contents of the "in-memory" document to see why it has lost contact with the images? Is it sensible for Writer to place all open documents' temporary files in the same folder? This has implications if the same random name gets uses for images in two different documents. Would it be betetr to have one folder per open file? Why has this never happened to me before, yet is has happened three times in on month? I edit a monthly magazine with Writer so my usage has not changed. Note that the image showing the contents of the temp folder has the files sorted in date order. However, although it looks like the top set of images are for the .odt file and the bottom set are for the .docx file, this is not so. 6 women at bar and Barree Fitness were never in ads A9.odt. Similarly, pilates and photo of Lynne were never in the .docx file.
Comment 1 John 2016-05-15 15:35:45 UTC
Created attachment 85533 [details] ads A9.odt - saved after images were lost Note that I cropped the Barree Fitness photo before inserting it into the .odt file. There is NO CROPPED TEMPORARY file which suggests that the cropped image was still in memory and had never been written to the ...\Temp folderr.
Comment 2 John 2016-05-15 15:37:00 UTC
Created attachment 85534 [details] ads A9 - after.odt This is what I was attempting to achieve when I lost all the other images apart from the last three images.
Comment 3 John 2016-05-15 15:41:48 UTC
Created attachment 85535 [details] Temporary folder contents - part 1 Split as maximum size is 1Mb
Comment 4 John 2016-05-15 15:44:16 UTC
Created attachment 85536 [details] temporary folder contents - part 2
Comment 5 John 2016-05-15 15:44:46 UTC
Created attachment 85537 [details] Temporary folder contebnts - part 3
Comment 6 orcmid 2016-05-15 17:22:26 UTC
(In reply to John from comment #0) > Created attachment 85532 [details] > Files for Scenario 2 lost images > > I am creating this new bug report to capture details about instances of lost > images. > > Lost images is one of the problems identified in Issue 126846 - Analysis > Task: Major Recurring Data/Operation Loss/Corruption Situations at > https://bz.apache.org/ooo/show_bug.cgi?id=126846. The original analysis of this specific case appears on Issue 126869 beginning with it comment 13, https://bz.apache.org/ooo/show_bug.cgi?id=126869#c13. There are also relevant attachments on Issue 126869. The present issue continues refinement and continuation of that thread.
Comment 7 Oliver Brinzing 2016-05-15 17:35:19 UTC
>For some reason, the temporary image files are NOT flagged as being open or owned by Writer, confiming, it's possible to delete the *.tmp file. but the file will be rewritten immediatelly (with a different name) if one starts editing the picture.
Comment 8 John 2016-05-15 18:49:17 UTC
> confiming, it's possible to delete the *.tmp file. but the file will be > rewritten immediatelly (with a different name) if one starts editing the > picture. I think that the deletion and the re-write are completely unrelated. Any image temporary file is rewritten whenever the image is edited. If a temporary image file is deleted, Writer does not know, and therefore makes no attempt to fix the problem. The image is lost when the document is scrolled and the flushed from memory image is called back so it can be displayed - it is not there and Writer generates the error message.
Comment 9 Oliver Brinzing 2016-05-16 08:27:30 UTC
i found there were several fixes for the aoo 411 build: "Read Error" with embedded images after saving in Writer https://bz.apache.org/ooo/show_bug.cgi?id=114361 Pictures replaced by "read error" message and dropped https://bz.apache.org/ooo/show_bug.cgi?id=118725
Comment 10 John 2016-05-16 09:19:50 UTC
Created attachment 85539 [details] Backup copy and AutoSave options Comment 23 in https://bz.apache.org/ooo/show_bug.cgi?id=118725 states > Check "Always create backup copy" and "Save AutoRecovery information every" > and set the time to 3 Minutes. I guess, the problem is connected with > autosave, because I could not reproduce it, when the options were unchecked. I have Always create backup copy set, and Save Autorecovery set to 7 minutes. I did not notice if there was an AutoSave immedietely before losing the images. Comment 10 in https://bz.apache.org/ooo/show_bug.cgi?id=118725 states > Since the bulk of the image handling code is shared between OpenOffice and > LibreOffice, it might be worthwhile to track the LibreOffice metabug issue > that tries to collect all bugs that have to do with image caching: > > https://bugs.freedesktop.org/show_bug.cgi?id=47148 It might be worth setting up a joint team with LO developers to examine this problem.
Comment 11 John 2016-05-16 09:26:23 UTC
(In reply to orcmid from comment #6) > (In reply to John from comment #0) > > Created attachment 85532 [details] > > Files for Scenario 2 lost images > > > > I am creating this new bug report to capture details about instances of lost > > images. > > > > Lost images is one of the problems identified in Issue 126846 - Analysis > > Task: Major Recurring Data/Operation Loss/Corruption Situations at > > https://bz.apache.org/ooo/show_bug.cgi?id=126846. > > The original analysis of this specific case appears on Issue 126869 > beginning with it comment 13, > https://bz.apache.org/ooo/show_bug.cgi?id=126869#c13. There are also > relevant attachments on Issue 126869. > > The present issue continues refinement and continuation of that thread. orcmid That is not quite correct. I have had three image losses. I did not report the first case here as I did no investigation. I reported the second case beginning with comment 13 in https://bz.apache.org/ooo/show_bug.cgi?id=126869#c13, and uploaded some files there. The .odt being edited was multipage, with 20? images (each small, less than 200kB), and the .odt was over 1MB. AutoSave and Backup were both set. This bug report is about the third case where I was editing a single page .odt file.
Comment 12 John 2016-05-16 11:14:34 UTC
Comment 43 in Bug 52226 - FILESAVE Images in .docx and .xlsx files show "Read-Error", probably corrupted by auto-save[Summary in comment # 31] at https://bugs.documentfoundation.org/show_bug.cgi?id=52226#c43 in the LO bugs is interesting. > As I've written somewhere else in one of the many occurrences of this problem: > Knowing that there are various in memory copies of a document, and in Impress > maybe related to various views (side panel, normal view, slide sorter, ..) it > may be well possible that there is a problem in synchronising those copies. > This is supported by some other comments too. The "problem in synchronising various in memory copies of a document" corresponds to my thought "the 'in-memory' document had somehow lost contact with the images".
Comment 13 orcmid 2016-05-16 14:12:54 UTC
(In reply to Oliver Brinzing from comment #9) > i found there were several fixes for the aoo 411 build: > > "Read Error" with embedded images after saving in Writer > https://bz.apache.org/ooo/show_bug.cgi?id=114361 > > Pictures replaced by "read error" message and dropped > https://bz.apache.org/ooo/show_bug.cgi?id=118725 Thanks for digging these up, Oliver. I notice, especially in the second one that there was recognition that there seem to be multiple cases and not all are fixed and, as usual, different users have varying success at reproduction. For Issue 118725 the later comments provide links to other issues created to reflect that the product was also creating incorrect links to images sometimes -- that is, the links had become mangled in some manner and not ones that found the image files. There is the possibility that we have had a regression, along with the prospect that the fix has been applied (needs to be confirmed) but it doesn't verify as solving all of the cases in hand.
Comment 14 John 2016-05-17 12:04:52 UTC
I searched bugzilla with images and got the following (maybe) relevant hits: Issue 126844 - saving a document multiplies images in it [Writer] 2016 Issue 126682 - After inserting picture from file, after save, within an hour, the graphic is gone [Writer] 2015 Issue 125767 - Images lost in document when autosave takes place AND previous version was saved on another platform. [Writer] 2015 Issue 125267 - Saves odt files with pictures incorrectly, specifically fills content.xml with draw frame repeats. [Writer] 2014 Issue 118725 - Pictures replaced by "read error" message and dropped [Writer] 2012 Issue 115994 - Images which were inserted with copy-paste disappear after the file is saved. [Writer] 2010 Issue 110255 - Sluggish viewing and freezes (document with 4096 pictures) [Writer] 2010 Issue 49781 - "Read Error" Displayed Between Over-lapped Cropped Images [Writer] 2005 Issue 51222 - Images disapear when I save a document [Writer] 2005 Issue 121433 - Inserted pictures sometimes vanish randomly from the file [Impress] 2012 Issue 117173 - graphic images dropped randomly, and repeatedly, during editing [Impress] 2011 Issue 105879 - Saving .ppt file looses slide images [Impress] 2009 Issue 125997 - aoo startup - missing images*.zip files [Code] 2015 Issue 102376 - Crash on saving document containing large mount of graphics [Writer] 2009 Issue 95616 - broken image links don't displayed completely [Writer] 2008 Issue 81946 - Writer crashes when scrolling through document containing many large images [Writer] 2007 Issue 80658 - Picture Selection Selects Wrong File If Preview Not Fully Loaded [UI] 2007 Issue 63253 - upper bound of graphics cache should not be constrained [UI] 2006 Issue 59915 - image loss in writer documents, dataloss [Code] 2005
Comment 15 John 2016-05-17 14:19:26 UTC
Created attachment 85544 [details] List of over 30 LibreOffice bug reports relating to image loss I did a similar search with images on the LibreOffice bugzilla at https://bugs.documentfoundation.org and I have attached LO bug reports.odt with links to over 30 bug reports mentioning lost images. I cannot post the links here as the page seems to interpret links as being in this forum. I think the Michael Meeks' LO bug report (2012) 47148 - image caching / management is utterly shambolic is relevant as it states > The code we've inherited that deals with image caching, swapping in, out, > lifecycle management of images via strings, swapping in and out to documents > etc. is broken beyond belief. > > This is a tracker bug to start aggregating these horrors.
Comment 16 orcmid 2016-05-17 16:18:47 UTC
I think there is ample evidence that there is something wrong with the handling of images, how they are cached for documents being worked on and also how they are collected and connected within ODF document files (.odt, .odp, and probably .ods too). It is also clear that this has been a long-standing problem, carried forward in legacy code upon which Apache OpenOffice and LibreOffice are built. The similar issues on the LibreOffice bugzilla are evidence for this. (I also observe that there are other issues with image-laden documents that we should not automatically consider part of this issue, even though later analysis might reveal a connection.) We don't lack evidence that there is a problem. It would be handy to have small files that demonstrate occurrences with adequate repeatability.
Comment 17 orcmid 2016-05-17 16:50:24 UTC
(In reply to John from comment #15) > Created attachment 85544 [details] > I think the Michael Meeks' LO bug report (2012) > > 47148 - image caching / management is utterly shambolic > > is relevant as it states > > > The code we've inherited that deals with image caching, swapping in, out, > > lifecycle management of images via strings, swapping in and out to documents > > etc. is broken beyond belief. > > > > This is a tracker bug to start aggregating these horrors. The LibreOffice "47148" is at https://bugs.documentfoundation.org/show_bug.cgi?id=47148. Note there that the "81378" and "99612" in the "See also" are recent. In general, there is consideration that the lifecycle management of images and their caching in an opened document is not handled properly and the way it is hacked together is not taking advantage of the UNO RTL technology for accomplishing such control. Remedy for this needs to be extended to cases where still-live material is spilled to disk. That introduces interaction with auto-recovery, dealing with shut-down cases, etc., and also cleaning up spilled files that are no longer part of a managed lifecycle. Unfortunately, this is all unsurprising. It is important for those working on AOO QA and the many community members who contribute on this matter to understand that any code solutions that are devised for LibreOffice are, even when readily adaptable to AOO, not available without their contributor's permission to incorporate under a license that is acceptable for Apache project source code. (The reverse direction does work.) Please do not quote code and substantial analysis of others in examining bug reports and repositories elsewhere for confirmation of common problems and the availability of analysis.
Comment 18 John 2016-05-18 13:55:42 UTC
Created attachment 85546 [details] Files for Case 4 - Comment 18 It has happened again to me again - the fourth time in about a month. This time it is Scenario 1 - "image disappeared from ...\Temp folder". I have uploaded Files - Case 4.zip. I received a one page document 2 parish councilAJB.odt with 340 words and a single, tiny PNG graphic as an email attachment. I opened it, set Edit File so I could edit it (it became Untitled 1) and accepted the changes, leaving the comment. I then File > Save As > and saved it as 2 parish council.odt. When I looked again at the still open document, the image was missing, there was an error message "Graphic cannot be displayed", but Navigator showed there was a graphics2. The Properties of graphics2 had a picture of a dog. No other documents were open. On checking, the saved 2 parish council.odt had lost all trace of the image. I looked in ...\Temp\sv3355e5.tmp\ which had only two .tmp files in it, but no image file. I have uploaded the temporary folder sv3355e5.tmp\, 2 parish councilAJB.odt as received, the saved 2 parish council.odt, and a screenshot of the error message. ...\Temp contains: sv33bzfj.tmp 18 May 2016 13:39 19kB Has a PK header, is a zipped .odt file sv33d9g8.tmp 25 May 2014 10:04 6kB Has a PK header, is a zipped .odt file sv33bzfj.tmp >>> There is no pictures folder. The thumbnail image in the unzipped sv33bzfj.tmp is an image of the page with the error message! contents.xml contains the edited text - ie I opened the file and made some edites - content.xml has those edits. sv33d9g8.tmp >>> There is a Pictures folder but it is empty. The thumbnail image is a 113 x 160 pixels transparent image with no content. contents.xml contains only the headers and no document text. Could it be that something has changed with a recent Windows update which has changed timings for things such that a race hazard in the code is making it more likely that images will be lost? In fifteen years use of OOo and AOO I have never lost images - I am now doing so regularly. I edit a magazine with images in many documents so I am an ideal candidate for diagnostics. Can someone suggest a suitable instrumented developer's version of AOO (or just Writer) for me to download which will allow this problem to be diagnosed better?
Comment 19 John 2016-05-19 17:11:56 UTC
It has happened again. This time, I had received a 6,000 word .odt file with 13 small images, mostly clipart. It was a file which had changes marked and comments added. I opened it while attached to the email, clicked Edit file, which renamed it to Untitled 1. I then opened my Master document version of the file and updated it. (I use a master document, and Edit > Accept Change does not work with Master documents so I have to apply the changes manually). I then opened one sub-document, edited it and saved it. I went back to the file I had received, scrolled it - and all the graphics were gone and I have Read Error messages for each image. I have two copies of each image in ...\Temp, strongly suggesting this is Scenario 2 - the images remain in the ...\Temp folder, but Writer has "lost contact" with them. Something has changed which is now making me susceptible to this problem. I am making one change. I am changing Tools > Options > OO > Paths > Temporary Files from the default C:\Users\John\AppData\Local\Temp to C:\Users\John\AppData\Roaming\OpenOffice\4\user\temp.
Comment 20 John 2016-05-20 15:46:15 UTC
Created attachment 85549 [details] Screen dump of images while AutoSave is taking place, and immediately after Amazingly, I now have a file where the image loss is reproduceable every time I open the file. I have uploaded 0 June_MASTER v2 AJB.odt to https://www.dropbox.com/s/n3zgrpx5l1s2jx1/0%20June_MASTER%20v2%20AJB.odt?dl=0. I have also saved my complete profile (140 MB) and can upload it if required. Images are lost on - my desktop and my wife's laptop (both Windows 10 Home, 64 bit, AOO412m3(Build:9782) - Rev. 1709696 2015-10-21 09:53:29 (Mi, 21 Okt 2015)) - a laptop (Windows 10 Home 64 bit, AOO 4.1.2 as above) - RoryOF's Xubuntu 16.04 64 bit, AOO 4.1.2 - acknak's Fedora release 23 (with /opt/aoodev/AOO412rc3/openoffice4/program/soffice -h OpenOffice 4.1.2 412m3(Build:9782) and $ cat /etc/redhat-release Fedora release 23 (Twenty Three) History: I created the file using AOO to create a Master document. I exported the Master and sub docs to a .odt to send to my proofreader so that his changes could be recorded. He uses LibreOffice, and he recorded changes in the .odt, and returned the file to me. The repeatable every time sequence to lose images is as follows: First, I have these settings: a) Tools > Options > OpenOffice > Memory ... - set Remove [graphics images] from memory = 1 minute b) Tools > Options > Load/Save ... - Tick Always create backup copy. - Tick AutoSave and set to 2 minutes. 1 Close all AOO documents etc. Close AOO. I don't use quickstart, so close quickstart just in case. 2 Open 0 June_MASTER v2 AJB.odt by double clicking it. - Say NO to update links. - Format > Sections > Select all > untick Link. Note how unlinking causes the Save icon to go dark - this starts the 2 minutes clock for AutoSave as you have changed the document. 3 Slowly scroll down until you see an image or two on screen. Now do nothing - just wait ... But keep the Writer window as the active window with the mouse in it. 4 Observe what happens when the AutoSave takes place after 2 minutes - you will know because the blue dashed bar crosses the bottom of the screen. I lose the images after the blue bar finishes - see attached image of screen shot taken during AutoSave (bar 75% across screen - images present) and immediately after AutoSave (images gone). Analysis shows that when I scroll, each image is written to ...\Temp when the image appears on the screen, irrespective of the graphics flush time. So, if I scroll such that only the first two images appear on the screen, these two, and only these two, images have temporary files written to ...\Temp. But when AutoSave takes place, I lose ALL images in the document, these two and all the others. After the AutoSave, all the other images are written to ...\Temp. Note how this is "Scenario 2 - Writer loses contact with images even though the image temporary files are still in the ...\Temp folder". Other analyses: 1 It does not happen with LO Version: 220.127.116.11 Build ID: 55b006a02d247b5f7215fc6ea0fde844b30035b3 Locale: en-GB (en_GB) 2 Making a slight change to the .odt file prevents its happening! I wanted to anonymise the phone numbers so I used Find and Replace to replace all digits by n. Images were then not lost by doing steps 1 - 4 above. 3 When I scroll DOWN after image loss, I get 4 x Read Error messages boxes (which are blank) for each image. When I then scroll UP, each Read Error message box contains the text immediately below where the image was. This is reproduceable as I coninue to scroll up and down. 4 RoryOF (Ubuntu) reports that he set AutoSave to 20 minutes. He then opened the file as above ... and just left it. After 15 minutes, the images were lost. 5 I repeated Rory's test by setting AutoSave to 40 minutes and leaving the file. My images were present at 35 minutes, but were lost after the AutoSave at 40 minutes. 6 acknak reports that at first, it showed "Read error" in red for the images. The message later changed to "Unable to display" (or something like that). We will continue our testing but hope that this file may prove useful to developers to allow this apparently random image loss problem to be diagnosed further.
Comment 21 John 2016-05-20 15:58:52 UTC
(In reply to John from comment #20) > > 4 RoryOF (Ubuntu) reports that he set AutoSave to 20 minutes. He then > opened the file as above ... and just left it. After 15 minutes, the images > were lost. > RoryOF corrects his report: image loss did actually occur at AutoSave.
Comment 22 John 2016-05-23 08:29:23 UTC
Created attachment 85552 [details] Master document loses all images It has now happened with a Master document which calls about 30 sub documents, many of which have images in them. Every image in every sub-document has been lost. When I opened the Master document and Update All, every image is missing and is replaced by a frame with Graphics n, where n is the image number. When I open a sub-document, the image is missing in the sub document. I obviously have had the Master document open for long periods so AutoSave has happened. I do not think I have had EVERY sub-document open for the AutoSave to have taken place, and whenever I have edited a sub-document, the image has been there when I have saved it. The Master document is a later version of the file exported to the .odt file in Comment 20.
Comment 23 John 2016-05-23 09:40:46 UTC
Created attachment 85553 [details] Unable to insert images even into a new document I am now unable to insert images into even a new document. File > New > Text document. Now insert an image either by ctrl/V paste or by Insert > Picture > From File gets a Graphics n message - see file. I rebooted in Safe Mode and it still happened. I am running fully up to date Windows 7 Home Edition 64 bit.
Comment 24 John 2016-05-23 10:05:47 UTC
Created attachment 85554 [details] registrymodifications.xcu from my Profile PLEASE DELETE Comment 22 and Comment 23 and the associated uploaded files. I apologise - this was my error. I had yesterday been debugging a file on the forum (MS Word had corrupted the .odt file when Recording changes by adding about 2,000 images to the file) and I had switched off View Images in Tools > Options to view the file. I had forgotten to switch it back on. Please accept my sincere apologies.