Bug 48593

Summary: [PATCH] Multiple Saves Causes Slide Corruption
Product: POI Reporter: jdente <jdente>
Component: HSLFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: major    
Priority: P2    
Version: 3.6-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: A simple java program that demonstrates the bug
My example of the same problem
[PATCH] Bug 48593 - Multi Rewrites
[PATCH] Bug 48593 - Multi Rewrites
[PATCH] multiple rewrites fixed

Description jdente 2010-01-21 13:19:15 UTC
Created attachment 24881 [details]
A simple java program that demonstrates the bug

I have a program that adds slides to a PowerPoint slide deck one slide at a time based on a user action. In between every addition of a new slide a save is performed so that the new slide is written to the slide deck. This save is necessary because the user may decide to view the PowerPoint presentation after the addition of a single slide. The first slide that is added is fine. The second slide that is added is fine. The third slide that is added is corrupt every time. This has nothing to do with slide contents, as the corruption occurs with blank slides as well as slides with content. 

The attached sample code demonstrates this bug. The sample code generates two slide decks. One has two slides and works fine, the other has three slides and the third slide is corrupt. Saving after every addition of a slide is done simply to demonstrate the problem. If we add 3 slides and then perform a single save, the corruption does not happen. But this is not a feasible work around for the issue; we need to be able to write the new slide to the slide deck every time the user initiates the action, since the user could decide to view his new slide deck after the addition of every slide.

It is also interesting to note that a slide deck created through PowerPoint (not prgorammatically using POI) does not have this problem. I can create a blank PowerPoint presentation and then append and save however many slides I want using POI.

I've also attempted to alleviate the problem by re-reading in the SlideShow object in between every save. This had no effect.
Comment 1 Yegor Kozlov 2010-01-24 06:09:41 UTC
Can you attach the template ppt? Evidently, the problem is in the way POI modifies existing ppt files. So, I need the source ppt to analyze what is wrong.

Yegor
Comment 2 jdente 2010-01-25 11:52:22 UTC
(In reply to comment #1)
> Can you attach the template ppt? Evidently, the problem is in the way POI
> modifies existing ppt files. So, I need the source ppt to analyze what is
> wrong.
> 
> Yegor

I did not use a template ppt. Initially the ppt is created by simply using "new SlideShow()" and then writing that slide show to a new *.ppt file. If you look at the sample java program for demonstrating the bug, the very first run of the program will create the new ppt file which is written during the save operation. In order to reproduce the bug, the sample java program assumes that the "threeSlides.ppt" file does not exist initially. In fact, if you run this program with a ppt file generated through PowerPoint (not POI), the problem does not occur. It only happens when POI creates the ppt file.
Comment 3 Rafael Lopez 2013-02-18 07:17:05 UTC
Hi there,

I am having the same problem under Ubuntu 11.10, java-6-openjdk and Apache POI 3.9 (downloaded from Maven Repository).

My exception trace is:

java.lang.NullPointerException
	at org.apache.poi.hslf.usermodel.SlideShow.buildSlidesAndNotes(SlideShow.java:401)
	at org.apache.poi.hslf.usermodel.SlideShow.<init>(SlideShow.java:109)
	at Main.getPresentation(Main.java:56)
	at Main.method2(Main.java:48)
	at Main.main(Main.java:14)

I am attaching my source code too. Note that method1() works while method2() is the one that throws the exception.

Maybe this additional information gives you some guidance: The truth is that once I had the same experience just with Sun JDK, but without using Apache POI at all. I just serialized some objects creating a new ObjectOutputStream each time and after overwriting the same file several times I got an StreamCorruptedException. There was a bug in the forums of Sun JDK at that time, but I cannot find it now.

The problem here is that I am making a service that needs to persist the Powerpoint after every change (e.g.: after I append one slide or change a text), so I don't know what can I do or if you can provide a workaround for that.
Comment 4 Rafael Lopez 2013-02-18 07:22:19 UTC
Created attachment 29961 [details]
My example of the same problem

Note that if you execute the application using method1() (commented) instead of method2(), it doesn't fail
Comment 5 Rafael Lopez 2013-02-18 07:22:58 UTC
Comment on attachment 29961 [details]
My example of the same problem

text/x-java-sourceHi there,

I am having the same problem under Ubuntu 11.10, java-6-openjdk and Apache POI 3.9 (downloaded from Maven Repository).

My exception trace is:

java.lang.NullPointerException
	at org.apache.poi.hslf.usermodel.SlideShow.buildSlidesAndNotes(SlideShow.java:401)
	at org.apache.poi.hslf.usermodel.SlideShow.<init>(SlideShow.java:109)
	at Main.getPresentation(Main.java:56)
	at Main.method2(Main.java:48)
	at Main.main(Main.java:14)

I am attaching my source code too. Note that method1() works while method2() is the one that throws the exception.

Maybe this additional information gives you some guidance: The truth is that once I had the same experience just with Sun JDK, but without using Apache POI at all. I just serialized some objects creating a new ObjectOutputStream each time and after overwriting the same file several times I got an StreamCorruptedException. There was a bug in the forums of Sun JDK at that time, but I cannot find it now.

The problem here is that I am making a service that needs to persist the Powerpoint after every change (e.g.: after I append one slide or change a text), so I don't know what can I do or if you can provide a workaround for that.
Comment 6 Andreas Beeker 2013-10-18 23:49:29 UTC
Created attachment 30944 [details]
[PATCH] Bug 48593 - Multi Rewrites

The were two issues:
- the newly create slide has had the same record position as an already exisiting record and the mapping of old to new positions went wrong
- the position of the usr record wasn't updated

I've copied some code from HSLFSlideShow.write() for updating the position ids -
it would be nice if these codeparts maybe become refactored to one some day ...

Apart from the bugfix, I've allowed me ;) to refactor the SlideShow class a bit. Similar to the patch #55579, which also needs to add a new persistent object.

(Tested with Libre Office 4.0 and Excel Viewer 2010)
Comment 7 Andreas Beeker 2013-10-19 18:14:27 UTC
Created attachment 30948 [details]
[PATCH] Bug 48593 - Multi Rewrites

old patch didn't contain a new test helper class
Comment 8 Andreas Beeker 2013-12-27 00:05:11 UTC
Created attachment 31161 [details]
[PATCH] multiple rewrites fixed

This patch is a merged version, because in the meantime there were a few changes to the SlideShow class. Furthermore the position recalc code is now only in HSLFSlideShow
Comment 9 Andreas Beeker 2013-12-27 00:50:34 UTC
applied with svn ver r1553610

Opposed to the uploaded attachment, I have removed the hashmap replacements again, as I wasn't sure, if the hashtables have been used with a purpose ... although the scratchpad tests with hashmaps ran 52-54 sec. vs. hashtables 1 min 02 sec on my pc

If there's no reason for being thread safe one day, the hashtables should be replaced ...