Bug 57552 - XMLSlideShow.addPicture() finds the incorrect image
Summary: XMLSlideShow.addPicture() finds the incorrect image
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: XSLF (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-02-09 14:40 UTC by mark.o
Modified: 2015-05-03 10:10 UTC (History)
0 users



Attachments
demonstration of possible fix for posted bug (3.46 KB, text/x-java)
2015-02-09 14:48 UTC, mark.o
Details
patch (generated with git format-patch against trunk) (9.68 KB, patch)
2015-03-15 16:48 UTC, mark.o
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description mark.o 2015-02-09 14:40:42 UTC

    
Comment 1 mark.o 2015-02-09 14:48:41 UTC
Created attachment 32443 [details]
demonstration of possible fix for posted bug

The patch has been implemented as a static 'fix-up' to allow easy testing without touching the original sources.

However, I think that it would make more sense to have the OPCPackage.getPartsByName() with a comparator, so that the indices are properly sorted to start with. Then there would only be the need to deal 0-based vs. 1-based.
Comment 2 Nick Burch 2015-02-09 14:56:52 UTC
Might be worth posting to the dev list - I seem to recall that a few months ago someone was talking about fixing up some problem with opc relations that was going to need some moderately major internal changes to do things cleanly. You might find this helps with that, or you can otherwise assist with the refactor :)
Comment 3 mark.o 2015-03-15 16:48:30 UTC
Created attachment 32573 [details]
patch (generated with git format-patch against trunk)

Should I following up on a dev-list?
I have a few more ideas for fixes (pptx handling), but I don't want to be coding in vain.
Comment 4 Andreas Beeker 2015-04-27 21:19:39 UTC
I see that your code introduces a sorting in combination with a comparing interface, but I don't understand what problem it solves.
Is sorting necessary to suffice the M1.12 rule?
Why is the old code wrong, which just compares case-insensitive?

Please give an example as to when addPicture() finds the incorrect image, i.e. adding a sample pptx and a test case (failing in the old code/working with the new code) would be good.
Comment 5 mark.o 2015-04-28 06:32:46 UTC
Hi Andreas,

It'll take some time for me to put together a test pptx, but the problem/solution are fairly quickly explained. When adding a picture, the code examines the chksum of the existing pictures to decide if a new entry is needed or an existing media can be reused. With the current code, the media will be listed in lexical order (eg, image1.png, image10.png, image11.png, image2.png, image20.png, image3.png, ...). If the picture to be added already exists, the 0-based index in the list is returned and (later) changed to a 1-based index for the referencing. This results in total rubbish.

For the small example, the original image10.png is found as index 2 in the list (0-based), which becomes index 3 (1-based) for the slide reference. The slide will thus now reference /media/image3.png !!

The current code will only work properly if you have a maximum of 9 images.

The patch introduces a (case-insensitive) alpha-numerical comparator so that the list is ordered as (image1.png, image2.png, ... image10.png, image11.png ...) which re-establishes the proper connection between the indices.

/mark
Comment 6 mark.o 2015-04-28 06:35:16 UTC
Errata:

The original image10.png is found as index 1 in the list (0-based), which becomes index 2 (1-based) for the slide reference. The slide will thus now reference /media/image2.png !!
Comment 7 Nick Burch 2015-04-29 18:33:39 UTC
I think I've been able to write a unit test to show this, starting from a blank slideshow. See commit r1676810 / TestXSLFBugs.test57552()

If someone (mark.o perhaps?) could confirm if the logic in that test is sane + it's triggering the same bug, then I can have a go at applying the patch
Comment 8 Andreas Beeker 2015-04-29 22:13:54 UTC
Do we really need to work with indices?
How about changing the return type of addPicture to something like a abstract PictureData class which encapsulate the PackagePart (XSLF) or EscherBSERecord (HSLF)?
Need image files always conform to the pattern /ppt/media/image<num>.<ext>? [1]

[1] http://web.mit.edu/~stevenj/www/ECMA-376-new-merged.pdf  (see 15.2.13 Image Part)
Comment 9 Nick Burch 2015-04-29 22:24:04 UTC
Some sort of helpful wrapper class would seem a cleaner fix to me, though with the small disadvantage of breaking some existing code. However, with the common SL code coming soon, people will need to tweak things anyway, so maybe we just roll this change into that?
Comment 10 mark.o 2015-05-01 08:19:31 UTC
(In reply to Nick Burch from comment #7)
> I think I've been able to write a unit test to show this, starting from a
> blank slideshow. See commit r1676810 / TestXSLFBugs.test57552()
> 
> If someone (mark.o perhaps?) could confirm if the logic in that test is sane
> + it's triggering the same bug, then I can have a go at applying the patch


The unit test looks plausible - I'll see if I get time in the next days to test and give confirmation.
Comment 11 mark.o 2015-05-01 08:44:14 UTC
(In reply to Nick Burch from comment #9)
> Some sort of helpful wrapper class would seem a cleaner fix to me, though
> with the small disadvantage of breaking some existing code. However, with
> the common SL code coming soon, people will need to tweak things anyway, so
> maybe we just roll this change into that?

I found the use of plain integers slightly fragile (or weird) too. A wrapper class would definitely be an improvement, even if the first incarnation is nothing better than a wrapped integer.

In the longer term, I had figured it would be even more convenient to have an addPicture() method on the slide class (I think I've seen something similar in the ppt or the docx code). The user could then simply add a picture to the slide directly without worrying about any of the internal bookkeeping. In the background the method would do the 'if-exists-or-add-new' logic on the slide show.

With this approach, it would be fairly easy to add a wrapper-class without immediately breaking any existing code. We could also provide extra XMLSlideShow.addPicture() methods that conveniently takes other parameters (eg,InputStream, File, etc) and returns the new wrapper-class. This would also provide an easy way to move forward without breaking existing code.


The PictureDataXXX class could also be slightly extended to include its CRC and (for plain image data) its original dimensions. The would still be fairly lean but would make the existence lookups much easier. Retaining information about the original dimensions makes it easier to handle some other functionality that I've already started to code. Eg,

/** Resize this picture into target rectangle, while maintaining its aspect ratio */
XSLFPictureShape.resize(Rectangle2D target, RectAlign align);

I'm using this type of method to fit pictures to the placeholder dimensions.
Comment 12 mark.o 2015-05-01 10:19:44 UTC
> The unit test looks plausible - I'll see if I get time in the next days to
> test and give confirmation.

Before:
  test57552 failed
  expected: <5> but was: <11> at line 356

After patch:
  test57552 passed

Looks OK.
Comment 13 Nick Burch 2015-05-03 08:49:15 UTC
Thanks for this, applied in r1677373.

If you're happy, I'd suggest we close this bug now, and open a new enhancement one to track the improvements to addPicture discussed here
Comment 14 mark.o 2015-05-03 10:01:32 UTC
(In reply to Nick Burch from comment #13)
> Thanks for this, applied in r1677373.
> 
> If you're happy, I'd suggest we close this bug now, and open a new
> enhancement one to track the improvements to addPicture discussed here

Thanks. We can close the bug and follow up with an enhancement.