Bug 60552 - Corrupt slideshow after importing a picture from another slideshow
Summary: Corrupt slideshow after importing a picture from another slideshow
Status: RESOLVED WORKSFORME
Alias: None
Product: POI
Classification: Unclassified
Component: XSLF (show other bugs)
Version: 3.15-FINAL
Hardware: PC All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-04 17:57 UTC by Anand
Modified: 2018-08-13 19:23 UTC (History)
1 user (show)



Attachments
Zip containing all required files (234.10 KB, application/x-zip-compressed)
2017-01-05 03:46 UTC, Anand
Details
PFA java code (6.41 KB, text/plain)
2017-01-05 04:31 UTC, Anand
Details
Diff file (4.52 KB, text/plain)
2017-01-05 05:23 UTC, Anand
Details
Working example (4.84 KB, text/plain)
2018-08-13 19:21 UTC, Andreas Beeker
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Anand 2017-01-04 17:57:27 UTC
I am using a pptx as a reference to create another pptx file. Its used more like a template. When I generate the ppt and open it in MS PowerPoint 2016 (same is seen on a Mac version too) it gives typical "PowerPoint found a problem with the content ...". I have tried all the usual PowerPoint tricks to see if it's a security issue. Does not seem so.
I was having issue with POI 3.9. So I downloaded 3.15 from http://www-eu.apache.org/dist/poi/release/bin/poi-bin-3.15-20160924.zip but continue to have the same issue.

Following are the relevant code snippets: In the process of removing the proprietary code, I may have made some small mistakes but the PPT generation part is intact. 

	private byte[] resizeImage(byte[] inPic, double img_width, double img_height) {
		try {
			ByteArrayInputStream in = new ByteArrayInputStream(inPic);
			BufferedImage originalImage = ImageIO.read(in);
			int type = originalImage.getType() == 0 ? BufferedImage.TYPE_INT_ARGB : originalImage.getType();

			if (img_height == 0) {
				img_height = originalImage.getHeight();
			}

			BufferedImage resizedImage = new BufferedImage((int) img_width, (int) img_height, type);
			Graphics2D g = resizedImage.createGraphics();
			g.drawImage(originalImage, 0, 0, (int) img_width, (int) img_height, null);
			g.dispose();

			ByteArrayOutputStream buffer = new ByteArrayOutputStream();
			ImageIO.write(resizedImage, "jpeg", buffer);
			return buffer.toByteArray();
		} catch (Exception ex) {

		}
		return null;
	}

	private XSLFShape getShape(BLogger logger, XSLFSlide slide, String shapeName) {
		XSLFShape shape = null;
		Iterator<XSLFShape> sit = slide.getSlideLayout().getShapes().iterator();
		while(sit.hasNext()) {
			XSLFShape lshape = sit.next();
			if (lshape != null && lshape.getShapeName().equalsIgnoreCase(shapeName)) {
				shape = lshape;
				break;
			}
		}
		return shape;
	}

	private static void setPic(XMLSlideShow ppt, XSLFSlide slide, byte[] pic, String shapeName) throws BException {
		XSLFShape shape = getShape(logger, slide, shapeName);

		Rectangle2D anchor = shape.getAnchor();
		byte[] ri = resizeImage(pic, anchor.getWidth(), anchor.getHeight());
		XSLFPictureData pd = ppt.addPicture(pic, PictureType.JPEG);
		XSLFPictureShape pic = slide.createPicture(pd);
		slide.removeShape(shape);
		pic.setAnchor(anchor);
	}

	public static XSLFSlide createSlide(XMLSlideShow ppt, XSLFSlide slide,
			Map<XSLFSlide, XSLFSlide> srcDstrcSlideMap, 
			Set<XSLFSlideLayout> visitedLOs) {
		slide = ppt.createSlide();

		XSLFSlideLayout srcLayout = cachedSlide.getSlideLayout(); //this is coming from a cached slide
		if (!visitedLOs.contains(srcLayout)) {
			visitedLOs.add(srcLayout);
			XSLFSlideLayout dstLayout = slide.getSlideLayout();
			dstLayout.importContent(srcLayout);
		}
		slide.importContent(pptlo.slide);
		return slide;
	}
Comment 1 Javen O'Neal 2017-01-04 18:34:08 UTC
Could you include your pptx template and the code that opens the template and copies slides to another pptx file (include this file too if you're not starting with a blank workbook)?

Without these, I can't reproduce your problem or have insight into code that may be the source of the problem (for example, using a slideshow after it has been closed).
Comment 2 Anand 2017-01-05 03:46:13 UTC
Created attachment 34590 [details]
Zip containing all required files

PFA zip file contains all the required data files and java file.
Please note that you need to place all data files in to one folder and point the environment variable PPT_TEST to this folder.
Please revert ASAP, my customer demo is blocked on this :(
Comment 3 Javen O'Neal 2017-01-05 04:22:49 UTC
(In reply to Javen O'Neal from comment #1)
> Could you include ... the code that opens the template
> and copies slides to another pptx file

I didn't see any java code in your attachment from comment 2. Did you forget to add it to the zip file?

(In reply to Anand from comment #2)
> Please revert ASAP, my customer demo is blocked on this :(

Apache POI is an unpaid, volunteer-run project. We cannot commit to prioritizing one bug over another or fixing any bug by a certain date. Generally bugs get closed quickest when the contributor finds the source of the problem and submits a patch with relevant unit tests to prevent the problem from recurring.

If you need an immediate fix, I'd recommend forking the last stable release (3.15 or 3.16 beta 1, depending on your stability needs) and troubleshooting the bug yourself. If you come up with a fix that is not a hack, feel free to attach it and we'll review and commit your fix.

If you have no idea where to start, try:
1. repairing dest.pptx in PowerPoint and saving to dest-repaired.pptx
2. unzip dest.pptx and dest-repaired.pptx
3. do a file-wise comparison on the XML files in each directory. This will help you figure out what PowerPoint deemed corrupt and how it repaired it. You could also insert the images into source.pptx with PowerPoint and compare that to the POI-generated or repaired versions to see what POI is doing differently.
Comment 4 Anand 2017-01-05 04:31:03 UTC
Created attachment 34591 [details]
PFA java code

Sorry for missing the java code. PFA. Thanks for quick response.
Comment 5 Anand 2017-01-05 04:48:04 UTC
BTW I am using jdk1.7.0_79, if it matters.
Comment 6 Anand 2017-01-05 05:23:21 UTC
Created attachment 34592 [details]
Diff file

I did a diff and I see that there are about 38 files that are different!
Comment 7 Javen O'Neal 2017-01-05 06:11:59 UTC
What are the differences in the XML files?

Either use an XML-specific diff tool or convert each XML file to some canonical representation and use a regular line-based text diff tool. On Windows, I have used Notepad++ with XML Tools plugin and then Pretty Print all open files. Microsoft has an XML-aware diff tool [1], though I haven't used it. A crude way would be to find-replace ">" with ">\n" in all XML files, and a slow way would be to copy-paste the XML into a website-based XML pretty printer.
 
I expect the changes will be:
* corrupted content that is removed from the repaired version
* implied attributes that PowerPoint explicitly adds or removes
* metadata such as last modified user, last modified date
* reordered relation ID's

[1] https://www.microsoft.com/en-us/download/details.aspx?id=24313
Comment 8 Anand 2017-01-05 06:37:27 UTC
BTW is there any issue in the way I have used PIO? or is it a bug in PIO? Is it possible to confirm? I thought what I am doing was a pretty basic stuf. Am I right?
Comment 9 Javen O'Neal 2017-01-05 09:18:01 UTC
(In reply to Anand from comment #8)
> BTW is there any issue in the way I have used PIO? or is it a bug in PIO? Is
> it possible to confirm? I thought what I am doing was a pretty basic stuf.
> Am I right?

Your code looks fairly straightforward, though importing content into a slide and embedding pictures are more advanced features.

There are a couple spots that look suspicious to me:
1) You never close srcPPT, destPPT, or spptIS
2) You import slide layout and slide. This might cause a problem if importing a slide implicitly imports a layout. If the same layout is imported twice, this would be reason for PowerPoint to complain.
3) your second call to setText writes the text to slide1. Should it have written it to slide2?

(modified for clarity)
> XSLFSlide destSlide = destPPT.createSlide();
> destSlide.getSlideLayout().importContent( srcSlide.getSlideLayout() );
> destSlide.importContent( srcSlide );

Your test code currently does several things:
* create a blank slide and import slide layout and content from the source slide
* add two jpeg pictures to the slide
* add a textbox to the slide
* repeat these 3 things to a second slide.

Do you still get a corrupt workbook if you comment out some of these lines?
Without understanding where the workbook becomes corrupted, we have no chance at determining if it's a bug in POI or a bug in your code. Only then could we figure out if POI can raise an exception just before the workbook becomes corrupt.

> XSLFSlide slide1 = createSlide(destPPT, srcSlide, visitedLOs);
> setPic(destPPT, slide1, cvPic, CV_PH_NAME);
> setPic(destPPT, slide1, lvPic, LV_PH_NAME);
> setText(slide1, LOC_PH_NAME, locTxt);			
> 
> XSLFSlide slide2 = createSlide(destPPT, srcSlide, visitedLOs);
> setPic(destPPT, slide2, cvPic, CV_PH_NAME);
> setPic(destPPT, slide2, lvPic, LV_PH_NAME);
> setText(slide1, LOC_PH_NAME, locTxt);	//slide2???
Comment 10 Anand 2017-01-05 09:46:44 UTC
I have come this far through trial and error :) If I recollect correctly, if I did not import the source slide (i.e. only importing the source layout) I was not able to see the source slide content like the logo in the top right corner and content in the footer ...
BTW if layout is imported twice it gives org.apache.xmlbeans.impl.values.XmlValueDisconnectedException, I have got in to and passed out of that phase :)

Anyway I will try again and see what combo works.
Comment 11 Anand 2017-01-05 14:33:26 UTC
I have tried different combinations of importing layout, slide master, and slide.
If I create the slide using the layout or import only the layout PowerPoint opens the resultant file without any errors but header and footer are missing from the destination slide.
Please let me know if there's any workaround you can suggest.
Comment 12 Andreas Beeker 2018-08-13 19:21:45 UTC
Created attachment 36089 [details]
Working example

After a small copy&paste error and some refactoring, this works for me.
Tested with Libre Office and Powerpoint 2016
Comment 13 Andreas Beeker 2018-08-13 19:23:02 UTC
The attachment "Working example" was tested with the 4.0.0 SNAPSHOT trunk.