Bug 52262 - [PATCH] [BUG] XSLFSlide.getMasterSheet() broken in rev1198658
Summary: [PATCH] [BUG] XSLFSlide.getMasterSheet() broken in rev1198658
Status: RESOLVED INVALID
Alias: None
Product: POI
Classification: Unclassified
Component: XSLF (show other bugs)
Version: 3.8-dev
Hardware: PC All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-29 16:54 UTC by Jeremy
Modified: 2011-11-30 13:47 UTC (History)
0 users



Attachments
Multi-embedded test Word document (154.13 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2011-11-29 16:54 UTC, Jeremy
Details
Patch file with changes (578 bytes, patch)
2011-11-29 16:58 UTC, Jeremy
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy 2011-11-29 16:54:41 UTC
Created attachment 28001 [details]
Multi-embedded test Word document

This bug was introduced during changes submitted in revision 1190347.

Bug was discovered using daily builds of TIKA and POI. Tika exposes the bug using a call to getMasterSheet() for an unused variable (bug submitted to Tika too for the unused variable.)

Essentially the return types of the getMasterSheet() accidentally changed between revisions.  Return type for getMasterSheet() changed to XSLFSlideLayout from XSLFSlideMaster.  Patch changes the returned value back to waht it was prior, leaving the newly added @override specification.

Patch file and example multi-embedded word document example used with a Tika based RecursiveMetadataParser included.

Stack Trace:
ERROR LogFaultActivity org.apache.poi.xslf.usermodel.XSLFSlide.getMasterSheet()Lorg/apache/poi/xslf/usermodel/XSLFSlideMaster;
java.lang.NoSuchMethodError: org.apache.poi.xslf.usermodel.XSLFSlide.getMasterSheet()Lorg/apache/poi/xslf/usermodel/XSLFSlideMaster;
	at org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator.buildXHTML(XSLFPowerPointExtractorDecorator.java:81)
	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:110)
	at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:97)
	at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:69)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
	at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91)
	at com.eastportanalytics.services.textextract.TikaTextExtractionService$RecursiveMetadataParser.parse(TikaTextExtractionService.java:364)
	at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)
	at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:109)
	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedFile(AbstractOOXMLExtractor.java:228)
	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedParts(AbstractOOXMLExtractor.java:148)
	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:113)
	at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:97)
	at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:69)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
	at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91)
	at com.eastportanalytics.services.textextract.TikaTextExtractionService$RecursiveMetadataParser.parse(TikaTextExtractionService.java:364)
Comment 1 Jeremy 2011-11-29 16:58:15 UTC
Created attachment 28002 [details]
Patch file with changes

Patch for the XSLFSlide.java file
Comment 2 Jeremy 2011-11-29 17:25:20 UTC
TIKA bug #795 opened for issue as well.
Comment 3 Yegor Kozlov 2011-11-30 13:35:53 UTC
It was an intentional change. Master sheet of a slide is slide layout and master of a slide layout is slide master. To be clear, my change reflects the sheet hierarchy in the .pptx format:

slide.xml <-- slideLayout.xml <-- slideMaster.xml

The immediate fix on the Tika side is to use XSLFSlide.getSlideMaster() instead of XSLFSlide.getMasterSheet(). With this change everything should compile and run. 

Meanwhile, I'm going to rework Tika's XSLFPowerPointExtractorDecorator - most of the logic can be simplified and written in a much nicer form.

Yegor

(In reply to comment #2)
> TIKA bug #795 opened for issue as well.
Comment 4 Jeremy 2011-11-30 13:47:33 UTC
Sounds good,  Thanks Yegor.

Nick has already added the patch to TIKA-700 yesterday using that additional method, I missed it when I was looking at it yesterday.   After seeing his patch I began to suspect that the change was done on-purpose and for a reason.  I'll update the status to resolved and not it as an invalid bug.

Thanks again... keep up the great work!!