Bug 63541

Summary: [PATCH] NullPointerException from XSLFSimpleShape.getAnchor for empty xfrm tags
Product: POI Reporter: petoalbert32
Component: XSLFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: petoalbert32
Priority: P2    
Version: 4.0.x-dev   
Target Milestone: ---   
Hardware: PC   
OS: Mac OS X 10.1   
Attachments: pptx file that contains a shape for which getAnchor throws NullPointerException
Proposed one-liner change in XSLFSimpleShape

Description petoalbert32 2019-07-03 14:39:09 UTC
Created attachment 36643 [details]
pptx file that contains a shape for which getAnchor throws NullPointerException

I have stumbled upon a pptx file that contains a shape for which getAnchor throws NullPointerException. The issue is with the following two lines from XSLFSimpleShape:

    CTPoint2D off = xfrm.getOff();
    double x = Units.toPoints(off.getX());

Apparently the shape in slide1.xml contains an empty xfrm tag and an empty path, and xfrm.getOff() returns null.

I have attached a simplified pptx file containing the weird shape. The mentioned shape is not visible in various editors/viewers, but they show other parts of the presentation correctly.

Test with:

    XMLSlideShow pptx = ... // attached file
    pptx.getSlides().get(0).getShapes().get(0).getAnchor();
Comment 1 petoalbert32 2019-07-03 14:50:42 UTC
As for the possible fix: I could wrap the mentioned block of code in a conditional statement with condition xfrm.isOffSet(), but it seems that xfrm is also used from other places, and I am not sure how to interpret a shape like this conceptually. 

I could not even reproduce it from an editor, but by manually inserting the following shape into slide1.xml:

<p:sp>
  <p:nvSpPr>
    <p:cNvPr id="119" name="Invalid shape"/>
    <p:cNvSpPr/>
    <p:nvPr/>
  </p:nvSpPr>
  <p:spPr>
    <a:xfrm/>
    <a:custGeom>
      <a:avLst/>
      <a:gdLst/>
      <a:ahLst/>
      <a:cxnLst/>
      <a:rect l="l" t="t" r="r" b="b"/>
      <a:pathLst>
        <a:path/>
      </a:pathLst>
    </a:custGeom>
    <a:solidFill/>
    <a:ln>
      <a:solidFill/>
      <a:prstDash/>
    </a:ln>
  </p:spPr>
  <p:txBody>
    <a:bodyPr rtlCol="0" anchor="ctr"/>
    <a:lstStyle/>
    <a:p>
      <a:pPr algn="ctr"/>
      <a:endParaRPr lang="en-US" altLang="zh-CN"/>
    </a:p>
  </p:txBody>
</p:sp>

Is this valid at all? It seems that other tools can handle it.
Is there a better solution than explicitly checking for the presence of offset everywhere where we use it?
Comment 2 Andreas Beeker 2019-07-03 20:12:19 UTC
(In reply to petoalbert32 from comment #1)
> Is this valid at all? It seems that other tools can handle it.
I thought its invalid, but the schema validation with ECMA-376 (5th edition) was positive - be aware that we still use the 1st edition (see also #56205)

> Is there a better solution than explicitly checking for the presence of
> offset everywhere where we use it?
If you access the XmlBeans directly -> nope
If you use the POI API, I'd return NULL as the anchor - I think an empty anchor doesn't make much sense. Furthermore I would throw your test file in our corpus and see where it crashes and fix those places.
Comment 3 petoalbert32 2019-07-24 11:52:59 UTC
Created attachment 36681 [details]
Proposed one-liner change in XSLFSimpleShape

Fixes the NullPointerException in getAnchor when called with the attached pptx file. I have run TestAllFiles with the file and this was the only place where the exception occured.
Comment 4 Andreas Beeker 2019-09-11 22:02:48 UTC
Applied via r1866810