Bug 66049

Summary: removing last remaining textparagraph results in corrupt file
Product: POI Reporter: bugzilla-apache
Component: XSLFAssignee: POI Developers List <dev>
Status: NEW ---    
Severity: normal    
Priority: P2    
Version: 5.2.2-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description bugzilla-apache 2022-05-04 11:56:38 UTC
When the last remaining textparagraph is deleted, the resulting file will be corrupt.

@Test
  void removeLastRemainingTextParagraph() throws IOException {
    final XMLSlideShow ppt = new XMLSlideShow();

    final XSLFSlide slide = ppt.createSlide();

    final XSLFTextBox textBox = slide.createTextBox();
    textBox.setAnchor(new java.awt.Rectangle(0, 0, 100, 100));

    textBox.getTextParagraphs().get(0).getTextRuns().get(0).setText("lorem");
    textBox.addNewTextParagraph().addNewTextRun().setText("ipsum");

    assertEquals(2, textBox.getTextParagraphs().size());
    assertEquals("lorem\nipsum", textBox.getText());
    assertEquals("lorem", textBox.getTextParagraphs().get(0).getText());
    assertEquals("ipsum", textBox.getTextParagraphs().get(1).getText());

    textBox.removeTextParagraph(textBox.getTextParagraphs().get(1));

    assertEquals(1, textBox.getTextParagraphs().size());
    assertEquals("lorem", textBox.getText());
    assertEquals("lorem", textBox.getTextParagraphs().get(0).getText());

    String filepath = System.getProperty("java.io.tmpdir") + "valid_file.pptx";

    FileOutputStream fos = new FileOutputStream(filepath);
    ppt.write(fos);

    fos.close();

    System.out.println("file written to " + filepath);

    textBox.removeTextParagraph(textBox.getTextParagraphs().get(0));

    assertEquals(0, textBox.getTextParagraphs().size());
    assertEquals("", textBox.getText());

    filepath = System.getProperty("java.io.tmpdir") + "invalid_file.pptx";

    fos = new FileOutputStream(filepath);
    ppt.write(fos);

    fos.close();

    System.out.println("file written to " + filepath);
  }

It would be helpful if an exception was thrown explaining why the textparagraph cannot be deleted. This would ensure that a readable file would be produced and you would not have to debug and analyze to get to know the cause of the problem.



By the way, removing the last remaining textrun of a textparagraph does not result in a file corruption:

@Test
  void removeLastRemainingTextRun() throws IOException {
    final XMLSlideShow ppt = new XMLSlideShow();

    final XSLFSlide slide = ppt.createSlide();

    final XSLFTextBox textBox = slide.createTextBox();
    textBox.setAnchor(new java.awt.Rectangle(0, 0, 100, 100));

    textBox.getTextParagraphs().get(0).getTextRuns().get(0).setText("lorem");
    textBox.getTextParagraphs().get(0).addNewTextRun().setText("ipsum");

    assertEquals(1, textBox.getTextParagraphs().size());
    assertEquals(2, textBox.getTextParagraphs().get(0).getTextRuns().size());
    assertEquals("loremipsum", textBox.getText());
    assertEquals("lorem", textBox.getTextParagraphs().get(0).getTextRuns().get(0).getRawText());
    assertEquals("ipsum", textBox.getTextParagraphs().get(0).getTextRuns().get(1).getRawText());

    textBox.getTextParagraphs().get(0).removeTextRun(textBox.getTextParagraphs().get(0).getTextRuns().get(1));

    assertEquals(1, textBox.getTextParagraphs().size());
    assertEquals(1, textBox.getTextParagraphs().get(0).getTextRuns().size());
    assertEquals("lorem", textBox.getText());
    assertEquals("lorem", textBox.getTextParagraphs().get(0).getTextRuns().get(0).getRawText());

    String filepath = System.getProperty("java.io.tmpdir") + "valid_file.pptx";

    FileOutputStream fos = new FileOutputStream(filepath);
    ppt.write(fos);

    fos.close();

    System.out.println("file written to " + filepath);

    textBox.getTextParagraphs().get(0).removeTextRun(textBox.getTextParagraphs().get(0).getTextRuns().get(0));

    assertEquals(1, textBox.getTextParagraphs().size());
    assertEquals(0, textBox.getTextParagraphs().get(0).getTextRuns().size());
    assertEquals("", textBox.getText());

    filepath = System.getProperty("java.io.tmpdir") + "still_valid_file.pptx";

    fos = new FileOutputStream(filepath);
    ppt.write(fos);

    fos.close();

    System.out.println("file written to " + filepath);
  }
Comment 1 PJ Fanning 2022-08-24 17:41:08 UTC
A lot of the XSLF methods are no longer worked on. The XDDF methods were added more recently to try to better handle the fact that some of the code can be reused for docx and pptx support.

Could you try using `public XDDFTextBody getTextBody()` on the XSLFTextBox instead? You can add and remove XDDFTextParagraphs from an XDDFTextBody. I'm not guaranteeing that this works better than the equivalent XSLF classes but it is worth experimenting with.