Created attachment 33348 [details] File from govdocs1 With 3.14-beta1-rc1, we're now getting 113 new exceptions out of ~7000 ppt files compared with 3.13-final. java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.poi.hslf.usermodel.HSLFTable.getCell(HSLFTable.java:125) at org.apache.poi.hslf.extractor.PowerPointExtractor.extractTableText(PowerPointExtractor.java:327) at org.apache.poi.hslf.extractor.PowerPointExtractor.getText(PowerPointExtractor.java:258) at org.apache.poi.hslf.extractor.PowerPointExtractor.getText(PowerPointExtractor.java:173) Again, apologies for not an actual unit test: InputStream is = new FileInputStream(new File(dir, fName)); PowerPointExtractor ex = new PowerPointExtractor(is); System.out.println(ex.getText());
fixed in r1720035 there was another issue with recognizing tables, this is also included in the fix