"commoncrawl/CC-MAIN-2021-31/75/c0/75c0f36adec322c1460ffac4406c5df4826c14c16eb16f7057511b5bc0c66397",1,False,"617","java.lang.NullPointerException at org.apache.poi.xssf.eventusermodel.XSSFReader$SheetIterator.next(XSSFReader.java:352) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:155) from what I understand, the problem is here: PackagePart sheetPkg = sheetMap.get(sheetId); return sheetPkg.getInputStream(); the result of get() isn't null checked. You could throw a POIXMLException instead. I don't have the file, but maybe Tim has it.
add r1903972