Bug 65396

Summary: WorkbookFactory.create bugs on Tomcat
Product: POI Reporter: monnomiznogoud
Component: POI OverallAssignee: POI Developers List <dev>
Status: RESOLVED INFORMATIONPROVIDED    
Severity: blocker    
Priority: P2    
Version: 5.0.0-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description monnomiznogoud 2021-06-23 12:41:17 UTC
Hi,

WorkbookFactory.create (POI 5.0.0) has a weird behavior when executed on a running Tomcat server (8.0.28, with jdk 8.0.251):

XSSFWorkbook wb = new XSSFWorkbook(new File("c:/temp/test.xlsx"));
//The following finds and prints the sheet in a standalone java application, null in Tomcat
System.out.println(wb.getSheet("mysheet"));
//The following prints the sheet's name (mysheet) in a standalone java application, null in Tomcat (but finds the sheet alright)
System.out.println(wb.getSheetAt(0).getSheetName());

I've spent a few hours trying to find out why, and I failed.

Any help ?

I've reverted to my old POI3 code to open the file and find the sheet:

      OPCPackage p = OPCPackage.open(in);
      XSSFReader xssfReader = new XSSFReader(p);
      this.sharedStringsTable = xssfReader.getSharedStringsTable();
      this.stylesTable = xssfReader.getStylesTable(); 
      XSSFReader.SheetIterator iter = (XSSFReader.SheetIterator) xssfReader.getSheetsData();
      while (iter.hasNext())
      {
        InputStream stream = iter.next();
        if (this.queriedSheetName == null || this.queriedSheetName.equals(iter.getSheetName()))
        {
          this.nextDataType = xssfDataType.NUMBER;
          this.formatter = new DataFormatter();
          InputSource sheetSource = new InputSource(stream);
          SAXParserFactory saxFactory = SAXParserFactory.newInstance();
          SAXParser saxParser = saxFactory.newSAXParser();
          XMLReader sheetParser = saxParser.getXMLReader();
          sheetParser.setContentHandler(this);
          sheetParser.parse(sheetSource);
          break;
        }
      }
Comment 1 monnomiznogoud 2021-06-23 14:10:04 UTC
It seems to be a QName problem. In Tomcat localName isn't enough, nameSpace is needed or something like that.

Not only sheet names are affected, also reading cell values. For example a cell with a string is read as numeric; I guess it must be the string's hash or something.

To sum it up there seems to be a problem getting nodes attributes by name.
Comment 2 PJ Fanning 2021-06-23 14:14:50 UTC
Could you try adding xerces 2.12 to the classpath to see if that helps? It seems possible that the XML parser that you are relying on has non-standard behaviour.
Comment 3 monnomiznogoud 2021-06-23 15:33:24 UTC
(In reply to PJ Fanning from comment #2)
> Could you try adding xerces 2.12 to the classpath to see if that helps? It
> seems possible that the XML parser that you are relying on has non-standard
> behaviour.

Thanks, that solved the problem, although I do not know which parser was used before. I guess it must have been java's default parsers.