Bug 61374 - Opening an XSSFWorkbook in CICS using a EBCDIC encoding
Summary: Opening an XSSFWorkbook in CICS using a EBCDIC encoding
Status: RESOLVED INVALID
Alias: None
Product: POI
Classification: Unclassified
Component: OPC (show other bugs)
Version: 3.16-FINAL
Hardware: Other other
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-02 16:05 UTC by slstpeter
Modified: 2017-09-13 20:04 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description slstpeter 2017-08-02 16:05:13 UTC
We are getting the following error when attempting to open an Excel file that is running under CICS in an environment with EBCDIC as the default encoding. We only see this error under CICS, and do not see this issue running in ASCII environments.

org.apache.poi.openxml4j.exceptions.OpenXML4JRuntimeException: Package.init() : this exception should never happen, if you read this message please send a mail to the developers team.
	at org.apache.poi.openxml4j.opc.OPCPackage.init(OPCPackage.java:161)
	at org.apache.poi.openxml4j.opc.OPCPackage.<init>(OPCPackage.java:136)
	at org.apache.poi.openxml4j.opc.Package.<init>(Package.java:54)
	at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:81)
	at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:220)
	at org.apache.poi.util.PackageHelper.open(PackageHelper.java:39)
	at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:186)
	at com.optum.cire.tops.rdp.ReferenceDataProvider.parseXlsRDP(Unknown Source)
	at com.optum.cire.tops.rdp.ReferenceDataProvider.<init>(Unknown Source)
	at com.optum.cire.tops.rdp.ReferenceDataProvider.getInstance(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
	at java.lang.reflect.Method.invoke(Method.java:620)

Here is the latest version of the code we are using to open the file. 
private void parseXlsRDP(String fileName, HashSet<String> set) {
		try {
			Properties p = System.getProperties();
		    p.put("file.encoding","ISO8859-1");
		    System.setProperties(p);
			InputStream inputStream = getClass().getResourceAsStream(fileName);	
		
			XSSFWorkbook exWorkBook= new XSSFWorkbook(inputStream);
			XSSFSheet sheet = exWorkBook.getSheetAt(0);
			
			XSSFFont wbFont;
			wbFont=exWorkBook.createFont();
			wbFont.setCharSet(HSSFFont.ANSI_CHARSET);
			
			Row row;
			Cell cell;
			Cell cell1;
			int noOfColumns = 0;
			for (int rowIndex = 0; rowIndex <= sheet.getLastRowNum(); rowIndex++) {
				row = sheet.getRow(rowIndex);
				if (row != null) {
					noOfColumns = sheet.getRow(0).getLastCellNum();
					if (noOfColumns == 2) {
						cell = row.getCell(0);
						cell1 = row.getCell(1);
						if (CellType.STRING == cell.getCellTypeEnum()  &&  CellType.STRING == cell1.getCellTypeEnum())
							set.add(new String(cell.getStringCellValue().getBytes(Charset.forName("UTF-8")))+ new String(cell1.getStringCellValue().getBytes(Charset.forName("UTF-8"))));
						else if (CellType.NUMERIC == cell.getCellTypeEnum()  &&  CellType.STRING == cell1.getCellTypeEnum())
							set.add(new String(String.valueOf(cell.getNumericCellValue()).getBytes(Charset.forName("UTF-8")))+ new String(cell1.getStringCellValue().getBytes(Charset.forName("UTF-8"))));
						else if (CellType.NUMERIC == cell.getCellTypeEnum()&& CellType.NUMERIC == cell1.getCellTypeEnum())
							set.add(new String(String.valueOf(cell.getNumericCellValue()).getBytes(Charset.forName("UTF-8")))+ new String(String.valueOf(cell1.getNumericCellValue()).getBytes(Charset.forName("UTF-8"))));
						else if (CellType.STRING == cell.getCellTypeEnum() && CellType.NUMERIC == cell1.getCellTypeEnum())
							set.add(new String(cell.getStringCellValue().getBytes(Charset.forName("UTF-8")))+ new String(String.valueOf(cell1.getNumericCellValue()).getBytes(Charset.forName("UTF-8"))));
					} else {
						for (int colIndex = 0; colIndex < noOfColumns; colIndex++) {
							cell = row.getCell(colIndex);
							if (CellType.NUMERIC == cell.getCellTypeEnum())
								set.add(new String(String.valueOf((long) cell.getNumericCellValue()).getBytes(Charset.forName("UTF-8"))));
							else if (CellType.STRING== cell.getCellTypeEnum())
								set.add(new String(cell.getStringCellValue().getBytes(Charset.forName("UTF-8"))));	
						}
					}
				}
			}
			inputStream.close();
			exWorkBook.close();
			//System.out.println(":: ReferenceDataProvider :: parseXlsRDP :: Name: "+fileName+" File size read: "+set.size());
		}
Comment 1 Javen O'Neal 2017-08-02 17:07:01 UTC
A few suggestions for your code:
> for (int rowIndex=0; rowIndex <= sheet.getLastRowNum(); rowIndex++) {
>     Row row = sheet.getRow(rowIndex);
>     if (row != null) {
>         ...
>     }
can be replaced with
> for (final Row row : sheet) {
>     ...
> }
which will iterate over all non-null rows. That'll save you one indentation layer and improve readability.

You can close inputStream as soon as you're done opening the workbook. This may help with memory consumption. If you aren't reading from an embedded resource, it's even better to read straight from a File, which avoids buffering the contents of the file in memory prior to initializing the workbook.


Addressing the issue at hand, based on your stack trace, it looks like  your issue arises from `new XSSFWorkbook(inputStream)`, so the rest of the example code is irrelevant. If this is correct, then the minimum test case that would reproduce the issue would look something like this:

private void parseXlsRDP(String fileName) {
    Properties p = System.getProperties();
    p.put("file.encoding","ISO8859-1");
    System.setProperties(p);

    InputStream inputStream = getClass().getResourceAsStream(fileName);	
    XSSFWorkbook wb = new XSSFWorkbook(inputStream);
    inputStream.close();
    wb.close();
}

I don't have access to a CICS system to be able to test this issue, nor do I have the file you used that produced the problem, so I'll need your help to resolve this issue.
1) Does this problem occur with certain files or all XLSX files on CICS? Could you try with a file that is known to work with POI, such as https://svn.apache.org/repos/asf/poi/trunk/test-data/spreadsheet/SampleSS.xlsx
2) Does this problem occur when reading non-Excel OOXML file formats (docx, pptx)? 
https://svn.apache.org/repos/asf/poi/trunk/test-data/document/SampleDoc.docx
https://svn.apache.org/repos/asf/poi/trunk/test-data/slideshow/SampleShow.pptx
3) Does my guess at the minimum reproducing code reproduce your issue? Is setting the file.encoding system property required to reproduce the issue?
4) If you open the XSSFWorkbook from a java.io.FileInputStream or java.io.File rather than a resource InputStream, does the problem still occur?
5) Does the problem occur if you open the workbook from WorkbookFactory.create()?
Comment 2 Dominik Stadler 2017-09-13 20:04:58 UTC
No response for over a month, so we can not analyse this with the amount of information provided and we don't have any such system available for testing, so we cannot do much here for now. Please reopen with more information if this is still an issue for you.