I am using POI-3.10-FINAL-20140208 release. I have a agent server that picks the files from the file system and reads the files. Below is the code I am using. OPCPackage opcPackage = OPCPackage.open(filename); XSSFReader reader = new XSSFReader(opcPackage); From the reader I get the workbook xml as InputStream and use custom SAX parser to get the sheet names and the corresponding RIDs InputStream workbookData = reader.getWorkbookData(); Using the RIDs, I get individual sheet XMLs as InputStream objects and use custom SAX Parser to parse these. I set the sharedStringTable and styleTable from the reader to the custom parsers to be used during the parsing of the sheet data. DefaultSheetParser sheetParser = new DefaultSheetParser(reader.getSharedStringsTable(), reader.getStylesTable()); InputStream sheet = reader.getSheet(relId); All this is working fine, but all of a sudden I start to get "IOException - Can't obtain the input stream from /xl/sharedStrings.xml" at the reader.getSharedStringsTable(). The files open in Excel without any error. Most of the failing files are of size 400KB. Once I re-start the agent server and reprocess the same files, there is no such error. I checked the memory settings of the JVM, there is enough memory allocated (about 4GB) and I do not get any Out of Memory error.
Keeping too many file handles open, perhaps? This would be more likely to show up in a long-running server process. Make sure you close your resource streams when you're done with them and you don't have too many open simultaneously. Once you've read through your code, skim through the POI classes that you're using to see if they leak any file handles/resources. Eclipse or other tools might make the process of finding leaked resources easier.
I think Hotspot, included in JDK, can show instantaneous resource usage (CPU, heap, permgen, and file handles) on running processes. Check that before you start looking for file handle leaks.
As well as ensuring you close your resources as Javen says, 3.10 is over 2 years old (clue is the date in the filename!), you might want to try 3.14, or better wait a few more days then try 3.15 beta 1
I am closing the workbook using the below code: - opcPackage.close(); After I finish reading from the workbook. Is there any other handle that needs to be closed?
I have updated the code to close all the InputStream handles once I have parsed the workbook and sheet InputStream using custom sax parsers.
Also on Unix you can look at the output of "ls /proc/<pid>/fd" with the pid of the server-process to see which files are actually currently open. This might give an indication of which part of your application is actually leaking file handles (if this is the actual problem here). Anyway I don't see an actual problem in POI here for now. We have extensive tests which verify that file-handles are closed properly as long as the respective close() method is called. If there is still a problem then please update to a current version and retry. If it still dose not work then, then please reopen this bug with the list of open files at the time when the application fails.