Bug 58420 - Document with many sheets may exceed FileHandle limit
Summary: Document with many sheets may exceed FileHandle limit
Status: RESOLVED WONTFIX
Alias: None
Product: POI
Classification: Unclassified
Component: SXSSF (show other bugs)
Version: 3.12-FINAL
Hardware: PC All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-16 10:44 UTC by Henning Moll
Modified: 2016-03-12 15:40 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Henning Moll 2015-09-16 10:44:40 UTC
SXSSFWorkbook creates a temporary file for each sheet. The FileHandle to this temporary file is closed during the call to workbook.write. This is no problem as long as the number of sheets is lower than the process' openfilehandle limit. Here's some code to reproduce:

    public static void main(String[] args) throws IOException {
        Workbook wb = new SXSSFWorkbook();
        for (int i = 0; i < 10000; i++) {
            Sheet sheet = wb.createSheet("sheet_" + i);

            //manipulate current sheet
        }

        FileOutputStream fileOut = new FileOutputStream("workbook.xlsx");
        wb.write(fileOut);
        fileOut.close();
    }

Depending on the current openfilehandel limit, this code will fail.

After manipulateion of the current sheet, it should not be neccessary to keep the filehandle open. I tried to close the handle manually by adding

            ((SXSSFSheet) sheet).getWorksheetXMLInputStream().close();

just before the end of the "sheet loop". Note that an side effect of getWorksheetXMLInputStream is to close the writer FileHandle.
Unfortunately the final call to 

        wb.write(fileOut);

tries to call flush again:

Exception in thread "main" java.io.IOException: Stream closed
	at java.io.BufferedWriter.ensureOpen(BufferedWriter.java:116)
	at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:126)
	at java.io.BufferedWriter.flush(BufferedWriter.java:253)
	at org.apache.poi.xssf.streaming.SheetDataWriter.close(SheetDataWriter.java:97)
	at org.apache.poi.xssf.streaming.SXSSFSheet.getWorksheetXMLInputStream(SXSSFSheet.java:86)
	at org.apache.poi.xssf.streaming.SXSSFWorkbook.injectData(SXSSFWorkbook.java:353)
	at org.apache.poi.xssf.streaming.SXSSFWorkbook.write(SXSSFWorkbook.java:891)
	at com.gfk.ait.convertserver.xml2excel.xlsx.Xml2XlsxConverter.main(Xml2XlsxConverter.java:56)

If i debug-eleminate the final calls to SheetDataWriter.close (as they have been closed before), the code works and the resulting xlsx document is valid.

A call getWorksheetXMLInputStream should try to close the _writer stream only once.
The interface should provide a clean method to release the filehandle.
Comment 1 Dominik Stadler 2015-09-28 19:03:26 UTC
Creating such a big workbook is probably a bit out of the normal use of POI, I'm not sure if we should invest time to add complexity only to handle such extreme cases when a simple raise of the ulimit will make it work for you as well. Is there a reason why you cannot simply set the max open files on your machine to a higher value?
Comment 2 Dominik Stadler 2016-03-12 15:40:33 UTC
As there was no response I am closing this for now, the required changes seem quite risky and only very few people will work with such big files. 

A simple workaround is to raise the maximum number of file-handles in these cases.