Bug 51158

Summary: Writing a workbook multiple times produces unreadable content
Product: POI Reporter: Dominik <dominik.schuberth>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: major CC: onealj
Priority: P2    
Version: 3.8-dev   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Bug Depends on: 49940    
Bug Blocks:    
Attachments: "sheet1.xml" file of the Excel workbook "test2.xlsx"
Example files used to reproduce InvocationTargetException

Description Dominik 2011-05-06 09:17:54 UTC
Created attachment 26962 [details]
"sheet1.xml" file of the Excel workbook "test2.xlsx"

When using the write() method of a XSSFWorkbook multiple times, the content of the created Excel files is not readable by Excel. The reason for this problem is that the content of some files (that are contained in the .xlsx zip archive) is appended to the old content of those files.


Example code:

-----------------------------------------------------

// imports
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFFont;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

// example code
FileOutputStream fos = null;

try {

    // create a workbook
    final XSSFWorkbook workbook = new XSSFWorkbook();
    XSSFSheet sheet = workbook.createSheet("Test Sheet");
    XSSFRow row = sheet.createRow(2);
    XSSFCell cell = row.createCell(3);
    cell.setCellValue("test1");

    // write the first excel file
    fos = new FileOutputStream(new File("test1.xlsx"));
    workbook.write(fos);
    fos.flush();
    fos.close();

    // add a new cell to the sheet
    cell = row.createCell(4);
    cell.setCellValue("test2");

    // write the second excel file
    fos = new FileOutputStream(new File("test2.xlsx"));
    workbook.write(fos);
    fos.flush();
    fos.close();

} catch (IOException ex) {

    Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);

} finally {

    try {
        fos.close();
    } catch (IOException ex) {
        Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
    }

}

-----------------------------------------------------

The content of the "sheet1.xml" file ("xl/worksheets/sheet1.xml") that has been created in the first excel file is correct. In the second Excel file, an incorrect "sheet1.xml" file is created. It includes the content of the first "sheet1.xml" file and in addition appends the content that this file SHOULD have. (The "sheet1.xml" file of the second Excel file of this example is attached to this bug report.)

Further files that are affected in the same way in the posted example:
docProps/app.xml
xl/sharedStrings.xml
xl/styles.xml
xl/workbook.xml

I already searched for the cause of this bug but I could not find the source code that is producing the appended XML code.

---

Used libraries:
poi-3.8-beta2-20110408.jar
poi-ooxml-3.8-beta2-20110408.jar
poi-ooxml-schemas-3.8-beta2-20110408.jar

Used external libraries:
dom4j-1.6.1.jar
stax-api-1.0.1.jar
xmlbeans-2.3.0.jar
Comment 1 Nick Burch 2011-05-06 13:18:48 UTC
Bug #49940 is a similar issue around writing a xssfworkbook multiple times. That one is looking like an xmlbeans bug, but more investigations are needed
Comment 2 Yegor Kozlov 2011-06-18 10:06:31 UTC
The behavior depends on how you construct XSSFWorkbook:

If you open an existing file and write it twice you will get XmlValueDisconnectedException, as described in Bug 49940:

org.apache.xmlbeans.impl.values.XmlValueDisconnectedException
	at org.apache.xmlbeans.impl.values.XmlObjectBase.check_orphaned(XmlObjectBase.java:1213)
	at org.apache.xmlbeans.impl.values.XmlObjectBase.newCursor(XmlObjectBase.java:243)
	at org.apache.xmlbeans.impl.values.XmlComplexContentImpl.arraySetterHelper(XmlComplexContentImpl.java:1073)
	at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTFontsImpl.setFontArray(Unknown Source)
 
However, with a newly constructed XSSFWorkbook  writing twice passes OK, but the output is broken.

Something to fix in future versions of POI.

Yegor
Comment 3 Michael L. 2011-12-20 16:49:34 UTC
While generating an example for Bug #52349, I believe I stumbled upon this bug (albeit without having to explicitly call "write").  I'm re-posting that example here with details on the "new" issue I ran into:


Attached "test.java" simply walks through a given spreadsheet, outputs the contents (formulas/cached values/FormulaEvaluator results/etc.), and closes the workbook.  For some spreadsheets this works fine...however with others (and specifically with the simple example spreadsheet I'm attaching - test.xlsx), the method runs fine the first time but on subsequent calls it fails with an InvocationTargetException:


Caused by: java.io.IOException: error: Unexpected end of file after null
	at org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:125)
	at org.apache.poi.xssf.model.SharedStringsTable.<init>(SharedStringsTable.java:102)
	... 9 more


And then if you open the file in Excel, you'll get an "unreadable content" error with this response if you allow Excel to repair the file:  "Removed Part: /xl/sharedStrings.xml part with XML error.  (Strings) A document must contain exactly one root element. Line 1, column 0."

Confirmed under java versions "1.6.0_24" and "1.7.0" using poi-3.8-beta4.
Comment 4 Michael L. 2011-12-20 16:54:39 UTC
Created attachment 28093 [details]
Example files used to reproduce InvocationTargetException
Comment 5 Dominik Stadler 2013-12-26 18:00:04 UTC
I have fix the item originally reported in this Bug under r1553525, i.e. multiple saves should not cause double-written XML any more.

Michael L., if your problem still persists then please report a separate bug for your problem.