Overview If there is a newline (or perhaps a carriage return) in the text of a header cell in a table in an XSSFSheet, when the workbook is written out by POI and then read by Excel, Excel reports some unreadable content in the workbook. Steps to Reproduce 1) Use Excel to prepare a workbook that has a table in the first sheet. 2) Select a header cell in the table, and enter text containing a new-line in the cell. In Excel 2011 for Mac, this is done with the keystroke combination cmd-option-return. 3) Save the workbook. 4) Read and then write the workbook using the following POI code: final File testFile = new File(srcFile.getParentFile(), "test.xlsx"); final InputStream is = new FileInputStream(srcFile); final XSSFWorkbook wb = (XSSFWorkbook) WorkbookFactory.create(is); is.close(); try (OutputStream fileOut = new FileOutputStream(testFile)) { wb.write(fileOut); wb.close(); } 5) Open the workbook in Excel. You will get warning message about repairing unreadable content; when you examine the repair log file, you will get the following information: Repair Result to test 04744.xml Errors were detected in file ‘[path redacted]:test.xlsx' Repaired Records: Table from /xl/tables/table1.xml (List) This has been seen in version 3.17 of POI, on Macintosh running OS X 10.11.6, using Excel 2011 for Mac
Can you attach such a sample Excel file that can be used to reproduce the problem?
Created attachment 36573 [details] Testcase for bug 62444 This JUnit test class illustrates the problem. It writes 4 files to build/custom-reports-test (JUnit reports 4 successes). All contain a 4 column, 3 row table, and differ only in the data in the first column. I am also attaching all 4 file, as well as testLfInHeader.repaired.xlsx, which is a copy of testLfInHeader.xlsx after it has been repaired by Excel 2010. testNoLfInHeader.xlsx contains no newline characters in any headers. Excel opens this file silently. testLfInHeaderCell.xlsx contains a newline character in the first header cell, but no newline character in the XSSFTable column name. Excel 2010 opens this file silently, but the first header cell shows no sign of a newline. testLfInHeaderCell.xlsx contains no newline character in the first header cell, but has a newline character in the XSSFTable column name. Excel 2010 opens this file silently, but the first header cell shows no sign of a newline. testLfInHeaderCell.xlsx contains a newline character in the first header cell, andhas a newline character in the XSSFTable column name. Excel 2010 opens this file after warning that the table needs repair, and the first header cell is 2 lines high.
Created attachment 36574 [details] Zip archive containing results of testcase This zip archive contains the 4 files described in the test case comment, plus one of the files after it has been repaired by Excel 2010. The repair changes the text in the file xl/tables/table1.xml inside the xlsx archive. It adds several xmlns attributes to the table tag, and an xr3:uid tag to each tableColumn tag. The contents of table1.xml are listed below. Before the repair, table1.xml has the following content: <?xml version="1.0" encoding="ISO-8859-1"?> <table id="1" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" name="Table1" displayName="Table1" ref="C3:F5"> <tableColumns count="4"> <tableColumn id="1" name="Col 1"/> <tableColumn id="2" name="Col2"/> <tableColumn id="3" name="Col3"/> <tableColumn id="4" name="Col4"/> </tableColumns> <tableStyleInfo name="TableStyleMedium2" showLastColumn="false" showFirstColumn="false" showRowStripes="true" showColumnStripes="false"/> </table> After the repair, table1.xml contains the following text: <?xml version="1.0" encoding="ISO-8859-1"?> <table id="1" ref="C3:F5" displayName="Table1" name="Table1" xr:uid="{00000000-000C-0000-FFFF-FFFF00000000}" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" mc:Ignorable="xr xr3" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"> <tableColumns count="4"> <tableColumn id="1" name="Col 1" xr3:uid="{00000000-0010-0000-0000-000001000000}"/> <tableColumn id="2" name="Col2" xr3:uid="{00000000-0010-0000-0000-000002000000}"/> <tableColumn id="3" name="Col3" xr3:uid="{00000000-0010-0000-0000-000003000000}"/> <tableColumn id="4" name="Col4" xr3:uid="{00000000-0010-0000-0000-000004000000}"/> </tableColumns> <tableStyleInfo name="TableStyleMedium2" showColumnStripes="0" showRowStripes="1" showLastColumn="0" showFirstColumn="0"/> </table>