Bug 54523 - .xlsx files more then 80000 rows can not open in excel (got "Excel found unreadable content" error)
Summary: .xlsx files more then 80000 rows can not open in excel (got "Excel found unre...
Status: RESOLVED DUPLICATE of bug 57342
Alias: None
Product: POI
Classification: Unclassified
Component: SXSSF (show other bugs)
Version: 3.10-dev
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks: 57342
  Show dependency tree
 
Reported: 2013-02-05 04:36 UTC by ppggff
Modified: 2016-11-01 20:04 UTC (History)
2 users (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ppggff 2013-02-05 04:36:27 UTC
I use a simple script to generate 81000 rows in a .xlsx file, 
then this file can not open in Excel 2010 with "Excel found 
unreadable content" error. And I got CRC error when try to
decompress it with 7zip on windows.

And a .xlsx file with 80000 rows is ok...

scripts:
	SXSSFWorkbook wb = new SXSSFWorkbook(100);
	Sheet sh = wb.createSheet();

	for (int i = 0; i < 81000; i++) {
		Row row = sh.createRow(i);
		for (int j = 0; j < 1000; j++) {
			Cell cell = row.createCell(j);
			cell.setCellValue("dddd");
		}
		if (i % 100 == 0){
			System.out.println("x - " + i);
		}
	}

	FileOutputStream out = new FileOutputStream(xlsx);
	wb.write(out);
	out.close();
	wb.dispose();
Comment 1 ppggff 2013-02-18 11:09:22 UTC
also exist in version 3.8, 3.9
Comment 2 Nick Burch 2013-02-18 11:14:06 UTC
Can you try on another machine / jvm? You shouldn't be getting a CRC error when decompressing, as that should all be handled automatically by the JVM, so that tends to imply that there's something broken about your setup
Comment 3 ppggff 2013-02-19 02:17:03 UTC
can you reproduce it? 
i compiled and ran it by:
  javac.exe -cp poi-4.0-beta1/* Test.java
  java.exe -cp poi-4.0-beta1/*;poi-4.0-beta1/ooxml-lib/*;. Test

all following situations failed:

1.ubuntu
  java version "1.6.0_24"
  OpenJDK Runtime Environment (IcedTea6 1.11.5) (6b24-1.11.5-0ubuntu1~12.04.1)
  OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

2 windows 7
  java version "1.7.0_11"
  Java(TM) SE Runtime Environment (build 1.7.0_11-b21)
  Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)

3 windows 7
  java version "1.6.0_27"
  Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
  Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
Comment 4 ppggff 2013-03-06 08:37:44 UTC
Can any one reproduce it?
Or need any other infomation?
Thanks a lot.
Comment 5 Piotr Przybylski 2013-03-18 10:33:00 UTC
windows 7
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b05)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)

windows 7
java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) Client VM (build 23.7-b01, mixed mode, sharing)

XLSX file with 81000 rows has 203 234 945 bytes, when extracted with 7-zip xl/worksheets/sheet1.xml has 4 305 778 271 bytes (slightly more than 4 GB) and looks to be cut at this point.

XLSX file with 80000 rows has 200 729 481 bytes, when extracted with 7-zip xl/worksheets/sheet1.xml has 4 252 483 271 bytes (almost 4 GB) and looks ok.

4 GB seems to be some magic size after which sheet1.xml gets broken (cut).
Comment 6 ppggff 2013-05-06 09:27:36 UTC
It seems jdk after 1.7b55 supports zip64 in 2009.
(https://blogs.oracle.com/xuemingshen/entry/zip64_support_for_4g_zipfile)

But it didn't help this problem. 
xlsx file generated by POI with newest jdk (1.7.0_21) didn't work.

The result xlsx file can be uncompress by 7zip successfully.
And then I built a new zip file with 7zip, the new zip file
worked with office.
Comment 7 Dominik Stadler 2014-09-01 14:03:23 UTC
Both the application "zip" from cygwin on Windows as well zip64 (http://sourceforge.net/projects/zip64file/) produce zip files that are deemed invalid by Excel 2013, however other tools can read all those files fine, seems Excel is much more picky about something in the file format.

Also choosing options to not compress at all with the zip64 tool did produce a file that got rejected by Excel.
Comment 8 Aldrin Baroi 2016-09-22 00:23:15 UTC
Apache POI version: 3.14, 3.15-BETA2
Excel 2010

Number of columns in Excel Sheet:  208
Number of rows in Excel Sheet:  410,000 or more

Excel displays "Excel found unreadable content in '...'. Do you want to recover the contents of this workbook? If you trust..." error message ONLY WHEN the number of rows exceeds 410,000 or so.
Comment 9 Dominik Stadler 2016-11-01 20:04:28 UTC
I am closing this one as duplicate of 57342, both bugs describe the same issue.

*** This bug has been marked as a duplicate of bug 57342 ***