Bug 53130

Summary: [PATCH] SXSSF Shared Strings option support, to make generated xlsx files compatible with Google Docs or iPad
Product: POI Reporter: Duane J. May <djmay>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: brian.siim.andersen, kmitchener, l_alexandra2010, rahulmondal100, strangerdeepak_07, sumedh.inamdar
Priority: P2    
Version: 3.8-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: generated document does not work on Google Docs
same file opened in excel then saved, now works in Google docs
Patch to facilitate optional use of SharedStringsTable in SXSSFWorkbook
Patch to facilitate optional use of SharedStringsTable in SXSSFWorkbook with correction of minor mistake
New version with javadoc and test
updated spreadsheet howto

Description Duane J. May 2012-04-23 03:01:12 UTC
Created attachment 28655 [details]
generated document does not work on Google Docs

I have been using the XSSFWorkbook to create .xlsx files with various versions of 3.8.
I am having a problem with many of the files I am creating, in that I can not open them on Google Docs. Also when sending them to the iPad I get only some of the columns to appear. 

Excel can read the file, and if I save it once opened in excel the file opens fine in Google Docs or on the iPad. 

I have a test file that is created which does not work on Google Docs. If you open the file in Excel then save it it will work on Google Docs.
Comment 1 Duane J. May 2012-04-23 03:02:35 UTC
Created attachment 28656 [details]
same file opened in excel then saved, now works in Google docs
Comment 2 Alexandra Luca 2013-04-22 08:28:01 UTC
I upgraded POI to 3.9 version and this issue is still reproducible.
Comment 3 sumedh 2013-04-26 11:18:35 UTC
This is most likely happening because of shared string table (which SXSSF doesn't create, it uses inline strings).
Comment 4 Rahul mondal 2013-08-21 12:10:18 UTC
Whats the status of this issue?
Issue still exists with IPAD.
I can provide patch to use shared string table in SXSSFWorkbook.
Comment 5 Rahul mondal 2013-08-22 09:17:10 UTC
Created attachment 30750 [details]
Patch to facilitate optional use of SharedStringsTable in SXSSFWorkbook

Hi,
   To solve the problem with IPAD, I have modified SXSSFWorkbook that will provide an optional capacity to use SharedStringsTable instead of inline strings.

   So, that will provide developers an option to use SharedStringsTable with SXSSFWorkbook.

Thanks
Rahul Mondal
Comment 6 Darren Roberts 2013-08-22 14:10:53 UTC
Minor pedantic point but the argument in the SXSSFWorkbook constructors is misspelled - it should be useSharedStringsTable, not useSahredStringsTable.
Comment 7 Rahul mondal 2013-08-23 06:27:41 UTC
Created attachment 30753 [details]
Patch to facilitate optional use of SharedStringsTable in SXSSFWorkbook with correction of minor mistake

I have removed typo error and added a new patch.

Thanks
Rahul Mondal
Comment 8 Kirk Mitchener 2013-10-11 13:36:05 UTC
This patch has fixed our issues, allowing us to efficiently create large Excel files that can then be used on iPads and Google Docs. What's required to get this patch accepted into the next release?
Comment 9 Nick Burch 2013-10-11 13:50:58 UTC
(In reply to Kirk Mitchener from comment #8)
> This patch has fixed our issues, allowing us to efficiently create large
> Excel files that can then be used on iPads and Google Docs. What's required
> to get this patch accepted into the next release?

I think the main things still needed are:
 * JavaDoc tweak to explain pluses and minuses of shared strings table
 * Website patch to explain how to use it, and pluses/minuses
 * Unit tests for it (likely based on writing out with sxssf+shared strings, then reading back in XSSF to verify)
Comment 10 Martin Andersson 2013-11-09 09:44:00 UTC
Hi,

I need this patch too. I'm willing to dig in and help if needed.

What concerns me though is that the new contructor

SXSSFWorkbook(XSSFWorkbook workbook, int rowAccessWindowSize, boolean useSharedStringsTable)

conflicts with the old

SXSSFWorkbook(XSSFWorkbook workbook,
             int rowAccessWindowSize,
             boolean compressTmpFiles)

That's going to break things for users upgrading poi after this patch.

All these constructors are quite confusing. Why not add only one with all four options?
Comment 11 Yegor Kozlov 2013-11-09 11:01:01 UTC
> 
> All these constructors are quite confusing. Why not add only one with all
> four options?


We have to keep the old constructors to stay compatible with old versions. We can't remove them and keep just one constructor because POI users will have to re-compile the code. 

Yegor
Comment 12 Martin Andersson 2013-11-09 13:58:04 UTC
(In reply to Yegor Kozlov from comment #11)
> > 
> > All these constructors are quite confusing. Why not add only one with all
> > four options?
> 
> 
> We have to keep the old constructors to stay compatible with old versions.
> We can't remove them and keep just one constructor because POI users will
> have to re-compile the code. 
> 
> Yegor

That's my point exactly! I think the patch should _add_ _one_ constructor with all four options instead of messing with the current ones and thereby breaking the api.

Martin
Comment 13 Nick Burch 2013-11-09 20:40:29 UTC
As Yegor says, we should add new overloaded constructors/methods, not change existing ones. If need be, @deprecate the old one

In terms of next steps, we need a tweak to the patch to fix this, then the things I listed in comment #9
Comment 14 Martin Andersson 2013-11-11 08:16:11 UTC
Created attachment 31031 [details]
New version with javadoc and test

I have uploaded a new version of the patch now.

* Javadoc added
* Test added
* No fiddling with the old constructors

Updates for the site still remains. Maybe we can use the javadoc comment? Could you give me a few pointers on how to create a patch for the site?
Comment 15 Nick Burch 2013-11-11 09:47:56 UTC
The site lives in a different bit of SVN: https://svn.apache.org/repos/asf/poi/site

The site is written in a fairly simple kind of XML markup, and created using Forrest. Since you'd largely just be adding a new section, worst case is plain text or markdown would be fine. Normally the site is a little more high level than the javadocs, to get people started before they go read the latter
Comment 16 Martin Andersson 2013-11-11 12:27:44 UTC
Created attachment 31032 [details]
updated spreadsheet howto

Site patch added
Comment 17 Kirk Mitchener 2014-01-09 12:01:34 UTC
Hello again. Martin has updated with the required patches. Is there anything else needed before this can be included?
Comment 18 Andreas Beeker 2014-02-14 22:46:29 UTC
Thank you for the patch.
Applied with SVN rev r1568539
Comment 19 Val Schuman 2014-09-29 16:44:42 UTC
Has the fix been implemented in a version of POI? If so, which version?
Comment 20 Andreas Beeker 2014-09-30 15:52:25 UTC
> Has the fix been implemented in a version of POI? If so, which version?

For your convenience, always use the latest version ;)
... apart of it, the patch is contained in version 3.11-beta1 -
see http://poi.apache.org/changes.html