Bug 54750

Summary: [PATCH] Modify SXSSFWorkbook to allow modification of existing cells
Product: POI Reporter: omerhj <omerhj>
Component: SXSSFAssignee: POI Developers List <dev>
Status: NEEDINFO ---    
Severity: enhancement Keywords: PatchAvailable
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Contains a patch for SXSSFWorkbook and SXSSFSheet to allow template sheet cells to be modified.

Description omerhj 2013-03-25 15:30:00 UTC
Created attachment 30101 [details]
Contains a patch for SXSSFWorkbook and SXSSFSheet to allow template sheet cells to be modified.

The attached patch allows SXSSFWorkbook users to transparantly access and modify rows and cells in a template workbook. Newly created rows are added through a sliding window as before. This means the following limitations as documented in SXSSFWorkbook are removed:

     * Access initial cells and rows in the template. After constructing
       SXSSFWorkbook(XSSFWorkbook) all internal windows are empty and
       SXSSFSheet@getRow and SXSSFRow#getCell return null.

     * Override existing cells and rows. The API silently allows
       that but the output file is invalid and Excel cannot read it.


With this patch applied, SXSSF workbooks with existing cells that have been
modified can be opened in Excel 2007 and earlier without producing an error message like this:

"Excel found unreadable content in 'NewBook.xlsx'."

I created this patch back in January but as of last week the affected files hadn't changed in the trunk. We've used this in production since then without any problems. There's no new test cases included with this patch. The existing tests pass (or passed, back in January).

Here's a description of the modifications made by the patch.

Modified file: src/ooxml/java/org/apache/poi/xssf/streaming/SXSSFSheet.java

Modified methods:

removeRow
  Removes the row from the template if it originated in the template worksheet.

getRow
  Returns the row from the template if the row number exists in the template
worksheet.

getPhysicalNumberOfRows
  Includes the number of rows from the template in the total.

getFirstRowNum
  Returns the number of the first row of the template if that number is lower
  than the first row number of the sliding window (which should be the case
unless the  template is empty).

getLastRowNum
  Returns the number of the last row of the template unless the row number of
the last
  row in the sliding window is higher.

rowIterator
  JavaDoc change only. This still only iterates over rows in the sliding window
and not
  over rows in the template.

groupRow
  Also increases the group level of rows in the template, if applicable.


Modified file: src/ooxml/java/org/apache/poi/xssf/streaming/SXSSFWorkbook

Removed a JavaDoc segment that describes the limitation that has been removed
by the patch to SXSSFSheet.

Thanks,
   Omer van der Horst Jansen
Comment 1 Rahul Jain 2015-01-26 11:29:08 UTC
Hi,

When can this changes will be available in a POI stable version?


Thanks, 
Rahul
Comment 2 Javen O'Neal 2016-10-09 06:42:06 UTC
Ideally, additional functionality and bugfixes should be covered by unit tests to demonstrate that the new behavior is correct. This also reduces the chance of a regression in the future.

If someone is willing to rebase the patch to the trunk and write unit tests that cover the new functionality, I would be happy to review and commit these changes.
Comment 3 Loris Boiteux 2017-05-16 12:14:18 UTC
Hi,

I work for Thales Avionics and I'm using SXSSF Workbooks to manipulate very large sheets. This bug is really critical to me ;)
Could we have any information about a new release coming and including this fix?

Cheers,

Loris
Comment 4 Javen O'Neal 2017-05-17 17:41:34 UTC
POI is a volunteer project developed in people's spare time. If you have an interested or a pressing need for a particular bug, the quickest way to resolution is to develop a patch yourself and submit it.

The patch from comment 0 cannot be applied to the trunk as is. If you have time, rebase the patch, add appropriate unit tests, and test your changes against the unit test suite to make sure the changes don't break another part of POI.

For bugs that affect software projects at my day job, I sometimes as my manager if I can use company time to develop a fix. You might try this route if you don't want to sink your personal time into this.

We plan on releasing POI 3.17 beta 1 this summer and 3.17 final will likely be Q4 2017 or Q1 2018, depending on developers availability and volume of user contributions. YAs soon as a suitable patch for this bug is submitted and a developer has time to review the patch, the changes will be applied to trunk and included in the following release.