Bug 60737

Summary: XSSFSheetXMLHandler class's inside interface SheetContentsHandler should have an endSheet method
Product: POI Reporter: zakim <zakim2000>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: enhancement Keywords: PatchAvailable
Priority: P2    
Version: 3.15-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: The patch described by comment 0
patched XSSFEventBasedExcelExtractor
The patch corresponding to attachment 34759 against the code from 3.15-final

Description zakim 2017-02-16 08:20:46 UTC
interface SheetContentsHandler inside XSSFSheetXMLHandler class should have an endSheet method, which should be called from the :

public void endElement(String uri, String localName, String qName)
            throws SAXException {

method's "switch" modified as follows:
...
 } else if ("sheetData".equals(localName)) {
            // Handle any "missing" cells which had comments attached
            checkForEmptyCellComments(EmptyCellCommentsCheckType.END_OF_SHEET_DATA);
            output.endSheet();
        }
...

This is is useful when the sheet finished processing, and you need to do something with the sheet data yet available.
Comment 1 Javen O'Neal 2017-02-16 09:33:24 UTC
Created attachment 34758 [details]
The patch described by comment 0
Comment 2 Javen O'Neal 2017-02-16 09:37:12 UTC
src/ooxml/java/org/apache/poi/xssf/extractor/XSSFEventBasedExcelExtractor.java does not compile with the patch from comment 1.
The following error is given:
XSSFEventBasedExcelExtractor.SheetTextExtractor is not abstract and does not override abstract method endSheet() in SheetContentsHandler

Could you submit a tested patch that hopefully also includes an update to the documentation, a poi-examples file, or a unit test so that we understand how endSheet() is used and what it's supposed to do? The title of this bug comment 0 is a bit vague on the need to have this new feature.
Comment 3 Javen O'Neal 2017-02-16 09:37:54 UTC
(In reply to Javen O'Neal from comment #2)
> The title of this bug comment 0
correction: The title of this bug and comment 0
Comment 4 zakim 2017-02-16 14:48:23 UTC
Created attachment 34759 [details]
patched XSSFEventBasedExcelExtractor

The missing file patched.
Comment 5 zakim 2017-02-16 14:48:51 UTC
Attached the missing file.
Comment 6 Javen O'Neal 2017-02-17 04:29:59 UTC
(In reply to zakim from comment #4)
> Created attachment 34759 [details]
> patched XSSFEventBasedExcelExtractor
> 
> The missing file patched.
Please submit patches generated by "svn diff", "git-svn diff" or "ant -f patch.xml" [1]
Entire files are difficult to review and commit because other changes may be made to that file after you attached it.

Additionally, I am unclear how this contribution will help POI. Adding a method that does nothing may be a source of confusion for people reading or maintaining the code. If your intent is to have POI define a method that you plan on overriding in your own code, how do you plan on using that? Could someone else override this inner class without modifying POI source code?
If so, perhaps this change is worthy of example code or documentation so that others know how to override endSheet().

[1] https://poi.apache.org/guidelines.html#Submitting+Patches
Comment 7 zakim 2017-02-17 09:48:59 UTC
The file shows the changes added to the existing file, compared to poi version 3.15-FINAL as stated in the version field.
I havent checked out the entire project to send only the differences.
It was only a suggestion to make poi more usable which means in my opinion: 
a more usable framework should be easy to use, with less coding.
If you need to write ten classes using the poi framework then you better write the whole function without using poi.
Comment 8 Javen O'Neal 2017-02-17 10:26:31 UTC
Created attachment 34762 [details]
The patch corresponding to attachment 34759 [details] against the code from 3.15-final

(In reply to zakim from comment #7)
> The file shows the changes added to the existing file, compared to poi version > 3.15-FINAL as stated in the version field.
The version field is sometimes updated when an unfixed bug is still present in a later build, so it's better to explicitly state in a comment what the base rev for a file is. Even better is to submit a patch, which includes the rev if the patch was generated by Subversion.
It's probably a good idea to check out trunk anyways so that you can open the project in Eclipse or other IDE, make your changes, and know immediately whether there are compile-time errors with your changes.

> I haven't checked out the entire project to send only the differences.
You can perform a sparse svn checkout of POI if you don't want to wait the 10 minutes it takes to check out the trunk or one of the release tags. You could also save the original file and modified file and use any diffing program to generate a patch file (Linux/Mac: diff, Windows: TortoiseSVN, TortoiseGit)

> It was only a suggestion to make poi more usable which means in my opinion: 
> a more usable framework should be easy to use, with less coding.
> If you need to write ten classes using the poi framework then you better
> write the whole function without using poi.
This is the part that I am confused about. How does the addition of an empty endSheet() method make it easier for someone to use POI? Did I correctly receive your patch file? Was there really only an addition of 3 lines?

The only other references I see to endSheet within the POI source code and website is in BigGridDemo. Is this the inspiration for this bug?

I am genuinely interested in trying to make POI easier for people to use, but I'm not convinced that the contributions so far accomplish that. Am I missing some other files that have some useful behavior when the endSheet() event is reached?
Comment 9 zakim 2017-02-17 11:23:23 UTC
Never mind, forget about it.
It's not the BigGridDemo, I've never seen that demo.

You can simply press the Diff button near the Attachment on the bugzilla site and you can see the difference. I was referring to the officially relase in the maven repo.
If you release another version with 3.15.Final name, then you have a big problem and the diff won't help either, because your versions will be messed up.
If I knew that a suggestion is so hard to accomplish, I would never start this.
This is a feature request I've asked for, not a bug.
I've encountered this, when I was overwriting the following interface: XSSFSheetXMLHandler.SheetContentsHandler. I was saving the rows read from the xlsx file by group(1 save from multiple rows) and needed to save the last group at the end of the sheet without passing the data outside the overridden interface.
Comment 10 Dominik Stadler 2017-09-23 07:42:57 UTC
Applied via r1809371, thanks for the suggestion.