Bug 56023

Summary: [PATCH] Include cell comments in XSSFEventBasedExcelExtractor output
Product: POI Reporter: Shaun Kalley <shaun.kalley>
Component: XSSFAssignee: POI Developers List <dev>
Severity: enhancement Keywords: PatchAvailable
Priority: P2    
Version: 3.10-dev   
Target Milestone: ---   
Hardware: Macintosh   
OS: All   
Attachments: diff for this patch
test file to go in test-data/spreadsheet
diff for this patch (revised)
test file to go in test-data/spreadsheet
diff for this patch
updated diff for this patch as requested

Description Shaun Kalley 2014-01-17 05:06:59 UTC
Created attachment 31219 [details]
diff for this patch

This patch includes cell comments in the extracted text in the same format as for XSSFExcelExtractor.
Comment 1 Shaun Kalley 2014-01-17 05:07:55 UTC
Created attachment 31220 [details]
test file to go in test-data/spreadsheet
Comment 2 Shaun Kalley 2014-01-17 17:43:14 UTC
Created attachment 31225 [details]
diff for this patch (revised)

I had made an addition to XSSFReader to get the CommentsTable which turned out to be unnecessary as it was already available through the SheetIterator.
Comment 3 Shaun Kalley 2014-01-20 21:25:17 UTC
Created attachment 31243 [details]
test file to go in test-data/spreadsheet
Comment 4 Shaun Kalley 2014-01-20 21:26:29 UTC
Created attachment 31244 [details]
diff for this patch

This diff is built upon the diffs for https://issues.apache.org/bugzilla/show_bug.cgi?id=56011 and https://issues.apache.org/bugzilla/show_bug.cgi?id=56022 as the code overlaps.
Comment 5 Shaun Kalley 2014-01-20 21:29:45 UTC
This patch adds functionality to include comments in XSSFEventBasedExcelExtractor's output.  It goes beyond parity with XSSFExcelExtractor in that it (1) included comments from otherwise-empty cells, and (2) doesn't duplicate the author's name in the output.

This patch also includes a fix to the broken equals() method of CellReference as well as adding hashCode().
Comment 6 Nick Burch 2014-03-09 09:53:32 UTC
I've had a look at the patch, and it looks good to me, thanks for that!

However... when I try to apply it to trunk, it doesn't go on cleanly. Any chance you could bring your working copy up to trunk, then let us have an updated version to apply?

(I've applied the CellReference fix in r1575683)
Comment 7 Shaun Kalley 2014-03-10 17:59:19 UTC
Thanks for your effort on this!  I've generated an updated diff based on the current trunk.
Comment 8 Shaun Kalley 2014-03-10 18:00:44 UTC
Created attachment 31378 [details]
updated diff for this patch as requested
Comment 9 Nick Burch 2014-07-24 20:14:51 UTC
Applied (with some changes) in r1613266.

I'm a little worried about backwards compatibility though - anyone who currently implements SheetContentsHandler will suddenly find their code won't compile / run on 3.11

What do you think about us adding a second interface for people who want comments, to maintain backwards compatibility? 

(We need to decide ASAP, before 3.11 beta 1 gets released)
Comment 10 Nick Burch 2014-07-30 07:44:03 UTC
In r1614576 I've tweaked the changes to XSSFSheetXMLHandler.SheetContentsHandler, to require fewer backwards incompatible changes for those upgrading

Given that, I think we can close this now. Thanks for the contribution!