The current Sheet interface allows the retrieval of comments only by location (row,col). In XSLX files, it is easily possible to have a comment reference a cell that is not physically present in the file. Iterating over all rows and columns as reported by get{First,Last}RowNum and get{First,Last}CellNum with hence miss comments that reference positions in "null" rows as well as positions to the left/right of physically present cells in a given row. As discussed on poi-dev[1], I propose adding the following method to Sheet: /** * Returns all cell addresses of this sheet * having one or more comments associated. * @return A set of cell references (in no particular order). */ Set<String> getAllCellCommentRefs(); [1] http://mail-archives.apache.org/mod_mbox/poi-dev/201509.mbox/ajax/%3C55E84053.4080002%40erven.at%3E
Created attachment 33098 [details] Suggested patch Suggested patch with implementation for HSSF, XSSF and SXSSF.
I think a Set<CellAddress> would be better than Set<String> since it eliminates the burden of string parsing in client code. Unfortunately, the class CellAddress doesn't currently exist in POI. You don't want CellReference, since that has a concept of relative/absolute that doesn't apply here. You don't want CellRangeAddress since that forms an area defined by the top left and lower left cells. Taking a step back, can you get away with returning an array/list/iterator on the CellComment objects themselves instead of the anchors? I'm guessing the CellComment objects would be more useful to most people. If you need both, consider some kind of Map<CellAddress, CellComment>. If you end up writing a CellAddress class, there are a quite a few classes that could be updated with this higher-level data structure.
Created attachment 33143 [details] Suggested patch, version 2 Javen, thank you for your suggestions. I'll attach a new patch that includes a new class org.apache.poi.ss.util.CellAddress and now suggests a Sheet.getCellComments() with return type Map<CellAdress, Comment> method. Internally, XSSF's CommentsTable still uses String cell references (addresses, in fact) to manage the comments; on first sight, changing that is possible, but will bloat the patch. Also, I noticed that various XSSF implementation use CellReference(row,col) as a CellAddress replacement. I guess that's the places you meant that would also benefit from a switch to CellAddress ?
Created attachment 33144 [details] Suggested patch, version 2, with CellAdress extended usage Here's another version of the patch. XSSF's CommentsTable now also internally uses CellAddress objects to manage the comments, and hence a bunch of locations that previously used "String ref" were converted to "CellAdress ref".
Hi, is there any feedback/thoughts on the patches? Thanks! -hannes
Added in r1715743. Docs updated r1715746.
Disregard comment 6.
Updated CommentsTable in r1715784. Updated XSSFSheetXMLHandler in r1715794. Updated remainder of patch in r1715839, plus a unit test for Sheet.getCellComments. Docs and quick-guide updated in r1715840 and r1715842. Hannes, thanks for your contribution to the POI project! These changes should be available in the upcoming POI 3.14beta1 release.