Bug 58818 - Cell comment not readed(return null)
Summary: Cell comment not readed(return null)
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.13-FINAL
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2016-01-07 06:35 UTC by Savchenko Vitaliy
Modified: 2016-04-12 16:17 UTC (History)
0 users

template for process (20.00 KB, application/vnd.ms-excel)
2016-01-07 06:35 UTC, Savchenko Vitaliy
fix EscherAggregate.createAggregate (651 bytes, patch)
2016-02-01 04:50 UTC, Javen O'Neal
Details | Diff
failed junit (1.35 KB, text/plain)
2016-02-01 04:53 UTC, Javen O'Neal

Note You need to log in before you can comment on or make changes to this bug.
Description Savchenko Vitaliy 2016-01-07 06:35:28 UTC
Created attachment 33418 [details]
template for process

File created by MS Excel 2007.
I can't to read comment at A1.

System.out.println(wb.getSheetAt(0).getCellComment(0, 0)); => null
System.out.println(wb.getSheetAt(0).getRow(0).getCell(0).getCellComment()); =>null

Sometimes, after edit and resave the file - cell comment read correctly.
Comment 1 Javen O'Neal 2016-01-31 11:12:49 UTC
I tried the following using POI 3.14 beta 1

InputStream fis = FileInputStream("bug58818.xls");
Workbook wb = HSSFWorkbook(fis);
Sheet sh = wb.getSheetAt(0);
sh.getCellComments() -> returns empty Map
sh.getRow(0).getCell(0).getCellComment(); -> returns null
sh.getCellComment(0, 0); -> returns null

Opening the file in Excel 2013, I found a comment in cell A1 with the text `jx:area(lastCell = "C25")\n`. I was able to reproduce your problem and this is a valid bug.

The problem is that the comment doesn't have a position (HSSFComment.hasPosition returns false).

I modified HSSFSheet.findCellCommentLocations to print comment.getString().getString() on each HSSFComment it finds in the drawing patriarch, and then POI found the cell comment. Disabling the check for hasPosition() will throw a NPE at comment.getRow() or comment.getColumn().

private void findCellCommentLocations(HSSFShapeContainer container, Map<CellAddress, HSSFComment> locations) {
     for (Object object : container.getChildren()) {
         HSSFShape shape = (HSSFShape) object;
         if (shape instanceof HSSFComment) {
             // recursively search down the drawing patriarch tree
             findCellCommentLocations((HSSFShapeGroup) shape, locations);
         if (shape instanceof HSSFComment) {
             HSSFComment comment = (HSSFComment) shape;
+            System.out.println(comment.getString().getString());
             if (comment.hasPosition()) {
                 locations.put(new CellAddress(comment.getRow(), comment.getColumn()), comment);
+            else {
+                // found unanchored comment. How should this be handled?
+                // does comment contain any information that might be relevant here?
+                locations.put(CellAddress.A1, comment);
+            }

HSSFComment.hasPosition() is false here because the HSSFComment object (created with the constructor HSSFComment(EscherContainerRecord, ObjectRecord, TextObjectRecord, NoteRecord)) sets NoteRecord as null. hasPosition and getRow/getColumn rely on this NoteRecord to identify the location.

There are two likely explanations here:
* data for creating a NoteRecord exists, but the NoteRecord object is not passed to the constructor of HSSFComment
* the Excel file does not have a NoteRecord for some cell comments. When this is the case, Excel uses A1 as the default, implied location of the comment. For behavior parity, POI would need to default cell comments to A1 if no position is specified.

HSSFComment(EscherContainerRecord spContainer, ObjRecord objRecord, TextObjectRecord textObjectRecord, NoteRecord _note) 

EscherContainerRecord spContainer
org.apache.poi.ddf.EscherContainerRecord (SpContainer):
  isContainer: true
  version: 0x000F
  instance: 0x0000
  recordId: 0xF004
  numchildren: 5
   Child 0:
      RecordId: 0xF00A
      Version: 0x0002
      ShapeType: 0x00CA
      ShapeId: 1027
      Flags: HAVEANCHOR|HASSHAPETYPE (0x00000A00)
   Child 1:
      isContainer: false
      version: 0x0003
      instance: 0x000E
      recordId: 0xF00B
      numchildren: 0
        propNum: 128, RAW: 0x0080, propName: text.textid, complex: false, blipId: false, value: 62207824 (0x03B53750)
        propNum: 133, RAW: 0x0085, propName: text.wraptext, complex: false, blipId: false, value: 1 (0x00000001)
        propNum: 139, RAW: 0x008B, propName: text.bidir, complex: false, blipId: false, value: 2 (0x00000002)
        propNum: 191, RAW: 0x00BF, propName: text.sizetexttofitshape, complex: false, blipId: false, value: 655368 (0x000A0008)
        propNum: 344, RAW: 0x0158, propName: unknown, complex: false, blipId: false, value: 0 (0x00000000)
        propNum: 385, RAW: 0x0181, propName: fill.fillcolor, complex: false, blipId: false, value: 14811135 (0x00E1FFFF)
        propNum: 387, RAW: 0x0183, propName: fill.fillbackcolor, complex: false, blipId: false, value: 14811135 (0x00E1FFFF)
        propNum: 389, RAW: 0x0185, propName: fill.crmod, complex: false, blipId: false, value: 268435700 (0x100000F4)
        propNum: 447, RAW: 0x01BF, propName: fill.nofillhittest, complex: false, blipId: false, value: 1048592 (0x00100010)
        propNum: 451, RAW: 0x01C3, propName: linestyle.crmod, complex: false, blipId: false, value: 268435700 (0x100000F4)
        propNum: 513, RAW: 0x0201, propName: shadowstyle.color, complex: false, blipId: false, value: 0 (0x00000000)
        propNum: 515, RAW: 0x0203, propName: shadowstyle.crmod, complex: false, blipId: false, value: 268435700 (0x100000F4)
        propNum: 575, RAW: 0x023F, propName: shadowstyle.shadowobscured, complex: false, blipId: false, value: 196611 (0x00030003)
        propNum: 959, RAW: 0x03BF, propName: groupshape.print, complex: false, blipId: false, value: 131072 (0x00020000)
   Child 2:
      RecordId: 0xF010
      Version: 0x0000
      Instance: 0x0000
      Flag: 3
      Col1: 1
      DX1: 169
      Row1: 0
      DY1: 30
      Col2: 2
      DX2: 832
      Row2: 4
      DY2: 196
      Extra Data:
    No Data
   Child 3:
      RecordId: 0xF011
      Version: 0x0000
      Instance: 0x0000
      Extra Data:
    No Data
   Child 4:
      isContainer: false
      version: 0x0000
      instance: 0x0000
      recordId: 0xF00D
      numchildren: 0

    .objectType           = 0x0019 (25 )
    .objectId             = 0x00000003 (3 )
    .option               = 0x4011 (16401 )
         .locked                   = true
         .printable                = true
         .autofill                 = false
         .autoline                 = true
    .reserved1            = 0x03B53750 (62207824 )
    .reserved2            = 0x032FCC60 (53464160 )
    .reserved3            = 0x00000000 (0 )
  size     = 22
  reserved = [41, 68, 11, C0, EC, 21, 65, 46, 86, B6, E6, 6B, EF, 8E, 10, 2B, 00, 00, 10, 00, 00, 00]
[/ftNts ]

    .options        = 0x0212
         .isHorizontal = 1
         .isVertical   = 1
         .textLocked   = true
    .textOrientation= 0x0000
    .reserved4      = 0x0000
    .reserved5      = 0x0000
    .reserved6      = 0x0000
    .textLength     = 0x001A
    .reserved7      = 0x00000000
    .string = jx:area(lastCell = "C25")

    .textrun = 6
    .textrun = 24


This is the stack trace that identifies how the HSSFComment was created:
  File "<stdin>", line 1, in <module>
        at org.apache.poi.hssf.usermodel.HSSFComment.<init>(HSSFComment.java:62)
        at org.apache.poi.hssf.usermodel.HSSFShapeFactory.createShapeTree(HSSFShapeFactory.java:108)
        at org.apache.poi.hssf.usermodel.HSSFPatriarch.buildShapeTree(HSSFPatriarch.java:540)
        at org.apache.poi.hssf.usermodel.HSSFPatriarch.<init>(HSSFPatriarch.java:87)
        at org.apache.poi.hssf.usermodel.HSSFSheet.getPatriarch(HSSFSheet.java:1989)
        at org.apache.poi.hssf.usermodel.HSSFSheet.getDrawingPatriarch(HSSFSheet.java:1941)
        at org.apache.poi.hssf.usermodel.HSSFSheet.getCellComments(HSSFSheet.java:2348)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)

In HSSFShapeFactory.java line 108:
The _note argument to the HSSFComment constructor is null, which equals agg.getNoteRecordByObj(objRecord)). Either getNoteRecordByObj is misbehaving or there are duplicates in the `for (EscherRecord record : container.getChildRecords())` that are screwing up the lookup.

Savchenko, do you have any files where POI doesn't recognized a cell comment NOT located at A1?
Comment 2 Javen O'Neal 2016-02-01 04:50:50 UTC
Created attachment 33508 [details]
fix EscherAggregate.createAggregate

It seems like the the break statement in createAggregate [2] leaves some records unread. In attachment 33418 [details], the NoteRecord has object id 3, and is skipped by the break statement. Removing the break statement fixes the described error (HSSFSheet.getCellComments) but fails a unit test [1]. 

[1] src/testcases/org/apache/poi/hssf/model/TestDrawingAggregate.java
>        // any NoteRecords that follow the drawing block must be aggregated and and saved in the tailRec collection
>        while (loc < records.size()) {
>            if (sid(records, loc) == NoteRecord.sid) {
>                NoteRecord r = (NoteRecord) records.get(loc);
>                agg.addTailRecord(r);
>            }
>            else {
>                break;
>            }
>            loc++;
>        }
Comment 3 Javen O'Neal 2016-02-01 04:53:01 UTC
Created attachment 33509 [details]
failed junit
Comment 4 Savchenko Vitaliy 2016-02-02 03:10:08 UTC
> Savchenko, do you have any files where POI doesn't recognized a cell comment
> NOT located at A1?

No, I have not. 
I found no problems with the other cells
Comment 5 Savchenko Vitaliy 2016-04-12 16:17:29 UTC
Any news?