Created attachment 31505 [details] unit test If I insert more than 1023 comments in a sheet and after that save the workbook only the last 1023 comments are persisted. I attached a test to that inserts 1025 comments. After saving the comments for the first row and the row 1023 are not persisted.
Created attachment 31538 [details] Update unit test I have updated the unit test somewhat, but did not yet manage to find the actual problem. BiffViewer shows the items for the comments, so it seems like HSSF reading discards the comments somehow. It works for up to 1024 comments, but breaks with 1025!
I found the culprit in HSSFComment.setShapeId(int shapeId): However I am not sure why this is in there in the first place and what will happen if we remove this here, need to discuss on the dev-mailing list first... cod.setObjectId((short) (shapeId % 1024)); _note.setShapeId(shapeId % 1024);
I can only suggest digging out the relevant bit of [MS-XLS] for those record(s), and seeing if that lists any restrictions
*** Bug 56445 has been marked as a duplicate of this bug. ***
The same applies to HSSFShape.setShapeId(int shapeId) There is a cod.setObjectId((short) (shapeId%1024)); (that is called from HSSFComment.setShapeId(int shapeId) with super.setShapeId(shapeId);) Is that related with DrawingManager2 and the size of clusters?
*** Bug 55275 has been marked as a duplicate of this bug. ***
Some news or workaround available? Thanks
Workaround is to patch and compile POI yourself with the limitation to 1024 entries removed in HSSFComment and report here if it makes it work fine for you until we figure out if the MS spec mandates this limitation or if it is rather some sort of leftover from some old codebase.
Here some notes from the spec: * The id is stored in NoteRecord [2.4.179], which in this case contains a NoteSh [2.5.186] ** idObj (2 bytes): An ObjId that specifies the Obj record that specifies the comment text. *** id (2 bytes): An unsigned integer that specifies the value of the cmo.id field of an Obj in the same drawing. A value of 0 specifies that this ObjId does not reference an Obj. * The id is also stored in the CommonObjectDataSubRecord ** The spec seems to have this at FtCmo [2.5.143] *** id (2 bytes): An unsigned integer that specifies the identifier of this object. This object identifier is used by other types to refer to this object. The value of id MUST be unique among all Obj records within the Chart Sheet Substream ABNF, Macro Sheet Substream ABNF and Worksheet Substream ABNF. * Additionally ODRAW has OfficeArtFDGG [2.2.47] ** spidMax (4 bytes): An MSOSPID structure, as defined in section 2.1.2, specifying the current maximum shape identifier that is used in any drawing. This value MUST be less than 0x03FFD7FF. The only place that I could find that is limited to 1024 is the FileIdCluster(), but there are additional ones added when the last one gets to 1024, so I don't see any limitation of 1024 mandated by the spec and would therefore conclude that we can get rid of this.
With r1666505 I have removed the limitation of 1024 comments, the spec has a limit of 65535 as the counter is an unsigned short, we now check that this limit is not violated. Unit tests are adjusted accordingly, please report here if you find any regressions caused by this change.
So now the max number of comments is 65535 ? The excel can create comments more than 65535, why we limit it in 65535 ? Can we change the unsigned short ?
How to do it when the comments are more than 65535 ?